All of lore.kernel.org
 help / color / mirror / Atom feed
* qemu riscv, thead c906, Linux boot regression
@ 2024-01-24 12:49 ` Björn Töpel
  0 siblings, 0 replies; 18+ messages in thread
From: Björn Töpel @ 2024-01-24 12:49 UTC (permalink / raw)
  To: qemu-devel, Daniel Henrique Barboza, Christoph Müllner
  Cc: linux-riscv, LIU Zhiwei, Andrew Jones, Alistair Francis

Hi!

I bumped the RISC-V Linux kernel CI to use qemu 8.2.0, and realized that
thead c906 didn't boot anymore. Bisection points to commit d6a427e2c0b2
("target/riscv/cpu.c: restrict 'marchid' value")

Reverting that commit, or the hack below solves the boot issue:

--8<--
diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 8cbfc7e781ad..e18596c8a55a 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -505,6 +505,9 @@ static void rv64_thead_c906_cpu_init(Object *obj)
     cpu->cfg.ext_xtheadsync = true;
 
     cpu->cfg.mvendorid = THEAD_VENDOR_ID;
+    cpu->cfg.marchid = ((QEMU_VERSION_MAJOR << 16) |
+                        (QEMU_VERSION_MINOR << 8)  |
+                        (QEMU_VERSION_MICRO));
 #ifndef CONFIG_USER_ONLY
     set_satp_mode_max_supported(cpu, VM_1_10_SV39);
 #endif
--8<--

I'm unsure what the correct qemu way of adding a default value is,
or if c906 should have a proper marchid.

Maybe Christoph or Zhiwei can answer?

qemu command-line:
qemu-system-riscv64 -nodefaults -nographic -machine virt,acpi=off \
   -cpu thead-c906 ...


Thanks,
Björn

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* qemu riscv, thead c906, Linux boot regression
@ 2024-01-24 12:49 ` Björn Töpel
  0 siblings, 0 replies; 18+ messages in thread
From: Björn Töpel @ 2024-01-24 12:49 UTC (permalink / raw)
  To: qemu-devel, Daniel Henrique Barboza, Christoph Müllner
  Cc: linux-riscv, LIU Zhiwei, Andrew Jones, Alistair Francis

Hi!

I bumped the RISC-V Linux kernel CI to use qemu 8.2.0, and realized that
thead c906 didn't boot anymore. Bisection points to commit d6a427e2c0b2
("target/riscv/cpu.c: restrict 'marchid' value")

Reverting that commit, or the hack below solves the boot issue:

--8<--
diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 8cbfc7e781ad..e18596c8a55a 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -505,6 +505,9 @@ static void rv64_thead_c906_cpu_init(Object *obj)
     cpu->cfg.ext_xtheadsync = true;
 
     cpu->cfg.mvendorid = THEAD_VENDOR_ID;
+    cpu->cfg.marchid = ((QEMU_VERSION_MAJOR << 16) |
+                        (QEMU_VERSION_MINOR << 8)  |
+                        (QEMU_VERSION_MICRO));
 #ifndef CONFIG_USER_ONLY
     set_satp_mode_max_supported(cpu, VM_1_10_SV39);
 #endif
--8<--

I'm unsure what the correct qemu way of adding a default value is,
or if c906 should have a proper marchid.

Maybe Christoph or Zhiwei can answer?

qemu command-line:
qemu-system-riscv64 -nodefaults -nographic -machine virt,acpi=off \
   -cpu thead-c906 ...


Thanks,
Björn


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: qemu riscv, thead c906, Linux boot regression
  2024-01-24 12:49 ` Björn Töpel
@ 2024-01-24 13:13   ` Conor Dooley
  -1 siblings, 0 replies; 18+ messages in thread
From: Conor Dooley @ 2024-01-24 13:13 UTC (permalink / raw)
  To: Björn Töpel
  Cc: qemu-devel, Daniel Henrique Barboza, Christoph Müllner,
	linux-riscv, LIU Zhiwei, Andrew Jones, Alistair Francis


[-- Attachment #1.1: Type: text/plain, Size: 1703 bytes --]

On Wed, Jan 24, 2024 at 01:49:51PM +0100, Björn Töpel wrote:
> Hi!
> 
> I bumped the RISC-V Linux kernel CI to use qemu 8.2.0, and realized that
> thead c906 didn't boot anymore. Bisection points to commit d6a427e2c0b2
> ("target/riscv/cpu.c: restrict 'marchid' value")
> 
> Reverting that commit, or the hack below solves the boot issue:
> 
> --8<--
> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> index 8cbfc7e781ad..e18596c8a55a 100644
> --- a/target/riscv/cpu.c
> +++ b/target/riscv/cpu.c
> @@ -505,6 +505,9 @@ static void rv64_thead_c906_cpu_init(Object *obj)
>      cpu->cfg.ext_xtheadsync = true;
>  
>      cpu->cfg.mvendorid = THEAD_VENDOR_ID;
> +    cpu->cfg.marchid = ((QEMU_VERSION_MAJOR << 16) |
> +                        (QEMU_VERSION_MINOR << 8)  |
> +                        (QEMU_VERSION_MICRO));
>  #ifndef CONFIG_USER_ONLY
>      set_satp_mode_max_supported(cpu, VM_1_10_SV39);
>  #endif
> --8<--
> 
> I'm unsure what the correct qemu way of adding a default value is,
> or if c906 should have a proper marchid.

The "correct" marchid/mimpid values for the c906 are zero.

I haven't looked into the code at all, so I am "assuming" that it is
being zero intialised at present. Linux applies the errata fixups for
the c906 when archid and impid are both zero - so your patch will avoid
these fixups being applied.
Do you think that perhaps the emulation in QEMU does not support what
the kernel uses once then errata fixups are enabled?

> 
> Maybe Christoph or Zhiwei can answer?
> 
> qemu command-line:
> qemu-system-riscv64 -nodefaults -nographic -machine virt,acpi=off \
>    -cpu thead-c906 ...
> 
> 
> Thanks,
> Björn
> 

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

[-- Attachment #2: Type: text/plain, Size: 161 bytes --]

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: qemu riscv, thead c906, Linux boot regression
@ 2024-01-24 13:13   ` Conor Dooley
  0 siblings, 0 replies; 18+ messages in thread
From: Conor Dooley @ 2024-01-24 13:13 UTC (permalink / raw)
  To: Björn Töpel
  Cc: qemu-devel, Daniel Henrique Barboza, Christoph Müllner,
	linux-riscv, LIU Zhiwei, Andrew Jones, Alistair Francis

[-- Attachment #1: Type: text/plain, Size: 1703 bytes --]

On Wed, Jan 24, 2024 at 01:49:51PM +0100, Björn Töpel wrote:
> Hi!
> 
> I bumped the RISC-V Linux kernel CI to use qemu 8.2.0, and realized that
> thead c906 didn't boot anymore. Bisection points to commit d6a427e2c0b2
> ("target/riscv/cpu.c: restrict 'marchid' value")
> 
> Reverting that commit, or the hack below solves the boot issue:
> 
> --8<--
> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> index 8cbfc7e781ad..e18596c8a55a 100644
> --- a/target/riscv/cpu.c
> +++ b/target/riscv/cpu.c
> @@ -505,6 +505,9 @@ static void rv64_thead_c906_cpu_init(Object *obj)
>      cpu->cfg.ext_xtheadsync = true;
>  
>      cpu->cfg.mvendorid = THEAD_VENDOR_ID;
> +    cpu->cfg.marchid = ((QEMU_VERSION_MAJOR << 16) |
> +                        (QEMU_VERSION_MINOR << 8)  |
> +                        (QEMU_VERSION_MICRO));
>  #ifndef CONFIG_USER_ONLY
>      set_satp_mode_max_supported(cpu, VM_1_10_SV39);
>  #endif
> --8<--
> 
> I'm unsure what the correct qemu way of adding a default value is,
> or if c906 should have a proper marchid.

The "correct" marchid/mimpid values for the c906 are zero.

I haven't looked into the code at all, so I am "assuming" that it is
being zero intialised at present. Linux applies the errata fixups for
the c906 when archid and impid are both zero - so your patch will avoid
these fixups being applied.
Do you think that perhaps the emulation in QEMU does not support what
the kernel uses once then errata fixups are enabled?

> 
> Maybe Christoph or Zhiwei can answer?
> 
> qemu command-line:
> qemu-system-riscv64 -nodefaults -nographic -machine virt,acpi=off \
>    -cpu thead-c906 ...
> 
> 
> Thanks,
> Björn
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: qemu riscv, thead c906, Linux boot regression
  2024-01-24 13:13   ` Conor Dooley
@ 2024-01-24 13:27     ` Björn Töpel
  -1 siblings, 0 replies; 18+ messages in thread
From: Björn Töpel @ 2024-01-24 13:27 UTC (permalink / raw)
  To: Conor Dooley
  Cc: qemu-devel, Daniel Henrique Barboza, Christoph Müllner,
	linux-riscv, LIU Zhiwei, Andrew Jones, Alistair Francis

Conor Dooley <conor@kernel.org> writes:

> On Wed, Jan 24, 2024 at 01:49:51PM +0100, Björn Töpel wrote:
>> Hi!
>> 
>> I bumped the RISC-V Linux kernel CI to use qemu 8.2.0, and realized that
>> thead c906 didn't boot anymore. Bisection points to commit d6a427e2c0b2
>> ("target/riscv/cpu.c: restrict 'marchid' value")
>> 
>> Reverting that commit, or the hack below solves the boot issue:
>> 
>> --8<--
>> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
>> index 8cbfc7e781ad..e18596c8a55a 100644
>> --- a/target/riscv/cpu.c
>> +++ b/target/riscv/cpu.c
>> @@ -505,6 +505,9 @@ static void rv64_thead_c906_cpu_init(Object *obj)
>>      cpu->cfg.ext_xtheadsync = true;
>>  
>>      cpu->cfg.mvendorid = THEAD_VENDOR_ID;
>> +    cpu->cfg.marchid = ((QEMU_VERSION_MAJOR << 16) |
>> +                        (QEMU_VERSION_MINOR << 8)  |
>> +                        (QEMU_VERSION_MICRO));
>>  #ifndef CONFIG_USER_ONLY
>>      set_satp_mode_max_supported(cpu, VM_1_10_SV39);
>>  #endif
>> --8<--
>> 
>> I'm unsure what the correct qemu way of adding a default value is,
>> or if c906 should have a proper marchid.
>
> The "correct" marchid/mimpid values for the c906 are zero.

Ok! Thanks for clearing that up for me.

> I haven't looked into the code at all, so I am "assuming" that it is
> being zero intialised at present. Linux applies the errata fixups for
> the c906 when archid and impid are both zero - so your patch will avoid
> these fixups being applied.

I'm also assuming 0, -- will double-check. Hmm, that means that the
*previous* marchid was incorrect (pre d6a427e2c0b2).

> Do you think that perhaps the emulation in QEMU does not support what
> the kernel uses once then errata fixups are enabled?

Did a quick look at the c906 "in_asm,int" logs:

| 0x80201040:  12000073          sfence.vma              zero,zero
| 0x80201044:  18051073          csrrw                   zero,satp,a0
| 
| riscv_cpu_do_interrupt: hart:0, async:0, cause:000000000000000c, epc:0x0000000080201048, tval:0x0000000080201048, desc=exec_page_fault
| riscv_cpu_do_interrupt: hart:0, async:0, cause:000000000000000c, epc:0xffffffff80001048, tval:0xffffffff80001048, desc=exec_page_fault
| ...cont forever

So it looks like we're tripping over the page tables, when we're turning
on paging.

Hmm, maybe it's not qemu, but the c906 that has been broken for a while?

I'll disable it temporarily from CI anyhow, and will continue digging.


Thanks for the pointers/clarifications, Conor!
Björn

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: qemu riscv, thead c906, Linux boot regression
@ 2024-01-24 13:27     ` Björn Töpel
  0 siblings, 0 replies; 18+ messages in thread
From: Björn Töpel @ 2024-01-24 13:27 UTC (permalink / raw)
  To: Conor Dooley
  Cc: qemu-devel, Daniel Henrique Barboza, Christoph Müllner,
	linux-riscv, LIU Zhiwei, Andrew Jones, Alistair Francis

Conor Dooley <conor@kernel.org> writes:

> On Wed, Jan 24, 2024 at 01:49:51PM +0100, Björn Töpel wrote:
>> Hi!
>> 
>> I bumped the RISC-V Linux kernel CI to use qemu 8.2.0, and realized that
>> thead c906 didn't boot anymore. Bisection points to commit d6a427e2c0b2
>> ("target/riscv/cpu.c: restrict 'marchid' value")
>> 
>> Reverting that commit, or the hack below solves the boot issue:
>> 
>> --8<--
>> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
>> index 8cbfc7e781ad..e18596c8a55a 100644
>> --- a/target/riscv/cpu.c
>> +++ b/target/riscv/cpu.c
>> @@ -505,6 +505,9 @@ static void rv64_thead_c906_cpu_init(Object *obj)
>>      cpu->cfg.ext_xtheadsync = true;
>>  
>>      cpu->cfg.mvendorid = THEAD_VENDOR_ID;
>> +    cpu->cfg.marchid = ((QEMU_VERSION_MAJOR << 16) |
>> +                        (QEMU_VERSION_MINOR << 8)  |
>> +                        (QEMU_VERSION_MICRO));
>>  #ifndef CONFIG_USER_ONLY
>>      set_satp_mode_max_supported(cpu, VM_1_10_SV39);
>>  #endif
>> --8<--
>> 
>> I'm unsure what the correct qemu way of adding a default value is,
>> or if c906 should have a proper marchid.
>
> The "correct" marchid/mimpid values for the c906 are zero.

Ok! Thanks for clearing that up for me.

> I haven't looked into the code at all, so I am "assuming" that it is
> being zero intialised at present. Linux applies the errata fixups for
> the c906 when archid and impid are both zero - so your patch will avoid
> these fixups being applied.

I'm also assuming 0, -- will double-check. Hmm, that means that the
*previous* marchid was incorrect (pre d6a427e2c0b2).

> Do you think that perhaps the emulation in QEMU does not support what
> the kernel uses once then errata fixups are enabled?

Did a quick look at the c906 "in_asm,int" logs:

| 0x80201040:  12000073          sfence.vma              zero,zero
| 0x80201044:  18051073          csrrw                   zero,satp,a0
| 
| riscv_cpu_do_interrupt: hart:0, async:0, cause:000000000000000c, epc:0x0000000080201048, tval:0x0000000080201048, desc=exec_page_fault
| riscv_cpu_do_interrupt: hart:0, async:0, cause:000000000000000c, epc:0xffffffff80001048, tval:0xffffffff80001048, desc=exec_page_fault
| ...cont forever

So it looks like we're tripping over the page tables, when we're turning
on paging.

Hmm, maybe it's not qemu, but the c906 that has been broken for a while?

I'll disable it temporarily from CI anyhow, and will continue digging.


Thanks for the pointers/clarifications, Conor!
Björn


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: qemu riscv, thead c906, Linux boot regression
  2024-01-24 12:49 ` Björn Töpel
@ 2024-01-24 13:38   ` Daniel Henrique Barboza
  -1 siblings, 0 replies; 18+ messages in thread
From: Daniel Henrique Barboza @ 2024-01-24 13:38 UTC (permalink / raw)
  To: Björn Töpel, qemu-devel, Christoph Müllner
  Cc: linux-riscv, LIU Zhiwei, Andrew Jones, Alistair Francis



On 1/24/24 09:49, Björn Töpel wrote:
> Hi!
> 
> I bumped the RISC-V Linux kernel CI to use qemu 8.2.0, and realized that
> thead c906 didn't boot anymore. Bisection points to commit d6a427e2c0b2
> ("target/riscv/cpu.c: restrict 'marchid' value")
> 
> Reverting that commit, or the hack below solves the boot issue:
> 
> --8<--
> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> index 8cbfc7e781ad..e18596c8a55a 100644
> --- a/target/riscv/cpu.c
> +++ b/target/riscv/cpu.c
> @@ -505,6 +505,9 @@ static void rv64_thead_c906_cpu_init(Object *obj)
>       cpu->cfg.ext_xtheadsync = true;
>   
>       cpu->cfg.mvendorid = THEAD_VENDOR_ID;
> +    cpu->cfg.marchid = ((QEMU_VERSION_MAJOR << 16) |
> +                        (QEMU_VERSION_MINOR << 8)  |
> +                        (QEMU_VERSION_MICRO));
>   #ifndef CONFIG_USER_ONLY
>       set_satp_mode_max_supported(cpu, VM_1_10_SV39);
>   #endif
> --8<--
> 
> I'm unsure what the correct qemu way of adding a default value is,
> or if c906 should have a proper marchid.

In case you need to set a 'marchid' different than zero for c906, this hack would
be a proper fix. As mentioned in the commit msg of the patch you mentioned:

"Named CPUs should set 'marchid' to a meaningful value instead, and generic
  CPUs can set to any valid value."

That means that any specific marchid value that the CPU uses must to be set
in its own cpu_init() function.


Thanks,

Daniel


> 
> Maybe Christoph or Zhiwei can answer?
> 
> qemu command-line:
> qemu-system-riscv64 -nodefaults -nographic -machine virt,acpi=off \
>     -cpu thead-c906 ...
> 
> 
> Thanks,
> Björn

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: qemu riscv, thead c906, Linux boot regression
@ 2024-01-24 13:38   ` Daniel Henrique Barboza
  0 siblings, 0 replies; 18+ messages in thread
From: Daniel Henrique Barboza @ 2024-01-24 13:38 UTC (permalink / raw)
  To: Björn Töpel, qemu-devel, Christoph Müllner
  Cc: linux-riscv, LIU Zhiwei, Andrew Jones, Alistair Francis



On 1/24/24 09:49, Björn Töpel wrote:
> Hi!
> 
> I bumped the RISC-V Linux kernel CI to use qemu 8.2.0, and realized that
> thead c906 didn't boot anymore. Bisection points to commit d6a427e2c0b2
> ("target/riscv/cpu.c: restrict 'marchid' value")
> 
> Reverting that commit, or the hack below solves the boot issue:
> 
> --8<--
> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> index 8cbfc7e781ad..e18596c8a55a 100644
> --- a/target/riscv/cpu.c
> +++ b/target/riscv/cpu.c
> @@ -505,6 +505,9 @@ static void rv64_thead_c906_cpu_init(Object *obj)
>       cpu->cfg.ext_xtheadsync = true;
>   
>       cpu->cfg.mvendorid = THEAD_VENDOR_ID;
> +    cpu->cfg.marchid = ((QEMU_VERSION_MAJOR << 16) |
> +                        (QEMU_VERSION_MINOR << 8)  |
> +                        (QEMU_VERSION_MICRO));
>   #ifndef CONFIG_USER_ONLY
>       set_satp_mode_max_supported(cpu, VM_1_10_SV39);
>   #endif
> --8<--
> 
> I'm unsure what the correct qemu way of adding a default value is,
> or if c906 should have a proper marchid.

In case you need to set a 'marchid' different than zero for c906, this hack would
be a proper fix. As mentioned in the commit msg of the patch you mentioned:

"Named CPUs should set 'marchid' to a meaningful value instead, and generic
  CPUs can set to any valid value."

That means that any specific marchid value that the CPU uses must to be set
in its own cpu_init() function.


Thanks,

Daniel


> 
> Maybe Christoph or Zhiwei can answer?
> 
> qemu command-line:
> qemu-system-riscv64 -nodefaults -nographic -machine virt,acpi=off \
>     -cpu thead-c906 ...
> 
> 
> Thanks,
> Björn


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: qemu riscv, thead c906, Linux boot regression
  2024-01-24 13:27     ` Björn Töpel
@ 2024-01-24 13:49       ` Conor Dooley
  -1 siblings, 0 replies; 18+ messages in thread
From: Conor Dooley @ 2024-01-24 13:49 UTC (permalink / raw)
  To: Björn Töpel
  Cc: qemu-devel, Daniel Henrique Barboza, Christoph Müllner,
	linux-riscv, LIU Zhiwei, Andrew Jones, Alistair Francis


[-- Attachment #1.1: Type: text/plain, Size: 3443 bytes --]

On Wed, Jan 24, 2024 at 02:27:10PM +0100, Björn Töpel wrote:
> Conor Dooley <conor@kernel.org> writes:
> 
> > On Wed, Jan 24, 2024 at 01:49:51PM +0100, Björn Töpel wrote:
> >> Hi!
> >> 
> >> I bumped the RISC-V Linux kernel CI to use qemu 8.2.0, and realized that
> >> thead c906 didn't boot anymore. Bisection points to commit d6a427e2c0b2
> >> ("target/riscv/cpu.c: restrict 'marchid' value")
> >> 
> >> Reverting that commit, or the hack below solves the boot issue:
> >> 
> >> --8<--
> >> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> >> index 8cbfc7e781ad..e18596c8a55a 100644
> >> --- a/target/riscv/cpu.c
> >> +++ b/target/riscv/cpu.c
> >> @@ -505,6 +505,9 @@ static void rv64_thead_c906_cpu_init(Object *obj)
> >>      cpu->cfg.ext_xtheadsync = true;
> >>  
> >>      cpu->cfg.mvendorid = THEAD_VENDOR_ID;
> >> +    cpu->cfg.marchid = ((QEMU_VERSION_MAJOR << 16) |
> >> +                        (QEMU_VERSION_MINOR << 8)  |
> >> +                        (QEMU_VERSION_MICRO));
> >>  #ifndef CONFIG_USER_ONLY
> >>      set_satp_mode_max_supported(cpu, VM_1_10_SV39);
> >>  #endif
> >> --8<--
> >> 
> >> I'm unsure what the correct qemu way of adding a default value is,
> >> or if c906 should have a proper marchid.
> >
> > The "correct" marchid/mimpid values for the c906 are zero.
> 
> Ok! Thanks for clearing that up for me.
> 
> > I haven't looked into the code at all, so I am "assuming" that it is
> > being zero intialised at present. Linux applies the errata fixups for
> > the c906 when archid and impid are both zero - so your patch will avoid
> > these fixups being applied.
> 
> I'm also assuming 0, -- will double-check. Hmm, that means that the
> *previous* marchid was incorrect (pre d6a427e2c0b2).
> 
> > Do you think that perhaps the emulation in QEMU does not support what
> > the kernel uses once then errata fixups are enabled?
> 
> Did a quick look at the c906 "in_asm,int" logs:
> 
> | 0x80201040:  12000073          sfence.vma              zero,zero
> | 0x80201044:  18051073          csrrw                   zero,satp,a0
> | 
> | riscv_cpu_do_interrupt: hart:0, async:0, cause:000000000000000c, epc:0x0000000080201048, tval:0x0000000080201048, desc=exec_page_fault
> | riscv_cpu_do_interrupt: hart:0, async:0, cause:000000000000000c, epc:0xffffffff80001048, tval:0xffffffff80001048, desc=exec_page_fault
> | ...cont forever
> 
> So it looks like we're tripping over the page tables, when we're turning
> on paging.
> 
> Hmm, maybe it's not qemu, but the c906 that has been broken for a while?

I didn't know what you mean by "not qemu, but the c906", so I went and
boot tested my d1 nezha. On today's next (6.8.0-rc1-next-20240124) it
booted into my initramfs with no problems. Obivously though my config is
unlikely to match yours, but that seems like a core thing that should be
hit regardless of config.
So perhaps this is a c906-in-QEMU problem? Lacking emulation for
something the kernel uses perhaps? I know nothing about the capabilities
of its emulation in QEMU, so I am of no help.

Cheers,
Conor.

> 
> I'll disable it temporarily from CI anyhow, and will continue digging.
> 
> 
> Thanks for the pointers/clarifications, Conor!
> Björn
> 
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

[-- Attachment #2: Type: text/plain, Size: 161 bytes --]

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: qemu riscv, thead c906, Linux boot regression
@ 2024-01-24 13:49       ` Conor Dooley
  0 siblings, 0 replies; 18+ messages in thread
From: Conor Dooley @ 2024-01-24 13:49 UTC (permalink / raw)
  To: Björn Töpel
  Cc: qemu-devel, Daniel Henrique Barboza, Christoph Müllner,
	linux-riscv, LIU Zhiwei, Andrew Jones, Alistair Francis

[-- Attachment #1: Type: text/plain, Size: 3443 bytes --]

On Wed, Jan 24, 2024 at 02:27:10PM +0100, Björn Töpel wrote:
> Conor Dooley <conor@kernel.org> writes:
> 
> > On Wed, Jan 24, 2024 at 01:49:51PM +0100, Björn Töpel wrote:
> >> Hi!
> >> 
> >> I bumped the RISC-V Linux kernel CI to use qemu 8.2.0, and realized that
> >> thead c906 didn't boot anymore. Bisection points to commit d6a427e2c0b2
> >> ("target/riscv/cpu.c: restrict 'marchid' value")
> >> 
> >> Reverting that commit, or the hack below solves the boot issue:
> >> 
> >> --8<--
> >> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> >> index 8cbfc7e781ad..e18596c8a55a 100644
> >> --- a/target/riscv/cpu.c
> >> +++ b/target/riscv/cpu.c
> >> @@ -505,6 +505,9 @@ static void rv64_thead_c906_cpu_init(Object *obj)
> >>      cpu->cfg.ext_xtheadsync = true;
> >>  
> >>      cpu->cfg.mvendorid = THEAD_VENDOR_ID;
> >> +    cpu->cfg.marchid = ((QEMU_VERSION_MAJOR << 16) |
> >> +                        (QEMU_VERSION_MINOR << 8)  |
> >> +                        (QEMU_VERSION_MICRO));
> >>  #ifndef CONFIG_USER_ONLY
> >>      set_satp_mode_max_supported(cpu, VM_1_10_SV39);
> >>  #endif
> >> --8<--
> >> 
> >> I'm unsure what the correct qemu way of adding a default value is,
> >> or if c906 should have a proper marchid.
> >
> > The "correct" marchid/mimpid values for the c906 are zero.
> 
> Ok! Thanks for clearing that up for me.
> 
> > I haven't looked into the code at all, so I am "assuming" that it is
> > being zero intialised at present. Linux applies the errata fixups for
> > the c906 when archid and impid are both zero - so your patch will avoid
> > these fixups being applied.
> 
> I'm also assuming 0, -- will double-check. Hmm, that means that the
> *previous* marchid was incorrect (pre d6a427e2c0b2).
> 
> > Do you think that perhaps the emulation in QEMU does not support what
> > the kernel uses once then errata fixups are enabled?
> 
> Did a quick look at the c906 "in_asm,int" logs:
> 
> | 0x80201040:  12000073          sfence.vma              zero,zero
> | 0x80201044:  18051073          csrrw                   zero,satp,a0
> | 
> | riscv_cpu_do_interrupt: hart:0, async:0, cause:000000000000000c, epc:0x0000000080201048, tval:0x0000000080201048, desc=exec_page_fault
> | riscv_cpu_do_interrupt: hart:0, async:0, cause:000000000000000c, epc:0xffffffff80001048, tval:0xffffffff80001048, desc=exec_page_fault
> | ...cont forever
> 
> So it looks like we're tripping over the page tables, when we're turning
> on paging.
> 
> Hmm, maybe it's not qemu, but the c906 that has been broken for a while?

I didn't know what you mean by "not qemu, but the c906", so I went and
boot tested my d1 nezha. On today's next (6.8.0-rc1-next-20240124) it
booted into my initramfs with no problems. Obivously though my config is
unlikely to match yours, but that seems like a core thing that should be
hit regardless of config.
So perhaps this is a c906-in-QEMU problem? Lacking emulation for
something the kernel uses perhaps? I know nothing about the capabilities
of its emulation in QEMU, so I am of no help.

Cheers,
Conor.

> 
> I'll disable it temporarily from CI anyhow, and will continue digging.
> 
> 
> Thanks for the pointers/clarifications, Conor!
> Björn
> 
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: qemu riscv, thead c906, Linux boot regression
  2024-01-24 13:38   ` Daniel Henrique Barboza
@ 2024-01-24 19:26     ` Björn Töpel
  -1 siblings, 0 replies; 18+ messages in thread
From: Björn Töpel @ 2024-01-24 19:26 UTC (permalink / raw)
  To: Daniel Henrique Barboza, qemu-devel, Christoph Müllner
  Cc: linux-riscv, LIU Zhiwei, Andrew Jones, Alistair Francis,
	Conor Dooley, Palmer Dabbelt

Daniel Henrique Barboza <dbarboza@ventanamicro.com> writes:

> On 1/24/24 09:49, Björn Töpel wrote:
>> Hi!
>> 
>> I bumped the RISC-V Linux kernel CI to use qemu 8.2.0, and realized that
>> thead c906 didn't boot anymore. Bisection points to commit d6a427e2c0b2
>> ("target/riscv/cpu.c: restrict 'marchid' value")
>> 
>> Reverting that commit, or the hack below solves the boot issue:
>> 
>> --8<--
>> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
>> index 8cbfc7e781ad..e18596c8a55a 100644
>> --- a/target/riscv/cpu.c
>> +++ b/target/riscv/cpu.c
>> @@ -505,6 +505,9 @@ static void rv64_thead_c906_cpu_init(Object *obj)
>>       cpu->cfg.ext_xtheadsync = true;
>>   
>>       cpu->cfg.mvendorid = THEAD_VENDOR_ID;
>> +    cpu->cfg.marchid = ((QEMU_VERSION_MAJOR << 16) |
>> +                        (QEMU_VERSION_MINOR << 8)  |
>> +                        (QEMU_VERSION_MICRO));
>>   #ifndef CONFIG_USER_ONLY
>>       set_satp_mode_max_supported(cpu, VM_1_10_SV39);
>>   #endif
>> --8<--
>> 
>> I'm unsure what the correct qemu way of adding a default value is,
>> or if c906 should have a proper marchid.
>
> In case you need to set a 'marchid' different than zero for c906, this hack would
> be a proper fix. As mentioned in the commit msg of the patch you mentioned:
>
> "Named CPUs should set 'marchid' to a meaningful value instead, and generic
>   CPUs can set to any valid value."
>
> That means that any specific marchid value that the CPU uses must to be set
> in its own cpu_init() function.

Got it. Thanks, Daniel!

For completeness (since it came up on the weekly PW call); Conor pointed
out that zero *is* indeed the right marchid for c906, and in fact, the
non-zero marchid pre commit d6a427e2c0b2 was incorrect.

Post commit d6a427e2c0b2, the correct alternative is picked up, and
ERRATA_THEAD_PBMT (using non-standard memory type bits in
page-table-entries) kicks in. AFAIU, that's not implemented by qemu's
c906 support, which then traps.

That's the theory. Maybe Christoph knows if the non-standard bits are
implemented or not?

Regardless; I removed booting Qemu T-head c906 from the CI, and the
build/boot passes nicely ;-) [1]


Björn

[1] https://github.com/linux-riscv/linux-riscv/actions/runs/7641764759/job/20819801235?pr=447

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: qemu riscv, thead c906, Linux boot regression
@ 2024-01-24 19:26     ` Björn Töpel
  0 siblings, 0 replies; 18+ messages in thread
From: Björn Töpel @ 2024-01-24 19:26 UTC (permalink / raw)
  To: Daniel Henrique Barboza, qemu-devel, Christoph Müllner
  Cc: linux-riscv, LIU Zhiwei, Andrew Jones, Alistair Francis,
	Conor Dooley, Palmer Dabbelt

Daniel Henrique Barboza <dbarboza@ventanamicro.com> writes:

> On 1/24/24 09:49, Björn Töpel wrote:
>> Hi!
>> 
>> I bumped the RISC-V Linux kernel CI to use qemu 8.2.0, and realized that
>> thead c906 didn't boot anymore. Bisection points to commit d6a427e2c0b2
>> ("target/riscv/cpu.c: restrict 'marchid' value")
>> 
>> Reverting that commit, or the hack below solves the boot issue:
>> 
>> --8<--
>> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
>> index 8cbfc7e781ad..e18596c8a55a 100644
>> --- a/target/riscv/cpu.c
>> +++ b/target/riscv/cpu.c
>> @@ -505,6 +505,9 @@ static void rv64_thead_c906_cpu_init(Object *obj)
>>       cpu->cfg.ext_xtheadsync = true;
>>   
>>       cpu->cfg.mvendorid = THEAD_VENDOR_ID;
>> +    cpu->cfg.marchid = ((QEMU_VERSION_MAJOR << 16) |
>> +                        (QEMU_VERSION_MINOR << 8)  |
>> +                        (QEMU_VERSION_MICRO));
>>   #ifndef CONFIG_USER_ONLY
>>       set_satp_mode_max_supported(cpu, VM_1_10_SV39);
>>   #endif
>> --8<--
>> 
>> I'm unsure what the correct qemu way of adding a default value is,
>> or if c906 should have a proper marchid.
>
> In case you need to set a 'marchid' different than zero for c906, this hack would
> be a proper fix. As mentioned in the commit msg of the patch you mentioned:
>
> "Named CPUs should set 'marchid' to a meaningful value instead, and generic
>   CPUs can set to any valid value."
>
> That means that any specific marchid value that the CPU uses must to be set
> in its own cpu_init() function.

Got it. Thanks, Daniel!

For completeness (since it came up on the weekly PW call); Conor pointed
out that zero *is* indeed the right marchid for c906, and in fact, the
non-zero marchid pre commit d6a427e2c0b2 was incorrect.

Post commit d6a427e2c0b2, the correct alternative is picked up, and
ERRATA_THEAD_PBMT (using non-standard memory type bits in
page-table-entries) kicks in. AFAIU, that's not implemented by qemu's
c906 support, which then traps.

That's the theory. Maybe Christoph knows if the non-standard bits are
implemented or not?

Regardless; I removed booting Qemu T-head c906 from the CI, and the
build/boot passes nicely ;-) [1]


Björn

[1] https://github.com/linux-riscv/linux-riscv/actions/runs/7641764759/job/20819801235?pr=447


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: qemu riscv, thead c906, Linux boot regression
  2024-01-24 19:26     ` Björn Töpel
@ 2024-01-24 20:02       ` Daniel Henrique Barboza
  -1 siblings, 0 replies; 18+ messages in thread
From: Daniel Henrique Barboza @ 2024-01-24 20:02 UTC (permalink / raw)
  To: Björn Töpel, qemu-devel, Christoph Müllner
  Cc: linux-riscv, LIU Zhiwei, Andrew Jones, Alistair Francis,
	Conor Dooley, Palmer Dabbelt



On 1/24/24 16:26, Björn Töpel wrote:
> Daniel Henrique Barboza <dbarboza@ventanamicro.com> writes:
> 
>> On 1/24/24 09:49, Björn Töpel wrote:
>>> Hi!
>>>
>>> I bumped the RISC-V Linux kernel CI to use qemu 8.2.0, and realized that
>>> thead c906 didn't boot anymore. Bisection points to commit d6a427e2c0b2
>>> ("target/riscv/cpu.c: restrict 'marchid' value")
>>>
>>> Reverting that commit, or the hack below solves the boot issue:
>>>
>>> --8<--
>>> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
>>> index 8cbfc7e781ad..e18596c8a55a 100644
>>> --- a/target/riscv/cpu.c
>>> +++ b/target/riscv/cpu.c
>>> @@ -505,6 +505,9 @@ static void rv64_thead_c906_cpu_init(Object *obj)
>>>        cpu->cfg.ext_xtheadsync = true;
>>>    
>>>        cpu->cfg.mvendorid = THEAD_VENDOR_ID;
>>> +    cpu->cfg.marchid = ((QEMU_VERSION_MAJOR << 16) |
>>> +                        (QEMU_VERSION_MINOR << 8)  |
>>> +                        (QEMU_VERSION_MICRO));
>>>    #ifndef CONFIG_USER_ONLY
>>>        set_satp_mode_max_supported(cpu, VM_1_10_SV39);
>>>    #endif
>>> --8<--
>>>
>>> I'm unsure what the correct qemu way of adding a default value is,
>>> or if c906 should have a proper marchid.
>>
>> In case you need to set a 'marchid' different than zero for c906, this hack would
>> be a proper fix. As mentioned in the commit msg of the patch you mentioned:
>>
>> "Named CPUs should set 'marchid' to a meaningful value instead, and generic
>>    CPUs can set to any valid value."
>>
>> That means that any specific marchid value that the CPU uses must to be set
>> in its own cpu_init() function.
> 
> Got it. Thanks, Daniel!
> 
> For completeness (since it came up on the weekly PW call); Conor pointed
> out that zero *is* indeed the right marchid for c906, and in fact, the
> non-zero marchid pre commit d6a427e2c0b2 was incorrect.
> 
> Post commit d6a427e2c0b2, the correct alternative is picked up, and
> ERRATA_THEAD_PBMT (using non-standard memory type bits in
> page-table-entries) kicks in. AFAIU, that's not implemented by qemu's
> c906 support, which then traps.


This looks like a very good reason to actually push what you called 'hack' as
a fix. Yeah, in theory that commit did nothing wrong, but the side effect
(missing support for non-standard memory type bits) is kind of a QEMU problem.

You're welcome to format that hack into a patch, explaining in the commit msg why
we need to set marchid for c906 to that specific value. I'd even add a TODO
tag in rv64_thead_c906_cpu_init() to remind us that this is a band-aid and
that we should remove it once we implement the needed support.



> 
> That's the theory. Maybe Christoph knows if the non-standard bits are
> implemented or not?
> 
> Regardless; I removed booting Qemu T-head c906 from the CI, and the
> build/boot passes nicely ;-) [1]

I vote for setting marchid in c906 cpu_init and re-enable it in the CI.


Thanks,

Daniel

> 
> 
> Björn
> 
> [1] https://github.com/linux-riscv/linux-riscv/actions/runs/7641764759/job/20819801235?pr=447

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: qemu riscv, thead c906, Linux boot regression
@ 2024-01-24 20:02       ` Daniel Henrique Barboza
  0 siblings, 0 replies; 18+ messages in thread
From: Daniel Henrique Barboza @ 2024-01-24 20:02 UTC (permalink / raw)
  To: Björn Töpel, qemu-devel, Christoph Müllner
  Cc: linux-riscv, LIU Zhiwei, Andrew Jones, Alistair Francis,
	Conor Dooley, Palmer Dabbelt



On 1/24/24 16:26, Björn Töpel wrote:
> Daniel Henrique Barboza <dbarboza@ventanamicro.com> writes:
> 
>> On 1/24/24 09:49, Björn Töpel wrote:
>>> Hi!
>>>
>>> I bumped the RISC-V Linux kernel CI to use qemu 8.2.0, and realized that
>>> thead c906 didn't boot anymore. Bisection points to commit d6a427e2c0b2
>>> ("target/riscv/cpu.c: restrict 'marchid' value")
>>>
>>> Reverting that commit, or the hack below solves the boot issue:
>>>
>>> --8<--
>>> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
>>> index 8cbfc7e781ad..e18596c8a55a 100644
>>> --- a/target/riscv/cpu.c
>>> +++ b/target/riscv/cpu.c
>>> @@ -505,6 +505,9 @@ static void rv64_thead_c906_cpu_init(Object *obj)
>>>        cpu->cfg.ext_xtheadsync = true;
>>>    
>>>        cpu->cfg.mvendorid = THEAD_VENDOR_ID;
>>> +    cpu->cfg.marchid = ((QEMU_VERSION_MAJOR << 16) |
>>> +                        (QEMU_VERSION_MINOR << 8)  |
>>> +                        (QEMU_VERSION_MICRO));
>>>    #ifndef CONFIG_USER_ONLY
>>>        set_satp_mode_max_supported(cpu, VM_1_10_SV39);
>>>    #endif
>>> --8<--
>>>
>>> I'm unsure what the correct qemu way of adding a default value is,
>>> or if c906 should have a proper marchid.
>>
>> In case you need to set a 'marchid' different than zero for c906, this hack would
>> be a proper fix. As mentioned in the commit msg of the patch you mentioned:
>>
>> "Named CPUs should set 'marchid' to a meaningful value instead, and generic
>>    CPUs can set to any valid value."
>>
>> That means that any specific marchid value that the CPU uses must to be set
>> in its own cpu_init() function.
> 
> Got it. Thanks, Daniel!
> 
> For completeness (since it came up on the weekly PW call); Conor pointed
> out that zero *is* indeed the right marchid for c906, and in fact, the
> non-zero marchid pre commit d6a427e2c0b2 was incorrect.
> 
> Post commit d6a427e2c0b2, the correct alternative is picked up, and
> ERRATA_THEAD_PBMT (using non-standard memory type bits in
> page-table-entries) kicks in. AFAIU, that's not implemented by qemu's
> c906 support, which then traps.


This looks like a very good reason to actually push what you called 'hack' as
a fix. Yeah, in theory that commit did nothing wrong, but the side effect
(missing support for non-standard memory type bits) is kind of a QEMU problem.

You're welcome to format that hack into a patch, explaining in the commit msg why
we need to set marchid for c906 to that specific value. I'd even add a TODO
tag in rv64_thead_c906_cpu_init() to remind us that this is a band-aid and
that we should remove it once we implement the needed support.



> 
> That's the theory. Maybe Christoph knows if the non-standard bits are
> implemented or not?
> 
> Regardless; I removed booting Qemu T-head c906 from the CI, and the
> build/boot passes nicely ;-) [1]

I vote for setting marchid in c906 cpu_init and re-enable it in the CI.


Thanks,

Daniel

> 
> 
> Björn
> 
> [1] https://github.com/linux-riscv/linux-riscv/actions/runs/7641764759/job/20819801235?pr=447


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: qemu riscv, thead c906, Linux boot regression
  2024-01-24 20:02       ` Daniel Henrique Barboza
@ 2024-01-25  8:48         ` Björn Töpel
  -1 siblings, 0 replies; 18+ messages in thread
From: Björn Töpel @ 2024-01-25  8:48 UTC (permalink / raw)
  To: Daniel Henrique Barboza, qemu-devel, Christoph Müllner
  Cc: linux-riscv, LIU Zhiwei, Andrew Jones, Alistair Francis,
	Conor Dooley, Palmer Dabbelt

Daniel Henrique Barboza <dbarboza@ventanamicro.com> writes:

> On 1/24/24 16:26, Björn Töpel wrote:
>> Daniel Henrique Barboza <dbarboza@ventanamicro.com> writes:
>> 
>>> On 1/24/24 09:49, Björn Töpel wrote:
>>>> Hi!
>>>>
>>>> I bumped the RISC-V Linux kernel CI to use qemu 8.2.0, and realized that
>>>> thead c906 didn't boot anymore. Bisection points to commit d6a427e2c0b2
>>>> ("target/riscv/cpu.c: restrict 'marchid' value")
>>>>
>>>> Reverting that commit, or the hack below solves the boot issue:
>>>>
>>>> --8<--
>>>> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
>>>> index 8cbfc7e781ad..e18596c8a55a 100644
>>>> --- a/target/riscv/cpu.c
>>>> +++ b/target/riscv/cpu.c
>>>> @@ -505,6 +505,9 @@ static void rv64_thead_c906_cpu_init(Object *obj)
>>>>        cpu->cfg.ext_xtheadsync = true;
>>>>    
>>>>        cpu->cfg.mvendorid = THEAD_VENDOR_ID;
>>>> +    cpu->cfg.marchid = ((QEMU_VERSION_MAJOR << 16) |
>>>> +                        (QEMU_VERSION_MINOR << 8)  |
>>>> +                        (QEMU_VERSION_MICRO));
>>>>    #ifndef CONFIG_USER_ONLY
>>>>        set_satp_mode_max_supported(cpu, VM_1_10_SV39);
>>>>    #endif
>>>> --8<--
>>>>
>>>> I'm unsure what the correct qemu way of adding a default value is,
>>>> or if c906 should have a proper marchid.
>>>
>>> In case you need to set a 'marchid' different than zero for c906, this hack would
>>> be a proper fix. As mentioned in the commit msg of the patch you mentioned:
>>>
>>> "Named CPUs should set 'marchid' to a meaningful value instead, and generic
>>>    CPUs can set to any valid value."
>>>
>>> That means that any specific marchid value that the CPU uses must to be set
>>> in its own cpu_init() function.
>> 
>> Got it. Thanks, Daniel!
>> 
>> For completeness (since it came up on the weekly PW call); Conor pointed
>> out that zero *is* indeed the right marchid for c906, and in fact, the
>> non-zero marchid pre commit d6a427e2c0b2 was incorrect.
>> 
>> Post commit d6a427e2c0b2, the correct alternative is picked up, and
>> ERRATA_THEAD_PBMT (using non-standard memory type bits in
>> page-table-entries) kicks in. AFAIU, that's not implemented by qemu's
>> c906 support, which then traps.
>
>
> This looks like a very good reason to actually push what you called 'hack' as
> a fix. Yeah, in theory that commit did nothing wrong, but the side effect
> (missing support for non-standard memory type bits) is kind of a QEMU problem.

For me, it'd be weird to add the hack (setting marchid to non-zero).
Claiming that it's a "thead-c906 emulation" in qemu, but w/o the proper
page-bit support. That's just cpu rv64 plus some extra instructions --
not the c906.


Björn

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: qemu riscv, thead c906, Linux boot regression
@ 2024-01-25  8:48         ` Björn Töpel
  0 siblings, 0 replies; 18+ messages in thread
From: Björn Töpel @ 2024-01-25  8:48 UTC (permalink / raw)
  To: Daniel Henrique Barboza, qemu-devel, Christoph Müllner
  Cc: linux-riscv, LIU Zhiwei, Andrew Jones, Alistair Francis,
	Conor Dooley, Palmer Dabbelt

Daniel Henrique Barboza <dbarboza@ventanamicro.com> writes:

> On 1/24/24 16:26, Björn Töpel wrote:
>> Daniel Henrique Barboza <dbarboza@ventanamicro.com> writes:
>> 
>>> On 1/24/24 09:49, Björn Töpel wrote:
>>>> Hi!
>>>>
>>>> I bumped the RISC-V Linux kernel CI to use qemu 8.2.0, and realized that
>>>> thead c906 didn't boot anymore. Bisection points to commit d6a427e2c0b2
>>>> ("target/riscv/cpu.c: restrict 'marchid' value")
>>>>
>>>> Reverting that commit, or the hack below solves the boot issue:
>>>>
>>>> --8<--
>>>> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
>>>> index 8cbfc7e781ad..e18596c8a55a 100644
>>>> --- a/target/riscv/cpu.c
>>>> +++ b/target/riscv/cpu.c
>>>> @@ -505,6 +505,9 @@ static void rv64_thead_c906_cpu_init(Object *obj)
>>>>        cpu->cfg.ext_xtheadsync = true;
>>>>    
>>>>        cpu->cfg.mvendorid = THEAD_VENDOR_ID;
>>>> +    cpu->cfg.marchid = ((QEMU_VERSION_MAJOR << 16) |
>>>> +                        (QEMU_VERSION_MINOR << 8)  |
>>>> +                        (QEMU_VERSION_MICRO));
>>>>    #ifndef CONFIG_USER_ONLY
>>>>        set_satp_mode_max_supported(cpu, VM_1_10_SV39);
>>>>    #endif
>>>> --8<--
>>>>
>>>> I'm unsure what the correct qemu way of adding a default value is,
>>>> or if c906 should have a proper marchid.
>>>
>>> In case you need to set a 'marchid' different than zero for c906, this hack would
>>> be a proper fix. As mentioned in the commit msg of the patch you mentioned:
>>>
>>> "Named CPUs should set 'marchid' to a meaningful value instead, and generic
>>>    CPUs can set to any valid value."
>>>
>>> That means that any specific marchid value that the CPU uses must to be set
>>> in its own cpu_init() function.
>> 
>> Got it. Thanks, Daniel!
>> 
>> For completeness (since it came up on the weekly PW call); Conor pointed
>> out that zero *is* indeed the right marchid for c906, and in fact, the
>> non-zero marchid pre commit d6a427e2c0b2 was incorrect.
>> 
>> Post commit d6a427e2c0b2, the correct alternative is picked up, and
>> ERRATA_THEAD_PBMT (using non-standard memory type bits in
>> page-table-entries) kicks in. AFAIU, that's not implemented by qemu's
>> c906 support, which then traps.
>
>
> This looks like a very good reason to actually push what you called 'hack' as
> a fix. Yeah, in theory that commit did nothing wrong, but the side effect
> (missing support for non-standard memory type bits) is kind of a QEMU problem.

For me, it'd be weird to add the hack (setting marchid to non-zero).
Claiming that it's a "thead-c906 emulation" in qemu, but w/o the proper
page-bit support. That's just cpu rv64 plus some extra instructions --
not the c906.


Björn


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: qemu riscv, thead c906, Linux boot regression
  2024-01-24 12:49 ` Björn Töpel
@ 2024-01-25  9:13   ` LIU Zhiwei
  -1 siblings, 0 replies; 18+ messages in thread
From: LIU Zhiwei @ 2024-01-25  9:13 UTC (permalink / raw)
  To: Björn Töpel, qemu-devel, Daniel Henrique Barboza,
	Christoph Müllner
  Cc: linux-riscv, Andrew Jones, Alistair Francis


On 2024/1/24 20:49, Björn Töpel wrote:
> Hi!
>
> I bumped the RISC-V Linux kernel CI to use qemu 8.2.0, and realized that
> thead c906 didn't boot anymore. Bisection points to commit d6a427e2c0b2
> ("target/riscv/cpu.c: restrict 'marchid' value")
>
> Reverting that commit, or the hack below solves the boot issue:
>
> --8<--
> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> index 8cbfc7e781ad..e18596c8a55a 100644
> --- a/target/riscv/cpu.c
> +++ b/target/riscv/cpu.c
> @@ -505,6 +505,9 @@ static void rv64_thead_c906_cpu_init(Object *obj)
>       cpu->cfg.ext_xtheadsync = true;
>   
>       cpu->cfg.mvendorid = THEAD_VENDOR_ID;
> +    cpu->cfg.marchid = ((QEMU_VERSION_MAJOR << 16) |
> +                        (QEMU_VERSION_MINOR << 8)  |
> +                        (QEMU_VERSION_MICRO));
>   #ifndef CONFIG_USER_ONLY
>       set_satp_mode_max_supported(cpu, VM_1_10_SV39);
>   #endif
> --8<--
>
> I'm unsure what the correct qemu way of adding a default value is,
> or if c906 should have a proper marchid.
>
> Maybe Christoph or Zhiwei can answer?
>
> qemu command-line:
> qemu-system-riscv64 -nodefaults -nographic -machine virt,acpi=off \
>     -cpu thead-c906 ...

Hi  Björn,

I think it is caused by an mmu extension(named XTheadMaee) not 
implemented on QEMU which is conflicts with Svpbmt, which is the reason 
for error-ta in Linux.

I will try to fix this on QEMU and at the same time give a way to 
implement vendor custom CSR(XTheadMaee is controlled by an CSR named 
mexstatus)  on QEMU.

Thanks,
Zhiwei

>
> Thanks,
> Björn

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: qemu riscv, thead c906, Linux boot regression
@ 2024-01-25  9:13   ` LIU Zhiwei
  0 siblings, 0 replies; 18+ messages in thread
From: LIU Zhiwei @ 2024-01-25  9:13 UTC (permalink / raw)
  To: Björn Töpel, qemu-devel, Daniel Henrique Barboza,
	Christoph Müllner
  Cc: linux-riscv, Andrew Jones, Alistair Francis


On 2024/1/24 20:49, Björn Töpel wrote:
> Hi!
>
> I bumped the RISC-V Linux kernel CI to use qemu 8.2.0, and realized that
> thead c906 didn't boot anymore. Bisection points to commit d6a427e2c0b2
> ("target/riscv/cpu.c: restrict 'marchid' value")
>
> Reverting that commit, or the hack below solves the boot issue:
>
> --8<--
> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> index 8cbfc7e781ad..e18596c8a55a 100644
> --- a/target/riscv/cpu.c
> +++ b/target/riscv/cpu.c
> @@ -505,6 +505,9 @@ static void rv64_thead_c906_cpu_init(Object *obj)
>       cpu->cfg.ext_xtheadsync = true;
>   
>       cpu->cfg.mvendorid = THEAD_VENDOR_ID;
> +    cpu->cfg.marchid = ((QEMU_VERSION_MAJOR << 16) |
> +                        (QEMU_VERSION_MINOR << 8)  |
> +                        (QEMU_VERSION_MICRO));
>   #ifndef CONFIG_USER_ONLY
>       set_satp_mode_max_supported(cpu, VM_1_10_SV39);
>   #endif
> --8<--
>
> I'm unsure what the correct qemu way of adding a default value is,
> or if c906 should have a proper marchid.
>
> Maybe Christoph or Zhiwei can answer?
>
> qemu command-line:
> qemu-system-riscv64 -nodefaults -nographic -machine virt,acpi=off \
>     -cpu thead-c906 ...

Hi  Björn,

I think it is caused by an mmu extension(named XTheadMaee) not 
implemented on QEMU which is conflicts with Svpbmt, which is the reason 
for error-ta in Linux.

I will try to fix this on QEMU and at the same time give a way to 
implement vendor custom CSR(XTheadMaee is controlled by an CSR named 
mexstatus)  on QEMU.

Thanks,
Zhiwei

>
> Thanks,
> Björn


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2024-01-25  9:14 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-24 12:49 qemu riscv, thead c906, Linux boot regression Björn Töpel
2024-01-24 12:49 ` Björn Töpel
2024-01-24 13:13 ` Conor Dooley
2024-01-24 13:13   ` Conor Dooley
2024-01-24 13:27   ` Björn Töpel
2024-01-24 13:27     ` Björn Töpel
2024-01-24 13:49     ` Conor Dooley
2024-01-24 13:49       ` Conor Dooley
2024-01-24 13:38 ` Daniel Henrique Barboza
2024-01-24 13:38   ` Daniel Henrique Barboza
2024-01-24 19:26   ` Björn Töpel
2024-01-24 19:26     ` Björn Töpel
2024-01-24 20:02     ` Daniel Henrique Barboza
2024-01-24 20:02       ` Daniel Henrique Barboza
2024-01-25  8:48       ` Björn Töpel
2024-01-25  8:48         ` Björn Töpel
2024-01-25  9:13 ` LIU Zhiwei
2024-01-25  9:13   ` LIU Zhiwei

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.