All of lore.kernel.org
 help / color / mirror / Atom feed
* Preparing isar-cip-core for RZ/Five
@ 2022-10-03 18:36 Jan Kiszka
  2022-10-03 20:12 ` Chris Paterson
  0 siblings, 1 reply; 39+ messages in thread
From: Jan Kiszka @ 2022-10-03 18:36 UTC (permalink / raw)
  To: Chris Paterson; +Cc: cip-dev

Hi Chris,

after getting qemu-riscv64 ready (patches will come tomorrow), I also
had a quick look at the RZ/Five eval board you shared with us. But that
raised a couple of questions, maybe you can help:

 - Why is the default U-Boot hard-wired to a very special boot mode? I'm
   missing standard distro boot - which would include UEFI as well.

 - Where is the U-Boot tree (upstream support is apparently missing)?
   I'm still trying to understand the instructions how to replace it in
   QSPI, but first I need a fixable source tree.

 - Where is the kernel tree (also no upstream support yet)?

 - Can I reconfigure the carrier board to automatically power-on after a
   power cycle? Without that, it would be impossible to add the board to
   a remote lab.

Thanks,
Jan

-- 
Siemens AG, Technology
Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 39+ messages in thread

* RE: Preparing isar-cip-core for RZ/Five
  2022-10-03 18:36 Preparing isar-cip-core for RZ/Five Jan Kiszka
@ 2022-10-03 20:12 ` Chris Paterson
  2022-10-04  7:15   ` Jan Kiszka
  2022-10-04 22:30   ` Prabhakar Mahadev Lad
  0 siblings, 2 replies; 39+ messages in thread
From: Chris Paterson @ 2022-10-03 20:12 UTC (permalink / raw)
  To: Jan Kiszka, Prabhakar Mahadev Lad, Hung Tran; +Cc: cip-dev

Hello Jan,

> From: Jan Kiszka <jan.kiszka@siemens.com>
> Sent: 03 October 2022 19:36
> 
> Hi Chris,
> 
> after getting qemu-riscv64 ready (patches will come tomorrow), I also

Huzzah

> had a quick look at the RZ/Five eval board you shared with us. But that
> raised a couple of questions, maybe you can help:

I can try :)

> 
>  - Why is the default U-Boot hard-wired to a very special boot mode? I'm
>    missing standard distro boot - which would include UEFI as well.

Good question.
@Prabhakar Mahadev Lad or @Hung Tran, could you help answer this?

> 
>  - Where is the U-Boot tree (upstream support is apparently missing)?
>    I'm still trying to understand the instructions how to replace it in
>    QSPI, but first I need a fixable source tree.

The Yocto BSP is here: https://github.com/renesas-rz/meta-rzg2/tree/dunfell/rzfive
The U-Boot recipe points to https://github.com/renesas-rz/renesas-u-boot-cip/tree/v2021.12/rzf-smarc

You're obviously welcome to fork/fixup as you like.

> 
>  - Where is the kernel tree (also no upstream support yet)?

Kernel tree used by the current BSP is here: https://github.com/renesas-rz/rz_linux-cip/tree/rzfive-5.10-cip1

Upstreaming is still in progress.
We're having some fun sharing things between arm64 and riscv, but we're getting there.
I'm hoping we'll have something bootable in v6.2. 

> 
>  - Can I reconfigure the carrier board to automatically power-on after a
>    power cycle? Without that, it would be impossible to add the board to
>    a remote lab.

Yea it's a bit of a pain.
Two changes are actually needed to get these smarc platforms into a remote lab.
1) Change the power wiring so that everything powers on as soon as the AC connector is powered
2) Power the USB serial connection from the USB so it's always on

Let me know if you want the instructions for making these mods to the board.
Note that we'll provide pre-modified boards for the CIP labs when the time comes.

Kind regards, Chris

> 
> Thanks,
> Jan
> 
> --
> Siemens AG, Technology
> Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preparing isar-cip-core for RZ/Five
  2022-10-03 20:12 ` Chris Paterson
@ 2022-10-04  7:15   ` Jan Kiszka
  2022-10-04 18:28     ` Jan Kiszka
  2022-10-04 22:30   ` Prabhakar Mahadev Lad
  1 sibling, 1 reply; 39+ messages in thread
From: Jan Kiszka @ 2022-10-04  7:15 UTC (permalink / raw)
  To: Chris Paterson, Prabhakar Mahadev Lad, Hung Tran; +Cc: cip-dev

On 03.10.22 22:12, Chris Paterson wrote:
> Hello Jan,
> 
>> From: Jan Kiszka <jan.kiszka@siemens.com>
>> Sent: 03 October 2022 19:36
>>
>> Hi Chris,
>>
>> after getting qemu-riscv64 ready (patches will come tomorrow), I also
> 
> Huzzah
> 
>> had a quick look at the RZ/Five eval board you shared with us. But that
>> raised a couple of questions, maybe you can help:
> 
> I can try :)
> 
>>
>>  - Why is the default U-Boot hard-wired to a very special boot mode? I'm
>>    missing standard distro boot - which would include UEFI as well.
> 
> Good question.
> @Prabhakar Mahadev Lad or @Hung Tran, could you help answer this?
> 
>>
>>  - Where is the U-Boot tree (upstream support is apparently missing)?
>>    I'm still trying to understand the instructions how to replace it in
>>    QSPI, but first I need a fixable source tree.
> 
> The Yocto BSP is here: https://github.com/renesas-rz/meta-rzg2/tree/dunfell/rzfive
> The U-Boot recipe points to https://github.com/renesas-rz/renesas-u-boot-cip/tree/v2021.12/rzf-smarc
> 
> You're obviously welcome to fork/fixup as you like.
> 

How is the upstreaming status here? For UEFI, a U-Boot from this year is
needed.

>>
>>  - Where is the kernel tree (also no upstream support yet)?
> 
> Kernel tree used by the current BSP is here: https://github.com/renesas-rz/rz_linux-cip/tree/rzfive-5.10-cip1
> 
> Upstreaming is still in progress.
> We're having some fun sharing things between arm64 and riscv, but we're getting there.
> I'm hoping we'll have something bootable in v6.2. 
> 

I see - over 700 patches for 5.10-cip...

>>
>>  - Can I reconfigure the carrier board to automatically power-on after a
>>    power cycle? Without that, it would be impossible to add the board to
>>    a remote lab.
> 
> Yea it's a bit of a pain.
> Two changes are actually needed to get these smarc platforms into a remote lab.
> 1) Change the power wiring so that everything powers on as soon as the AC connector is powered
> 2) Power the USB serial connection from the USB so it's always on
> 
> Let me know if you want the instructions for making these mods to the board.
> Note that we'll provide pre-modified boards for the CIP labs when the time comes.

Ah, ok. Thanks!

Jan

-- 
Siemens AG, Technology
Competence Center Embedded Linux



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preparing isar-cip-core for RZ/Five
  2022-10-04  7:15   ` Jan Kiszka
@ 2022-10-04 18:28     ` Jan Kiszka
  2022-10-04 19:36       ` Jan Kiszka
  2022-10-05  5:43       ` [cip-dev] " Biju Das
  0 siblings, 2 replies; 39+ messages in thread
From: Jan Kiszka @ 2022-10-04 18:28 UTC (permalink / raw)
  To: Chris Paterson, Prabhakar Mahadev Lad, Hung Tran; +Cc: cip-dev

On 04.10.22 09:15, Jan Kiszka wrote:
> On 03.10.22 22:12, Chris Paterson wrote:
>> Hello Jan,
>>
>>> From: Jan Kiszka <jan.kiszka@siemens.com>
>>> Sent: 03 October 2022 19:36
>>>
>>> Hi Chris,
>>>
>>> after getting qemu-riscv64 ready (patches will come tomorrow), I also
>>
>> Huzzah
>>
>>> had a quick look at the RZ/Five eval board you shared with us. But that
>>> raised a couple of questions, maybe you can help:
>>
>> I can try :)
>>
>>>
>>>  - Why is the default U-Boot hard-wired to a very special boot mode? I'm
>>>    missing standard distro boot - which would include UEFI as well.
>>
>> Good question.
>> @Prabhakar Mahadev Lad or @Hung Tran, could you help answer this?
>>
>>>
>>>  - Where is the U-Boot tree (upstream support is apparently missing)?
>>>    I'm still trying to understand the instructions how to replace it in
>>>    QSPI, but first I need a fixable source tree.
>>
>> The Yocto BSP is here: https://github.com/renesas-rz/meta-rzg2/tree/dunfell/rzfive
>> The U-Boot recipe points to https://github.com/renesas-rz/renesas-u-boot-cip/tree/v2021.12/rzf-smarc
>>
>> You're obviously welcome to fork/fixup as you like.
>>
> 
> How is the upstreaming status here? For UEFI, a U-Boot from this year is
> needed.
> 
>>>
>>>  - Where is the kernel tree (also no upstream support yet)?
>>
>> Kernel tree used by the current BSP is here: https://github.com/renesas-rz/rz_linux-cip/tree/rzfive-5.10-cip1
>>
>> Upstreaming is still in progress.
>> We're having some fun sharing things between arm64 and riscv, but we're getting there.
>> I'm hoping we'll have something bootable in v6.2. 
>>
> 
> I see - over 700 patches for 5.10-cip...
> 

...and it's based on an outdated CIP baseline. We are specifically
missing "riscv: fix build with binutils 2.38", compared to latest CIP
kernels. Those happily build with the Debian toolchain.

Fixing up locally for now.

Jan

-- 
Siemens AG, Technology
Competence Center Embedded Linux



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preparing isar-cip-core for RZ/Five
  2022-10-04 18:28     ` Jan Kiszka
@ 2022-10-04 19:36       ` Jan Kiszka
  2022-10-04 19:47         ` Jan Kiszka
  2022-10-05  5:43       ` [cip-dev] " Biju Das
  1 sibling, 1 reply; 39+ messages in thread
From: Jan Kiszka @ 2022-10-04 19:36 UTC (permalink / raw)
  To: Chris Paterson, Prabhakar Mahadev Lad, Hung Tran; +Cc: cip-dev

On 04.10.22 20:28, Jan Kiszka wrote:
> On 04.10.22 09:15, Jan Kiszka wrote:
>> On 03.10.22 22:12, Chris Paterson wrote:
>>> Hello Jan,
>>>
>>>> From: Jan Kiszka <jan.kiszka@siemens.com>
>>>> Sent: 03 October 2022 19:36
>>>>
>>>> Hi Chris,
>>>>
>>>> after getting qemu-riscv64 ready (patches will come tomorrow), I also
>>>
>>> Huzzah
>>>
>>>> had a quick look at the RZ/Five eval board you shared with us. But that
>>>> raised a couple of questions, maybe you can help:
>>>
>>> I can try :)
>>>
>>>>
>>>>  - Why is the default U-Boot hard-wired to a very special boot mode? I'm
>>>>    missing standard distro boot - which would include UEFI as well.
>>>
>>> Good question.
>>> @Prabhakar Mahadev Lad or @Hung Tran, could you help answer this?
>>>
>>>>
>>>>  - Where is the U-Boot tree (upstream support is apparently missing)?
>>>>    I'm still trying to understand the instructions how to replace it in
>>>>    QSPI, but first I need a fixable source tree.
>>>
>>> The Yocto BSP is here: https://github.com/renesas-rz/meta-rzg2/tree/dunfell/rzfive
>>> The U-Boot recipe points to https://github.com/renesas-rz/renesas-u-boot-cip/tree/v2021.12/rzf-smarc
>>>
>>> You're obviously welcome to fork/fixup as you like.
>>>
>>
>> How is the upstreaming status here? For UEFI, a U-Boot from this year is
>> needed.
>>
>>>>
>>>>  - Where is the kernel tree (also no upstream support yet)?
>>>
>>> Kernel tree used by the current BSP is here: https://github.com/renesas-rz/rz_linux-cip/tree/rzfive-5.10-cip1
>>>
>>> Upstreaming is still in progress.
>>> We're having some fun sharing things between arm64 and riscv, but we're getting there.
>>> I'm hoping we'll have something bootable in v6.2. 
>>>
>>
>> I see - over 700 patches for 5.10-cip...
>>
> 
> ...and it's based on an outdated CIP baseline. We are specifically
> missing "riscv: fix build with binutils 2.38", compared to latest CIP
> kernels. Those happily build with the Debian toolchain.
> 
> Fixing up locally for now.
> 

root@demo:~# cat /sys/firmware/devicetree/base/model ; echo
Renesas SMARC EVK based on r9a07g043f01
root@demo:~# cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux bookworm/sid"
NAME="Debian GNU/Linux"
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"
BUILD_ID="698957d5-dirty"
VARIANT="CIP Core image"
VARIANT_VERSION="1.0"

But due to the weird U-Boot setup, I had to add some hacks, rather than
a proper boot process.

Jan

-- 
Siemens AG, Technology
Competence Center Embedded Linux



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preparing isar-cip-core for RZ/Five
  2022-10-04 19:36       ` Jan Kiszka
@ 2022-10-04 19:47         ` Jan Kiszka
  2022-10-05 18:21           ` Pavel Machek
  0 siblings, 1 reply; 39+ messages in thread
From: Jan Kiszka @ 2022-10-04 19:47 UTC (permalink / raw)
  To: Chris Paterson, Prabhakar Mahadev Lad, Hung Tran, Pavel Machek; +Cc: cip-dev

On 04.10.22 21:36, Jan Kiszka wrote:
> On 04.10.22 20:28, Jan Kiszka wrote:
>> On 04.10.22 09:15, Jan Kiszka wrote:
>>> On 03.10.22 22:12, Chris Paterson wrote:
>>>> Hello Jan,
>>>>
>>>>> From: Jan Kiszka <jan.kiszka@siemens.com>
>>>>> Sent: 03 October 2022 19:36
>>>>>
>>>>> Hi Chris,
>>>>>
>>>>> after getting qemu-riscv64 ready (patches will come tomorrow), I also
>>>>
>>>> Huzzah
>>>>
>>>>> had a quick look at the RZ/Five eval board you shared with us. But that
>>>>> raised a couple of questions, maybe you can help:
>>>>
>>>> I can try :)
>>>>
>>>>>
>>>>>  - Why is the default U-Boot hard-wired to a very special boot mode? I'm
>>>>>    missing standard distro boot - which would include UEFI as well.
>>>>
>>>> Good question.
>>>> @Prabhakar Mahadev Lad or @Hung Tran, could you help answer this?
>>>>
>>>>>
>>>>>  - Where is the U-Boot tree (upstream support is apparently missing)?
>>>>>    I'm still trying to understand the instructions how to replace it in
>>>>>    QSPI, but first I need a fixable source tree.
>>>>
>>>> The Yocto BSP is here: https://github.com/renesas-rz/meta-rzg2/tree/dunfell/rzfive
>>>> The U-Boot recipe points to https://github.com/renesas-rz/renesas-u-boot-cip/tree/v2021.12/rzf-smarc
>>>>
>>>> You're obviously welcome to fork/fixup as you like.
>>>>
>>>
>>> How is the upstreaming status here? For UEFI, a U-Boot from this year is
>>> needed.
>>>
>>>>>
>>>>>  - Where is the kernel tree (also no upstream support yet)?
>>>>
>>>> Kernel tree used by the current BSP is here: https://github.com/renesas-rz/rz_linux-cip/tree/rzfive-5.10-cip1
>>>>
>>>> Upstreaming is still in progress.
>>>> We're having some fun sharing things between arm64 and riscv, but we're getting there.
>>>> I'm hoping we'll have something bootable in v6.2. 
>>>>
>>>
>>> I see - over 700 patches for 5.10-cip...
>>>
>>
>> ...and it's based on an outdated CIP baseline. We are specifically
>> missing "riscv: fix build with binutils 2.38", compared to latest CIP
>> kernels. Those happily build with the Debian toolchain.
>>
>> Fixing up locally for now.
>>
> 
> root@demo:~# cat /sys/firmware/devicetree/base/model ; echo
> Renesas SMARC EVK based on r9a07g043f01
> root@demo:~# cat /etc/os-release
> PRETTY_NAME="Debian GNU/Linux bookworm/sid"
> NAME="Debian GNU/Linux"
> ID=debian
> HOME_URL="https://www.debian.org/"
> SUPPORT_URL="https://www.debian.org/support"
> BUG_REPORT_URL="https://bugs.debian.org/"
> BUILD_ID="698957d5-dirty"
> VARIANT="CIP Core image"
> VARIANT_VERSION="1.0"
> 
> But due to the weird U-Boot setup, I had to add some hacks, rather than
> a proper boot process.
> 

https://gitlab.com/cip-project/cip-core/isar-cip-core/-/commits/wip/rzfive

Test by flashing the image to an SD-Card switching to SD booting on the 
module.

Maybe we can find a better solution for the boot procedure before moving 
forward with this.

Anyway, now we have a real distribution on this board. Pavel, can you 
re-check what you observed, if those issues persist with Debian?

Jan

-- 
Siemens AG, Technology
Competence Center Embedded Linux



^ permalink raw reply	[flat|nested] 39+ messages in thread

* RE: Preparing isar-cip-core for RZ/Five
  2022-10-03 20:12 ` Chris Paterson
  2022-10-04  7:15   ` Jan Kiszka
@ 2022-10-04 22:30   ` Prabhakar Mahadev Lad
  2022-10-05  5:45     ` Jan Kiszka
  1 sibling, 1 reply; 39+ messages in thread
From: Prabhakar Mahadev Lad @ 2022-10-04 22:30 UTC (permalink / raw)
  To: Chris Paterson, Jan Kiszka, Hung Tran; +Cc: cip-dev

Hi Jan,

> -----Original Message-----
> From: Chris Paterson <Chris.Paterson2@renesas.com>
> Sent: 03 October 2022 21:13
> To: Jan Kiszka <jan.kiszka@siemens.com>; Prabhakar Mahadev Lad
> <prabhakar.mahadev-lad.rj@bp.renesas.com>; Hung Tran
> <hung.tran.jy@renesas.com>
> Cc: cip-dev <cip-dev@lists.cip-project.org>
> Subject: RE: Preparing isar-cip-core for RZ/Five
> 
> Hello Jan,
> 
> > From: Jan Kiszka <jan.kiszka@siemens.com>
> > Sent: 03 October 2022 19:36
> >
> > Hi Chris,
> >
> > after getting qemu-riscv64 ready (patches will come tomorrow), I
> also
> 
> Huzzah
> 
> > had a quick look at the RZ/Five eval board you shared with us. But
> > that raised a couple of questions, maybe you can help:
> 
> I can try :)
> 
> >
> >  - Why is the default U-Boot hard-wired to a very special boot mode?
> I'm
> >    missing standard distro boot - which would include UEFI as well.
> 
Its just that we used the standard configs which we use for BSP releases. Looking at the u-boot code enabling distro boot should be straight forward as EFI is supported on RISC-V


Cheers,
Prabhakar

^ permalink raw reply	[flat|nested] 39+ messages in thread

* RE: [cip-dev] Preparing isar-cip-core for RZ/Five
  2022-10-04 18:28     ` Jan Kiszka
  2022-10-04 19:36       ` Jan Kiszka
@ 2022-10-05  5:43       ` Biju Das
  1 sibling, 0 replies; 39+ messages in thread
From: Biju Das @ 2022-10-05  5:43 UTC (permalink / raw)
  To: cip-dev, Chris Paterson, Prabhakar Mahadev Lad, Hung Tran

> Subject: Re: [cip-dev] Preparing isar-cip-core for RZ/Five
> 
> On 04.10.22 09:15, Jan Kiszka wrote:
> > On 03.10.22 22:12, Chris Paterson wrote:
> >> Hello Jan,
> >>
> >>> From: Jan Kiszka <jan.kiszka@siemens.com>
> >>> Sent: 03 October 2022 19:36
> >>>
> >>> Hi Chris,
> >>>
> >>> after getting qemu-riscv64 ready (patches will come tomorrow), I
> >>> also
> >>
> >> Huzzah
> >>
> >>> had a quick look at the RZ/Five eval board you shared with us. But
> >>> that raised a couple of questions, maybe you can help:
> >>
> >> I can try :)
> >>
> >>>
> >>>  - Why is the default U-Boot hard-wired to a very special boot
> mode? I'm
> >>>    missing standard distro boot - which would include UEFI as
> well.
> >>
> >> Good question.
> >> @Prabhakar Mahadev Lad or @Hung Tran, could you help answer this?
> >>
> >>>
> >>>  - Where is the U-Boot tree (upstream support is apparently
> missing)?
> >>>    I'm still trying to understand the instructions how to replace
> it in
> >>>    QSPI, but first I need a fixable source tree.
> >>
> >> The Yocto BSP is here:
> >>
> >>
> >> You're obviously welcome to fork/fixup as you like.
> >>
> >
> > How is the upstreaming status here? For UEFI, a U-Boot from this
> year
> > is needed.
> >
> >>>
> >>>  - Where is the kernel tree (also no upstream support yet)?
> >>
> >> Kernel tree used by the current BSP is here:
> >>
 >>
> >> Upstreaming is still in progress.
> >> We're having some fun sharing things between arm64 and riscv, but
> we're getting there.
> >> I'm hoping we'll have something bootable in v6.2.
> >>
> >
> > I see - over 700 patches for 5.10-cip...

RZ/Five and RZ/G2UL share same drivers except CPU, IRQ and Cache.

Already RZ/G2UL full support is available in 5.10-cip and we are going to reuse
the SoC and board dtsi from RZ/G2UL for RZ/Five.

So once, if we mainline CPU, IRQ and Cache and backport to 5.10-cip
We will have full support for 5.10-cip.

Cheers,
Biju


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preparing isar-cip-core for RZ/Five
  2022-10-04 22:30   ` Prabhakar Mahadev Lad
@ 2022-10-05  5:45     ` Jan Kiszka
  0 siblings, 0 replies; 39+ messages in thread
From: Jan Kiszka @ 2022-10-05  5:45 UTC (permalink / raw)
  To: Prabhakar Mahadev Lad, Chris Paterson, Hung Tran; +Cc: cip-dev

On 05.10.22 00:30, Prabhakar Mahadev Lad wrote:
> Hi Jan,
> 
>> -----Original Message-----
>> From: Chris Paterson <Chris.Paterson2@renesas.com>
>> Sent: 03 October 2022 21:13
>> To: Jan Kiszka <jan.kiszka@siemens.com>; Prabhakar Mahadev Lad
>> <prabhakar.mahadev-lad.rj@bp.renesas.com>; Hung Tran
>> <hung.tran.jy@renesas.com>
>> Cc: cip-dev <cip-dev@lists.cip-project.org>
>> Subject: RE: Preparing isar-cip-core for RZ/Five
>>
>> Hello Jan,
>>
>>> From: Jan Kiszka <jan.kiszka@siemens.com>
>>> Sent: 03 October 2022 19:36
>>>
>>> Hi Chris,
>>>
>>> after getting qemu-riscv64 ready (patches will come tomorrow), I
>> also
>>
>> Huzzah
>>
>>> had a quick look at the RZ/Five eval board you shared with us. But
>>> that raised a couple of questions, maybe you can help:
>>
>> I can try :)
>>
>>>
>>>  - Why is the default U-Boot hard-wired to a very special boot mode?
>> I'm
>>>    missing standard distro boot - which would include UEFI as well.
>>
> Its just that we used the standard configs which we use for BSP releases. Looking at the u-boot code enabling distro boot should be straight forward as EFI is supported on RISC-V
> 

Exactly - that's why I don't understand that you are not doing it
already. The default setup should be generic, easily extensible. But now
your users need to compile an own firmware and flash it, just to have a
standard boot procedure. And you won't be SystemReady this way as well.

Jan

-- 
Siemens AG, Technology
Competence Center Embedded Linux



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preparing isar-cip-core for RZ/Five
  2022-10-04 19:47         ` Jan Kiszka
@ 2022-10-05 18:21           ` Pavel Machek
  2022-10-06  6:29             ` Jan Kiszka
  0 siblings, 1 reply; 39+ messages in thread
From: Pavel Machek @ 2022-10-05 18:21 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Chris Paterson, Prabhakar Mahadev Lad, Hung Tran, Pavel Machek, cip-dev

[-- Attachment #1: Type: text/plain, Size: 933 bytes --]

Hi!

> > But due to the weird U-Boot setup, I had to add some hacks, rather than
> > a proper boot process.
> 
> https://gitlab.com/cip-project/cip-core/isar-cip-core/-/commits/wip/rzfive
> 
> Test by flashing the image to an SD-Card switching to SD booting on the 
> module.
> 
> Maybe we can find a better solution for the boot procedure before moving 
> forward with this.
> 
> Anyway, now we have a real distribution on this board. Pavel, can you 
> re-check what you observed, if those issues persist with Debian?

I believe I was running Debian; two versions and Ubuntu, IIRC :-).

Can you check if ldconfig and gcc work for you? That were the
roadblocks I was hitting. If they do, I'll be interested in details of
your setup.

Best regards,
								Pavel
-- 
DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preparing isar-cip-core for RZ/Five
  2022-10-05 18:21           ` Pavel Machek
@ 2022-10-06  6:29             ` Jan Kiszka
  2022-10-06  6:49               ` Jan Kiszka
  2022-10-06 11:43               ` Pavel Machek
  0 siblings, 2 replies; 39+ messages in thread
From: Jan Kiszka @ 2022-10-06  6:29 UTC (permalink / raw)
  To: Pavel Machek, Chris Paterson, Prabhakar Mahadev Lad, Hung Tran; +Cc: cip-dev

On 05.10.22 20:21, Pavel Machek wrote:
> Hi!
> 
>>> But due to the weird U-Boot setup, I had to add some hacks, rather than
>>> a proper boot process.
>>
>> https://gitlab.com/cip-project/cip-core/isar-cip-core/-/commits/wip/rzfive
>>
>> Test by flashing the image to an SD-Card switching to SD booting on the 
>> module.
>>
>> Maybe we can find a better solution for the boot procedure before moving 
>> forward with this.
>>
>> Anyway, now we have a real distribution on this board. Pavel, can you 
>> re-check what you observed, if those issues persist with Debian?
> 
> I believe I was running Debian; two versions and Ubuntu, IIRC :-).
> 
> Can you check if ldconfig and gcc work for you? That were the
> roadblocks I was hitting. If they do, I'll be interested in details of
> your setup.
> 

Hmm, seems the issue persists:

root@demo:~# ldconfig                                                                                                                                                                                                                        
Illegal instruction

[  297.146728] ldconfig[497]: unhandled signal 4 code 0x1 at 0x00000000000380c8 in ldconfig[10000+83000]
[  297.146768] CPU: 0 PID: 497 Comm: ldconfig Not tainted 5.10.83-cip1-riscv-renesas #1
[  297.146775] epc: 00000000000380c8 ra : 0000000000015382 sp : 0000003fffe01c10
[  297.146782]  gp : 0000000000099da8 tp : 0000003fda772800 t0 : 0000003fda7787c0
[  297.146788]  t1 : 0000003fda8065a8 t2 : 0000002acbbda7a0 s0 : 0000002afa2ec890
[  297.146794]  s1 : 0000000000000001 a0 : 0000003fffe01d18 a1 : 0000000000000001
[  297.146800]  a2 : 0000003fffe01c88 a3 : 0000000000000000 a4 : 0000003fffe01d18
[  297.146806]  a5 : 000000000009736e a6 : 0000003fffe01c80 a7 : 00000000000000dd
[  297.146812]  s2 : 0000003fffe01c88 s3 : 0000000000000000 s4 : 0000000000000000
[  297.146819]  s5 : 00000000000105a4 s6 : 000000000009e670 s7 : 0000002afa2ec850
[  297.146824]  s8 : 0000002afa2ec710 s9 : 0000000000000000 s10: 0000002acbbe39c8
[  297.146830]  s11: 0000002acbbe3938 t3 : 0000002acbaf2610 t4 : 00000000000925a8
[  297.146835]  t5 : 0000000000000004 t6 : 0000000000000000
[  297.146842] status: 0000000200004020 badaddr: 00000000a01253cf cause: 0000000000000002

(gdb) disassemble $pc,+0x10
Dump of assembler code from 0x380c8 to 0x380d8:
=> 0x00000000000380c8:  auipc   a2,0x66
   0x00000000000380cc:  addi    a2,a2,2000 # 0x9e898
   0x00000000000380d0:  sd      a0,0(a2)
   0x00000000000380d2:  mv      a5,sp
   0x00000000000380d4:  addi    a4,sp,416
   0x00000000000380d6:  sd      zero,0(a5)
End of assembler dump.

Do we have any instruction set restrictions that prevents the usage of 
distros on this CPU? Under QEMU, we come along here as well and execute 
this without problems. Some infamous Intel CPU comes to my mind at this 
point - hope, history does not repeat here....

Jan

-- 
Siemens AG, Technology
Competence Center Embedded Linux



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preparing isar-cip-core for RZ/Five
  2022-10-06  6:29             ` Jan Kiszka
@ 2022-10-06  6:49               ` Jan Kiszka
  2022-10-06  7:07                 ` Jan Kiszka
  2022-10-06  7:08                 ` Prabhakar Mahadev Lad
  2022-10-06 11:43               ` Pavel Machek
  1 sibling, 2 replies; 39+ messages in thread
From: Jan Kiszka @ 2022-10-06  6:49 UTC (permalink / raw)
  To: Pavel Machek, Chris Paterson, Prabhakar Mahadev Lad, Hung Tran; +Cc: cip-dev

On 06.10.22 08:29, Jan Kiszka wrote:
> On 05.10.22 20:21, Pavel Machek wrote:
>> Hi!
>>
>>>> But due to the weird U-Boot setup, I had to add some hacks, rather than
>>>> a proper boot process.
>>>
>>> https://gitlab.com/cip-project/cip-core/isar-cip-core/-/commits/wip/rzfive
>>>
>>> Test by flashing the image to an SD-Card switching to SD booting on the 
>>> module.
>>>
>>> Maybe we can find a better solution for the boot procedure before moving 
>>> forward with this.
>>>
>>> Anyway, now we have a real distribution on this board. Pavel, can you 
>>> re-check what you observed, if those issues persist with Debian?
>>
>> I believe I was running Debian; two versions and Ubuntu, IIRC :-).
>>
>> Can you check if ldconfig and gcc work for you? That were the
>> roadblocks I was hitting. If they do, I'll be interested in details of
>> your setup.
>>
> 
> Hmm, seems the issue persists:
> 
> root@demo:~# ldconfig                                                                                                                                                                                                                        
> Illegal instruction
> 
> [  297.146728] ldconfig[497]: unhandled signal 4 code 0x1 at 0x00000000000380c8 in ldconfig[10000+83000]
> [  297.146768] CPU: 0 PID: 497 Comm: ldconfig Not tainted 5.10.83-cip1-riscv-renesas #1
> [  297.146775] epc: 00000000000380c8 ra : 0000000000015382 sp : 0000003fffe01c10
> [  297.146782]  gp : 0000000000099da8 tp : 0000003fda772800 t0 : 0000003fda7787c0
> [  297.146788]  t1 : 0000003fda8065a8 t2 : 0000002acbbda7a0 s0 : 0000002afa2ec890
> [  297.146794]  s1 : 0000000000000001 a0 : 0000003fffe01d18 a1 : 0000000000000001
> [  297.146800]  a2 : 0000003fffe01c88 a3 : 0000000000000000 a4 : 0000003fffe01d18
> [  297.146806]  a5 : 000000000009736e a6 : 0000003fffe01c80 a7 : 00000000000000dd
> [  297.146812]  s2 : 0000003fffe01c88 s3 : 0000000000000000 s4 : 0000000000000000
> [  297.146819]  s5 : 00000000000105a4 s6 : 000000000009e670 s7 : 0000002afa2ec850
> [  297.146824]  s8 : 0000002afa2ec710 s9 : 0000000000000000 s10: 0000002acbbe39c8
> [  297.146830]  s11: 0000002acbbe3938 t3 : 0000002acbaf2610 t4 : 00000000000925a8
> [  297.146835]  t5 : 0000000000000004 t6 : 0000000000000000
> [  297.146842] status: 0000000200004020 badaddr: 00000000a01253cf cause: 0000000000000002
> 
> (gdb) disassemble $pc,+0x10
> Dump of assembler code from 0x380c8 to 0x380d8:
> => 0x00000000000380c8:  auipc   a2,0x66
>    0x00000000000380cc:  addi    a2,a2,2000 # 0x9e898
>    0x00000000000380d0:  sd      a0,0(a2)
>    0x00000000000380d2:  mv      a5,sp
>    0x00000000000380d4:  addi    a4,sp,416
>    0x00000000000380d6:  sd      zero,0(a5)
> End of assembler dump.
> 
> Do we have any instruction set restrictions that prevents the usage of 
> distros on this CPU? Under QEMU, we come along here as well and execute 
> this without problems. Some infamous Intel CPU comes to my mind at this 
> point - hope, history does not repeat here....
> 

auipc was introduced with ISA 2.0, around 2019 - please don't tell me we
are on 1.0 here.

Jan

-- 
Siemens AG, Technology
Competence Center Embedded Linux



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preparing isar-cip-core for RZ/Five
  2022-10-06  6:49               ` Jan Kiszka
@ 2022-10-06  7:07                 ` Jan Kiszka
  2022-10-06  7:08                 ` Prabhakar Mahadev Lad
  1 sibling, 0 replies; 39+ messages in thread
From: Jan Kiszka @ 2022-10-06  7:07 UTC (permalink / raw)
  To: Pavel Machek, Chris Paterson, Prabhakar Mahadev Lad, Hung Tran; +Cc: cip-dev

On 06.10.22 08:49, Jan Kiszka wrote:
> On 06.10.22 08:29, Jan Kiszka wrote:
>> On 05.10.22 20:21, Pavel Machek wrote:
>>> Hi!
>>>
>>>>> But due to the weird U-Boot setup, I had to add some hacks, rather than
>>>>> a proper boot process.
>>>>
>>>> https://gitlab.com/cip-project/cip-core/isar-cip-core/-/commits/wip/rzfive
>>>>
>>>> Test by flashing the image to an SD-Card switching to SD booting on the 
>>>> module.
>>>>
>>>> Maybe we can find a better solution for the boot procedure before moving 
>>>> forward with this.
>>>>
>>>> Anyway, now we have a real distribution on this board. Pavel, can you 
>>>> re-check what you observed, if those issues persist with Debian?
>>>
>>> I believe I was running Debian; two versions and Ubuntu, IIRC :-).
>>>
>>> Can you check if ldconfig and gcc work for you? That were the
>>> roadblocks I was hitting. If they do, I'll be interested in details of
>>> your setup.
>>>
>>
>> Hmm, seems the issue persists:
>>
>> root@demo:~# ldconfig                                                                                                                                                                                                                        
>> Illegal instruction
>>
>> [  297.146728] ldconfig[497]: unhandled signal 4 code 0x1 at 0x00000000000380c8 in ldconfig[10000+83000]
>> [  297.146768] CPU: 0 PID: 497 Comm: ldconfig Not tainted 5.10.83-cip1-riscv-renesas #1
>> [  297.146775] epc: 00000000000380c8 ra : 0000000000015382 sp : 0000003fffe01c10
>> [  297.146782]  gp : 0000000000099da8 tp : 0000003fda772800 t0 : 0000003fda7787c0
>> [  297.146788]  t1 : 0000003fda8065a8 t2 : 0000002acbbda7a0 s0 : 0000002afa2ec890
>> [  297.146794]  s1 : 0000000000000001 a0 : 0000003fffe01d18 a1 : 0000000000000001
>> [  297.146800]  a2 : 0000003fffe01c88 a3 : 0000000000000000 a4 : 0000003fffe01d18
>> [  297.146806]  a5 : 000000000009736e a6 : 0000003fffe01c80 a7 : 00000000000000dd
>> [  297.146812]  s2 : 0000003fffe01c88 s3 : 0000000000000000 s4 : 0000000000000000
>> [  297.146819]  s5 : 00000000000105a4 s6 : 000000000009e670 s7 : 0000002afa2ec850
>> [  297.146824]  s8 : 0000002afa2ec710 s9 : 0000000000000000 s10: 0000002acbbe39c8
>> [  297.146830]  s11: 0000002acbbe3938 t3 : 0000002acbaf2610 t4 : 00000000000925a8
>> [  297.146835]  t5 : 0000000000000004 t6 : 0000000000000000
>> [  297.146842] status: 0000000200004020 badaddr: 00000000a01253cf cause: 0000000000000002
>>
>> (gdb) disassemble $pc,+0x10
>> Dump of assembler code from 0x380c8 to 0x380d8:
>> => 0x00000000000380c8:  auipc   a2,0x66
>>    0x00000000000380cc:  addi    a2,a2,2000 # 0x9e898
>>    0x00000000000380d0:  sd      a0,0(a2)
>>    0x00000000000380d2:  mv      a5,sp
>>    0x00000000000380d4:  addi    a4,sp,416
>>    0x00000000000380d6:  sd      zero,0(a5)
>> End of assembler dump.
>>
>> Do we have any instruction set restrictions that prevents the usage of 
>> distros on this CPU? Under QEMU, we come along here as well and execute 
>> this without problems. Some infamous Intel CPU comes to my mind at this 
>> point - hope, history does not repeat here....
>>
> 
> auipc was introduced with ISA 2.0, around 2019 - please don't tell me we
> are on 1.0 here.
> 

Nope, it must be some other condition: I've found those instructions in
working binaries as well, and I was even able to modify one on-the-fly
to run the very same op-code there, without any trap.

Jan

-- 
Siemens AG, Technology
Competence Center Embedded Linux



^ permalink raw reply	[flat|nested] 39+ messages in thread

* RE: Preparing isar-cip-core for RZ/Five
  2022-10-06  6:49               ` Jan Kiszka
  2022-10-06  7:07                 ` Jan Kiszka
@ 2022-10-06  7:08                 ` Prabhakar Mahadev Lad
  2022-10-06  7:26                   ` Jan Kiszka
  1 sibling, 1 reply; 39+ messages in thread
From: Prabhakar Mahadev Lad @ 2022-10-06  7:08 UTC (permalink / raw)
  To: Jan Kiszka, Pavel Machek, Chris Paterson, Hung Tran; +Cc: cip-dev

Hi Jan,

> -----Original Message-----
> From: Jan Kiszka <jan.kiszka@siemens.com>
> Sent: 06 October 2022 07:50
> To: Pavel Machek <pavel@denx.de>; Chris Paterson
> <Chris.Paterson2@renesas.com>; Prabhakar Mahadev Lad
> <prabhakar.mahadev-lad.rj@bp.renesas.com>; Hung Tran
> <hung.tran.jy@renesas.com>
> Cc: cip-dev <cip-dev@lists.cip-project.org>
> Subject: Re: Preparing isar-cip-core for RZ/Five
> 
> On 06.10.22 08:29, Jan Kiszka wrote:
> > On 05.10.22 20:21, Pavel Machek wrote:
> >> Hi!
> >>
> >>>> But due to the weird U-Boot setup, I had to add some hacks,
> rather
> >>>> than a proper boot process.
> >>>
> >>>
> https://jpn01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgi
> >>> tlab.com%2Fcip-project%2Fcip-core%2Fisar-cip-core%2F-
> %2Fcommits%2Fwi
> >>> p%2Frzfive&amp;data=05%7C01%7Cprabhakar.mahadev-
> lad.rj%40bp.renesas.
> >>>
> com%7Cc8b33a1b0332467e04ce08daa7670053%7C53d82571da1947e49cb4625a166
> >>>
> a4a2a%7C0%7C0%7C638006358079991069%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiM
> >>>
> C4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%
> >>>
> 7C%7C&amp;sdata=I2ajBytTSsUvSHp5wmkA1rgVz83rW%2B6aHGby1O2tyog%3D&amp
> >>> ;reserved=0
> >>>
> >>> Test by flashing the image to an SD-Card switching to SD booting
> on
> >>> the module.
> >>>
> >>> Maybe we can find a better solution for the boot procedure before
> >>> moving forward with this.
> >>>
> >>> Anyway, now we have a real distribution on this board. Pavel, can
> >>> you re-check what you observed, if those issues persist with
> Debian?
> >>
> >> I believe I was running Debian; two versions and Ubuntu, IIRC :-).
> >>
> >> Can you check if ldconfig and gcc work for you? That were the
> >> roadblocks I was hitting. If they do, I'll be interested in details
> >> of your setup.
> >>
> >
> > Hmm, seems the issue persists:
> >
> > root@demo:~# ldconfig
> > Illegal instruction
> >
> > [  297.146728] ldconfig[497]: unhandled signal 4 code 0x1 at
> > 0x00000000000380c8 in ldconfig[10000+83000] [  297.146768] CPU: 0
> PID:
> > 497 Comm: ldconfig Not tainted 5.10.83-cip1-riscv-renesas #1 [
> > 297.146775] epc: 00000000000380c8 ra : 0000000000015382 sp :
> > 0000003fffe01c10 [  297.146782]  gp : 0000000000099da8 tp :
> > 0000003fda772800 t0 : 0000003fda7787c0 [  297.146788]  t1 :
> > 0000003fda8065a8 t2 : 0000002acbbda7a0 s0 : 0000002afa2ec890 [
> > 297.146794]  s1 : 0000000000000001 a0 : 0000003fffe01d18 a1 :
> > 0000000000000001 [  297.146800]  a2 : 0000003fffe01c88 a3 :
> > 0000000000000000 a4 : 0000003fffe01d18 [  297.146806]  a5 :
> > 000000000009736e a6 : 0000003fffe01c80 a7 : 00000000000000dd [
> > 297.146812]  s2 : 0000003fffe01c88 s3 : 0000000000000000 s4 :
> > 0000000000000000 [  297.146819]  s5 : 00000000000105a4 s6 :
> > 000000000009e670 s7 : 0000002afa2ec850 [  297.146824]  s8 :
> > 0000002afa2ec710 s9 : 0000000000000000 s10: 0000002acbbe39c8 [
> > 297.146830]  s11: 0000002acbbe3938 t3 : 0000002acbaf2610 t4 :
> > 00000000000925a8 [  297.146835]  t5 : 0000000000000004 t6 :
> > 0000000000000000 [  297.146842] status: 0000000200004020 badaddr:
> > 00000000a01253cf cause: 0000000000000002
> >
> > (gdb) disassemble $pc,+0x10
> > Dump of assembler code from 0x380c8 to 0x380d8:
> > => 0x00000000000380c8:  auipc   a2,0x66
> >    0x00000000000380cc:  addi    a2,a2,2000 # 0x9e898
> >    0x00000000000380d0:  sd      a0,0(a2)
> >    0x00000000000380d2:  mv      a5,sp
> >    0x00000000000380d4:  addi    a4,sp,416
> >    0x00000000000380d6:  sd      zero,0(a5)
> > End of assembler dump.
> >
> > Do we have any instruction set restrictions that prevents the usage
> of
> > distros on this CPU? Under QEMU, we come along here as well and
> > execute this without problems. Some infamous Intel CPU comes to my
> > mind at this point - hope, history does not repeat here....
> >
> 
> auipc was introduced with ISA 2.0, around 2019 - please don't tell me
> we are on 1.0 here.
> 
I have seen similar issue with yocto dunfell release, If I go higher up to kirkstone release I see there is patch 
add-riscv-support.patch [0] (I haven't tried this release yet though).

[0] https://git.openembedded.org/openembedded-core/tree/meta/recipes-core/glibc/ldconfig-native-2.12.1?h=kirkstone

Cheers,
Prabhakar


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preparing isar-cip-core for RZ/Five
  2022-10-06  7:08                 ` Prabhakar Mahadev Lad
@ 2022-10-06  7:26                   ` Jan Kiszka
  0 siblings, 0 replies; 39+ messages in thread
From: Jan Kiszka @ 2022-10-06  7:26 UTC (permalink / raw)
  To: Prabhakar Mahadev Lad, Pavel Machek, Chris Paterson, Hung Tran; +Cc: cip-dev

On 06.10.22 09:08, Prabhakar Mahadev Lad wrote:
> Hi Jan,
> 
>> -----Original Message-----
>> From: Jan Kiszka <jan.kiszka@siemens.com>
>> Sent: 06 October 2022 07:50
>> To: Pavel Machek <pavel@denx.de>; Chris Paterson
>> <Chris.Paterson2@renesas.com>; Prabhakar Mahadev Lad
>> <prabhakar.mahadev-lad.rj@bp.renesas.com>; Hung Tran
>> <hung.tran.jy@renesas.com>
>> Cc: cip-dev <cip-dev@lists.cip-project.org>
>> Subject: Re: Preparing isar-cip-core for RZ/Five
>>
>> On 06.10.22 08:29, Jan Kiszka wrote:
>>> On 05.10.22 20:21, Pavel Machek wrote:
>>>> Hi!
>>>>
>>>>>> But due to the weird U-Boot setup, I had to add some hacks,
>> rather
>>>>>> than a proper boot process.
>>>>>
>>>>>
>> https://gi
>>>>> tlab.com%2Fcip-project%2Fcip-core%2Fisar-cip-core%2F-
>> %2Fcommits%2Fwi
>>>>> p%2Frzfive&amp;data=05%7C01%7Cprabhakar.mahadev-
>> lad.rj%40bp.renesas.
>>>>>
>> com%7Cc8b33a1b0332467e04ce08daa7670053%7C53d82571da1947e49cb4625a166
>>>>>
>> a4a2a%7C0%7C0%7C638006358079991069%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiM
>>>>>
>> C4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%
>>>>>
>> 7C%7C&amp;sdata=I2ajBytTSsUvSHp5wmkA1rgVz83rW%2B6aHGby1O2tyog%3D&amp
>>>>> ;reserved=0
>>>>>
>>>>> Test by flashing the image to an SD-Card switching to SD booting
>> on
>>>>> the module.
>>>>>
>>>>> Maybe we can find a better solution for the boot procedure before
>>>>> moving forward with this.
>>>>>
>>>>> Anyway, now we have a real distribution on this board. Pavel, can
>>>>> you re-check what you observed, if those issues persist with
>> Debian?
>>>>
>>>> I believe I was running Debian; two versions and Ubuntu, IIRC :-).
>>>>
>>>> Can you check if ldconfig and gcc work for you? That were the
>>>> roadblocks I was hitting. If they do, I'll be interested in details
>>>> of your setup.
>>>>
>>>
>>> Hmm, seems the issue persists:
>>>
>>> root@demo:~# ldconfig
>>> Illegal instruction
>>>
>>> [  297.146728] ldconfig[497]: unhandled signal 4 code 0x1 at
>>> 0x00000000000380c8 in ldconfig[10000+83000] [  297.146768] CPU: 0
>> PID:
>>> 497 Comm: ldconfig Not tainted 5.10.83-cip1-riscv-renesas #1 [
>>> 297.146775] epc: 00000000000380c8 ra : 0000000000015382 sp :
>>> 0000003fffe01c10 [  297.146782]  gp : 0000000000099da8 tp :
>>> 0000003fda772800 t0 : 0000003fda7787c0 [  297.146788]  t1 :
>>> 0000003fda8065a8 t2 : 0000002acbbda7a0 s0 : 0000002afa2ec890 [
>>> 297.146794]  s1 : 0000000000000001 a0 : 0000003fffe01d18 a1 :
>>> 0000000000000001 [  297.146800]  a2 : 0000003fffe01c88 a3 :
>>> 0000000000000000 a4 : 0000003fffe01d18 [  297.146806]  a5 :
>>> 000000000009736e a6 : 0000003fffe01c80 a7 : 00000000000000dd [
>>> 297.146812]  s2 : 0000003fffe01c88 s3 : 0000000000000000 s4 :
>>> 0000000000000000 [  297.146819]  s5 : 00000000000105a4 s6 :
>>> 000000000009e670 s7 : 0000002afa2ec850 [  297.146824]  s8 :
>>> 0000002afa2ec710 s9 : 0000000000000000 s10: 0000002acbbe39c8 [
>>> 297.146830]  s11: 0000002acbbe3938 t3 : 0000002acbaf2610 t4 :
>>> 00000000000925a8 [  297.146835]  t5 : 0000000000000004 t6 :
>>> 0000000000000000 [  297.146842] status: 0000000200004020 badaddr:
>>> 00000000a01253cf cause: 0000000000000002
>>>
>>> (gdb) disassemble $pc,+0x10
>>> Dump of assembler code from 0x380c8 to 0x380d8:
>>> => 0x00000000000380c8:  auipc   a2,0x66
>>>    0x00000000000380cc:  addi    a2,a2,2000 # 0x9e898
>>>    0x00000000000380d0:  sd      a0,0(a2)
>>>    0x00000000000380d2:  mv      a5,sp
>>>    0x00000000000380d4:  addi    a4,sp,416
>>>    0x00000000000380d6:  sd      zero,0(a5)
>>> End of assembler dump.
>>>
>>> Do we have any instruction set restrictions that prevents the usage
>> of
>>> distros on this CPU? Under QEMU, we come along here as well and
>>> execute this without problems. Some infamous Intel CPU comes to my
>>> mind at this point - hope, history does not repeat here....
>>>
>>
>> auipc was introduced with ISA 2.0, around 2019 - please don't tell me
>> we are on 1.0 here.
>>
> I have seen similar issue with yocto dunfell release, If I go higher up to kirkstone release I see there is patch 
> add-riscv-support.patch [0] (I haven't tried this release yet though).
> 
> [0] https://git.openembedded.org/openembedded-core/tree/meta/recipes-core/glibc/ldconfig-native-2.12.1?h=kirkstone
> 

I doubt that something that fundamental wouldn't be in the Debian
package as well. And it works fine over QEMU.

Jan

-- 
Siemens AG, Technology
Competence Center Embedded Linux



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preparing isar-cip-core for RZ/Five
  2022-10-06  6:29             ` Jan Kiszka
  2022-10-06  6:49               ` Jan Kiszka
@ 2022-10-06 11:43               ` Pavel Machek
  2022-10-06 11:51                 ` Jan Kiszka
  1 sibling, 1 reply; 39+ messages in thread
From: Pavel Machek @ 2022-10-06 11:43 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Pavel Machek, Chris Paterson, Prabhakar Mahadev Lad, Hung Tran, cip-dev

[-- Attachment #1: Type: text/plain, Size: 1692 bytes --]

Hi!

> > Can you check if ldconfig and gcc work for you? That were the
> > roadblocks I was hitting. If they do, I'll be interested in details of
> > your setup.
> 
> Hmm, seems the issue persists:

:-(. Do you get gcc faulting, too?

> root@demo:~# ldconfig                                                                                                                                                                                                                        
> Illegal instruction
> 
> [  297.146728] ldconfig[497]: unhandled signal 4 code 0x1 at 0x00000000000380c8 in ldconfig[10000+83000]
...
> (gdb) disassemble $pc,+0x10
> Dump of assembler code from 0x380c8 to 0x380d8:
> => 0x00000000000380c8:  auipc   a2,0x66
>    0x00000000000380cc:  addi    a2,a2,2000 # 0x9e898
>    0x00000000000380d0:  sd      a0,0(a2)

auipc is something rather simple. a2 = pc + 0x66 << something. Not
sure how it could fault. Plus we get "illegal instruction", suggesting
it is not some other fault.

Could some kind of self-modifying code be involved? I guess some kind
of debugging/watchpoint is not probable.

> Do we have any instruction set restrictions that prevents the usage of 
> distros on this CPU? Under QEMU, we come along here as well and execute 
> this without problems. Some infamous Intel CPU comes to my mind at this 
> point - hope, history does not repeat here....

Which Intel CPU comes to mind? Pentium with its fdiv bug? I'd say
recent ones outdid it :-).

Best regards,
								Pavel
-- 
DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preparing isar-cip-core for RZ/Five
  2022-10-06 11:43               ` Pavel Machek
@ 2022-10-06 11:51                 ` Jan Kiszka
  2022-10-06 22:07                   ` ldconfig segfault on RZ/Five was " Pavel Machek
  0 siblings, 1 reply; 39+ messages in thread
From: Jan Kiszka @ 2022-10-06 11:51 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Chris Paterson, Prabhakar Mahadev Lad, Hung Tran, cip-dev

On 06.10.22 13:43, Pavel Machek wrote:
> Hi!
> 
>>> Can you check if ldconfig and gcc work for you? That were the
>>> roadblocks I was hitting. If they do, I'll be interested in details of
>>> your setup.
>>
>> Hmm, seems the issue persists:
> 
> :-(. Do you get gcc faulting, too?

I tried, but installation fails - illegal instruction.

> 
>> root@demo:~# ldconfig                                                                                                                                                                                                                        
>> Illegal instruction
>>
>> [  297.146728] ldconfig[497]: unhandled signal 4 code 0x1 at 0x00000000000380c8 in ldconfig[10000+83000]
> ...
>> (gdb) disassemble $pc,+0x10
>> Dump of assembler code from 0x380c8 to 0x380d8:
>> => 0x00000000000380c8:  auipc   a2,0x66
>>    0x00000000000380cc:  addi    a2,a2,2000 # 0x9e898
>>    0x00000000000380d0:  sd      a0,0(a2)
> 
> auipc is something rather simple. a2 = pc + 0x66 << something. Not
> sure how it could fault. Plus we get "illegal instruction", suggesting
> it is not some other fault.
> 
> Could some kind of self-modifying code be involved? I guess some kind
> of debugging/watchpoint is not probable.

No idea - but why should ldconfig be self-modifying?

> 
>> Do we have any instruction set restrictions that prevents the usage of 
>> distros on this CPU? Under QEMU, we come along here as well and execute 
>> this without problems. Some infamous Intel CPU comes to my mind at this 
>> point - hope, history does not repeat here....
> 
> Which Intel CPU comes to mind? Pentium with its fdiv bug? I'd say
> recent ones outdid it :-).

Quark :D
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=738575%22

Jan

-- 
Siemens AG, Technology
Competence Center Embedded Linux



^ permalink raw reply	[flat|nested] 39+ messages in thread

* ldconfig segfault on RZ/Five was Re: Preparing isar-cip-core for RZ/Five
  2022-10-06 11:51                 ` Jan Kiszka
@ 2022-10-06 22:07                   ` Pavel Machek
  2022-10-06 22:32                     ` Pavel Machek
  0 siblings, 1 reply; 39+ messages in thread
From: Pavel Machek @ 2022-10-06 22:07 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Pavel Machek, Chris Paterson, Prabhakar Mahadev Lad, Hung Tran, cip-dev

[-- Attachment #1: Type: text/plain, Size: 2460 bytes --]

Hi!

> >> Hmm, seems the issue persists:
> > 
> > :-(. Do you get gcc faulting, too?
> 
> I tried, but installation fails - illegal instruction.

Yeah, ldconfig is needed for installation. But I get a segfaulting gcc
binary.

> >> root@demo:~# ldconfig                                                                        
> >>
> >> [  297.146728] ldconfig[497]: unhandled signal 4 code 0x1 at 0x00000000000380c8 in ldconfig[10000+83000]
> > ...
> >> (gdb) disassemble $pc,+0x10
> >> Dump of assembler code from 0x380c8 to 0x380d8:
> >> => 0x00000000000380c8:  auipc   a2,0x66
> >>    0x00000000000380cc:  addi    a2,a2,2000 # 0x9e898
> >>    0x00000000000380d0:  sd      a0,0(a2)
> > 
> > auipc is something rather simple. a2 = pc + 0x66 << something. Not
> > sure how it could fault. Plus we get "illegal instruction", suggesting
> > it is not some other fault.
> > 
> > Could some kind of self-modifying code be involved? I guess some kind
> > of debugging/watchpoint is not probable.
> 
> No idea - but why should ldconfig be self-modifying?

No idea.

But I do have slightly different results then you (I think; I'm far
from risc-v expert). I did a breakpoint:

Breakpoint 1, 0x00000000000385d4 in ?? ()
(gdb)

Dump of assembler code from 0x385d4 to 0x385f4:
=> 0x00000000000385d4:  lb      zero,81(t1)
   0x00000000000385d8:  andi    a1,a1,25
   0x00000000000385da:  sd      zero,24(sp)
   0x00000000000385dc:  sd      zero,32(sp)

If I do the stepi, it will give the illegal instruction, because,
well, we are in the middle of the auipc instruction:

(gdb) disassemble $pc-0x10,+0x20
Dump of assembler code from 0x385c4 to 0x385e4:
   0x00000000000385c4:  .4byte  0x4881f753
   0x00000000000385c8:  li      a6,0
   0x00000000000385ca:  li      a5,0
   0x00000000000385cc:  addi    a3,a1,920
   0x00000000000385d0:  mv      a2,s8
   0x00000000000385d2:  auipc   a0,0x3f
   0x00000000000385d6:  addi    a0,a0,-1890 # 0x76e70
   0x00000000000385da:  sd      zero,24(sp)
   0x00000000000385dc:  sd      zero,32(sp)
   0x00000000000385de:  sb      t3,20(sp)
   0x00000000000385e2:  sd      s7,40(sp)
End of assembler dump.
(gdb)

Weird. But it explains sigill when executing auipc does not result in
segfault...

Best regards,
								Pavel
-- 
DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: ldconfig segfault on RZ/Five was Re: Preparing isar-cip-core for RZ/Five
  2022-10-06 22:07                   ` ldconfig segfault on RZ/Five was " Pavel Machek
@ 2022-10-06 22:32                     ` Pavel Machek
  2022-10-07  8:18                       ` Jan Kiszka
  0 siblings, 1 reply; 39+ messages in thread
From: Pavel Machek @ 2022-10-06 22:32 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Jan Kiszka, Chris Paterson, Prabhakar Mahadev Lad, Hung Tran, cip-dev

[-- Attachment #1: Type: text/plain, Size: 1522 bytes --]

Hi!

> > I tried, but installation fails - illegal instruction.
> 
> Yeah, ldconfig is needed for installation. But I get a segfaulting gcc
> binary.

It crashes rather soon after startup, so I was able to trace complete
path.

> But I do have slightly different results then you (I think; I'm far
> from risc-v expert). I did a breakpoint:
> 
> Breakpoint 1, 0x00000000000385d4 in ?? ()

I believe it should not end at 0x00000000000385d4 at all. The
0x000000000001537e jal instruction should end up calling 0x3806a
AFAICT, but it calls 0x385d4 instead. It happens during
single-stepping, so it should not be anything subtle.

(gdb) disassemble $pc,+0x20
Dump of assembler code from 0x1537c to 0x1539c:
=> 0x000000000001537c:  mv      a0,a4
   0x000000000001537e:  jal     ra,0x3806a
   0x0000000000015382:  auipc   a5,0x8a
   0x0000000000015386:  addi    a5,a5,1342 # 0x9f8c0
   0x000000000001538a:  ld      a4,0(a5)
   0x000000000001538c:  beqz    a4,0x153f0
   0x000000000001538e:  jal     ra,0x38abe
   0x0000000000015392:  ld      a0,0(s6)
   0x0000000000015396:  auipc   s7,0x85
   0x000000000001539a:  ld      s7,-406(s7) # 0x9a200
End of assembler dump.
(gdb)
(gdb) stepi
0x000000000001537e in ?? ()
(gdb)

Program received signal SIGILL, Illegal instruction.
0x00000000000385d4 in ?? ()
(gdb)

Best regards,
								Pavel
-- 
DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: ldconfig segfault on RZ/Five was Re: Preparing isar-cip-core for RZ/Five
  2022-10-06 22:32                     ` Pavel Machek
@ 2022-10-07  8:18                       ` Jan Kiszka
  2022-10-07 10:19                         ` Pavel Machek
  0 siblings, 1 reply; 39+ messages in thread
From: Jan Kiszka @ 2022-10-07  8:18 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Chris Paterson, Prabhakar Mahadev Lad, Hung Tran, cip-dev

On 07.10.22 00:32, Pavel Machek wrote:
> Hi!
> 
>>> I tried, but installation fails - illegal instruction.
>>
>> Yeah, ldconfig is needed for installation. But I get a segfaulting gcc
>> binary.
> 
> It crashes rather soon after startup, so I was able to trace complete
> path.
> 
>> But I do have slightly different results then you (I think; I'm far
>> from risc-v expert). I did a breakpoint:
>>
>> Breakpoint 1, 0x00000000000385d4 in ?? ()
> 
> I believe it should not end at 0x00000000000385d4 at all. The
> 0x000000000001537e jal instruction should end up calling 0x3806a
> AFAICT, but it calls 0x385d4 instead. It happens during
> single-stepping, so it should not be anything subtle.
> 
> (gdb) disassemble $pc,+0x20
> Dump of assembler code from 0x1537c to 0x1539c:
> => 0x000000000001537c:  mv      a0,a4
>    0x000000000001537e:  jal     ra,0x3806a
>    0x0000000000015382:  auipc   a5,0x8a
>    0x0000000000015386:  addi    a5,a5,1342 # 0x9f8c0
>    0x000000000001538a:  ld      a4,0(a5)
>    0x000000000001538c:  beqz    a4,0x153f0
>    0x000000000001538e:  jal     ra,0x38abe
>    0x0000000000015392:  ld      a0,0(s6)
>    0x0000000000015396:  auipc   s7,0x85
>    0x000000000001539a:  ld      s7,-406(s7) # 0x9a200
> End of assembler dump.
> (gdb)
> (gdb) stepi
> 0x000000000001537e in ?? ()
> (gdb)
> 
> Program received signal SIGILL, Illegal instruction.
> 0x00000000000385d4 in ?? ()
> (gdb)
> 
> Best regards,
> 								Pavel

Did you try to compare the call trace to QEMU, where we divert?

Jan

-- 
Siemens AG, Technology
Competence Center Embedded Linux



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: ldconfig segfault on RZ/Five was Re: Preparing isar-cip-core for RZ/Five
  2022-10-07  8:18                       ` Jan Kiszka
@ 2022-10-07 10:19                         ` Pavel Machek
  2022-10-08  8:27                           ` Jan Kiszka
  0 siblings, 1 reply; 39+ messages in thread
From: Pavel Machek @ 2022-10-07 10:19 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Pavel Machek, Chris Paterson, Prabhakar Mahadev Lad, Hung Tran, cip-dev

[-- Attachment #1: Type: text/plain, Size: 2026 bytes --]

Hi!

> >>> I tried, but installation fails - illegal instruction.
> >>
> >> Yeah, ldconfig is needed for installation. But I get a segfaulting gcc
> >> binary.
> > 
> > It crashes rather soon after startup, so I was able to trace complete
> > path.
> > 
> >> But I do have slightly different results then you (I think; I'm far
> >> from risc-v expert). I did a breakpoint:
> >>
> >> Breakpoint 1, 0x00000000000385d4 in ?? ()
> > 
> > I believe it should not end at 0x00000000000385d4 at all. The
> > 0x000000000001537e jal instruction should end up calling 0x3806a
> > AFAICT, but it calls 0x385d4 instead. It happens during
> > single-stepping, so it should not be anything subtle.
> > 
> > (gdb) disassemble $pc,+0x20
> > Dump of assembler code from 0x1537c to 0x1539c:
> > => 0x000000000001537c:  mv      a0,a4
> >    0x000000000001537e:  jal     ra,0x3806a
> >    0x0000000000015382:  auipc   a5,0x8a
> >    0x0000000000015386:  addi    a5,a5,1342 # 0x9f8c0
> >    0x000000000001538a:  ld      a4,0(a5)
> >    0x000000000001538c:  beqz    a4,0x153f0
> >    0x000000000001538e:  jal     ra,0x38abe
> >    0x0000000000015392:  ld      a0,0(s6)
> >    0x0000000000015396:  auipc   s7,0x85
> >    0x000000000001539a:  ld      s7,-406(s7) # 0x9a200
> > End of assembler dump.
> > (gdb)
> > (gdb) stepi
> > 0x000000000001537e in ?? ()
> > (gdb)
> > 
> > Program received signal SIGILL, Illegal instruction.
> > 0x00000000000385d4 in ?? ()
> > (gdb)
> 
> Did you try to compare the call trace to QEMU, where we divert?

Yes, that's possible way forward, but it will require some
considerable setup on my side.

If you have QEMU ready... objdump tells you ldconfig's entrypoint,
from that point you can just stepi. In less than 200 steps, you should
have sigill... and complete steps that lead to it.

Best regards,
								Pavel
-- 
DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: ldconfig segfault on RZ/Five was Re: Preparing isar-cip-core for RZ/Five
  2022-10-07 10:19                         ` Pavel Machek
@ 2022-10-08  8:27                           ` Jan Kiszka
  2022-10-09  8:29                             ` Jan Kiszka
  0 siblings, 1 reply; 39+ messages in thread
From: Jan Kiszka @ 2022-10-08  8:27 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Chris Paterson, Prabhakar Mahadev Lad, Hung Tran, cip-dev

On 07.10.22 12:19, Pavel Machek wrote:
> Hi!
> 
>>>>> I tried, but installation fails - illegal instruction.
>>>>
>>>> Yeah, ldconfig is needed for installation. But I get a segfaulting gcc
>>>> binary.
>>>
>>> It crashes rather soon after startup, so I was able to trace complete
>>> path.
>>>
>>>> But I do have slightly different results then you (I think; I'm far
>>>> from risc-v expert). I did a breakpoint:
>>>>
>>>> Breakpoint 1, 0x00000000000385d4 in ?? ()
>>>
>>> I believe it should not end at 0x00000000000385d4 at all. The
>>> 0x000000000001537e jal instruction should end up calling 0x3806a
>>> AFAICT, but it calls 0x385d4 instead. It happens during
>>> single-stepping, so it should not be anything subtle.
>>>
>>> (gdb) disassemble $pc,+0x20
>>> Dump of assembler code from 0x1537c to 0x1539c:
>>> => 0x000000000001537c:  mv      a0,a4
>>>    0x000000000001537e:  jal     ra,0x3806a
>>>    0x0000000000015382:  auipc   a5,0x8a
>>>    0x0000000000015386:  addi    a5,a5,1342 # 0x9f8c0
>>>    0x000000000001538a:  ld      a4,0(a5)
>>>    0x000000000001538c:  beqz    a4,0x153f0
>>>    0x000000000001538e:  jal     ra,0x38abe
>>>    0x0000000000015392:  ld      a0,0(s6)
>>>    0x0000000000015396:  auipc   s7,0x85
>>>    0x000000000001539a:  ld      s7,-406(s7) # 0x9a200
>>> End of assembler dump.
>>> (gdb)
>>> (gdb) stepi
>>> 0x000000000001537e in ?? ()
>>> (gdb)
>>>
>>> Program received signal SIGILL, Illegal instruction.
>>> 0x00000000000385d4 in ?? ()
>>> (gdb)
>>
>> Did you try to compare the call trace to QEMU, where we divert?
> 
> Yes, that's possible way forward, but it will require some
> considerable setup on my side.
> 
> If you have QEMU ready... objdump tells you ldconfig's entrypoint,
> from that point you can just stepi. In less than 200 steps, you should
> have sigill... and complete steps that lead to it.
> 

I've updated sid-ports (dropped the snapshot pinning), and now I'm 
getting a page fault on the instruction before the one that was causing 
SIGILL before:

[  558.490689] CPU: 0 PID: 3212 Comm: ldconfig Not tainted 5.10.83-cip1-riscv-renesas #1
[  558.490697] epc: 00000000000380c6 ra : 0000000000015382 sp : 0000003fff9e3c10
[  558.490703]  gp : 0000000000099da8 tp : 0000003fe9c3c800 t0 : 0000003fe9c427c0
[  558.490710]  t1 : 0000003fe9cd059c t2 : 0000002acb8f2c00 s0 : 0000002b079e9510
[  558.490716]  s1 : 0000000000000001 a0 : 0000003fff9e3d18 a1 : 0000000000000001
[  558.490722]  a2 : 0000003fff9e3c88 a3 : 0000000000000000 a4 : 0000003fff9e3d18
[  558.490728]  a5 : 000000000009736e a6 : 0000003fff9e3c80 a7 : 00000000000000dd
[  558.490734]  s2 : 0000003fff9e3c88 s3 : 0000000000000000 s4 : 0000000000000000
[  558.490740]  s5 : 00000000000105a4 s6 : 000000000009e670 s7 : 0000002b079c8ab0
[  558.490746]  s8 : 0000002b079e91c0 s9 : 0000000000000000 s10: 0000002acb8fc9b0
[  558.490752]  s11: 0000002acb8fc920 t3 : 0000002acb80f5d8 t4 : 000000000009259c
[  558.490758]  t5 : 0000000000000004 t6 : 0000002b0799c010
[  558.490764] status: 0000000200004020 badaddr: 00000000000000e1 cause: 000000000000000f

(gdb) disassemble 0x00000000000380c6,+0x10
Dump of assembler code from 0x380c6 to 0x380d6:
   0x00000000000380c6:  addi    sp,sp,-416
   0x00000000000380c8:  auipc   a2,0x66
   0x00000000000380cc:  addi    a2,a2,2000 # 0x9e898
   0x00000000000380d0:  sd      a0,0(a2)
   0x00000000000380d2:  mv      a5,sp
   0x00000000000380d4:  addi    a4,sp,416
End of assembler dump.

I've stepped this through under qemu as well, and the control flow is 
identical. Registers are almost the same, except for some temporaries:

--- regs-qemu
+++ regs-rzfive
@@ -2,9 +2,9 @@
 ra             0x15382  0x15382
 sp             0x3ffffffbe0     0x3ffffffbe0
 gp             0x99da8  0x99da8
-tp             0x3ff7e77800     0x3ff7e77800
-t0             0x3ff7e7d7c0     274742106048
-t1             0x3ff7f0b59c     274742687132
+tp             0x3ff7e78800     0x3ff7e78800
+t0             0x3ff7e7e7c0     274742110144
+t1             0x3ff7f0c59c     274742691228
 t2             0x2aaab92c00     183252888576
 fp             0x2aaabaee00     0x2aaabaee00
 s1             0x1      1

No idea if that is normal (different machines, different memory sizes
and layouts) or a symptom of the problem.

Jan

-- 
Siemens AG, Technology
Competence Center Embedded Linux



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: ldconfig segfault on RZ/Five was Re: Preparing isar-cip-core for RZ/Five
  2022-10-08  8:27                           ` Jan Kiszka
@ 2022-10-09  8:29                             ` Jan Kiszka
  2022-10-09  8:42                               ` [cip-dev] " Biju Das
  2022-10-09 19:20                               ` Chris Paterson
  0 siblings, 2 replies; 39+ messages in thread
From: Jan Kiszka @ 2022-10-09  8:29 UTC (permalink / raw)
  To: Chris Paterson, Prabhakar Mahadev Lad, Hung Tran; +Cc: cip-dev, Pavel Machek

On 08.10.22 10:27, Jan Kiszka wrote:
> On 07.10.22 12:19, Pavel Machek wrote:
>> Hi!
>>
>>>>>> I tried, but installation fails - illegal instruction.
>>>>>
>>>>> Yeah, ldconfig is needed for installation. But I get a segfaulting gcc
>>>>> binary.
>>>>
>>>> It crashes rather soon after startup, so I was able to trace complete
>>>> path.
>>>>
>>>>> But I do have slightly different results then you (I think; I'm far
>>>>> from risc-v expert). I did a breakpoint:
>>>>>
>>>>> Breakpoint 1, 0x00000000000385d4 in ?? ()
>>>>
>>>> I believe it should not end at 0x00000000000385d4 at all. The
>>>> 0x000000000001537e jal instruction should end up calling 0x3806a
>>>> AFAICT, but it calls 0x385d4 instead. It happens during
>>>> single-stepping, so it should not be anything subtle.
>>>>
>>>> (gdb) disassemble $pc,+0x20
>>>> Dump of assembler code from 0x1537c to 0x1539c:
>>>> => 0x000000000001537c:  mv      a0,a4
>>>>    0x000000000001537e:  jal     ra,0x3806a
>>>>    0x0000000000015382:  auipc   a5,0x8a
>>>>    0x0000000000015386:  addi    a5,a5,1342 # 0x9f8c0
>>>>    0x000000000001538a:  ld      a4,0(a5)
>>>>    0x000000000001538c:  beqz    a4,0x153f0
>>>>    0x000000000001538e:  jal     ra,0x38abe
>>>>    0x0000000000015392:  ld      a0,0(s6)
>>>>    0x0000000000015396:  auipc   s7,0x85
>>>>    0x000000000001539a:  ld      s7,-406(s7) # 0x9a200
>>>> End of assembler dump.
>>>> (gdb)
>>>> (gdb) stepi
>>>> 0x000000000001537e in ?? ()
>>>> (gdb)
>>>>
>>>> Program received signal SIGILL, Illegal instruction.
>>>> 0x00000000000385d4 in ?? ()
>>>> (gdb)
>>>
>>> Did you try to compare the call trace to QEMU, where we divert?
>>
>> Yes, that's possible way forward, but it will require some
>> considerable setup on my side.
>>
>> If you have QEMU ready... objdump tells you ldconfig's entrypoint,
>> from that point you can just stepi. In less than 200 steps, you should
>> have sigill... and complete steps that lead to it.
>>
> 
> I've updated sid-ports (dropped the snapshot pinning), and now I'm 
> getting a page fault on the instruction before the one that was causing 
> SIGILL before:
> 
> [  558.490689] CPU: 0 PID: 3212 Comm: ldconfig Not tainted 5.10.83-cip1-riscv-renesas #1
> [  558.490697] epc: 00000000000380c6 ra : 0000000000015382 sp : 0000003fff9e3c10
> [  558.490703]  gp : 0000000000099da8 tp : 0000003fe9c3c800 t0 : 0000003fe9c427c0
> [  558.490710]  t1 : 0000003fe9cd059c t2 : 0000002acb8f2c00 s0 : 0000002b079e9510
> [  558.490716]  s1 : 0000000000000001 a0 : 0000003fff9e3d18 a1 : 0000000000000001
> [  558.490722]  a2 : 0000003fff9e3c88 a3 : 0000000000000000 a4 : 0000003fff9e3d18
> [  558.490728]  a5 : 000000000009736e a6 : 0000003fff9e3c80 a7 : 00000000000000dd
> [  558.490734]  s2 : 0000003fff9e3c88 s3 : 0000000000000000 s4 : 0000000000000000
> [  558.490740]  s5 : 00000000000105a4 s6 : 000000000009e670 s7 : 0000002b079c8ab0
> [  558.490746]  s8 : 0000002b079e91c0 s9 : 0000000000000000 s10: 0000002acb8fc9b0
> [  558.490752]  s11: 0000002acb8fc920 t3 : 0000002acb80f5d8 t4 : 000000000009259c
> [  558.490758]  t5 : 0000000000000004 t6 : 0000002b0799c010
> [  558.490764] status: 0000000200004020 badaddr: 00000000000000e1 cause: 000000000000000f
> 
> (gdb) disassemble 0x00000000000380c6,+0x10
> Dump of assembler code from 0x380c6 to 0x380d6:
>    0x00000000000380c6:  addi    sp,sp,-416
>    0x00000000000380c8:  auipc   a2,0x66
>    0x00000000000380cc:  addi    a2,a2,2000 # 0x9e898
>    0x00000000000380d0:  sd      a0,0(a2)
>    0x00000000000380d2:  mv      a5,sp
>    0x00000000000380d4:  addi    a4,sp,416
> End of assembler dump.
> 
> I've stepped this through under qemu as well, and the control flow is 
> identical. Registers are almost the same, except for some temporaries:
> 
> --- regs-qemu
> +++ regs-rzfive
> @@ -2,9 +2,9 @@
>  ra             0x15382  0x15382
>  sp             0x3ffffffbe0     0x3ffffffbe0
>  gp             0x99da8  0x99da8
> -tp             0x3ff7e77800     0x3ff7e77800
> -t0             0x3ff7e7d7c0     274742106048
> -t1             0x3ff7f0b59c     274742687132
> +tp             0x3ff7e78800     0x3ff7e78800
> +t0             0x3ff7e7e7c0     274742110144
> +t1             0x3ff7f0c59c     274742691228
>  t2             0x2aaab92c00     183252888576
>  fp             0x2aaabaee00     0x2aaabaee00
>  s1             0x1      1
> 
> No idea if that is normal (different machines, different memory sizes
> and layouts) or a symptom of the problem.
> 

...
OpenEmbedded nodistro.0 smarc-rzfive ttySC0


[   12.829622] audit: type=1006 audit(1653987107.735:2): pid=156 uid=0 old-auid=4294967295 auid=0 tty=(none) old-ses=4294967295 ses=1 res=1
root@smarc-rzfive:~# ldconfig 
[   22.278868] ldconfig[166]: unhandled signal 11 code 0x1 at 0x0000000000000088 in ldconfig[10000+68000]
[   22.290244] CPU: 0 PID: 166 Comm: ldconfig Not tainted 5.10.83-cip1-riscv-renesas #1
[   22.298954] epc: 0000000000030eea ra : 00000000000145a0 sp : 0000003fff9f8aa0
[   22.306906]  gp : 000000000007fe48 tp : 0000003fd958b720 t0 : 0000000000000000
[   22.314973]  t1 : 0000002adf9c3bbc t2 : 00000000000003ff s0 : 0000003fff9f8c90
[   22.322986]  s1 : 0000000000014b0e a0 : 0000003fff9f8c98 a1 : 0000000000000000
[   22.330967]  a2 : 0000003fff9f8be8 a3 : 0000000000014a86 a4 : 000000000007e576
[   22.338936]  a5 : 0000000000000000 a6 : 0000003fff9f8be0 a7 : 0000000000000000
[   22.346897]  s2 : 0000000000000000 s3 : 0000003fd96df918 s4 : ffffffffffffffff
[   22.354905]  s5 : 0000002b01953f70 s6 : 0000002b01953c60 s7 : 0000002b019539b0
[   22.362875]  s8 : 0000002b01953b50 s9 : 0000000000000000 s10: 0000002adfa74584
[   22.370884]  s11: 0000000000000000 t3 : 0000003fd960ee18 t4 : 000000000000000f
[   22.378945]  t5 : 000000000000000f t6 : 0000000000000000
[   22.385051] status: 8000000200004020 badaddr: 0000000000000088 cause: 000000000000000d
[   22.393860] audit: type=1701 audit(1653987117.299:3): auid=4294967295 uid=0 gid=0 ses=4294967295 pid=166 comm="ldconfig" exe="/sbin/ldconfig" sig=11 res=1
Segmentation fault


That was the version I found on eMMC.

I think you have some real homework now...

Jan

-- 
Siemens AG, Technology
Competence Center Embedded Linux



^ permalink raw reply	[flat|nested] 39+ messages in thread

* RE: [cip-dev] ldconfig segfault on RZ/Five was Re: Preparing isar-cip-core for RZ/Five
  2022-10-09  8:29                             ` Jan Kiszka
@ 2022-10-09  8:42                               ` Biju Das
  2022-10-11  9:30                                 ` Jan Kiszka
  2022-10-09 19:20                               ` Chris Paterson
  1 sibling, 1 reply; 39+ messages in thread
From: Biju Das @ 2022-10-09  8:42 UTC (permalink / raw)
  To: cip-dev, Chris Paterson, Prabhakar Mahadev Lad, Hung Tran; +Cc: Pavel Machek

> Subject: Re: [cip-dev] ldconfig segfault on RZ/Five was Re: Preparing
> isar-cip-core for RZ/Five
> 
> On 08.10.22 10:27, Jan Kiszka wrote:
> > On 07.10.22 12:19, Pavel Machek wrote:
> >> Hi!
> >>
> >>>>>> I tried, but installation fails - illegal instruction.
> >>>>>
> >>>>> Yeah, ldconfig is needed for installation. But I get a
> segfaulting
> >>>>> gcc binary.
> >>>>
> >>>> It crashes rather soon after startup, so I was able to trace
> >>>> complete path.
> >>>>
> >>>>> But I do have slightly different results then you (I think; I'm
> >>>>> far from risc-v expert). I did a breakpoint:
> >>>>>
> >>>>> Breakpoint 1, 0x00000000000385d4 in ?? ()
> >>>>
> >>>> I believe it should not end at 0x00000000000385d4 at all. The
> >>>> 0x000000000001537e jal instruction should end up calling 0x3806a
> >>>> AFAICT, but it calls 0x385d4 instead. It happens during
> >>>> single-stepping, so it should not be anything subtle.
> >>>>
> >>>> (gdb) disassemble $pc,+0x20
> >>>> Dump of assembler code from 0x1537c to 0x1539c:
> >>>> => 0x000000000001537c:  mv      a0,a4
> >>>>    0x000000000001537e:  jal     ra,0x3806a
> >>>>    0x0000000000015382:  auipc   a5,0x8a
> >>>>    0x0000000000015386:  addi    a5,a5,1342 # 0x9f8c0
> >>>>    0x000000000001538a:  ld      a4,0(a5)
> >>>>    0x000000000001538c:  beqz    a4,0x153f0
> >>>>    0x000000000001538e:  jal     ra,0x38abe
> >>>>    0x0000000000015392:  ld      a0,0(s6)
> >>>>    0x0000000000015396:  auipc   s7,0x85
> >>>>    0x000000000001539a:  ld      s7,-406(s7) # 0x9a200
> >>>> End of assembler dump.
> >>>> (gdb)
> >>>> (gdb) stepi
> >>>> 0x000000000001537e in ?? ()
> >>>> (gdb)
> >>>>
> >>>> Program received signal SIGILL, Illegal instruction.
> >>>> 0x00000000000385d4 in ?? ()
> >>>> (gdb)
> >>>
> >>> Did you try to compare the call trace to QEMU, where we divert?
> >>
> >> Yes, that's possible way forward, but it will require some
> >> considerable setup on my side.
> >>
> >> If you have QEMU ready... objdump tells you ldconfig's entrypoint,
> >> from that point you can just stepi. In less than 200 steps, you
> >> should have sigill... and complete steps that lead to it.
> >>
> >
> > I've updated sid-ports (dropped the snapshot pinning), and now I'm
> > getting a page fault on the instruction before the one that was
> > causing SIGILL before:
> >
> > [  558.490689] CPU: 0 PID: 3212 Comm: ldconfig Not tainted
> > 5.10.83-cip1-riscv-renesas #1 [  558.490697] epc: 00000000000380c6
> ra
> > : 0000000000015382 sp : 0000003fff9e3c10 [  558.490703]  gp :
> > 0000000000099da8 tp : 0000003fe9c3c800 t0 : 0000003fe9c427c0 [
> > 558.490710]  t1 : 0000003fe9cd059c t2 : 0000002acb8f2c00 s0 :
> > 0000002b079e9510 [  558.490716]  s1 : 0000000000000001 a0 :
> > 0000003fff9e3d18 a1 : 0000000000000001 [  558.490722]  a2 :
> > 0000003fff9e3c88 a3 : 0000000000000000 a4 : 0000003fff9e3d18 [
> > 558.490728]  a5 : 000000000009736e a6 : 0000003fff9e3c80 a7 :
> > 00000000000000dd [  558.490734]  s2 : 0000003fff9e3c88 s3 :
> > 0000000000000000 s4 : 0000000000000000 [  558.490740]  s5 :
> > 00000000000105a4 s6 : 000000000009e670 s7 : 0000002b079c8ab0 [
> > 558.490746]  s8 : 0000002b079e91c0 s9 : 0000000000000000 s10:
> > 0000002acb8fc9b0 [  558.490752]  s11: 0000002acb8fc920 t3 :
> > 0000002acb80f5d8 t4 : 000000000009259c [  558.490758]  t5 :
> > 0000000000000004 t6 : 0000002b0799c010 [  558.490764] status:
> > 0000000200004020 badaddr: 00000000000000e1 cause: 000000000000000f
> >
> > (gdb) disassemble 0x00000000000380c6,+0x10 Dump of assembler code
> from
> > 0x380c6 to 0x380d6:
> >    0x00000000000380c6:  addi    sp,sp,-416
> >    0x00000000000380c8:  auipc   a2,0x66
> >    0x00000000000380cc:  addi    a2,a2,2000 # 0x9e898
> >    0x00000000000380d0:  sd      a0,0(a2)
> >    0x00000000000380d2:  mv      a5,sp
> >    0x00000000000380d4:  addi    a4,sp,416
> > End of assembler dump.
> >
> > I've stepped this through under qemu as well, and the control flow
> is
> > identical. Registers are almost the same, except for some
> temporaries:
> >
> > --- regs-qemu
> > +++ regs-rzfive
> > @@ -2,9 +2,9 @@
> >  ra             0x15382  0x15382
> >  sp             0x3ffffffbe0     0x3ffffffbe0
> >  gp             0x99da8  0x99da8
> > -tp             0x3ff7e77800     0x3ff7e77800
> > -t0             0x3ff7e7d7c0     274742106048
> > -t1             0x3ff7f0b59c     274742687132
> > +tp             0x3ff7e78800     0x3ff7e78800
> > +t0             0x3ff7e7e7c0     274742110144
> > +t1             0x3ff7f0c59c     274742691228
> >  t2             0x2aaab92c00     183252888576
> >  fp             0x2aaabaee00     0x2aaabaee00
> >  s1             0x1      1
> >
> > No idea if that is normal (different machines, different memory
> sizes
> > and layouts) or a symptom of the problem.
> >
> 
> ...
> OpenEmbedded nodistro.0 smarc-rzfive ttySC0
> 
> 
> [   12.829622] audit: type=1006 audit(1653987107.735:2): pid=156 uid=0
> old-auid=4294967295 auid=0 tty=(none) old-ses=4294967295 ses=1 res=1
> root@smarc-rzfive:~# ldconfig
> [   22.278868] ldconfig[166]: unhandled signal 11 code 0x1 at
> 0x0000000000000088 in ldconfig[10000+68000]
> [   22.290244] CPU: 0 PID: 166 Comm: ldconfig Not tainted 5.10.83-
> cip1-riscv-renesas #1
> [   22.298954] epc: 0000000000030eea ra : 00000000000145a0 sp :
> 0000003fff9f8aa0
> [   22.306906]  gp : 000000000007fe48 tp : 0000003fd958b720 t0 :
> 0000000000000000
> [   22.314973]  t1 : 0000002adf9c3bbc t2 : 00000000000003ff s0 :
> 0000003fff9f8c90
> [   22.322986]  s1 : 0000000000014b0e a0 : 0000003fff9f8c98 a1 :
> 0000000000000000
> [   22.330967]  a2 : 0000003fff9f8be8 a3 : 0000000000014a86 a4 :
> 000000000007e576
> [   22.338936]  a5 : 0000000000000000 a6 : 0000003fff9f8be0 a7 :
> 0000000000000000
> [   22.346897]  s2 : 0000000000000000 s3 : 0000003fd96df918 s4 :
> ffffffffffffffff
> [   22.354905]  s5 : 0000002b01953f70 s6 : 0000002b01953c60 s7 :
> 0000002b019539b0
> [   22.362875]  s8 : 0000002b01953b50 s9 : 0000000000000000 s10:
> 0000002adfa74584
> [   22.370884]  s11: 0000000000000000 t3 : 0000003fd960ee18 t4 :
> 000000000000000f
> [   22.378945]  t5 : 000000000000000f t6 : 0000000000000000
> [   22.385051] status: 8000000200004020 badaddr: 0000000000000088
> cause: 000000000000000d
> [   22.393860] audit: type=1701 audit(1653987117.299:3):
> auid=4294967295 uid=0 gid=0 ses=4294967295 pid=166 comm="ldconfig"
> exe="/sbin/ldconfig" sig=11 res=1
> Segmentation fault
> 
> 
> That was the version I found on eMMC.
> 
> I think you have some real homework now...

What is your conclusion? Is it tool chain related issue? Or cache related issue?

Or

Something else ?

Cheers,
Biju

^ permalink raw reply	[flat|nested] 39+ messages in thread

* RE: ldconfig segfault on RZ/Five was Re: Preparing isar-cip-core for RZ/Five
  2022-10-09  8:29                             ` Jan Kiszka
  2022-10-09  8:42                               ` [cip-dev] " Biju Das
@ 2022-10-09 19:20                               ` Chris Paterson
  1 sibling, 0 replies; 39+ messages in thread
From: Chris Paterson @ 2022-10-09 19:20 UTC (permalink / raw)
  To: Jan Kiszka, Prabhakar Mahadev Lad, Hung Tran; +Cc: cip-dev, Pavel Machek

Hi Jan,

> From: Jan Kiszka <jan.kiszka@siemens.com>
> Sent: 09 October 2022 09:29
> 
> On 08.10.22 10:27, Jan Kiszka wrote:
> > On 07.10.22 12:19, Pavel Machek wrote:
> >> Hi!
> >>
> >>>>>> I tried, but installation fails - illegal instruction.
> >>>>>
> >>>>> Yeah, ldconfig is needed for installation. But I get a segfaulting gcc
> >>>>> binary.
> >>>>
> >>>> It crashes rather soon after startup, so I was able to trace complete
> >>>> path.
> >>>>
> >>>>> But I do have slightly different results then you (I think; I'm far
> >>>>> from risc-v expert). I did a breakpoint:
> >>>>>
> >>>>> Breakpoint 1, 0x00000000000385d4 in ?? ()
> >>>>
> >>>> I believe it should not end at 0x00000000000385d4 at all. The
> >>>> 0x000000000001537e jal instruction should end up calling 0x3806a
> >>>> AFAICT, but it calls 0x385d4 instead. It happens during
> >>>> single-stepping, so it should not be anything subtle.
> >>>>
> >>>> (gdb) disassemble $pc,+0x20
> >>>> Dump of assembler code from 0x1537c to 0x1539c:
> >>>> => 0x000000000001537c:  mv      a0,a4
> >>>>    0x000000000001537e:  jal     ra,0x3806a
> >>>>    0x0000000000015382:  auipc   a5,0x8a
> >>>>    0x0000000000015386:  addi    a5,a5,1342 # 0x9f8c0
> >>>>    0x000000000001538a:  ld      a4,0(a5)
> >>>>    0x000000000001538c:  beqz    a4,0x153f0
> >>>>    0x000000000001538e:  jal     ra,0x38abe
> >>>>    0x0000000000015392:  ld      a0,0(s6)
> >>>>    0x0000000000015396:  auipc   s7,0x85
> >>>>    0x000000000001539a:  ld      s7,-406(s7) # 0x9a200
> >>>> End of assembler dump.
> >>>> (gdb)
> >>>> (gdb) stepi
> >>>> 0x000000000001537e in ?? ()
> >>>> (gdb)
> >>>>
> >>>> Program received signal SIGILL, Illegal instruction.
> >>>> 0x00000000000385d4 in ?? ()
> >>>> (gdb)
> >>>
> >>> Did you try to compare the call trace to QEMU, where we divert?
> >>
> >> Yes, that's possible way forward, but it will require some
> >> considerable setup on my side.
> >>
> >> If you have QEMU ready... objdump tells you ldconfig's entrypoint,
> >> from that point you can just stepi. In less than 200 steps, you should
> >> have sigill... and complete steps that lead to it.
> >>
> >
> > I've updated sid-ports (dropped the snapshot pinning), and now I'm
> > getting a page fault on the instruction before the one that was causing
> > SIGILL before:
> >
> > [  558.490689] CPU: 0 PID: 3212 Comm: ldconfig Not tainted 5.10.83-cip1-
> riscv-renesas #1
> > [  558.490697] epc: 00000000000380c6 ra : 0000000000015382 sp :
> 0000003fff9e3c10
> > [  558.490703]  gp : 0000000000099da8 tp : 0000003fe9c3c800 t0 :
> 0000003fe9c427c0
> > [  558.490710]  t1 : 0000003fe9cd059c t2 : 0000002acb8f2c00 s0 :
> 0000002b079e9510
> > [  558.490716]  s1 : 0000000000000001 a0 : 0000003fff9e3d18 a1 :
> 0000000000000001
> > [  558.490722]  a2 : 0000003fff9e3c88 a3 : 0000000000000000 a4 :
> 0000003fff9e3d18
> > [  558.490728]  a5 : 000000000009736e a6 : 0000003fff9e3c80 a7 :
> 00000000000000dd
> > [  558.490734]  s2 : 0000003fff9e3c88 s3 : 0000000000000000 s4 :
> 0000000000000000
> > [  558.490740]  s5 : 00000000000105a4 s6 : 000000000009e670 s7 :
> 0000002b079c8ab0
> > [  558.490746]  s8 : 0000002b079e91c0 s9 : 0000000000000000 s10:
> 0000002acb8fc9b0
> > [  558.490752]  s11: 0000002acb8fc920 t3 : 0000002acb80f5d8 t4 :
> 000000000009259c
> > [  558.490758]  t5 : 0000000000000004 t6 : 0000002b0799c010
> > [  558.490764] status: 0000000200004020 badaddr: 00000000000000e1 cause:
> 000000000000000f
> >
> > (gdb) disassemble 0x00000000000380c6,+0x10
> > Dump of assembler code from 0x380c6 to 0x380d6:
> >    0x00000000000380c6:  addi    sp,sp,-416
> >    0x00000000000380c8:  auipc   a2,0x66
> >    0x00000000000380cc:  addi    a2,a2,2000 # 0x9e898
> >    0x00000000000380d0:  sd      a0,0(a2)
> >    0x00000000000380d2:  mv      a5,sp
> >    0x00000000000380d4:  addi    a4,sp,416
> > End of assembler dump.
> >
> > I've stepped this through under qemu as well, and the control flow is
> > identical. Registers are almost the same, except for some temporaries:
> >
> > --- regs-qemu
> > +++ regs-rzfive
> > @@ -2,9 +2,9 @@
> >  ra             0x15382  0x15382
> >  sp             0x3ffffffbe0     0x3ffffffbe0
> >  gp             0x99da8  0x99da8
> > -tp             0x3ff7e77800     0x3ff7e77800
> > -t0             0x3ff7e7d7c0     274742106048
> > -t1             0x3ff7f0b59c     274742687132
> > +tp             0x3ff7e78800     0x3ff7e78800
> > +t0             0x3ff7e7e7c0     274742110144
> > +t1             0x3ff7f0c59c     274742691228
> >  t2             0x2aaab92c00     183252888576
> >  fp             0x2aaabaee00     0x2aaabaee00
> >  s1             0x1      1
> >
> > No idea if that is normal (different machines, different memory sizes
> > and layouts) or a symptom of the problem.
> >
> 
> ...
> OpenEmbedded nodistro.0 smarc-rzfive ttySC0
> 
> 
> [   12.829622] audit: type=1006 audit(1653987107.735:2): pid=156 uid=0 old-
> auid=4294967295 auid=0 tty=(none) old-ses=4294967295 ses=1 res=1
> root@smarc-rzfive:~# ldconfig
> [   22.278868] ldconfig[166]: unhandled signal 11 code 0x1 at
> 0x0000000000000088 in ldconfig[10000+68000]
> [   22.290244] CPU: 0 PID: 166 Comm: ldconfig Not tainted 5.10.83-cip1-riscv-
> renesas #1
> [   22.298954] epc: 0000000000030eea ra : 00000000000145a0 sp :
> 0000003fff9f8aa0
> [   22.306906]  gp : 000000000007fe48 tp : 0000003fd958b720 t0 :
> 0000000000000000
> [   22.314973]  t1 : 0000002adf9c3bbc t2 : 00000000000003ff s0 :
> 0000003fff9f8c90
> [   22.322986]  s1 : 0000000000014b0e a0 : 0000003fff9f8c98 a1 :
> 0000000000000000
> [   22.330967]  a2 : 0000003fff9f8be8 a3 : 0000000000014a86 a4 :
> 000000000007e576
> [   22.338936]  a5 : 0000000000000000 a6 : 0000003fff9f8be0 a7 :
> 0000000000000000
> [   22.346897]  s2 : 0000000000000000 s3 : 0000003fd96df918 s4 : ffffffffffffffff
> [   22.354905]  s5 : 0000002b01953f70 s6 : 0000002b01953c60 s7 :
> 0000002b019539b0
> [   22.362875]  s8 : 0000002b01953b50 s9 : 0000000000000000 s10:
> 0000002adfa74584
> [   22.370884]  s11: 0000000000000000 t3 : 0000003fd960ee18 t4 :
> 000000000000000f
> [   22.378945]  t5 : 000000000000000f t6 : 0000000000000000
> [   22.385051] status: 8000000200004020 badaddr: 0000000000000088 cause:
> 000000000000000d
> [   22.393860] audit: type=1701 audit(1653987117.299:3): auid=4294967295
> uid=0 gid=0 ses=4294967295 pid=166 comm="ldconfig" exe="/sbin/ldconfig"
> sig=11 res=1
> Segmentation fault
> 
> 
> That was the version I found on eMMC.
> 
> I think you have some real homework now...

Thanks, we'll take a look.

Kind regards, Chris

> 
> Jan
> 
> --
> Siemens AG, Technology
> Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: RE: [cip-dev] ldconfig segfault on RZ/Five was Re: Preparing isar-cip-core for RZ/Five
  2022-10-09  8:42                               ` [cip-dev] " Biju Das
@ 2022-10-11  9:30                                 ` Jan Kiszka
  2022-10-11 10:34                                   ` Biju Das
  0 siblings, 1 reply; 39+ messages in thread
From: Jan Kiszka @ 2022-10-11  9:30 UTC (permalink / raw)
  To: Biju Das, cip-dev, Chris Paterson, Prabhakar Mahadev Lad, Hung Tran
  Cc: Pavel Machek

On 09.10.22 10:42, Biju Das wrote:
>> Subject: Re: [cip-dev] ldconfig segfault on RZ/Five was Re: Preparing
>> isar-cip-core for RZ/Five
>>
>> On 08.10.22 10:27, Jan Kiszka wrote:
>>> On 07.10.22 12:19, Pavel Machek wrote:
>>>> Hi!
>>>>
>>>>>>>> I tried, but installation fails - illegal instruction.
>>>>>>>
>>>>>>> Yeah, ldconfig is needed for installation. But I get a
>> segfaulting
>>>>>>> gcc binary.
>>>>>>
>>>>>> It crashes rather soon after startup, so I was able to trace
>>>>>> complete path.
>>>>>>
>>>>>>> But I do have slightly different results then you (I think; I'm
>>>>>>> far from risc-v expert). I did a breakpoint:
>>>>>>>
>>>>>>> Breakpoint 1, 0x00000000000385d4 in ?? ()
>>>>>>
>>>>>> I believe it should not end at 0x00000000000385d4 at all. The
>>>>>> 0x000000000001537e jal instruction should end up calling 0x3806a
>>>>>> AFAICT, but it calls 0x385d4 instead. It happens during
>>>>>> single-stepping, so it should not be anything subtle.
>>>>>>
>>>>>> (gdb) disassemble $pc,+0x20
>>>>>> Dump of assembler code from 0x1537c to 0x1539c:
>>>>>> => 0x000000000001537c:  mv      a0,a4
>>>>>>    0x000000000001537e:  jal     ra,0x3806a
>>>>>>    0x0000000000015382:  auipc   a5,0x8a
>>>>>>    0x0000000000015386:  addi    a5,a5,1342 # 0x9f8c0
>>>>>>    0x000000000001538a:  ld      a4,0(a5)
>>>>>>    0x000000000001538c:  beqz    a4,0x153f0
>>>>>>    0x000000000001538e:  jal     ra,0x38abe
>>>>>>    0x0000000000015392:  ld      a0,0(s6)
>>>>>>    0x0000000000015396:  auipc   s7,0x85
>>>>>>    0x000000000001539a:  ld      s7,-406(s7) # 0x9a200
>>>>>> End of assembler dump.
>>>>>> (gdb)
>>>>>> (gdb) stepi
>>>>>> 0x000000000001537e in ?? ()
>>>>>> (gdb)
>>>>>>
>>>>>> Program received signal SIGILL, Illegal instruction.
>>>>>> 0x00000000000385d4 in ?? ()
>>>>>> (gdb)
>>>>>
>>>>> Did you try to compare the call trace to QEMU, where we divert?
>>>>
>>>> Yes, that's possible way forward, but it will require some
>>>> considerable setup on my side.
>>>>
>>>> If you have QEMU ready... objdump tells you ldconfig's entrypoint,
>>>> from that point you can just stepi. In less than 200 steps, you
>>>> should have sigill... and complete steps that lead to it.
>>>>
>>>
>>> I've updated sid-ports (dropped the snapshot pinning), and now I'm
>>> getting a page fault on the instruction before the one that was
>>> causing SIGILL before:
>>>
>>> [  558.490689] CPU: 0 PID: 3212 Comm: ldconfig Not tainted
>>> 5.10.83-cip1-riscv-renesas #1 [  558.490697] epc: 00000000000380c6
>> ra
>>> : 0000000000015382 sp : 0000003fff9e3c10 [  558.490703]  gp :
>>> 0000000000099da8 tp : 0000003fe9c3c800 t0 : 0000003fe9c427c0 [
>>> 558.490710]  t1 : 0000003fe9cd059c t2 : 0000002acb8f2c00 s0 :
>>> 0000002b079e9510 [  558.490716]  s1 : 0000000000000001 a0 :
>>> 0000003fff9e3d18 a1 : 0000000000000001 [  558.490722]  a2 :
>>> 0000003fff9e3c88 a3 : 0000000000000000 a4 : 0000003fff9e3d18 [
>>> 558.490728]  a5 : 000000000009736e a6 : 0000003fff9e3c80 a7 :
>>> 00000000000000dd [  558.490734]  s2 : 0000003fff9e3c88 s3 :
>>> 0000000000000000 s4 : 0000000000000000 [  558.490740]  s5 :
>>> 00000000000105a4 s6 : 000000000009e670 s7 : 0000002b079c8ab0 [
>>> 558.490746]  s8 : 0000002b079e91c0 s9 : 0000000000000000 s10:
>>> 0000002acb8fc9b0 [  558.490752]  s11: 0000002acb8fc920 t3 :
>>> 0000002acb80f5d8 t4 : 000000000009259c [  558.490758]  t5 :
>>> 0000000000000004 t6 : 0000002b0799c010 [  558.490764] status:
>>> 0000000200004020 badaddr: 00000000000000e1 cause: 000000000000000f
>>>
>>> (gdb) disassemble 0x00000000000380c6,+0x10 Dump of assembler code
>> from
>>> 0x380c6 to 0x380d6:
>>>    0x00000000000380c6:  addi    sp,sp,-416
>>>    0x00000000000380c8:  auipc   a2,0x66
>>>    0x00000000000380cc:  addi    a2,a2,2000 # 0x9e898
>>>    0x00000000000380d0:  sd      a0,0(a2)
>>>    0x00000000000380d2:  mv      a5,sp
>>>    0x00000000000380d4:  addi    a4,sp,416
>>> End of assembler dump.
>>>
>>> I've stepped this through under qemu as well, and the control flow
>> is
>>> identical. Registers are almost the same, except for some
>> temporaries:
>>>
>>> --- regs-qemu
>>> +++ regs-rzfive
>>> @@ -2,9 +2,9 @@
>>>  ra             0x15382  0x15382
>>>  sp             0x3ffffffbe0     0x3ffffffbe0
>>>  gp             0x99da8  0x99da8
>>> -tp             0x3ff7e77800     0x3ff7e77800
>>> -t0             0x3ff7e7d7c0     274742106048
>>> -t1             0x3ff7f0b59c     274742687132
>>> +tp             0x3ff7e78800     0x3ff7e78800
>>> +t0             0x3ff7e7e7c0     274742110144
>>> +t1             0x3ff7f0c59c     274742691228
>>>  t2             0x2aaab92c00     183252888576
>>>  fp             0x2aaabaee00     0x2aaabaee00
>>>  s1             0x1      1
>>>
>>> No idea if that is normal (different machines, different memory
>> sizes
>>> and layouts) or a symptom of the problem.
>>>
>>
>> ...
>> OpenEmbedded nodistro.0 smarc-rzfive ttySC0
>>
>>
>> [   12.829622] audit: type=1006 audit(1653987107.735:2): pid=156 uid=0
>> old-auid=4294967295 auid=0 tty=(none) old-ses=4294967295 ses=1 res=1
>> root@smarc-rzfive:~# ldconfig
>> [   22.278868] ldconfig[166]: unhandled signal 11 code 0x1 at
>> 0x0000000000000088 in ldconfig[10000+68000]
>> [   22.290244] CPU: 0 PID: 166 Comm: ldconfig Not tainted 5.10.83-
>> cip1-riscv-renesas #1
>> [   22.298954] epc: 0000000000030eea ra : 00000000000145a0 sp :
>> 0000003fff9f8aa0
>> [   22.306906]  gp : 000000000007fe48 tp : 0000003fd958b720 t0 :
>> 0000000000000000
>> [   22.314973]  t1 : 0000002adf9c3bbc t2 : 00000000000003ff s0 :
>> 0000003fff9f8c90
>> [   22.322986]  s1 : 0000000000014b0e a0 : 0000003fff9f8c98 a1 :
>> 0000000000000000
>> [   22.330967]  a2 : 0000003fff9f8be8 a3 : 0000000000014a86 a4 :
>> 000000000007e576
>> [   22.338936]  a5 : 0000000000000000 a6 : 0000003fff9f8be0 a7 :
>> 0000000000000000
>> [   22.346897]  s2 : 0000000000000000 s3 : 0000003fd96df918 s4 :
>> ffffffffffffffff
>> [   22.354905]  s5 : 0000002b01953f70 s6 : 0000002b01953c60 s7 :
>> 0000002b019539b0
>> [   22.362875]  s8 : 0000002b01953b50 s9 : 0000000000000000 s10:
>> 0000002adfa74584
>> [   22.370884]  s11: 0000000000000000 t3 : 0000003fd960ee18 t4 :
>> 000000000000000f
>> [   22.378945]  t5 : 000000000000000f t6 : 0000000000000000
>> [   22.385051] status: 8000000200004020 badaddr: 0000000000000088
>> cause: 000000000000000d
>> [   22.393860] audit: type=1701 audit(1653987117.299:3):
>> auid=4294967295 uid=0 gid=0 ses=4294967295 pid=166 comm="ldconfig"
>> exe="/sbin/ldconfig" sig=11 res=1
>> Segmentation fault
>>
>>
>> That was the version I found on eMMC.
>>
>> I think you have some real homework now...
> 
> What is your conclusion? Is it tool chain related issue? Or cache related issue?
> 
> Or
> 
> Something else ?
> 

I have no idea and still only limited knowledge about the arch and this
SoC. We can just rule out by now that the issue is Debian-exclusive.

Jan

-- 
Siemens AG, Technology
Competence Center Embedded Linux



^ permalink raw reply	[flat|nested] 39+ messages in thread

* RE: RE: [cip-dev] ldconfig segfault on RZ/Five was Re: Preparing isar-cip-core for RZ/Five
  2022-10-11  9:30                                 ` Jan Kiszka
@ 2022-10-11 10:34                                   ` Biju Das
  2022-10-11 18:51                                     ` Florian Bezdeka
  0 siblings, 1 reply; 39+ messages in thread
From: Biju Das @ 2022-10-11 10:34 UTC (permalink / raw)
  To: Jan Kiszka, cip-dev, Chris Paterson, Prabhakar Mahadev Lad, Hung Tran
  Cc: Pavel Machek

> Subject: Re: RE: [cip-dev] ldconfig segfault on RZ/Five was Re:
> Preparing isar-cip-core for RZ/Five
> 
> On 09.10.22 10:42, Biju Das wrote:
> >> Subject: Re: [cip-dev] ldconfig segfault on RZ/Five was Re:
> Preparing
> >> isar-cip-core for RZ/Five
> >>
> >> On 08.10.22 10:27, Jan Kiszka wrote:
> >>> On 07.10.22 12:19, Pavel Machek wrote:
> >>>> Hi!
> >>>>
> >>>>>>>> I tried, but installation fails - illegal instruction.
> >>>>>>>
> >>>>>>> Yeah, ldconfig is needed for installation. But I get a
> >> segfaulting
> >>>>>>> gcc binary.
> >>>>>>
> >>>>>> It crashes rather soon after startup, so I was able to trace
> >>>>>> complete path.
> >>>>>>
> >>>>>>> But I do have slightly different results then you (I think;
> I'm
> >>>>>>> far from risc-v expert). I did a breakpoint:
> >>>>>>>
> >>>>>>> Breakpoint 1, 0x00000000000385d4 in ?? ()
> >>>>>>
> >>>>>> I believe it should not end at 0x00000000000385d4 at all. The
> >>>>>> 0x000000000001537e jal instruction should end up calling
> 0x3806a
> >>>>>> AFAICT, but it calls 0x385d4 instead. It happens during
> >>>>>> single-stepping, so it should not be anything subtle.
> >>>>>>
> >>>>>> (gdb) disassemble $pc,+0x20
> >>>>>> Dump of assembler code from 0x1537c to 0x1539c:
> >>>>>> => 0x000000000001537c:  mv      a0,a4
> >>>>>>    0x000000000001537e:  jal     ra,0x3806a
> >>>>>>    0x0000000000015382:  auipc   a5,0x8a
> >>>>>>    0x0000000000015386:  addi    a5,a5,1342 # 0x9f8c0
> >>>>>>    0x000000000001538a:  ld      a4,0(a5)
> >>>>>>    0x000000000001538c:  beqz    a4,0x153f0
> >>>>>>    0x000000000001538e:  jal     ra,0x38abe
> >>>>>>    0x0000000000015392:  ld      a0,0(s6)
> >>>>>>    0x0000000000015396:  auipc   s7,0x85
> >>>>>>    0x000000000001539a:  ld      s7,-406(s7) # 0x9a200
> >>>>>> End of assembler dump.
> >>>>>> (gdb)
> >>>>>> (gdb) stepi
> >>>>>> 0x000000000001537e in ?? ()
> >>>>>> (gdb)
> >>>>>>
> >>>>>> Program received signal SIGILL, Illegal instruction.
> >>>>>> 0x00000000000385d4 in ?? ()
> >>>>>> (gdb)
> >>>>>
> >>>>> Did you try to compare the call trace to QEMU, where we divert?
> >>>>
> >>>> Yes, that's possible way forward, but it will require some
> >>>> considerable setup on my side.
> >>>>
> >>>> If you have QEMU ready... objdump tells you ldconfig's
> entrypoint,
> >>>> from that point you can just stepi. In less than 200 steps, you
> >>>> should have sigill... and complete steps that lead to it.
> >>>>
> >>>
> >>> I've updated sid-ports (dropped the snapshot pinning), and now I'm
> >>> getting a page fault on the instruction before the one that was
> >>> causing SIGILL before:
> >>>
> >>> [  558.490689] CPU: 0 PID: 3212 Comm: ldconfig Not tainted
> >>> 5.10.83-cip1-riscv-renesas #1 [  558.490697] epc: 00000000000380c6
> >> ra
> >>> : 0000000000015382 sp : 0000003fff9e3c10 [  558.490703]  gp :
> >>> 0000000000099da8 tp : 0000003fe9c3c800 t0 : 0000003fe9c427c0 [
> >>> 558.490710]  t1 : 0000003fe9cd059c t2 : 0000002acb8f2c00 s0 :
> >>> 0000002b079e9510 [  558.490716]  s1 : 0000000000000001 a0 :
> >>> 0000003fff9e3d18 a1 : 0000000000000001 [  558.490722]  a2 :
> >>> 0000003fff9e3c88 a3 : 0000000000000000 a4 : 0000003fff9e3d18 [
> >>> 558.490728]  a5 : 000000000009736e a6 : 0000003fff9e3c80 a7 :
> >>> 00000000000000dd [  558.490734]  s2 : 0000003fff9e3c88 s3 :
> >>> 0000000000000000 s4 : 0000000000000000 [  558.490740]  s5 :
> >>> 00000000000105a4 s6 : 000000000009e670 s7 : 0000002b079c8ab0 [
> >>> 558.490746]  s8 : 0000002b079e91c0 s9 : 0000000000000000 s10:
> >>> 0000002acb8fc9b0 [  558.490752]  s11: 0000002acb8fc920 t3 :
> >>> 0000002acb80f5d8 t4 : 000000000009259c [  558.490758]  t5 :
> >>> 0000000000000004 t6 : 0000002b0799c010 [  558.490764] status:
> >>> 0000000200004020 badaddr: 00000000000000e1 cause: 000000000000000f
> >>>
> >>> (gdb) disassemble 0x00000000000380c6,+0x10 Dump of assembler code
> >> from
> >>> 0x380c6 to 0x380d6:
> >>>    0x00000000000380c6:  addi    sp,sp,-416
> >>>    0x00000000000380c8:  auipc   a2,0x66
> >>>    0x00000000000380cc:  addi    a2,a2,2000 # 0x9e898
> >>>    0x00000000000380d0:  sd      a0,0(a2)
> >>>    0x00000000000380d2:  mv      a5,sp
> >>>    0x00000000000380d4:  addi    a4,sp,416
> >>> End of assembler dump.
> >>>
> >>> I've stepped this through under qemu as well, and the control flow
> >> is
> >>> identical. Registers are almost the same, except for some
> >> temporaries:
> >>>
> >>> --- regs-qemu
> >>> +++ regs-rzfive
> >>> @@ -2,9 +2,9 @@
> >>>  ra             0x15382  0x15382
> >>>  sp             0x3ffffffbe0     0x3ffffffbe0
> >>>  gp             0x99da8  0x99da8
> >>> -tp             0x3ff7e77800     0x3ff7e77800
> >>> -t0             0x3ff7e7d7c0     274742106048
> >>> -t1             0x3ff7f0b59c     274742687132
> >>> +tp             0x3ff7e78800     0x3ff7e78800
> >>> +t0             0x3ff7e7e7c0     274742110144
> >>> +t1             0x3ff7f0c59c     274742691228
> >>>  t2             0x2aaab92c00     183252888576
> >>>  fp             0x2aaabaee00     0x2aaabaee00
> >>>  s1             0x1      1
> >>>
> >>> No idea if that is normal (different machines, different memory
> >> sizes
> >>> and layouts) or a symptom of the problem.
> >>>
> >>
> >> ...
> >> OpenEmbedded nodistro.0 smarc-rzfive ttySC0
> >>
> >>
> >> [   12.829622] audit: type=1006 audit(1653987107.735:2): pid=156
> uid=0
> >> old-auid=4294967295 auid=0 tty=(none) old-ses=4294967295 ses=1
> res=1
> >> root@smarc-rzfive:~# ldconfig
> >> [   22.278868] ldconfig[166]: unhandled signal 11 code 0x1 at
> >> 0x0000000000000088 in ldconfig[10000+68000]
> >> [   22.290244] CPU: 0 PID: 166 Comm: ldconfig Not tainted 5.10.83-
> >> cip1-riscv-renesas #1
> >> [   22.298954] epc: 0000000000030eea ra : 00000000000145a0 sp :
> >> 0000003fff9f8aa0
> >> [   22.306906]  gp : 000000000007fe48 tp : 0000003fd958b720 t0 :
> >> 0000000000000000
> >> [   22.314973]  t1 : 0000002adf9c3bbc t2 : 00000000000003ff s0 :
> >> 0000003fff9f8c90
> >> [   22.322986]  s1 : 0000000000014b0e a0 : 0000003fff9f8c98 a1 :
> >> 0000000000000000
> >> [   22.330967]  a2 : 0000003fff9f8be8 a3 : 0000000000014a86 a4 :
> >> 000000000007e576
> >> [   22.338936]  a5 : 0000000000000000 a6 : 0000003fff9f8be0 a7 :
> >> 0000000000000000
> >> [   22.346897]  s2 : 0000000000000000 s3 : 0000003fd96df918 s4 :
> >> ffffffffffffffff
> >> [   22.354905]  s5 : 0000002b01953f70 s6 : 0000002b01953c60 s7 :
> >> 0000002b019539b0
> >> [   22.362875]  s8 : 0000002b01953b50 s9 : 0000000000000000 s10:
> >> 0000002adfa74584
> >> [   22.370884]  s11: 0000000000000000 t3 : 0000003fd960ee18 t4 :
> >> 000000000000000f
> >> [   22.378945]  t5 : 000000000000000f t6 : 0000000000000000
> >> [   22.385051] status: 8000000200004020 badaddr: 0000000000000088
> >> cause: 000000000000000d
> >> [   22.393860] audit: type=1701 audit(1653987117.299:3):
> >> auid=4294967295 uid=0 gid=0 ses=4294967295 pid=166 comm="ldconfig"
> >> exe="/sbin/ldconfig" sig=11 res=1
> >> Segmentation fault
> >>
> >>
> >> That was the version I found on eMMC.
> >>
> >> I think you have some real homework now...
> >
> > What is your conclusion? Is it tool chain related issue? Or cache
> related issue?
> >
> > Or
> >
> > Something else ?
> >
> 
> I have no idea and still only limited knowledge about the arch and
> this SoC. We can just rule out by now that the issue is Debian-
> exclusive.

Thanks for your feedback.

Cheers,
Biju

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [cip-dev] ldconfig segfault on RZ/Five was Re: Preparing isar-cip-core for RZ/Five
  2022-10-11 10:34                                   ` Biju Das
@ 2022-10-11 18:51                                     ` Florian Bezdeka
  2022-10-11 20:15                                       ` Jan Kiszka
  0 siblings, 1 reply; 39+ messages in thread
From: Florian Bezdeka @ 2022-10-11 18:51 UTC (permalink / raw)
  To: cip-dev, Jan Kiszka, Chris Paterson, Prabhakar Mahadev Lad, Hung Tran
  Cc: Pavel Machek

On 11.10.22 12:34, Biju Das via lists.cip-project.org wrote:
>> Subject: Re: RE: [cip-dev] ldconfig segfault on RZ/Five was Re:
>> Preparing isar-cip-core for RZ/Five
>>
>> On 09.10.22 10:42, Biju Das wrote:
>>>> Subject: Re: [cip-dev] ldconfig segfault on RZ/Five was Re:
>> Preparing
>>>> isar-cip-core for RZ/Five
>>>>
>>>> On 08.10.22 10:27, Jan Kiszka wrote:
>>>>> On 07.10.22 12:19, Pavel Machek wrote:
>>>>>> Hi!
>>>>>>
>>>>>>>>>> I tried, but installation fails - illegal instruction.
>>>>>>>>>
>>>>>>>>> Yeah, ldconfig is needed for installation. But I get a
>>>> segfaulting
>>>>>>>>> gcc binary.
>>>>>>>>
>>>>>>>> It crashes rather soon after startup, so I was able to trace
>>>>>>>> complete path.
>>>>>>>>
>>>>>>>>> But I do have slightly different results then you (I think;
>> I'm
>>>>>>>>> far from risc-v expert). I did a breakpoint:
>>>>>>>>>
>>>>>>>>> Breakpoint 1, 0x00000000000385d4 in ?? ()
>>>>>>>>
>>>>>>>> I believe it should not end at 0x00000000000385d4 at all. The
>>>>>>>> 0x000000000001537e jal instruction should end up calling
>> 0x3806a
>>>>>>>> AFAICT, but it calls 0x385d4 instead. It happens during
>>>>>>>> single-stepping, so it should not be anything subtle.
>>>>>>>>
>>>>>>>> (gdb) disassemble $pc,+0x20
>>>>>>>> Dump of assembler code from 0x1537c to 0x1539c:
>>>>>>>> => 0x000000000001537c:  mv      a0,a4
>>>>>>>>    0x000000000001537e:  jal     ra,0x3806a
>>>>>>>>    0x0000000000015382:  auipc   a5,0x8a
>>>>>>>>    0x0000000000015386:  addi    a5,a5,1342 # 0x9f8c0
>>>>>>>>    0x000000000001538a:  ld      a4,0(a5)
>>>>>>>>    0x000000000001538c:  beqz    a4,0x153f0
>>>>>>>>    0x000000000001538e:  jal     ra,0x38abe
>>>>>>>>    0x0000000000015392:  ld      a0,0(s6)
>>>>>>>>    0x0000000000015396:  auipc   s7,0x85
>>>>>>>>    0x000000000001539a:  ld      s7,-406(s7) # 0x9a200
>>>>>>>> End of assembler dump.
>>>>>>>> (gdb)
>>>>>>>> (gdb) stepi
>>>>>>>> 0x000000000001537e in ?? ()
>>>>>>>> (gdb)
>>>>>>>>
>>>>>>>> Program received signal SIGILL, Illegal instruction.
>>>>>>>> 0x00000000000385d4 in ?? ()
>>>>>>>> (gdb)
>>>>>>>
>>>>>>> Did you try to compare the call trace to QEMU, where we divert?
>>>>>>
>>>>>> Yes, that's possible way forward, but it will require some
>>>>>> considerable setup on my side.
>>>>>>
>>>>>> If you have QEMU ready... objdump tells you ldconfig's
>> entrypoint,
>>>>>> from that point you can just stepi. In less than 200 steps, you
>>>>>> should have sigill... and complete steps that lead to it.
>>>>>>
>>>>>
>>>>> I've updated sid-ports (dropped the snapshot pinning), and now I'm
>>>>> getting a page fault on the instruction before the one that was
>>>>> causing SIGILL before:

In case the requested page is a page with PROT_WRITE only (no PROT_READ) 
it might be related to
https://lore.kernel.org/linux-riscv/20220915193702.2201018-1-abrestic@rivosinc.com/

AFAIR all stable branches have that problem currently.

>>>>>
>>>>> [  558.490689] CPU: 0 PID: 3212 Comm: ldconfig Not tainted
>>>>> 5.10.83-cip1-riscv-renesas #1 [  558.490697] epc: 00000000000380c6
>>>> ra
>>>>> : 0000000000015382 sp : 0000003fff9e3c10 [  558.490703]  gp :
>>>>> 0000000000099da8 tp : 0000003fe9c3c800 t0 : 0000003fe9c427c0 [
>>>>> 558.490710]  t1 : 0000003fe9cd059c t2 : 0000002acb8f2c00 s0 :
>>>>> 0000002b079e9510 [  558.490716]  s1 : 0000000000000001 a0 :
>>>>> 0000003fff9e3d18 a1 : 0000000000000001 [  558.490722]  a2 :
>>>>> 0000003fff9e3c88 a3 : 0000000000000000 a4 : 0000003fff9e3d18 [
>>>>> 558.490728]  a5 : 000000000009736e a6 : 0000003fff9e3c80 a7 :
>>>>> 00000000000000dd [  558.490734]  s2 : 0000003fff9e3c88 s3 :
>>>>> 0000000000000000 s4 : 0000000000000000 [  558.490740]  s5 :
>>>>> 00000000000105a4 s6 : 000000000009e670 s7 : 0000002b079c8ab0 [
>>>>> 558.490746]  s8 : 0000002b079e91c0 s9 : 0000000000000000 s10:
>>>>> 0000002acb8fc9b0 [  558.490752]  s11: 0000002acb8fc920 t3 :
>>>>> 0000002acb80f5d8 t4 : 000000000009259c [  558.490758]  t5 :
>>>>> 0000000000000004 t6 : 0000002b0799c010 [  558.490764] status:
>>>>> 0000000200004020 badaddr: 00000000000000e1 cause: 000000000000000f
>>>>>
>>>>> (gdb) disassemble 0x00000000000380c6,+0x10 Dump of assembler code
>>>> from
>>>>> 0x380c6 to 0x380d6:
>>>>>    0x00000000000380c6:  addi    sp,sp,-416
>>>>>    0x00000000000380c8:  auipc   a2,0x66
>>>>>    0x00000000000380cc:  addi    a2,a2,2000 # 0x9e898
>>>>>    0x00000000000380d0:  sd      a0,0(a2)
>>>>>    0x00000000000380d2:  mv      a5,sp
>>>>>    0x00000000000380d4:  addi    a4,sp,416
>>>>> End of assembler dump.
>>>>>
>>>>> I've stepped this through under qemu as well, and the control flow
>>>> is
>>>>> identical. Registers are almost the same, except for some
>>>> temporaries:
>>>>>
>>>>> --- regs-qemu
>>>>> +++ regs-rzfive
>>>>> @@ -2,9 +2,9 @@
>>>>>  ra             0x15382  0x15382
>>>>>  sp             0x3ffffffbe0     0x3ffffffbe0
>>>>>  gp             0x99da8  0x99da8
>>>>> -tp             0x3ff7e77800     0x3ff7e77800
>>>>> -t0             0x3ff7e7d7c0     274742106048
>>>>> -t1             0x3ff7f0b59c     274742687132
>>>>> +tp             0x3ff7e78800     0x3ff7e78800
>>>>> +t0             0x3ff7e7e7c0     274742110144
>>>>> +t1             0x3ff7f0c59c     274742691228
>>>>>  t2             0x2aaab92c00     183252888576
>>>>>  fp             0x2aaabaee00     0x2aaabaee00
>>>>>  s1             0x1      1
>>>>>
>>>>> No idea if that is normal (different machines, different memory
>>>> sizes
>>>>> and layouts) or a symptom of the problem.
>>>>>
>>>>
>>>> ...
>>>> OpenEmbedded nodistro.0 smarc-rzfive ttySC0
>>>>
>>>>
>>>> [   12.829622] audit: type=1006 audit(1653987107.735:2): pid=156
>> uid=0
>>>> old-auid=4294967295 auid=0 tty=(none) old-ses=4294967295 ses=1
>> res=1
>>>> root@smarc-rzfive:~# ldconfig
>>>> [   22.278868] ldconfig[166]: unhandled signal 11 code 0x1 at
>>>> 0x0000000000000088 in ldconfig[10000+68000]
>>>> [   22.290244] CPU: 0 PID: 166 Comm: ldconfig Not tainted 5.10.83-
>>>> cip1-riscv-renesas #1
>>>> [   22.298954] epc: 0000000000030eea ra : 00000000000145a0 sp :
>>>> 0000003fff9f8aa0
>>>> [   22.306906]  gp : 000000000007fe48 tp : 0000003fd958b720 t0 :
>>>> 0000000000000000
>>>> [   22.314973]  t1 : 0000002adf9c3bbc t2 : 00000000000003ff s0 :
>>>> 0000003fff9f8c90
>>>> [   22.322986]  s1 : 0000000000014b0e a0 : 0000003fff9f8c98 a1 :
>>>> 0000000000000000
>>>> [   22.330967]  a2 : 0000003fff9f8be8 a3 : 0000000000014a86 a4 :
>>>> 000000000007e576
>>>> [   22.338936]  a5 : 0000000000000000 a6 : 0000003fff9f8be0 a7 :
>>>> 0000000000000000
>>>> [   22.346897]  s2 : 0000000000000000 s3 : 0000003fd96df918 s4 :
>>>> ffffffffffffffff
>>>> [   22.354905]  s5 : 0000002b01953f70 s6 : 0000002b01953c60 s7 :
>>>> 0000002b019539b0
>>>> [   22.362875]  s8 : 0000002b01953b50 s9 : 0000000000000000 s10:
>>>> 0000002adfa74584
>>>> [   22.370884]  s11: 0000000000000000 t3 : 0000003fd960ee18 t4 :
>>>> 000000000000000f
>>>> [   22.378945]  t5 : 000000000000000f t6 : 0000000000000000
>>>> [   22.385051] status: 8000000200004020 badaddr: 0000000000000088
>>>> cause: 000000000000000d
>>>> [   22.393860] audit: type=1701 audit(1653987117.299:3):
>>>> auid=4294967295 uid=0 gid=0 ses=4294967295 pid=166 comm="ldconfig"
>>>> exe="/sbin/ldconfig" sig=11 res=1
>>>> Segmentation fault
>>>>
>>>>
>>>> That was the version I found on eMMC.
>>>>
>>>> I think you have some real homework now...
>>>
>>> What is your conclusion? Is it tool chain related issue? Or cache
>> related issue?
>>>
>>> Or
>>>
>>> Something else ?
>>>
>>
>> I have no idea and still only limited knowledge about the arch and
>> this SoC. We can just rule out by now that the issue is Debian-
>> exclusive.
> 
> Thanks for your feedback.
> 
> Cheers,
> Biju
> 
> 
> 
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> View/Reply Online (#9710): https://lists.cip-project.org/g/cip-dev/message/9710
> Mute This Topic: https://lists.cip-project.org/mt/94168382/5792637
> Group Owner: cip-dev+owner@lists.cip-project.org
> Unsubscribe: https://lists.cip-project.org/g/cip-dev/leave/9882880/5792637/2090341516/xyzzy [florian.bezdeka@siemens.com]
> -=-=-=-=-=-=-=-=-=-=-=-
> 



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [cip-dev] ldconfig segfault on RZ/Five was Re: Preparing isar-cip-core for RZ/Five
  2022-10-11 18:51                                     ` Florian Bezdeka
@ 2022-10-11 20:15                                       ` Jan Kiszka
  2022-10-11 20:48                                         ` Prabhakar Mahadev Lad
  0 siblings, 1 reply; 39+ messages in thread
From: Jan Kiszka @ 2022-10-11 20:15 UTC (permalink / raw)
  To: Florian Bezdeka, cip-dev, Chris Paterson, Prabhakar Mahadev Lad,
	Hung Tran
  Cc: Pavel Machek

On 11.10.22 20:51, Florian Bezdeka wrote:
> On 11.10.22 12:34, Biju Das via lists.cip-project.org wrote:
>>> Subject: Re: RE: [cip-dev] ldconfig segfault on RZ/Five was Re:
>>> Preparing isar-cip-core for RZ/Five
>>>
>>> On 09.10.22 10:42, Biju Das wrote:
>>>>> Subject: Re: [cip-dev] ldconfig segfault on RZ/Five was Re:
>>> Preparing
>>>>> isar-cip-core for RZ/Five
>>>>>
>>>>> On 08.10.22 10:27, Jan Kiszka wrote:
>>>>>> On 07.10.22 12:19, Pavel Machek wrote:
>>>>>>> Hi!
>>>>>>>
>>>>>>>>>>> I tried, but installation fails - illegal instruction.
>>>>>>>>>>
>>>>>>>>>> Yeah, ldconfig is needed for installation. But I get a
>>>>> segfaulting
>>>>>>>>>> gcc binary.
>>>>>>>>>
>>>>>>>>> It crashes rather soon after startup, so I was able to trace
>>>>>>>>> complete path.
>>>>>>>>>
>>>>>>>>>> But I do have slightly different results then you (I think;
>>> I'm
>>>>>>>>>> far from risc-v expert). I did a breakpoint:
>>>>>>>>>>
>>>>>>>>>> Breakpoint 1, 0x00000000000385d4 in ?? ()
>>>>>>>>>
>>>>>>>>> I believe it should not end at 0x00000000000385d4 at all. The
>>>>>>>>> 0x000000000001537e jal instruction should end up calling
>>> 0x3806a
>>>>>>>>> AFAICT, but it calls 0x385d4 instead. It happens during
>>>>>>>>> single-stepping, so it should not be anything subtle.
>>>>>>>>>
>>>>>>>>> (gdb) disassemble $pc,+0x20
>>>>>>>>> Dump of assembler code from 0x1537c to 0x1539c:
>>>>>>>>> => 0x000000000001537c:  mv      a0,a4
>>>>>>>>>    0x000000000001537e:  jal     ra,0x3806a
>>>>>>>>>    0x0000000000015382:  auipc   a5,0x8a
>>>>>>>>>    0x0000000000015386:  addi    a5,a5,1342 # 0x9f8c0
>>>>>>>>>    0x000000000001538a:  ld      a4,0(a5)
>>>>>>>>>    0x000000000001538c:  beqz    a4,0x153f0
>>>>>>>>>    0x000000000001538e:  jal     ra,0x38abe
>>>>>>>>>    0x0000000000015392:  ld      a0,0(s6)
>>>>>>>>>    0x0000000000015396:  auipc   s7,0x85
>>>>>>>>>    0x000000000001539a:  ld      s7,-406(s7) # 0x9a200
>>>>>>>>> End of assembler dump.
>>>>>>>>> (gdb)
>>>>>>>>> (gdb) stepi
>>>>>>>>> 0x000000000001537e in ?? ()
>>>>>>>>> (gdb)
>>>>>>>>>
>>>>>>>>> Program received signal SIGILL, Illegal instruction.
>>>>>>>>> 0x00000000000385d4 in ?? ()
>>>>>>>>> (gdb)
>>>>>>>>
>>>>>>>> Did you try to compare the call trace to QEMU, where we divert?
>>>>>>>
>>>>>>> Yes, that's possible way forward, but it will require some
>>>>>>> considerable setup on my side.
>>>>>>>
>>>>>>> If you have QEMU ready... objdump tells you ldconfig's
>>> entrypoint,
>>>>>>> from that point you can just stepi. In less than 200 steps, you
>>>>>>> should have sigill... and complete steps that lead to it.
>>>>>>>
>>>>>>
>>>>>> I've updated sid-ports (dropped the snapshot pinning), and now I'm
>>>>>> getting a page fault on the instruction before the one that was
>>>>>> causing SIGILL before:
> 
> In case the requested page is a page with PROT_WRITE only (no PROT_READ) 
> it might be related to
> https://lore.kernel.org/linux-riscv/20220915193702.2201018-1-abrestic@rivosinc.com/
> 
> AFAIR all stable branches have that problem currently.
> 

Nice idea. I quickly hacked that on top of the rzfive kernel, but it
didn't change the picture, unfortunately.

That said, being able to test linus/master would be very valuable here.

Jan

-- 
Siemens AG, Technology
Competence Center Embedded Linux



^ permalink raw reply	[flat|nested] 39+ messages in thread

* RE: [cip-dev] ldconfig segfault on RZ/Five was Re: Preparing isar-cip-core for RZ/Five
  2022-10-11 20:15                                       ` Jan Kiszka
@ 2022-10-11 20:48                                         ` Prabhakar Mahadev Lad
  2022-10-12  9:50                                           ` Prabhakar Mahadev Lad
  0 siblings, 1 reply; 39+ messages in thread
From: Prabhakar Mahadev Lad @ 2022-10-11 20:48 UTC (permalink / raw)
  To: Jan Kiszka, Florian Bezdeka, cip-dev, Chris Paterson, Hung Tran
  Cc: Pavel Machek

Hi Jan,

> -----Original Message-----
> From: Jan Kiszka <jan.kiszka@siemens.com>
> Sent: 11 October 2022 21:15
> To: Florian Bezdeka <florian.bezdeka@siemens.com>; cip-dev@lists.cip-project.org; Chris Paterson
> <Chris.Paterson2@renesas.com>; Prabhakar Mahadev Lad <prabhakar.mahadev-lad.rj@bp.renesas.com>; Hung
> Tran <hung.tran.jy@renesas.com>
> Cc: Pavel Machek <pavel@denx.de>
> Subject: Re: [cip-dev] ldconfig segfault on RZ/Five was Re: Preparing isar-cip-core for RZ/Five
> 
> On 11.10.22 20:51, Florian Bezdeka wrote:
> > On 11.10.22 12:34, Biju Das via lists.cip-project.org wrote:
> >>> Subject: Re: RE: [cip-dev] ldconfig segfault on RZ/Five was Re:
> >>> Preparing isar-cip-core for RZ/Five
> >>>
> >>> On 09.10.22 10:42, Biju Das wrote:
> >>>>> Subject: Re: [cip-dev] ldconfig segfault on RZ/Five was Re:
> >>> Preparing
> >>>>> isar-cip-core for RZ/Five
> >>>>>
> >>>>> On 08.10.22 10:27, Jan Kiszka wrote:
> >>>>>> On 07.10.22 12:19, Pavel Machek wrote:
> >>>>>>> Hi!
> >>>>>>>
> >>>>>>>>>>> I tried, but installation fails - illegal instruction.
> >>>>>>>>>>
> >>>>>>>>>> Yeah, ldconfig is needed for installation. But I get a
> >>>>> segfaulting
> >>>>>>>>>> gcc binary.
> >>>>>>>>>
> >>>>>>>>> It crashes rather soon after startup, so I was able to trace
> >>>>>>>>> complete path.
> >>>>>>>>>
> >>>>>>>>>> But I do have slightly different results then you (I think;
> >>> I'm
> >>>>>>>>>> far from risc-v expert). I did a breakpoint:
> >>>>>>>>>>
> >>>>>>>>>> Breakpoint 1, 0x00000000000385d4 in ?? ()
> >>>>>>>>>
> >>>>>>>>> I believe it should not end at 0x00000000000385d4 at all. The
> >>>>>>>>> 0x000000000001537e jal instruction should end up calling
> >>> 0x3806a
> >>>>>>>>> AFAICT, but it calls 0x385d4 instead. It happens during
> >>>>>>>>> single-stepping, so it should not be anything subtle.
> >>>>>>>>>
> >>>>>>>>> (gdb) disassemble $pc,+0x20
> >>>>>>>>> Dump of assembler code from 0x1537c to 0x1539c:
> >>>>>>>>> => 0x000000000001537c:  mv      a0,a4
> >>>>>>>>>    0x000000000001537e:  jal     ra,0x3806a
> >>>>>>>>>    0x0000000000015382:  auipc   a5,0x8a
> >>>>>>>>>    0x0000000000015386:  addi    a5,a5,1342 # 0x9f8c0
> >>>>>>>>>    0x000000000001538a:  ld      a4,0(a5)
> >>>>>>>>>    0x000000000001538c:  beqz    a4,0x153f0
> >>>>>>>>>    0x000000000001538e:  jal     ra,0x38abe
> >>>>>>>>>    0x0000000000015392:  ld      a0,0(s6)
> >>>>>>>>>    0x0000000000015396:  auipc   s7,0x85
> >>>>>>>>>    0x000000000001539a:  ld      s7,-406(s7) # 0x9a200
> >>>>>>>>> End of assembler dump.
> >>>>>>>>> (gdb)
> >>>>>>>>> (gdb) stepi
> >>>>>>>>> 0x000000000001537e in ?? ()
> >>>>>>>>> (gdb)
> >>>>>>>>>
> >>>>>>>>> Program received signal SIGILL, Illegal instruction.
> >>>>>>>>> 0x00000000000385d4 in ?? ()
> >>>>>>>>> (gdb)
> >>>>>>>>
> >>>>>>>> Did you try to compare the call trace to QEMU, where we divert?
> >>>>>>>
> >>>>>>> Yes, that's possible way forward, but it will require some
> >>>>>>> considerable setup on my side.
> >>>>>>>
> >>>>>>> If you have QEMU ready... objdump tells you ldconfig's
> >>> entrypoint,
> >>>>>>> from that point you can just stepi. In less than 200 steps, you
> >>>>>>> should have sigill... and complete steps that lead to it.
> >>>>>>>
> >>>>>>
> >>>>>> I've updated sid-ports (dropped the snapshot pinning), and now
> >>>>>> I'm getting a page fault on the instruction before the one that
> >>>>>> was causing SIGILL before:
> >
> > In case the requested page is a page with PROT_WRITE only (no
> > PROT_READ) it might be related to
> > https://jpn01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore
> > .kernel.org%2Flinux-riscv%2F20220915193702.2201018-1-abrestic%40rivosi
> > nc.com%2F&amp;data=05%7C01%7Cprabhakar.mahadev-lad.rj%40bp.renesas.com
> > %7C4efefe2d9ed148944efd08daabc55ab5%7C53d82571da1947e49cb4625a166a4a2a
> > %7C0%7C0%7C638011161361337108%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAw
> > MDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp
> > ;sdata=msPQCy0siXTQOmhj7gAtCK1zSQChGNg%2B2KcmAhQvH4k%3D&amp;reserved=0
> >
> > AFAIR all stable branches have that problem currently.
> >
> 
> Nice idea. I quickly hacked that on top of the rzfive kernel, but it didn't change the picture,
> unfortunately.
> 
Thanks for the quick test.

> That said, being able to test linus/master would be very valuable here.
> 
I will test this on top of v6.0 and update the results.

Cheers,
Prabhakar


^ permalink raw reply	[flat|nested] 39+ messages in thread

* RE: [cip-dev] ldconfig segfault on RZ/Five was Re: Preparing isar-cip-core for RZ/Five
  2022-10-11 20:48                                         ` Prabhakar Mahadev Lad
@ 2022-10-12  9:50                                           ` Prabhakar Mahadev Lad
  2022-10-13  8:36                                             ` Ulrich Hecht
  0 siblings, 1 reply; 39+ messages in thread
From: Prabhakar Mahadev Lad @ 2022-10-12  9:50 UTC (permalink / raw)
  To: Jan Kiszka, Florian Bezdeka, cip-dev, Chris Paterson, Hung Tran
  Cc: Pavel Machek

[-- Attachment #1: Type: text/plain, Size: 5397 bytes --]

Hi Jan,

> -----Original Message-----
> From: Prabhakar Mahadev Lad
> Sent: 11 October 2022 21:49
> To: Jan Kiszka <jan.kiszka@siemens.com>; Florian Bezdeka
> <florian.bezdeka@siemens.com>; cip-dev@lists.cip-project.org; Chris
> Paterson <Chris.Paterson2@renesas.com>; Hung Tran
> <hung.tran.jy@renesas.com>
> Cc: Pavel Machek <pavel@denx.de>
> Subject: RE: [cip-dev] ldconfig segfault on RZ/Five was Re: Preparing
> isar-cip-core for RZ/Five
> 
> Hi Jan,
> 
> > -----Original Message-----
> > From: Jan Kiszka <jan.kiszka@siemens.com>
> > Sent: 11 October 2022 21:15
> > To: Florian Bezdeka <florian.bezdeka@siemens.com>;
> > cip-dev@lists.cip-project.org; Chris Paterson
> > <Chris.Paterson2@renesas.com>; Prabhakar Mahadev Lad
> > <prabhakar.mahadev-lad.rj@bp.renesas.com>; Hung Tran
> > <hung.tran.jy@renesas.com>
> > Cc: Pavel Machek <pavel@denx.de>
> > Subject: Re: [cip-dev] ldconfig segfault on RZ/Five was Re:
> Preparing
> > isar-cip-core for RZ/Five
> >
> > On 11.10.22 20:51, Florian Bezdeka wrote:
> > > On 11.10.22 12:34, Biju Das via lists.cip-project.org wrote:
> > >>> Subject: Re: RE: [cip-dev] ldconfig segfault on RZ/Five was Re:
> > >>> Preparing isar-cip-core for RZ/Five
> > >>>
> > >>> On 09.10.22 10:42, Biju Das wrote:
> > >>>>> Subject: Re: [cip-dev] ldconfig segfault on RZ/Five was Re:
> > >>> Preparing
> > >>>>> isar-cip-core for RZ/Five
> > >>>>>
> > >>>>> On 08.10.22 10:27, Jan Kiszka wrote:
> > >>>>>> On 07.10.22 12:19, Pavel Machek wrote:
> > >>>>>>> Hi!
> > >>>>>>>
> > >>>>>>>>>>> I tried, but installation fails - illegal instruction.
> > >>>>>>>>>>
> > >>>>>>>>>> Yeah, ldconfig is needed for installation. But I get a
> > >>>>> segfaulting
> > >>>>>>>>>> gcc binary.
> > >>>>>>>>>
> > >>>>>>>>> It crashes rather soon after startup, so I was able to
> trace
> > >>>>>>>>> complete path.
> > >>>>>>>>>
> > >>>>>>>>>> But I do have slightly different results then you (I
> think;
> > >>> I'm
> > >>>>>>>>>> far from risc-v expert). I did a breakpoint:
> > >>>>>>>>>>
> > >>>>>>>>>> Breakpoint 1, 0x00000000000385d4 in ?? ()
> > >>>>>>>>>
> > >>>>>>>>> I believe it should not end at 0x00000000000385d4 at all.
> > >>>>>>>>> The 0x000000000001537e jal instruction should end up
> calling
> > >>> 0x3806a
> > >>>>>>>>> AFAICT, but it calls 0x385d4 instead. It happens during
> > >>>>>>>>> single-stepping, so it should not be anything subtle.
> > >>>>>>>>>
> > >>>>>>>>> (gdb) disassemble $pc,+0x20
> > >>>>>>>>> Dump of assembler code from 0x1537c to 0x1539c:
> > >>>>>>>>> => 0x000000000001537c:  mv      a0,a4
> > >>>>>>>>>    0x000000000001537e:  jal     ra,0x3806a
> > >>>>>>>>>    0x0000000000015382:  auipc   a5,0x8a
> > >>>>>>>>>    0x0000000000015386:  addi    a5,a5,1342 # 0x9f8c0
> > >>>>>>>>>    0x000000000001538a:  ld      a4,0(a5)
> > >>>>>>>>>    0x000000000001538c:  beqz    a4,0x153f0
> > >>>>>>>>>    0x000000000001538e:  jal     ra,0x38abe
> > >>>>>>>>>    0x0000000000015392:  ld      a0,0(s6)
> > >>>>>>>>>    0x0000000000015396:  auipc   s7,0x85
> > >>>>>>>>>    0x000000000001539a:  ld      s7,-406(s7) # 0x9a200
> > >>>>>>>>> End of assembler dump.
> > >>>>>>>>> (gdb)
> > >>>>>>>>> (gdb) stepi
> > >>>>>>>>> 0x000000000001537e in ?? ()
> > >>>>>>>>> (gdb)
> > >>>>>>>>>
> > >>>>>>>>> Program received signal SIGILL, Illegal instruction.
> > >>>>>>>>> 0x00000000000385d4 in ?? ()
> > >>>>>>>>> (gdb)
> > >>>>>>>>
> > >>>>>>>> Did you try to compare the call trace to QEMU, where we
> divert?
> > >>>>>>>
> > >>>>>>> Yes, that's possible way forward, but it will require some
> > >>>>>>> considerable setup on my side.
> > >>>>>>>
> > >>>>>>> If you have QEMU ready... objdump tells you ldconfig's
> > >>> entrypoint,
> > >>>>>>> from that point you can just stepi. In less than 200 steps,
> > >>>>>>> you should have sigill... and complete steps that lead to
> it.
> > >>>>>>>
> > >>>>>>
> > >>>>>> I've updated sid-ports (dropped the snapshot pinning), and
> now
> > >>>>>> I'm getting a page fault on the instruction before the one
> that
> > >>>>>> was causing SIGILL before:
> > >
> > > In case the requested page is a page with PROT_WRITE only (no
> > > PROT_READ) it might be related to
> > >
> https://jpn01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flo
> > > re
> > > .kernel.org%2Flinux-riscv%2F20220915193702.2201018-1-
> abrestic%40rivo
> > > si
> > > nc.com%2F&amp;data=05%7C01%7Cprabhakar.mahadev-
> lad.rj%40bp.renesas.c
> > > om
> > >
> %7C4efefe2d9ed148944efd08daabc55ab5%7C53d82571da1947e49cb4625a166a4a
> > > 2a
> > >
> %7C0%7C0%7C638011161361337108%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLj
> > > Aw
> > >
> MDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&a
> > > mp
> > >
> ;sdata=msPQCy0siXTQOmhj7gAtCK1zSQChGNg%2B2KcmAhQvH4k%3D&amp;reserved
> > > =0
> > >
> > > AFAIR all stable branches have that problem currently.
> > >
> >
> > Nice idea. I quickly hacked that on top of the rzfive kernel, but it
> > didn't change the picture, unfortunately.
> >
> Thanks for the quick test.
> 
> > That said, being able to test linus/master would be very valuable
> here.
> >
> I will test this on top of v6.0 and update the results.
> 
I did a quick test with the patches pointed by Florian but unfortunately ldconfig still fails.

Cheers,
Prabhakar

[-- Attachment #2: ldconfig.txt --]
[-- Type: text/plain, Size: 25745 bytes --]

U-Boot 2022.10-00188-gbd74fb2a78-dirty (Oct 12 2022 - 10:10:51 +0100)

CPU:   rv64imafdc
Model: smarc-rzf
DRAM:  896 MiB
SW_ET0_EN: OFF
Core:  29 devices, 17 uclasses, devicetree: separate
MMC:   sd@11c00000: 0, sd@11c10000: 1
Loading Environment from MMC... OK
In:    serial@1004b800
Out:   serial@1004b800
Err:   serial@1004b800
Net:   eth0: ethernet@11c30000
Hit any key to stop autoboot:  0
ethernet@11c30000 Waiting for PHY auto negotiation to complete... done
BOOTP broadcast 1
BOOTP broadcast 2
BOOTP broadcast 3
DHCP client bound to address 192.168.10.96 (1398 ms)
Using ethernet@11c30000 device
TFTP from server 192.168.10.1; our IP address is 192.168.10.96
Filename 'rzf/Image.gz'.
Load address: 0x4a080000
Loading: #################################################################
         #################################################################
         #################################################################
         #################################################################
         #################################################################
         #################################################################
         ######################
         2 MiB/s
done
Bytes transferred = 6038644 (5c2474 hex)
Using ethernet@11c30000 device
TFTP from server 192.168.10.1; our IP address is 192.168.10.96
Filename 'rzf/r9a07g043f01-smarc.dtb'.
Load address: 0x48000000
Loading: ##
         1.3 MiB/s
done
Bytes transferred = 22129 (5671 hex)
Uncompressed size: 19847168 = 0x12ED800
Moving Image from 0x48080000 to 0x48200000, end=49564000
## Flattened Device Tree blob at 48000000
   Booting using the fdt blob at 0x48000000
   Loading Device Tree to 0000000057ff7000, end 0000000057fff670 ... OK

Starting kernel ...

[    0.000000] Linux version 6.0.0-rc7+ (prasmi@prasmi) (riscv64-linux-gnu-gcc (Ubuntu 9.4.0-1ubuntu1~20.04) 9.4.0, GNU ld (GNU Binutils for Ubuntu) 2.34) #91 SMP Wed Oct 12 10:46:04 BST 2022
[    0.000000] OF: fdt: Ignoring memory range 0x48000000 - 0x48200000
[    0.000000] Machine model: Renesas SMARC EVK based on r9a07g043f01
[    0.000000] earlycon: sbi0 at I/O port 0x0 (options '')
[    0.000000] printk: bootconsole [sbi0] enabled
[    0.000000] efi: UEFI not found.
[    0.000000] Reserved memory: created DMA memory pool at 0x0000000058000000, size 128 MiB
[    0.000000] OF: reserved mem: initialized node linux,cma@58000000, compatible id shared-dma-pool
[    0.000000] Zone ranges:
[    0.000000]   DMA32    [mem 0x0000000048200000-0x000000007fffffff]
[    0.000000]   Normal   empty
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000048200000-0x0000000057ffffff]
[    0.000000]   node   0: [mem 0x0000000058000000-0x000000005fffffff]
[    0.000000]   node   0: [mem 0x0000000060000000-0x000000007fffffff]
[    0.000000] Initmem setup node 0 [mem 0x0000000048200000-0x000000007fffffff]
[    0.000000] SBI specification v0.3 detected
[    0.000000] SBI implementation ID=0x1 Version=0x10000
[    0.000000] SBI TIME extension detected
[    0.000000] SBI IPI extension detected
[    0.000000] SBI RFENCE extension detected
[    0.000000] SBI HSM extension detected
[    0.000000] riscv: base ISA extensions acdfim
[    0.000000] riscv: ELF capabilities acdfim
[    0.000000] percpu: Embedded 18 pages/cpu s34680 r8192 d30856 u73728
[    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 225288
[    0.000000] Kernel command line: root=/dev/nfs rw rootwait ip=dhcp nfsroot=192.168.10.1:/mnt/rzfive,vers=4,tcp console=ttySC0,115200n8 earlycon=sbi debug loglevel=7 deferred_probe_timeout=5
[    0.000000] Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes, linear)
[    0.000000] Inode-cache hash table entries: 65536 (order: 7, 524288 bytes, linear)
[    0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
[    0.000000] Virtual kernel memory layout:
[    0.000000]       fixmap : 0xffffffc6fee00000 - 0xffffffc6ff000000   (2048 kB)
[    0.000000]       pci io : 0xffffffc6ff000000 - 0xffffffc700000000   (  16 MB)
[    0.000000]      vmemmap : 0xffffffc700000000 - 0xffffffc800000000   (4096 MB)
[    0.000000]      vmalloc : 0xffffffc800000000 - 0xffffffd800000000   (  64 GB)
[    0.000000]      modules : 0xffffffff01364000 - 0xffffffff80000000   (2028 MB)
[    0.000000]       lowmem : 0xffffffd800000000 - 0xffffffd837e00000   ( 894 MB)
[    0.000000]       kernel : 0xffffffff80000000 - 0xffffffffffffffff   (2047 MB)
[    0.000000] Memory: 747816K/915456K available (7387K kernel code, 4898K rwdata, 4096K rodata, 2195K init, 468K bss, 167640K reserved, 0K cma-reserved)
[    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
[    0.000000] rcu: Hierarchical RCU implementation.
[    0.000000] rcu:     RCU restricting CPUs from NR_CPUS=8 to nr_cpu_ids=1.
[    0.000000] rcu:     RCU debug extended QS entry/exit.
[    0.000000]  Tracing variant of Tasks RCU enabled.
[    0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 25 jiffies.
[    0.000000] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=1
[    0.000000] NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0
[    0.000000] riscv-intc: 64 local interrupts mapped
[    0.000000] plic: interrupt-controller@12c00000: mapped 512 interrupts with 1 handlers for 2 contexts.
[    0.000000] rcu: srcu_init: Setting srcu_struct sizes based on contention.
[    0.000000] riscv-timer: riscv_timer_init_dt: Registering clocksource cpuid [0] hartid [0]
[    0.000000] clocksource: riscv_clocksource: mask: 0xffffffffffffffff max_cycles: 0x2c47f4ee7, max_idle_ns: 440795202497 ns
[    0.000003] sched_clock: 64 bits at 12MHz, resolution 83ns, wraps every 4398046511096ns
[    0.008296] Console: colour dummy device 80x25
[    0.012643] Calibrating delay loop (skipped), value calculated using timer frequency.. 24.00 BogoMIPS (lpj=48000)
[    0.022817] pid_max: default: 32768 minimum: 301
[    0.027597] LSM: Security Framework initializing
[    0.032356] Mount-cache hash table entries: 2048 (order: 2, 16384 bytes, linear)
[    0.039595] Mountpoint-cache hash table entries: 2048 (order: 2, 16384 bytes, linear)
[    0.050391] cblist_init_generic: Setting adjustable number of callback queues.
[    0.056911] cblist_init_generic: Setting shift to 0 and lim to 1.
[    0.063260] ASID allocator using 9 bits (512 entries)
[    0.068339] rcu: Hierarchical SRCU implementation.
[    0.072928] rcu:     Max phase no-delay instances is 1000.
[    0.078807] Detected Renesas RZ/Five r9a07g043 Rev 0
[    0.083256] EFI services will not be available.
[    0.088300] smp: Bringing up secondary CPUs ...
[    0.092378] smp: Brought up 1 node, 1 CPU
[    0.097346] devtmpfs: initialized
[    0.107364] DMA: default coherent area is set
[    0.111049] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
[    0.120775] futex hash table entries: 256 (order: 2, 16384 bytes, linear)
[    0.127965] pinctrl core: initialized pinctrl subsystem
[    0.134968] NET: Registered PF_NETLINK/PF_ROUTE protocol family
[    0.140713] DMA: preallocated 128 KiB GFP_KERNEL pool for atomic allocations
[    0.147264] DMA: preallocated 128 KiB GFP_KERNEL|GFP_DMA32 pool for atomic allocations
[    0.155267] audit: initializing netlink subsys (disabled)
[    0.161582] thermal_sys: Registered thermal governor 'fair_share'
[    0.161600] thermal_sys: Registered thermal governor 'bang_bang'
[    0.166967] thermal_sys: Registered thermal governor 'step_wise'
[    0.172979] thermal_sys: Registered thermal governor 'user_space'
[    0.179442] audit: type=2000 audit(0.148:1): state=initialized audit_enabled=0 res=1
[    0.192948] cpuidle: using governor menu
[    0.224033] HugeTLB: registered 2.00 MiB page size, pre-allocated 0 pages
[    0.230156] HugeTLB: 0 KiB vmemmap can be freed for a 2.00 MiB page
[    0.240720] SCSI subsystem initialized
[    0.244047] usbcore: registered new interface driver usbfs
[    0.249355] usbcore: registered new interface driver hub
[    0.254689] usbcore: registered new device driver usb
[    0.260841] Advanced Linux Sound Architecture Driver Initialized.
[    0.267776] clocksource: Switched to clocksource riscv_clocksource
[    0.286651] NET: Registered PF_INET protocol family
[    0.291124] IP idents hash table entries: 16384 (order: 5, 131072 bytes, linear)
[    0.300275] tcp_listen_portaddr_hash hash table entries: 512 (order: 2, 16384 bytes, linear)
[    0.308092] Table-perturb hash table entries: 65536 (order: 6, 262144 bytes, linear)
[    0.315725] TCP established hash table entries: 8192 (order: 4, 65536 bytes, linear)
[    0.323562] TCP bind hash table entries: 8192 (order: 6, 262144 bytes, linear)
[    0.331852] TCP: Hash tables configured (established 8192 bind 8192)
[    0.337662] UDP hash table entries: 512 (order: 3, 49152 bytes, linear)
[    0.344297] UDP-Lite hash table entries: 512 (order: 3, 49152 bytes, linear)
[    0.351644] NET: Registered PF_UNIX/PF_LOCAL protocol family
[    0.358081] RPC: Registered named UNIX socket transport module.
[    0.363229] RPC: Registered udp transport module.
[    0.368063] RPC: Registered tcp transport module.
[    0.372756] RPC: Registered tcp NFSv4.1 backchannel transport module.
[    0.381500] workingset: timestamp_bits=46 max_order=18 bucket_order=0
[    0.400118] NFS: Registering the id_resolver key type
[    0.404598] Key type id_resolver registered
[    0.408637] Key type id_legacy registered
[    0.412898] nfs4filelayout_init: NFSv4 File Layout Driver Registering...
[    0.419439] nfs4flexfilelayout_init: NFSv4 Flexfile Layout Driver Registering...
[    0.426956] jffs2: version 2.2. (NAND) © 2001-2006 Red Hat, Inc.
[    0.434376] NET: Registered PF_ALG protocol family
[    0.438489] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 248)
[    0.445898] io scheduler mq-deadline registered
[    0.450428] io scheduler kyber registered
[    0.551411] SuperH (H)SCI(F) driver initialized
[    0.556018] cacheinfo: Unable to detect cache hierarchy for CPU 0
[    0.571166] loop: module loaded
[    0.577021] CAN device driver interface
[    0.580720] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[    0.586657] ehci-platform: EHCI generic platform driver
[    0.592153] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
[    0.598183] ohci-platform: OHCI generic platform driver
[    0.604452] usbcore: registered new interface driver uas
[    0.609112] usbcore: registered new interface driver usb-storage
[    0.616140] mousedev: PS/2 mouse device common for all mice
[    0.621347] i2c_dev: i2c /dev entries driver
[    0.744626] sdhci: Secure Digital Host Controller Interface driver
[    0.750090] sdhci: Copyright(c) Pierre Ossman
[    0.754782] sdhci-pltfm: SDHCI platform and OF driver helper
[    0.760567] usbcore: registered new interface driver usbhid
[    0.765756] usbhid: USB HID core driver
[    0.770529] riscv-pmu-sbi: SBI PMU extension is available
[    0.775223] riscv-pmu-sbi: 15 firmware and 6 hardware counters
[    0.781049] riscv-pmu-sbi: Perf sampling/filtering is not supported as sscof extension is not available
[    0.793470] NET: Registered PF_INET6 protocol family
[    0.799682] Segment Routing with IPv6
[    0.802649] In-situ OAM (IOAM) with IPv6
[    0.806737] sit: IPv6, IPv4 and MPLS over IPv4 tunneling driver
[    0.813480] NET: Registered PF_PACKET protocol family
[    0.817777] can: controller area network core
[    0.822465] NET: Registered PF_CAN protocol family
[    0.827037] can: raw protocol
[    0.830066] can: broadcast manager protocol
[    0.834272] can: netlink gateway - max_hops=1
[    0.838850] can: isotp protocol
[    0.842037] Key type dns_resolver registered
[    0.847347] debug_vm_pgtable: [debug_vm_pgtable         ]: Validating architecture page table helpers
[    0.873380] gpio-378 (can0_stb): hogged as output/low
[    0.877694] gpio-379 (can1_stb): hogged as output/low
[    0.882910] gpio-363 (sd1_pwr_en): hogged as output/high
[    0.888948] pinctrl-rzg2l 11030000.pinctrl: pinctrl-rzg2l support registered
[    0.900041] 1004b800.serial: ttySC0 at MMIO 0x1004b800 (irq = 23, base_baud = 0) is á[    0.907687] printk: console [ttySC0] enabled
[    0.907687] printk: console [ttySC0] enabled
[    0.915460] printk: bootconsole [sbi0] disabled
[    0.915460] printk: bootconsole [sbi0] disabled
[    0.926442] renesas_spi 1004b000.spi: DMA available
[    0.932064] renesas_spi 1004b000.spi: probed
[    0.939869] rcar_canfd 10050000.can: can_clk rate is 50000000
[    0.946658] rcar_canfd 10050000.can: device registered (channel 0)
[    0.952951] rcar_canfd 10050000.can: can_clk rate is 50000000
[    0.959660] rcar_canfd 10050000.can: device registered (channel 1)
[    0.965902] rcar_canfd 10050000.can: global operational state (clk 0, fdmode 1)
[    0.977610] ravb 11c30000.ethernet eth0: Base address at 0x11c30000, 70:b3:d5:1a:70:06, IRQ 35.
[    0.987826] renesas_usbhs 11c60000.usb: host probed
[    0.992738] renesas_usbhs 11c60000.usb: no transceiver found
[    0.998654] renesas_usbhs 11c60000.usb: gadget probed
[    1.005689] ehci-platform 11c70100.usb: EHCI Host Controller
[    1.011461] ehci-platform 11c70100.usb: new USB bus registered, assigned bus number 1
[    1.019690] ehci-platform 11c70100.usb: irq 38, io mem 0x11c70100
[    1.027590] ohci-platform 11c70000.usb: Generic Platform OHCI controller
[    1.034455] ohci-platform 11c70000.usb: new USB bus registered, assigned bus number 2
[    1.042560] ohci-platform 11c70000.usb: irq 40, io mem 0x11c70000
[    1.048753] ehci-platform 11c50100.usb: EHCI Host Controller
[    1.054424] ehci-platform 11c50100.usb: new USB bus registered, assigned bus number 3
[    1.062633] ehci-platform 11c50100.usb: irq 37, io mem 0x11c50100
[    1.068934] ohci-platform 11c50000.usb: Generic Platform OHCI controller
[    1.075676] ohci-platform 11c50000.usb: new USB bus registered, assigned bus number 4
[    1.083749] ohci-platform 11c50000.usb: irq 39, io mem 0x11c50000
[    1.090125] renesas_usbhs 11c60000.usb: probed
[    1.096446] ehci-platform 11c70100.usb: USB 2.0 started, EHCI 1.10
[    1.102911] i2c-riic 10058000.i2c: registered with 100000Hz bus speed
[    1.112283] hub 1-0:1.0: USB hub found
[    1.146266] ehci-platform 11c50100.usb: USB 2.0 started, EHCI 1.10
[    1.183315] hub 1-0:1.0: 1 port detected
[    1.187456] i2c-riic 10058400.i2c: registered with 100000Hz bus speed
[    1.196768] rz-ssi-pcm-audio 1004a000.ssi: DMA enabled
[    1.209549] hub 3-0:1.0: USB hub found
[    1.213466] hub 3-0:1.0: 1 port detected
[    1.219645] hub 2-0:1.0: USB hub found
[    1.223682] hub 2-0:1.0: 1 port detected
[    1.232014] hub 4-0:1.0: USB hub found
[    1.236492] hub 4-0:1.0: 1 port detected
[    1.380294] renesas_sdhi_internal_dmac 11c00000.mmc: mmc0 base at 0x0000000011c00000, max clock rate 133 MHz
[    1.390969] renesas_sdhi_internal_dmac 11c10000.mmc: mmc1 base at 0x0000000011c10000, max clock rate 133 MHz
[    1.467884] Microchip KSZ9131 Gigabit PHY 11c30000.ethernet-ffffffff:07: attached PHY driver (mii_bus:phy_addr=11c30000.ethernet-ffffffff:07, irq=POLL)
[    1.641499] mmc0: new HS200 MMC card at address 0001
[    1.647551] mmcblk0: mmc0:0001 G1M15M 59.3 GiB
[    1.654616]  mmcblk0: p1
[    1.658020] mmcblk0boot0: mmc0:0001 G1M15M 31.5 MiB
[    1.664904] mmcblk0boot1: mmc0:0001 G1M15M 31.5 MiB
[    1.671804] mmcblk0rpmb: mmc0:0001 G1M15M 4.00 MiB, chardev (247:0)
[    1.699776] usb 3-1: new high-speed USB device number 2 using ehci-platform
[    1.874263] usb-storage 3-1:1.0: USB Mass Storage device detected
[    1.881340] scsi host0: usb-storage 3-1:1.0
[    2.091446] mmc1: new ultra high speed SDR104 SDHC card at address aaaa
[    2.099077] mmcblk1: mmc1:aaaa SC16G 14.8 GiB
[    2.109136]  mmcblk1: p1 p2
[    2.917213] scsi 0:0:0:0: Direct-Access     General  USB Flash Disk   1.00 PQ: 0 ANSI: 2
[    2.927555] sd 0:0:0:0: [sda] 7831552 512-byte logical blocks: (4.01 GB/3.73 GiB)
[    2.935923] sd 0:0:0:0: [sda] Write Protect is off
[    2.941411] sd 0:0:0:0: [sda] No Caching mode page found
[    2.946786] sd 0:0:0:0: [sda] Assuming drive cache: write through
[    2.957869]  sda: sda1 sda2
[    2.961889] sd 0:0:0:0: [sda] Attached SCSI removable disk
[    5.573750] ravb 11c30000.ethernet eth0: Link is Up - 1Gbps/Full - flow control off
[    5.581459] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[    5.607776] Sending DHCP requests ., OK
[    5.627637] IP-Config: Got DHCP answer from 192.168.10.1, my address is 192.168.10.96
[    5.635490] IP-Config: Complete:
[    5.638705]      device=eth0, hwaddr=70:b3:d5:1a:70:06, ipaddr=192.168.10.96, mask=255.255.255.0, gw=192.168.10.1
[    5.648965]      host=192.168.10.96, domain=example.org, nis-domain=(none)
[    5.655836]      bootserver=192.168.10.1, rootserver=192.168.10.1, rootpath=
[    5.655847]      nameserver0=192.168.10.1
[    5.667250] ALSA device list:
[    5.670255]   #0: rz-ssi-dai-wm8978-hifi
[    5.715235] VFS: Mounted root (nfs4 filesystem) on device 0:17.
[    5.722626] devtmpfs: mounted
[    5.726875] Freeing unused kernel image (initmem) memory: 2192K
[    5.743790] Run /sbin/init as init process
[    6.652859] systemd[1]: System time before build time, advancing clock.
[    6.784169] systemd[1]: systemd 244 running in system mode. (+PAM -AUDIT -SELINUX +IMA -APPARMOR -SMACK +SYSVINIT +UTMP -LIBCRYPTSETUP -GCRYPT -GNUTLS +ACL +XZ -LZ4 -SECCOMP +BLKID -ELFUTILS +KMOD -IDN2 -IDN -PCRE2 default-hierarchy=hybrid)
[    6.806118] systemd[1]: Detected architecture riscv64.

Welcome to OpenEmbedded nodistro.0!

[    6.844381] systemd[1]: Set hostname to <smarc-rzfive>.
[   10.035768] random: crng init done
[  OK  ] Created slice system-getty.slice.
[  OK  ] Created slice system-serial\x2dgetty.slice.
[  OK  ] Created slice User and Session Slice.
[  OK  ] Started Dispatch Password …ts to Console Directory Watch.
[  OK  ] Started Forward Password R…uests to Wall Directory Watch.
[  OK  ] Reached target Paths.
[  OK  ] Reached target Remote File Systems.
[  OK  ] Reached target Slices.
[  OK  ] Reached target Swap.
[  OK  ] Listening on Syslog Socket.
[  OK  ] Listening on initctl Compatibility Named Pipe.
[  OK  ] Listening on Journal Audit Socket.
[  OK  ] Listening on Journal Socket (/dev/log).
[  OK  ] Listening on Journal Socket.
[  OK  ] Listening on Network Service Netlink Socket.
[  OK  ] Listening on udev Control Socket.
[  OK  ] Listening on udev Kernel Socket.
         Mounting Huge Pages File System...
         Mounting POSIX Message Queue File System...
         Mounting Kernel Debug File System...
         Mounting Temporary Directory (/tmp)...
         Starting Journal Service...
         Mounting Kernel Configuration File System...
         Starting Remount Root and Kernel File Systems...
         Starting Apply Kernel Variables...
         Starting udev Coldplug all Devices...
[  OK  ] Mounted Huge Pages File System.
[  OK  ] Mounted POSIX Message Queue File System.
[  OK  ] Mounted Kernel Debug File System.
[  OK  ] Mounted Temporary Directory (/tmp).
[  OK  ] Mounted Kernel Configuration File System.
[  OK  ] Started Remount Root and Kernel File Systems.
         Starting Create Static Device Nodes in /dev...
[  OK  ] Started Apply Kernel Variables.
[  OK  ] Started Create Static Device Nodes in /dev.
[  OK  ] Reached target Local File Systems (Pre).
         Mounting /var/volatile...
[   11.464818] audit: type=1334 audit(1660757910.807:2): prog-id=5 op=LOAD
         Starting udev Kernel D[   11.495994] audit: type=1334 audit(1660757910.827:3): prog-id=6 op=LOAD
evice Manager...
[  OK  ] Started Journal Service.
         Starting Flush Journal to Persistent Storage...
[  OK  ] Mounted /var/volatile.
[   11.765030] systemd-journald[97]: Received client request to flush runtime journal.
         Starting Load/Save Random Seed...
[  OK  ] Reached target Local File Systems.
[  OK  ] Started Flush Journal to Persistent Storage.
         Starting Create Volatile Files and Directories...
[  OK  ] Started Load/Save Random Seed.
[  OK  ] Started udev Kernel Device Manager.
         Starting Network Service...
[  OK  ] Started Create Volatile Files and Directories.
         Starting Network Time Synchronization...
         Starting Update UTMP about System Boot/Shutdown...
[  OK  ] Started Network Service.
         Starting Network Name Resolution...
[  OK  ] Started Update UTMP about System Boot/Shutdown.
[  OK  ] Started Network Time Synchronization.
[  OK  ] Reached target System Time Set.
[  OK  ] Reached target System Time Synchronized.
[  OK  ] Started Network Name Resolution.
[  OK  ] Reached target Network.
[  OK  ] Reached target Host and Network Name Lookups.
[  OK  ] Started udev Coldplug all Devices.
[  OK  ] Reached target Hardware activated USB gadget.
         Starting udev Wait for Complete Device Initialization...
[  OK  ] Started udev Wait for Complete Device Initialization.
[  OK  ] Started Hardware RNG Entropy Gatherer Daemon.
[  OK  ] Reached target System Initialization.
[  OK  ] Started Daily Cleanup of Temporary Directories.
[  OK  ] Reached target Timers.
[  OK  ] Listening on D-Bus System Message Bus Socket.
         Starting sshd.socket.
[  OK  ] Listening on sshd.socket.
[  OK  ] Reached target Sockets.
[  OK  ] Reached target Basic System.
         Starting Save/Restore Sound Card State...
[  OK  ] Started Kernel Logging Service.
[  OK  ] Started System Logging Service.
[  OK  ] Started D-Bus System Message Bus.
[  OK  ] Started Respond to IPv6 Node Information Queries.
[  OK  ] Started Network Router Discovery Daemon.
[   23.431612] audit: type=1334 audit(1660757922.771:4): prog-id=7 op=LOAD
[   23.462529] audit: type=1334 audit(1660757922.791:5): prog-id=8 op=LOAD
         Starting Login Service...
         Starting Permit User Sessions...
[  OK  ] Started Save/Restore Sound Card State.
[  OK  ] Reached target Sound Card.
[  OK  ] Started Permit User Sessions.
[  OK  ] Started Getty on tty1.
[  OK  ] Started Serial Getty on hvc0.
[  OK  ] Started Serial Getty on ttySC0.
[  OK  ] Reached target Login Prompts.
[  OK  ] Started Login Service.
[  OK  ] Reached target Multi-User System.
         Starting Update UTMP about System Runlevel Changes...
[  OK  ] Started Update UTMP about System Runlevel Changes.

OpenEmbedded nodistro.0 smarc-rzfive hvc0


OpenEmbedded nodistro.0 smarc-rzfive ttySC0



Last login: Wed Aug 17 17:38:45 UTC 2022 on hvc0
[   26.650166] audit: type=1006 audit(1660757925.991:6): pid=164 uid=0 old-auid=4294967295 auid=0 tty=(none) old-ses=4294967295 ses=1 res=1
[   26.662564] audit: type=1300 audit(1660757925.991:6): arch=c00000f3 syscall=64 success=yes exit=1 a0=8 a1=3fc7b40de0 a2=1 a3=0 items=0 ppid=1 pid=164 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=1 comm="(systemd)" exe="/lib/systemd/systemd" key=(null)
[   26.769367] audit: type=1327 audit(1660757925.991:6): proctitle="(systemd)"
[   26.788963] audit: type=1334 audit(1660757926.119:7): prog-id=9 op=LOAD
[   26.795930] audit: type=1300 audit(1660757926.119:7): arch=c00000f3 syscall=280 success=yes exit=8 a0=5 a1=3fe6f196e8 a2=70 a3=0 items=0 ppid=1 pid=164 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=1 comm="systemd" exe="/lib/systemd/systemd" key=(null)
[   26.821044] audit: type=1327 audit(1660757926.119:7): proctitle=2F6C69622F73797374656D642F73797374656D64002D2D75736572
[   26.831778] audit: type=1334 audit(1660757926.119:8): prog-id=0 op=UNLOAD
[   26.838553] audit: type=1334 audit(1660757926.119:9): prog-id=10 op=LOAD
root@smarc-rzfive:/lava-testing# root@smarc-rzfive:/lava-testing#
root@smarc-rzfive:/lava-testing#
root@smarc-rzfive:/lava-testing# ldconfig
[   34.232712] do_trap: 3 callbacks suppressed
[   34.232737] ldconfig[177]: unhandled signal 4 code 0x1 at 0x00000000000311f0 in ldconfig[10000+68000]
[   34.246285] CPU: 0 PID: 177 Comm: ldconfig Not tainted 6.0.0-rc7+ #91
[   34.252748] Hardware name: Renesas SMARC EVK based on r9a07g043f01 (DT)
[   34.259334] epc : 00000000000311f0 ra : 00000000000145a0 sp : 0000003fcdb3ca80
[   34.266567]  gp : 000000000007fe48 tp : 0000003fb2834720 t0 : 0000000000000000
[   34.273792]  t1 : 0000002ae0447bbc t2 : 00000000000003ff s0 : 0000000000014a86
[   34.281002]  s1 : 0000000000014b0e a0 : 0000003fcdb3cc80 a1 : 0000000000000001
[   34.288236]  a2 : 0000003fcdb3cbc8 a3 : 0000000000014a86 a4 : 000000000007e576
[   34.295427]  a5 : 0000000000000000 a6 : 0000003fcdb3cbc0 a7 : 0000000000000000
[   34.302646]  s2 : 0000000000000000 s3 : 0000003fb298a918 s4 : ffffffffffffffff
[   34.309866]  s5 : 0000002ae053d860 s6 : 0000002ae053d490 s7 : 0000002ae0523720
[   34.317086]  s8 : 0000002ae053d380 s9 : 0000000000000000 s10: 0000002ae04f8584
[   34.324306]  s11: 0000000000000000 t3 : 0000006a92df52a8 t4 : 000000000000000f
[   34.331516]  t5 : 000000000000000f t6 : 0000000000000000
[   34.336833] status: 0000000200004020 badaddr: 000000006dbffe9b cause: 0000000000000002
[   34.359023] audit: type=1701 audit(1660757933.695:11): auid=4294967295 uid=0 gid=0 ses=4294967295 pid=177 comm="ldconfig" exe="/sbin/ldconfig" sig=4 res=1
Illegal instruction
root@smarc-rzfive:/lava-testing#

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [cip-dev] ldconfig segfault on RZ/Five was Re: Preparing isar-cip-core for RZ/Five
  2022-10-12  9:50                                           ` Prabhakar Mahadev Lad
@ 2022-10-13  8:36                                             ` Ulrich Hecht
  2022-10-13 10:35                                               ` Pavel Machek
  2022-10-13 21:47                                               ` Pavel Machek
  0 siblings, 2 replies; 39+ messages in thread
From: Ulrich Hecht @ 2022-10-13  8:36 UTC (permalink / raw)
  To: cip-dev, Lad Prabhakar, Jan Kiszka, Florian Bezdeka,
	Chris Paterson, Hung Tran
  Cc: Pavel Machek


> On 10/12/2022 11:50 AM CEST Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> wrote:
> I did a quick test with the patches pointed by Florian but unfortunately ldconfig still fails.

I did some experiments on RZ/Five with this issue, and I'm almost positive that there is something wrong (or doesn't work as documented) with the icache handling on this SoC.

1. The issue only affects non-PIE executables (there are very few of those, basically just ldconfig, gcc, cpp and gcov* on the Debian system), and it occurs very early during the execution of the program. According to the datasheet, the cache on the ax45mp-1c core is virtually indexed, so it is unlikely that a PIE executable will ever hit anything in the cache when newly loaded, but it is much more likely with non-PIE executables.

2. Setting a breakpoint before the illegal/segfaulting instruction doesn't work, and what is executed is clearly not what we're seeing through the dcache (the offending instructions are neither illegal, nor are they able to cause segfaults), so instruction fetches must see something different.

3. Neither manually calling __vdso_flush_icache() from gdb (which executes a "fence.i" instruction) nor patching a "fence.i" into the ldconfig binary seem to do anything. According to the ax45mp-1c datasheet "fence.i" should flush the dcache and invalidate the icache.

My educated guess is that, in spite of the claims in the core manual, the "fence.i" instruction is not implemented, or not implemented correctly. (The datasheet does acknowledge that "fence", without the ".i", is a nop.)

The RISC-V ISA manual says that "fence.i" is part of the optional "Zifencei" extension, which I don't see mentioned in the core datasheet anywhere. (And at least at first glance, I couldn't find any other mechanism to invalidate the icache there either.)

CU
Uli


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [cip-dev] ldconfig segfault on RZ/Five was Re: Preparing isar-cip-core for RZ/Five
  2022-10-13  8:36                                             ` Ulrich Hecht
@ 2022-10-13 10:35                                               ` Pavel Machek
  2022-10-13 21:47                                               ` Pavel Machek
  1 sibling, 0 replies; 39+ messages in thread
From: Pavel Machek @ 2022-10-13 10:35 UTC (permalink / raw)
  To: Ulrich Hecht
  Cc: cip-dev, Lad Prabhakar, Jan Kiszka, Florian Bezdeka,
	Chris Paterson, Hung Tran, Pavel Machek

[-- Attachment #1: Type: text/plain, Size: 1589 bytes --]

Hi!

(Can I get you to wrap emails at ~72 columns or so?)

> > I did a quick test with the patches pointed by Florian but unfortunately ldconfig still fails.
> 
> I did some experiments on RZ/Five with this issue, and I'm almost positive that there is something wrong (or doesn't work as documented) with the icache handling on this SoC.

>  1. The issue only affects non-PIE executables (there are very few
> of those, basically just ldconfig, gcc, cpp and gcov* on the Debian
> system), and it occurs very early during the execution of the
> program. According to the datasheet, the cache on the ax45mp-1c core
> is virtually indexed, so it is unlikely that a PIE executable will
> ever hit anything in the cache when newly loaded, but it is much
> more likely with non-PIE executables.

Ah, I was wondering what does gcc and ldconfig have in common...

> 2. Setting a breakpoint before the illegal/segfaulting instruction
>doesn't work, and what is executed is clearly not what we're seeing
>through the dcache (the offending instructions are neither illegal,
>nor are they able to cause segfaults), so instruction fetches must
>see something different.

In my testing, I was able to stepi from the start, and then I was able
to put breakpoint at preceding instruction (which was a jump). It
looked like we jumped into the middle of instruction, which would
explain the fault.

Best regards,
								Pavel

-- 
DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [cip-dev] ldconfig segfault on RZ/Five was Re: Preparing isar-cip-core for RZ/Five
  2022-10-13  8:36                                             ` Ulrich Hecht
  2022-10-13 10:35                                               ` Pavel Machek
@ 2022-10-13 21:47                                               ` Pavel Machek
  2022-11-29 18:57                                                 ` Prabhakar Mahadev Lad
  1 sibling, 1 reply; 39+ messages in thread
From: Pavel Machek @ 2022-10-13 21:47 UTC (permalink / raw)
  To: Ulrich Hecht
  Cc: cip-dev, Lad Prabhakar, Jan Kiszka, Florian Bezdeka,
	Chris Paterson, Hung Tran, Pavel Machek

[-- Attachment #1: Type: text/plain, Size: 2906 bytes --]

Hi!

> > On 10/12/2022 11:50 AM CEST Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> wrote:
> > I did a quick test with the patches pointed by Florian but unfortunately ldconfig still fails.
> 
> I did some experiments on RZ/Five with this issue, and I'm almost positive that there is something wrong (or doesn't work as documented) with the icache handling on this SoC.
> 
> 1. The issue only affects non-PIE executables (there are very few of those, basically just ldconfig, gcc, cpp and gcov* on the Debian system), and it occurs very early during the execution of the program. According to the datasheet, the cache on the ax45mp-1c core is virtually indexed, so it is unlikely that a PIE executable will ever hit anything in the cache when newly loaded, but it is much more likely with non-PIE executables.
>

This is very good observation. Thanks!

And indeed it looks like _any_ non-PIE executable fails. See:

root@smarc-rzfive:/my# cat mytest.c 
#include <stdio.h>

void main(void) { printf("ahoj svete\n"); } 
root@smarc-rzfive:/my# clang mytest.c -fno-pie -static
mytest.c:3:1: warning: return type of 'main' is not 'int' [-Wmain-return-type]
void main(void) { printf("ahoj svete\n"); } 
^
mytest.c:3:1: note: change return type to 'int'
void main(void) { printf("ahoj svete\n"); } 
^~~~
int
1 warning generated.
root@smarc-rzfive:/my# ./a.out 
[  279.010424] a.out[214]: unhandled signal 11 code 0x1 at 0xffffff8c38bd1524


(-O3 -g might be useful to add to clang command line).

Then you can

b _dl_discover_osversion
run

(gdb) disassemble /r
Dump of assembler code for function _dl_discover_osversion:
   0x000000000002538a <+0>:     41 71   addi    sp,sp,-496
   0x000000000002538c <+2>:     a8 00   addi    a0,sp,72
   0x000000000002538e <+4>:     86 f7   sd      ra,488(sp)
   0x0000000000025390 <+6>:     a2 f3   sd      s0,480(sp)
   0x0000000000025392 <+8>:     a6 ef   sd      s1,472(sp)
   0x0000000000025394 <+10>:    ca eb   sd      s2,464(sp)
=> 0x0000000000025396 <+12>:    ef 60 a1 5c     jal     ra,0x3b960 <uname>
   0x000000000002539a <+16>:    93 05 a1 0c     addi    a1,sp,202
   0x000000000002539e <+20>:    49 e5   bnez    a0,0x25428 <_dl_discover_osversion+158>
   0x00000000000253a0 <+22>:    81 48   li      a7,0
   0x00000000000253a2 <+24>:    01 45   li      a0,0
   0x00000000000253a4 <+26>:    25 48   li      a6,9
   0x00000000000253a6 <+28>:    13 03 e0 02     li      t1,46

It clearly tries to call uname, which.. it should, according to the
source code. But somehow it ends up in completely different function:

(gdb) stepi

Program received signal SIGILL, Illegal instruction.
0x000000000003b2fe in wcsrtombs ()

Best regards,
							Pavel
-- 
DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* RE: [cip-dev] ldconfig segfault on RZ/Five was Re: Preparing isar-cip-core for RZ/Five
  2022-10-13 21:47                                               ` Pavel Machek
@ 2022-11-29 18:57                                                 ` Prabhakar Mahadev Lad
  2022-12-10  7:23                                                   ` Jan Kiszka
  0 siblings, 1 reply; 39+ messages in thread
From: Prabhakar Mahadev Lad @ 2022-11-29 18:57 UTC (permalink / raw)
  To: Pavel Machek, Ulrich Hecht
  Cc: cip-dev, Jan Kiszka, Florian Bezdeka, Chris Paterson, Hung Tran

[-- Attachment #1: Type: text/plain, Size: 4032 bytes --]

Hi All,

> -----Original Message-----
> From: Pavel Machek <pavel@denx.de>
> Sent: 13 October 2022 22:48
> To: Ulrich Hecht <uli@fpond.eu>
> Cc: cip-dev@lists.cip-project.org; Prabhakar Mahadev Lad <prabhakar.mahadev-lad.rj@bp.renesas.com>;
> Jan Kiszka <jan.kiszka@siemens.com>; Florian Bezdeka <florian.bezdeka@siemens.com>; Chris Paterson
> <Chris.Paterson2@renesas.com>; Hung Tran <hung.tran.jy@renesas.com>; Pavel Machek <pavel@denx.de>
> Subject: Re: [cip-dev] ldconfig segfault on RZ/Five was Re: Preparing isar-cip-core for RZ/Five
> 
> Hi!
> 
> > > On 10/12/2022 11:50 AM CEST Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> wrote:
> > > I did a quick test with the patches pointed by Florian but unfortunately ldconfig still fails.
> >
> > I did some experiments on RZ/Five with this issue, and I'm almost positive that there is something
> wrong (or doesn't work as documented) with the icache handling on this SoC.
> >
> > 1. The issue only affects non-PIE executables (there are very few of those, basically just ldconfig,
> gcc, cpp and gcov* on the Debian system), and it occurs very early during the execution of the
> program. According to the datasheet, the cache on the ax45mp-1c core is virtually indexed, so it is
> unlikely that a PIE executable will ever hit anything in the cache when newly loaded, but it is much
> more likely with non-PIE executables.
> >
> 
> This is very good observation. Thanks!
> 
> And indeed it looks like _any_ non-PIE executable fails. See:
> 

Just a brief about the issue and solution:

TEXT_START_ADDR is the start of text segment of an application. This is being set to 0x10000 for RISCV platforms.

So when an application is compiled with the static flag the load would start from 0x10000 - xyz (depending on size of the application)

Entry point 0x101c0
There are 5 program headers, starting at offset 64Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000000000000 0x0000000000010000 0x0000000000010000
                 0x0000000000059b48 0x0000000000059b48  R E    0x1000
  LOAD           0x0000000000059b60 0x000000000006ab60 0x000000000006ab60
                 0x0000000000001f68 0x0000000000003528  RW     0x1000
So for the above application which is compiled statically we can see the entry point is 0x101c0 and load 0x0000000000010000.

Andes cores have local memories ILM and DLM that are mapped in the region H'0_0003_0000 - H'0_0004_FFFF on the RZ/Five SoC. When the virtual address falls in this range the MMU doesnt trigger a page fault and assume the virtual address as physical address and hence the application fails to run (panics somewhere).

So to avoid this issue we set the TEXT_START_ADDR to 0x50000 so that virtual address of any statically compiled application doesnt fall in the range of H'0_0003_0000 - H'0_0004_FFFF.

Elf file type is EXEC (Executable file)
Entry point 0x504e4
There are 5 program headers, starting at offset 64

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000000000000 0x0000000000050000 0x0000000000050000
                 0x0000000000057dc8 0x0000000000057dc8  R E    0x1000
  LOAD           0x00000000000585b8 0x00000000000a95b8 0x00000000000a95b8
                 0x0000000000004ee0 0x00000000000064b0  RW     0x1000
  NOTE           0x0000000000000158 0x0000000000050158 0x0000000000050158
                 0x0000000000000044 0x0000000000000044  R      0x4

So now with the fix for statically compiled application we can see its offsetted and entry point is 0x504e4 and load is at 0x0000000000050000. So with this we are for sure the MMU will always trigger a page fault.

I have attached a patch for binutils to the email. We plan to upstream this patch to binutils soon. 

Cheers,
Prabhakar

[-- Attachment #2: 0001-ld-emulparams-elf32lriscv-defs.sh-Adjust-TEXT_START_.patch --]
[-- Type: application/octet-stream, Size: 1307 bytes --]

From 91b9d727d696701bc0fa09a66a91fbfe3a639e55 Mon Sep 17 00:00:00 2001
From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Date: Tue, 29 Nov 2022 11:14:08 +0000
Subject: [PATCH] ld: emulparams: elf32lriscv-defs.sh: Adjust TEXT_START_ADDR
 for RZ/Five
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

With applications compiled with a static flag the virtual address for the
applications may fall within H’0_0003_0000 - H’0_0004_FFFF where the ILM
and DLM blocks of the AX45MP exist.

The MMU won't trigger page faults when the virtual address falls in the
range of AX45MP local memory. So to make sure statically compiled
applications run successfully, adjust the TEXT_START_ADDR to 0x50000.

Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
---
 ld/emulparams/elf32lriscv-defs.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ld/emulparams/elf32lriscv-defs.sh b/ld/emulparams/elf32lriscv-defs.sh
index 91015d44..bba34c17 100644
--- a/ld/emulparams/elf32lriscv-defs.sh
+++ b/ld/emulparams/elf32lriscv-defs.sh
@@ -26,7 +26,7 @@ case "$target" in
     ;;
 esac
 
-TEXT_START_ADDR=0x10000
+TEXT_START_ADDR=0x50000
 MAXPAGESIZE="CONSTANT (MAXPAGESIZE)"
 COMMONPAGESIZE="CONSTANT (COMMONPAGESIZE)"
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: RE: [cip-dev] ldconfig segfault on RZ/Five was Re: Preparing isar-cip-core for RZ/Five
  2022-11-29 18:57                                                 ` Prabhakar Mahadev Lad
@ 2022-12-10  7:23                                                   ` Jan Kiszka
  2022-12-10 20:25                                                     ` Pavel Machek
  2022-12-12 13:24                                                     ` Prabhakar Mahadev Lad
  0 siblings, 2 replies; 39+ messages in thread
From: Jan Kiszka @ 2022-12-10  7:23 UTC (permalink / raw)
  To: Prabhakar Mahadev Lad, Pavel Machek, Ulrich Hecht
  Cc: cip-dev, Florian Bezdeka, Chris Paterson, Hung Tran

On 29.11.22 19:57, Prabhakar Mahadev Lad wrote:
> Hi All,
> 
>> -----Original Message-----
>> From: Pavel Machek <pavel@denx.de>
>> Sent: 13 October 2022 22:48
>> To: Ulrich Hecht <uli@fpond.eu>
>> Cc: cip-dev@lists.cip-project.org; Prabhakar Mahadev Lad <prabhakar.mahadev-lad.rj@bp.renesas.com>;
>> Jan Kiszka <jan.kiszka@siemens.com>; Florian Bezdeka <florian.bezdeka@siemens.com>; Chris Paterson
>> <Chris.Paterson2@renesas.com>; Hung Tran <hung.tran.jy@renesas.com>; Pavel Machek <pavel@denx.de>
>> Subject: Re: [cip-dev] ldconfig segfault on RZ/Five was Re: Preparing isar-cip-core for RZ/Five
>>
>> Hi!
>>
>>>> On 10/12/2022 11:50 AM CEST Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> wrote:
>>>> I did a quick test with the patches pointed by Florian but unfortunately ldconfig still fails.
>>>
>>> I did some experiments on RZ/Five with this issue, and I'm almost positive that there is something
>> wrong (or doesn't work as documented) with the icache handling on this SoC.
>>>
>>> 1. The issue only affects non-PIE executables (there are very few of those, basically just ldconfig,
>> gcc, cpp and gcov* on the Debian system), and it occurs very early during the execution of the
>> program. According to the datasheet, the cache on the ax45mp-1c core is virtually indexed, so it is
>> unlikely that a PIE executable will ever hit anything in the cache when newly loaded, but it is much
>> more likely with non-PIE executables.
>>>
>>
>> This is very good observation. Thanks!
>>
>> And indeed it looks like _any_ non-PIE executable fails. See:
>>
> 
> Just a brief about the issue and solution:
> 
> TEXT_START_ADDR is the start of text segment of an application. This is being set to 0x10000 for RISCV platforms.
> 
> So when an application is compiled with the static flag the load would start from 0x10000 - xyz (depending on size of the application)
> 
> Entry point 0x101c0
> There are 5 program headers, starting at offset 64Program Headers:
>   Type           Offset             VirtAddr           PhysAddr
>                  FileSiz            MemSiz              Flags  Align
>   LOAD           0x0000000000000000 0x0000000000010000 0x0000000000010000
>                  0x0000000000059b48 0x0000000000059b48  R E    0x1000
>   LOAD           0x0000000000059b60 0x000000000006ab60 0x000000000006ab60
>                  0x0000000000001f68 0x0000000000003528  RW     0x1000
> So for the above application which is compiled statically we can see the entry point is 0x101c0 and load 0x0000000000010000.
> 
> Andes cores have local memories ILM and DLM that are mapped in the region H'0_0003_0000 - H'0_0004_FFFF on the RZ/Five SoC. When the virtual address falls in this range the MMU doesnt trigger a page fault and assume the virtual address as physical address and hence the application fails to run (panics somewhere).
> 
> So to avoid this issue we set the TEXT_START_ADDR to 0x50000 so that virtual address of any statically compiled application doesnt fall in the range of H'0_0003_0000 - H'0_0004_FFFF.
> 
> Elf file type is EXEC (Executable file)
> Entry point 0x504e4
> There are 5 program headers, starting at offset 64
> 
> Program Headers:
>   Type           Offset             VirtAddr           PhysAddr
>                  FileSiz            MemSiz              Flags  Align
>   LOAD           0x0000000000000000 0x0000000000050000 0x0000000000050000
>                  0x0000000000057dc8 0x0000000000057dc8  R E    0x1000
>   LOAD           0x00000000000585b8 0x00000000000a95b8 0x00000000000a95b8
>                  0x0000000000004ee0 0x00000000000064b0  RW     0x1000
>   NOTE           0x0000000000000158 0x0000000000050158 0x0000000000050158
>                  0x0000000000000044 0x0000000000000044  R      0x4
> 
> So now with the fix for statically compiled application we can see its offsetted and entry point is 0x504e4 and load is at 0x0000000000050000. So with this we are for sure the MMU will always trigger a page fault.
> 
> I have attached a patch for binutils to the email. We plan to upstream this patch to binutils soon. 
> 

Good that the issue is understood and likely solved now. Make sure to
upstream this as quickly as possible. It targets a fundamental tool and
requires recompilation of many components. And Debian will freeze the
toolchain in early January - although:

"It is unlikely that the release arch of bookworm will include riscv64."
[1] :(

Jan

[1] https://lists.debian.org/debian-riscv/2022/12/msg00009.html

-- 
Siemens AG, Technology
Competence Center Embedded Linux



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: RE: [cip-dev] ldconfig segfault on RZ/Five was Re: Preparing isar-cip-core for RZ/Five
  2022-12-10  7:23                                                   ` Jan Kiszka
@ 2022-12-10 20:25                                                     ` Pavel Machek
  2022-12-12 13:51                                                       ` Prabhakar Mahadev Lad
  2022-12-12 13:24                                                     ` Prabhakar Mahadev Lad
  1 sibling, 1 reply; 39+ messages in thread
From: Pavel Machek @ 2022-12-10 20:25 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Prabhakar Mahadev Lad, Pavel Machek, Ulrich Hecht, cip-dev,
	Florian Bezdeka, Chris Paterson, Hung Tran

[-- Attachment #1: Type: text/plain, Size: 2751 bytes --]

Hi!

> >> This is very good observation. Thanks!
> >>
> >> And indeed it looks like _any_ non-PIE executable fails. See:
> >>
> > 
> > Just a brief about the issue and solution:
> > 
> > TEXT_START_ADDR is the start of text segment of an application. This is being set to 0x10000 for RISCV platforms.
> > 
> > So when an application is compiled with the static flag the load would start from 0x10000 - xyz (depending on size of the application)
> > 
> > Entry point 0x101c0
> > There are 5 program headers, starting at offset 64Program Headers:
> >   Type           Offset             VirtAddr           PhysAddr
> >                  FileSiz            MemSiz              Flags  Align
> >   LOAD           0x0000000000000000 0x0000000000010000 0x0000000000010000
> >                  0x0000000000059b48 0x0000000000059b48  R E    0x1000
> >   LOAD           0x0000000000059b60 0x000000000006ab60 0x000000000006ab60
> >                  0x0000000000001f68 0x0000000000003528  RW     0x1000
> > So for the above application which is compiled statically we can see the entry point is 0x101c0 and load 0x0000000000010000.
> > 
> > Andes cores have local memories ILM and DLM that are mapped in the
> >>region H'0_0003_0000 - H'0_0004_FFFF on the RZ/Five SoC. When the
> >>virtual address falls in this range the MMU doesnt trigger a page
> >>fault and assume the virtual address as physical address and hence
> >>the application fails to run (panics somewhere).

...

> Good that the issue is understood and likely solved now. Make sure to
> upstream this as quickly as possible. It targets a fundamental tool and
> requires recompilation of many components. And Debian will freeze the
> toolchain in early January - although:
> 
> "It is unlikely that the release arch of bookworm will include riscv64."
> [1] :(

I'm pretty sure this is not complete fix. Yes, we should change the
toolchain, but the problem is really in the hardware: you can't just
take part of _virtual_ address space and reserve it. Not if you want
to claim board is riscv64 compatible. Someone else (manual mmap, some
kind of JIT, some kind of emulator) might want normal RAM there.

I believe this is quite important and should be solved in hardware (at
least in next generation).

Can ILM/DLM be disabled?

If we can not fix it at hardware level, we'll really need to prevent
attempts to map anything at that virtual memory range. Clear -EPERM
from mmap is better than strange behaviour at runtime, and it is
must-have from security perspective.

Best regards,
								Pavel
-- 
DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* RE: RE: [cip-dev] ldconfig segfault on RZ/Five was Re: Preparing isar-cip-core for RZ/Five
  2022-12-10  7:23                                                   ` Jan Kiszka
  2022-12-10 20:25                                                     ` Pavel Machek
@ 2022-12-12 13:24                                                     ` Prabhakar Mahadev Lad
  1 sibling, 0 replies; 39+ messages in thread
From: Prabhakar Mahadev Lad @ 2022-12-12 13:24 UTC (permalink / raw)
  To: Jan Kiszka, Pavel Machek, Ulrich Hecht
  Cc: cip-dev, Florian Bezdeka, Chris Paterson, Hung Tran

Hi Jan,

> -----Original Message-----
> From: Jan Kiszka <jan.kiszka@siemens.com>
> Sent: 10 December 2022 07:23
> To: Prabhakar Mahadev Lad <prabhakar.mahadev-lad.rj@bp.renesas.com>; Pavel Machek <pavel@denx.de>;
> Ulrich Hecht <uli@fpond.eu>
> Cc: cip-dev@lists.cip-project.org; Florian Bezdeka <florian.bezdeka@siemens.com>; Chris Paterson
> <Chris.Paterson2@renesas.com>; Hung Tran <hung.tran.jy@renesas.com>
> Subject: Re: RE: [cip-dev] ldconfig segfault on RZ/Five was Re: Preparing isar-cip-core for RZ/Five
> 
> On 29.11.22 19:57, Prabhakar Mahadev Lad wrote:
> > Hi All,
> >
> >> -----Original Message-----
> >> From: Pavel Machek <pavel@denx.de>
> >> Sent: 13 October 2022 22:48
> >> To: Ulrich Hecht <uli@fpond.eu>
> >> Cc: cip-dev@lists.cip-project.org; Prabhakar Mahadev Lad
> >> <prabhakar.mahadev-lad.rj@bp.renesas.com>;
> >> Jan Kiszka <jan.kiszka@siemens.com>; Florian Bezdeka
> >> <florian.bezdeka@siemens.com>; Chris Paterson
> >> <Chris.Paterson2@renesas.com>; Hung Tran <hung.tran.jy@renesas.com>;
> >> Pavel Machek <pavel@denx.de>
> >> Subject: Re: [cip-dev] ldconfig segfault on RZ/Five was Re: Preparing
> >> isar-cip-core for RZ/Five
> >>
> >> Hi!
> >>
> >>>> On 10/12/2022 11:50 AM CEST Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> wrote:
> >>>> I did a quick test with the patches pointed by Florian but unfortunately ldconfig still fails.
> >>>
> >>> I did some experiments on RZ/Five with this issue, and I'm almost
> >>> positive that there is something
> >> wrong (or doesn't work as documented) with the icache handling on this SoC.
> >>>
> >>> 1. The issue only affects non-PIE executables (there are very few of
> >>> those, basically just ldconfig,
> >> gcc, cpp and gcov* on the Debian system), and it occurs very early
> >> during the execution of the program. According to the datasheet, the
> >> cache on the ax45mp-1c core is virtually indexed, so it is unlikely
> >> that a PIE executable will ever hit anything in the cache when newly loaded, but it is much more
> likely with non-PIE executables.
> >>>
> >>
> >> This is very good observation. Thanks!
> >>
> >> And indeed it looks like _any_ non-PIE executable fails. See:
> >>
> >
> > Just a brief about the issue and solution:
> >
> > TEXT_START_ADDR is the start of text segment of an application. This is being set to 0x10000 for
> RISCV platforms.
> >
> > So when an application is compiled with the static flag the load would
> > start from 0x10000 - xyz (depending on size of the application)
> >
> > Entry point 0x101c0
> > There are 5 program headers, starting at offset 64Program Headers:
> >   Type           Offset             VirtAddr           PhysAddr
> >                  FileSiz            MemSiz              Flags  Align
> >   LOAD           0x0000000000000000 0x0000000000010000 0x0000000000010000
> >                  0x0000000000059b48 0x0000000000059b48  R E    0x1000
> >   LOAD           0x0000000000059b60 0x000000000006ab60 0x000000000006ab60
> >                  0x0000000000001f68 0x0000000000003528  RW     0x1000
> > So for the above application which is compiled statically we can see the entry point is 0x101c0 and
> load 0x0000000000010000.
> >
> > Andes cores have local memories ILM and DLM that are mapped in the region H'0_0003_0000 -
> H'0_0004_FFFF on the RZ/Five SoC. When the virtual address falls in this range the MMU doesnt trigger
> a page fault and assume the virtual address as physical address and hence the application fails to run
> (panics somewhere).
> >
> > So to avoid this issue we set the TEXT_START_ADDR to 0x50000 so that virtual address of any
> statically compiled application doesnt fall in the range of H'0_0003_0000 - H'0_0004_FFFF.
> >
> > Elf file type is EXEC (Executable file) Entry point 0x504e4 There are
> > 5 program headers, starting at offset 64
> >
> > Program Headers:
> >   Type           Offset             VirtAddr           PhysAddr
> >                  FileSiz            MemSiz              Flags  Align
> >   LOAD           0x0000000000000000 0x0000000000050000 0x0000000000050000
> >                  0x0000000000057dc8 0x0000000000057dc8  R E    0x1000
> >   LOAD           0x00000000000585b8 0x00000000000a95b8 0x00000000000a95b8
> >                  0x0000000000004ee0 0x00000000000064b0  RW     0x1000
> >   NOTE           0x0000000000000158 0x0000000000050158 0x0000000000050158
> >                  0x0000000000000044 0x0000000000000044  R      0x4
> >
> > So now with the fix for statically compiled application we can see its offsetted and entry point is
> 0x504e4 and load is at 0x0000000000050000. So with this we are for sure the MMU will always trigger a
> page fault.
> >
> > I have attached a patch for binutils to the email. We plan to upstream this patch to binutils soon.
> >
> 
> Good that the issue is understood and likely solved now. Make sure to upstream this as quickly as
> possible. It targets a fundamental tool and requires recompilation of many components. And Debian will
> freeze the toolchain in early January - although:
> 
Yes the plan is to get the TEXT_START_ADDR change asap.

Cheers,
Prabhakar

> "It is unlikely that the release arch of bookworm will include riscv64."
> [1] :(
> 
> Jan
> 
> [1] https://jpn01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.debian.org%2Fdebian-
> riscv%2F2022%2F12%2Fmsg00009.html&amp;data=05%7C01%7Cprabhakar.mahadev-
> lad.rj%40bp.renesas.com%7Ce594a8c6f8e140ff526508dada7f723b%7C53d82571da1947e49cb4625a166a4a2a%7C0%7C0%
> 7C638062538154354137%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXV
> CI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=TJZULCOPJrvGQIPC4VQ0q1A%2BRk7kBpdvC34IY87JJM4%3D&amp;reserved=0
> 
> --
> Siemens AG, Technology
> Competence Center Embedded Linux



^ permalink raw reply	[flat|nested] 39+ messages in thread

* RE: RE: [cip-dev] ldconfig segfault on RZ/Five was Re: Preparing isar-cip-core for RZ/Five
  2022-12-10 20:25                                                     ` Pavel Machek
@ 2022-12-12 13:51                                                       ` Prabhakar Mahadev Lad
  0 siblings, 0 replies; 39+ messages in thread
From: Prabhakar Mahadev Lad @ 2022-12-12 13:51 UTC (permalink / raw)
  To: Pavel Machek, Jan Kiszka
  Cc: Ulrich Hecht, cip-dev, Florian Bezdeka, Chris Paterson, Hung Tran

Hi Pavel,

> -----Original Message-----
> From: Pavel Machek <pavel@denx.de>
> Sent: 10 December 2022 20:26
> To: Jan Kiszka <jan.kiszka@siemens.com>
> Cc: Prabhakar Mahadev Lad <prabhakar.mahadev-lad.rj@bp.renesas.com>; Pavel Machek <pavel@denx.de>;
> Ulrich Hecht <uli@fpond.eu>; cip-dev@lists.cip-project.org; Florian Bezdeka
> <florian.bezdeka@siemens.com>; Chris Paterson <Chris.Paterson2@renesas.com>; Hung Tran
> <hung.tran.jy@renesas.com>
> Subject: Re: RE: [cip-dev] ldconfig segfault on RZ/Five was Re: Preparing isar-cip-core for RZ/Five
> 
> Hi!
> 
> > >> This is very good observation. Thanks!
> > >>
> > >> And indeed it looks like _any_ non-PIE executable fails. See:
> > >>
> > >
> > > Just a brief about the issue and solution:
> > >
> > > TEXT_START_ADDR is the start of text segment of an application. This is being set to 0x10000 for
> RISCV platforms.
> > >
> > > So when an application is compiled with the static flag the load
> > > would start from 0x10000 - xyz (depending on size of the
> > > application)
> > >
> > > Entry point 0x101c0
> > > There are 5 program headers, starting at offset 64Program Headers:
> > >   Type           Offset             VirtAddr           PhysAddr
> > >                  FileSiz            MemSiz              Flags  Align
> > >   LOAD           0x0000000000000000 0x0000000000010000 0x0000000000010000
> > >                  0x0000000000059b48 0x0000000000059b48  R E    0x1000
> > >   LOAD           0x0000000000059b60 0x000000000006ab60 0x000000000006ab60
> > >                  0x0000000000001f68 0x0000000000003528  RW     0x1000
> > > So for the above application which is compiled statically we can see the entry point is 0x101c0
> and load 0x0000000000010000.
> > >
> > > Andes cores have local memories ILM and DLM that are mapped in the
> > >>region H'0_0003_0000 - H'0_0004_FFFF on the RZ/Five SoC. When the
> > >>virtual address falls in this range the MMU doesnt trigger a page
> > >>fault and assume the virtual address as physical address and hence
> > >>the application fails to run (panics somewhere).
> 
> ...
> 
> > Good that the issue is understood and likely solved now. Make sure to
> > upstream this as quickly as possible. It targets a fundamental tool
> > and requires recompilation of many components. And Debian will freeze
> > the toolchain in early January - although:
> >
> > "It is unlikely that the release arch of bookworm will include riscv64."
> > [1] :(
> 
> I'm pretty sure this is not complete fix. Yes, we should change the toolchain, but the problem is
> really in the hardware: you can't just take part of _virtual_ address space and reserve it. Not if you
> want to claim board is riscv64 compatible. Someone else (manual mmap, some kind of JIT, some kind of
> emulator) might want normal RAM there.
> 
> I believe this is quite important and should be solved in hardware (at least in next generation).
> 
Agreed.

> Can ILM/DLM be disabled?
> 
Unfortunately ILM/DLM  cannot be disabled in the current version of HW.

> If we can not fix it at hardware level, we'll really need to prevent attempts to map anything at that
> virtual memory range. Clear -EPERM from mmap is better than strange behaviour at runtime, and it is
> must-have from security perspective.
> 
Yes we need to come up with a solution for stopping users from mapping withing this virtual memory ranges.

Cheers,
Prabhakar


^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2022-12-12 13:52 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-03 18:36 Preparing isar-cip-core for RZ/Five Jan Kiszka
2022-10-03 20:12 ` Chris Paterson
2022-10-04  7:15   ` Jan Kiszka
2022-10-04 18:28     ` Jan Kiszka
2022-10-04 19:36       ` Jan Kiszka
2022-10-04 19:47         ` Jan Kiszka
2022-10-05 18:21           ` Pavel Machek
2022-10-06  6:29             ` Jan Kiszka
2022-10-06  6:49               ` Jan Kiszka
2022-10-06  7:07                 ` Jan Kiszka
2022-10-06  7:08                 ` Prabhakar Mahadev Lad
2022-10-06  7:26                   ` Jan Kiszka
2022-10-06 11:43               ` Pavel Machek
2022-10-06 11:51                 ` Jan Kiszka
2022-10-06 22:07                   ` ldconfig segfault on RZ/Five was " Pavel Machek
2022-10-06 22:32                     ` Pavel Machek
2022-10-07  8:18                       ` Jan Kiszka
2022-10-07 10:19                         ` Pavel Machek
2022-10-08  8:27                           ` Jan Kiszka
2022-10-09  8:29                             ` Jan Kiszka
2022-10-09  8:42                               ` [cip-dev] " Biju Das
2022-10-11  9:30                                 ` Jan Kiszka
2022-10-11 10:34                                   ` Biju Das
2022-10-11 18:51                                     ` Florian Bezdeka
2022-10-11 20:15                                       ` Jan Kiszka
2022-10-11 20:48                                         ` Prabhakar Mahadev Lad
2022-10-12  9:50                                           ` Prabhakar Mahadev Lad
2022-10-13  8:36                                             ` Ulrich Hecht
2022-10-13 10:35                                               ` Pavel Machek
2022-10-13 21:47                                               ` Pavel Machek
2022-11-29 18:57                                                 ` Prabhakar Mahadev Lad
2022-12-10  7:23                                                   ` Jan Kiszka
2022-12-10 20:25                                                     ` Pavel Machek
2022-12-12 13:51                                                       ` Prabhakar Mahadev Lad
2022-12-12 13:24                                                     ` Prabhakar Mahadev Lad
2022-10-09 19:20                               ` Chris Paterson
2022-10-05  5:43       ` [cip-dev] " Biju Das
2022-10-04 22:30   ` Prabhakar Mahadev Lad
2022-10-05  5:45     ` Jan Kiszka

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.