All of lore.kernel.org
 help / color / mirror / Atom feed
From: <Conor.Dooley@microchip.com>
To: <nathan@kernel.org>
Cc: <ndesaulniers@google.com>, <llvm@lists.linux.dev>,
	<mturquette@baylibre.com>, <sboyd@kernel.org>,
	<robh+dt@kernel.org>, <krzysztof.kozlowski+dt@linaro.org>,
	<paul.walmsley@sifive.com>, <aou@eecs.berkeley.edu>,
	<linux-clk@vger.kernel.org>, <devicetree@vger.kernel.org>,
	<p.zabel@pengutronix.de>, <linux-kernel@vger.kernel.org>,
	<linux-riscv@lists.infradead.org>, <Daire.McNamara@microchip.com>,
	<palmer@dabbelt.com>
Subject: Re: [PATCH v2 00/12] PolarFire SoC reset controller & clock cleanups
Date: Wed, 17 Aug 2022 12:17:11 +0000	[thread overview]
Message-ID: <62bb08d7-0a14-13e4-5ca2-edb5ef3b9793@microchip.com> (raw)
In-Reply-To: <42354fef-8ca8-a81d-6af9-f250ecfd1924@microchip.com>

On 14/08/2022 12:41, Conor.Dooley@microchip.com wrote:
>>
>> Doesn't really matter since thats long enough to get past the switch
>> out of earlycon which is where the clang built kernel dies.

I had forgotten something obvious: keep_bootcon. With that, on the sbi
console I get a nice:

[    0.581754] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000b1
[    0.591520] Oops [#1]
[    0.594045] Modules linked in:
[    0.597435] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.0.0-rc1-00011-g8e1459cf4eca #1
[    0.606188] Hardware name: Microchip PolarFire-SoC Icicle Kit (DT)
[    0.613012] epc : __clk_register+0x4a6/0x85c
[    0.617759]  ra : __clk_register+0x49e/0x85c
[    0.622489] epc : ffffffff803faf7c ra : ffffffff803faf74 sp : ffffffc80400b720
[    0.630466]  gp : ffffffff810e93f8 tp : ffffffe77fe60000 t0 : ffffffe77ffb3800
[    0.638443]  t1 : 000000000000000a t2 : ffffffffffffffff s0 : ffffffc80400b7c0
[    0.646420]  s1 : 0000000000000001 a0 : 0000000000000001 a1 : 0000000000000000
[    0.654396]  a2 : 0000000000000001 a3 : 0000000000000000 a4 : 0000000000000000
[    0.662373]  a5 : ffffffff803a5810 a6 : 0000000200000022 a7 : 0000000000000006
[    0.670350]  s2 : ffffffff81099d48 s3 : ffffffff80d6e28e s4 : 0000000000000028
[    0.678327]  s5 : ffffffff810ed3c8 s6 : ffffffff810ed3d0 s7 : ffffffe77ffbc100
[    0.686304]  s8 : ffffffe77ffb1540 s9 : ffffffe77ffb1540 s10: 0000000000000008
[    0.694281]  s11: 0000000000000000 t3 : 00000000000000c6 t4 : 0000000000000007
[    0.702258]  t5 : ffffffff810c78c0 t6 : ffffffe77ff88cd0
[    0.708125] status: 0000000200000120 badaddr: 00000000000000b1 cause: 000000000000000d
[    0.716869] [<ffffffff803fb892>] devm_clk_hw_register+0x62/0xaa
[    0.723420] [<ffffffff80403412>] mpfs_clk_probe+0x1e0/0x244
[    0.729592] [<ffffffff80457dea>] platform_probe+0x82/0xa6
[    0.735581] [<ffffffff8045593c>] call_driver_probe+0x22/0xa4
[    0.741848] [<ffffffff804557da>] really_probe+0x13a/0x27a
[    0.747819] [<ffffffff804549f8>] __driver_probe_device+0xc4/0xee
[    0.754460] [<ffffffff804554d0>] driver_probe_device+0x3c/0x196
[    0.761013] [<ffffffff804552cc>] __device_attach_driver+0xa2/0x18c
[    0.767853] [<ffffffff8045284a>] bus_for_each_drv+0x76/0xba
[    0.774016] [<ffffffff804547ca>] __device_attach+0xa0/0x15a
[    0.780179] [<ffffffff80454896>] device_initial_probe+0x12/0x1a
[    0.786732] [<ffffffff804529c6>] bus_probe_device+0x2e/0x7c
[    0.792895] [<ffffffff8044f290>] device_add+0x21a/0x3aa
[    0.798693] [<ffffffff805c323a>] of_device_add+0x28/0x32
[    0.804578] [<ffffffff805c3c54>] of_platform_device_create_pdata+0x88/0xb2
[    0.812181] [<ffffffff805c3e70>] of_platform_bus_create+0x14a/0x1a8
[    0.819117] [<ffffffff805c3ea6>] of_platform_bus_create+0x180/0x1a8
[    0.826053] [<ffffffff805c3f2a>] of_platform_populate+0x5c/0x96
[    0.832606] [<ffffffff8082bbaa>] of_platform_default_populate_init+0xc2/0xd4
[    0.840400] [<ffffffff800020e8>] do_one_initcall+0xbc/0x216
[    0.846563] [<ffffffff8080126c>] do_initcall_level+0x82/0x94
[    0.852830] [<ffffffff808011ae>] do_initcalls+0x5c/0x98
[    0.858611] [<ffffffff8080114a>] do_basic_setup+0x20/0x28
[    0.864583] [<ffffffff808010e0>] kernel_init_freeable+0xba/0x104
[    0.871223] [<ffffffff80746856>] kernel_init+0x22/0x1ba
[    0.877013] [<ffffffff80003700>] ret_from_exception+0x0/0xc
[    0.883218] ---[ end trace 0000000000000000 ]---

At least I had somewhere to start. Since QEMU repros, I continued there with:
	$(QEMU)/qemu-system-riscv64 -s -S -M microchip-icicle-kit \
		-m 2G -smp 5 \
		-kernel $(vmlinux_bin) \
		-dtb $(dtb) \
		-initrd $(initramfs) \
		-display none -serial null \
		-serial stdio

Dumping the backtrace at the devm_clk_hw_register() callsites shows that
it passes for:
- the main pll in mpfs_clk_register_mssplls()
- the "cfg" clocks in mpfs_clk_register_cfgs()
- the first 4 periph clocks...

It fails on "clk_periph_timer" - which uses a different parent, that it
tries to find using the macro:
#define PARENT_CLK(PARENT) (&mpfs_cfg_clks[CLK_##PARENT].cfg.hw)

If parent is RTCREF, that becomes: &mpfs_cfg_clks[33].cfg.hw

And yeah, thats well beyond the end of the array!

Booting with gcc, and looking at the clock summary:
  mssrefclk                            1        1        0   125000000          0     0  50000         Y
     clk_rtcref                        0        0        0     1000000          0     0  50000         Y
        clk_periph_timer               0        0        0     1000000          0     0  50000         N

This is reported correctly, so how this works for either gcc or clang now, or
gcc with the patches applied, I am not really too sure..

On v6.0-rc1, gcc and clang show:
  mssrefclk                            1        1        0   125000000          0     0  50000         Y
     clk_rtcref                        0        0        0     1000000          0     0  50000         Y

I guess this may fail gracefully as the pointer is actually null, but
the extra level of indirection added by my patch started tripping up
the error checking. I must just not have noticed because I added RTCREF
in a bit of a hurry trying to fix a bug in time for 5.18:
https://lore.kernel.org/linux-clk/986c73df-9634-d18b-eed3-37584fa2ea89@conchuod.ie/#t

Obviously I've got to fix the bug itself now, but now I am left wondering
about GCC's behaviour rather than clang/llvm's!

Thanks Nathan - if even just for being a rubber duck :)

Conor.

>>
>> Didn't get a chance to look at disassembly etc today, but as I said
>> last night it reproduces with GNU binutils.
>>
>> Thanks,
>> Conor.
>>
>> On another note, brought up our QEMU port's state today so fixing
>> it is now on the good ole, ever expanding todo list :)
> 


WARNING: multiple messages have this Message-ID (diff)
From: <Conor.Dooley@microchip.com>
To: <nathan@kernel.org>
Cc: <ndesaulniers@google.com>, <llvm@lists.linux.dev>,
	<mturquette@baylibre.com>, <sboyd@kernel.org>,
	<robh+dt@kernel.org>, <krzysztof.kozlowski+dt@linaro.org>,
	<paul.walmsley@sifive.com>, <aou@eecs.berkeley.edu>,
	<linux-clk@vger.kernel.org>, <devicetree@vger.kernel.org>,
	<p.zabel@pengutronix.de>, <linux-kernel@vger.kernel.org>,
	<linux-riscv@lists.infradead.org>, <Daire.McNamara@microchip.com>,
	<palmer@dabbelt.com>
Subject: Re: [PATCH v2 00/12] PolarFire SoC reset controller & clock cleanups
Date: Wed, 17 Aug 2022 12:17:11 +0000	[thread overview]
Message-ID: <62bb08d7-0a14-13e4-5ca2-edb5ef3b9793@microchip.com> (raw)
In-Reply-To: <42354fef-8ca8-a81d-6af9-f250ecfd1924@microchip.com>

On 14/08/2022 12:41, Conor.Dooley@microchip.com wrote:
>>
>> Doesn't really matter since thats long enough to get past the switch
>> out of earlycon which is where the clang built kernel dies.

I had forgotten something obvious: keep_bootcon. With that, on the sbi
console I get a nice:

[    0.581754] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000b1
[    0.591520] Oops [#1]
[    0.594045] Modules linked in:
[    0.597435] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.0.0-rc1-00011-g8e1459cf4eca #1
[    0.606188] Hardware name: Microchip PolarFire-SoC Icicle Kit (DT)
[    0.613012] epc : __clk_register+0x4a6/0x85c
[    0.617759]  ra : __clk_register+0x49e/0x85c
[    0.622489] epc : ffffffff803faf7c ra : ffffffff803faf74 sp : ffffffc80400b720
[    0.630466]  gp : ffffffff810e93f8 tp : ffffffe77fe60000 t0 : ffffffe77ffb3800
[    0.638443]  t1 : 000000000000000a t2 : ffffffffffffffff s0 : ffffffc80400b7c0
[    0.646420]  s1 : 0000000000000001 a0 : 0000000000000001 a1 : 0000000000000000
[    0.654396]  a2 : 0000000000000001 a3 : 0000000000000000 a4 : 0000000000000000
[    0.662373]  a5 : ffffffff803a5810 a6 : 0000000200000022 a7 : 0000000000000006
[    0.670350]  s2 : ffffffff81099d48 s3 : ffffffff80d6e28e s4 : 0000000000000028
[    0.678327]  s5 : ffffffff810ed3c8 s6 : ffffffff810ed3d0 s7 : ffffffe77ffbc100
[    0.686304]  s8 : ffffffe77ffb1540 s9 : ffffffe77ffb1540 s10: 0000000000000008
[    0.694281]  s11: 0000000000000000 t3 : 00000000000000c6 t4 : 0000000000000007
[    0.702258]  t5 : ffffffff810c78c0 t6 : ffffffe77ff88cd0
[    0.708125] status: 0000000200000120 badaddr: 00000000000000b1 cause: 000000000000000d
[    0.716869] [<ffffffff803fb892>] devm_clk_hw_register+0x62/0xaa
[    0.723420] [<ffffffff80403412>] mpfs_clk_probe+0x1e0/0x244
[    0.729592] [<ffffffff80457dea>] platform_probe+0x82/0xa6
[    0.735581] [<ffffffff8045593c>] call_driver_probe+0x22/0xa4
[    0.741848] [<ffffffff804557da>] really_probe+0x13a/0x27a
[    0.747819] [<ffffffff804549f8>] __driver_probe_device+0xc4/0xee
[    0.754460] [<ffffffff804554d0>] driver_probe_device+0x3c/0x196
[    0.761013] [<ffffffff804552cc>] __device_attach_driver+0xa2/0x18c
[    0.767853] [<ffffffff8045284a>] bus_for_each_drv+0x76/0xba
[    0.774016] [<ffffffff804547ca>] __device_attach+0xa0/0x15a
[    0.780179] [<ffffffff80454896>] device_initial_probe+0x12/0x1a
[    0.786732] [<ffffffff804529c6>] bus_probe_device+0x2e/0x7c
[    0.792895] [<ffffffff8044f290>] device_add+0x21a/0x3aa
[    0.798693] [<ffffffff805c323a>] of_device_add+0x28/0x32
[    0.804578] [<ffffffff805c3c54>] of_platform_device_create_pdata+0x88/0xb2
[    0.812181] [<ffffffff805c3e70>] of_platform_bus_create+0x14a/0x1a8
[    0.819117] [<ffffffff805c3ea6>] of_platform_bus_create+0x180/0x1a8
[    0.826053] [<ffffffff805c3f2a>] of_platform_populate+0x5c/0x96
[    0.832606] [<ffffffff8082bbaa>] of_platform_default_populate_init+0xc2/0xd4
[    0.840400] [<ffffffff800020e8>] do_one_initcall+0xbc/0x216
[    0.846563] [<ffffffff8080126c>] do_initcall_level+0x82/0x94
[    0.852830] [<ffffffff808011ae>] do_initcalls+0x5c/0x98
[    0.858611] [<ffffffff8080114a>] do_basic_setup+0x20/0x28
[    0.864583] [<ffffffff808010e0>] kernel_init_freeable+0xba/0x104
[    0.871223] [<ffffffff80746856>] kernel_init+0x22/0x1ba
[    0.877013] [<ffffffff80003700>] ret_from_exception+0x0/0xc
[    0.883218] ---[ end trace 0000000000000000 ]---

At least I had somewhere to start. Since QEMU repros, I continued there with:
	$(QEMU)/qemu-system-riscv64 -s -S -M microchip-icicle-kit \
		-m 2G -smp 5 \
		-kernel $(vmlinux_bin) \
		-dtb $(dtb) \
		-initrd $(initramfs) \
		-display none -serial null \
		-serial stdio

Dumping the backtrace at the devm_clk_hw_register() callsites shows that
it passes for:
- the main pll in mpfs_clk_register_mssplls()
- the "cfg" clocks in mpfs_clk_register_cfgs()
- the first 4 periph clocks...

It fails on "clk_periph_timer" - which uses a different parent, that it
tries to find using the macro:
#define PARENT_CLK(PARENT) (&mpfs_cfg_clks[CLK_##PARENT].cfg.hw)

If parent is RTCREF, that becomes: &mpfs_cfg_clks[33].cfg.hw

And yeah, thats well beyond the end of the array!

Booting with gcc, and looking at the clock summary:
  mssrefclk                            1        1        0   125000000          0     0  50000         Y
     clk_rtcref                        0        0        0     1000000          0     0  50000         Y
        clk_periph_timer               0        0        0     1000000          0     0  50000         N

This is reported correctly, so how this works for either gcc or clang now, or
gcc with the patches applied, I am not really too sure..

On v6.0-rc1, gcc and clang show:
  mssrefclk                            1        1        0   125000000          0     0  50000         Y
     clk_rtcref                        0        0        0     1000000          0     0  50000         Y

I guess this may fail gracefully as the pointer is actually null, but
the extra level of indirection added by my patch started tripping up
the error checking. I must just not have noticed because I added RTCREF
in a bit of a hurry trying to fix a bug in time for 5.18:
https://lore.kernel.org/linux-clk/986c73df-9634-d18b-eed3-37584fa2ea89@conchuod.ie/#t

Obviously I've got to fix the bug itself now, but now I am left wondering
about GCC's behaviour rather than clang/llvm's!

Thanks Nathan - if even just for being a rubber duck :)

Conor.

>>
>> Didn't get a chance to look at disassembly etc today, but as I said
>> last night it reproduces with GNU binutils.
>>
>> Thanks,
>> Conor.
>>
>> On another note, brought up our QEMU port's state today so fixing
>> it is now on the good ole, ever expanding todo list :)
> 

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

  reply	other threads:[~2022-08-17 12:17 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-04 12:15 [PATCH v2 00/12] PolarFire SoC reset controller & clock cleanups Conor Dooley
2022-07-04 12:15 ` Conor Dooley
2022-07-04 12:15 ` [PATCH v2 01/12] dt-bindings: clk: microchip: mpfs: add reset controller support Conor Dooley
2022-07-04 12:15   ` Conor Dooley
2022-07-04 12:15 ` [PATCH v2 02/12] clk: microchip: mpfs: add reset controller Conor Dooley
2022-07-04 12:15   ` Conor Dooley
2022-07-04 12:15 ` [PATCH v2 03/12] reset: add polarfire soc reset support Conor Dooley
2022-07-04 12:15   ` Conor Dooley
2022-07-18 11:34   ` Conor.Dooley
2022-07-18 11:34     ` Conor.Dooley
2022-07-18 15:42     ` Philipp Zabel
2022-07-18 15:42       ` Philipp Zabel
2022-07-18 15:44       ` Conor.Dooley
2022-07-18 15:44         ` Conor.Dooley
2022-07-04 12:15 ` [PATCH v2 04/12] MAINTAINERS: add polarfire soc reset controller Conor Dooley
2022-07-04 12:15   ` Conor Dooley
2022-07-04 12:15 ` [PATCH v2 05/12] riscv: dts: microchip: add mpfs specific macb reset support Conor Dooley
2022-07-04 12:15   ` Conor Dooley
2022-07-04 12:15 ` [PATCH v2 06/12] clk: microchip: mpfs: add module_authors entries Conor Dooley
2022-07-04 12:15   ` Conor Dooley
2022-07-04 12:15 ` [PATCH v2 07/12] clk: microchip: mpfs: add MSS pll's set & round rate Conor Dooley
2022-07-04 12:15   ` Conor Dooley
2022-07-04 12:15 ` [PATCH v2 08/12] clk: microchip: mpfs: move id & offset out of clock structs Conor Dooley
2022-07-04 12:15   ` Conor Dooley
2022-07-04 12:15 ` [PATCH v2 09/12] clk: microchip: mpfs: simplify control reg access Conor Dooley
2022-07-04 12:15   ` Conor Dooley
2022-07-04 12:15 ` [PATCH v2 10/12] clk: microchip: mpfs: delete 2 line mpfs_clk_register_foo() Conor Dooley
2022-07-04 12:15   ` Conor Dooley
2022-07-04 12:15 ` [PATCH v2 11/12] clk: microchip: mpfs: convert cfg_clk to clk_divider Conor Dooley
2022-07-04 12:15   ` Conor Dooley
2022-07-04 12:15 ` [PATCH v2 12/12] clk: microchip: mpfs: convert periph_clk to clk_gate Conor Dooley
2022-07-04 12:15   ` Conor Dooley
2022-07-20 13:46 ` [PATCH v2 00/12] PolarFire SoC reset controller & clock cleanups Conor.Dooley
2022-07-20 13:46   ` Conor.Dooley
2022-08-09 23:05 ` Conor.Dooley
2022-08-09 23:05   ` Conor.Dooley
2022-08-10 18:56   ` Nathan Chancellor
2022-08-10 18:56     ` Nathan Chancellor
2022-08-10 19:20     ` Conor.Dooley
2022-08-10 19:20       ` Conor.Dooley
2022-08-10 19:32       ` Nathan Chancellor
2022-08-10 19:32         ` Nathan Chancellor
2022-08-10 19:43         ` Conor.Dooley
2022-08-10 19:43           ` Conor.Dooley
2022-08-11 13:13           ` Conor.Dooley
2022-08-11 13:13             ` Conor.Dooley
2022-08-14 11:41             ` Conor.Dooley
2022-08-14 11:41               ` Conor.Dooley
2022-08-17 12:17               ` Conor.Dooley [this message]
2022-08-17 12:17                 ` Conor.Dooley
2022-08-17 16:18                 ` Nathan Chancellor
2022-08-17 16:18                   ` Nathan Chancellor
2022-08-10 11:50 ` Daire.McNamara
2022-08-10 11:50   ` Daire.McNamara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=62bb08d7-0a14-13e4-5ca2-edb5ef3b9793@microchip.com \
    --to=conor.dooley@microchip.com \
    --cc=Daire.McNamara@microchip.com \
    --cc=aou@eecs.berkeley.edu \
    --cc=devicetree@vger.kernel.org \
    --cc=krzysztof.kozlowski+dt@linaro.org \
    --cc=linux-clk@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=llvm@lists.linux.dev \
    --cc=mturquette@baylibre.com \
    --cc=nathan@kernel.org \
    --cc=ndesaulniers@google.com \
    --cc=p.zabel@pengutronix.de \
    --cc=palmer@dabbelt.com \
    --cc=paul.walmsley@sifive.com \
    --cc=robh+dt@kernel.org \
    --cc=sboyd@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.