From: <Conor.Dooley@microchip.com> To: <nathan@kernel.org> Cc: <ndesaulniers@google.com>, <llvm@lists.linux.dev>, <mturquette@baylibre.com>, <sboyd@kernel.org>, <robh+dt@kernel.org>, <krzysztof.kozlowski+dt@linaro.org>, <paul.walmsley@sifive.com>, <aou@eecs.berkeley.edu>, <linux-clk@vger.kernel.org>, <devicetree@vger.kernel.org>, <p.zabel@pengutronix.de>, <linux-kernel@vger.kernel.org>, <linux-riscv@lists.infradead.org>, <Daire.McNamara@microchip.com>, <palmer@dabbelt.com> Subject: Re: [PATCH v2 00/12] PolarFire SoC reset controller & clock cleanups Date: Wed, 17 Aug 2022 12:17:11 +0000 [thread overview] Message-ID: <62bb08d7-0a14-13e4-5ca2-edb5ef3b9793@microchip.com> (raw) In-Reply-To: <42354fef-8ca8-a81d-6af9-f250ecfd1924@microchip.com> On 14/08/2022 12:41, Conor.Dooley@microchip.com wrote: >> >> Doesn't really matter since thats long enough to get past the switch >> out of earlycon which is where the clang built kernel dies. I had forgotten something obvious: keep_bootcon. With that, on the sbi console I get a nice: [ 0.581754] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000b1 [ 0.591520] Oops [#1] [ 0.594045] Modules linked in: [ 0.597435] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.0.0-rc1-00011-g8e1459cf4eca #1 [ 0.606188] Hardware name: Microchip PolarFire-SoC Icicle Kit (DT) [ 0.613012] epc : __clk_register+0x4a6/0x85c [ 0.617759] ra : __clk_register+0x49e/0x85c [ 0.622489] epc : ffffffff803faf7c ra : ffffffff803faf74 sp : ffffffc80400b720 [ 0.630466] gp : ffffffff810e93f8 tp : ffffffe77fe60000 t0 : ffffffe77ffb3800 [ 0.638443] t1 : 000000000000000a t2 : ffffffffffffffff s0 : ffffffc80400b7c0 [ 0.646420] s1 : 0000000000000001 a0 : 0000000000000001 a1 : 0000000000000000 [ 0.654396] a2 : 0000000000000001 a3 : 0000000000000000 a4 : 0000000000000000 [ 0.662373] a5 : ffffffff803a5810 a6 : 0000000200000022 a7 : 0000000000000006 [ 0.670350] s2 : ffffffff81099d48 s3 : ffffffff80d6e28e s4 : 0000000000000028 [ 0.678327] s5 : ffffffff810ed3c8 s6 : ffffffff810ed3d0 s7 : ffffffe77ffbc100 [ 0.686304] s8 : ffffffe77ffb1540 s9 : ffffffe77ffb1540 s10: 0000000000000008 [ 0.694281] s11: 0000000000000000 t3 : 00000000000000c6 t4 : 0000000000000007 [ 0.702258] t5 : ffffffff810c78c0 t6 : ffffffe77ff88cd0 [ 0.708125] status: 0000000200000120 badaddr: 00000000000000b1 cause: 000000000000000d [ 0.716869] [<ffffffff803fb892>] devm_clk_hw_register+0x62/0xaa [ 0.723420] [<ffffffff80403412>] mpfs_clk_probe+0x1e0/0x244 [ 0.729592] [<ffffffff80457dea>] platform_probe+0x82/0xa6 [ 0.735581] [<ffffffff8045593c>] call_driver_probe+0x22/0xa4 [ 0.741848] [<ffffffff804557da>] really_probe+0x13a/0x27a [ 0.747819] [<ffffffff804549f8>] __driver_probe_device+0xc4/0xee [ 0.754460] [<ffffffff804554d0>] driver_probe_device+0x3c/0x196 [ 0.761013] [<ffffffff804552cc>] __device_attach_driver+0xa2/0x18c [ 0.767853] [<ffffffff8045284a>] bus_for_each_drv+0x76/0xba [ 0.774016] [<ffffffff804547ca>] __device_attach+0xa0/0x15a [ 0.780179] [<ffffffff80454896>] device_initial_probe+0x12/0x1a [ 0.786732] [<ffffffff804529c6>] bus_probe_device+0x2e/0x7c [ 0.792895] [<ffffffff8044f290>] device_add+0x21a/0x3aa [ 0.798693] [<ffffffff805c323a>] of_device_add+0x28/0x32 [ 0.804578] [<ffffffff805c3c54>] of_platform_device_create_pdata+0x88/0xb2 [ 0.812181] [<ffffffff805c3e70>] of_platform_bus_create+0x14a/0x1a8 [ 0.819117] [<ffffffff805c3ea6>] of_platform_bus_create+0x180/0x1a8 [ 0.826053] [<ffffffff805c3f2a>] of_platform_populate+0x5c/0x96 [ 0.832606] [<ffffffff8082bbaa>] of_platform_default_populate_init+0xc2/0xd4 [ 0.840400] [<ffffffff800020e8>] do_one_initcall+0xbc/0x216 [ 0.846563] [<ffffffff8080126c>] do_initcall_level+0x82/0x94 [ 0.852830] [<ffffffff808011ae>] do_initcalls+0x5c/0x98 [ 0.858611] [<ffffffff8080114a>] do_basic_setup+0x20/0x28 [ 0.864583] [<ffffffff808010e0>] kernel_init_freeable+0xba/0x104 [ 0.871223] [<ffffffff80746856>] kernel_init+0x22/0x1ba [ 0.877013] [<ffffffff80003700>] ret_from_exception+0x0/0xc [ 0.883218] ---[ end trace 0000000000000000 ]--- At least I had somewhere to start. Since QEMU repros, I continued there with: $(QEMU)/qemu-system-riscv64 -s -S -M microchip-icicle-kit \ -m 2G -smp 5 \ -kernel $(vmlinux_bin) \ -dtb $(dtb) \ -initrd $(initramfs) \ -display none -serial null \ -serial stdio Dumping the backtrace at the devm_clk_hw_register() callsites shows that it passes for: - the main pll in mpfs_clk_register_mssplls() - the "cfg" clocks in mpfs_clk_register_cfgs() - the first 4 periph clocks... It fails on "clk_periph_timer" - which uses a different parent, that it tries to find using the macro: #define PARENT_CLK(PARENT) (&mpfs_cfg_clks[CLK_##PARENT].cfg.hw) If parent is RTCREF, that becomes: &mpfs_cfg_clks[33].cfg.hw And yeah, thats well beyond the end of the array! Booting with gcc, and looking at the clock summary: mssrefclk 1 1 0 125000000 0 0 50000 Y clk_rtcref 0 0 0 1000000 0 0 50000 Y clk_periph_timer 0 0 0 1000000 0 0 50000 N This is reported correctly, so how this works for either gcc or clang now, or gcc with the patches applied, I am not really too sure.. On v6.0-rc1, gcc and clang show: mssrefclk 1 1 0 125000000 0 0 50000 Y clk_rtcref 0 0 0 1000000 0 0 50000 Y I guess this may fail gracefully as the pointer is actually null, but the extra level of indirection added by my patch started tripping up the error checking. I must just not have noticed because I added RTCREF in a bit of a hurry trying to fix a bug in time for 5.18: https://lore.kernel.org/linux-clk/986c73df-9634-d18b-eed3-37584fa2ea89@conchuod.ie/#t Obviously I've got to fix the bug itself now, but now I am left wondering about GCC's behaviour rather than clang/llvm's! Thanks Nathan - if even just for being a rubber duck :) Conor. >> >> Didn't get a chance to look at disassembly etc today, but as I said >> last night it reproduces with GNU binutils. >> >> Thanks, >> Conor. >> >> On another note, brought up our QEMU port's state today so fixing >> it is now on the good ole, ever expanding todo list :) >
WARNING: multiple messages have this Message-ID (diff)
From: <Conor.Dooley@microchip.com> To: <nathan@kernel.org> Cc: <ndesaulniers@google.com>, <llvm@lists.linux.dev>, <mturquette@baylibre.com>, <sboyd@kernel.org>, <robh+dt@kernel.org>, <krzysztof.kozlowski+dt@linaro.org>, <paul.walmsley@sifive.com>, <aou@eecs.berkeley.edu>, <linux-clk@vger.kernel.org>, <devicetree@vger.kernel.org>, <p.zabel@pengutronix.de>, <linux-kernel@vger.kernel.org>, <linux-riscv@lists.infradead.org>, <Daire.McNamara@microchip.com>, <palmer@dabbelt.com> Subject: Re: [PATCH v2 00/12] PolarFire SoC reset controller & clock cleanups Date: Wed, 17 Aug 2022 12:17:11 +0000 [thread overview] Message-ID: <62bb08d7-0a14-13e4-5ca2-edb5ef3b9793@microchip.com> (raw) In-Reply-To: <42354fef-8ca8-a81d-6af9-f250ecfd1924@microchip.com> On 14/08/2022 12:41, Conor.Dooley@microchip.com wrote: >> >> Doesn't really matter since thats long enough to get past the switch >> out of earlycon which is where the clang built kernel dies. I had forgotten something obvious: keep_bootcon. With that, on the sbi console I get a nice: [ 0.581754] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000b1 [ 0.591520] Oops [#1] [ 0.594045] Modules linked in: [ 0.597435] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.0.0-rc1-00011-g8e1459cf4eca #1 [ 0.606188] Hardware name: Microchip PolarFire-SoC Icicle Kit (DT) [ 0.613012] epc : __clk_register+0x4a6/0x85c [ 0.617759] ra : __clk_register+0x49e/0x85c [ 0.622489] epc : ffffffff803faf7c ra : ffffffff803faf74 sp : ffffffc80400b720 [ 0.630466] gp : ffffffff810e93f8 tp : ffffffe77fe60000 t0 : ffffffe77ffb3800 [ 0.638443] t1 : 000000000000000a t2 : ffffffffffffffff s0 : ffffffc80400b7c0 [ 0.646420] s1 : 0000000000000001 a0 : 0000000000000001 a1 : 0000000000000000 [ 0.654396] a2 : 0000000000000001 a3 : 0000000000000000 a4 : 0000000000000000 [ 0.662373] a5 : ffffffff803a5810 a6 : 0000000200000022 a7 : 0000000000000006 [ 0.670350] s2 : ffffffff81099d48 s3 : ffffffff80d6e28e s4 : 0000000000000028 [ 0.678327] s5 : ffffffff810ed3c8 s6 : ffffffff810ed3d0 s7 : ffffffe77ffbc100 [ 0.686304] s8 : ffffffe77ffb1540 s9 : ffffffe77ffb1540 s10: 0000000000000008 [ 0.694281] s11: 0000000000000000 t3 : 00000000000000c6 t4 : 0000000000000007 [ 0.702258] t5 : ffffffff810c78c0 t6 : ffffffe77ff88cd0 [ 0.708125] status: 0000000200000120 badaddr: 00000000000000b1 cause: 000000000000000d [ 0.716869] [<ffffffff803fb892>] devm_clk_hw_register+0x62/0xaa [ 0.723420] [<ffffffff80403412>] mpfs_clk_probe+0x1e0/0x244 [ 0.729592] [<ffffffff80457dea>] platform_probe+0x82/0xa6 [ 0.735581] [<ffffffff8045593c>] call_driver_probe+0x22/0xa4 [ 0.741848] [<ffffffff804557da>] really_probe+0x13a/0x27a [ 0.747819] [<ffffffff804549f8>] __driver_probe_device+0xc4/0xee [ 0.754460] [<ffffffff804554d0>] driver_probe_device+0x3c/0x196 [ 0.761013] [<ffffffff804552cc>] __device_attach_driver+0xa2/0x18c [ 0.767853] [<ffffffff8045284a>] bus_for_each_drv+0x76/0xba [ 0.774016] [<ffffffff804547ca>] __device_attach+0xa0/0x15a [ 0.780179] [<ffffffff80454896>] device_initial_probe+0x12/0x1a [ 0.786732] [<ffffffff804529c6>] bus_probe_device+0x2e/0x7c [ 0.792895] [<ffffffff8044f290>] device_add+0x21a/0x3aa [ 0.798693] [<ffffffff805c323a>] of_device_add+0x28/0x32 [ 0.804578] [<ffffffff805c3c54>] of_platform_device_create_pdata+0x88/0xb2 [ 0.812181] [<ffffffff805c3e70>] of_platform_bus_create+0x14a/0x1a8 [ 0.819117] [<ffffffff805c3ea6>] of_platform_bus_create+0x180/0x1a8 [ 0.826053] [<ffffffff805c3f2a>] of_platform_populate+0x5c/0x96 [ 0.832606] [<ffffffff8082bbaa>] of_platform_default_populate_init+0xc2/0xd4 [ 0.840400] [<ffffffff800020e8>] do_one_initcall+0xbc/0x216 [ 0.846563] [<ffffffff8080126c>] do_initcall_level+0x82/0x94 [ 0.852830] [<ffffffff808011ae>] do_initcalls+0x5c/0x98 [ 0.858611] [<ffffffff8080114a>] do_basic_setup+0x20/0x28 [ 0.864583] [<ffffffff808010e0>] kernel_init_freeable+0xba/0x104 [ 0.871223] [<ffffffff80746856>] kernel_init+0x22/0x1ba [ 0.877013] [<ffffffff80003700>] ret_from_exception+0x0/0xc [ 0.883218] ---[ end trace 0000000000000000 ]--- At least I had somewhere to start. Since QEMU repros, I continued there with: $(QEMU)/qemu-system-riscv64 -s -S -M microchip-icicle-kit \ -m 2G -smp 5 \ -kernel $(vmlinux_bin) \ -dtb $(dtb) \ -initrd $(initramfs) \ -display none -serial null \ -serial stdio Dumping the backtrace at the devm_clk_hw_register() callsites shows that it passes for: - the main pll in mpfs_clk_register_mssplls() - the "cfg" clocks in mpfs_clk_register_cfgs() - the first 4 periph clocks... It fails on "clk_periph_timer" - which uses a different parent, that it tries to find using the macro: #define PARENT_CLK(PARENT) (&mpfs_cfg_clks[CLK_##PARENT].cfg.hw) If parent is RTCREF, that becomes: &mpfs_cfg_clks[33].cfg.hw And yeah, thats well beyond the end of the array! Booting with gcc, and looking at the clock summary: mssrefclk 1 1 0 125000000 0 0 50000 Y clk_rtcref 0 0 0 1000000 0 0 50000 Y clk_periph_timer 0 0 0 1000000 0 0 50000 N This is reported correctly, so how this works for either gcc or clang now, or gcc with the patches applied, I am not really too sure.. On v6.0-rc1, gcc and clang show: mssrefclk 1 1 0 125000000 0 0 50000 Y clk_rtcref 0 0 0 1000000 0 0 50000 Y I guess this may fail gracefully as the pointer is actually null, but the extra level of indirection added by my patch started tripping up the error checking. I must just not have noticed because I added RTCREF in a bit of a hurry trying to fix a bug in time for 5.18: https://lore.kernel.org/linux-clk/986c73df-9634-d18b-eed3-37584fa2ea89@conchuod.ie/#t Obviously I've got to fix the bug itself now, but now I am left wondering about GCC's behaviour rather than clang/llvm's! Thanks Nathan - if even just for being a rubber duck :) Conor. >> >> Didn't get a chance to look at disassembly etc today, but as I said >> last night it reproduces with GNU binutils. >> >> Thanks, >> Conor. >> >> On another note, brought up our QEMU port's state today so fixing >> it is now on the good ole, ever expanding todo list :) > _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv
next prev parent reply other threads:[~2022-08-17 12:17 UTC|newest] Thread overview: 54+ messages / expand[flat|nested] mbox.gz Atom feed top 2022-07-04 12:15 [PATCH v2 00/12] PolarFire SoC reset controller & clock cleanups Conor Dooley 2022-07-04 12:15 ` Conor Dooley 2022-07-04 12:15 ` [PATCH v2 01/12] dt-bindings: clk: microchip: mpfs: add reset controller support Conor Dooley 2022-07-04 12:15 ` Conor Dooley 2022-07-04 12:15 ` [PATCH v2 02/12] clk: microchip: mpfs: add reset controller Conor Dooley 2022-07-04 12:15 ` Conor Dooley 2022-07-04 12:15 ` [PATCH v2 03/12] reset: add polarfire soc reset support Conor Dooley 2022-07-04 12:15 ` Conor Dooley 2022-07-18 11:34 ` Conor.Dooley 2022-07-18 11:34 ` Conor.Dooley 2022-07-18 15:42 ` Philipp Zabel 2022-07-18 15:42 ` Philipp Zabel 2022-07-18 15:44 ` Conor.Dooley 2022-07-18 15:44 ` Conor.Dooley 2022-07-04 12:15 ` [PATCH v2 04/12] MAINTAINERS: add polarfire soc reset controller Conor Dooley 2022-07-04 12:15 ` Conor Dooley 2022-07-04 12:15 ` [PATCH v2 05/12] riscv: dts: microchip: add mpfs specific macb reset support Conor Dooley 2022-07-04 12:15 ` Conor Dooley 2022-07-04 12:15 ` [PATCH v2 06/12] clk: microchip: mpfs: add module_authors entries Conor Dooley 2022-07-04 12:15 ` Conor Dooley 2022-07-04 12:15 ` [PATCH v2 07/12] clk: microchip: mpfs: add MSS pll's set & round rate Conor Dooley 2022-07-04 12:15 ` Conor Dooley 2022-07-04 12:15 ` [PATCH v2 08/12] clk: microchip: mpfs: move id & offset out of clock structs Conor Dooley 2022-07-04 12:15 ` Conor Dooley 2022-07-04 12:15 ` [PATCH v2 09/12] clk: microchip: mpfs: simplify control reg access Conor Dooley 2022-07-04 12:15 ` Conor Dooley 2022-07-04 12:15 ` [PATCH v2 10/12] clk: microchip: mpfs: delete 2 line mpfs_clk_register_foo() Conor Dooley 2022-07-04 12:15 ` Conor Dooley 2022-07-04 12:15 ` [PATCH v2 11/12] clk: microchip: mpfs: convert cfg_clk to clk_divider Conor Dooley 2022-07-04 12:15 ` Conor Dooley 2022-07-04 12:15 ` [PATCH v2 12/12] clk: microchip: mpfs: convert periph_clk to clk_gate Conor Dooley 2022-07-04 12:15 ` Conor Dooley 2022-07-20 13:46 ` [PATCH v2 00/12] PolarFire SoC reset controller & clock cleanups Conor.Dooley 2022-07-20 13:46 ` Conor.Dooley 2022-08-09 23:05 ` Conor.Dooley 2022-08-09 23:05 ` Conor.Dooley 2022-08-10 18:56 ` Nathan Chancellor 2022-08-10 18:56 ` Nathan Chancellor 2022-08-10 19:20 ` Conor.Dooley 2022-08-10 19:20 ` Conor.Dooley 2022-08-10 19:32 ` Nathan Chancellor 2022-08-10 19:32 ` Nathan Chancellor 2022-08-10 19:43 ` Conor.Dooley 2022-08-10 19:43 ` Conor.Dooley 2022-08-11 13:13 ` Conor.Dooley 2022-08-11 13:13 ` Conor.Dooley 2022-08-14 11:41 ` Conor.Dooley 2022-08-14 11:41 ` Conor.Dooley 2022-08-17 12:17 ` Conor.Dooley [this message] 2022-08-17 12:17 ` Conor.Dooley 2022-08-17 16:18 ` Nathan Chancellor 2022-08-17 16:18 ` Nathan Chancellor 2022-08-10 11:50 ` Daire.McNamara 2022-08-10 11:50 ` Daire.McNamara
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=62bb08d7-0a14-13e4-5ca2-edb5ef3b9793@microchip.com \ --to=conor.dooley@microchip.com \ --cc=Daire.McNamara@microchip.com \ --cc=aou@eecs.berkeley.edu \ --cc=devicetree@vger.kernel.org \ --cc=krzysztof.kozlowski+dt@linaro.org \ --cc=linux-clk@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-riscv@lists.infradead.org \ --cc=llvm@lists.linux.dev \ --cc=mturquette@baylibre.com \ --cc=nathan@kernel.org \ --cc=ndesaulniers@google.com \ --cc=p.zabel@pengutronix.de \ --cc=palmer@dabbelt.com \ --cc=paul.walmsley@sifive.com \ --cc=robh+dt@kernel.org \ --cc=sboyd@kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.