* Re: Slow booting on x15 [not found] ` <20201001082256.GA3722@pendragon.ideasonboard.com> @ 2020-10-01 12:56 ` Grygorii Strashko 2020-10-01 13:11 ` Geert Uytterhoeven 2020-10-01 18:24 ` Saravana Kannan 0 siblings, 2 replies; 26+ messages in thread From: Grygorii Strashko @ 2020-10-01 12:56 UTC (permalink / raw) To: Laurent Pinchart, Tony Lindgren, Saravana Kannan, Rafael J. Wysocki, Ulf Hansson, Rob Herring Cc: Peter Ujfalusi, Tomi Valkeinen, Linux-OMAP, Greg Kroah-Hartman, linux-pm, Geert Uytterhoeven On 01/10/2020 11:22, Laurent Pinchart wrote: > Hi Tony, > > On Thu, Oct 01, 2020 at 11:17:48AM +0300, Tony Lindgren wrote: >> * Tony Lindgren <tony@atomide.com> [201001 07:53]: >>> * Peter Ujfalusi <peter.ujfalusi@ti.com> [200930 12:41]: >>>> Fwiw on my beagle x15 >>>> >>>> v5.8 >>>> [ 9.908787] Run /sbin/init as init process >>>> >>>> v5.9-rc7 >>>> [ 15.085373] Run /sbin/init as init process >>>> >>>> >>>> It appears to be 'fixed' in next-20200928: the board does not even boot. >>> >>> Yeah so it seems :( >>> >>>> next-20200928 on omap5 >>>> [ 9.936806] Run /sbin/init as init process >>>> >>>> >>>> -rc7 spends most of it's time: >>>> [ 7.635530] Micrel KSZ9031 Gigabit PHY 48485000.mdio:01: attached PHY driver [Micrel KSZ9031 Gigabit PHY] (mii_bus:phy_addr=48485000.mdio:01, irq=POLL) >>>> [ 14.956671] cpsw 48484000.ethernet eth0: Link is Up - 1Gbps/Full - flow control off >>>> [ 15.005211] IP-Config: Complete: >>> >>> Booting with initcall_debug I see this with current Linux next: >>> >>> ... >>> [ 1.697313] cpuidle: using governor menu >>> [ 1.701353] initcall init_menu+0x0/0xc returned 0 after 0 usecs >>> [ 1.707458] calling gpmc_init+0x0/0x10 @ 1 >>> [ 1.711784] initcall gpmc_init+0x0/0x10 returned 0 after 0 usecs >>> [ 1.717974] calling omap3_l3_init+0x0/0x10 @ 1 >>> [ 1.722653] initcall omap3_l3_init+0x0/0x10 returned 0 after 0 usecs >>> [ 1.729201] calling omap_l3_init+0x0/0x10 @ 1 >>> [ 1.733791] initcall omap_l3_init+0x0/0x10 returned 0 after 0 usecs >>> [ 1.740314] calling gate_vma_init+0x0/0x70 @ 1 >>> [ 1.744976] initcall gate_vma_init+0x0/0x70 returned 0 after 0 usecs >>> [ 1.751522] calling customize_machine+0x0/0x30 @ 1 >>> [ 3.823114] initcall customize_machine+0x0/0x30 returned 0 after 2011718 usecs >>> [ 3.830566] calling init_atags_procfs+0x0/0xec @ 1 >>> [ 3.835583] No ATAGs? >> >> And the long time above with customize_machine() ends up being >> pdata_quirks_init() calling of_platform_populate(). > > That's what the delay is for me (I think I've reported that initially). > >>> Laurent & Tomi, care to check what you guys see in the slow booting case >>> after booting with initcall_debug? >> >> But maybe the long delay is something else for you guys so please check. > It's all devlink :( Looks like sometimes, improvements (PM) could became so complicated that time required to execute such algorithms may completely eliminate all expected benefits. Will not be surprised if PM consumption also increased instead of decreasing in some cases. not sure if it's 100% correct, but below diff reduces boot time from 7.6sec to 3.7sec :P before: [ 0.053870] cpuidle: using governor menu [ 2.505971] No ATAGs? ... [ 7.562317] Freeing unused kernel memory: 1024K after: [ 0.053800] cpuidle: using governor menu [ 0.136853] No ATAGs? [ 3.716218] devtmpfs: mounted [ 3.719628] Freeing unused kernel memory: 1024K [ 3.724266] Run /sbin/init as init process \x01 ---- diff --git a/drivers/of/platform.c b/drivers/of/platform.c index 071f04da32c8..e0cc37ed46ca 100644 --- a/drivers/of/platform.c +++ b/drivers/of/platform.c @@ -481,6 +481,7 @@ int of_platform_populate(struct device_node *root, pr_debug(" starting at: %pOF\n", root); device_links_supplier_sync_state_pause(); + fw_devlink_pause(); for_each_child_of_node(root, child) { rc = of_platform_bus_create(child, matches, lookup, parent, true); if (rc) { @@ -488,6 +489,7 @@ int of_platform_populate(struct device_node *root, break; } } + fw_devlink_resume(); device_links_supplier_sync_state_resume(); of_node_set_flag(root, OF_POPULATED_BUS); @@ -538,9 +540,7 @@ static int __init of_platform_default_populate_init(void) } /* Populate everything else. */ - fw_devlink_pause(); of_platform_default_populate(NULL, NULL, NULL); - fw_devlink_resume(); return 0; } -- Best regards, grygorii ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: Slow booting on x15 2020-10-01 12:56 ` Slow booting on x15 Grygorii Strashko @ 2020-10-01 13:11 ` Geert Uytterhoeven 2020-10-01 13:49 ` Grygorii Strashko 2020-10-01 18:24 ` Saravana Kannan 1 sibling, 1 reply; 26+ messages in thread From: Geert Uytterhoeven @ 2020-10-01 13:11 UTC (permalink / raw) To: Grygorii Strashko Cc: Laurent Pinchart, Tony Lindgren, Saravana Kannan, Rafael J. Wysocki, Ulf Hansson, Rob Herring, Peter Ujfalusi, Tomi Valkeinen, Linux-OMAP, Greg Kroah-Hartman, Linux PM list, Geert Uytterhoeven Hi Grygorii et al, On Thu, Oct 1, 2020 at 2:56 PM Grygorii Strashko <grygorii.strashko@ti.com> wrote: > On 01/10/2020 11:22, Laurent Pinchart wrote: > > On Thu, Oct 01, 2020 at 11:17:48AM +0300, Tony Lindgren wrote: > >>>> -rc7 spends most of it's time: > >>>> [ 7.635530] Micrel KSZ9031 Gigabit PHY 48485000.mdio:01: attached PHY driver [Micrel KSZ9031 Gigabit PHY] (mii_bus:phy_addr=48485000.mdio:01, irq=POLL) > >>>> [ 14.956671] cpsw 48484000.ethernet eth0: Link is Up - 1Gbps/Full - flow control off > >>>> [ 15.005211] IP-Config: Complete: 1. Is irq=POLL normal behavior for your board? 2. As this is a Micrel PHY, perhaps you are affected by the changes in the configuration defaults in commit bcf3440c6dd78bfe ("net: phy: micrel: add phy-mode support for the KSZ9031 PHY")? See also the performance drop figures in the description of quick fix 9b23203c32ee02cd ("ravb: Mask PHY mode to avoid inserting delays twice") (and better solution https://lore.kernel.org/linux-renesas-soc/20201001101008.14365-1-geert+renesas@glider.be) Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Slow booting on x15 2020-10-01 13:11 ` Geert Uytterhoeven @ 2020-10-01 13:49 ` Grygorii Strashko 0 siblings, 0 replies; 26+ messages in thread From: Grygorii Strashko @ 2020-10-01 13:49 UTC (permalink / raw) To: Geert Uytterhoeven Cc: Laurent Pinchart, Tony Lindgren, Saravana Kannan, Rafael J. Wysocki, Ulf Hansson, Rob Herring, Peter Ujfalusi, Tomi Valkeinen, Linux-OMAP, Greg Kroah-Hartman, Linux PM list, Geert Uytterhoeven hi Geert, On 01/10/2020 16:11, Geert Uytterhoeven wrote: > Hi Grygorii et al, > > On Thu, Oct 1, 2020 at 2:56 PM Grygorii Strashko > <grygorii.strashko@ti.com> wrote: >> On 01/10/2020 11:22, Laurent Pinchart wrote: >>> On Thu, Oct 01, 2020 at 11:17:48AM +0300, Tony Lindgren wrote: >>>>>> -rc7 spends most of it's time: >>>>>> [ 7.635530] Micrel KSZ9031 Gigabit PHY 48485000.mdio:01: attached PHY driver [Micrel KSZ9031 Gigabit PHY] (mii_bus:phy_addr=48485000.mdio:01, irq=POLL) >>>>>> [ 14.956671] cpsw 48484000.ethernet eth0: Link is Up - 1Gbps/Full - flow control off >>>>>> [ 15.005211] IP-Config: Complete: > > 1. Is irq=POLL normal behavior for your board? > 2. As this is a Micrel PHY, perhaps you are affected by the changes in > the configuration defaults in commit bcf3440c6dd78bfe ("net: phy: > micrel: add phy-mode support for the KSZ9031 PHY")? > > See also the performance drop figures in the description of quick fix > 9b23203c32ee02cd ("ravb: Mask PHY mode to avoid inserting delays twice") > (and better solution > https://lore.kernel.org/linux-renesas-soc/20201001101008.14365-1-geert+renesas@glider.be) It's not about Ethernet PHY and I've tried different board am571x-idk. The boot delay introduced first very early during boot: >> [ 1.729201] calling omap_l3_init+0x0/0x10 @ 1 >> [ 1.733791] initcall omap_l3_init+0x0/0x10 returned 0 after 0 usecs >> [ 1.740314] calling gate_vma_init+0x0/0x70 @ 1 >> [ 1.744976] initcall gate_vma_init+0x0/0x70 returned 0 after 0 usecs >> [ 1.751522] calling customize_machine+0x0/0x30 @ 1 >> [ 3.823114] initcall customize_machine+0x0/0x30 returned 0 after 2011718 usecs >> [ 3.830566] calling init_atags_procfs+0x0/0xec @ 1 >> [ 3.835583] No ATAGs? > > And the long time above with customize_machine() ends up being > pdata_quirks_init() calling of_platform_populate(). And was narrowed down to ^ of_platform_populate() - above customize_machine() costs 2sec. related discussion https://lkml.org/lkml/2020/6/17/452 -- Best regards, grygorii ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Slow booting on x15 2020-10-01 12:56 ` Slow booting on x15 Grygorii Strashko 2020-10-01 13:11 ` Geert Uytterhoeven @ 2020-10-01 18:24 ` Saravana Kannan 2020-10-01 19:43 ` Grygorii Strashko 1 sibling, 1 reply; 26+ messages in thread From: Saravana Kannan @ 2020-10-01 18:24 UTC (permalink / raw) To: Grygorii Strashko Cc: Laurent Pinchart, Tony Lindgren, Rafael J. Wysocki, Ulf Hansson, Rob Herring, Peter Ujfalusi, Tomi Valkeinen, Linux-OMAP, Greg Kroah-Hartman, Linux PM, Geert Uytterhoeven On Thu, Oct 1, 2020 at 5:56 AM Grygorii Strashko <grygorii.strashko@ti.com> wrote: > > > > On 01/10/2020 11:22, Laurent Pinchart wrote: > > Hi Tony, > > > > On Thu, Oct 01, 2020 at 11:17:48AM +0300, Tony Lindgren wrote: > >> * Tony Lindgren <tony@atomide.com> [201001 07:53]: > >>> * Peter Ujfalusi <peter.ujfalusi@ti.com> [200930 12:41]: > >>>> Fwiw on my beagle x15 > >>>> > >>>> v5.8 > >>>> [ 9.908787] Run /sbin/init as init process > >>>> > >>>> v5.9-rc7 > >>>> [ 15.085373] Run /sbin/init as init process > >>>> > >>>> > >>>> It appears to be 'fixed' in next-20200928: the board does not even boot. > >>> > >>> Yeah so it seems :( > >>> > >>>> next-20200928 on omap5 > >>>> [ 9.936806] Run /sbin/init as init process > >>>> > >>>> > >>>> -rc7 spends most of it's time: > >>>> [ 7.635530] Micrel KSZ9031 Gigabit PHY 48485000.mdio:01: attached PHY driver [Micrel KSZ9031 Gigabit PHY] (mii_bus:phy_addr=48485000.mdio:01, irq=POLL) > >>>> [ 14.956671] cpsw 48484000.ethernet eth0: Link is Up - 1Gbps/Full - flow control off > >>>> [ 15.005211] IP-Config: Complete: > >>> > >>> Booting with initcall_debug I see this with current Linux next: > >>> > >>> ... > >>> [ 1.697313] cpuidle: using governor menu > >>> [ 1.701353] initcall init_menu+0x0/0xc returned 0 after 0 usecs > >>> [ 1.707458] calling gpmc_init+0x0/0x10 @ 1 > >>> [ 1.711784] initcall gpmc_init+0x0/0x10 returned 0 after 0 usecs > >>> [ 1.717974] calling omap3_l3_init+0x0/0x10 @ 1 > >>> [ 1.722653] initcall omap3_l3_init+0x0/0x10 returned 0 after 0 usecs > >>> [ 1.729201] calling omap_l3_init+0x0/0x10 @ 1 > >>> [ 1.733791] initcall omap_l3_init+0x0/0x10 returned 0 after 0 usecs > >>> [ 1.740314] calling gate_vma_init+0x0/0x70 @ 1 > >>> [ 1.744976] initcall gate_vma_init+0x0/0x70 returned 0 after 0 usecs > >>> [ 1.751522] calling customize_machine+0x0/0x30 @ 1 > >>> [ 3.823114] initcall customize_machine+0x0/0x30 returned 0 after 2011718 usecs > >>> [ 3.830566] calling init_atags_procfs+0x0/0xec @ 1 > >>> [ 3.835583] No ATAGs? > >> > >> And the long time above with customize_machine() ends up being > >> pdata_quirks_init() calling of_platform_populate(). > > > > That's what the delay is for me (I think I've reported that initially). > > > >>> Laurent & Tomi, care to check what you guys see in the slow booting case > >>> after booting with initcall_debug? > >> > >> But maybe the long delay is something else for you guys so please check. > > > > It's all devlink :( Looks like sometimes, improvements (PM) could became so complicated > that time required to execute such algorithms may completely eliminate all expected benefits. > Will not be surprised if PM consumption also increased instead of decreasing in some cases. > > not sure if it's 100% correct, but below diff reduces boot time > from 7.6sec to 3.7sec :P > > before: > [ 0.053870] cpuidle: using governor menu > [ 2.505971] No ATAGs? > ... > [ 7.562317] Freeing unused kernel memory: 1024K > > after: > [ 0.053800] cpuidle: using governor menu > [ 0.136853] No ATAGs? > [ 3.716218] devtmpfs: mounted > [ 3.719628] Freeing unused kernel memory: 1024K > [ 3.724266] Run /sbin/init as init process > > ---- > diff --git a/drivers/of/platform.c b/drivers/of/platform.c > index 071f04da32c8..e0cc37ed46ca 100644 > --- a/drivers/of/platform.c > +++ b/drivers/of/platform.c > @@ -481,6 +481,7 @@ int of_platform_populate(struct device_node *root, > pr_debug(" starting at: %pOF\n", root); > > device_links_supplier_sync_state_pause(); > + fw_devlink_pause(); > for_each_child_of_node(root, child) { > rc = of_platform_bus_create(child, matches, lookup, parent, true); > if (rc) { > @@ -488,6 +489,7 @@ int of_platform_populate(struct device_node *root, > break; > } > } > + fw_devlink_resume(); > device_links_supplier_sync_state_resume(); > > of_node_set_flag(root, OF_POPULATED_BUS); > @@ -538,9 +540,7 @@ static int __init of_platform_default_populate_init(void) > } > > /* Populate everything else. */ > - fw_devlink_pause(); > of_platform_default_populate(NULL, NULL, NULL); > - fw_devlink_resume(); Your analysis is right, but this change is not safe. You'll get an unlocked linked list trampling if you call it outside of where it's called now. That's explicitly why I didn't do it the way this patch does it. To explain more, if you call fw_devlink_pause/resume() inside of_platform_populate() you can end up calling it in the context of another device's probe function. When a device's probe function is called, a has a bunch of other locks held and you'll cause a deadlock. To avoid that, I had to use defer_fw_devlink_lock to manage the list used by fw_devlink_pause/resume(). I'll add more details later. But yeah, this patch isn't safe as is. -Saravana ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Slow booting on x15 2020-10-01 18:24 ` Saravana Kannan @ 2020-10-01 19:43 ` Grygorii Strashko 2020-10-01 22:22 ` Saravana Kannan 0 siblings, 1 reply; 26+ messages in thread From: Grygorii Strashko @ 2020-10-01 19:43 UTC (permalink / raw) To: Saravana Kannan Cc: Laurent Pinchart, Tony Lindgren, Rafael J. Wysocki, Ulf Hansson, Rob Herring, Peter Ujfalusi, Tomi Valkeinen, Linux-OMAP, Greg Kroah-Hartman, Linux PM, Geert Uytterhoeven On 01/10/2020 21:24, Saravana Kannan wrote: > On Thu, Oct 1, 2020 at 5:56 AM Grygorii Strashko > <grygorii.strashko@ti.com> wrote: >> >> >> >> On 01/10/2020 11:22, Laurent Pinchart wrote: >>> Hi Tony, >>> >>> On Thu, Oct 01, 2020 at 11:17:48AM +0300, Tony Lindgren wrote: >>>> * Tony Lindgren <tony@atomide.com> [201001 07:53]: >>>>> * Peter Ujfalusi <peter.ujfalusi@ti.com> [200930 12:41]: >>>>>> Fwiw on my beagle x15 >>>>>> >>>>>> v5.8 >>>>>> [ 9.908787] Run /sbin/init as init process >>>>>> >>>>>> v5.9-rc7 >>>>>> [ 15.085373] Run /sbin/init as init process >>>>>> >>>>>> >>>>>> It appears to be 'fixed' in next-20200928: the board does not even boot. >>>>> >>>>> Yeah so it seems :( >>>>> >>>>>> next-20200928 on omap5 >>>>>> [ 9.936806] Run /sbin/init as init process >>>>>> >>>>>> >>>>>> -rc7 spends most of it's time: >>>>>> [ 7.635530] Micrel KSZ9031 Gigabit PHY 48485000.mdio:01: attached PHY driver [Micrel KSZ9031 Gigabit PHY] (mii_bus:phy_addr=48485000.mdio:01, irq=POLL) >>>>>> [ 14.956671] cpsw 48484000.ethernet eth0: Link is Up - 1Gbps/Full - flow control off >>>>>> [ 15.005211] IP-Config: Complete: >>>>> >>>>> Booting with initcall_debug I see this with current Linux next: >>>>> >>>>> ... >>>>> [ 1.697313] cpuidle: using governor menu >>>>> [ 1.701353] initcall init_menu+0x0/0xc returned 0 after 0 usecs >>>>> [ 1.707458] calling gpmc_init+0x0/0x10 @ 1 >>>>> [ 1.711784] initcall gpmc_init+0x0/0x10 returned 0 after 0 usecs >>>>> [ 1.717974] calling omap3_l3_init+0x0/0x10 @ 1 >>>>> [ 1.722653] initcall omap3_l3_init+0x0/0x10 returned 0 after 0 usecs >>>>> [ 1.729201] calling omap_l3_init+0x0/0x10 @ 1 >>>>> [ 1.733791] initcall omap_l3_init+0x0/0x10 returned 0 after 0 usecs >>>>> [ 1.740314] calling gate_vma_init+0x0/0x70 @ 1 >>>>> [ 1.744976] initcall gate_vma_init+0x0/0x70 returned 0 after 0 usecs >>>>> [ 1.751522] calling customize_machine+0x0/0x30 @ 1 >>>>> [ 3.823114] initcall customize_machine+0x0/0x30 returned 0 after 2011718 usecs >>>>> [ 3.830566] calling init_atags_procfs+0x0/0xec @ 1 >>>>> [ 3.835583] No ATAGs? >>>> >>>> And the long time above with customize_machine() ends up being >>>> pdata_quirks_init() calling of_platform_populate(). >>> >>> That's what the delay is for me (I think I've reported that initially). >>> >>>>> Laurent & Tomi, care to check what you guys see in the slow booting case >>>>> after booting with initcall_debug? >>>> >>>> But maybe the long delay is something else for you guys so please check. >>> >> >> It's all devlink :( Looks like sometimes, improvements (PM) could became so complicated >> that time required to execute such algorithms may completely eliminate all expected benefits. >> Will not be surprised if PM consumption also increased instead of decreasing in some cases. >> >> not sure if it's 100% correct, but below diff reduces boot time >> from 7.6sec to 3.7sec :P >> >> before: >> [ 0.053870] cpuidle: using governor menu >> [ 2.505971] No ATAGs? >> ... >> [ 7.562317] Freeing unused kernel memory: 1024K >> >> after: >> [ 0.053800] cpuidle: using governor menu >> [ 0.136853] No ATAGs? >> [ 3.716218] devtmpfs: mounted >> [ 3.719628] Freeing unused kernel memory: 1024K >> [ 3.724266] Run /sbin/init as init process >> >> ---- >> diff --git a/drivers/of/platform.c b/drivers/of/platform.c >> index 071f04da32c8..e0cc37ed46ca 100644 >> --- a/drivers/of/platform.c >> +++ b/drivers/of/platform.c >> @@ -481,6 +481,7 @@ int of_platform_populate(struct device_node *root, >> pr_debug(" starting at: %pOF\n", root); >> >> device_links_supplier_sync_state_pause(); >> + fw_devlink_pause(); >> for_each_child_of_node(root, child) { >> rc = of_platform_bus_create(child, matches, lookup, parent, true); >> if (rc) { >> @@ -488,6 +489,7 @@ int of_platform_populate(struct device_node *root, >> break; >> } >> } >> + fw_devlink_resume(); >> device_links_supplier_sync_state_resume(); >> >> of_node_set_flag(root, OF_POPULATED_BUS); >> @@ -538,9 +540,7 @@ static int __init of_platform_default_populate_init(void) >> } >> >> /* Populate everything else. */ >> - fw_devlink_pause(); >> of_platform_default_populate(NULL, NULL, NULL); >> - fw_devlink_resume(); > > Your analysis is right, but this change is not safe. You'll get an > unlocked linked list trampling if you call it outside of where it's > called now. That's explicitly why I didn't do it the way this patch > does it. > > To explain more, if you call fw_devlink_pause/resume() inside > of_platform_populate() you can end up calling it in the context of > another device's probe function. When a device's probe function is > called, a has a bunch of other locks held and you'll cause a deadlock. > To avoid that, I had to use defer_fw_devlink_lock to manage the list > used by fw_devlink_pause/resume(). > > I'll add more details later. But yeah, this patch isn't safe as is. I assume you going to fix it your way, right? Just FYI. On ARM main DT population path is: arch_initcall(customize_machine) machine_desc->init_machine(); of_platform_default_populate(); -- or -- of_platform_populate(); the of_platform_default_populate_init() called later due to make/linker dependencies. As result, existing pause/resume optimizations don't fully working. Thank you. -- Best regards, grygorii ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Slow booting on x15 2020-10-01 19:43 ` Grygorii Strashko @ 2020-10-01 22:22 ` Saravana Kannan 2020-10-01 22:30 ` Saravana Kannan 0 siblings, 1 reply; 26+ messages in thread From: Saravana Kannan @ 2020-10-01 22:22 UTC (permalink / raw) To: Grygorii Strashko Cc: Laurent Pinchart, Tony Lindgren, Rafael J. Wysocki, Ulf Hansson, Rob Herring, Peter Ujfalusi, Tomi Valkeinen, Linux-OMAP, Greg Kroah-Hartman, Linux PM, Geert Uytterhoeven On Thu, Oct 1, 2020 at 12:43 PM Grygorii Strashko <grygorii.strashko@ti.com> wrote: > > > > On 01/10/2020 21:24, Saravana Kannan wrote: > > On Thu, Oct 1, 2020 at 5:56 AM Grygorii Strashko > > <grygorii.strashko@ti.com> wrote: > >> > >> > >> > >> On 01/10/2020 11:22, Laurent Pinchart wrote: > >>> Hi Tony, > >>> > >>> On Thu, Oct 01, 2020 at 11:17:48AM +0300, Tony Lindgren wrote: > >>>> * Tony Lindgren <tony@atomide.com> [201001 07:53]: > >>>>> * Peter Ujfalusi <peter.ujfalusi@ti.com> [200930 12:41]: > >>>>>> Fwiw on my beagle x15 > >>>>>> > >>>>>> v5.8 > >>>>>> [ 9.908787] Run /sbin/init as init process > >>>>>> > >>>>>> v5.9-rc7 > >>>>>> [ 15.085373] Run /sbin/init as init process > >>>>>> > >>>>>> > >>>>>> It appears to be 'fixed' in next-20200928: the board does not even boot. > >>>>> > >>>>> Yeah so it seems :( > >>>>> > >>>>>> next-20200928 on omap5 > >>>>>> [ 9.936806] Run /sbin/init as init process > >>>>>> > >>>>>> > >>>>>> -rc7 spends most of it's time: > >>>>>> [ 7.635530] Micrel KSZ9031 Gigabit PHY 48485000.mdio:01: attached PHY driver [Micrel KSZ9031 Gigabit PHY] (mii_bus:phy_addr=48485000.mdio:01, irq=POLL) > >>>>>> [ 14.956671] cpsw 48484000.ethernet eth0: Link is Up - 1Gbps/Full - flow control off > >>>>>> [ 15.005211] IP-Config: Complete: > >>>>> > >>>>> Booting with initcall_debug I see this with current Linux next: > >>>>> > >>>>> ... > >>>>> [ 1.697313] cpuidle: using governor menu > >>>>> [ 1.701353] initcall init_menu+0x0/0xc returned 0 after 0 usecs > >>>>> [ 1.707458] calling gpmc_init+0x0/0x10 @ 1 > >>>>> [ 1.711784] initcall gpmc_init+0x0/0x10 returned 0 after 0 usecs > >>>>> [ 1.717974] calling omap3_l3_init+0x0/0x10 @ 1 > >>>>> [ 1.722653] initcall omap3_l3_init+0x0/0x10 returned 0 after 0 usecs > >>>>> [ 1.729201] calling omap_l3_init+0x0/0x10 @ 1 > >>>>> [ 1.733791] initcall omap_l3_init+0x0/0x10 returned 0 after 0 usecs > >>>>> [ 1.740314] calling gate_vma_init+0x0/0x70 @ 1 > >>>>> [ 1.744976] initcall gate_vma_init+0x0/0x70 returned 0 after 0 usecs > >>>>> [ 1.751522] calling customize_machine+0x0/0x30 @ 1 > >>>>> [ 3.823114] initcall customize_machine+0x0/0x30 returned 0 after 2011718 usecs > >>>>> [ 3.830566] calling init_atags_procfs+0x0/0xec @ 1 > >>>>> [ 3.835583] No ATAGs? > >>>> > >>>> And the long time above with customize_machine() ends up being > >>>> pdata_quirks_init() calling of_platform_populate(). > >>> > >>> That's what the delay is for me (I think I've reported that initially). > >>> > >>>>> Laurent & Tomi, care to check what you guys see in the slow booting case > >>>>> after booting with initcall_debug? > >>>> > >>>> But maybe the long delay is something else for you guys so please check. > >>> > >> > >> It's all devlink :( Looks like sometimes, improvements (PM) could became so complicated > >> that time required to execute such algorithms may completely eliminate all expected benefits. > >> Will not be surprised if PM consumption also increased instead of decreasing in some cases. > >> > >> not sure if it's 100% correct, but below diff reduces boot time > >> from 7.6sec to 3.7sec :P > >> > >> before: > >> [ 0.053870] cpuidle: using governor menu > >> [ 2.505971] No ATAGs? > >> ... > >> [ 7.562317] Freeing unused kernel memory: 1024K > >> > >> after: > >> [ 0.053800] cpuidle: using governor menu > >> [ 0.136853] No ATAGs? > >> [ 3.716218] devtmpfs: mounted > >> [ 3.719628] Freeing unused kernel memory: 1024K > >> [ 3.724266] Run /sbin/init as init process > >> > >> ---- > >> diff --git a/drivers/of/platform.c b/drivers/of/platform.c > >> index 071f04da32c8..e0cc37ed46ca 100644 > >> --- a/drivers/of/platform.c > >> +++ b/drivers/of/platform.c > >> @@ -481,6 +481,7 @@ int of_platform_populate(struct device_node *root, > >> pr_debug(" starting at: %pOF\n", root); > >> > >> device_links_supplier_sync_state_pause(); > >> + fw_devlink_pause(); > >> for_each_child_of_node(root, child) { > >> rc = of_platform_bus_create(child, matches, lookup, parent, true); > >> if (rc) { > >> @@ -488,6 +489,7 @@ int of_platform_populate(struct device_node *root, > >> break; > >> } > >> } > >> + fw_devlink_resume(); > >> device_links_supplier_sync_state_resume(); > >> > >> of_node_set_flag(root, OF_POPULATED_BUS); > >> @@ -538,9 +540,7 @@ static int __init of_platform_default_populate_init(void) > >> } > >> > >> /* Populate everything else. */ > >> - fw_devlink_pause(); > >> of_platform_default_populate(NULL, NULL, NULL); > >> - fw_devlink_resume(); > > > > Your analysis is right, but this change is not safe. You'll get an > > unlocked linked list trampling if you call it outside of where it's > > called now. That's explicitly why I didn't do it the way this patch > > does it. > > > > To explain more, if you call fw_devlink_pause/resume() inside > > of_platform_populate() you can end up calling it in the context of > > another device's probe function. When a device's probe function is > > called, a has a bunch of other locks held and you'll cause a deadlock. > > To avoid that, I had to use defer_fw_devlink_lock to manage the list > > used by fw_devlink_pause/resume(). > > > > I'll add more details later. But yeah, this patch isn't safe as is. > > I assume you going to fix it your way, right? I'll be glad to fix this if I find a way out. But currently, there's no "my way" for a generic fix to use inside of_platform_populate(). However... > Just FYI. On ARM main DT population path is: > > arch_initcall(customize_machine) > machine_desc->init_machine(); > of_platform_default_populate(); > -- or -- > of_platform_populate(); > > the of_platform_default_populate_init() called later due to make/linker dependencies. > > As result, existing pause/resume optimizations don't fully working. We can just add this optimization around all the other places the top level addition of DT devices is done. I wasn't aware of (more like forgot) the init_machine() path. Can you give the call path in your case? Just to make it a bit easier for me. Thanks, Saravana ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Slow booting on x15 2020-10-01 22:22 ` Saravana Kannan @ 2020-10-01 22:30 ` Saravana Kannan 2020-10-01 22:38 ` Laurent Pinchart 0 siblings, 1 reply; 26+ messages in thread From: Saravana Kannan @ 2020-10-01 22:30 UTC (permalink / raw) To: Grygorii Strashko Cc: Laurent Pinchart, Tony Lindgren, Rafael J. Wysocki, Ulf Hansson, Rob Herring, Peter Ujfalusi, Tomi Valkeinen, Linux-OMAP, Greg Kroah-Hartman, Linux PM, Geert Uytterhoeven On Thu, Oct 1, 2020 at 3:22 PM Saravana Kannan <saravanak@google.com> wrote: > > On Thu, Oct 1, 2020 at 12:43 PM Grygorii Strashko > <grygorii.strashko@ti.com> wrote: > > > > > > > > On 01/10/2020 21:24, Saravana Kannan wrote: > > > On Thu, Oct 1, 2020 at 5:56 AM Grygorii Strashko > > > <grygorii.strashko@ti.com> wrote: > > >> > > >> > > >> > > >> On 01/10/2020 11:22, Laurent Pinchart wrote: > > >>> Hi Tony, > > >>> > > >>> On Thu, Oct 01, 2020 at 11:17:48AM +0300, Tony Lindgren wrote: > > >>>> * Tony Lindgren <tony@atomide.com> [201001 07:53]: > > >>>>> * Peter Ujfalusi <peter.ujfalusi@ti.com> [200930 12:41]: > > >>>>>> Fwiw on my beagle x15 > > >>>>>> > > >>>>>> v5.8 > > >>>>>> [ 9.908787] Run /sbin/init as init process > > >>>>>> > > >>>>>> v5.9-rc7 > > >>>>>> [ 15.085373] Run /sbin/init as init process > > >>>>>> > > >>>>>> > > >>>>>> It appears to be 'fixed' in next-20200928: the board does not even boot. > > >>>>> > > >>>>> Yeah so it seems :( > > >>>>> > > >>>>>> next-20200928 on omap5 > > >>>>>> [ 9.936806] Run /sbin/init as init process > > >>>>>> > > >>>>>> > > >>>>>> -rc7 spends most of it's time: > > >>>>>> [ 7.635530] Micrel KSZ9031 Gigabit PHY 48485000.mdio:01: attached PHY driver [Micrel KSZ9031 Gigabit PHY] (mii_bus:phy_addr=48485000.mdio:01, irq=POLL) > > >>>>>> [ 14.956671] cpsw 48484000.ethernet eth0: Link is Up - 1Gbps/Full - flow control off > > >>>>>> [ 15.005211] IP-Config: Complete: > > >>>>> > > >>>>> Booting with initcall_debug I see this with current Linux next: > > >>>>> > > >>>>> ... > > >>>>> [ 1.697313] cpuidle: using governor menu > > >>>>> [ 1.701353] initcall init_menu+0x0/0xc returned 0 after 0 usecs > > >>>>> [ 1.707458] calling gpmc_init+0x0/0x10 @ 1 > > >>>>> [ 1.711784] initcall gpmc_init+0x0/0x10 returned 0 after 0 usecs > > >>>>> [ 1.717974] calling omap3_l3_init+0x0/0x10 @ 1 > > >>>>> [ 1.722653] initcall omap3_l3_init+0x0/0x10 returned 0 after 0 usecs > > >>>>> [ 1.729201] calling omap_l3_init+0x0/0x10 @ 1 > > >>>>> [ 1.733791] initcall omap_l3_init+0x0/0x10 returned 0 after 0 usecs > > >>>>> [ 1.740314] calling gate_vma_init+0x0/0x70 @ 1 > > >>>>> [ 1.744976] initcall gate_vma_init+0x0/0x70 returned 0 after 0 usecs > > >>>>> [ 1.751522] calling customize_machine+0x0/0x30 @ 1 > > >>>>> [ 3.823114] initcall customize_machine+0x0/0x30 returned 0 after 2011718 usecs > > >>>>> [ 3.830566] calling init_atags_procfs+0x0/0xec @ 1 > > >>>>> [ 3.835583] No ATAGs? > > >>>> > > >>>> And the long time above with customize_machine() ends up being > > >>>> pdata_quirks_init() calling of_platform_populate(). > > >>> > > >>> That's what the delay is for me (I think I've reported that initially). > > >>> > > >>>>> Laurent & Tomi, care to check what you guys see in the slow booting case > > >>>>> after booting with initcall_debug? > > >>>> > > >>>> But maybe the long delay is something else for you guys so please check. > > >>> > > >> > > >> It's all devlink :( Looks like sometimes, improvements (PM) could became so complicated > > >> that time required to execute such algorithms may completely eliminate all expected benefits. > > >> Will not be surprised if PM consumption also increased instead of decreasing in some cases. > > >> > > >> not sure if it's 100% correct, but below diff reduces boot time > > >> from 7.6sec to 3.7sec :P > > >> > > >> before: > > >> [ 0.053870] cpuidle: using governor menu > > >> [ 2.505971] No ATAGs? > > >> ... > > >> [ 7.562317] Freeing unused kernel memory: 1024K > > >> > > >> after: > > >> [ 0.053800] cpuidle: using governor menu > > >> [ 0.136853] No ATAGs? > > >> [ 3.716218] devtmpfs: mounted > > >> [ 3.719628] Freeing unused kernel memory: 1024K > > >> [ 3.724266] Run /sbin/init as init process > > >> > > >> ---- > > >> diff --git a/drivers/of/platform.c b/drivers/of/platform.c > > >> index 071f04da32c8..e0cc37ed46ca 100644 > > >> --- a/drivers/of/platform.c > > >> +++ b/drivers/of/platform.c > > >> @@ -481,6 +481,7 @@ int of_platform_populate(struct device_node *root, > > >> pr_debug(" starting at: %pOF\n", root); > > >> > > >> device_links_supplier_sync_state_pause(); > > >> + fw_devlink_pause(); > > >> for_each_child_of_node(root, child) { > > >> rc = of_platform_bus_create(child, matches, lookup, parent, true); > > >> if (rc) { > > >> @@ -488,6 +489,7 @@ int of_platform_populate(struct device_node *root, > > >> break; > > >> } > > >> } > > >> + fw_devlink_resume(); > > >> device_links_supplier_sync_state_resume(); > > >> > > >> of_node_set_flag(root, OF_POPULATED_BUS); > > >> @@ -538,9 +540,7 @@ static int __init of_platform_default_populate_init(void) > > >> } > > >> > > >> /* Populate everything else. */ > > >> - fw_devlink_pause(); > > >> of_platform_default_populate(NULL, NULL, NULL); > > >> - fw_devlink_resume(); > > > > > > Your analysis is right, but this change is not safe. You'll get an > > > unlocked linked list trampling if you call it outside of where it's > > > called now. That's explicitly why I didn't do it the way this patch > > > does it. > > > > > > To explain more, if you call fw_devlink_pause/resume() inside > > > of_platform_populate() you can end up calling it in the context of > > > another device's probe function. When a device's probe function is > > > called, a has a bunch of other locks held and you'll cause a deadlock. > > > To avoid that, I had to use defer_fw_devlink_lock to manage the list > > > used by fw_devlink_pause/resume(). > > > > > > I'll add more details later. But yeah, this patch isn't safe as is. > > > > I assume you going to fix it your way, right? > > I'll be glad to fix this if I find a way out. But currently, there's > no "my way" for a generic fix to use inside of_platform_populate(). > However... > > > Just FYI. On ARM main DT population path is: > > > > arch_initcall(customize_machine) > > machine_desc->init_machine(); > > of_platform_default_populate(); > > -- or -- > > of_platform_populate(); > > > > the of_platform_default_populate_init() called later due to make/linker dependencies. > > > > As result, existing pause/resume optimizations don't fully working. > > We can just add this optimization around all the other places the top > level addition of DT devices is done. I wasn't aware of (more like > forgot) the init_machine() path. > > Can you give the call path in your case? Just to make it a bit easier for me. I think I have a simple solution. I can just move the fw_devlink_pause/resume() inside of_platform_default_populate() and call those only if "root" == NULL. -Saravana ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Slow booting on x15 2020-10-01 22:30 ` Saravana Kannan @ 2020-10-01 22:38 ` Laurent Pinchart 2020-10-01 22:44 ` Saravana Kannan 0 siblings, 1 reply; 26+ messages in thread From: Laurent Pinchart @ 2020-10-01 22:38 UTC (permalink / raw) To: Saravana Kannan Cc: Grygorii Strashko, Tony Lindgren, Rafael J. Wysocki, Ulf Hansson, Rob Herring, Peter Ujfalusi, Tomi Valkeinen, Linux-OMAP, Greg Kroah-Hartman, Linux PM, Geert Uytterhoeven Hi Saravana, On Thu, Oct 01, 2020 at 03:30:39PM -0700, Saravana Kannan wrote: > On Thu, Oct 1, 2020 at 3:22 PM Saravana Kannan wrote: > > On Thu, Oct 1, 2020 at 12:43 PM Grygorii Strashko wrote: > > > On 01/10/2020 21:24, Saravana Kannan wrote: > > > > On Thu, Oct 1, 2020 at 5:56 AM Grygorii Strashko wrote: > > > >> On 01/10/2020 11:22, Laurent Pinchart wrote: > > > >>> On Thu, Oct 01, 2020 at 11:17:48AM +0300, Tony Lindgren wrote: > > > >>>> * Tony Lindgren <tony@atomide.com> [201001 07:53]: > > > >>>>> * Peter Ujfalusi <peter.ujfalusi@ti.com> [200930 12:41]: > > > >>>>>> Fwiw on my beagle x15 > > > >>>>>> > > > >>>>>> v5.8 > > > >>>>>> [ 9.908787] Run /sbin/init as init process > > > >>>>>> > > > >>>>>> v5.9-rc7 > > > >>>>>> [ 15.085373] Run /sbin/init as init process > > > >>>>>> > > > >>>>>> > > > >>>>>> It appears to be 'fixed' in next-20200928: the board does not even boot. > > > >>>>> > > > >>>>> Yeah so it seems :( > > > >>>>> > > > >>>>>> next-20200928 on omap5 > > > >>>>>> [ 9.936806] Run /sbin/init as init process > > > >>>>>> > > > >>>>>> > > > >>>>>> -rc7 spends most of it's time: > > > >>>>>> [ 7.635530] Micrel KSZ9031 Gigabit PHY 48485000.mdio:01: attached PHY driver [Micrel KSZ9031 Gigabit PHY] (mii_bus:phy_addr=48485000.mdio:01, irq=POLL) > > > >>>>>> [ 14.956671] cpsw 48484000.ethernet eth0: Link is Up - 1Gbps/Full - flow control off > > > >>>>>> [ 15.005211] IP-Config: Complete: > > > >>>>> > > > >>>>> Booting with initcall_debug I see this with current Linux next: > > > >>>>> > > > >>>>> ... > > > >>>>> [ 1.697313] cpuidle: using governor menu > > > >>>>> [ 1.701353] initcall init_menu+0x0/0xc returned 0 after 0 usecs > > > >>>>> [ 1.707458] calling gpmc_init+0x0/0x10 @ 1 > > > >>>>> [ 1.711784] initcall gpmc_init+0x0/0x10 returned 0 after 0 usecs > > > >>>>> [ 1.717974] calling omap3_l3_init+0x0/0x10 @ 1 > > > >>>>> [ 1.722653] initcall omap3_l3_init+0x0/0x10 returned 0 after 0 usecs > > > >>>>> [ 1.729201] calling omap_l3_init+0x0/0x10 @ 1 > > > >>>>> [ 1.733791] initcall omap_l3_init+0x0/0x10 returned 0 after 0 usecs > > > >>>>> [ 1.740314] calling gate_vma_init+0x0/0x70 @ 1 > > > >>>>> [ 1.744976] initcall gate_vma_init+0x0/0x70 returned 0 after 0 usecs > > > >>>>> [ 1.751522] calling customize_machine+0x0/0x30 @ 1 > > > >>>>> [ 3.823114] initcall customize_machine+0x0/0x30 returned 0 after 2011718 usecs > > > >>>>> [ 3.830566] calling init_atags_procfs+0x0/0xec @ 1 > > > >>>>> [ 3.835583] No ATAGs? > > > >>>> > > > >>>> And the long time above with customize_machine() ends up being > > > >>>> pdata_quirks_init() calling of_platform_populate(). > > > >>> > > > >>> That's what the delay is for me (I think I've reported that initially). > > > >>> > > > >>>>> Laurent & Tomi, care to check what you guys see in the slow booting case > > > >>>>> after booting with initcall_debug? > > > >>>> > > > >>>> But maybe the long delay is something else for you guys so please check. > > > >>> > > > >> > > > >> It's all devlink :( Looks like sometimes, improvements (PM) could became so complicated > > > >> that time required to execute such algorithms may completely eliminate all expected benefits. > > > >> Will not be surprised if PM consumption also increased instead of decreasing in some cases. > > > >> > > > >> not sure if it's 100% correct, but below diff reduces boot time > > > >> from 7.6sec to 3.7sec :P > > > >> > > > >> before: > > > >> [ 0.053870] cpuidle: using governor menu > > > >> [ 2.505971] No ATAGs? > > > >> ... > > > >> [ 7.562317] Freeing unused kernel memory: 1024K > > > >> > > > >> after: > > > >> [ 0.053800] cpuidle: using governor menu > > > >> [ 0.136853] No ATAGs? > > > >> [ 3.716218] devtmpfs: mounted > > > >> [ 3.719628] Freeing unused kernel memory: 1024K > > > >> [ 3.724266] Run /sbin/init as init process > > > >> > > > >> ---- > > > >> diff --git a/drivers/of/platform.c b/drivers/of/platform.c > > > >> index 071f04da32c8..e0cc37ed46ca 100644 > > > >> --- a/drivers/of/platform.c > > > >> +++ b/drivers/of/platform.c > > > >> @@ -481,6 +481,7 @@ int of_platform_populate(struct device_node *root, > > > >> pr_debug(" starting at: %pOF\n", root); > > > >> > > > >> device_links_supplier_sync_state_pause(); > > > >> + fw_devlink_pause(); > > > >> for_each_child_of_node(root, child) { > > > >> rc = of_platform_bus_create(child, matches, lookup, parent, true); > > > >> if (rc) { > > > >> @@ -488,6 +489,7 @@ int of_platform_populate(struct device_node *root, > > > >> break; > > > >> } > > > >> } > > > >> + fw_devlink_resume(); > > > >> device_links_supplier_sync_state_resume(); > > > >> > > > >> of_node_set_flag(root, OF_POPULATED_BUS); > > > >> @@ -538,9 +540,7 @@ static int __init of_platform_default_populate_init(void) > > > >> } > > > >> > > > >> /* Populate everything else. */ > > > >> - fw_devlink_pause(); > > > >> of_platform_default_populate(NULL, NULL, NULL); > > > >> - fw_devlink_resume(); > > > > > > > > Your analysis is right, but this change is not safe. You'll get an > > > > unlocked linked list trampling if you call it outside of where it's > > > > called now. That's explicitly why I didn't do it the way this patch > > > > does it. > > > > > > > > To explain more, if you call fw_devlink_pause/resume() inside > > > > of_platform_populate() you can end up calling it in the context of > > > > another device's probe function. When a device's probe function is > > > > called, a has a bunch of other locks held and you'll cause a deadlock. > > > > To avoid that, I had to use defer_fw_devlink_lock to manage the list > > > > used by fw_devlink_pause/resume(). > > > > > > > > I'll add more details later. But yeah, this patch isn't safe as is. > > > > > > I assume you going to fix it your way, right? > > > > I'll be glad to fix this if I find a way out. But currently, there's > > no "my way" for a generic fix to use inside of_platform_populate(). > > However... > > > > > Just FYI. On ARM main DT population path is: > > > > > > arch_initcall(customize_machine) > > > machine_desc->init_machine(); > > > of_platform_default_populate(); > > > -- or -- > > > of_platform_populate(); > > > > > > the of_platform_default_populate_init() called later due to make/linker dependencies. > > > > > > As result, existing pause/resume optimizations don't fully working. > > > > We can just add this optimization around all the other places the top > > level addition of DT devices is done. I wasn't aware of (more like > > forgot) the init_machine() path. > > > > Can you give the call path in your case? Just to make it a bit easier for me. > > I think I have a simple solution. I can just move the > fw_devlink_pause/resume() inside of_platform_default_populate() and > call those only if "root" == NULL. I'd be happy to test a patch. On my device, the initial of_platform_populate() call from machine_desc->init_machine() takes between 6 and 10 seconds to run with v5.9-rc5, compared to 200ms on v5.7. That's a fairly bad regression, all the people who have worked hard so reduce boot time would really hate this :-) -- Regards, Laurent Pinchart ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Slow booting on x15 2020-10-01 22:38 ` Laurent Pinchart @ 2020-10-01 22:44 ` Saravana Kannan 2020-10-01 22:59 ` [PATCH v1] of: platform: Batch fwnode parsing in the init_machine() path Saravana Kannan 0 siblings, 1 reply; 26+ messages in thread From: Saravana Kannan @ 2020-10-01 22:44 UTC (permalink / raw) To: Laurent Pinchart Cc: Grygorii Strashko, Tony Lindgren, Rafael J. Wysocki, Ulf Hansson, Rob Herring, Peter Ujfalusi, Tomi Valkeinen, Linux-OMAP, Greg Kroah-Hartman, Linux PM, Geert Uytterhoeven On Thu, Oct 1, 2020 at 3:39 PM Laurent Pinchart <laurent.pinchart@ideasonboard.com> wrote: > > Hi Saravana, > > On Thu, Oct 01, 2020 at 03:30:39PM -0700, Saravana Kannan wrote: > > On Thu, Oct 1, 2020 at 3:22 PM Saravana Kannan wrote: > > > On Thu, Oct 1, 2020 at 12:43 PM Grygorii Strashko wrote: > > > > On 01/10/2020 21:24, Saravana Kannan wrote: > > > > > On Thu, Oct 1, 2020 at 5:56 AM Grygorii Strashko wrote: > > > > >> On 01/10/2020 11:22, Laurent Pinchart wrote: > > > > >>> On Thu, Oct 01, 2020 at 11:17:48AM +0300, Tony Lindgren wrote: > > > > >>>> * Tony Lindgren <tony@atomide.com> [201001 07:53]: > > > > >>>>> * Peter Ujfalusi <peter.ujfalusi@ti.com> [200930 12:41]: > > > > >>>>>> Fwiw on my beagle x15 > > > > >>>>>> > > > > >>>>>> v5.8 > > > > >>>>>> [ 9.908787] Run /sbin/init as init process > > > > >>>>>> > > > > >>>>>> v5.9-rc7 > > > > >>>>>> [ 15.085373] Run /sbin/init as init process > > > > >>>>>> > > > > >>>>>> > > > > >>>>>> It appears to be 'fixed' in next-20200928: the board does not even boot. > > > > >>>>> > > > > >>>>> Yeah so it seems :( > > > > >>>>> > > > > >>>>>> next-20200928 on omap5 > > > > >>>>>> [ 9.936806] Run /sbin/init as init process > > > > >>>>>> > > > > >>>>>> > > > > >>>>>> -rc7 spends most of it's time: > > > > >>>>>> [ 7.635530] Micrel KSZ9031 Gigabit PHY 48485000.mdio:01: attached PHY driver [Micrel KSZ9031 Gigabit PHY] (mii_bus:phy_addr=48485000.mdio:01, irq=POLL) > > > > >>>>>> [ 14.956671] cpsw 48484000.ethernet eth0: Link is Up - 1Gbps/Full - flow control off > > > > >>>>>> [ 15.005211] IP-Config: Complete: > > > > >>>>> > > > > >>>>> Booting with initcall_debug I see this with current Linux next: > > > > >>>>> > > > > >>>>> ... > > > > >>>>> [ 1.697313] cpuidle: using governor menu > > > > >>>>> [ 1.701353] initcall init_menu+0x0/0xc returned 0 after 0 usecs > > > > >>>>> [ 1.707458] calling gpmc_init+0x0/0x10 @ 1 > > > > >>>>> [ 1.711784] initcall gpmc_init+0x0/0x10 returned 0 after 0 usecs > > > > >>>>> [ 1.717974] calling omap3_l3_init+0x0/0x10 @ 1 > > > > >>>>> [ 1.722653] initcall omap3_l3_init+0x0/0x10 returned 0 after 0 usecs > > > > >>>>> [ 1.729201] calling omap_l3_init+0x0/0x10 @ 1 > > > > >>>>> [ 1.733791] initcall omap_l3_init+0x0/0x10 returned 0 after 0 usecs > > > > >>>>> [ 1.740314] calling gate_vma_init+0x0/0x70 @ 1 > > > > >>>>> [ 1.744976] initcall gate_vma_init+0x0/0x70 returned 0 after 0 usecs > > > > >>>>> [ 1.751522] calling customize_machine+0x0/0x30 @ 1 > > > > >>>>> [ 3.823114] initcall customize_machine+0x0/0x30 returned 0 after 2011718 usecs > > > > >>>>> [ 3.830566] calling init_atags_procfs+0x0/0xec @ 1 > > > > >>>>> [ 3.835583] No ATAGs? > > > > >>>> > > > > >>>> And the long time above with customize_machine() ends up being > > > > >>>> pdata_quirks_init() calling of_platform_populate(). > > > > >>> > > > > >>> That's what the delay is for me (I think I've reported that initially). > > > > >>> > > > > >>>>> Laurent & Tomi, care to check what you guys see in the slow booting case > > > > >>>>> after booting with initcall_debug? > > > > >>>> > > > > >>>> But maybe the long delay is something else for you guys so please check. > > > > >>> > > > > >> > > > > >> It's all devlink :( Looks like sometimes, improvements (PM) could became so complicated > > > > >> that time required to execute such algorithms may completely eliminate all expected benefits. > > > > >> Will not be surprised if PM consumption also increased instead of decreasing in some cases. > > > > >> > > > > >> not sure if it's 100% correct, but below diff reduces boot time > > > > >> from 7.6sec to 3.7sec :P > > > > >> > > > > >> before: > > > > >> [ 0.053870] cpuidle: using governor menu > > > > >> [ 2.505971] No ATAGs? > > > > >> ... > > > > >> [ 7.562317] Freeing unused kernel memory: 1024K > > > > >> > > > > >> after: > > > > >> [ 0.053800] cpuidle: using governor menu > > > > >> [ 0.136853] No ATAGs? > > > > >> [ 3.716218] devtmpfs: mounted > > > > >> [ 3.719628] Freeing unused kernel memory: 1024K > > > > >> [ 3.724266] Run /sbin/init as init process > > > > >> > > > > >> ---- > > > > >> diff --git a/drivers/of/platform.c b/drivers/of/platform.c > > > > >> index 071f04da32c8..e0cc37ed46ca 100644 > > > > >> --- a/drivers/of/platform.c > > > > >> +++ b/drivers/of/platform.c > > > > >> @@ -481,6 +481,7 @@ int of_platform_populate(struct device_node *root, > > > > >> pr_debug(" starting at: %pOF\n", root); > > > > >> > > > > >> device_links_supplier_sync_state_pause(); > > > > >> + fw_devlink_pause(); > > > > >> for_each_child_of_node(root, child) { > > > > >> rc = of_platform_bus_create(child, matches, lookup, parent, true); > > > > >> if (rc) { > > > > >> @@ -488,6 +489,7 @@ int of_platform_populate(struct device_node *root, > > > > >> break; > > > > >> } > > > > >> } > > > > >> + fw_devlink_resume(); > > > > >> device_links_supplier_sync_state_resume(); > > > > >> > > > > >> of_node_set_flag(root, OF_POPULATED_BUS); > > > > >> @@ -538,9 +540,7 @@ static int __init of_platform_default_populate_init(void) > > > > >> } > > > > >> > > > > >> /* Populate everything else. */ > > > > >> - fw_devlink_pause(); > > > > >> of_platform_default_populate(NULL, NULL, NULL); > > > > >> - fw_devlink_resume(); > > > > > > > > > > Your analysis is right, but this change is not safe. You'll get an > > > > > unlocked linked list trampling if you call it outside of where it's > > > > > called now. That's explicitly why I didn't do it the way this patch > > > > > does it. > > > > > > > > > > To explain more, if you call fw_devlink_pause/resume() inside > > > > > of_platform_populate() you can end up calling it in the context of > > > > > another device's probe function. When a device's probe function is > > > > > called, a has a bunch of other locks held and you'll cause a deadlock. > > > > > To avoid that, I had to use defer_fw_devlink_lock to manage the list > > > > > used by fw_devlink_pause/resume(). > > > > > > > > > > I'll add more details later. But yeah, this patch isn't safe as is. > > > > > > > > I assume you going to fix it your way, right? > > > > > > I'll be glad to fix this if I find a way out. But currently, there's > > > no "my way" for a generic fix to use inside of_platform_populate(). > > > However... > > > > > > > Just FYI. On ARM main DT population path is: > > > > > > > > arch_initcall(customize_machine) > > > > machine_desc->init_machine(); > > > > of_platform_default_populate(); > > > > -- or -- > > > > of_platform_populate(); > > > > > > > > the of_platform_default_populate_init() called later due to make/linker dependencies. > > > > > > > > As result, existing pause/resume optimizations don't fully working. > > > > > > We can just add this optimization around all the other places the top > > > level addition of DT devices is done. I wasn't aware of (more like > > > forgot) the init_machine() path. > > > > > > Can you give the call path in your case? Just to make it a bit easier for me. > > > > I think I have a simple solution. I can just move the > > fw_devlink_pause/resume() inside of_platform_default_populate() and > > call those only if "root" == NULL. > > I'd be happy to test a patch. On my device, the initial > of_platform_populate() call from machine_desc->init_machine() takes > between 6 and 10 seconds to run with v5.9-rc5, compared to 200ms on > v5.7. That's a fairly bad regression, all the people who have worked > hard so reduce boot time would really hate this :-) Thanks! Patch coming up in a few minutes. -Saravana ^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH v1] of: platform: Batch fwnode parsing in the init_machine() path 2020-10-01 22:44 ` Saravana Kannan @ 2020-10-01 22:59 ` Saravana Kannan 2020-10-01 23:19 ` Laurent Pinchart 2020-10-02 14:07 ` Rob Herring 0 siblings, 2 replies; 26+ messages in thread From: Saravana Kannan @ 2020-10-01 22:59 UTC (permalink / raw) To: saravanak, Rob Herring, Frank Rowand Cc: geert+renesas, gregkh, grygorii.strashko, laurent.pinchart, linux-omap, linux-pm, peter.ujfalusi, rjw, tomi.valkeinen, tony, ulf.hansson, kernel-team, devicetree, linux-kernel When commit 93d2e4322aa7 ("of: platform: Batch fwnode parsing when adding all top level devices") optimized the fwnode parsing when all top level devices are added, it missed out optimizing this for platform where the top level devices are added through the init_machine() path. This commit does the optimization for all paths by simply moving the fw_devlink_pause/resume() inside of_platform_default_populate(). Reported-by: Tomi Valkeinen <tomi.valkeinen@ti.com> Signed-off-by: Saravana Kannan <saravanak@google.com> --- drivers/of/platform.c | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-) diff --git a/drivers/of/platform.c b/drivers/of/platform.c index 071f04da32c8..79972e49b539 100644 --- a/drivers/of/platform.c +++ b/drivers/of/platform.c @@ -501,8 +501,21 @@ int of_platform_default_populate(struct device_node *root, const struct of_dev_auxdata *lookup, struct device *parent) { - return of_platform_populate(root, of_default_bus_match_table, lookup, - parent); + int ret; + + /* + * fw_devlink_pause/resume() are only safe to be called around top + * level device addition due to locking constraints. + */ + if (!root) + fw_devlink_pause(); + + ret = of_platform_populate(root, of_default_bus_match_table, lookup, + parent); + + if (!root) + fw_devlink_resume(); + return ret; } EXPORT_SYMBOL_GPL(of_platform_default_populate); @@ -538,9 +551,7 @@ static int __init of_platform_default_populate_init(void) } /* Populate everything else. */ - fw_devlink_pause(); of_platform_default_populate(NULL, NULL, NULL); - fw_devlink_resume(); return 0; } -- 2.28.0.709.gb0816b6eb0-goog ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH v1] of: platform: Batch fwnode parsing in the init_machine() path 2020-10-01 22:59 ` [PATCH v1] of: platform: Batch fwnode parsing in the init_machine() path Saravana Kannan @ 2020-10-01 23:19 ` Laurent Pinchart 2020-10-02 11:40 ` Grygorii Strashko 2020-10-02 14:07 ` Rob Herring 1 sibling, 1 reply; 26+ messages in thread From: Laurent Pinchart @ 2020-10-01 23:19 UTC (permalink / raw) To: Saravana Kannan Cc: Rob Herring, Frank Rowand, geert+renesas, gregkh, grygorii.strashko, linux-omap, linux-pm, peter.ujfalusi, rjw, tomi.valkeinen, tony, ulf.hansson, kernel-team, devicetree, linux-kernel Hi Saravana, Thank you for the patch. On Thu, Oct 01, 2020 at 03:59:51PM -0700, Saravana Kannan wrote: > When commit 93d2e4322aa7 ("of: platform: Batch fwnode parsing when > adding all top level devices") optimized the fwnode parsing when all top > level devices are added, it missed out optimizing this for platform > where the top level devices are added through the init_machine() path. > > This commit does the optimization for all paths by simply moving the > fw_devlink_pause/resume() inside of_platform_default_populate(). Based on v5.9-rc5, before the patch: [ 0.652887] cpuidle: using governor menu [ 12.349476] No ATAGs? After the patch: [ 0.650460] cpuidle: using governor menu [ 12.262101] No ATAGs? :-( > Reported-by: Tomi Valkeinen <tomi.valkeinen@ti.com> > Signed-off-by: Saravana Kannan <saravanak@google.com> > --- > drivers/of/platform.c | 19 +++++++++++++++---- > 1 file changed, 15 insertions(+), 4 deletions(-) > > diff --git a/drivers/of/platform.c b/drivers/of/platform.c > index 071f04da32c8..79972e49b539 100644 > --- a/drivers/of/platform.c > +++ b/drivers/of/platform.c > @@ -501,8 +501,21 @@ int of_platform_default_populate(struct device_node *root, > const struct of_dev_auxdata *lookup, > struct device *parent) > { > - return of_platform_populate(root, of_default_bus_match_table, lookup, > - parent); > + int ret; > + > + /* > + * fw_devlink_pause/resume() are only safe to be called around top > + * level device addition due to locking constraints. > + */ > + if (!root) > + fw_devlink_pause(); > + > + ret = of_platform_populate(root, of_default_bus_match_table, lookup, > + parent); > + > + if (!root) > + fw_devlink_resume(); > + return ret; > } > EXPORT_SYMBOL_GPL(of_platform_default_populate); > > @@ -538,9 +551,7 @@ static int __init of_platform_default_populate_init(void) > } > > /* Populate everything else. */ > - fw_devlink_pause(); > of_platform_default_populate(NULL, NULL, NULL); > - fw_devlink_resume(); > > return 0; > } -- Regards, Laurent Pinchart ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v1] of: platform: Batch fwnode parsing in the init_machine() path 2020-10-01 23:19 ` Laurent Pinchart @ 2020-10-02 11:40 ` Grygorii Strashko 2020-10-02 15:03 ` Grygorii Strashko 0 siblings, 1 reply; 26+ messages in thread From: Grygorii Strashko @ 2020-10-02 11:40 UTC (permalink / raw) To: Laurent Pinchart, Saravana Kannan Cc: Rob Herring, Frank Rowand, geert+renesas, gregkh, linux-omap, linux-pm, peter.ujfalusi, rjw, tomi.valkeinen, tony, ulf.hansson, kernel-team, devicetree, linux-kernel On 02/10/2020 02:19, Laurent Pinchart wrote: > Hi Saravana, > > Thank you for the patch. > > On Thu, Oct 01, 2020 at 03:59:51PM -0700, Saravana Kannan wrote: >> When commit 93d2e4322aa7 ("of: platform: Batch fwnode parsing when >> adding all top level devices") optimized the fwnode parsing when all top >> level devices are added, it missed out optimizing this for platform >> where the top level devices are added through the init_machine() path. >> >> This commit does the optimization for all paths by simply moving the >> fw_devlink_pause/resume() inside of_platform_default_populate(). > > Based on v5.9-rc5, before the patch: > > [ 0.652887] cpuidle: using governor menu > [ 12.349476] No ATAGs? > > After the patch: > > [ 0.650460] cpuidle: using governor menu > [ 12.262101] No ATAGs? > > :-( This is kinda expected :( because omap2 arch doesn't call of_platform_default_populate() Call path: board-generic.c DT_MACHINE_START() .init_machine = omap_generic_init, omap_generic_init() pdata_quirks_init(omap_dt_match_table); of_platform_populate(NULL, omap_dt_match_table, omap_auxdata_lookup, NULL); Other affected platforms arm: mach-ux500 some mips some powerpc there are also case when a lot of devices placed under bus node, in such case of_platform_populate() calls from bus drivers will also suffer from this issue. I think one option could be to add some parameter to _populate() or introduce new api. By the way, is there option to disable this feature at all? Is there Kconfig option? Is there any reasons why such complex and time consuming code added to the kernel and not implemented on DTC level? Also, I've came with another diff, pls check. [ 0.000000] Booting Linux on physical CPU 0x0 [ 0.000000] Linux version 5.9.0-rc6-01791-g9acba6b38757-dirty (grygorii@grygorii-XPS-13-9370) (arm-linux-gnueabihf-gcc (GNU Toolcha0 [ 0.000000] CPU: ARMv7 Processor [412fc0f2] revision 2 (ARMv7), cr=10c5387d [ 0.000000] CPU: div instructions available: patching division code [ 0.000000] CPU: PIPT / VIPT nonaliasing data cache, PIPT instruction cache [ 0.000000] OF: fdt: Machine model: TI AM5718 IDK ... [ 0.053443] cpuidle: using governor ladder [ 0.053470] cpuidle: using governor menu [ 0.089304] No ATAGs? ... [ 3.092291] devtmpfs: mounted [ 3.095804] Freeing unused kernel memory: 1024K [ 3.100483] Run /sbin/init as init process ------ >< --- diff --git a/drivers/of/platform.c b/drivers/of/platform.c index 071f04da32c8..4521b26e7745 100644 --- a/drivers/of/platform.c +++ b/drivers/of/platform.c @@ -514,6 +514,12 @@ static const struct of_device_id reserved_mem_matches[] = { {} }; +static int __init of_platform_fw_devlink_pause(void) +{ + fw_devlink_pause(); +} +core_initcall(of_platform_fw_devlink_pause); + static int __init of_platform_default_populate_init(void) { struct device_node *node; @@ -538,9 +544,7 @@ static int __init of_platform_default_populate_init(void) } /* Populate everything else. */ - fw_devlink_pause(); of_platform_default_populate(NULL, NULL, NULL); - fw_devlink_resume(); return 0; } @@ -548,6 +552,7 @@ arch_initcall_sync(of_platform_default_populate_init); static int __init of_platform_sync_state_init(void) { + fw_devlink_resume(); device_links_supplier_sync_state_resume(); return 0; } -- Best regards, grygorii ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH v1] of: platform: Batch fwnode parsing in the init_machine() path 2020-10-02 11:40 ` Grygorii Strashko @ 2020-10-02 15:03 ` Grygorii Strashko 2020-10-02 17:48 ` Saravana Kannan 0 siblings, 1 reply; 26+ messages in thread From: Grygorii Strashko @ 2020-10-02 15:03 UTC (permalink / raw) To: Laurent Pinchart, Saravana Kannan Cc: Rob Herring, Frank Rowand, geert+renesas, gregkh, linux-omap, linux-pm, peter.ujfalusi, rjw, tomi.valkeinen, tony, ulf.hansson, kernel-team, devicetree, linux-kernel On 02/10/2020 14:40, Grygorii Strashko wrote: > > > On 02/10/2020 02:19, Laurent Pinchart wrote: >> Hi Saravana, >> >> Thank you for the patch. >> >> On Thu, Oct 01, 2020 at 03:59:51PM -0700, Saravana Kannan wrote: >>> When commit 93d2e4322aa7 ("of: platform: Batch fwnode parsing when >>> adding all top level devices") optimized the fwnode parsing when all top >>> level devices are added, it missed out optimizing this for platform >>> where the top level devices are added through the init_machine() path. >>> >>> This commit does the optimization for all paths by simply moving the >>> fw_devlink_pause/resume() inside of_platform_default_populate(). >> >> Based on v5.9-rc5, before the patch: >> >> [ 0.652887] cpuidle: using governor menu >> [ 12.349476] No ATAGs? >> >> After the patch: >> >> [ 0.650460] cpuidle: using governor menu >> [ 12.262101] No ATAGs? >> >> :-( > > This is kinda expected :( because omap2 arch doesn't call of_platform_default_populate() > > Call path: > board-generic.c > DT_MACHINE_START() > .init_machine = omap_generic_init, > > omap_generic_init() > pdata_quirks_init(omap_dt_match_table); > of_platform_populate(NULL, omap_dt_match_table, > omap_auxdata_lookup, NULL); > > Other affected platforms > arm: mach-ux500 > some mips > some powerpc > > there are also case when a lot of devices placed under bus node, in such case > of_platform_populate() calls from bus drivers will also suffer from this issue. > > I think one option could be to add some parameter to _populate() or introduce new api. > > By the way, is there option to disable this feature at all? > Is there Kconfig option? > Is there any reasons why such complex and time consuming code added to the kernel and not implemented on DTC level? > > > Also, I've came with another diff, pls check. > > [ 0.000000] Booting Linux on physical CPU 0x0 > [ 0.000000] Linux version 5.9.0-rc6-01791-g9acba6b38757-dirty (grygorii@grygorii-XPS-13-9370) (arm-linux-gnueabihf-gcc (GNU Toolcha0 > [ 0.000000] CPU: ARMv7 Processor [412fc0f2] revision 2 (ARMv7), cr=10c5387d > [ 0.000000] CPU: div instructions available: patching division code > [ 0.000000] CPU: PIPT / VIPT nonaliasing data cache, PIPT instruction cache > [ 0.000000] OF: fdt: Machine model: TI AM5718 IDK > ... > [ 0.053443] cpuidle: using governor ladder > [ 0.053470] cpuidle: using governor menu > [ 0.089304] No ATAGs? > ... > [ 3.092291] devtmpfs: mounted > [ 3.095804] Freeing unused kernel memory: 1024K > [ 3.100483] Run /sbin/init as init process > > > > ------ >< --- > diff --git a/drivers/of/platform.c b/drivers/of/platform.c > index 071f04da32c8..4521b26e7745 100644 > --- a/drivers/of/platform.c > +++ b/drivers/of/platform.c > @@ -514,6 +514,12 @@ static const struct of_device_id reserved_mem_matches[] = { > {} > }; > > +static int __init of_platform_fw_devlink_pause(void) > +{ > + fw_devlink_pause(); > +} > +core_initcall(of_platform_fw_devlink_pause); > + > static int __init of_platform_default_populate_init(void) > { > struct device_node *node; > @@ -538,9 +544,7 @@ static int __init of_platform_default_populate_init(void) > } > > /* Populate everything else. */ > - fw_devlink_pause(); > of_platform_default_populate(NULL, NULL, NULL); > - fw_devlink_resume(); > > return 0; > } > @@ -548,6 +552,7 @@ arch_initcall_sync(of_platform_default_populate_init); > > static int __init of_platform_sync_state_init(void) > { > + fw_devlink_resume(); ^ it seems has to be done earlier, like +static int __init of_platform_fw_devlink_resume(void) +{ + fw_devlink_resume(); + return 0; +} +device_initcall_sync(of_platform_fw_devlink_resume); > device_links_supplier_sync_state_resume(); > return 0; > } > > > -- Best regards, grygorii ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v1] of: platform: Batch fwnode parsing in the init_machine() path 2020-10-02 15:03 ` Grygorii Strashko @ 2020-10-02 17:48 ` Saravana Kannan 2020-10-02 18:11 ` Grygorii Strashko 0 siblings, 1 reply; 26+ messages in thread From: Saravana Kannan @ 2020-10-02 17:48 UTC (permalink / raw) To: Grygorii Strashko Cc: Laurent Pinchart, Rob Herring, Frank Rowand, Geert Uytterhoeven, Greg Kroah-Hartman, Linux-OMAP, Linux PM, Peter Ujfalusi, Rafael J. Wysocki, Tomi Valkeinen, Tony Lindgren, Ulf Hansson, Android Kernel Team, open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS, LKML On Fri, Oct 2, 2020 at 8:03 AM 'Grygorii Strashko' via kernel-team <kernel-team@android.com> wrote: > > > > On 02/10/2020 14:40, Grygorii Strashko wrote: > > > > > > On 02/10/2020 02:19, Laurent Pinchart wrote: > >> Hi Saravana, > >> > >> Thank you for the patch. > >> > >> On Thu, Oct 01, 2020 at 03:59:51PM -0700, Saravana Kannan wrote: > >>> When commit 93d2e4322aa7 ("of: platform: Batch fwnode parsing when > >>> adding all top level devices") optimized the fwnode parsing when all top > >>> level devices are added, it missed out optimizing this for platform > >>> where the top level devices are added through the init_machine() path. > >>> > >>> This commit does the optimization for all paths by simply moving the > >>> fw_devlink_pause/resume() inside of_platform_default_populate(). > >> > >> Based on v5.9-rc5, before the patch: > >> > >> [ 0.652887] cpuidle: using governor menu > >> [ 12.349476] No ATAGs? > >> > >> After the patch: > >> > >> [ 0.650460] cpuidle: using governor menu > >> [ 12.262101] No ATAGs? > >> > >> :-( > > > > This is kinda expected :( because omap2 arch doesn't call of_platform_default_populate() > > > > Call path: > > board-generic.c > > DT_MACHINE_START() > > .init_machine = omap_generic_init, > > > > omap_generic_init() > > pdata_quirks_init(omap_dt_match_table); > > of_platform_populate(NULL, omap_dt_match_table, > > omap_auxdata_lookup, NULL); > > > > Other affected platforms > > arm: mach-ux500 > > some mips > > some powerpc > > > > there are also case when a lot of devices placed under bus node, in such case > > of_platform_populate() calls from bus drivers will also suffer from this issue. > > > > I think one option could be to add some parameter to _populate() or introduce new api. > > > > By the way, is there option to disable this feature at all? > > Is there Kconfig option? > > Is there any reasons why such complex and time consuming code added to the kernel and not implemented on DTC level? > > > > > > Also, I've came with another diff, pls check. > > > > [ 0.000000] Booting Linux on physical CPU 0x0 > > [ 0.000000] Linux version 5.9.0-rc6-01791-g9acba6b38757-dirty (grygorii@grygorii-XPS-13-9370) (arm-linux-gnueabihf-gcc (GNU Toolcha0 > > [ 0.000000] CPU: ARMv7 Processor [412fc0f2] revision 2 (ARMv7), cr=10c5387d > > [ 0.000000] CPU: div instructions available: patching division code > > [ 0.000000] CPU: PIPT / VIPT nonaliasing data cache, PIPT instruction cache > > [ 0.000000] OF: fdt: Machine model: TI AM5718 IDK > > ... > > [ 0.053443] cpuidle: using governor ladder > > [ 0.053470] cpuidle: using governor menu > > [ 0.089304] No ATAGs? > > ... > > [ 3.092291] devtmpfs: mounted > > [ 3.095804] Freeing unused kernel memory: 1024K > > [ 3.100483] Run /sbin/init as init process > > > > > > > > ------ >< --- > > diff --git a/drivers/of/platform.c b/drivers/of/platform.c > > index 071f04da32c8..4521b26e7745 100644 > > --- a/drivers/of/platform.c > > +++ b/drivers/of/platform.c > > @@ -514,6 +514,12 @@ static const struct of_device_id reserved_mem_matches[] = { > > {} > > }; > > > > +static int __init of_platform_fw_devlink_pause(void) > > +{ > > + fw_devlink_pause(); > > +} > > +core_initcall(of_platform_fw_devlink_pause); > > + > > static int __init of_platform_default_populate_init(void) > > { > > struct device_node *node; > > @@ -538,9 +544,7 @@ static int __init of_platform_default_populate_init(void) > > } > > > > /* Populate everything else. */ > > - fw_devlink_pause(); > > of_platform_default_populate(NULL, NULL, NULL); > > - fw_devlink_resume(); > > > > return 0; > > } > > @@ -548,6 +552,7 @@ arch_initcall_sync(of_platform_default_populate_init); > > > > static int __init of_platform_sync_state_init(void) > > { > > + fw_devlink_resume(); > > ^ it seems has to be done earlier, like > +static int __init of_platform_fw_devlink_resume(void) > +{ > + fw_devlink_resume(); > + return 0; > +} > +device_initcall_sync(of_platform_fw_devlink_resume); This will mean no device will probe until device_initcall_sync(). Unfortunately, I don't think we can make such a sweeping assumption. -Saravana ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v1] of: platform: Batch fwnode parsing in the init_machine() path 2020-10-02 17:48 ` Saravana Kannan @ 2020-10-02 18:11 ` Grygorii Strashko 0 siblings, 0 replies; 26+ messages in thread From: Grygorii Strashko @ 2020-10-02 18:11 UTC (permalink / raw) To: Saravana Kannan Cc: Laurent Pinchart, Rob Herring, Frank Rowand, Geert Uytterhoeven, Greg Kroah-Hartman, Linux-OMAP, Linux PM, Peter Ujfalusi, Rafael J. Wysocki, Tomi Valkeinen, Tony Lindgren, Ulf Hansson, Android Kernel Team, open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS, LKML On 02/10/2020 20:48, Saravana Kannan wrote: > On Fri, Oct 2, 2020 at 8:03 AM 'Grygorii Strashko' via kernel-team > <kernel-team@android.com> wrote: >> >> >> >> On 02/10/2020 14:40, Grygorii Strashko wrote: >>> >>> >>> On 02/10/2020 02:19, Laurent Pinchart wrote: >>>> Hi Saravana, >>>> >>>> Thank you for the patch. >>>> >>>> On Thu, Oct 01, 2020 at 03:59:51PM -0700, Saravana Kannan wrote: >>>>> When commit 93d2e4322aa7 ("of: platform: Batch fwnode parsing when >>>>> adding all top level devices") optimized the fwnode parsing when all top >>>>> level devices are added, it missed out optimizing this for platform >>>>> where the top level devices are added through the init_machine() path. >>>>> >>>>> This commit does the optimization for all paths by simply moving the >>>>> fw_devlink_pause/resume() inside of_platform_default_populate(). >>>> >>>> Based on v5.9-rc5, before the patch: >>>> >>>> [ 0.652887] cpuidle: using governor menu >>>> [ 12.349476] No ATAGs? >>>> >>>> After the patch: >>>> >>>> [ 0.650460] cpuidle: using governor menu >>>> [ 12.262101] No ATAGs? >>>> >>>> :-( >>> >>> This is kinda expected :( because omap2 arch doesn't call of_platform_default_populate() >>> >>> Call path: >>> board-generic.c >>> DT_MACHINE_START() >>> .init_machine = omap_generic_init, >>> >>> omap_generic_init() >>> pdata_quirks_init(omap_dt_match_table); >>> of_platform_populate(NULL, omap_dt_match_table, >>> omap_auxdata_lookup, NULL); >>> >>> Other affected platforms >>> arm: mach-ux500 >>> some mips >>> some powerpc >>> >>> there are also case when a lot of devices placed under bus node, in such case >>> of_platform_populate() calls from bus drivers will also suffer from this issue. >>> >>> I think one option could be to add some parameter to _populate() or introduce new api. >>> >>> By the way, is there option to disable this feature at all? >>> Is there Kconfig option? >>> Is there any reasons why such complex and time consuming code added to the kernel and not implemented on DTC level? >>> >>> >>> Also, I've came with another diff, pls check. >>> >>> [ 0.000000] Booting Linux on physical CPU 0x0 >>> [ 0.000000] Linux version 5.9.0-rc6-01791-g9acba6b38757-dirty (grygorii@grygorii-XPS-13-9370) (arm-linux-gnueabihf-gcc (GNU Toolcha0 >>> [ 0.000000] CPU: ARMv7 Processor [412fc0f2] revision 2 (ARMv7), cr=10c5387d >>> [ 0.000000] CPU: div instructions available: patching division code >>> [ 0.000000] CPU: PIPT / VIPT nonaliasing data cache, PIPT instruction cache >>> [ 0.000000] OF: fdt: Machine model: TI AM5718 IDK >>> ... >>> [ 0.053443] cpuidle: using governor ladder >>> [ 0.053470] cpuidle: using governor menu >>> [ 0.089304] No ATAGs? >>> ... >>> [ 3.092291] devtmpfs: mounted >>> [ 3.095804] Freeing unused kernel memory: 1024K >>> [ 3.100483] Run /sbin/init as init process >>> >>> >>> >>> ------ >< --- >>> diff --git a/drivers/of/platform.c b/drivers/of/platform.c >>> index 071f04da32c8..4521b26e7745 100644 >>> --- a/drivers/of/platform.c >>> +++ b/drivers/of/platform.c >>> @@ -514,6 +514,12 @@ static const struct of_device_id reserved_mem_matches[] = { >>> {} >>> }; >>> >>> +static int __init of_platform_fw_devlink_pause(void) >>> +{ >>> + fw_devlink_pause(); >>> +} >>> +core_initcall(of_platform_fw_devlink_pause); >>> + >>> static int __init of_platform_default_populate_init(void) >>> { >>> struct device_node *node; >>> @@ -538,9 +544,7 @@ static int __init of_platform_default_populate_init(void) >>> } >>> >>> /* Populate everything else. */ >>> - fw_devlink_pause(); >>> of_platform_default_populate(NULL, NULL, NULL); >>> - fw_devlink_resume(); >>> >>> return 0; >>> } >>> @@ -548,6 +552,7 @@ arch_initcall_sync(of_platform_default_populate_init); >>> >>> static int __init of_platform_sync_state_init(void) >>> { >>> + fw_devlink_resume(); >> >> ^ it seems has to be done earlier, like >> +static int __init of_platform_fw_devlink_resume(void) >> +{ >> + fw_devlink_resume(); >> + return 0; >> +} >> +device_initcall_sync(of_platform_fw_devlink_resume); > > This will mean no device will probe until device_initcall_sync(). > Unfortunately, I don't think we can make such a sweeping assumption. Could you answer below questions, pls? >>> By the way, is there option to disable this feature at all? >>> Is there Kconfig option? -- Best regards, grygorii ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v1] of: platform: Batch fwnode parsing in the init_machine() path 2020-10-01 22:59 ` [PATCH v1] of: platform: Batch fwnode parsing in the init_machine() path Saravana Kannan 2020-10-01 23:19 ` Laurent Pinchart @ 2020-10-02 14:07 ` Rob Herring 2020-10-02 17:51 ` Saravana Kannan 1 sibling, 1 reply; 26+ messages in thread From: Rob Herring @ 2020-10-02 14:07 UTC (permalink / raw) To: Saravana Kannan Cc: Frank Rowand, Geert Uytterhoeven, Greg Kroah-Hartman, Grygorii Strashko, Laurent Pinchart, linux-omap, open list:THERMAL, Peter Ujfalusi, Rafael J. Wysocki, Tomi Valkeinen, Tony Lindgren, Ulf Hansson, Android Kernel Team, devicetree, linux-kernel On Thu, Oct 1, 2020 at 5:59 PM Saravana Kannan <saravanak@google.com> wrote: > > When commit 93d2e4322aa7 ("of: platform: Batch fwnode parsing when > adding all top level devices") optimized the fwnode parsing when all top > level devices are added, it missed out optimizing this for platform > where the top level devices are added through the init_machine() path. > > This commit does the optimization for all paths by simply moving the > fw_devlink_pause/resume() inside of_platform_default_populate(). > > Reported-by: Tomi Valkeinen <tomi.valkeinen@ti.com> > Signed-off-by: Saravana Kannan <saravanak@google.com> > --- > drivers/of/platform.c | 19 +++++++++++++++---- > 1 file changed, 15 insertions(+), 4 deletions(-) > > diff --git a/drivers/of/platform.c b/drivers/of/platform.c > index 071f04da32c8..79972e49b539 100644 > --- a/drivers/of/platform.c > +++ b/drivers/of/platform.c > @@ -501,8 +501,21 @@ int of_platform_default_populate(struct device_node *root, > const struct of_dev_auxdata *lookup, > struct device *parent) > { > - return of_platform_populate(root, of_default_bus_match_table, lookup, > - parent); > + int ret; > + > + /* > + * fw_devlink_pause/resume() are only safe to be called around top > + * level device addition due to locking constraints. > + */ > + if (!root) > + fw_devlink_pause(); > + > + ret = of_platform_populate(root, of_default_bus_match_table, lookup, > + parent); of_platform_default_populate() vs. of_platform_populate() is just a different match table. I don't think the behavior should otherwise be different. There's also of_platform_probe() which has slightly different matching behavior. It should not behave differently either with respect to devlinks. Rob ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v1] of: platform: Batch fwnode parsing in the init_machine() path 2020-10-02 14:07 ` Rob Herring @ 2020-10-02 17:51 ` Saravana Kannan 2020-10-02 17:54 ` Laurent Pinchart 2020-10-02 20:29 ` Rob Herring 0 siblings, 2 replies; 26+ messages in thread From: Saravana Kannan @ 2020-10-02 17:51 UTC (permalink / raw) To: Rob Herring Cc: Frank Rowand, Geert Uytterhoeven, Greg Kroah-Hartman, Grygorii Strashko, Laurent Pinchart, linux-omap, open list:THERMAL, Peter Ujfalusi, Rafael J. Wysocki, Tomi Valkeinen, Tony Lindgren, Ulf Hansson, Android Kernel Team, open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS, linux-kernel On Fri, Oct 2, 2020 at 7:08 AM Rob Herring <robh+dt@kernel.org> wrote: > > On Thu, Oct 1, 2020 at 5:59 PM Saravana Kannan <saravanak@google.com> wrote: > > > > When commit 93d2e4322aa7 ("of: platform: Batch fwnode parsing when > > adding all top level devices") optimized the fwnode parsing when all top > > level devices are added, it missed out optimizing this for platform > > where the top level devices are added through the init_machine() path. > > > > This commit does the optimization for all paths by simply moving the > > fw_devlink_pause/resume() inside of_platform_default_populate(). > > > > Reported-by: Tomi Valkeinen <tomi.valkeinen@ti.com> > > Signed-off-by: Saravana Kannan <saravanak@google.com> > > --- > > drivers/of/platform.c | 19 +++++++++++++++---- > > 1 file changed, 15 insertions(+), 4 deletions(-) > > > > diff --git a/drivers/of/platform.c b/drivers/of/platform.c > > index 071f04da32c8..79972e49b539 100644 > > --- a/drivers/of/platform.c > > +++ b/drivers/of/platform.c > > @@ -501,8 +501,21 @@ int of_platform_default_populate(struct device_node *root, > > const struct of_dev_auxdata *lookup, > > struct device *parent) > > { > > - return of_platform_populate(root, of_default_bus_match_table, lookup, > > - parent); > > + int ret; > > + > > + /* > > + * fw_devlink_pause/resume() are only safe to be called around top > > + * level device addition due to locking constraints. > > + */ > > + if (!root) > > + fw_devlink_pause(); > > + > > + ret = of_platform_populate(root, of_default_bus_match_table, lookup, > > + parent); > > of_platform_default_populate() vs. of_platform_populate() is just a > different match table. I don't think the behavior should otherwise be > different. > > There's also of_platform_probe() which has slightly different matching > behavior. It should not behave differently either with respect to > devlinks. So I'm trying to do this only when the top level devices are added for the first time. of_platform_default_populate() seems to be the most common path. For other cases, I think we just need to call fw_devlink_pause/resume() wherever the top level devices are added for the first time. As I said in the other email, we can't add fw_devlink_pause/resume() by default to of_platform_populate(). Do you have other ideas for achieving "call fw_devlink_pause/resume() only when top level devices are added for the first time"? -Saravana ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v1] of: platform: Batch fwnode parsing in the init_machine() path 2020-10-02 17:51 ` Saravana Kannan @ 2020-10-02 17:54 ` Laurent Pinchart 2020-10-02 17:58 ` Saravana Kannan 2020-10-02 20:29 ` Rob Herring 1 sibling, 1 reply; 26+ messages in thread From: Laurent Pinchart @ 2020-10-02 17:54 UTC (permalink / raw) To: Saravana Kannan Cc: Rob Herring, Frank Rowand, Geert Uytterhoeven, Greg Kroah-Hartman, Grygorii Strashko, linux-omap, open list:THERMAL, Peter Ujfalusi, Rafael J. Wysocki, Tomi Valkeinen, Tony Lindgren, Ulf Hansson, Android Kernel Team, open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS, linux-kernel Hi Saravana, On Fri, Oct 02, 2020 at 10:51:51AM -0700, Saravana Kannan wrote: > On Fri, Oct 2, 2020 at 7:08 AM Rob Herring <robh+dt@kernel.org> wrote: > > On Thu, Oct 1, 2020 at 5:59 PM Saravana Kannan <saravanak@google.com> wrote: > > > > > > When commit 93d2e4322aa7 ("of: platform: Batch fwnode parsing when > > > adding all top level devices") optimized the fwnode parsing when all top > > > level devices are added, it missed out optimizing this for platform > > > where the top level devices are added through the init_machine() path. > > > > > > This commit does the optimization for all paths by simply moving the > > > fw_devlink_pause/resume() inside of_platform_default_populate(). > > > > > > Reported-by: Tomi Valkeinen <tomi.valkeinen@ti.com> > > > Signed-off-by: Saravana Kannan <saravanak@google.com> > > > --- > > > drivers/of/platform.c | 19 +++++++++++++++---- > > > 1 file changed, 15 insertions(+), 4 deletions(-) > > > > > > diff --git a/drivers/of/platform.c b/drivers/of/platform.c > > > index 071f04da32c8..79972e49b539 100644 > > > --- a/drivers/of/platform.c > > > +++ b/drivers/of/platform.c > > > @@ -501,8 +501,21 @@ int of_platform_default_populate(struct device_node *root, > > > const struct of_dev_auxdata *lookup, > > > struct device *parent) > > > { > > > - return of_platform_populate(root, of_default_bus_match_table, lookup, > > > - parent); > > > + int ret; > > > + > > > + /* > > > + * fw_devlink_pause/resume() are only safe to be called around top > > > + * level device addition due to locking constraints. > > > + */ > > > + if (!root) > > > + fw_devlink_pause(); > > > + > > > + ret = of_platform_populate(root, of_default_bus_match_table, lookup, > > > + parent); > > > > of_platform_default_populate() vs. of_platform_populate() is just a > > different match table. I don't think the behavior should otherwise be > > different. > > > > There's also of_platform_probe() which has slightly different matching > > behavior. It should not behave differently either with respect to > > devlinks. > > So I'm trying to do this only when the top level devices are added for > the first time. of_platform_default_populate() seems to be the most > common path. For other cases, I think we just need to call > fw_devlink_pause/resume() wherever the top level devices are added for > the first time. As I said in the other email, we can't add > fw_devlink_pause/resume() by default to of_platform_populate(). > > Do you have other ideas for achieving "call fw_devlink_pause/resume() > only when top level devices are added for the first time"? I'm not an expert in this domain, but before investigating it, would you be able to share a hack patch that implements this (in the most simple way) to check if it actually fixes the delays I experience on my system ? -- Regards, Laurent Pinchart ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v1] of: platform: Batch fwnode parsing in the init_machine() path 2020-10-02 17:54 ` Laurent Pinchart @ 2020-10-02 17:58 ` Saravana Kannan 2020-10-02 18:27 ` Laurent Pinchart 0 siblings, 1 reply; 26+ messages in thread From: Saravana Kannan @ 2020-10-02 17:58 UTC (permalink / raw) To: Laurent Pinchart Cc: Rob Herring, Frank Rowand, Geert Uytterhoeven, Greg Kroah-Hartman, Grygorii Strashko, linux-omap, open list:THERMAL, Peter Ujfalusi, Rafael J. Wysocki, Tomi Valkeinen, Tony Lindgren, Ulf Hansson, Android Kernel Team, open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS, linux-kernel On Fri, Oct 2, 2020 at 10:55 AM Laurent Pinchart <laurent.pinchart@ideasonboard.com> wrote: > > Hi Saravana, > > On Fri, Oct 02, 2020 at 10:51:51AM -0700, Saravana Kannan wrote: > > On Fri, Oct 2, 2020 at 7:08 AM Rob Herring <robh+dt@kernel.org> wrote: > > > On Thu, Oct 1, 2020 at 5:59 PM Saravana Kannan <saravanak@google.com> wrote: > > > > > > > > When commit 93d2e4322aa7 ("of: platform: Batch fwnode parsing when > > > > adding all top level devices") optimized the fwnode parsing when all top > > > > level devices are added, it missed out optimizing this for platform > > > > where the top level devices are added through the init_machine() path. > > > > > > > > This commit does the optimization for all paths by simply moving the > > > > fw_devlink_pause/resume() inside of_platform_default_populate(). > > > > > > > > Reported-by: Tomi Valkeinen <tomi.valkeinen@ti.com> > > > > Signed-off-by: Saravana Kannan <saravanak@google.com> > > > > --- > > > > drivers/of/platform.c | 19 +++++++++++++++---- > > > > 1 file changed, 15 insertions(+), 4 deletions(-) > > > > > > > > diff --git a/drivers/of/platform.c b/drivers/of/platform.c > > > > index 071f04da32c8..79972e49b539 100644 > > > > --- a/drivers/of/platform.c > > > > +++ b/drivers/of/platform.c > > > > @@ -501,8 +501,21 @@ int of_platform_default_populate(struct device_node *root, > > > > const struct of_dev_auxdata *lookup, > > > > struct device *parent) > > > > { > > > > - return of_platform_populate(root, of_default_bus_match_table, lookup, > > > > - parent); > > > > + int ret; > > > > + > > > > + /* > > > > + * fw_devlink_pause/resume() are only safe to be called around top > > > > + * level device addition due to locking constraints. > > > > + */ > > > > + if (!root) > > > > + fw_devlink_pause(); > > > > + > > > > + ret = of_platform_populate(root, of_default_bus_match_table, lookup, > > > > + parent); > > > > > > of_platform_default_populate() vs. of_platform_populate() is just a > > > different match table. I don't think the behavior should otherwise be > > > different. > > > > > > There's also of_platform_probe() which has slightly different matching > > > behavior. It should not behave differently either with respect to > > > devlinks. > > > > So I'm trying to do this only when the top level devices are added for > > the first time. of_platform_default_populate() seems to be the most > > common path. For other cases, I think we just need to call > > fw_devlink_pause/resume() wherever the top level devices are added for > > the first time. As I said in the other email, we can't add > > fw_devlink_pause/resume() by default to of_platform_populate(). > > > > Do you have other ideas for achieving "call fw_devlink_pause/resume() > > only when top level devices are added for the first time"? > > I'm not an expert in this domain, but before investigating it, would you > be able to share a hack patch that implements this (in the most simple > way) to check if it actually fixes the delays I experience on my system > ? So I take it the patch I sent out didn't work for you? Can you tell me what machine/DT you are using? -Saravana ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v1] of: platform: Batch fwnode parsing in the init_machine() path 2020-10-02 17:58 ` Saravana Kannan @ 2020-10-02 18:27 ` Laurent Pinchart 2020-10-02 18:35 ` Grygorii Strashko 0 siblings, 1 reply; 26+ messages in thread From: Laurent Pinchart @ 2020-10-02 18:27 UTC (permalink / raw) To: Saravana Kannan Cc: Rob Herring, Frank Rowand, Geert Uytterhoeven, Greg Kroah-Hartman, Grygorii Strashko, linux-omap, open list:THERMAL, Peter Ujfalusi, Rafael J. Wysocki, Tomi Valkeinen, Tony Lindgren, Ulf Hansson, Android Kernel Team, open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS, linux-kernel Hi Saravana, On Fri, Oct 02, 2020 at 10:58:55AM -0700, Saravana Kannan wrote: > On Fri, Oct 2, 2020 at 10:55 AM Laurent Pinchart wrote: > > On Fri, Oct 02, 2020 at 10:51:51AM -0700, Saravana Kannan wrote: > > > On Fri, Oct 2, 2020 at 7:08 AM Rob Herring <robh+dt@kernel.org> wrote: > > > > On Thu, Oct 1, 2020 at 5:59 PM Saravana Kannan <saravanak@google.com> wrote: > > > > > > > > > > When commit 93d2e4322aa7 ("of: platform: Batch fwnode parsing when > > > > > adding all top level devices") optimized the fwnode parsing when all top > > > > > level devices are added, it missed out optimizing this for platform > > > > > where the top level devices are added through the init_machine() path. > > > > > > > > > > This commit does the optimization for all paths by simply moving the > > > > > fw_devlink_pause/resume() inside of_platform_default_populate(). > > > > > > > > > > Reported-by: Tomi Valkeinen <tomi.valkeinen@ti.com> > > > > > Signed-off-by: Saravana Kannan <saravanak@google.com> > > > > > --- > > > > > drivers/of/platform.c | 19 +++++++++++++++---- > > > > > 1 file changed, 15 insertions(+), 4 deletions(-) > > > > > > > > > > diff --git a/drivers/of/platform.c b/drivers/of/platform.c > > > > > index 071f04da32c8..79972e49b539 100644 > > > > > --- a/drivers/of/platform.c > > > > > +++ b/drivers/of/platform.c > > > > > @@ -501,8 +501,21 @@ int of_platform_default_populate(struct device_node *root, > > > > > const struct of_dev_auxdata *lookup, > > > > > struct device *parent) > > > > > { > > > > > - return of_platform_populate(root, of_default_bus_match_table, lookup, > > > > > - parent); > > > > > + int ret; > > > > > + > > > > > + /* > > > > > + * fw_devlink_pause/resume() are only safe to be called around top > > > > > + * level device addition due to locking constraints. > > > > > + */ > > > > > + if (!root) > > > > > + fw_devlink_pause(); > > > > > + > > > > > + ret = of_platform_populate(root, of_default_bus_match_table, lookup, > > > > > + parent); > > > > > > > > of_platform_default_populate() vs. of_platform_populate() is just a > > > > different match table. I don't think the behavior should otherwise be > > > > different. > > > > > > > > There's also of_platform_probe() which has slightly different matching > > > > behavior. It should not behave differently either with respect to > > > > devlinks. > > > > > > So I'm trying to do this only when the top level devices are added for > > > the first time. of_platform_default_populate() seems to be the most > > > common path. For other cases, I think we just need to call > > > fw_devlink_pause/resume() wherever the top level devices are added for > > > the first time. As I said in the other email, we can't add > > > fw_devlink_pause/resume() by default to of_platform_populate(). > > > > > > Do you have other ideas for achieving "call fw_devlink_pause/resume() > > > only when top level devices are added for the first time"? > > > > I'm not an expert in this domain, but before investigating it, would you > > be able to share a hack patch that implements this (in the most simple > > way) to check if it actually fixes the delays I experience on my system > > ? > > So I take it the patch I sent out didn't work for you? Can you tell me > what machine/DT you are using? I've replied to the patch: Based on v5.9-rc5, before the patch: [ 0.652887] cpuidle: using governor menu [ 12.349476] No ATAGs? After the patch: [ 0.650460] cpuidle: using governor menu [ 12.262101] No ATAGs? I'm using an AM57xx EVM, whose DT is not upstream, but it's essentially a am57xx-beagle-x15-revb1.dts (it includes that DTS) with a few additional nodes for GPIO keys, LCD panel, backlight and touchscreen. -- Regards, Laurent Pinchart ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v1] of: platform: Batch fwnode parsing in the init_machine() path 2020-10-02 18:27 ` Laurent Pinchart @ 2020-10-02 18:35 ` Grygorii Strashko 2020-10-02 19:56 ` Saravana Kannan 0 siblings, 1 reply; 26+ messages in thread From: Grygorii Strashko @ 2020-10-02 18:35 UTC (permalink / raw) To: Laurent Pinchart, Saravana Kannan Cc: Rob Herring, Frank Rowand, Geert Uytterhoeven, Greg Kroah-Hartman, linux-omap, open list:THERMAL, Peter Ujfalusi, Rafael J. Wysocki, Tomi Valkeinen, Tony Lindgren, Ulf Hansson, Android Kernel Team, open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS, linux-kernel hi Saravana, On 02/10/2020 21:27, Laurent Pinchart wrote: > Hi Saravana, > > On Fri, Oct 02, 2020 at 10:58:55AM -0700, Saravana Kannan wrote: >> On Fri, Oct 2, 2020 at 10:55 AM Laurent Pinchart wrote: >>> On Fri, Oct 02, 2020 at 10:51:51AM -0700, Saravana Kannan wrote: >>>> On Fri, Oct 2, 2020 at 7:08 AM Rob Herring <robh+dt@kernel.org> wrote: >>>>> On Thu, Oct 1, 2020 at 5:59 PM Saravana Kannan <saravanak@google.com> wrote: >>>>>> >>>>>> When commit 93d2e4322aa7 ("of: platform: Batch fwnode parsing when >>>>>> adding all top level devices") optimized the fwnode parsing when all top >>>>>> level devices are added, it missed out optimizing this for platform >>>>>> where the top level devices are added through the init_machine() path. >>>>>> >>>>>> This commit does the optimization for all paths by simply moving the >>>>>> fw_devlink_pause/resume() inside of_platform_default_populate(). >>>>>> >>>>>> Reported-by: Tomi Valkeinen <tomi.valkeinen@ti.com> >>>>>> Signed-off-by: Saravana Kannan <saravanak@google.com> >>>>>> --- >>>>>> drivers/of/platform.c | 19 +++++++++++++++---- >>>>>> 1 file changed, 15 insertions(+), 4 deletions(-) >>>>>> >>>>>> diff --git a/drivers/of/platform.c b/drivers/of/platform.c >>>>>> index 071f04da32c8..79972e49b539 100644 >>>>>> --- a/drivers/of/platform.c >>>>>> +++ b/drivers/of/platform.c >>>>>> @@ -501,8 +501,21 @@ int of_platform_default_populate(struct device_node *root, >>>>>> const struct of_dev_auxdata *lookup, >>>>>> struct device *parent) >>>>>> { >>>>>> - return of_platform_populate(root, of_default_bus_match_table, lookup, >>>>>> - parent); >>>>>> + int ret; >>>>>> + >>>>>> + /* >>>>>> + * fw_devlink_pause/resume() are only safe to be called around top >>>>>> + * level device addition due to locking constraints. >>>>>> + */ >>>>>> + if (!root) >>>>>> + fw_devlink_pause(); >>>>>> + >>>>>> + ret = of_platform_populate(root, of_default_bus_match_table, lookup, >>>>>> + parent); >>>>> >>>>> of_platform_default_populate() vs. of_platform_populate() is just a >>>>> different match table. I don't think the behavior should otherwise be >>>>> different. >>>>> >>>>> There's also of_platform_probe() which has slightly different matching >>>>> behavior. It should not behave differently either with respect to >>>>> devlinks. >>>> >>>> So I'm trying to do this only when the top level devices are added for >>>> the first time. of_platform_default_populate() seems to be the most >>>> common path. For other cases, I think we just need to call >>>> fw_devlink_pause/resume() wherever the top level devices are added for >>>> the first time. As I said in the other email, we can't add >>>> fw_devlink_pause/resume() by default to of_platform_populate(). >>>> >>>> Do you have other ideas for achieving "call fw_devlink_pause/resume() >>>> only when top level devices are added for the first time"? >>> >>> I'm not an expert in this domain, but before investigating it, would you >>> be able to share a hack patch that implements this (in the most simple >>> way) to check if it actually fixes the delays I experience on my system >>> ? >> >> So I take it the patch I sent out didn't work for you? Can you tell me >> what machine/DT you are using? > > I've replied to the patch: > > Based on v5.9-rc5, before the patch: > > [ 0.652887] cpuidle: using governor menu > [ 12.349476] No ATAGs? > > After the patch: > > [ 0.650460] cpuidle: using governor menu > [ 12.262101] No ATAGs? > > I'm using an AM57xx EVM, whose DT is not upstream, but it's essentially > a am57xx-beagle-x15-revb1.dts (it includes that DTS) with a few > additional nodes for GPIO keys, LCD panel, backlight and touchscreen. > hope you are receiving my mails as I've provided you with all required information already [1] with below diff: [ 4.177231] Freeing unused kernel memory: 1024K [ 4.181892] Run /sbin/init as init process The best time with [2] is [ 3.100483] Run /sbin/init as init process Still 1 sec lose. Pls understand an issue - requirements here are like 500ms boot with can, Ethernet, camera and display on ;( [1] https://lore.kernel.org/patchwork/patch/1316134/#1511276 [2] https://lore.kernel.org/patchwork/patch/1316134/#1511435 diff --git a/arch/arm/mach-omap2/pdata-quirks.c b/arch/arm/mach-omap2/pdata-quirks.c index 2a4fe3e68b82..ac1ab8928190 100644 --- a/arch/arm/mach-omap2/pdata-quirks.c +++ b/arch/arm/mach-omap2/pdata-quirks.c @@ -591,7 +591,9 @@ void __init pdata_quirks_init(const struct of_device_id *omap_dt_match_table) if (of_machine_is_compatible("ti,omap3")) omap3_mcbsp_init(); pdata_quirks_check(auxdata_quirks); + fw_devlink_pause(); of_platform_populate(NULL, omap_dt_match_table, omap_auxdata_lookup, NULL); + fw_devlink_resume(); pdata_quirks_check(pdata_quirks); } -- Best regards, grygorii ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH v1] of: platform: Batch fwnode parsing in the init_machine() path 2020-10-02 18:35 ` Grygorii Strashko @ 2020-10-02 19:56 ` Saravana Kannan 2020-10-03 0:13 ` Laurent Pinchart 0 siblings, 1 reply; 26+ messages in thread From: Saravana Kannan @ 2020-10-02 19:56 UTC (permalink / raw) To: Grygorii Strashko Cc: Laurent Pinchart, Rob Herring, Frank Rowand, Geert Uytterhoeven, Greg Kroah-Hartman, linux-omap, open list:THERMAL, Peter Ujfalusi, Rafael J. Wysocki, Tomi Valkeinen, Tony Lindgren, Ulf Hansson, Android Kernel Team, open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS, linux-kernel On Fri, Oct 2, 2020 at 11:35 AM 'Grygorii Strashko' via kernel-team <kernel-team@android.com> wrote: > > hi Saravana, > > On 02/10/2020 21:27, Laurent Pinchart wrote: > > Hi Saravana, > > > > On Fri, Oct 02, 2020 at 10:58:55AM -0700, Saravana Kannan wrote: > >> On Fri, Oct 2, 2020 at 10:55 AM Laurent Pinchart wrote: > >>> On Fri, Oct 02, 2020 at 10:51:51AM -0700, Saravana Kannan wrote: > >>>> On Fri, Oct 2, 2020 at 7:08 AM Rob Herring <robh+dt@kernel.org> wrote: > >>>>> On Thu, Oct 1, 2020 at 5:59 PM Saravana Kannan <saravanak@google.com> wrote: > >>>>>> > >>>>>> When commit 93d2e4322aa7 ("of: platform: Batch fwnode parsing when > >>>>>> adding all top level devices") optimized the fwnode parsing when all top > >>>>>> level devices are added, it missed out optimizing this for platform > >>>>>> where the top level devices are added through the init_machine() path. > >>>>>> > >>>>>> This commit does the optimization for all paths by simply moving the > >>>>>> fw_devlink_pause/resume() inside of_platform_default_populate(). > >>>>>> > >>>>>> Reported-by: Tomi Valkeinen <tomi.valkeinen@ti.com> > >>>>>> Signed-off-by: Saravana Kannan <saravanak@google.com> > >>>>>> --- > >>>>>> drivers/of/platform.c | 19 +++++++++++++++---- > >>>>>> 1 file changed, 15 insertions(+), 4 deletions(-) > >>>>>> > >>>>>> diff --git a/drivers/of/platform.c b/drivers/of/platform.c > >>>>>> index 071f04da32c8..79972e49b539 100644 > >>>>>> --- a/drivers/of/platform.c > >>>>>> +++ b/drivers/of/platform.c > >>>>>> @@ -501,8 +501,21 @@ int of_platform_default_populate(struct device_node *root, > >>>>>> const struct of_dev_auxdata *lookup, > >>>>>> struct device *parent) > >>>>>> { > >>>>>> - return of_platform_populate(root, of_default_bus_match_table, lookup, > >>>>>> - parent); > >>>>>> + int ret; > >>>>>> + > >>>>>> + /* > >>>>>> + * fw_devlink_pause/resume() are only safe to be called around top > >>>>>> + * level device addition due to locking constraints. > >>>>>> + */ > >>>>>> + if (!root) > >>>>>> + fw_devlink_pause(); > >>>>>> + > >>>>>> + ret = of_platform_populate(root, of_default_bus_match_table, lookup, > >>>>>> + parent); > >>>>> > >>>>> of_platform_default_populate() vs. of_platform_populate() is just a > >>>>> different match table. I don't think the behavior should otherwise be > >>>>> different. > >>>>> > >>>>> There's also of_platform_probe() which has slightly different matching > >>>>> behavior. It should not behave differently either with respect to > >>>>> devlinks. > >>>> > >>>> So I'm trying to do this only when the top level devices are added for > >>>> the first time. of_platform_default_populate() seems to be the most > >>>> common path. For other cases, I think we just need to call > >>>> fw_devlink_pause/resume() wherever the top level devices are added for > >>>> the first time. As I said in the other email, we can't add > >>>> fw_devlink_pause/resume() by default to of_platform_populate(). > >>>> > >>>> Do you have other ideas for achieving "call fw_devlink_pause/resume() > >>>> only when top level devices are added for the first time"? > >>> > >>> I'm not an expert in this domain, but before investigating it, would you > >>> be able to share a hack patch that implements this (in the most simple > >>> way) to check if it actually fixes the delays I experience on my system > >>> ? > >> > >> So I take it the patch I sent out didn't work for you? Can you tell me > >> what machine/DT you are using? > > > > I've replied to the patch: > > > > Based on v5.9-rc5, before the patch: > > > > [ 0.652887] cpuidle: using governor menu > > [ 12.349476] No ATAGs? > > > > After the patch: > > > > [ 0.650460] cpuidle: using governor menu > > [ 12.262101] No ATAGs? > > > > I'm using an AM57xx EVM, whose DT is not upstream, but it's essentially > > a am57xx-beagle-x15-revb1.dts (it includes that DTS) with a few > > additional nodes for GPIO keys, LCD panel, backlight and touchscreen. > > > > hope you are receiving my mails as I've provided you with all required information already [1] Laurent/Grygorii, Looks like I'm definitely missing emails. Sorry about the confusion. I have some other urgent things on my plate right now. Is it okay if I get to this in a day or two? In the end, we'll find a solution that addresses most/all of the delay. Thanks, Saravana ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v1] of: platform: Batch fwnode parsing in the init_machine() path 2020-10-02 19:56 ` Saravana Kannan @ 2020-10-03 0:13 ` Laurent Pinchart 2020-10-27 3:29 ` Saravana Kannan 0 siblings, 1 reply; 26+ messages in thread From: Laurent Pinchart @ 2020-10-03 0:13 UTC (permalink / raw) To: Saravana Kannan Cc: Grygorii Strashko, Rob Herring, Frank Rowand, Geert Uytterhoeven, Greg Kroah-Hartman, linux-omap, open list:THERMAL, Peter Ujfalusi, Rafael J. Wysocki, Tomi Valkeinen, Tony Lindgren, Ulf Hansson, Android Kernel Team, open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS, linux-kernel Hi Saravana, On Fri, Oct 02, 2020 at 12:56:30PM -0700, Saravana Kannan wrote: > On Fri, Oct 2, 2020 at 11:35 AM 'Grygorii Strashko' via kernel-team wrote: > > On 02/10/2020 21:27, Laurent Pinchart wrote: > > > On Fri, Oct 02, 2020 at 10:58:55AM -0700, Saravana Kannan wrote: > > >> On Fri, Oct 2, 2020 at 10:55 AM Laurent Pinchart wrote: > > >>> On Fri, Oct 02, 2020 at 10:51:51AM -0700, Saravana Kannan wrote: > > >>>> On Fri, Oct 2, 2020 at 7:08 AM Rob Herring <robh+dt@kernel.org> wrote: > > >>>>> On Thu, Oct 1, 2020 at 5:59 PM Saravana Kannan <saravanak@google.com> wrote: > > >>>>>> > > >>>>>> When commit 93d2e4322aa7 ("of: platform: Batch fwnode parsing when > > >>>>>> adding all top level devices") optimized the fwnode parsing when all top > > >>>>>> level devices are added, it missed out optimizing this for platform > > >>>>>> where the top level devices are added through the init_machine() path. > > >>>>>> > > >>>>>> This commit does the optimization for all paths by simply moving the > > >>>>>> fw_devlink_pause/resume() inside of_platform_default_populate(). > > >>>>>> > > >>>>>> Reported-by: Tomi Valkeinen <tomi.valkeinen@ti.com> > > >>>>>> Signed-off-by: Saravana Kannan <saravanak@google.com> > > >>>>>> --- > > >>>>>> drivers/of/platform.c | 19 +++++++++++++++---- > > >>>>>> 1 file changed, 15 insertions(+), 4 deletions(-) > > >>>>>> > > >>>>>> diff --git a/drivers/of/platform.c b/drivers/of/platform.c > > >>>>>> index 071f04da32c8..79972e49b539 100644 > > >>>>>> --- a/drivers/of/platform.c > > >>>>>> +++ b/drivers/of/platform.c > > >>>>>> @@ -501,8 +501,21 @@ int of_platform_default_populate(struct device_node *root, > > >>>>>> const struct of_dev_auxdata *lookup, > > >>>>>> struct device *parent) > > >>>>>> { > > >>>>>> - return of_platform_populate(root, of_default_bus_match_table, lookup, > > >>>>>> - parent); > > >>>>>> + int ret; > > >>>>>> + > > >>>>>> + /* > > >>>>>> + * fw_devlink_pause/resume() are only safe to be called around top > > >>>>>> + * level device addition due to locking constraints. > > >>>>>> + */ > > >>>>>> + if (!root) > > >>>>>> + fw_devlink_pause(); > > >>>>>> + > > >>>>>> + ret = of_platform_populate(root, of_default_bus_match_table, lookup, > > >>>>>> + parent); > > >>>>> > > >>>>> of_platform_default_populate() vs. of_platform_populate() is just a > > >>>>> different match table. I don't think the behavior should otherwise be > > >>>>> different. > > >>>>> > > >>>>> There's also of_platform_probe() which has slightly different matching > > >>>>> behavior. It should not behave differently either with respect to > > >>>>> devlinks. > > >>>> > > >>>> So I'm trying to do this only when the top level devices are added for > > >>>> the first time. of_platform_default_populate() seems to be the most > > >>>> common path. For other cases, I think we just need to call > > >>>> fw_devlink_pause/resume() wherever the top level devices are added for > > >>>> the first time. As I said in the other email, we can't add > > >>>> fw_devlink_pause/resume() by default to of_platform_populate(). > > >>>> > > >>>> Do you have other ideas for achieving "call fw_devlink_pause/resume() > > >>>> only when top level devices are added for the first time"? > > >>> > > >>> I'm not an expert in this domain, but before investigating it, would you > > >>> be able to share a hack patch that implements this (in the most simple > > >>> way) to check if it actually fixes the delays I experience on my system > > >>> ? > > >> > > >> So I take it the patch I sent out didn't work for you? Can you tell me > > >> what machine/DT you are using? > > > > > > I've replied to the patch: > > > > > > Based on v5.9-rc5, before the patch: > > > > > > [ 0.652887] cpuidle: using governor menu > > > [ 12.349476] No ATAGs? > > > > > > After the patch: > > > > > > [ 0.650460] cpuidle: using governor menu > > > [ 12.262101] No ATAGs? > > > > > > I'm using an AM57xx EVM, whose DT is not upstream, but it's essentially > > > a am57xx-beagle-x15-revb1.dts (it includes that DTS) with a few > > > additional nodes for GPIO keys, LCD panel, backlight and touchscreen. > > > > > > > hope you are receiving my mails as I've provided you with all required information already [1] > > Laurent/Grygorii, > > Looks like I'm definitely missing emails. Sorry about the confusion. > > I have some other urgent things on my plate right now. Is it okay if I > get to this in a day or two? In the end, we'll find a solution that > addresses most/all of the delay. No issue on my side. By the way, during initial investigations, I've traced code paths to figure out if there was a particular step that would consume a large amount of time, and found out that of_platform_populate() ends up executing devlink-related code that seems to have an O(n^3) complexity on the number of devices, with a few dozens of milliseconds for each iteration. That's a very bad complexity. -- Regards, Laurent Pinchart ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v1] of: platform: Batch fwnode parsing in the init_machine() path 2020-10-03 0:13 ` Laurent Pinchart @ 2020-10-27 3:29 ` Saravana Kannan 2020-10-28 7:34 ` Tomi Valkeinen 0 siblings, 1 reply; 26+ messages in thread From: Saravana Kannan @ 2020-10-27 3:29 UTC (permalink / raw) To: Laurent Pinchart Cc: Grygorii Strashko, Rob Herring, Frank Rowand, Geert Uytterhoeven, Greg Kroah-Hartman, linux-omap, open list:THERMAL, Peter Ujfalusi, Rafael J. Wysocki, Tomi Valkeinen, Tony Lindgren, Ulf Hansson, Android Kernel Team, open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS, linux-kernel On Fri, Oct 2, 2020 at 5:14 PM Laurent Pinchart <laurent.pinchart@ideasonboard.com> wrote: > > Hi Saravana, > > On Fri, Oct 02, 2020 at 12:56:30PM -0700, Saravana Kannan wrote: > > On Fri, Oct 2, 2020 at 11:35 AM 'Grygorii Strashko' via kernel-team wrote: > > > On 02/10/2020 21:27, Laurent Pinchart wrote: > > > > On Fri, Oct 02, 2020 at 10:58:55AM -0700, Saravana Kannan wrote: > > > >> On Fri, Oct 2, 2020 at 10:55 AM Laurent Pinchart wrote: > > > >>> On Fri, Oct 02, 2020 at 10:51:51AM -0700, Saravana Kannan wrote: > > > >>>> On Fri, Oct 2, 2020 at 7:08 AM Rob Herring <robh+dt@kernel.org> wrote: > > > >>>>> On Thu, Oct 1, 2020 at 5:59 PM Saravana Kannan <saravanak@google.com> wrote: > > > >>>>>> > > > >>>>>> When commit 93d2e4322aa7 ("of: platform: Batch fwnode parsing when > > > >>>>>> adding all top level devices") optimized the fwnode parsing when all top > > > >>>>>> level devices are added, it missed out optimizing this for platform > > > >>>>>> where the top level devices are added through the init_machine() path. > > > >>>>>> > > > >>>>>> This commit does the optimization for all paths by simply moving the > > > >>>>>> fw_devlink_pause/resume() inside of_platform_default_populate(). > > > >>>>>> > > > >>>>>> Reported-by: Tomi Valkeinen <tomi.valkeinen@ti.com> > > > >>>>>> Signed-off-by: Saravana Kannan <saravanak@google.com> > > > >>>>>> --- > > > >>>>>> drivers/of/platform.c | 19 +++++++++++++++---- > > > >>>>>> 1 file changed, 15 insertions(+), 4 deletions(-) > > > >>>>>> > > > >>>>>> diff --git a/drivers/of/platform.c b/drivers/of/platform.c > > > >>>>>> index 071f04da32c8..79972e49b539 100644 > > > >>>>>> --- a/drivers/of/platform.c > > > >>>>>> +++ b/drivers/of/platform.c > > > >>>>>> @@ -501,8 +501,21 @@ int of_platform_default_populate(struct device_node *root, > > > >>>>>> const struct of_dev_auxdata *lookup, > > > >>>>>> struct device *parent) > > > >>>>>> { > > > >>>>>> - return of_platform_populate(root, of_default_bus_match_table, lookup, > > > >>>>>> - parent); > > > >>>>>> + int ret; > > > >>>>>> + > > > >>>>>> + /* > > > >>>>>> + * fw_devlink_pause/resume() are only safe to be called around top > > > >>>>>> + * level device addition due to locking constraints. > > > >>>>>> + */ > > > >>>>>> + if (!root) > > > >>>>>> + fw_devlink_pause(); > > > >>>>>> + > > > >>>>>> + ret = of_platform_populate(root, of_default_bus_match_table, lookup, > > > >>>>>> + parent); > > > >>>>> > > > >>>>> of_platform_default_populate() vs. of_platform_populate() is just a > > > >>>>> different match table. I don't think the behavior should otherwise be > > > >>>>> different. > > > >>>>> > > > >>>>> There's also of_platform_probe() which has slightly different matching > > > >>>>> behavior. It should not behave differently either with respect to > > > >>>>> devlinks. > > > >>>> > > > >>>> So I'm trying to do this only when the top level devices are added for > > > >>>> the first time. of_platform_default_populate() seems to be the most > > > >>>> common path. For other cases, I think we just need to call > > > >>>> fw_devlink_pause/resume() wherever the top level devices are added for > > > >>>> the first time. As I said in the other email, we can't add > > > >>>> fw_devlink_pause/resume() by default to of_platform_populate(). > > > >>>> > > > >>>> Do you have other ideas for achieving "call fw_devlink_pause/resume() > > > >>>> only when top level devices are added for the first time"? > > > >>> > > > >>> I'm not an expert in this domain, but before investigating it, would you > > > >>> be able to share a hack patch that implements this (in the most simple > > > >>> way) to check if it actually fixes the delays I experience on my system > > > >>> ? > > > >> > > > >> So I take it the patch I sent out didn't work for you? Can you tell me > > > >> what machine/DT you are using? > > > > > > > > I've replied to the patch: > > > > > > > > Based on v5.9-rc5, before the patch: > > > > > > > > [ 0.652887] cpuidle: using governor menu > > > > [ 12.349476] No ATAGs? > > > > > > > > After the patch: > > > > > > > > [ 0.650460] cpuidle: using governor menu > > > > [ 12.262101] No ATAGs? > > > > > > > > I'm using an AM57xx EVM, whose DT is not upstream, but it's essentially > > > > a am57xx-beagle-x15-revb1.dts (it includes that DTS) with a few > > > > additional nodes for GPIO keys, LCD panel, backlight and touchscreen. > > > > > > > > > > hope you are receiving my mails as I've provided you with all required information already [1] > > > > Laurent/Grygorii, > > > > Looks like I'm definitely missing emails. Sorry about the confusion. > > > > I have some other urgent things on my plate right now. Is it okay if I > > get to this in a day or two? In the end, we'll find a solution that > > addresses most/all of the delay. > > No issue on my side. Hi Laurent, Sorry it took awhile for me to get back to this. Can you try throwing around fw_devlink_pause/resume() around the of_platform_populate() call in arch/arm/mach-omap2/pdata-quirks.c? Just trying to verify the cause/fix. If it fixes the issue, then considering Rob's comments [1], a good short term solution might be to have the suggestion above and some way to do pause/resume only when the top level devices are added. > By the way, during initial investigations, I've traced code paths to > figure out if there was a particular step that would consume a large > amount of time, and found out that of_platform_populate() ends up > executing devlink-related code that seems to have an O(n^3) complexity > on the number of devices, with a few dozens of milliseconds for each > iteration. That's a very bad complexity. As you said, the complexity of fw_devlink parsing can be O(N^2). There are other ways to improve it to make it O(N) but it has a bunch of additional complexity and memory increase. When I tried to do it that way the first time, I was question whether O(N^2) actually translated to measurable difference. Looks like we do now :) I have something in mind for how to do it with O(N) complexity, but I expect it to take a while to get in. So in the meantime, I'm thinking of using fw_devlink_pause/resume() as a short term optimization. -Saravana [1] - https://lore.kernel.org/linux-omap/CAL_Jsq+6mxtFei3+1ic4c5XCftJ8nZK6_S5_d15yEXQ02BTNKw@mail.gmail.com/ ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v1] of: platform: Batch fwnode parsing in the init_machine() path 2020-10-27 3:29 ` Saravana Kannan @ 2020-10-28 7:34 ` Tomi Valkeinen 0 siblings, 0 replies; 26+ messages in thread From: Tomi Valkeinen @ 2020-10-28 7:34 UTC (permalink / raw) To: Saravana Kannan, Laurent Pinchart Cc: Grygorii Strashko, Rob Herring, Frank Rowand, Geert Uytterhoeven, Greg Kroah-Hartman, linux-omap, open list:THERMAL, Peter Ujfalusi, Rafael J. Wysocki, Tony Lindgren, Ulf Hansson, Android Kernel Team, open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS, linux-kernel On 27/10/2020 05:29, Saravana Kannan wrote: > Can you try throwing around fw_devlink_pause/resume() around the > of_platform_populate() call in arch/arm/mach-omap2/pdata-quirks.c? > Just trying to verify the cause/fix. AM5 EVM on v5.10-rc1: [ 1.139945] cpuidle: using governor menu [ 13.126461] No ATAGs? After adding fw_devlink_pause/resume around of_platform_populate: [ 1.139587] cpuidle: using governor menu [ 1.899913] No ATAGs? Tomi -- Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v1] of: platform: Batch fwnode parsing in the init_machine() path 2020-10-02 17:51 ` Saravana Kannan 2020-10-02 17:54 ` Laurent Pinchart @ 2020-10-02 20:29 ` Rob Herring 1 sibling, 0 replies; 26+ messages in thread From: Rob Herring @ 2020-10-02 20:29 UTC (permalink / raw) To: Saravana Kannan Cc: Frank Rowand, Geert Uytterhoeven, Greg Kroah-Hartman, Grygorii Strashko, Laurent Pinchart, linux-omap, open list:THERMAL, Peter Ujfalusi, Rafael J. Wysocki, Tomi Valkeinen, Tony Lindgren, Ulf Hansson, Android Kernel Team, open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS, linux-kernel On Fri, Oct 2, 2020 at 12:52 PM Saravana Kannan <saravanak@google.com> wrote: > > On Fri, Oct 2, 2020 at 7:08 AM Rob Herring <robh+dt@kernel.org> wrote: > > > > On Thu, Oct 1, 2020 at 5:59 PM Saravana Kannan <saravanak@google.com> wrote: > > > > > > When commit 93d2e4322aa7 ("of: platform: Batch fwnode parsing when > > > adding all top level devices") optimized the fwnode parsing when all top > > > level devices are added, it missed out optimizing this for platform > > > where the top level devices are added through the init_machine() path. > > > > > > This commit does the optimization for all paths by simply moving the > > > fw_devlink_pause/resume() inside of_platform_default_populate(). > > > > > > Reported-by: Tomi Valkeinen <tomi.valkeinen@ti.com> > > > Signed-off-by: Saravana Kannan <saravanak@google.com> > > > --- > > > drivers/of/platform.c | 19 +++++++++++++++---- > > > 1 file changed, 15 insertions(+), 4 deletions(-) > > > > > > diff --git a/drivers/of/platform.c b/drivers/of/platform.c > > > index 071f04da32c8..79972e49b539 100644 > > > --- a/drivers/of/platform.c > > > +++ b/drivers/of/platform.c > > > @@ -501,8 +501,21 @@ int of_platform_default_populate(struct device_node *root, > > > const struct of_dev_auxdata *lookup, > > > struct device *parent) > > > { > > > - return of_platform_populate(root, of_default_bus_match_table, lookup, > > > - parent); > > > + int ret; > > > + > > > + /* > > > + * fw_devlink_pause/resume() are only safe to be called around top > > > + * level device addition due to locking constraints. > > > + */ > > > + if (!root) > > > + fw_devlink_pause(); > > > + > > > + ret = of_platform_populate(root, of_default_bus_match_table, lookup, > > > + parent); > > > > of_platform_default_populate() vs. of_platform_populate() is just a > > different match table. I don't think the behavior should otherwise be > > different. > > > > There's also of_platform_probe() which has slightly different matching > > behavior. It should not behave differently either with respect to > > devlinks. > > So I'm trying to do this only when the top level devices are added for > the first time. of_platform_default_populate() seems to be the most > common path. For other cases, I think we just need to call > fw_devlink_pause/resume() wherever the top level devices are added for > the first time. > As I said in the other email, we can't add > fw_devlink_pause/resume() by default to of_platform_populate(). If you detect it's the first time, you could? > > Do you have other ideas for achieving "call fw_devlink_pause/resume() > only when top level devices are added for the first time"? Eliminate the cases not using of_platform_default_populate(). There's 2 main reasons for the non default cases. The first is auxdata. Really, for any modern platform that people care about (and care about the boot time), they should not be using auxdata. That's just for the DT transition. You know, a temporary thing from 9 years ago. The 2nd is having some parent device. This is typically an soc_device. I really think this is kind of dumb. We should either have the parent device always or never. After all, everything's an SoC right? Of course changing that will break some Android systems since they like to use non-ABI sysfs device paths. There could also be some initcall ordering issues. IIRC, in the last round of cleanups in this area, at91 gpio/pinctrl had an issue with that. I think I have a half done fix for that I started. Rob ^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2020-10-29 17:17 UTC | newest] Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <20200924055313.GC9471@atomide.com> [not found] ` <fe0a4fa8-53fc-d316-261f-52f631f12469@ti.com> [not found] ` <20200924060826.GE9471@atomide.com> [not found] ` <20200924133049.GH3968@pendragon.ideasonboard.com> [not found] ` <20200925115147.GM9471@atomide.com> [not found] ` <20200925115817.GB3933@pendragon.ideasonboard.com> [not found] ` <20200930052057.GP9471@atomide.com> [not found] ` <d8d81891-7e22-81a2-19df-6e9a5f8679c4@ti.com> [not found] ` <20201001075344.GU9471@atomide.com> [not found] ` <20201001081748.GW9471@atomide.com> [not found] ` <20201001082256.GA3722@pendragon.ideasonboard.com> 2020-10-01 12:56 ` Slow booting on x15 Grygorii Strashko 2020-10-01 13:11 ` Geert Uytterhoeven 2020-10-01 13:49 ` Grygorii Strashko 2020-10-01 18:24 ` Saravana Kannan 2020-10-01 19:43 ` Grygorii Strashko 2020-10-01 22:22 ` Saravana Kannan 2020-10-01 22:30 ` Saravana Kannan 2020-10-01 22:38 ` Laurent Pinchart 2020-10-01 22:44 ` Saravana Kannan 2020-10-01 22:59 ` [PATCH v1] of: platform: Batch fwnode parsing in the init_machine() path Saravana Kannan 2020-10-01 23:19 ` Laurent Pinchart 2020-10-02 11:40 ` Grygorii Strashko 2020-10-02 15:03 ` Grygorii Strashko 2020-10-02 17:48 ` Saravana Kannan 2020-10-02 18:11 ` Grygorii Strashko 2020-10-02 14:07 ` Rob Herring 2020-10-02 17:51 ` Saravana Kannan 2020-10-02 17:54 ` Laurent Pinchart 2020-10-02 17:58 ` Saravana Kannan 2020-10-02 18:27 ` Laurent Pinchart 2020-10-02 18:35 ` Grygorii Strashko 2020-10-02 19:56 ` Saravana Kannan 2020-10-03 0:13 ` Laurent Pinchart 2020-10-27 3:29 ` Saravana Kannan 2020-10-28 7:34 ` Tomi Valkeinen 2020-10-02 20:29 ` Rob Herring
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).