* Re: [PATCH v8 19/29] thermal/of: Remove of_thermal_get_crit_temp() [not found] ` <20221003092602.1323944-20-daniel.lezcano@linaro.org> @ 2022-10-03 12:50 ` Marek Szyprowski 2022-10-03 13:29 ` [PATCH] thermal/drivers/exynos: Fix NULL pointer dereference when getting the critical temp Daniel Lezcano 2022-10-03 13:31 ` [PATCH v8 19/29] thermal/of: Remove of_thermal_get_crit_temp() Daniel Lezcano 0 siblings, 2 replies; 14+ messages in thread From: Marek Szyprowski @ 2022-10-03 12:50 UTC (permalink / raw) To: Daniel Lezcano, rafael Cc: linux-kernel, linux-pm, Bartlomiej Zolnierkiewicz, Krzysztof Kozlowski, Alim Akhtar, linux-arm-kernel, linux-samsung-soc, linux-tegra, linux-omap Hi Daniel, On 03.10.2022 11:25, Daniel Lezcano wrote: > The generic version of of_thermal_get_crit_temp() can be used. Let's > remove this ops which is pointless. > > Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> This patch breaks Exynos thermal driver as it introduces a NULL pointer dereference in exynos_tmu_initialize(): 8<--- cut here --- Unable to handle kernel NULL pointer dereference at virtual address 00000000 [00000000] *pgd=00000000 Internal error: Oops: 80000005 [#1] PREEMPT SMP ARM Modules linked in: CPU: 1 PID: 1 Comm: swapper/0 Not tainted 6.0.0-rc1-00072-ge521efddb107 #12941 Hardware name: Samsung Exynos (Flattened Device Tree) dwc2 12480000.hsotg: new address 125 PC is at 0x0 LR is at exynos_tmu_initialize+0x4c/0x1e8 ... Process swapper/0 (pid: 1, stack limit = 0x(ptrval)) Stack: (0xf082dd78 to 0xf082e000) ... exynos_tmu_initialize from exynos_tmu_probe+0x2b0/0x728 exynos_tmu_probe from platform_probe+0x5c/0xb8 platform_probe from really_probe+0xe0/0x414 really_probe from __driver_probe_device+0xa0/0x208 __driver_probe_device from driver_probe_device+0x30/0xc0 driver_probe_device from __driver_attach+0xf0/0x1f0 __driver_attach from bus_for_each_dev+0x70/0xb0 bus_for_each_dev from bus_add_driver+0x174/0x218 bus_add_driver from driver_register+0x88/0x11c driver_register from do_one_initcall+0x64/0x380 do_one_initcall from kernel_init_freeable+0x1c0/0x224 kernel_init_freeable from kernel_init+0x18/0x12c kernel_init from ret_from_fork+0x14/0x2c Exception stack(0xf082dfb0 to 0xf082dff8) ... Code: bad PC value ---[ end trace 0000000000000000 ]--- If there is no replacement for tzd->ops->get_crit_temp(tzd, &temp), then please simply remove that call in exynos_tmu_initialize() to avoid breaking the initialization. > --- > drivers/thermal/thermal_of.c | 15 --------------- > 1 file changed, 15 deletions(-) > > diff --git a/drivers/thermal/thermal_of.c b/drivers/thermal/thermal_of.c > index 494e9c319541..bd872183e521 100644 > --- a/drivers/thermal/thermal_of.c > +++ b/drivers/thermal/thermal_of.c > @@ -19,20 +19,6 @@ > #include "thermal_core.h" > -static int of_thermal_get_crit_temp(struct thermal_zone_device *tz, > - int *temp) > -{ > - int i; > - > - for (i = 0; i < tz->num_trips; i++) > - if (tz->trips[i].type == THERMAL_TRIP_CRITICAL) { > - *temp = tz->trips[i].temperature; > - return 0; > - } > - > - return -EINVAL; > -} > - > /*** functions parsing device tree nodes ***/ > static int of_find_trip_id(struct device_node *np, struct device_node > *trip) > @@ -529,7 +515,6 @@ struct thermal_zone_device > *thermal_of_zone_register(struct device_node *sensor, > goto out_kfree_trips; > } > - of_ops->get_crit_temp = of_ops->get_crit_temp ? : > of_thermal_get_crit_temp; > of_ops->bind = thermal_of_bind; > of_ops->unbind = thermal_of_unbind; Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH] thermal/drivers/exynos: Fix NULL pointer dereference when getting the critical temp 2022-10-03 12:50 ` [PATCH v8 19/29] thermal/of: Remove of_thermal_get_crit_temp() Marek Szyprowski @ 2022-10-03 13:29 ` Daniel Lezcano 2022-10-03 13:40 ` Krzysztof Kozlowski ` (2 more replies) 2022-10-03 13:31 ` [PATCH v8 19/29] thermal/of: Remove of_thermal_get_crit_temp() Daniel Lezcano 1 sibling, 3 replies; 14+ messages in thread From: Daniel Lezcano @ 2022-10-03 13:29 UTC (permalink / raw) To: daniel.lezcano, rafael, m.szyprowski Cc: Bartlomiej Zolnierkiewicz, Krzysztof Kozlowski, Amit Kucheria, Zhang Rui, Alim Akhtar, open list:SAMSUNG THERMAL DRIVER, open list:SAMSUNG THERMAL DRIVER, moderated list:ARM/SAMSUNG S3C, S5P AND EXYNOS ARM ARCHITECTURES, open list The driver is assuming the get_critical temperature exists as it is inherited by the thermal of ops. But this one has been removed in favor of the generic one. Use the generic thermal_zone_get_crit_temp() function instead Fixes: 13bea86623b ("thermal/of: Remove of_thermal_get_crit_temp(") Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> --- drivers/thermal/samsung/exynos_tmu.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/thermal/samsung/exynos_tmu.c b/drivers/thermal/samsung/exynos_tmu.c index 5a1ffe2f3134..37465af59262 100644 --- a/drivers/thermal/samsung/exynos_tmu.c +++ b/drivers/thermal/samsung/exynos_tmu.c @@ -264,9 +264,8 @@ static int exynos_tmu_initialize(struct platform_device *pdev) unsigned int status; int ret = 0, temp; - if (data->soc != SOC_ARCH_EXYNOS5433) /* FIXME */ - ret = tzd->ops->get_crit_temp(tzd, &temp); - if (ret) { + ret = thermal_zone_get_crit_temp(tzd, &temp); + if (ret && data->soc != SOC_ARCH_EXYNOS5433) { /* FIXME */ dev_err(&pdev->dev, "No CRITICAL trip point defined in device tree!\n"); goto out; -- 2.34.1 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH] thermal/drivers/exynos: Fix NULL pointer dereference when getting the critical temp 2022-10-03 13:29 ` [PATCH] thermal/drivers/exynos: Fix NULL pointer dereference when getting the critical temp Daniel Lezcano @ 2022-10-03 13:40 ` Krzysztof Kozlowski 2022-10-03 13:50 ` Marek Szyprowski 2022-10-17 13:48 ` Marek Szyprowski 2 siblings, 0 replies; 14+ messages in thread From: Krzysztof Kozlowski @ 2022-10-03 13:40 UTC (permalink / raw) To: Daniel Lezcano, rafael, m.szyprowski Cc: Bartlomiej Zolnierkiewicz, Amit Kucheria, Zhang Rui, Alim Akhtar, open list:SAMSUNG THERMAL DRIVER, open list:SAMSUNG THERMAL DRIVER, moderated list:ARM/SAMSUNG S3C, S5P AND EXYNOS ARM ARCHITECTURES, open list On 03/10/2022 15:29, Daniel Lezcano wrote: > The driver is assuming the get_critical temperature exists as it is > inherited by the thermal of ops. But this one has been removed in > favor of the generic one. > > Use the generic thermal_zone_get_crit_temp() function instead > > Fixes: 13bea86623b ("thermal/of: Remove of_thermal_get_crit_temp(") > Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> > Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Looks good. Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Best regards, Krzysztof _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] thermal/drivers/exynos: Fix NULL pointer dereference when getting the critical temp 2022-10-03 13:29 ` [PATCH] thermal/drivers/exynos: Fix NULL pointer dereference when getting the critical temp Daniel Lezcano 2022-10-03 13:40 ` Krzysztof Kozlowski @ 2022-10-03 13:50 ` Marek Szyprowski 2022-10-17 13:48 ` Marek Szyprowski 2 siblings, 0 replies; 14+ messages in thread From: Marek Szyprowski @ 2022-10-03 13:50 UTC (permalink / raw) To: Daniel Lezcano, rafael Cc: Bartlomiej Zolnierkiewicz, Krzysztof Kozlowski, Amit Kucheria, Zhang Rui, Alim Akhtar, open list:SAMSUNG THERMAL DRIVER, open list:SAMSUNG THERMAL DRIVER, moderated list:ARM/SAMSUNG S3C, S5P AND EXYNOS ARM ARCHITECTURES, open list On 03.10.2022 15:29, Daniel Lezcano wrote: > The driver is assuming the get_critical temperature exists as it is > inherited by the thermal of ops. But this one has been removed in > favor of the generic one. > > Use the generic thermal_zone_get_crit_temp() function instead > > Fixes: 13bea86623b ("thermal/of: Remove of_thermal_get_crit_temp(") > Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> > Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> > --- > drivers/thermal/samsung/exynos_tmu.c | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/drivers/thermal/samsung/exynos_tmu.c b/drivers/thermal/samsung/exynos_tmu.c > index 5a1ffe2f3134..37465af59262 100644 > --- a/drivers/thermal/samsung/exynos_tmu.c > +++ b/drivers/thermal/samsung/exynos_tmu.c > @@ -264,9 +264,8 @@ static int exynos_tmu_initialize(struct platform_device *pdev) > unsigned int status; > int ret = 0, temp; > > - if (data->soc != SOC_ARCH_EXYNOS5433) /* FIXME */ > - ret = tzd->ops->get_crit_temp(tzd, &temp); > - if (ret) { > + ret = thermal_zone_get_crit_temp(tzd, &temp); > + if (ret && data->soc != SOC_ARCH_EXYNOS5433) { /* FIXME */ > dev_err(&pdev->dev, > "No CRITICAL trip point defined in device tree!\n"); > goto out; Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] thermal/drivers/exynos: Fix NULL pointer dereference when getting the critical temp 2022-10-03 13:29 ` [PATCH] thermal/drivers/exynos: Fix NULL pointer dereference when getting the critical temp Daniel Lezcano 2022-10-03 13:40 ` Krzysztof Kozlowski 2022-10-03 13:50 ` Marek Szyprowski @ 2022-10-17 13:48 ` Marek Szyprowski 2022-10-17 14:14 ` Daniel Lezcano 2 siblings, 1 reply; 14+ messages in thread From: Marek Szyprowski @ 2022-10-17 13:48 UTC (permalink / raw) To: Daniel Lezcano, rafael Cc: Bartlomiej Zolnierkiewicz, Krzysztof Kozlowski, Amit Kucheria, Zhang Rui, Alim Akhtar, open list:SAMSUNG THERMAL DRIVER, open list:SAMSUNG THERMAL DRIVER, moderated list:ARM/SAMSUNG S3C, S5P AND EXYNOS ARM ARCHITECTURES, open list Hi Daniel, On 03.10.2022 15:29, Daniel Lezcano wrote: > The driver is assuming the get_critical temperature exists as it is > inherited by the thermal of ops. But this one has been removed in > favor of the generic one. > > Use the generic thermal_zone_get_crit_temp() function instead > > Fixes: 13bea86623b ("thermal/of: Remove of_thermal_get_crit_temp(") > Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> > Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> This patch has been dropped from -next, is there are reason for that? > --- > drivers/thermal/samsung/exynos_tmu.c | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/drivers/thermal/samsung/exynos_tmu.c b/drivers/thermal/samsung/exynos_tmu.c > index 5a1ffe2f3134..37465af59262 100644 > --- a/drivers/thermal/samsung/exynos_tmu.c > +++ b/drivers/thermal/samsung/exynos_tmu.c > @@ -264,9 +264,8 @@ static int exynos_tmu_initialize(struct platform_device *pdev) > unsigned int status; > int ret = 0, temp; > > - if (data->soc != SOC_ARCH_EXYNOS5433) /* FIXME */ > - ret = tzd->ops->get_crit_temp(tzd, &temp); > - if (ret) { > + ret = thermal_zone_get_crit_temp(tzd, &temp); > + if (ret && data->soc != SOC_ARCH_EXYNOS5433) { /* FIXME */ > dev_err(&pdev->dev, > "No CRITICAL trip point defined in device tree!\n"); > goto out; Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] thermal/drivers/exynos: Fix NULL pointer dereference when getting the critical temp 2022-10-17 13:48 ` Marek Szyprowski @ 2022-10-17 14:14 ` Daniel Lezcano 0 siblings, 0 replies; 14+ messages in thread From: Daniel Lezcano @ 2022-10-17 14:14 UTC (permalink / raw) To: Marek Szyprowski, rafael Cc: Bartlomiej Zolnierkiewicz, Krzysztof Kozlowski, Amit Kucheria, Zhang Rui, Alim Akhtar, open list:SAMSUNG THERMAL DRIVER, open list:SAMSUNG THERMAL DRIVER, moderated list:ARM/SAMSUNG S3C, S5P AND EXYNOS ARM ARCHITECTURES, open list On 17/10/2022 15:48, Marek Szyprowski wrote: > Hi Daniel, > > On 03.10.2022 15:29, Daniel Lezcano wrote: >> The driver is assuming the get_critical temperature exists as it is >> inherited by the thermal of ops. But this one has been removed in >> favor of the generic one. >> >> Use the generic thermal_zone_get_crit_temp() function instead >> >> Fixes: 13bea86623b ("thermal/of: Remove of_thermal_get_crit_temp(") >> Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> >> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> > > This patch has been dropped from -next, is there are reason for that? No, my mistake. I dropped it accidentally when I rebased the branch. Thanks for pointing it out. -- <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook | <http://twitter.com/#!/linaroorg> Twitter | <http://www.linaro.org/linaro-blog/> Blog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v8 19/29] thermal/of: Remove of_thermal_get_crit_temp() 2022-10-03 12:50 ` [PATCH v8 19/29] thermal/of: Remove of_thermal_get_crit_temp() Marek Szyprowski 2022-10-03 13:29 ` [PATCH] thermal/drivers/exynos: Fix NULL pointer dereference when getting the critical temp Daniel Lezcano @ 2022-10-03 13:31 ` Daniel Lezcano 1 sibling, 0 replies; 14+ messages in thread From: Daniel Lezcano @ 2022-10-03 13:31 UTC (permalink / raw) To: Marek Szyprowski, rafael Cc: linux-kernel, linux-pm, Bartlomiej Zolnierkiewicz, Krzysztof Kozlowski, Alim Akhtar, linux-arm-kernel, linux-samsung-soc, linux-tegra, linux-omap Hi Marek, On 03/10/2022 14:50, Marek Szyprowski wrote: > Hi Daniel, > > On 03.10.2022 11:25, Daniel Lezcano wrote: >> The generic version of of_thermal_get_crit_temp() can be used. Let's >> remove this ops which is pointless. >> >> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> > > This patch breaks Exynos thermal driver as it introduces a NULL pointer > dereference in exynos_tmu_initialize(): > Thanks for testing and reporting, I've just sent a fix for that (unfortunately can not be tested from my side) -- <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook | <http://twitter.com/#!/linaroorg> Twitter | <http://www.linaro.org/linaro-blog/> Blog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v8 00/29] Rework the trip points creation [not found] ` <20221003092602.1323944-1-daniel.lezcano@linaro.org> [not found] ` <CGME20221003093207eucas1p1d456288f35eadbc6fcda0bf24b58e678@eucas1p1.samsung.com> @ 2022-10-03 14:10 ` Marek Szyprowski 2022-10-03 15:36 ` Daniel Lezcano 2022-10-03 21:18 ` Daniel Lezcano 1 sibling, 2 replies; 14+ messages in thread From: Marek Szyprowski @ 2022-10-03 14:10 UTC (permalink / raw) To: Daniel Lezcano, rafael Cc: linux-kernel, linux-pm, rui.zhang, Broadcom Kernel Team, Support Opensource, Pengutronix Kernel Team, NXP Linux Team, Andy Gross, Bartlomiej Zolnierkiewicz, Krzysztof Kozlowski, Alim Akhtar, netdev, platform-driver-x86, linux-rpi-kernel, linux-arm-kernel, linux-arm-msm, linux-renesas-soc, linux-samsung-soc, linux-tegra, linux-omap Hi Daniel, On 03.10.2022 11:25, Daniel Lezcano wrote: > This work is the pre-requisite of handling correctly when the trip > point are crossed. For that we need to rework how the trip points are > declared and assigned to a thermal zone. > > Even if it appears to be a common sense to have the trip points being > ordered, this no guarantee neither documentation telling that is the > case. > > One solution could have been to create an ordered array of trips built > when registering the thermal zone by calling the different get_trip* > ops. However those ops receive a thermal zone pointer which is not > known as it is in the process of creating it. > > This cyclic dependency shows we have to rework how we manage the trip > points. > > Actually, all the trip points definition can be common to the backend > sensor drivers and we can factor out the thermal trip structure in all > of them. > > Then, as we register the thermal trips array, they will be available > in the thermal zone structure and a core function can return the trip > given its id. > > The get_trip_* ops won't be needed anymore and could be removed. The > resulting code will be another step forward to a self encapsulated > generic thermal framework. > > Most of the drivers can be converted more or less easily. This series > does a first round with most of the drivers. Some remain and will be > converted but with a smaller set of changes as the conversion is a bit > more complex. > > Changelog: > v8: > - Pretty oneline change and parenthesis removal (Rafael) > - Collected tags > v7: > - Added missing return 0 in the x86_pkg_temp driver > v6: > - Improved the code for the get_crit_temp() function as suggested by > Rafael > - Removed inner parenthesis in the set_trip_temp() function and invert the > conditions. Check the type of the trip point is unchanged > - Folded patch 4 with 1 > - Add per thermal zone info message in the bang-bang governor > - Folded the fix for an uninitialized variable in > int340x_thermal_zone_add() > v5: > - Fixed a deadlock when calling thermal_zone_get_trip() while > handling the thermal zone lock > - Remove an extra line in the sysfs change > - Collected tags > v4: > - Remove extra lines on exynos changes as reported by Krzysztof Kozlowski > - Collected tags > v3: > - Reorg the series to be git-bisect safe > - Added the set_trip generic function > - Added the get_crit_temp generic function > - Removed more dead code in the thermal-of > - Fixed the exynos changelog > - Fixed the error check for the exynos drivers > - Collected tags > v2: > - Added missing EXPORT_SYMBOL_GPL() for thermal_zone_get_trip() > - Removed tab whitespace in the acerhdf driver > - Collected tags > > Cc: Raju Rangoju <rajur@chelsio.com> > Cc: "David S. Miller" <davem@davemloft.net> > Cc: Eric Dumazet <edumazet@google.com> > Cc: Jakub Kicinski <kuba@kernel.org> > Cc: Paolo Abeni <pabeni@redhat.com> > Cc: Peter Kaestle <peter@piie.net> > Cc: Hans de Goede <hdegoede@redhat.com> > Cc: Mark Gross <markgross@kernel.org> > Cc: Miquel Raynal <miquel.raynal@bootlin.com> > Cc: "Rafael J. Wysocki" <rafael@kernel.org> > Cc: Daniel Lezcano <daniel.lezcano@linaro.org> > Cc: Amit Kucheria <amitk@kernel.org> > Cc: Zhang Rui <rui.zhang@intel.com> > Cc: Nicolas Saenz Julienne <nsaenz@kernel.org> > Cc: Broadcom Kernel Team <bcm-kernel-feedback-list@broadcom.com> > Cc: Florian Fainelli <f.fainelli@gmail.com> > Cc: Ray Jui <rjui@broadcom.com> > Cc: Scott Branden <sbranden@broadcom.com> > Cc: Support Opensource <support.opensource@diasemi.com> > Cc: Lukasz Luba <lukasz.luba@arm.com> > Cc: Shawn Guo <shawnguo@kernel.org> > Cc: Sascha Hauer <s.hauer@pengutronix.de> > Cc: Pengutronix Kernel Team <kernel@pengutronix.de> > Cc: Fabio Estevam <festevam@gmail.com> > Cc: NXP Linux Team <linux-imx@nxp.com> > Cc: Thara Gopinath <thara.gopinath@linaro.org> > Cc: Andy Gross <agross@kernel.org> > Cc: Bjorn Andersson <bjorn.andersson@linaro.org> > Cc: "Niklas Söderlund" <niklas.soderlund@ragnatech.se> > Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> > Cc: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> > Cc: Alim Akhtar <alim.akhtar@samsung.com> > Cc: Thierry Reding <thierry.reding@gmail.com> > Cc: Jonathan Hunter <jonathanh@nvidia.com> > Cc: Eduardo Valentin <edubezval@gmail.com> > Cc: Keerthy <j-keerthy@ti.com> > Cc: Kunihiko Hayashi <hayashi.kunihiko@socionext.com> > Cc: Masami Hiramatsu <mhiramat@kernel.org> > Cc: Antoine Tenart <atenart@kernel.org> > Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> > Cc: Dmitry Osipenko <digetx@gmail.com> > Cc: netdev@vger.kernel.org > Cc: linux-kernel@vger.kernel.org > Cc: platform-driver-x86@vger.kernel.org > Cc: linux-pm@vger.kernel.org > Cc: linux-rpi-kernel@lists.infradead.org > Cc: linux-arm-kernel@lists.infradead.org > Cc: linux-arm-msm@vger.kernel.org > Cc: linux-renesas-soc@vger.kernel.org > Cc: linux-samsung-soc@vger.kernel.org > Cc: linux-tegra@vger.kernel.org > Cc: linux-omap@vger.kernel.org > > Daniel Lezcano (29): > thermal/core: Add a generic thermal_zone_get_trip() function > thermal/sysfs: Always expose hysteresis attributes > thermal/core: Add a generic thermal_zone_set_trip() function > thermal/core/governors: Use thermal_zone_get_trip() instead of ops > functions > thermal/of: Use generic thermal_zone_get_trip() function > thermal/of: Remove unused functions > thermal/drivers/exynos: Use generic thermal_zone_get_trip() function > thermal/drivers/exynos: of_thermal_get_ntrips() > thermal/drivers/exynos: Replace of_thermal_is_trip_valid() by > thermal_zone_get_trip() > thermal/drivers/tegra: Use generic thermal_zone_get_trip() function > thermal/drivers/uniphier: Use generic thermal_zone_get_trip() function > thermal/drivers/hisi: Use generic thermal_zone_get_trip() function > thermal/drivers/qcom: Use generic thermal_zone_get_trip() function > thermal/drivers/armada: Use generic thermal_zone_get_trip() function > thermal/drivers/rcar_gen3: Use the generic function to get the number > of trips > thermal/of: Remove of_thermal_get_ntrips() > thermal/of: Remove of_thermal_is_trip_valid() > thermal/of: Remove of_thermal_set_trip_hyst() > thermal/of: Remove of_thermal_get_crit_temp() > thermal/drivers/st: Use generic trip points > thermal/drivers/imx: Use generic thermal_zone_get_trip() function > thermal/drivers/rcar: Use generic thermal_zone_get_trip() function > thermal/drivers/broadcom: Use generic thermal_zone_get_trip() function > thermal/drivers/da9062: Use generic thermal_zone_get_trip() function > thermal/drivers/ti: Remove unused macros ti_thermal_get_trip_value() / > ti_thermal_trip_is_valid() > thermal/drivers/acerhdf: Use generic thermal_zone_get_trip() function > thermal/drivers/cxgb4: Use generic thermal_zone_get_trip() function > thermal/intel/int340x: Replace parameter to simplify > thermal/drivers/intel: Use generic thermal_zone_get_trip() function I've tested this v8 patchset after fixing the issue with Exynos TMU with https://lore.kernel.org/all/20221003132943.1383065-1-daniel.lezcano@linaro.org/ patch and I got the following lockdep warning on all Exynos-based boards: ====================================================== WARNING: possible circular locking dependency detected 6.0.0-rc1-00083-ge5c9d117223e #12945 Not tainted ------------------------------------------------------ swapper/0/1 is trying to acquire lock: c1ce66b0 (&data->lock#2){+.+.}-{3:3}, at: exynos_get_temp+0x3c/0xc8 but task is already holding lock: c2979b94 (&tz->lock){+.+.}-{3:3}, at: thermal_zone_device_update.part.0+0x3c/0x528 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&tz->lock){+.+.}-{3:3}: mutex_lock_nested+0x1c/0x24 thermal_zone_get_trip+0x20/0x44 exynos_tmu_initialize+0x144/0x1e0 exynos_tmu_probe+0x2b0/0x728 platform_probe+0x5c/0xb8 really_probe+0xe0/0x414 __driver_probe_device+0xa0/0x208 driver_probe_device+0x30/0xc0 __driver_attach+0xf0/0x1f0 bus_for_each_dev+0x70/0xb0 bus_add_driver+0x174/0x218 driver_register+0x88/0x11c do_one_initcall+0x64/0x380 kernel_init_freeable+0x1c0/0x224 kernel_init+0x18/0x12c ret_from_fork+0x14/0x2c 0x0 -> #0 (&data->lock#2){+.+.}-{3:3}: lock_acquire+0x124/0x3e4 __mutex_lock+0x90/0x948 mutex_lock_nested+0x1c/0x24 exynos_get_temp+0x3c/0xc8 __thermal_zone_get_temp+0x5c/0x12c thermal_zone_device_update.part.0+0x78/0x528 __thermal_cooling_device_register.part.0+0x298/0x354 __cpufreq_cooling_register.constprop.0+0x138/0x218 of_cpufreq_cooling_register+0x48/0x8c cpufreq_online+0x8d0/0xb2c cpufreq_add_dev+0xb0/0xec subsys_interface_register+0x108/0x118 cpufreq_register_driver+0x15c/0x380 dt_cpufreq_probe+0x2e4/0x434 platform_probe+0x5c/0xb8 really_probe+0xe0/0x414 __driver_probe_device+0xa0/0x208 driver_probe_device+0x30/0xc0 __driver_attach+0xf0/0x1f0 bus_for_each_dev+0x70/0xb0 bus_add_driver+0x174/0x218 driver_register+0x88/0x11c do_one_initcall+0x64/0x380 kernel_init_freeable+0x1c0/0x224 kernel_init+0x18/0x12c ret_from_fork+0x14/0x2c 0x0 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&tz->lock); lock(&data->lock#2); lock(&tz->lock); lock(&data->lock#2); *** DEADLOCK *** 5 locks held by swapper/0/1: #0: c1c8648c (&dev->mutex){....}-{3:3}, at: __driver_attach+0xe4/0x1f0 #1: c1210434 (cpu_hotplug_lock){++++}-{0:0}, at: cpufreq_register_driver+0xc4/0x380 #2: c1ed8298 (subsys mutex#8){+.+.}-{3:3}, at: subsys_interface_register+0x4c/0x118 #3: c131f944 (thermal_list_lock){+.+.}-{3:3}, at: __thermal_cooling_device_register.part.0+0x238/0x354 #4: c2979b94 (&tz->lock){+.+.}-{3:3}, at: thermal_zone_device_update.part.0+0x3c/0x528 stack backtrace: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.0.0-rc1-00083-ge5c9d117223e #12945 Hardware name: Samsung Exynos (Flattened Device Tree) unwind_backtrace from show_stack+0x10/0x14 show_stack from dump_stack_lvl+0x58/0x70 dump_stack_lvl from check_noncircular+0xf0/0x158 check_noncircular from __lock_acquire+0x15e8/0x2a7c __lock_acquire from lock_acquire+0x124/0x3e4 lock_acquire from __mutex_lock+0x90/0x948 __mutex_lock from mutex_lock_nested+0x1c/0x24 mutex_lock_nested from exynos_get_temp+0x3c/0xc8 exynos_get_temp from __thermal_zone_get_temp+0x5c/0x12c __thermal_zone_get_temp from thermal_zone_device_update.part.0+0x78/0x528 thermal_zone_device_update.part.0 from __thermal_cooling_device_register.part.0+0x298/0x354 __thermal_cooling_device_register.part.0 from __cpufreq_cooling_register.constprop.0+0x138/0x218 __cpufreq_cooling_register.constprop.0 from of_cpufreq_cooling_register+0x48/0x8c of_cpufreq_cooling_register from cpufreq_online+0x8d0/0xb2c cpufreq_online from cpufreq_add_dev+0xb0/0xec cpufreq_add_dev from subsys_interface_register+0x108/0x118 subsys_interface_register from cpufreq_register_driver+0x15c/0x380 cpufreq_register_driver from dt_cpufreq_probe+0x2e4/0x434 dt_cpufreq_probe from platform_probe+0x5c/0xb8 platform_probe from really_probe+0xe0/0x414 really_probe from __driver_probe_device+0xa0/0x208 __driver_probe_device from driver_probe_device+0x30/0xc0 driver_probe_device from __driver_attach+0xf0/0x1f0 __driver_attach from bus_for_each_dev+0x70/0xb0 bus_for_each_dev from bus_add_driver+0x174/0x218 bus_add_driver from driver_register+0x88/0x11c driver_register from do_one_initcall+0x64/0x380 do_one_initcall from kernel_init_freeable+0x1c0/0x224 kernel_init_freeable from kernel_init+0x18/0x12c kernel_init from ret_from_fork+0x14/0x2c Exception stack(0xf082dfb0 to 0xf082dff8) ... Let me know if You need anything more to test. > drivers/net/ethernet/chelsio/cxgb4/cxgb4.h | 2 - > .../ethernet/chelsio/cxgb4/cxgb4_thermal.c | 41 +---- > drivers/platform/x86/acerhdf.c | 73 +++----- > drivers/thermal/armada_thermal.c | 39 ++--- > drivers/thermal/broadcom/bcm2835_thermal.c | 8 +- > drivers/thermal/da9062-thermal.c | 52 +----- > drivers/thermal/gov_bang_bang.c | 39 +++-- > drivers/thermal/gov_fair_share.c | 18 +- > drivers/thermal/gov_power_allocator.c | 51 +++--- > drivers/thermal/gov_step_wise.c | 22 ++- > drivers/thermal/hisi_thermal.c | 11 +- > drivers/thermal/imx_thermal.c | 72 +++----- > .../int340x_thermal/int340x_thermal_zone.c | 33 ++-- > .../int340x_thermal/int340x_thermal_zone.h | 4 +- > .../processor_thermal_device.c | 10 +- > drivers/thermal/intel/x86_pkg_temp_thermal.c | 120 +++++++------ > drivers/thermal/qcom/qcom-spmi-temp-alarm.c | 39 ++--- > drivers/thermal/rcar_gen3_thermal.c | 2 +- > drivers/thermal/rcar_thermal.c | 53 +----- > drivers/thermal/samsung/exynos_tmu.c | 57 +++---- > drivers/thermal/st/st_thermal.c | 47 +---- > drivers/thermal/tegra/soctherm.c | 33 ++-- > drivers/thermal/tegra/tegra30-tsensor.c | 17 +- > drivers/thermal/thermal_core.c | 160 +++++++++++++++--- > drivers/thermal/thermal_core.h | 24 +-- > drivers/thermal/thermal_helpers.c | 28 +-- > drivers/thermal/thermal_netlink.c | 21 +-- > drivers/thermal/thermal_of.c | 116 ------------- > drivers/thermal/thermal_sysfs.c | 133 +++++---------- > drivers/thermal/ti-soc-thermal/ti-thermal.h | 15 -- > drivers/thermal/uniphier_thermal.c | 27 ++- > include/linux/thermal.h | 10 ++ > 32 files changed, 559 insertions(+), 818 deletions(-) > Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v8 00/29] Rework the trip points creation 2022-10-03 14:10 ` [PATCH v8 00/29] Rework the trip points creation Marek Szyprowski @ 2022-10-03 15:36 ` Daniel Lezcano 2022-10-03 21:18 ` Daniel Lezcano 1 sibling, 0 replies; 14+ messages in thread From: Daniel Lezcano @ 2022-10-03 15:36 UTC (permalink / raw) To: Marek Szyprowski, rafael Cc: linux-kernel, linux-pm, rui.zhang, Broadcom Kernel Team, Support Opensource, Pengutronix Kernel Team, NXP Linux Team, Andy Gross, Bartlomiej Zolnierkiewicz, Krzysztof Kozlowski, Alim Akhtar, netdev, platform-driver-x86, linux-rpi-kernel, linux-arm-kernel, linux-arm-msm, linux-renesas-soc, linux-samsung-soc, linux-tegra, linux-omap On 03/10/2022 16:10, Marek Szyprowski wrote: > Hi Daniel, > > On 03.10.2022 11:25, Daniel Lezcano wrote: >> This work is the pre-requisite of handling correctly when the trip >> point are crossed. For that we need to rework how the trip points are >> declared and assigned to a thermal zone. >> >> Even if it appears to be a common sense to have the trip points being >> ordered, this no guarantee neither documentation telling that is the >> case. >> >> One solution could have been to create an ordered array of trips built >> when registering the thermal zone by calling the different get_trip* >> ops. However those ops receive a thermal zone pointer which is not >> known as it is in the process of creating it. >> >> This cyclic dependency shows we have to rework how we manage the trip >> points. >> >> Actually, all the trip points definition can be common to the backend >> sensor drivers and we can factor out the thermal trip structure in all >> of them. >> >> Then, as we register the thermal trips array, they will be available >> in the thermal zone structure and a core function can return the trip >> given its id. >> >> The get_trip_* ops won't be needed anymore and could be removed. The >> resulting code will be another step forward to a self encapsulated >> generic thermal framework. >> >> Most of the drivers can be converted more or less easily. This series >> does a first round with most of the drivers. Some remain and will be >> converted but with a smaller set of changes as the conversion is a bit >> more complex. >> >> Changelog: >> v8: >> - Pretty oneline change and parenthesis removal (Rafael) >> - Collected tags >> v7: >> - Added missing return 0 in the x86_pkg_temp driver >> v6: >> - Improved the code for the get_crit_temp() function as suggested by >> Rafael >> - Removed inner parenthesis in the set_trip_temp() function and invert the >> conditions. Check the type of the trip point is unchanged >> - Folded patch 4 with 1 >> - Add per thermal zone info message in the bang-bang governor >> - Folded the fix for an uninitialized variable in >> int340x_thermal_zone_add() >> v5: >> - Fixed a deadlock when calling thermal_zone_get_trip() while >> handling the thermal zone lock >> - Remove an extra line in the sysfs change >> - Collected tags >> v4: >> - Remove extra lines on exynos changes as reported by Krzysztof Kozlowski >> - Collected tags >> v3: >> - Reorg the series to be git-bisect safe >> - Added the set_trip generic function >> - Added the get_crit_temp generic function >> - Removed more dead code in the thermal-of >> - Fixed the exynos changelog >> - Fixed the error check for the exynos drivers >> - Collected tags >> v2: >> - Added missing EXPORT_SYMBOL_GPL() for thermal_zone_get_trip() >> - Removed tab whitespace in the acerhdf driver >> - Collected tags >> >> Cc: Raju Rangoju <rajur@chelsio.com> >> Cc: "David S. Miller" <davem@davemloft.net> >> Cc: Eric Dumazet <edumazet@google.com> >> Cc: Jakub Kicinski <kuba@kernel.org> >> Cc: Paolo Abeni <pabeni@redhat.com> >> Cc: Peter Kaestle <peter@piie.net> >> Cc: Hans de Goede <hdegoede@redhat.com> >> Cc: Mark Gross <markgross@kernel.org> >> Cc: Miquel Raynal <miquel.raynal@bootlin.com> >> Cc: "Rafael J. Wysocki" <rafael@kernel.org> >> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> >> Cc: Amit Kucheria <amitk@kernel.org> >> Cc: Zhang Rui <rui.zhang@intel.com> >> Cc: Nicolas Saenz Julienne <nsaenz@kernel.org> >> Cc: Broadcom Kernel Team <bcm-kernel-feedback-list@broadcom.com> >> Cc: Florian Fainelli <f.fainelli@gmail.com> >> Cc: Ray Jui <rjui@broadcom.com> >> Cc: Scott Branden <sbranden@broadcom.com> >> Cc: Support Opensource <support.opensource@diasemi.com> >> Cc: Lukasz Luba <lukasz.luba@arm.com> >> Cc: Shawn Guo <shawnguo@kernel.org> >> Cc: Sascha Hauer <s.hauer@pengutronix.de> >> Cc: Pengutronix Kernel Team <kernel@pengutronix.de> >> Cc: Fabio Estevam <festevam@gmail.com> >> Cc: NXP Linux Team <linux-imx@nxp.com> >> Cc: Thara Gopinath <thara.gopinath@linaro.org> >> Cc: Andy Gross <agross@kernel.org> >> Cc: Bjorn Andersson <bjorn.andersson@linaro.org> >> Cc: "Niklas Söderlund" <niklas.soderlund@ragnatech.se> >> Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> >> Cc: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> >> Cc: Alim Akhtar <alim.akhtar@samsung.com> >> Cc: Thierry Reding <thierry.reding@gmail.com> >> Cc: Jonathan Hunter <jonathanh@nvidia.com> >> Cc: Eduardo Valentin <edubezval@gmail.com> >> Cc: Keerthy <j-keerthy@ti.com> >> Cc: Kunihiko Hayashi <hayashi.kunihiko@socionext.com> >> Cc: Masami Hiramatsu <mhiramat@kernel.org> >> Cc: Antoine Tenart <atenart@kernel.org> >> Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> >> Cc: Dmitry Osipenko <digetx@gmail.com> >> Cc: netdev@vger.kernel.org >> Cc: linux-kernel@vger.kernel.org >> Cc: platform-driver-x86@vger.kernel.org >> Cc: linux-pm@vger.kernel.org >> Cc: linux-rpi-kernel@lists.infradead.org >> Cc: linux-arm-kernel@lists.infradead.org >> Cc: linux-arm-msm@vger.kernel.org >> Cc: linux-renesas-soc@vger.kernel.org >> Cc: linux-samsung-soc@vger.kernel.org >> Cc: linux-tegra@vger.kernel.org >> Cc: linux-omap@vger.kernel.org >> >> Daniel Lezcano (29): >> thermal/core: Add a generic thermal_zone_get_trip() function >> thermal/sysfs: Always expose hysteresis attributes >> thermal/core: Add a generic thermal_zone_set_trip() function >> thermal/core/governors: Use thermal_zone_get_trip() instead of ops >> functions >> thermal/of: Use generic thermal_zone_get_trip() function >> thermal/of: Remove unused functions >> thermal/drivers/exynos: Use generic thermal_zone_get_trip() function >> thermal/drivers/exynos: of_thermal_get_ntrips() >> thermal/drivers/exynos: Replace of_thermal_is_trip_valid() by >> thermal_zone_get_trip() >> thermal/drivers/tegra: Use generic thermal_zone_get_trip() function >> thermal/drivers/uniphier: Use generic thermal_zone_get_trip() function >> thermal/drivers/hisi: Use generic thermal_zone_get_trip() function >> thermal/drivers/qcom: Use generic thermal_zone_get_trip() function >> thermal/drivers/armada: Use generic thermal_zone_get_trip() function >> thermal/drivers/rcar_gen3: Use the generic function to get the number >> of trips >> thermal/of: Remove of_thermal_get_ntrips() >> thermal/of: Remove of_thermal_is_trip_valid() >> thermal/of: Remove of_thermal_set_trip_hyst() >> thermal/of: Remove of_thermal_get_crit_temp() >> thermal/drivers/st: Use generic trip points >> thermal/drivers/imx: Use generic thermal_zone_get_trip() function >> thermal/drivers/rcar: Use generic thermal_zone_get_trip() function >> thermal/drivers/broadcom: Use generic thermal_zone_get_trip() function >> thermal/drivers/da9062: Use generic thermal_zone_get_trip() function >> thermal/drivers/ti: Remove unused macros ti_thermal_get_trip_value() / >> ti_thermal_trip_is_valid() >> thermal/drivers/acerhdf: Use generic thermal_zone_get_trip() function >> thermal/drivers/cxgb4: Use generic thermal_zone_get_trip() function >> thermal/intel/int340x: Replace parameter to simplify >> thermal/drivers/intel: Use generic thermal_zone_get_trip() function > > I've tested this v8 patchset after fixing the issue with Exynos TMU with > https://lore.kernel.org/all/20221003132943.1383065-1-daniel.lezcano@linaro.org/ > patch and I got the following lockdep warning on all Exynos-based boards: > > ====================================================== > WARNING: possible circular locking dependency detected > 6.0.0-rc1-00083-ge5c9d117223e #12945 Not tainted > ------------------------------------------------------ > swapper/0/1 is trying to acquire lock: > c1ce66b0 (&data->lock#2){+.+.}-{3:3}, at: exynos_get_temp+0x3c/0xc8 > > but task is already holding lock: > c2979b94 (&tz->lock){+.+.}-{3:3}, at: > thermal_zone_device_update.part.0+0x3c/0x528 > > which lock already depends on the new lock. Investigating ... -- <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook | <http://twitter.com/#!/linaroorg> Twitter | <http://www.linaro.org/linaro-blog/> Blog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v8 00/29] Rework the trip points creation 2022-10-03 14:10 ` [PATCH v8 00/29] Rework the trip points creation Marek Szyprowski 2022-10-03 15:36 ` Daniel Lezcano @ 2022-10-03 21:18 ` Daniel Lezcano 2022-10-05 12:37 ` Daniel Lezcano 1 sibling, 1 reply; 14+ messages in thread From: Daniel Lezcano @ 2022-10-03 21:18 UTC (permalink / raw) To: Marek Szyprowski, rafael Cc: linux-kernel, linux-pm, rui.zhang, Broadcom Kernel Team, Support Opensource, Pengutronix Kernel Team, NXP Linux Team, Andy Gross, Bartlomiej Zolnierkiewicz, Krzysztof Kozlowski, Alim Akhtar, netdev, platform-driver-x86, linux-rpi-kernel, linux-arm-kernel, linux-arm-msm, linux-renesas-soc, linux-samsung-soc, linux-tegra, linux-omap On 03/10/2022 16:10, Marek Szyprowski wrote: > Hi Daniel, > > On 03.10.2022 11:25, Daniel Lezcano wrote: >> This work is the pre-requisite of handling correctly when the trip >> point are crossed. For that we need to rework how the trip points are >> declared and assigned to a thermal zone. >> >> Even if it appears to be a common sense to have the trip points being >> ordered, this no guarantee neither documentation telling that is the >> case. >> >> One solution could have been to create an ordered array of trips built >> when registering the thermal zone by calling the different get_trip* >> ops. However those ops receive a thermal zone pointer which is not >> known as it is in the process of creating it. >> >> This cyclic dependency shows we have to rework how we manage the trip >> points. >> >> Actually, all the trip points definition can be common to the backend >> sensor drivers and we can factor out the thermal trip structure in all >> of them. >> >> Then, as we register the thermal trips array, they will be available >> in the thermal zone structure and a core function can return the trip >> given its id. >> >> The get_trip_* ops won't be needed anymore and could be removed. The >> resulting code will be another step forward to a self encapsulated >> generic thermal framework. >> >> Most of the drivers can be converted more or less easily. This series >> does a first round with most of the drivers. Some remain and will be >> converted but with a smaller set of changes as the conversion is a bit >> more complex. >> >> Changelog: >> v8: >> - Pretty oneline change and parenthesis removal (Rafael) >> - Collected tags >> v7: >> - Added missing return 0 in the x86_pkg_temp driver >> v6: >> - Improved the code for the get_crit_temp() function as suggested by >> Rafael >> - Removed inner parenthesis in the set_trip_temp() function and invert the >> conditions. Check the type of the trip point is unchanged >> - Folded patch 4 with 1 >> - Add per thermal zone info message in the bang-bang governor >> - Folded the fix for an uninitialized variable in >> int340x_thermal_zone_add() >> v5: >> - Fixed a deadlock when calling thermal_zone_get_trip() while >> handling the thermal zone lock >> - Remove an extra line in the sysfs change >> - Collected tags >> v4: >> - Remove extra lines on exynos changes as reported by Krzysztof Kozlowski >> - Collected tags >> v3: >> - Reorg the series to be git-bisect safe >> - Added the set_trip generic function >> - Added the get_crit_temp generic function >> - Removed more dead code in the thermal-of >> - Fixed the exynos changelog >> - Fixed the error check for the exynos drivers >> - Collected tags >> v2: >> - Added missing EXPORT_SYMBOL_GPL() for thermal_zone_get_trip() >> - Removed tab whitespace in the acerhdf driver >> - Collected tags >> >> Cc: Raju Rangoju <rajur@chelsio.com> >> Cc: "David S. Miller" <davem@davemloft.net> >> Cc: Eric Dumazet <edumazet@google.com> >> Cc: Jakub Kicinski <kuba@kernel.org> >> Cc: Paolo Abeni <pabeni@redhat.com> >> Cc: Peter Kaestle <peter@piie.net> >> Cc: Hans de Goede <hdegoede@redhat.com> >> Cc: Mark Gross <markgross@kernel.org> >> Cc: Miquel Raynal <miquel.raynal@bootlin.com> >> Cc: "Rafael J. Wysocki" <rafael@kernel.org> >> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> >> Cc: Amit Kucheria <amitk@kernel.org> >> Cc: Zhang Rui <rui.zhang@intel.com> >> Cc: Nicolas Saenz Julienne <nsaenz@kernel.org> >> Cc: Broadcom Kernel Team <bcm-kernel-feedback-list@broadcom.com> >> Cc: Florian Fainelli <f.fainelli@gmail.com> >> Cc: Ray Jui <rjui@broadcom.com> >> Cc: Scott Branden <sbranden@broadcom.com> >> Cc: Support Opensource <support.opensource@diasemi.com> >> Cc: Lukasz Luba <lukasz.luba@arm.com> >> Cc: Shawn Guo <shawnguo@kernel.org> >> Cc: Sascha Hauer <s.hauer@pengutronix.de> >> Cc: Pengutronix Kernel Team <kernel@pengutronix.de> >> Cc: Fabio Estevam <festevam@gmail.com> >> Cc: NXP Linux Team <linux-imx@nxp.com> >> Cc: Thara Gopinath <thara.gopinath@linaro.org> >> Cc: Andy Gross <agross@kernel.org> >> Cc: Bjorn Andersson <bjorn.andersson@linaro.org> >> Cc: "Niklas Söderlund" <niklas.soderlund@ragnatech.se> >> Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> >> Cc: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> >> Cc: Alim Akhtar <alim.akhtar@samsung.com> >> Cc: Thierry Reding <thierry.reding@gmail.com> >> Cc: Jonathan Hunter <jonathanh@nvidia.com> >> Cc: Eduardo Valentin <edubezval@gmail.com> >> Cc: Keerthy <j-keerthy@ti.com> >> Cc: Kunihiko Hayashi <hayashi.kunihiko@socionext.com> >> Cc: Masami Hiramatsu <mhiramat@kernel.org> >> Cc: Antoine Tenart <atenart@kernel.org> >> Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> >> Cc: Dmitry Osipenko <digetx@gmail.com> >> Cc: netdev@vger.kernel.org >> Cc: linux-kernel@vger.kernel.org >> Cc: platform-driver-x86@vger.kernel.org >> Cc: linux-pm@vger.kernel.org >> Cc: linux-rpi-kernel@lists.infradead.org >> Cc: linux-arm-kernel@lists.infradead.org >> Cc: linux-arm-msm@vger.kernel.org >> Cc: linux-renesas-soc@vger.kernel.org >> Cc: linux-samsung-soc@vger.kernel.org >> Cc: linux-tegra@vger.kernel.org >> Cc: linux-omap@vger.kernel.org >> >> Daniel Lezcano (29): >> thermal/core: Add a generic thermal_zone_get_trip() function >> thermal/sysfs: Always expose hysteresis attributes >> thermal/core: Add a generic thermal_zone_set_trip() function >> thermal/core/governors: Use thermal_zone_get_trip() instead of ops >> functions >> thermal/of: Use generic thermal_zone_get_trip() function >> thermal/of: Remove unused functions >> thermal/drivers/exynos: Use generic thermal_zone_get_trip() function >> thermal/drivers/exynos: of_thermal_get_ntrips() >> thermal/drivers/exynos: Replace of_thermal_is_trip_valid() by >> thermal_zone_get_trip() >> thermal/drivers/tegra: Use generic thermal_zone_get_trip() function >> thermal/drivers/uniphier: Use generic thermal_zone_get_trip() function >> thermal/drivers/hisi: Use generic thermal_zone_get_trip() function >> thermal/drivers/qcom: Use generic thermal_zone_get_trip() function >> thermal/drivers/armada: Use generic thermal_zone_get_trip() function >> thermal/drivers/rcar_gen3: Use the generic function to get the number >> of trips >> thermal/of: Remove of_thermal_get_ntrips() >> thermal/of: Remove of_thermal_is_trip_valid() >> thermal/of: Remove of_thermal_set_trip_hyst() >> thermal/of: Remove of_thermal_get_crit_temp() >> thermal/drivers/st: Use generic trip points >> thermal/drivers/imx: Use generic thermal_zone_get_trip() function >> thermal/drivers/rcar: Use generic thermal_zone_get_trip() function >> thermal/drivers/broadcom: Use generic thermal_zone_get_trip() function >> thermal/drivers/da9062: Use generic thermal_zone_get_trip() function >> thermal/drivers/ti: Remove unused macros ti_thermal_get_trip_value() / >> ti_thermal_trip_is_valid() >> thermal/drivers/acerhdf: Use generic thermal_zone_get_trip() function >> thermal/drivers/cxgb4: Use generic thermal_zone_get_trip() function >> thermal/intel/int340x: Replace parameter to simplify >> thermal/drivers/intel: Use generic thermal_zone_get_trip() function > > I've tested this v8 patchset after fixing the issue with Exynos TMU with > https://lore.kernel.org/all/20221003132943.1383065-1-daniel.lezcano@linaro.org/ > patch and I got the following lockdep warning on all Exynos-based boards: > > > ====================================================== > WARNING: possible circular locking dependency detected > 6.0.0-rc1-00083-ge5c9d117223e #12945 Not tainted > ------------------------------------------------------ > swapper/0/1 is trying to acquire lock: > c1ce66b0 (&data->lock#2){+.+.}-{3:3}, at: exynos_get_temp+0x3c/0xc8 > > but task is already holding lock: > c2979b94 (&tz->lock){+.+.}-{3:3}, at: > thermal_zone_device_update.part.0+0x3c/0x528 > > which lock already depends on the new lock. I'm wondering if the problem is not already there and related to data->lock ... Doesn't the thermal zone lock already prevent racy access to the data structure? Another question: if the sensor clock is disabled after reading it, how does the hardware update the temperature and detect the programed threshold is crossed? > the existing dependency chain (in reverse order) is: > > -> #1 (&tz->lock){+.+.}-{3:3}: > mutex_lock_nested+0x1c/0x24 > thermal_zone_get_trip+0x20/0x44 > exynos_tmu_initialize+0x144/0x1e0 > exynos_tmu_probe+0x2b0/0x728 > platform_probe+0x5c/0xb8 > really_probe+0xe0/0x414 > __driver_probe_device+0xa0/0x208 > driver_probe_device+0x30/0xc0 > __driver_attach+0xf0/0x1f0 > bus_for_each_dev+0x70/0xb0 > bus_add_driver+0x174/0x218 > driver_register+0x88/0x11c > do_one_initcall+0x64/0x380 > kernel_init_freeable+0x1c0/0x224 > kernel_init+0x18/0x12c > ret_from_fork+0x14/0x2c > 0x0 > > -> #0 (&data->lock#2){+.+.}-{3:3}: > lock_acquire+0x124/0x3e4 > __mutex_lock+0x90/0x948 > mutex_lock_nested+0x1c/0x24 > exynos_get_temp+0x3c/0xc8 > __thermal_zone_get_temp+0x5c/0x12c > thermal_zone_device_update.part.0+0x78/0x528 > __thermal_cooling_device_register.part.0+0x298/0x354 > __cpufreq_cooling_register.constprop.0+0x138/0x218 > of_cpufreq_cooling_register+0x48/0x8c > cpufreq_online+0x8d0/0xb2c > cpufreq_add_dev+0xb0/0xec > subsys_interface_register+0x108/0x118 > cpufreq_register_driver+0x15c/0x380 > dt_cpufreq_probe+0x2e4/0x434 > platform_probe+0x5c/0xb8 > really_probe+0xe0/0x414 > __driver_probe_device+0xa0/0x208 > driver_probe_device+0x30/0xc0 > __driver_attach+0xf0/0x1f0 > bus_for_each_dev+0x70/0xb0 > bus_add_driver+0x174/0x218 > driver_register+0x88/0x11c > do_one_initcall+0x64/0x380 > kernel_init_freeable+0x1c0/0x224 > kernel_init+0x18/0x12c > ret_from_fork+0x14/0x2c > 0x0 > > other info that might help us debug this: > > Possible unsafe locking scenario: > > CPU0 CPU1 > ---- ---- > lock(&tz->lock); > lock(&data->lock#2); > lock(&tz->lock); > lock(&data->lock#2); > > *** DEADLOCK *** > > 5 locks held by swapper/0/1: > #0: c1c8648c (&dev->mutex){....}-{3:3}, at: __driver_attach+0xe4/0x1f0 > #1: c1210434 (cpu_hotplug_lock){++++}-{0:0}, at: > cpufreq_register_driver+0xc4/0x380 > #2: c1ed8298 (subsys mutex#8){+.+.}-{3:3}, at: > subsys_interface_register+0x4c/0x118 > #3: c131f944 (thermal_list_lock){+.+.}-{3:3}, at: > __thermal_cooling_device_register.part.0+0x238/0x354 > #4: c2979b94 (&tz->lock){+.+.}-{3:3}, at: > thermal_zone_device_update.part.0+0x3c/0x528 > > stack backtrace: > CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.0.0-rc1-00083-ge5c9d117223e > #12945 > Hardware name: Samsung Exynos (Flattened Device Tree) > unwind_backtrace from show_stack+0x10/0x14 > show_stack from dump_stack_lvl+0x58/0x70 > dump_stack_lvl from check_noncircular+0xf0/0x158 > check_noncircular from __lock_acquire+0x15e8/0x2a7c > __lock_acquire from lock_acquire+0x124/0x3e4 > lock_acquire from __mutex_lock+0x90/0x948 > __mutex_lock from mutex_lock_nested+0x1c/0x24 > mutex_lock_nested from exynos_get_temp+0x3c/0xc8 > exynos_get_temp from __thermal_zone_get_temp+0x5c/0x12c > __thermal_zone_get_temp from thermal_zone_device_update.part.0+0x78/0x528 > thermal_zone_device_update.part.0 from > __thermal_cooling_device_register.part.0+0x298/0x354 > __thermal_cooling_device_register.part.0 from > __cpufreq_cooling_register.constprop.0+0x138/0x218 > __cpufreq_cooling_register.constprop.0 from > of_cpufreq_cooling_register+0x48/0x8c > of_cpufreq_cooling_register from cpufreq_online+0x8d0/0xb2c > cpufreq_online from cpufreq_add_dev+0xb0/0xec > cpufreq_add_dev from subsys_interface_register+0x108/0x118 > subsys_interface_register from cpufreq_register_driver+0x15c/0x380 > cpufreq_register_driver from dt_cpufreq_probe+0x2e4/0x434 > dt_cpufreq_probe from platform_probe+0x5c/0xb8 > platform_probe from really_probe+0xe0/0x414 > really_probe from __driver_probe_device+0xa0/0x208 > __driver_probe_device from driver_probe_device+0x30/0xc0 > driver_probe_device from __driver_attach+0xf0/0x1f0 > __driver_attach from bus_for_each_dev+0x70/0xb0 > bus_for_each_dev from bus_add_driver+0x174/0x218 > bus_add_driver from driver_register+0x88/0x11c > driver_register from do_one_initcall+0x64/0x380 > do_one_initcall from kernel_init_freeable+0x1c0/0x224 > kernel_init_freeable from kernel_init+0x18/0x12c > kernel_init from ret_from_fork+0x14/0x2c > Exception stack(0xf082dfb0 to 0xf082dff8) > ... > > Let me know if You need anything more to test. > > >> drivers/net/ethernet/chelsio/cxgb4/cxgb4.h | 2 - >> .../ethernet/chelsio/cxgb4/cxgb4_thermal.c | 41 +---- >> drivers/platform/x86/acerhdf.c | 73 +++----- >> drivers/thermal/armada_thermal.c | 39 ++--- >> drivers/thermal/broadcom/bcm2835_thermal.c | 8 +- >> drivers/thermal/da9062-thermal.c | 52 +----- >> drivers/thermal/gov_bang_bang.c | 39 +++-- >> drivers/thermal/gov_fair_share.c | 18 +- >> drivers/thermal/gov_power_allocator.c | 51 +++--- >> drivers/thermal/gov_step_wise.c | 22 ++- >> drivers/thermal/hisi_thermal.c | 11 +- >> drivers/thermal/imx_thermal.c | 72 +++----- >> .../int340x_thermal/int340x_thermal_zone.c | 33 ++-- >> .../int340x_thermal/int340x_thermal_zone.h | 4 +- >> .../processor_thermal_device.c | 10 +- >> drivers/thermal/intel/x86_pkg_temp_thermal.c | 120 +++++++------ >> drivers/thermal/qcom/qcom-spmi-temp-alarm.c | 39 ++--- >> drivers/thermal/rcar_gen3_thermal.c | 2 +- >> drivers/thermal/rcar_thermal.c | 53 +----- >> drivers/thermal/samsung/exynos_tmu.c | 57 +++---- >> drivers/thermal/st/st_thermal.c | 47 +---- >> drivers/thermal/tegra/soctherm.c | 33 ++-- >> drivers/thermal/tegra/tegra30-tsensor.c | 17 +- >> drivers/thermal/thermal_core.c | 160 +++++++++++++++--- >> drivers/thermal/thermal_core.h | 24 +-- >> drivers/thermal/thermal_helpers.c | 28 +-- >> drivers/thermal/thermal_netlink.c | 21 +-- >> drivers/thermal/thermal_of.c | 116 ------------- >> drivers/thermal/thermal_sysfs.c | 133 +++++---------- >> drivers/thermal/ti-soc-thermal/ti-thermal.h | 15 -- >> drivers/thermal/uniphier_thermal.c | 27 ++- >> include/linux/thermal.h | 10 ++ >> 32 files changed, 559 insertions(+), 818 deletions(-) >> > Best regards > -- <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook | <http://twitter.com/#!/linaroorg> Twitter | <http://www.linaro.org/linaro-blog/> Blog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v8 00/29] Rework the trip points creation 2022-10-03 21:18 ` Daniel Lezcano @ 2022-10-05 12:37 ` Daniel Lezcano 2022-10-05 13:05 ` Marek Szyprowski 0 siblings, 1 reply; 14+ messages in thread From: Daniel Lezcano @ 2022-10-05 12:37 UTC (permalink / raw) To: Marek Szyprowski Cc: linux-kernel, linux-pm, rui.zhang, Broadcom Kernel Team, Support Opensource, Pengutronix Kernel Team, NXP Linux Team, Andy Gross, Bartlomiej Zolnierkiewicz, Krzysztof Kozlowski, Alim Akhtar, netdev, platform-driver-x86, linux-rpi-kernel, linux-arm-kernel, linux-arm-msm, linux-renesas-soc, linux-samsung-soc, linux-tegra, linux-omap, rafael Hi Marek, On 03/10/2022 23:18, Daniel Lezcano wrote: [ ... ] >> I've tested this v8 patchset after fixing the issue with Exynos TMU with >> https://lore.kernel.org/all/20221003132943.1383065-1-daniel.lezcano@linaro.org/ >> >> patch and I got the following lockdep warning on all Exynos-based boards: >> >> >> ====================================================== >> WARNING: possible circular locking dependency detected >> 6.0.0-rc1-00083-ge5c9d117223e #12945 Not tainted >> ------------------------------------------------------ >> swapper/0/1 is trying to acquire lock: >> c1ce66b0 (&data->lock#2){+.+.}-{3:3}, at: exynos_get_temp+0x3c/0xc8 >> >> but task is already holding lock: >> c2979b94 (&tz->lock){+.+.}-{3:3}, at: >> thermal_zone_device_update.part.0+0x3c/0x528 >> >> which lock already depends on the new lock. > > I'm wondering if the problem is not already there and related to > data->lock ... > > Doesn't the thermal zone lock already prevent racy access to the data > structure? > > Another question: if the sensor clock is disabled after reading it, how > does the hardware update the temperature and detect the programed > threshold is crossed? just a gentle ping, as the fix will depend on your answer ;) Thanks -- D. [ ... ] -- <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook | <http://twitter.com/#!/linaroorg> Twitter | <http://www.linaro.org/linaro-blog/> Blog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v8 00/29] Rework the trip points creation 2022-10-05 12:37 ` Daniel Lezcano @ 2022-10-05 13:05 ` Marek Szyprowski 2022-10-06 6:55 ` Daniel Lezcano 0 siblings, 1 reply; 14+ messages in thread From: Marek Szyprowski @ 2022-10-05 13:05 UTC (permalink / raw) To: Daniel Lezcano Cc: linux-kernel, linux-pm, rui.zhang, Broadcom Kernel Team, Support Opensource, Pengutronix Kernel Team, NXP Linux Team, Andy Gross, Bartlomiej Zolnierkiewicz, Krzysztof Kozlowski, Alim Akhtar, netdev, platform-driver-x86, linux-rpi-kernel, linux-arm-kernel, linux-arm-msm, linux-renesas-soc, linux-samsung-soc, linux-tegra, linux-omap, rafael On 05.10.2022 14:37, Daniel Lezcano wrote: > > Hi Marek, > > On 03/10/2022 23:18, Daniel Lezcano wrote: > > [ ... ] > >>> I've tested this v8 patchset after fixing the issue with Exynos TMU >>> with >>> https://lore.kernel.org/all/20221003132943.1383065-1-daniel.lezcano@linaro.org/ >>> >>> patch and I got the following lockdep warning on all Exynos-based >>> boards: >>> >>> >>> ====================================================== >>> WARNING: possible circular locking dependency detected >>> 6.0.0-rc1-00083-ge5c9d117223e #12945 Not tainted >>> ------------------------------------------------------ >>> swapper/0/1 is trying to acquire lock: >>> c1ce66b0 (&data->lock#2){+.+.}-{3:3}, at: exynos_get_temp+0x3c/0xc8 >>> >>> but task is already holding lock: >>> c2979b94 (&tz->lock){+.+.}-{3:3}, at: >>> thermal_zone_device_update.part.0+0x3c/0x528 >>> >>> which lock already depends on the new lock. >> >> I'm wondering if the problem is not already there and related to >> data->lock ... >> >> Doesn't the thermal zone lock already prevent racy access to the data >> structure? >> >> Another question: if the sensor clock is disabled after reading it, >> how does the hardware update the temperature and detect the programed >> threshold is crossed? > > just a gentle ping, as the fix will depend on your answer ;) > Sorry, I've been busy with other stuff. I thought I will fix this once I find a bit of spare time. IMHO the clock management is a bit over-engineered, as there is little (if any) benefit from such fine grade clock management. That clock is needed only for the AHB related part of the TMU (reading/writing the registers). The IRQ generation and temperature measurement is clocked from so called 'sclk' (special clock). I also briefly looked at the code and the internal lock doesn't look to be really necessary assuming that the thermal core already serializes all the calls. Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v8 00/29] Rework the trip points creation 2022-10-05 13:05 ` Marek Szyprowski @ 2022-10-06 6:55 ` Daniel Lezcano 2022-10-06 16:25 ` Marek Szyprowski 0 siblings, 1 reply; 14+ messages in thread From: Daniel Lezcano @ 2022-10-06 6:55 UTC (permalink / raw) To: Marek Szyprowski Cc: linux-kernel, linux-pm, rui.zhang, Broadcom Kernel Team, Support Opensource, Pengutronix Kernel Team, NXP Linux Team, Andy Gross, Bartlomiej Zolnierkiewicz, Krzysztof Kozlowski, Alim Akhtar, netdev, platform-driver-x86, linux-rpi-kernel, linux-arm-kernel, linux-arm-msm, linux-renesas-soc, linux-samsung-soc, linux-tegra, linux-omap, rafael Hi Marek, On 05/10/2022 15:05, Marek Szyprowski wrote: > > On 05.10.2022 14:37, Daniel Lezcano wrote: >> >> Hi Marek, >> >> On 03/10/2022 23:18, Daniel Lezcano wrote: >> >> [ ... ] >> >>>> I've tested this v8 patchset after fixing the issue with Exynos TMU >>>> with >>>> https://lore.kernel.org/all/20221003132943.1383065-1-daniel.lezcano@linaro.org/ >>>> >>>> patch and I got the following lockdep warning on all Exynos-based >>>> boards: >>>> >>>> >>>> ====================================================== >>>> WARNING: possible circular locking dependency detected >>>> 6.0.0-rc1-00083-ge5c9d117223e #12945 Not tainted >>>> ------------------------------------------------------ >>>> swapper/0/1 is trying to acquire lock: >>>> c1ce66b0 (&data->lock#2){+.+.}-{3:3}, at: exynos_get_temp+0x3c/0xc8 >>>> >>>> but task is already holding lock: >>>> c2979b94 (&tz->lock){+.+.}-{3:3}, at: >>>> thermal_zone_device_update.part.0+0x3c/0x528 >>>> >>>> which lock already depends on the new lock. >>> >>> I'm wondering if the problem is not already there and related to >>> data->lock ... >>> >>> Doesn't the thermal zone lock already prevent racy access to the data >>> structure? >>> >>> Another question: if the sensor clock is disabled after reading it, >>> how does the hardware update the temperature and detect the programed >>> threshold is crossed? >> >> just a gentle ping, as the fix will depend on your answer ;) >> > Sorry, I've been busy with other stuff. I thought I will fix this once I > find a bit of spare time. Ok, that is great if you can find time to fix it up because I've other drivers to convert to the generic thermal trips. > IMHO the clock management is a bit over-engineered, as there is little > (if any) benefit from such fine grade clock management. That clock is > needed only for the AHB related part of the TMU (reading/writing the > registers). The IRQ generation and temperature measurement is clocked > from so called 'sclk' (special clock). > > I also briefly looked at the code and the internal lock doesn't look to > be really necessary assuming that the thermal core already serializes > all the calls. I looked at the code and I think the driver can be simplified (fixed?) even more. IIUC, the sensor has multiple trip point interrupts, so if the device tree is describing more trip points than the sensor supports, there is a warning and the number of trip point is capped. IMO that can be simplified by using two trip point interrupt because the thermal_zone_device_update() will call the set_trips callback with the new boundaries. IOW, the thermal framework sets a new trip point interrupt when one is crossed. That should result in the simplification of the tmu_control as well as the tmu_probe function. As well as removing the limitation of the number of trip points. In order to have that correctly working, the 'set_trips' ops must be used to call the tmu_control callback instead of calling it in tmu_probe. The intialization workflow should be: probe->... ->thermal_zone_device_register() ->thermal_zone_device_update() ->update_trip_points() ->ops->set_trips() ->tmu_control() Also, replace the workqueue by a threaded interrupt. Does it make sense? -- <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook | <http://twitter.com/#!/linaroorg> Twitter | <http://www.linaro.org/linaro-blog/> Blog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v8 00/29] Rework the trip points creation 2022-10-06 6:55 ` Daniel Lezcano @ 2022-10-06 16:25 ` Marek Szyprowski 0 siblings, 0 replies; 14+ messages in thread From: Marek Szyprowski @ 2022-10-06 16:25 UTC (permalink / raw) To: Daniel Lezcano Cc: linux-kernel, linux-pm, rui.zhang, Broadcom Kernel Team, Support Opensource, Pengutronix Kernel Team, NXP Linux Team, Andy Gross, Bartlomiej Zolnierkiewicz, Krzysztof Kozlowski, Alim Akhtar, netdev, platform-driver-x86, linux-rpi-kernel, linux-arm-kernel, linux-arm-msm, linux-renesas-soc, linux-samsung-soc, linux-tegra, linux-omap, rafael Hi Daniel, On 06.10.2022 08:55, Daniel Lezcano wrote: > > On 05/10/2022 15:05, Marek Szyprowski wrote: >> >> On 05.10.2022 14:37, Daniel Lezcano wrote: >>> >>> On 03/10/2022 23:18, Daniel Lezcano wrote: >>> >>> [ ... ] >>> >>>>> I've tested this v8 patchset after fixing the issue with Exynos TMU >>>>> with >>>>> https://lore.kernel.org/all/20221003132943.1383065-1-daniel.lezcano@linaro.org/ >>>>> >>>>> >>>>> patch and I got the following lockdep warning on all Exynos-based >>>>> boards: >>>>> >>>>> >>>>> ====================================================== >>>>> WARNING: possible circular locking dependency detected >>>>> 6.0.0-rc1-00083-ge5c9d117223e #12945 Not tainted >>>>> ------------------------------------------------------ >>>>> swapper/0/1 is trying to acquire lock: >>>>> c1ce66b0 (&data->lock#2){+.+.}-{3:3}, at: exynos_get_temp+0x3c/0xc8 >>>>> >>>>> but task is already holding lock: >>>>> c2979b94 (&tz->lock){+.+.}-{3:3}, at: >>>>> thermal_zone_device_update.part.0+0x3c/0x528 >>>>> >>>>> which lock already depends on the new lock. >>>> >>>> I'm wondering if the problem is not already there and related to >>>> data->lock ... >>>> >>>> Doesn't the thermal zone lock already prevent racy access to the data >>>> structure? >>>> >>>> Another question: if the sensor clock is disabled after reading it, >>>> how does the hardware update the temperature and detect the programed >>>> threshold is crossed? >>> >>> just a gentle ping, as the fix will depend on your answer ;) >>> >> Sorry, I've been busy with other stuff. I thought I will fix this once I >> find a bit of spare time. > > Ok, that is great if you can find time to fix it up because I've other > drivers to convert to the generic thermal trips. > > >> IMHO the clock management is a bit over-engineered, as there is little >> (if any) benefit from such fine grade clock management. That clock is >> needed only for the AHB related part of the TMU (reading/writing the >> registers). The IRQ generation and temperature measurement is clocked >> from so called 'sclk' (special clock). >> >> I also briefly looked at the code and the internal lock doesn't look to >> be really necessary assuming that the thermal core already serializes >> all the calls. > > I looked at the code and I think the driver can be simplified (fixed?) > even more. > > IIUC, the sensor has multiple trip point interrupts, so if the device > tree is describing more trip points than the sensor supports, there is > a warning and the number of trip point is capped. > > IMO that can be simplified by using two trip point interrupt because > the thermal_zone_device_update() will call the set_trips callback with > the new boundaries. IOW, the thermal framework sets a new trip point > interrupt when one is crossed. > > That should result in the simplification of the tmu_control as well as > the tmu_probe function. As well as removing the limitation of the > number of trip points. > > In order to have that correctly working, the 'set_trips' ops must be > used to call the tmu_control callback instead of calling it in tmu_probe. > > The intialization workflow should be: > > probe->... > ->thermal_zone_device_register() > ->thermal_zone_device_update() > ->update_trip_points() > ->ops->set_trips() > ->tmu_control() > > Also, replace the workqueue by a threaded interrupt. > > Does it make sense? Yes, definitely. Frankly speaking I've never looked into that code, so I was not aware that it needs some cleanup. Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2022-10-17 14:15 UTC | newest] Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <CGME20221003092704eucas1p2875c1f996dfd60a58f06cf986e02e8eb@eucas1p2.samsung.com> [not found] ` <20221003092602.1323944-1-daniel.lezcano@linaro.org> [not found] ` <CGME20221003093207eucas1p1d456288f35eadbc6fcda0bf24b58e678@eucas1p1.samsung.com> [not found] ` <20221003092602.1323944-20-daniel.lezcano@linaro.org> 2022-10-03 12:50 ` [PATCH v8 19/29] thermal/of: Remove of_thermal_get_crit_temp() Marek Szyprowski 2022-10-03 13:29 ` [PATCH] thermal/drivers/exynos: Fix NULL pointer dereference when getting the critical temp Daniel Lezcano 2022-10-03 13:40 ` Krzysztof Kozlowski 2022-10-03 13:50 ` Marek Szyprowski 2022-10-17 13:48 ` Marek Szyprowski 2022-10-17 14:14 ` Daniel Lezcano 2022-10-03 13:31 ` [PATCH v8 19/29] thermal/of: Remove of_thermal_get_crit_temp() Daniel Lezcano 2022-10-03 14:10 ` [PATCH v8 00/29] Rework the trip points creation Marek Szyprowski 2022-10-03 15:36 ` Daniel Lezcano 2022-10-03 21:18 ` Daniel Lezcano 2022-10-05 12:37 ` Daniel Lezcano 2022-10-05 13:05 ` Marek Szyprowski 2022-10-06 6:55 ` Daniel Lezcano 2022-10-06 16:25 ` Marek Szyprowski
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).