All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marek Szyprowski <m.szyprowski@samsung.com>
To: Daniel Lezcano <daniel.lezcano@linaro.org>, rafael@kernel.org
Cc: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org,
	rui.zhang@intel.com,
	Broadcom Kernel Team <bcm-kernel-feedback-list@broadcom.com>,
	Support Opensource <support.opensource@diasemi.com>,
	Pengutronix Kernel Team <kernel@pengutronix.de>,
	NXP Linux Team <linux-imx@nxp.com>,
	Andy Gross <agross@kernel.org>,
	Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>,
	Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>,
	Alim Akhtar <alim.akhtar@samsung.com>,
	netdev@vger.kernel.org, platform-driver-x86@vger.kernel.org,
	linux-rpi-kernel@lists.infradead.org,
	linux-arm-kernel@lists.infradead.org,
	linux-arm-msm@vger.kernel.org, linux-renesas-soc@vger.kernel.org,
	linux-samsung-soc@vger.kernel.org, linux-tegra@vger.kernel.org,
	linux-omap@vger.kernel.org
Subject: Re: [PATCH v8 00/29] Rework the trip points creation
Date: Mon, 3 Oct 2022 16:10:33 +0200	[thread overview]
Message-ID: <8cdd1927-da38-c23e-fa75-384694724b1c@samsung.com> (raw)
In-Reply-To: <20221003092602.1323944-1-daniel.lezcano@linaro.org>

Hi Daniel,

On 03.10.2022 11:25, Daniel Lezcano wrote:
> This work is the pre-requisite of handling correctly when the trip
> point are crossed. For that we need to rework how the trip points are
> declared and assigned to a thermal zone.
>
> Even if it appears to be a common sense to have the trip points being
> ordered, this no guarantee neither documentation telling that is the
> case.
>
> One solution could have been to create an ordered array of trips built
> when registering the thermal zone by calling the different get_trip*
> ops. However those ops receive a thermal zone pointer which is not
> known as it is in the process of creating it.
>
> This cyclic dependency shows we have to rework how we manage the trip
> points.
>
> Actually, all the trip points definition can be common to the backend
> sensor drivers and we can factor out the thermal trip structure in all
> of them.
>
> Then, as we register the thermal trips array, they will be available
> in the thermal zone structure and a core function can return the trip
> given its id.
>
> The get_trip_* ops won't be needed anymore and could be removed. The
> resulting code will be another step forward to a self encapsulated
> generic thermal framework.
>
> Most of the drivers can be converted more or less easily. This series
> does a first round with most of the drivers. Some remain and will be
> converted but with a smaller set of changes as the conversion is a bit
> more complex.
>
> Changelog:
> v8:
> - Pretty oneline change and parenthesis removal (Rafael)
> - Collected tags
> v7:
> - Added missing return 0 in the x86_pkg_temp driver
> v6:
> - Improved the code for the get_crit_temp() function as suggested by 
> Rafael
> - Removed inner parenthesis in the set_trip_temp() function and invert the
> conditions. Check the type of the trip point is unchanged
> - Folded patch 4 with 1
> - Add per thermal zone info message in the bang-bang governor
> - Folded the fix for an uninitialized variable in 
> int340x_thermal_zone_add()
> v5:
> - Fixed a deadlock when calling thermal_zone_get_trip() while
> handling the thermal zone lock
> - Remove an extra line in the sysfs change
> - Collected tags
> v4:
> - Remove extra lines on exynos changes as reported by Krzysztof Kozlowski
> - Collected tags
> v3:
> - Reorg the series to be git-bisect safe
> - Added the set_trip generic function
> - Added the get_crit_temp generic function
> - Removed more dead code in the thermal-of
> - Fixed the exynos changelog
> - Fixed the error check for the exynos drivers
> - Collected tags
> v2:
> - Added missing EXPORT_SYMBOL_GPL() for thermal_zone_get_trip()
> - Removed tab whitespace in the acerhdf driver
> - Collected tags
>
> Cc: Raju Rangoju <rajur@chelsio.com>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Eric Dumazet <edumazet@google.com>
> Cc: Jakub Kicinski <kuba@kernel.org>
> Cc: Paolo Abeni <pabeni@redhat.com>
> Cc: Peter Kaestle <peter@piie.net>
> Cc: Hans de Goede <hdegoede@redhat.com>
> Cc: Mark Gross <markgross@kernel.org>
> Cc: Miquel Raynal <miquel.raynal@bootlin.com>
> Cc: "Rafael J. Wysocki" <rafael@kernel.org>
> Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
> Cc: Amit Kucheria <amitk@kernel.org>
> Cc: Zhang Rui <rui.zhang@intel.com>
> Cc: Nicolas Saenz Julienne <nsaenz@kernel.org>
> Cc: Broadcom Kernel Team <bcm-kernel-feedback-list@broadcom.com>
> Cc: Florian Fainelli <f.fainelli@gmail.com>
> Cc: Ray Jui <rjui@broadcom.com>
> Cc: Scott Branden <sbranden@broadcom.com>
> Cc: Support Opensource <support.opensource@diasemi.com>
> Cc: Lukasz Luba <lukasz.luba@arm.com>
> Cc: Shawn Guo <shawnguo@kernel.org>
> Cc: Sascha Hauer <s.hauer@pengutronix.de>
> Cc: Pengutronix Kernel Team <kernel@pengutronix.de>
> Cc: Fabio Estevam <festevam@gmail.com>
> Cc: NXP Linux Team <linux-imx@nxp.com>
> Cc: Thara Gopinath <thara.gopinath@linaro.org>
> Cc: Andy Gross <agross@kernel.org>
> Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
> Cc: "Niklas Söderlund" <niklas.soderlund@ragnatech.se>
> Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
> Cc: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
> Cc: Alim Akhtar <alim.akhtar@samsung.com>
> Cc: Thierry Reding <thierry.reding@gmail.com>
> Cc: Jonathan Hunter <jonathanh@nvidia.com>
> Cc: Eduardo Valentin <edubezval@gmail.com>
> Cc: Keerthy <j-keerthy@ti.com>
> Cc: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
> Cc: Masami Hiramatsu <mhiramat@kernel.org>
> Cc: Antoine Tenart <atenart@kernel.org>
> Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
> Cc: Dmitry Osipenko <digetx@gmail.com>
> Cc: netdev@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: platform-driver-x86@vger.kernel.org
> Cc: linux-pm@vger.kernel.org
> Cc: linux-rpi-kernel@lists.infradead.org
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-arm-msm@vger.kernel.org
> Cc: linux-renesas-soc@vger.kernel.org
> Cc: linux-samsung-soc@vger.kernel.org
> Cc: linux-tegra@vger.kernel.org
> Cc: linux-omap@vger.kernel.org
>
> Daniel Lezcano (29):
> thermal/core: Add a generic thermal_zone_get_trip() function
> thermal/sysfs: Always expose hysteresis attributes
> thermal/core: Add a generic thermal_zone_set_trip() function
> thermal/core/governors: Use thermal_zone_get_trip() instead of ops
> functions
> thermal/of: Use generic thermal_zone_get_trip() function
> thermal/of: Remove unused functions
> thermal/drivers/exynos: Use generic thermal_zone_get_trip() function
> thermal/drivers/exynos: of_thermal_get_ntrips()
> thermal/drivers/exynos: Replace of_thermal_is_trip_valid() by
> thermal_zone_get_trip()
> thermal/drivers/tegra: Use generic thermal_zone_get_trip() function
> thermal/drivers/uniphier: Use generic thermal_zone_get_trip() function
> thermal/drivers/hisi: Use generic thermal_zone_get_trip() function
> thermal/drivers/qcom: Use generic thermal_zone_get_trip() function
> thermal/drivers/armada: Use generic thermal_zone_get_trip() function
> thermal/drivers/rcar_gen3: Use the generic function to get the number
> of trips
> thermal/of: Remove of_thermal_get_ntrips()
> thermal/of: Remove of_thermal_is_trip_valid()
> thermal/of: Remove of_thermal_set_trip_hyst()
> thermal/of: Remove of_thermal_get_crit_temp()
> thermal/drivers/st: Use generic trip points
> thermal/drivers/imx: Use generic thermal_zone_get_trip() function
> thermal/drivers/rcar: Use generic thermal_zone_get_trip() function
> thermal/drivers/broadcom: Use generic thermal_zone_get_trip() function
> thermal/drivers/da9062: Use generic thermal_zone_get_trip() function
> thermal/drivers/ti: Remove unused macros ti_thermal_get_trip_value() /
> ti_thermal_trip_is_valid()
> thermal/drivers/acerhdf: Use generic thermal_zone_get_trip() function
> thermal/drivers/cxgb4: Use generic thermal_zone_get_trip() function
> thermal/intel/int340x: Replace parameter to simplify
> thermal/drivers/intel: Use generic thermal_zone_get_trip() function

I've tested this v8 patchset after fixing the issue with Exynos TMU with 
https://lore.kernel.org/all/20221003132943.1383065-1-daniel.lezcano@linaro.org/ 
patch and I got the following lockdep warning on all Exynos-based boards:


======================================================
WARNING: possible circular locking dependency detected
6.0.0-rc1-00083-ge5c9d117223e #12945 Not tainted
------------------------------------------------------
swapper/0/1 is trying to acquire lock:
c1ce66b0 (&data->lock#2){+.+.}-{3:3}, at: exynos_get_temp+0x3c/0xc8

but task is already holding lock:
c2979b94 (&tz->lock){+.+.}-{3:3}, at: 
thermal_zone_device_update.part.0+0x3c/0x528

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (&tz->lock){+.+.}-{3:3}:
        mutex_lock_nested+0x1c/0x24
        thermal_zone_get_trip+0x20/0x44
        exynos_tmu_initialize+0x144/0x1e0
        exynos_tmu_probe+0x2b0/0x728
        platform_probe+0x5c/0xb8
        really_probe+0xe0/0x414
        __driver_probe_device+0xa0/0x208
        driver_probe_device+0x30/0xc0
        __driver_attach+0xf0/0x1f0
        bus_for_each_dev+0x70/0xb0
        bus_add_driver+0x174/0x218
        driver_register+0x88/0x11c
        do_one_initcall+0x64/0x380
        kernel_init_freeable+0x1c0/0x224
        kernel_init+0x18/0x12c
        ret_from_fork+0x14/0x2c
        0x0

-> #0 (&data->lock#2){+.+.}-{3:3}:
        lock_acquire+0x124/0x3e4
        __mutex_lock+0x90/0x948
        mutex_lock_nested+0x1c/0x24
        exynos_get_temp+0x3c/0xc8
        __thermal_zone_get_temp+0x5c/0x12c
        thermal_zone_device_update.part.0+0x78/0x528
        __thermal_cooling_device_register.part.0+0x298/0x354
        __cpufreq_cooling_register.constprop.0+0x138/0x218
        of_cpufreq_cooling_register+0x48/0x8c
        cpufreq_online+0x8d0/0xb2c
        cpufreq_add_dev+0xb0/0xec
        subsys_interface_register+0x108/0x118
        cpufreq_register_driver+0x15c/0x380
        dt_cpufreq_probe+0x2e4/0x434
        platform_probe+0x5c/0xb8
        really_probe+0xe0/0x414
        __driver_probe_device+0xa0/0x208
        driver_probe_device+0x30/0xc0
        __driver_attach+0xf0/0x1f0
        bus_for_each_dev+0x70/0xb0
        bus_add_driver+0x174/0x218
        driver_register+0x88/0x11c
        do_one_initcall+0x64/0x380
        kernel_init_freeable+0x1c0/0x224
        kernel_init+0x18/0x12c
        ret_from_fork+0x14/0x2c
        0x0

other info that might help us debug this:

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&tz->lock);
                                lock(&data->lock#2);
                                lock(&tz->lock);
   lock(&data->lock#2);

  *** DEADLOCK ***

5 locks held by swapper/0/1:
  #0: c1c8648c (&dev->mutex){....}-{3:3}, at: __driver_attach+0xe4/0x1f0
  #1: c1210434 (cpu_hotplug_lock){++++}-{0:0}, at: 
cpufreq_register_driver+0xc4/0x380
  #2: c1ed8298 (subsys mutex#8){+.+.}-{3:3}, at: 
subsys_interface_register+0x4c/0x118
  #3: c131f944 (thermal_list_lock){+.+.}-{3:3}, at: 
__thermal_cooling_device_register.part.0+0x238/0x354
  #4: c2979b94 (&tz->lock){+.+.}-{3:3}, at: 
thermal_zone_device_update.part.0+0x3c/0x528

stack backtrace:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.0.0-rc1-00083-ge5c9d117223e 
#12945
Hardware name: Samsung Exynos (Flattened Device Tree)
  unwind_backtrace from show_stack+0x10/0x14
  show_stack from dump_stack_lvl+0x58/0x70
  dump_stack_lvl from check_noncircular+0xf0/0x158
  check_noncircular from __lock_acquire+0x15e8/0x2a7c
  __lock_acquire from lock_acquire+0x124/0x3e4
  lock_acquire from __mutex_lock+0x90/0x948
  __mutex_lock from mutex_lock_nested+0x1c/0x24
  mutex_lock_nested from exynos_get_temp+0x3c/0xc8
  exynos_get_temp from __thermal_zone_get_temp+0x5c/0x12c
  __thermal_zone_get_temp from thermal_zone_device_update.part.0+0x78/0x528
  thermal_zone_device_update.part.0 from 
__thermal_cooling_device_register.part.0+0x298/0x354
  __thermal_cooling_device_register.part.0 from 
__cpufreq_cooling_register.constprop.0+0x138/0x218
  __cpufreq_cooling_register.constprop.0 from 
of_cpufreq_cooling_register+0x48/0x8c
  of_cpufreq_cooling_register from cpufreq_online+0x8d0/0xb2c
  cpufreq_online from cpufreq_add_dev+0xb0/0xec
  cpufreq_add_dev from subsys_interface_register+0x108/0x118
  subsys_interface_register from cpufreq_register_driver+0x15c/0x380
  cpufreq_register_driver from dt_cpufreq_probe+0x2e4/0x434
  dt_cpufreq_probe from platform_probe+0x5c/0xb8
  platform_probe from really_probe+0xe0/0x414
  really_probe from __driver_probe_device+0xa0/0x208
  __driver_probe_device from driver_probe_device+0x30/0xc0
  driver_probe_device from __driver_attach+0xf0/0x1f0
  __driver_attach from bus_for_each_dev+0x70/0xb0
  bus_for_each_dev from bus_add_driver+0x174/0x218
  bus_add_driver from driver_register+0x88/0x11c
  driver_register from do_one_initcall+0x64/0x380
  do_one_initcall from kernel_init_freeable+0x1c0/0x224
  kernel_init_freeable from kernel_init+0x18/0x12c
  kernel_init from ret_from_fork+0x14/0x2c
Exception stack(0xf082dfb0 to 0xf082dff8)
...

Let me know if You need anything more to test.


> drivers/net/ethernet/chelsio/cxgb4/cxgb4.h | 2 -
> .../ethernet/chelsio/cxgb4/cxgb4_thermal.c | 41 +----
> drivers/platform/x86/acerhdf.c | 73 +++-----
> drivers/thermal/armada_thermal.c | 39 ++---
> drivers/thermal/broadcom/bcm2835_thermal.c | 8 +-
> drivers/thermal/da9062-thermal.c | 52 +-----
> drivers/thermal/gov_bang_bang.c | 39 +++--
> drivers/thermal/gov_fair_share.c | 18 +-
> drivers/thermal/gov_power_allocator.c | 51 +++---
> drivers/thermal/gov_step_wise.c | 22 ++-
> drivers/thermal/hisi_thermal.c | 11 +-
> drivers/thermal/imx_thermal.c | 72 +++-----
> .../int340x_thermal/int340x_thermal_zone.c | 33 ++--
> .../int340x_thermal/int340x_thermal_zone.h | 4 +-
> .../processor_thermal_device.c | 10 +-
> drivers/thermal/intel/x86_pkg_temp_thermal.c | 120 +++++++------
> drivers/thermal/qcom/qcom-spmi-temp-alarm.c | 39 ++---
> drivers/thermal/rcar_gen3_thermal.c | 2 +-
> drivers/thermal/rcar_thermal.c | 53 +-----
> drivers/thermal/samsung/exynos_tmu.c | 57 +++----
> drivers/thermal/st/st_thermal.c | 47 +----
> drivers/thermal/tegra/soctherm.c | 33 ++--
> drivers/thermal/tegra/tegra30-tsensor.c | 17 +-
> drivers/thermal/thermal_core.c | 160 +++++++++++++++---
> drivers/thermal/thermal_core.h | 24 +--
> drivers/thermal/thermal_helpers.c | 28 +--
> drivers/thermal/thermal_netlink.c | 21 +--
> drivers/thermal/thermal_of.c | 116 -------------
> drivers/thermal/thermal_sysfs.c | 133 +++++----------
> drivers/thermal/ti-soc-thermal/ti-thermal.h | 15 --
> drivers/thermal/uniphier_thermal.c | 27 ++-
> include/linux/thermal.h | 10 ++
> 32 files changed, 559 insertions(+), 818 deletions(-)
>
Best regards

-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


WARNING: multiple messages have this Message-ID (diff)
From: Marek Szyprowski <m.szyprowski@samsung.com>
To: Daniel Lezcano <daniel.lezcano@linaro.org>, rafael@kernel.org
Cc: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org,
	rui.zhang@intel.com,
	Broadcom Kernel Team <bcm-kernel-feedback-list@broadcom.com>,
	Support Opensource <support.opensource@diasemi.com>,
	Pengutronix Kernel Team <kernel@pengutronix.de>,
	NXP Linux Team <linux-imx@nxp.com>,
	Andy Gross <agross@kernel.org>,
	Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>,
	Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>,
	Alim Akhtar <alim.akhtar@samsung.com>,
	netdev@vger.kernel.org, platform-driver-x86@vger.kernel.org,
	linux-rpi-kernel@lists.infradead.org,
	linux-arm-kernel@lists.infradead.org,
	linux-arm-msm@vger.kernel.org, linux-renesas-soc@vger.kernel.org,
	linux-samsung-soc@vger.kernel.org, linux-tegra@vger.kernel.org,
	linux-omap@vger.kernel.org
Subject: Re: [PATCH v8 00/29] Rework the trip points creation
Date: Mon, 3 Oct 2022 16:10:33 +0200	[thread overview]
Message-ID: <8cdd1927-da38-c23e-fa75-384694724b1c@samsung.com> (raw)
In-Reply-To: <20221003092602.1323944-1-daniel.lezcano@linaro.org>

Hi Daniel,

On 03.10.2022 11:25, Daniel Lezcano wrote:
> This work is the pre-requisite of handling correctly when the trip
> point are crossed. For that we need to rework how the trip points are
> declared and assigned to a thermal zone.
>
> Even if it appears to be a common sense to have the trip points being
> ordered, this no guarantee neither documentation telling that is the
> case.
>
> One solution could have been to create an ordered array of trips built
> when registering the thermal zone by calling the different get_trip*
> ops. However those ops receive a thermal zone pointer which is not
> known as it is in the process of creating it.
>
> This cyclic dependency shows we have to rework how we manage the trip
> points.
>
> Actually, all the trip points definition can be common to the backend
> sensor drivers and we can factor out the thermal trip structure in all
> of them.
>
> Then, as we register the thermal trips array, they will be available
> in the thermal zone structure and a core function can return the trip
> given its id.
>
> The get_trip_* ops won't be needed anymore and could be removed. The
> resulting code will be another step forward to a self encapsulated
> generic thermal framework.
>
> Most of the drivers can be converted more or less easily. This series
> does a first round with most of the drivers. Some remain and will be
> converted but with a smaller set of changes as the conversion is a bit
> more complex.
>
> Changelog:
> v8:
> - Pretty oneline change and parenthesis removal (Rafael)
> - Collected tags
> v7:
> - Added missing return 0 in the x86_pkg_temp driver
> v6:
> - Improved the code for the get_crit_temp() function as suggested by 
> Rafael
> - Removed inner parenthesis in the set_trip_temp() function and invert the
> conditions. Check the type of the trip point is unchanged
> - Folded patch 4 with 1
> - Add per thermal zone info message in the bang-bang governor
> - Folded the fix for an uninitialized variable in 
> int340x_thermal_zone_add()
> v5:
> - Fixed a deadlock when calling thermal_zone_get_trip() while
> handling the thermal zone lock
> - Remove an extra line in the sysfs change
> - Collected tags
> v4:
> - Remove extra lines on exynos changes as reported by Krzysztof Kozlowski
> - Collected tags
> v3:
> - Reorg the series to be git-bisect safe
> - Added the set_trip generic function
> - Added the get_crit_temp generic function
> - Removed more dead code in the thermal-of
> - Fixed the exynos changelog
> - Fixed the error check for the exynos drivers
> - Collected tags
> v2:
> - Added missing EXPORT_SYMBOL_GPL() for thermal_zone_get_trip()
> - Removed tab whitespace in the acerhdf driver
> - Collected tags
>
> Cc: Raju Rangoju <rajur@chelsio.com>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Eric Dumazet <edumazet@google.com>
> Cc: Jakub Kicinski <kuba@kernel.org>
> Cc: Paolo Abeni <pabeni@redhat.com>
> Cc: Peter Kaestle <peter@piie.net>
> Cc: Hans de Goede <hdegoede@redhat.com>
> Cc: Mark Gross <markgross@kernel.org>
> Cc: Miquel Raynal <miquel.raynal@bootlin.com>
> Cc: "Rafael J. Wysocki" <rafael@kernel.org>
> Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
> Cc: Amit Kucheria <amitk@kernel.org>
> Cc: Zhang Rui <rui.zhang@intel.com>
> Cc: Nicolas Saenz Julienne <nsaenz@kernel.org>
> Cc: Broadcom Kernel Team <bcm-kernel-feedback-list@broadcom.com>
> Cc: Florian Fainelli <f.fainelli@gmail.com>
> Cc: Ray Jui <rjui@broadcom.com>
> Cc: Scott Branden <sbranden@broadcom.com>
> Cc: Support Opensource <support.opensource@diasemi.com>
> Cc: Lukasz Luba <lukasz.luba@arm.com>
> Cc: Shawn Guo <shawnguo@kernel.org>
> Cc: Sascha Hauer <s.hauer@pengutronix.de>
> Cc: Pengutronix Kernel Team <kernel@pengutronix.de>
> Cc: Fabio Estevam <festevam@gmail.com>
> Cc: NXP Linux Team <linux-imx@nxp.com>
> Cc: Thara Gopinath <thara.gopinath@linaro.org>
> Cc: Andy Gross <agross@kernel.org>
> Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
> Cc: "Niklas Söderlund" <niklas.soderlund@ragnatech.se>
> Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
> Cc: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
> Cc: Alim Akhtar <alim.akhtar@samsung.com>
> Cc: Thierry Reding <thierry.reding@gmail.com>
> Cc: Jonathan Hunter <jonathanh@nvidia.com>
> Cc: Eduardo Valentin <edubezval@gmail.com>
> Cc: Keerthy <j-keerthy@ti.com>
> Cc: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
> Cc: Masami Hiramatsu <mhiramat@kernel.org>
> Cc: Antoine Tenart <atenart@kernel.org>
> Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
> Cc: Dmitry Osipenko <digetx@gmail.com>
> Cc: netdev@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: platform-driver-x86@vger.kernel.org
> Cc: linux-pm@vger.kernel.org
> Cc: linux-rpi-kernel@lists.infradead.org
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-arm-msm@vger.kernel.org
> Cc: linux-renesas-soc@vger.kernel.org
> Cc: linux-samsung-soc@vger.kernel.org
> Cc: linux-tegra@vger.kernel.org
> Cc: linux-omap@vger.kernel.org
>
> Daniel Lezcano (29):
> thermal/core: Add a generic thermal_zone_get_trip() function
> thermal/sysfs: Always expose hysteresis attributes
> thermal/core: Add a generic thermal_zone_set_trip() function
> thermal/core/governors: Use thermal_zone_get_trip() instead of ops
> functions
> thermal/of: Use generic thermal_zone_get_trip() function
> thermal/of: Remove unused functions
> thermal/drivers/exynos: Use generic thermal_zone_get_trip() function
> thermal/drivers/exynos: of_thermal_get_ntrips()
> thermal/drivers/exynos: Replace of_thermal_is_trip_valid() by
> thermal_zone_get_trip()
> thermal/drivers/tegra: Use generic thermal_zone_get_trip() function
> thermal/drivers/uniphier: Use generic thermal_zone_get_trip() function
> thermal/drivers/hisi: Use generic thermal_zone_get_trip() function
> thermal/drivers/qcom: Use generic thermal_zone_get_trip() function
> thermal/drivers/armada: Use generic thermal_zone_get_trip() function
> thermal/drivers/rcar_gen3: Use the generic function to get the number
> of trips
> thermal/of: Remove of_thermal_get_ntrips()
> thermal/of: Remove of_thermal_is_trip_valid()
> thermal/of: Remove of_thermal_set_trip_hyst()
> thermal/of: Remove of_thermal_get_crit_temp()
> thermal/drivers/st: Use generic trip points
> thermal/drivers/imx: Use generic thermal_zone_get_trip() function
> thermal/drivers/rcar: Use generic thermal_zone_get_trip() function
> thermal/drivers/broadcom: Use generic thermal_zone_get_trip() function
> thermal/drivers/da9062: Use generic thermal_zone_get_trip() function
> thermal/drivers/ti: Remove unused macros ti_thermal_get_trip_value() /
> ti_thermal_trip_is_valid()
> thermal/drivers/acerhdf: Use generic thermal_zone_get_trip() function
> thermal/drivers/cxgb4: Use generic thermal_zone_get_trip() function
> thermal/intel/int340x: Replace parameter to simplify
> thermal/drivers/intel: Use generic thermal_zone_get_trip() function

I've tested this v8 patchset after fixing the issue with Exynos TMU with 
https://lore.kernel.org/all/20221003132943.1383065-1-daniel.lezcano@linaro.org/ 
patch and I got the following lockdep warning on all Exynos-based boards:


======================================================
WARNING: possible circular locking dependency detected
6.0.0-rc1-00083-ge5c9d117223e #12945 Not tainted
------------------------------------------------------
swapper/0/1 is trying to acquire lock:
c1ce66b0 (&data->lock#2){+.+.}-{3:3}, at: exynos_get_temp+0x3c/0xc8

but task is already holding lock:
c2979b94 (&tz->lock){+.+.}-{3:3}, at: 
thermal_zone_device_update.part.0+0x3c/0x528

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (&tz->lock){+.+.}-{3:3}:
        mutex_lock_nested+0x1c/0x24
        thermal_zone_get_trip+0x20/0x44
        exynos_tmu_initialize+0x144/0x1e0
        exynos_tmu_probe+0x2b0/0x728
        platform_probe+0x5c/0xb8
        really_probe+0xe0/0x414
        __driver_probe_device+0xa0/0x208
        driver_probe_device+0x30/0xc0
        __driver_attach+0xf0/0x1f0
        bus_for_each_dev+0x70/0xb0
        bus_add_driver+0x174/0x218
        driver_register+0x88/0x11c
        do_one_initcall+0x64/0x380
        kernel_init_freeable+0x1c0/0x224
        kernel_init+0x18/0x12c
        ret_from_fork+0x14/0x2c
        0x0

-> #0 (&data->lock#2){+.+.}-{3:3}:
        lock_acquire+0x124/0x3e4
        __mutex_lock+0x90/0x948
        mutex_lock_nested+0x1c/0x24
        exynos_get_temp+0x3c/0xc8
        __thermal_zone_get_temp+0x5c/0x12c
        thermal_zone_device_update.part.0+0x78/0x528
        __thermal_cooling_device_register.part.0+0x298/0x354
        __cpufreq_cooling_register.constprop.0+0x138/0x218
        of_cpufreq_cooling_register+0x48/0x8c
        cpufreq_online+0x8d0/0xb2c
        cpufreq_add_dev+0xb0/0xec
        subsys_interface_register+0x108/0x118
        cpufreq_register_driver+0x15c/0x380
        dt_cpufreq_probe+0x2e4/0x434
        platform_probe+0x5c/0xb8
        really_probe+0xe0/0x414
        __driver_probe_device+0xa0/0x208
        driver_probe_device+0x30/0xc0
        __driver_attach+0xf0/0x1f0
        bus_for_each_dev+0x70/0xb0
        bus_add_driver+0x174/0x218
        driver_register+0x88/0x11c
        do_one_initcall+0x64/0x380
        kernel_init_freeable+0x1c0/0x224
        kernel_init+0x18/0x12c
        ret_from_fork+0x14/0x2c
        0x0

other info that might help us debug this:

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&tz->lock);
                                lock(&data->lock#2);
                                lock(&tz->lock);
   lock(&data->lock#2);

  *** DEADLOCK ***

5 locks held by swapper/0/1:
  #0: c1c8648c (&dev->mutex){....}-{3:3}, at: __driver_attach+0xe4/0x1f0
  #1: c1210434 (cpu_hotplug_lock){++++}-{0:0}, at: 
cpufreq_register_driver+0xc4/0x380
  #2: c1ed8298 (subsys mutex#8){+.+.}-{3:3}, at: 
subsys_interface_register+0x4c/0x118
  #3: c131f944 (thermal_list_lock){+.+.}-{3:3}, at: 
__thermal_cooling_device_register.part.0+0x238/0x354
  #4: c2979b94 (&tz->lock){+.+.}-{3:3}, at: 
thermal_zone_device_update.part.0+0x3c/0x528

stack backtrace:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.0.0-rc1-00083-ge5c9d117223e 
#12945
Hardware name: Samsung Exynos (Flattened Device Tree)
  unwind_backtrace from show_stack+0x10/0x14
  show_stack from dump_stack_lvl+0x58/0x70
  dump_stack_lvl from check_noncircular+0xf0/0x158
  check_noncircular from __lock_acquire+0x15e8/0x2a7c
  __lock_acquire from lock_acquire+0x124/0x3e4
  lock_acquire from __mutex_lock+0x90/0x948
  __mutex_lock from mutex_lock_nested+0x1c/0x24
  mutex_lock_nested from exynos_get_temp+0x3c/0xc8
  exynos_get_temp from __thermal_zone_get_temp+0x5c/0x12c
  __thermal_zone_get_temp from thermal_zone_device_update.part.0+0x78/0x528
  thermal_zone_device_update.part.0 from 
__thermal_cooling_device_register.part.0+0x298/0x354
  __thermal_cooling_device_register.part.0 from 
__cpufreq_cooling_register.constprop.0+0x138/0x218
  __cpufreq_cooling_register.constprop.0 from 
of_cpufreq_cooling_register+0x48/0x8c
  of_cpufreq_cooling_register from cpufreq_online+0x8d0/0xb2c
  cpufreq_online from cpufreq_add_dev+0xb0/0xec
  cpufreq_add_dev from subsys_interface_register+0x108/0x118
  subsys_interface_register from cpufreq_register_driver+0x15c/0x380
  cpufreq_register_driver from dt_cpufreq_probe+0x2e4/0x434
  dt_cpufreq_probe from platform_probe+0x5c/0xb8
  platform_probe from really_probe+0xe0/0x414
  really_probe from __driver_probe_device+0xa0/0x208
  __driver_probe_device from driver_probe_device+0x30/0xc0
  driver_probe_device from __driver_attach+0xf0/0x1f0
  __driver_attach from bus_for_each_dev+0x70/0xb0
  bus_for_each_dev from bus_add_driver+0x174/0x218
  bus_add_driver from driver_register+0x88/0x11c
  driver_register from do_one_initcall+0x64/0x380
  do_one_initcall from kernel_init_freeable+0x1c0/0x224
  kernel_init_freeable from kernel_init+0x18/0x12c
  kernel_init from ret_from_fork+0x14/0x2c
Exception stack(0xf082dfb0 to 0xf082dff8)
...

Let me know if You need anything more to test.


> drivers/net/ethernet/chelsio/cxgb4/cxgb4.h | 2 -
> .../ethernet/chelsio/cxgb4/cxgb4_thermal.c | 41 +----
> drivers/platform/x86/acerhdf.c | 73 +++-----
> drivers/thermal/armada_thermal.c | 39 ++---
> drivers/thermal/broadcom/bcm2835_thermal.c | 8 +-
> drivers/thermal/da9062-thermal.c | 52 +-----
> drivers/thermal/gov_bang_bang.c | 39 +++--
> drivers/thermal/gov_fair_share.c | 18 +-
> drivers/thermal/gov_power_allocator.c | 51 +++---
> drivers/thermal/gov_step_wise.c | 22 ++-
> drivers/thermal/hisi_thermal.c | 11 +-
> drivers/thermal/imx_thermal.c | 72 +++-----
> .../int340x_thermal/int340x_thermal_zone.c | 33 ++--
> .../int340x_thermal/int340x_thermal_zone.h | 4 +-
> .../processor_thermal_device.c | 10 +-
> drivers/thermal/intel/x86_pkg_temp_thermal.c | 120 +++++++------
> drivers/thermal/qcom/qcom-spmi-temp-alarm.c | 39 ++---
> drivers/thermal/rcar_gen3_thermal.c | 2 +-
> drivers/thermal/rcar_thermal.c | 53 +-----
> drivers/thermal/samsung/exynos_tmu.c | 57 +++----
> drivers/thermal/st/st_thermal.c | 47 +----
> drivers/thermal/tegra/soctherm.c | 33 ++--
> drivers/thermal/tegra/tegra30-tsensor.c | 17 +-
> drivers/thermal/thermal_core.c | 160 +++++++++++++++---
> drivers/thermal/thermal_core.h | 24 +--
> drivers/thermal/thermal_helpers.c | 28 +--
> drivers/thermal/thermal_netlink.c | 21 +--
> drivers/thermal/thermal_of.c | 116 -------------
> drivers/thermal/thermal_sysfs.c | 133 +++++----------
> drivers/thermal/ti-soc-thermal/ti-thermal.h | 15 --
> drivers/thermal/uniphier_thermal.c | 27 ++-
> include/linux/thermal.h | 10 ++
> 32 files changed, 559 insertions(+), 818 deletions(-)
>
Best regards

-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  parent reply	other threads:[~2022-10-03 14:10 UTC|newest]

Thread overview: 91+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20221003092704eucas1p2875c1f996dfd60a58f06cf986e02e8eb@eucas1p2.samsung.com>
2022-10-03  9:25 ` [PATCH v8 00/29] Rework the trip points creation Daniel Lezcano
2022-10-03  9:25   ` [PATCH v8 01/29] thermal/core: Add a generic thermal_zone_get_trip() function Daniel Lezcano
2022-12-09 15:26     ` [thermal: thermal/next] " thermal-bot for Daniel Lezcano
2023-03-12 12:14     ` [PATCH v8 01/29] " Ido Schimmel
2023-03-13 10:45       ` Daniel Lezcano
2023-03-13 12:12         ` Ido Schimmel
2022-10-03  9:25   ` [PATCH v8 02/29] thermal/sysfs: Always expose hysteresis attributes Daniel Lezcano
2022-12-09 15:26     ` [thermal: thermal/next] " thermal-bot for Daniel Lezcano
2022-10-03  9:25   ` [PATCH v8 03/29] thermal/core: Add a generic thermal_zone_set_trip() function Daniel Lezcano
2022-10-03 11:56     ` Rafael J. Wysocki
2022-12-09 15:26     ` [thermal: thermal/next] " thermal-bot for Daniel Lezcano
2022-10-03  9:25   ` [PATCH v8 04/29] thermal/core/governors: Use thermal_zone_get_trip() instead of ops functions Daniel Lezcano
2022-12-09 15:26     ` [thermal: thermal/next] " thermal-bot for Daniel Lezcano
2022-10-03  9:25   ` [PATCH v8 05/29] thermal/of: Use generic thermal_zone_get_trip() function Daniel Lezcano
2022-12-09 15:26     ` [thermal: thermal/next] " thermal-bot for Daniel Lezcano
2022-10-03  9:25   ` [PATCH v8 06/29] thermal/of: Remove unused functions Daniel Lezcano
2022-12-09 15:26     ` [thermal: thermal/next] " thermal-bot for Daniel Lezcano
2022-10-03  9:25   ` [PATCH v8 07/29] thermal/drivers/exynos: Use generic thermal_zone_get_trip() function Daniel Lezcano
2022-12-09 15:26     ` [thermal: thermal/next] " thermal-bot for Daniel Lezcano
2022-10-03  9:25   ` [PATCH v8 08/29] thermal/drivers/exynos: of_thermal_get_ntrips() Daniel Lezcano
2022-12-09 15:26     ` [thermal: thermal/next] " thermal-bot for Daniel Lezcano
2022-10-03  9:25   ` [PATCH v8 09/29] thermal/drivers/exynos: Replace of_thermal_is_trip_valid() by thermal_zone_get_trip() Daniel Lezcano
2022-12-09 15:26     ` [thermal: thermal/next] " thermal-bot for Daniel Lezcano
2022-10-03  9:25   ` [PATCH v8 10/29] thermal/drivers/tegra: Use generic thermal_zone_get_trip() function Daniel Lezcano
2022-12-09 15:26     ` [thermal: thermal/next] " thermal-bot for Daniel Lezcano
2022-10-03  9:25   ` [PATCH v8 11/29] thermal/drivers/uniphier: " Daniel Lezcano
2022-12-09 15:26     ` [thermal: thermal/next] " thermal-bot for Daniel Lezcano
2022-10-03  9:25   ` [PATCH v8 12/29] thermal/drivers/hisi: " Daniel Lezcano
2022-12-09 15:26     ` [thermal: thermal/next] " thermal-bot for Daniel Lezcano
2022-10-03  9:25   ` [PATCH v8 13/29] thermal/drivers/qcom: " Daniel Lezcano
2022-12-09 15:26     ` [thermal: thermal/next] " thermal-bot for Daniel Lezcano
2022-10-03  9:25   ` [PATCH v8 14/29] thermal/drivers/armada: " Daniel Lezcano
2022-12-09 15:26     ` [thermal: thermal/next] " thermal-bot for Daniel Lezcano
2022-10-03  9:25   ` [PATCH v8 15/29] thermal/drivers/rcar_gen3: Use the generic function to get the number of trips Daniel Lezcano
2022-12-09 15:26     ` [thermal: thermal/next] " thermal-bot for Daniel Lezcano
2022-10-03  9:25   ` [PATCH v8 16/29] thermal/of: Remove of_thermal_get_ntrips() Daniel Lezcano
2022-12-09 15:26     ` [thermal: thermal/next] " thermal-bot for Daniel Lezcano
2022-10-03  9:25   ` [PATCH v8 17/29] thermal/of: Remove of_thermal_is_trip_valid() Daniel Lezcano
2022-12-09 15:26     ` [thermal: thermal/next] " thermal-bot for Daniel Lezcano
2022-10-03  9:25   ` [PATCH v8 18/29] thermal/of: Remove of_thermal_set_trip_hyst() Daniel Lezcano
2022-12-09 15:26     ` [thermal: thermal/next] " thermal-bot for Daniel Lezcano
2022-10-03  9:25   ` [PATCH v8 19/29] thermal/of: Remove of_thermal_get_crit_temp() Daniel Lezcano
2022-10-03 12:50     ` Marek Szyprowski
2022-10-03 12:50       ` Marek Szyprowski
2022-10-03 13:29       ` [PATCH] thermal/drivers/exynos: Fix NULL pointer dereference when getting the critical temp Daniel Lezcano
2022-10-03 13:29         ` Daniel Lezcano
2022-10-03 13:40         ` Krzysztof Kozlowski
2022-10-03 13:40           ` Krzysztof Kozlowski
2022-10-03 13:50         ` Marek Szyprowski
2022-10-03 13:50           ` Marek Szyprowski
2022-10-17 13:48         ` Marek Szyprowski
2022-10-17 13:48           ` Marek Szyprowski
2022-10-17 14:14           ` Daniel Lezcano
2022-10-17 14:14             ` Daniel Lezcano
2022-10-03 13:31       ` [PATCH v8 19/29] thermal/of: Remove of_thermal_get_crit_temp() Daniel Lezcano
2022-10-03 13:31         ` Daniel Lezcano
2022-12-09 15:26     ` [thermal: thermal/next] " thermal-bot for Daniel Lezcano
2022-10-03  9:25   ` [PATCH v8 20/29] thermal/drivers/st: Use generic trip points Daniel Lezcano
2022-12-09 15:26     ` [thermal: thermal/next] " thermal-bot for Daniel Lezcano
2022-10-03  9:25   ` [PATCH v8 21/29] thermal/drivers/imx: Use generic thermal_zone_get_trip() function Daniel Lezcano
2022-12-09 15:26     ` [thermal: thermal/next] " thermal-bot for Daniel Lezcano
2022-10-03  9:25   ` [PATCH v8 22/29] thermal/drivers/rcar: " Daniel Lezcano
2022-12-09 15:26     ` [thermal: thermal/next] " thermal-bot for Daniel Lezcano
2022-10-03  9:25   ` [PATCH v8 23/29] thermal/drivers/broadcom: " Daniel Lezcano
2022-12-09 15:26     ` [thermal: thermal/next] " thermal-bot for Daniel Lezcano
2022-10-03  9:25   ` [PATCH v8 24/29] thermal/drivers/da9062: " Daniel Lezcano
2022-12-09 15:26     ` [thermal: thermal/next] " thermal-bot for Daniel Lezcano
2022-10-03  9:25   ` [PATCH v8 25/29] thermal/drivers/ti: Remove unused macros ti_thermal_get_trip_value() / ti_thermal_trip_is_valid() Daniel Lezcano
2022-12-09 15:26     ` [thermal: thermal/next] " thermal-bot for Daniel Lezcano
2022-10-03  9:25   ` [PATCH v8 26/29] thermal/drivers/acerhdf: Use generic thermal_zone_get_trip() function Daniel Lezcano
2022-12-09 15:26     ` [thermal: thermal/next] " thermal-bot for Daniel Lezcano
2022-10-03  9:26   ` [PATCH v8 27/29] thermal/drivers/cxgb4: " Daniel Lezcano
2022-12-09 15:26     ` [thermal: thermal/next] " thermal-bot for Daniel Lezcano
2022-10-03  9:26   ` [PATCH v8 28/29] thermal/intel/int340x: Replace parameter to simplify Daniel Lezcano
2022-12-09 15:26     ` [thermal: thermal/next] " thermal-bot for Daniel Lezcano
2022-10-03  9:26   ` [PATCH v8 29/29] thermal/drivers/intel: Use generic thermal_zone_get_trip() function Daniel Lezcano
2022-12-09 15:26     ` [thermal: thermal/next] " thermal-bot for Daniel Lezcano
2022-10-03 14:10   ` Marek Szyprowski [this message]
2022-10-03 14:10     ` [PATCH v8 00/29] Rework the trip points creation Marek Szyprowski
2022-10-03 15:36     ` Daniel Lezcano
2022-10-03 15:36       ` Daniel Lezcano
2022-10-03 21:18     ` Daniel Lezcano
2022-10-03 21:18       ` Daniel Lezcano
2022-10-05 12:37       ` Daniel Lezcano
2022-10-05 12:37         ` Daniel Lezcano
2022-10-05 13:05         ` Marek Szyprowski
2022-10-05 13:05           ` Marek Szyprowski
2022-10-06  6:55           ` Daniel Lezcano
2022-10-06  6:55             ` Daniel Lezcano
2022-10-06 16:25             ` Marek Szyprowski
2022-10-06 16:25               ` Marek Szyprowski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8cdd1927-da38-c23e-fa75-384694724b1c@samsung.com \
    --to=m.szyprowski@samsung.com \
    --cc=agross@kernel.org \
    --cc=alim.akhtar@samsung.com \
    --cc=bcm-kernel-feedback-list@broadcom.com \
    --cc=bzolnier@gmail.com \
    --cc=daniel.lezcano@linaro.org \
    --cc=kernel@pengutronix.de \
    --cc=krzysztof.kozlowski@linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-imx@nxp.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-omap@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=linux-renesas-soc@vger.kernel.org \
    --cc=linux-rpi-kernel@lists.infradead.org \
    --cc=linux-samsung-soc@vger.kernel.org \
    --cc=linux-tegra@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=platform-driver-x86@vger.kernel.org \
    --cc=rafael@kernel.org \
    --cc=rui.zhang@intel.com \
    --cc=support.opensource@diasemi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.