linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
  • * Re: [PATCH v8 00/29] Rework the trip points creation
           [not found] ` <20221003092602.1323944-1-daniel.lezcano@linaro.org>
           [not found]   ` <CGME20221003093207eucas1p1d456288f35eadbc6fcda0bf24b58e678@eucas1p1.samsung.com>
    @ 2022-10-03 14:10   ` Marek Szyprowski
      2022-10-03 15:36     ` Daniel Lezcano
      2022-10-03 21:18     ` Daniel Lezcano
      1 sibling, 2 replies; 14+ messages in thread
    From: Marek Szyprowski @ 2022-10-03 14:10 UTC (permalink / raw)
      To: Daniel Lezcano, rafael
      Cc: linux-kernel, linux-pm, rui.zhang, Broadcom Kernel Team,
    	Support Opensource, Pengutronix Kernel Team, NXP Linux Team,
    	Andy Gross, Bartlomiej Zolnierkiewicz, Krzysztof Kozlowski,
    	Alim Akhtar, netdev, platform-driver-x86, linux-rpi-kernel,
    	linux-arm-kernel, linux-arm-msm, linux-renesas-soc,
    	linux-samsung-soc, linux-tegra, linux-omap
    
    Hi Daniel,
    
    On 03.10.2022 11:25, Daniel Lezcano wrote:
    > This work is the pre-requisite of handling correctly when the trip
    > point are crossed. For that we need to rework how the trip points are
    > declared and assigned to a thermal zone.
    >
    > Even if it appears to be a common sense to have the trip points being
    > ordered, this no guarantee neither documentation telling that is the
    > case.
    >
    > One solution could have been to create an ordered array of trips built
    > when registering the thermal zone by calling the different get_trip*
    > ops. However those ops receive a thermal zone pointer which is not
    > known as it is in the process of creating it.
    >
    > This cyclic dependency shows we have to rework how we manage the trip
    > points.
    >
    > Actually, all the trip points definition can be common to the backend
    > sensor drivers and we can factor out the thermal trip structure in all
    > of them.
    >
    > Then, as we register the thermal trips array, they will be available
    > in the thermal zone structure and a core function can return the trip
    > given its id.
    >
    > The get_trip_* ops won't be needed anymore and could be removed. The
    > resulting code will be another step forward to a self encapsulated
    > generic thermal framework.
    >
    > Most of the drivers can be converted more or less easily. This series
    > does a first round with most of the drivers. Some remain and will be
    > converted but with a smaller set of changes as the conversion is a bit
    > more complex.
    >
    > Changelog:
    > v8:
    > - Pretty oneline change and parenthesis removal (Rafael)
    > - Collected tags
    > v7:
    > - Added missing return 0 in the x86_pkg_temp driver
    > v6:
    > - Improved the code for the get_crit_temp() function as suggested by 
    > Rafael
    > - Removed inner parenthesis in the set_trip_temp() function and invert the
    > conditions. Check the type of the trip point is unchanged
    > - Folded patch 4 with 1
    > - Add per thermal zone info message in the bang-bang governor
    > - Folded the fix for an uninitialized variable in 
    > int340x_thermal_zone_add()
    > v5:
    > - Fixed a deadlock when calling thermal_zone_get_trip() while
    > handling the thermal zone lock
    > - Remove an extra line in the sysfs change
    > - Collected tags
    > v4:
    > - Remove extra lines on exynos changes as reported by Krzysztof Kozlowski
    > - Collected tags
    > v3:
    > - Reorg the series to be git-bisect safe
    > - Added the set_trip generic function
    > - Added the get_crit_temp generic function
    > - Removed more dead code in the thermal-of
    > - Fixed the exynos changelog
    > - Fixed the error check for the exynos drivers
    > - Collected tags
    > v2:
    > - Added missing EXPORT_SYMBOL_GPL() for thermal_zone_get_trip()
    > - Removed tab whitespace in the acerhdf driver
    > - Collected tags
    >
    > Cc: Raju Rangoju <rajur@chelsio.com>
    > Cc: "David S. Miller" <davem@davemloft.net>
    > Cc: Eric Dumazet <edumazet@google.com>
    > Cc: Jakub Kicinski <kuba@kernel.org>
    > Cc: Paolo Abeni <pabeni@redhat.com>
    > Cc: Peter Kaestle <peter@piie.net>
    > Cc: Hans de Goede <hdegoede@redhat.com>
    > Cc: Mark Gross <markgross@kernel.org>
    > Cc: Miquel Raynal <miquel.raynal@bootlin.com>
    > Cc: "Rafael J. Wysocki" <rafael@kernel.org>
    > Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
    > Cc: Amit Kucheria <amitk@kernel.org>
    > Cc: Zhang Rui <rui.zhang@intel.com>
    > Cc: Nicolas Saenz Julienne <nsaenz@kernel.org>
    > Cc: Broadcom Kernel Team <bcm-kernel-feedback-list@broadcom.com>
    > Cc: Florian Fainelli <f.fainelli@gmail.com>
    > Cc: Ray Jui <rjui@broadcom.com>
    > Cc: Scott Branden <sbranden@broadcom.com>
    > Cc: Support Opensource <support.opensource@diasemi.com>
    > Cc: Lukasz Luba <lukasz.luba@arm.com>
    > Cc: Shawn Guo <shawnguo@kernel.org>
    > Cc: Sascha Hauer <s.hauer@pengutronix.de>
    > Cc: Pengutronix Kernel Team <kernel@pengutronix.de>
    > Cc: Fabio Estevam <festevam@gmail.com>
    > Cc: NXP Linux Team <linux-imx@nxp.com>
    > Cc: Thara Gopinath <thara.gopinath@linaro.org>
    > Cc: Andy Gross <agross@kernel.org>
    > Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
    > Cc: "Niklas Söderlund" <niklas.soderlund@ragnatech.se>
    > Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
    > Cc: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
    > Cc: Alim Akhtar <alim.akhtar@samsung.com>
    > Cc: Thierry Reding <thierry.reding@gmail.com>
    > Cc: Jonathan Hunter <jonathanh@nvidia.com>
    > Cc: Eduardo Valentin <edubezval@gmail.com>
    > Cc: Keerthy <j-keerthy@ti.com>
    > Cc: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
    > Cc: Masami Hiramatsu <mhiramat@kernel.org>
    > Cc: Antoine Tenart <atenart@kernel.org>
    > Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
    > Cc: Dmitry Osipenko <digetx@gmail.com>
    > Cc: netdev@vger.kernel.org
    > Cc: linux-kernel@vger.kernel.org
    > Cc: platform-driver-x86@vger.kernel.org
    > Cc: linux-pm@vger.kernel.org
    > Cc: linux-rpi-kernel@lists.infradead.org
    > Cc: linux-arm-kernel@lists.infradead.org
    > Cc: linux-arm-msm@vger.kernel.org
    > Cc: linux-renesas-soc@vger.kernel.org
    > Cc: linux-samsung-soc@vger.kernel.org
    > Cc: linux-tegra@vger.kernel.org
    > Cc: linux-omap@vger.kernel.org
    >
    > Daniel Lezcano (29):
    > thermal/core: Add a generic thermal_zone_get_trip() function
    > thermal/sysfs: Always expose hysteresis attributes
    > thermal/core: Add a generic thermal_zone_set_trip() function
    > thermal/core/governors: Use thermal_zone_get_trip() instead of ops
    > functions
    > thermal/of: Use generic thermal_zone_get_trip() function
    > thermal/of: Remove unused functions
    > thermal/drivers/exynos: Use generic thermal_zone_get_trip() function
    > thermal/drivers/exynos: of_thermal_get_ntrips()
    > thermal/drivers/exynos: Replace of_thermal_is_trip_valid() by
    > thermal_zone_get_trip()
    > thermal/drivers/tegra: Use generic thermal_zone_get_trip() function
    > thermal/drivers/uniphier: Use generic thermal_zone_get_trip() function
    > thermal/drivers/hisi: Use generic thermal_zone_get_trip() function
    > thermal/drivers/qcom: Use generic thermal_zone_get_trip() function
    > thermal/drivers/armada: Use generic thermal_zone_get_trip() function
    > thermal/drivers/rcar_gen3: Use the generic function to get the number
    > of trips
    > thermal/of: Remove of_thermal_get_ntrips()
    > thermal/of: Remove of_thermal_is_trip_valid()
    > thermal/of: Remove of_thermal_set_trip_hyst()
    > thermal/of: Remove of_thermal_get_crit_temp()
    > thermal/drivers/st: Use generic trip points
    > thermal/drivers/imx: Use generic thermal_zone_get_trip() function
    > thermal/drivers/rcar: Use generic thermal_zone_get_trip() function
    > thermal/drivers/broadcom: Use generic thermal_zone_get_trip() function
    > thermal/drivers/da9062: Use generic thermal_zone_get_trip() function
    > thermal/drivers/ti: Remove unused macros ti_thermal_get_trip_value() /
    > ti_thermal_trip_is_valid()
    > thermal/drivers/acerhdf: Use generic thermal_zone_get_trip() function
    > thermal/drivers/cxgb4: Use generic thermal_zone_get_trip() function
    > thermal/intel/int340x: Replace parameter to simplify
    > thermal/drivers/intel: Use generic thermal_zone_get_trip() function
    
    I've tested this v8 patchset after fixing the issue with Exynos TMU with 
    https://lore.kernel.org/all/20221003132943.1383065-1-daniel.lezcano@linaro.org/ 
    patch and I got the following lockdep warning on all Exynos-based boards:
    
    
    ======================================================
    WARNING: possible circular locking dependency detected
    6.0.0-rc1-00083-ge5c9d117223e #12945 Not tainted
    ------------------------------------------------------
    swapper/0/1 is trying to acquire lock:
    c1ce66b0 (&data->lock#2){+.+.}-{3:3}, at: exynos_get_temp+0x3c/0xc8
    
    but task is already holding lock:
    c2979b94 (&tz->lock){+.+.}-{3:3}, at: 
    thermal_zone_device_update.part.0+0x3c/0x528
    
    which lock already depends on the new lock.
    
    
    the existing dependency chain (in reverse order) is:
    
    -> #1 (&tz->lock){+.+.}-{3:3}:
            mutex_lock_nested+0x1c/0x24
            thermal_zone_get_trip+0x20/0x44
            exynos_tmu_initialize+0x144/0x1e0
            exynos_tmu_probe+0x2b0/0x728
            platform_probe+0x5c/0xb8
            really_probe+0xe0/0x414
            __driver_probe_device+0xa0/0x208
            driver_probe_device+0x30/0xc0
            __driver_attach+0xf0/0x1f0
            bus_for_each_dev+0x70/0xb0
            bus_add_driver+0x174/0x218
            driver_register+0x88/0x11c
            do_one_initcall+0x64/0x380
            kernel_init_freeable+0x1c0/0x224
            kernel_init+0x18/0x12c
            ret_from_fork+0x14/0x2c
            0x0
    
    -> #0 (&data->lock#2){+.+.}-{3:3}:
            lock_acquire+0x124/0x3e4
            __mutex_lock+0x90/0x948
            mutex_lock_nested+0x1c/0x24
            exynos_get_temp+0x3c/0xc8
            __thermal_zone_get_temp+0x5c/0x12c
            thermal_zone_device_update.part.0+0x78/0x528
            __thermal_cooling_device_register.part.0+0x298/0x354
            __cpufreq_cooling_register.constprop.0+0x138/0x218
            of_cpufreq_cooling_register+0x48/0x8c
            cpufreq_online+0x8d0/0xb2c
            cpufreq_add_dev+0xb0/0xec
            subsys_interface_register+0x108/0x118
            cpufreq_register_driver+0x15c/0x380
            dt_cpufreq_probe+0x2e4/0x434
            platform_probe+0x5c/0xb8
            really_probe+0xe0/0x414
            __driver_probe_device+0xa0/0x208
            driver_probe_device+0x30/0xc0
            __driver_attach+0xf0/0x1f0
            bus_for_each_dev+0x70/0xb0
            bus_add_driver+0x174/0x218
            driver_register+0x88/0x11c
            do_one_initcall+0x64/0x380
            kernel_init_freeable+0x1c0/0x224
            kernel_init+0x18/0x12c
            ret_from_fork+0x14/0x2c
            0x0
    
    other info that might help us debug this:
    
      Possible unsafe locking scenario:
    
            CPU0                    CPU1
            ----                    ----
       lock(&tz->lock);
                                    lock(&data->lock#2);
                                    lock(&tz->lock);
       lock(&data->lock#2);
    
      *** DEADLOCK ***
    
    5 locks held by swapper/0/1:
      #0: c1c8648c (&dev->mutex){....}-{3:3}, at: __driver_attach+0xe4/0x1f0
      #1: c1210434 (cpu_hotplug_lock){++++}-{0:0}, at: 
    cpufreq_register_driver+0xc4/0x380
      #2: c1ed8298 (subsys mutex#8){+.+.}-{3:3}, at: 
    subsys_interface_register+0x4c/0x118
      #3: c131f944 (thermal_list_lock){+.+.}-{3:3}, at: 
    __thermal_cooling_device_register.part.0+0x238/0x354
      #4: c2979b94 (&tz->lock){+.+.}-{3:3}, at: 
    thermal_zone_device_update.part.0+0x3c/0x528
    
    stack backtrace:
    CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.0.0-rc1-00083-ge5c9d117223e 
    #12945
    Hardware name: Samsung Exynos (Flattened Device Tree)
      unwind_backtrace from show_stack+0x10/0x14
      show_stack from dump_stack_lvl+0x58/0x70
      dump_stack_lvl from check_noncircular+0xf0/0x158
      check_noncircular from __lock_acquire+0x15e8/0x2a7c
      __lock_acquire from lock_acquire+0x124/0x3e4
      lock_acquire from __mutex_lock+0x90/0x948
      __mutex_lock from mutex_lock_nested+0x1c/0x24
      mutex_lock_nested from exynos_get_temp+0x3c/0xc8
      exynos_get_temp from __thermal_zone_get_temp+0x5c/0x12c
      __thermal_zone_get_temp from thermal_zone_device_update.part.0+0x78/0x528
      thermal_zone_device_update.part.0 from 
    __thermal_cooling_device_register.part.0+0x298/0x354
      __thermal_cooling_device_register.part.0 from 
    __cpufreq_cooling_register.constprop.0+0x138/0x218
      __cpufreq_cooling_register.constprop.0 from 
    of_cpufreq_cooling_register+0x48/0x8c
      of_cpufreq_cooling_register from cpufreq_online+0x8d0/0xb2c
      cpufreq_online from cpufreq_add_dev+0xb0/0xec
      cpufreq_add_dev from subsys_interface_register+0x108/0x118
      subsys_interface_register from cpufreq_register_driver+0x15c/0x380
      cpufreq_register_driver from dt_cpufreq_probe+0x2e4/0x434
      dt_cpufreq_probe from platform_probe+0x5c/0xb8
      platform_probe from really_probe+0xe0/0x414
      really_probe from __driver_probe_device+0xa0/0x208
      __driver_probe_device from driver_probe_device+0x30/0xc0
      driver_probe_device from __driver_attach+0xf0/0x1f0
      __driver_attach from bus_for_each_dev+0x70/0xb0
      bus_for_each_dev from bus_add_driver+0x174/0x218
      bus_add_driver from driver_register+0x88/0x11c
      driver_register from do_one_initcall+0x64/0x380
      do_one_initcall from kernel_init_freeable+0x1c0/0x224
      kernel_init_freeable from kernel_init+0x18/0x12c
      kernel_init from ret_from_fork+0x14/0x2c
    Exception stack(0xf082dfb0 to 0xf082dff8)
    ...
    
    Let me know if You need anything more to test.
    
    
    > drivers/net/ethernet/chelsio/cxgb4/cxgb4.h | 2 -
    > .../ethernet/chelsio/cxgb4/cxgb4_thermal.c | 41 +----
    > drivers/platform/x86/acerhdf.c | 73 +++-----
    > drivers/thermal/armada_thermal.c | 39 ++---
    > drivers/thermal/broadcom/bcm2835_thermal.c | 8 +-
    > drivers/thermal/da9062-thermal.c | 52 +-----
    > drivers/thermal/gov_bang_bang.c | 39 +++--
    > drivers/thermal/gov_fair_share.c | 18 +-
    > drivers/thermal/gov_power_allocator.c | 51 +++---
    > drivers/thermal/gov_step_wise.c | 22 ++-
    > drivers/thermal/hisi_thermal.c | 11 +-
    > drivers/thermal/imx_thermal.c | 72 +++-----
    > .../int340x_thermal/int340x_thermal_zone.c | 33 ++--
    > .../int340x_thermal/int340x_thermal_zone.h | 4 +-
    > .../processor_thermal_device.c | 10 +-
    > drivers/thermal/intel/x86_pkg_temp_thermal.c | 120 +++++++------
    > drivers/thermal/qcom/qcom-spmi-temp-alarm.c | 39 ++---
    > drivers/thermal/rcar_gen3_thermal.c | 2 +-
    > drivers/thermal/rcar_thermal.c | 53 +-----
    > drivers/thermal/samsung/exynos_tmu.c | 57 +++----
    > drivers/thermal/st/st_thermal.c | 47 +----
    > drivers/thermal/tegra/soctherm.c | 33 ++--
    > drivers/thermal/tegra/tegra30-tsensor.c | 17 +-
    > drivers/thermal/thermal_core.c | 160 +++++++++++++++---
    > drivers/thermal/thermal_core.h | 24 +--
    > drivers/thermal/thermal_helpers.c | 28 +--
    > drivers/thermal/thermal_netlink.c | 21 +--
    > drivers/thermal/thermal_of.c | 116 -------------
    > drivers/thermal/thermal_sysfs.c | 133 +++++----------
    > drivers/thermal/ti-soc-thermal/ti-thermal.h | 15 --
    > drivers/thermal/uniphier_thermal.c | 27 ++-
    > include/linux/thermal.h | 10 ++
    > 32 files changed, 559 insertions(+), 818 deletions(-)
    >
    Best regards
    
    -- 
    Marek Szyprowski, PhD
    Samsung R&D Institute Poland
    
    
    _______________________________________________
    linux-arm-kernel mailing list
    linux-arm-kernel@lists.infradead.org
    http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
    
    ^ permalink raw reply	[flat|nested] 14+ messages in thread

  • end of thread, other threads:[~2022-10-17 14:15 UTC | newest]
    
    Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
    -- links below jump to the message on this page --
         [not found] <CGME20221003092704eucas1p2875c1f996dfd60a58f06cf986e02e8eb@eucas1p2.samsung.com>
         [not found] ` <20221003092602.1323944-1-daniel.lezcano@linaro.org>
         [not found]   ` <CGME20221003093207eucas1p1d456288f35eadbc6fcda0bf24b58e678@eucas1p1.samsung.com>
         [not found]     ` <20221003092602.1323944-20-daniel.lezcano@linaro.org>
    2022-10-03 12:50       ` [PATCH v8 19/29] thermal/of: Remove of_thermal_get_crit_temp() Marek Szyprowski
    2022-10-03 13:29         ` [PATCH] thermal/drivers/exynos: Fix NULL pointer dereference when getting the critical temp Daniel Lezcano
    2022-10-03 13:40           ` Krzysztof Kozlowski
    2022-10-03 13:50           ` Marek Szyprowski
    2022-10-17 13:48           ` Marek Szyprowski
    2022-10-17 14:14             ` Daniel Lezcano
    2022-10-03 13:31         ` [PATCH v8 19/29] thermal/of: Remove of_thermal_get_crit_temp() Daniel Lezcano
    2022-10-03 14:10   ` [PATCH v8 00/29] Rework the trip points creation Marek Szyprowski
    2022-10-03 15:36     ` Daniel Lezcano
    2022-10-03 21:18     ` Daniel Lezcano
    2022-10-05 12:37       ` Daniel Lezcano
    2022-10-05 13:05         ` Marek Szyprowski
    2022-10-06  6:55           ` Daniel Lezcano
    2022-10-06 16:25             ` Marek Szyprowski
    

    This is a public inbox, see mirroring instructions
    for how to clone and mirror all data and code used for this inbox;
    as well as URLs for NNTP newsgroup(s).