All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dmitry Osipenko <dmitry.osipenko@collabora.com>
To: Guenter Roeck <linux@roeck-us.net>,
	Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org,
	Amit Kucheria <amitk@kernel.org>, Zhang Rui <rui.zhang@intel.com>,
	Thierry Reding <thierry.reding@gmail.com>,
	Jonathan Hunter <jonathanh@nvidia.com>,
	Dmitry Osipenko <digetx@gmail.com>,
	"open list:TEGRA ARCHITECTURE SUPPORT"
	<linux-tegra@vger.kernel.org>,
	Hardware Monitoring <linux-hwmon@vger.kernel.org>,
	rafael@kernel.org
Subject: Re: [PATCH 2/3] thermal/drivers/tegra: Remove get_trend function
Date: Wed, 29 Jun 2022 12:35:42 +0300	[thread overview]
Message-ID: <0f6cc7d3-5537-cd8f-d234-a61420e1cbc8@collabora.com> (raw)
In-Reply-To: <20220628184332.GA3624671@roeck-us.net>

On 6/28/22 21:43, Guenter Roeck wrote:
> On Tue, Jun 28, 2022 at 08:10:30AM -0700, Guenter Roeck wrote:
>> On Tue, Jun 28, 2022 at 02:44:31PM +0300, Dmitry Osipenko wrote:
>>> On 6/28/22 11:41, Daniel Lezcano wrote:
>>>>
>>>> Thierry, Dmitry,
>>>>
>>>> are fine with this patch?
>>>
>>> Seems should be good. I couldn't test it using recent the linux-next
>>> because of a lockup in LM90 driver. There were quite a lot of changes in
>>> LM90 recently, adding Guenter.
>>>
>>
>> Weird, I tested those changes to death with real hardware, and I don't
>> see a code path where the mutex can be left in blocked state unless the
>> underlying i2c driver locks up for some reason. What is the platform,
>> and can you point me to the devicetree file ? Also, is there anything
>> else lm90 or i2c related in the kernel log ?
>>
> 
> Follow-up question: I see that various Tegra systems use lm90 compatible
> chips, and the interrupt line is in general wired up. Can you check if
> you get lots of interrupts on that interrupt line ? Also, can you check
> what happens if you read hwmon attributes directly ?

The number of interrupt fires is okay. It's a Nexus 7 Tegra30 tablet
device that I'm using for the testing.

Today I enabled the lockdep and it immediately showed where the problem is:

======================================================
WARNING: possible circular locking dependency detected
5.19.0-rc4-next-20220628-00002-g94e5dbbe1c58-dirty #24 Not tainted
------------------------------------------------------
irq/91-lm90/130 is trying to acquire lock:
c27ba380 (&tz->lock){+.+.}-{3:3}, at: thermal_zone_device_update+0x2c/0x64

               but task is already holding lock:
c27b42c8 (&data->update_lock){+.+.}-{3:3}, at: lm90_irq_thread+0x2c/0x68

               which lock already depends on the new lock.


               the existing dependency chain (in reverse order) is:

               -> #1 (&data->update_lock){+.+.}-{3:3}:
       __mutex_lock+0x9c/0x984
       mutex_lock_nested+0x2c/0x34
       lm90_read+0x44/0x3e8
       hwmon_thermal_get_temp+0x58/0x8c
       of_thermal_get_temp+0x38/0x44
       thermal_zone_get_temp+0x5c/0x7c
       thermal_zone_device_update.part.0+0x48/0x5fc
       thermal_zone_device_set_mode+0xa0/0xe4
       thermal_zone_device_enable+0x1c/0x20
       thermal_zone_of_sensor_register+0x18c/0x19c
       devm_thermal_zone_of_sensor_register+0x68/0xa4
       __hwmon_device_register+0x704/0x91c
       hwmon_device_register_with_info+0x6c/0x80
       devm_hwmon_device_register_with_info+0x78/0xb4
       lm90_probe+0x618/0x8c0
       i2c_device_probe+0x170/0x2e0
       really_probe+0xd8/0x300
       __driver_probe_device+0x94/0xf4
       driver_probe_device+0x40/0x118
       __device_attach_driver+0xc8/0x10c
       bus_for_each_drv+0x90/0xdc
       __device_attach+0xbc/0x1d4
       device_initial_probe+0x1c/0x20
       bus_probe_device+0x98/0xa0
       deferred_probe_work_func+0x8c/0xbc
       process_one_work+0x2b8/0x774
       worker_thread+0x17c/0x56c
       kthread+0x108/0x13c
       ret_from_fork+0x14/0x28
       0x0

               -> #0 (&tz->lock){+.+.}-{3:3}:
       __lock_acquire+0x173c/0x3198
       lock_acquire+0x128/0x3f0
       __mutex_lock+0x9c/0x984
       mutex_lock_nested+0x2c/0x34
       thermal_zone_device_update+0x2c/0x64
       hwmon_notify_event+0x128/0x138
       lm90_update_alarms_locked+0x35c/0x3b8
       lm90_irq_thread+0x38/0x68
       irq_thread_fn+0x2c/0x8c
       irq_thread+0x190/0x29c
       kthread+0x108/0x13c
       ret_from_fork+0x14/0x28
       0x0

               other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&data->update_lock);
                               lock(&tz->lock);
                               lock(&data->update_lock);
  lock(&tz->lock);

                *** DEADLOCK ***

1 lock held by irq/91-lm90/130:
 #0: c27b42c8 (&data->update_lock){+.+.}-{3:3}, at:
lm90_irq_thread+0x2c/0x68

               stack backtrace:
CPU: 1 PID: 130 Comm: irq/91-lm90 Not tainted
5.19.0-rc4-next-20220628-00002-g94e5dbbe1c58-dirty #24
Hardware name: NVIDIA Tegra SoC (Flattened Device Tree)
Backtrace:
 dump_backtrace from show_stack+0x20/0x24
 r7:c33d1b60 r6:00000080 r5:60000093 r4:c168c6a4
 show_stack from dump_stack_lvl+0x68/0x98
 dump_stack_lvl from dump_stack+0x18/0x1c
 r7:c33d1b60 r6:c20328cc r5:c203c700 r4:c20328cc
 dump_stack from print_circular_bug+0x2ec/0x33c
 print_circular_bug from check_noncircular+0x104/0x168
 r10:c1a14cc8 r9:c33d1240 r8:00000001 r7:00000000 r6:dfc3dcc0 r5:c33d1b60
 r4:c33d1b80
 check_noncircular from __lock_acquire+0x173c/0x3198
 r7:c33d1b80 r6:c202bc98 r5:c33d1b60 r4:c21d92ac
 __lock_acquire from lock_acquire+0x128/0x3f0
 r10:60000013 r9:00000000 r8:00000000 r7:00000000 r6:dfc3dd40 r5:c19ac688
 r4:c19ac688
 lock_acquire from __mutex_lock+0x9c/0x984
 r10:c27ba380 r9:00000000 r8:c21d92ac r7:c33d1240 r6:00000000 r5:00000000
 r4:c27ba348
 __mutex_lock from mutex_lock_nested+0x2c/0x34
 r10:c27b4000 r9:00000000 r8:dfc3de87 r7:00000000 r6:c27ba348 r5:00000000
 r4:c27ba000
 mutex_lock_nested from thermal_zone_device_update+0x2c/0x64
 thermal_zone_device_update from hwmon_notify_event+0x128/0x138
 r7:00000000 r6:00000000 r5:c2d23ea4 r4:c33fd040
 hwmon_notify_event from lm90_update_alarms_locked+0x35c/0x3b8
 r8:c27b4378 r7:c2d23c08 r6:00000020 r5:00000000 r4:c27b4240
 lm90_update_alarms_locked from lm90_irq_thread+0x38/0x68
 r9:c01c2814 r8:00000001 r7:c33d2240 r6:c27b4290 r5:c27b4240 r4:c33fc200
 lm90_irq_thread from irq_thread_fn+0x2c/0x8c
 r7:c33d2240 r6:c27b4000 r5:c33d1240 r4:c33fc200
 irq_thread_fn from irq_thread+0x190/0x29c
 r7:c33d2240 r6:c33fc224 r5:c33d1240 r4:00000000
 irq_thread from kthread+0x108/0x13c
 r10:00000000 r9:df9ddbf4 r8:c31d2200 r7:c33fc200 r6:c01c2710 r5:c33d1240
 r4:c33fc240
 kthread from ret_from_fork+0x14/0x28

-- 
Best regards,
Dmitry

  reply	other threads:[~2022-06-29  9:35 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-16 20:25 [PATCH 1/3] thermal/drivers/qcom: Remove get_trend function Daniel Lezcano
2022-06-16 20:25 ` [PATCH 2/3] thermal/drivers/tegra: " Daniel Lezcano
2022-06-18 12:44   ` Dmitry Osipenko
2022-06-20 15:17     ` Daniel Lezcano
2022-06-28  8:41   ` Daniel Lezcano
2022-06-28 11:44     ` Dmitry Osipenko
2022-06-28 15:10       ` Guenter Roeck
2022-06-28 18:43         ` Guenter Roeck
2022-06-29  9:35           ` Dmitry Osipenko [this message]
2022-06-29 12:56             ` Guenter Roeck
2022-06-29 20:05   ` Dmitry Osipenko
2022-06-30 10:17     ` Daniel Lezcano
2022-07-28 15:41   ` [thermal: thermal/next] " thermal-bot for Daniel Lezcano
2022-06-16 20:25 ` [PATCH 3/3] thermal/drivers/u8500: Remove the " Daniel Lezcano
2022-06-28  8:40   ` Daniel Lezcano
2022-06-28 12:50     ` Linus Walleij
2022-06-30 10:16       ` Daniel Lezcano
2022-06-30 12:32         ` Vincent Guittot
2022-06-30 13:27           ` Daniel Lezcano
2022-07-28 15:41   ` [thermal: thermal/next] " thermal-bot for Daniel Lezcano
2022-06-17 12:58 ` [PATCH 1/3] thermal/drivers/qcom: Remove " Amit Kucheria
2022-07-28 15:41 ` [thermal: thermal/next] " thermal-bot for Daniel Lezcano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0f6cc7d3-5537-cd8f-d234-a61420e1cbc8@collabora.com \
    --to=dmitry.osipenko@collabora.com \
    --cc=amitk@kernel.org \
    --cc=daniel.lezcano@linaro.org \
    --cc=digetx@gmail.com \
    --cc=jonathanh@nvidia.com \
    --cc=linux-hwmon@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=linux-tegra@vger.kernel.org \
    --cc=linux@roeck-us.net \
    --cc=rafael@kernel.org \
    --cc=rui.zhang@intel.com \
    --cc=thierry.reding@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.