All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] timekeeping: Fix change_clocksource() for PM and sh_cmt
@ 2020-12-05  2:19 Niklas Söderlund
  2020-12-05  2:19 ` [PATCH 1/2] clocksource/drivers/sh_cmt: Fix potential deadlock when calling runtime PM Niklas Söderlund
  2020-12-05  2:19 ` [PATCH 2/2] timekeeping: Allow runtime PM from change_clocksource() Niklas Söderlund
  0 siblings, 2 replies; 5+ messages in thread
From: Niklas Söderlund @ 2020-12-05  2:19 UTC (permalink / raw)
  To: Daniel Lezcano, Thomas Gleixner, John Stultz, linux-kernel,
	linux-renesas-soc
  Cc: Niklas Söderlund

Hello,

This series is an attempt to fix two potential deadlock situations and 
allowing the tools/testing/selftests/timers/clocksource-switch test to 
pass for the sh_cmt driver.

The two patches are not directly related, but patch 1/2 fixes a local 
problem in the sh_cmt driver that hides the issue in the timekeeping 
core addressed in patch 2/2. In trying to give as an complete view of my 
problem so I have opted to keep the two patches together.

I'm in no way an expert on the timekeeping core but with these two 
patches applied I can pass the clocksource-switch selftest on R-Car Gen2 
and Gen3 boards. The selftest rapidly switches clock sources and without 
these patches I lockup somewhere and the test fails, with this applied 
it passes. Please se end of cover-letter for logs of the selftest runs, 
and each patch for possible deadlock information.

For each patch the easiest way demonstrate the each of the deadlock 
situations all that is needed is to select the CMT clocksource,

  # echo e60f0000.timer > /sys/devices/system/clocksource/clocksource0/current_clocksource

I have tested on-top of v5.10-rc6 on R-Car M3-N (CMT0 and CMT1) and 
Koelsch (CMT0).

* R-Car M3-N without this series applied

  # ./clocksource-switch 
  Validating clocksource arch_sys_counter
  Consistent CLOCK_REALTIME                 [OK]
  Consistent CLOCK_MONOTONIC                [OK]
  Consistent CLOCK_PROCESS_CPUTIME_ID       [OK]
  Consistent CLOCK_THREAD_CPUTIME_ID        [OK]
  Consistent CLOCK_MONOTONIC_RAW            [OK]
  Consistent CLOCK_REALTIME_COARSE          [OK]
  Consistent CLOCK_MONOTONIC_COARSE         [OK]
  Consistent CLOCK_BOOTTIME                 [OK]
  Consistent CLOCK_TAI                      [OK]
  # Totals: pass:0 fail:0 xfail:0 xpass:0 skip:0 error:0
  Nanosleep CLOCK_REALTIME                  [OK]
  Nanosleep CLOCK_MONOTONIC                 [OK]
  Nanosleep CLOCK_MONOTONIC_RAW             [UNSUPPORTED]
  Nanosleep CLOCK_REALTIME_COARSE           [UNSUPPORTED]
  Nanosleep CLOCK_MONOTONIC_COARSE          [UNSUPPORTED]
  Nanosleep CLOCK_BOOTTIME                  [OK]
  Nanosleep CLOCK_REALTIME_ALARM            [UNSUPPORTED]
  Nanosleep CLOCK_BOOTTIME_ALARM            [UNSUPPORTED]
  Nanosleep CLOCK_TAI                       [OK]
  # Totals: pass:0 fail:0 xfail:0 xpass:0 skip:0 error:0
  Validating clocksource e60f0000.timer
  Consistent CLOCK_REALTIME                 [OK]
  Consistent CLOCK_MONOTONIC                [OK]
  Consistent CLOCK_PROCESS_CPUTIME_ID       [OK]
  Consistent CLOCK_THREAD_CPUTIME_ID        [OK]
  Consistent CLOCK_MONOTONIC_RAW            [OK]
  Consistent CLOCK_REALTIME_COARSE          [OK]
  Consistent CLOCK_MONOTONIC_COARSE         [OK]
  Consistent CLOCK_BOOTTIME                 [OK]
  Consistent CLOCK_TAI                      [OK]
  # Totals: pass:0 fail:0 xfail:0 xpass:0 skip:0 error:0
  Nanosleep CLOCK_REALTIME                  [OK]
  Nanosleep CLOCK_MONOTONIC                 [OK]
  Nanosleep CLOCK_MONOTONIC_RAW             [UNSUPPORTED]
  Nanosleep CLOCK_REALTIME_COARSE           [UNSUPPORTED]
  Nanosleep CLOCK_MONOTONIC_COARSE          [UNSUPPORTED]
  Nanosleep CLOCK_BOOTTIME                  [OK]
  Nanosleep CLOCK_REALTIME_ALARM            [UNSUPPORTED]
  Nanosleep CLOCK_BOOTTIME_ALARM            [UNSUPPORTED]
  Nanosleep CLOCK_TAI                       [OK]
  # Totals: pass:0 fail:0 xfail:0 xpass:0 skip:0 error:0
  Validating clocksource e6130000.timer
  Consistent CLOCK_REALTIME                 [OK]
  Consistent CLOCK_MONOTONIC                [OK]
  Consistent CLOCK_PROCESS_CPUTIME_ID       [OK]
  Consistent CLOCK_THREAD_CPUTIME_ID        [OK]
  Consistent CLOCK_MONOTONIC_RAW            [OK]
  Consistent CLOCK_REALTIME_COARSE          [OK]
  Consistent CLOCK_MONOTONIC_COARSE         [OK]
  Consistent CLOCK_BOOTTIME                 [OK]
  Consistent CLOCK_TAI                      [OK]
  # Totals: pass:0 fail:0 xfail:0 xpass:0 skip:0 error:0
  Nanosleep CLOCK_REALTIME                  [OK]
  Nanosleep CLOCK_MONOTONIC                 [OK]
  Nanosleep CLOCK_MONOTONIC_RAW             [UNSUPPORTED]
  Nanosleep CLOCK_REALTIME_COARSE           [UNSUPPORTED]
  Nanosleep CLOCK_MONOTONIC_COARSE          [UNSUPPORTED]
  Nanosleep CLOCK_BOOTTIME                  [OK]
  Nanosleep CLOCK_REALTIME_ALARM            [UNSUPPORTED]
  Nanosleep CLOCK_BOOTTIME_ALARM            [UNSUPPORTED]
  Nanosleep CLOCK_TAI                       [OK]
  # Totals: pass:0 fail:0 xfail:0 xpass:0 skip:0 error:0
  Running Asynchronous Switching Tests...
  Consistent CLOCK_REALTIME                 Sat Dec  5 01:52:28 2020
  
  1607133200:750182652
  1607133200:750182652
  1607133200:750182772
  1607133200:750182892
  1607133200:750182892
  1607133200:750183012
  1607133200:750183133
  1607133200:750183133
  1607133200:750183253
  1607133200:750183373
  1607133200:750183373
  1607133200:750183493
  1607133200:750183493
  1607133200:750183613
  1607133200:750183734
  1607133200:750183734
  1607133200:750183854
  1607133200:750183974
  1607133200:750183974
  1607133200:750184094
  1607133200:750184214
  1607133200:750184214
  1607133200:750184335
  1607133200:750184455
  1607133200:750184455
  1607133200:750184575
  1607133200:750184695
  1607133200:750184695
  1607133200:750184815
  1607133200:750184936
  1607133200:750184936
  1607133200:750185056
  1607133200:750185056
  1607133200:750185176
  1607133200:750185296
  1607133200:750185296
  1607133200:750185416
  1607133200:750185536
  1607133200:750185536
  1607133200:750185657
  1607133200:750185777
  1607133200:750185777
  1607133200:750185897
  1607133200:750186017
  1607133200:750295633
  1607133200:750295633
  1607133200:750541817
  1607133200:750541817
  1607133200:750541817
  1607133200:750541817
  1607133200:750541817
  1607133200:750541817
  1607133200:750541817
  --------------------
  1607133200:750541817
  1607133200:750295633
  --------------------
  1607133200:750295633
  1607133200:750295633
  1607133200:750295633
  1607133200:750295633
  1607133200:750295633
  1607133200:750295633
  1607133200:750295633
  1607133200:750295633
  1607133200:750295633
  Delta: 246184 ns
  Sat Dec  5 01:53:20 2020
  
  [FAILED]
  # Totals: pass:0 fail:0 xfail:0 xpass:0 skip:0 error:0
  # Totals: pass:0 fail:0 xfail:0 xpass:0 skip:0 error:0

* R-Car M3-N with this series applied

  # ./clocksource-switch 
  Validating clocksource arch_sys_counter
  Consistent CLOCK_REALTIME                 [OK]
  Consistent CLOCK_MONOTONIC                [OK]
  Consistent CLOCK_PROCESS_CPUTIME_ID       [OK]
  Consistent CLOCK_THREAD_CPUTIME_ID        [OK]
  Consistent CLOCK_MONOTONIC_RAW            [OK]
  Consistent CLOCK_REALTIME_COARSE          [OK]
  Consistent CLOCK_MONOTONIC_COARSE         [OK]
  Consistent CLOCK_BOOTTIME                 [OK]
  Consistent CLOCK_TAI                      [OK]
  # Totals: pass:0 fail:0 xfail:0 xpass:0 skip:0 error:0
  Nanosleep CLOCK_REALTIME                  [OK]
  Nanosleep CLOCK_MONOTONIC                 [OK]
  Nanosleep CLOCK_MONOTONIC_RAW             [UNSUPPORTED]
  Nanosleep CLOCK_REALTIME_COARSE           [UNSUPPORTED]
  Nanosleep CLOCK_MONOTONIC_COARSE          [UNSUPPORTED]
  Nanosleep CLOCK_BOOTTIME                  [OK]
  Nanosleep CLOCK_REALTIME_ALARM            [UNSUPPORTED]
  Nanosleep CLOCK_BOOTTIME_ALARM            [UNSUPPORTED]
  Nanosleep CLOCK_TAI                       [OK]
  # Totals: pass:0 fail:0 xfail:0 xpass:0 skip:0 error:0
  Validating clocksource e60f0000.timer
  Consistent CLOCK_REALTIME                 [OK]
  Consistent CLOCK_MONOTONIC                [OK]
  Consistent CLOCK_PROCESS_CPUTIME_ID       [OK]
  Consistent CLOCK_THREAD_CPUTIME_ID        [OK]
  Consistent CLOCK_MONOTONIC_RAW            [OK]
  Consistent CLOCK_REALTIME_COARSE          [OK]
  Consistent CLOCK_MONOTONIC_COARSE         [OK]
  Consistent CLOCK_BOOTTIME                 [OK]
  Consistent CLOCK_TAI                      [OK]
  # Totals: pass:0 fail:0 xfail:0 xpass:0 skip:0 error:0
  Nanosleep CLOCK_REALTIME                  [OK]
  Nanosleep CLOCK_MONOTONIC                 [OK]
  Nanosleep CLOCK_MONOTONIC_RAW             [UNSUPPORTED]
  Nanosleep CLOCK_REALTIME_COARSE           [UNSUPPORTED]
  Nanosleep CLOCK_MONOTONIC_COARSE          [UNSUPPORTED]
  Nanosleep CLOCK_BOOTTIME                  [OK]
  Nanosleep CLOCK_REALTIME_ALARM            [UNSUPPORTED]
  Nanosleep CLOCK_BOOTTIME_ALARM            [UNSUPPORTED]
  Nanosleep CLOCK_TAI                       [OK]
  # Totals: pass:0 fail:0 xfail:0 xpass:0 skip:0 error:0
  Validating clocksource e6130000.timer
  Consistent CLOCK_REALTIME                 [OK]
  Consistent CLOCK_MONOTONIC                [OK]
  Consistent CLOCK_PROCESS_CPUTIME_ID       [OK]
  Consistent CLOCK_THREAD_CPUTIME_ID        [OK]
  Consistent CLOCK_MONOTONIC_RAW            [OK]
  Consistent CLOCK_REALTIME_COARSE          [OK]
  Consistent CLOCK_MONOTONIC_COARSE         [OK]
  Consistent CLOCK_BOOTTIME                 [OK]
  Consistent CLOCK_TAI                      [OK]
  # Totals: pass:0 fail:0 xfail:0 xpass:0 skip:0 error:0
  Nanosleep CLOCK_REALTIME                  [OK]
  Nanosleep CLOCK_MONOTONIC                 [OK]
  Nanosleep CLOCK_MONOTONIC_RAW             [UNSUPPORTED]
  Nanosleep CLOCK_REALTIME_COARSE           [UNSUPPORTED]
  Nanosleep CLOCK_MONOTONIC_COARSE          [UNSUPPORTED]
  Nanosleep CLOCK_BOOTTIME                  [OK]
  Nanosleep CLOCK_REALTIME_ALARM            [UNSUPPORTED]
  Nanosleep CLOCK_BOOTTIME_ALARM            [UNSUPPORTED]
  Nanosleep CLOCK_TAI                       [OK]
  # Totals: pass:0 fail:0 xfail:0 xpass:0 skip:0 error:0
  Running Asynchronous Switching Tests...
  Consistent CLOCK_REALTIME                 [OK]
  Consistent CLOCK_MONOTONIC                [OK]
  Consistent CLOCK_PROCESS_CPUTIME_ID       [OK]
  Consistent CLOCK_THREAD_CPUTIME_ID        [OK]
  Consistent CLOCK_MONOTONIC_RAW            [OK]
  Consistent CLOCK_REALTIME_COARSE          [OK]
  Consistent CLOCK_MONOTONIC_COARSE         [OK]
  Consistent CLOCK_BOOTTIME                 [OK]
  Consistent CLOCK_TAI                      [OK]
  # Totals: pass:0 fail:0 xfail:0 xpass:0 skip:0 error:0
  Nanosleep CLOCK_REALTIME                  [OK]
  Nanosleep CLOCK_MONOTONIC                 [OK]
  Nanosleep CLOCK_MONOTONIC_RAW             [UNSUPPORTED]
  Nanosleep CLOCK_REALTIME_COARSE           [UNSUPPORTED]
  Nanosleep CLOCK_MONOTONIC_COARSE          [UNSUPPORTED]
  Nanosleep CLOCK_BOOTTIME                  [OK]
  Nanosleep CLOCK_REALTIME_ALARM            [UNSUPPORTED]
  Nanosleep CLOCK_BOOTTIME_ALARM            [UNSUPPORTED]
  Nanosleep CLOCK_TAI                       [OK]
  # Totals: pass:0 fail:0 xfail:0 xpass:0 skip:0 error:0
  # Totals: pass:0 fail:0 xfail:0 xpass:0 skip:0 error:0

Niklas Söderlund (2):
  clocksource/drivers/sh_cmt: Fix potential deadlock when calling
    runtime PM
  timekeeping: Allow runtime PM from change_clocksource()

 drivers/clocksource/sh_cmt.c | 18 ++++++++++++++----
 kernel/time/timekeeping.c    | 36 +++++++++++++++++++++++-------------
 2 files changed, 37 insertions(+), 17 deletions(-)

-- 
2.29.2


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/2] clocksource/drivers/sh_cmt: Fix potential deadlock when calling runtime PM
  2020-12-05  2:19 [PATCH 0/2] timekeeping: Fix change_clocksource() for PM and sh_cmt Niklas Söderlund
@ 2020-12-05  2:19 ` Niklas Söderlund
  2020-12-07 15:57   ` Geert Uytterhoeven
  2020-12-12 12:58   ` [tip: timers/core] " tip-bot2 for Niklas Söderlund
  2020-12-05  2:19 ` [PATCH 2/2] timekeeping: Allow runtime PM from change_clocksource() Niklas Söderlund
  1 sibling, 2 replies; 5+ messages in thread
From: Niklas Söderlund @ 2020-12-05  2:19 UTC (permalink / raw)
  To: Daniel Lezcano, Thomas Gleixner, John Stultz, linux-kernel,
	linux-renesas-soc
  Cc: Niklas Söderlund

The ch->lock is used to protect the whole enable() and read() of
sh_cmt's implementation of struct clocksource. The enable()
implementation calls pm_runtime_get_sync() which may result in the clock
source to be read() triggering a cyclic lockdep warning for the
ch->lock.

The sh_cmt driver implement its own balancing of calls to
sh_cmt_{enable,disable}() with flags in sh_cmt_{start,stop}(). It does
this to deal with that start and stop are shared between the clock
source and clock event providers. While this could be improved on
verifying corner cases based on any substantial rework on all devices
this driver supports might prove hard.

As a first step separate the PM handling for clock event and clock
source. Always put/get the device when enabling/disabling the clock
source but keep the clock event logic unchanged. This allows the sh_cmt
implementation of struct clocksource to call PM without holding the
ch->lock and avoiding the deadlock.

Triggering and log of the deadlock warning,

  # echo e60f0000.timer > /sys/devices/system/clocksource/clocksource0/current_clocksource
  [   46.948370] ======================================================
  [   46.954730] WARNING: possible circular locking dependency detected
  [   46.961094] 5.10.0-rc6-arm64-renesas-00001-g0e5fd7414e8b #36 Not tainted
  [   46.967985] ------------------------------------------------------
  [   46.974342] migration/0/11 is trying to acquire lock:
  [   46.979543] ffff0000403ed220 (&dev->power.lock){-...}-{2:2}, at: __pm_runtime_resume+0x40/0x74
  [   46.988445]
  [   46.988445] but task is already holding lock:
  [   46.994441] ffff000040ad0298 (&ch->lock){....}-{2:2}, at: sh_cmt_start+0x28/0x210
  [   47.002173]
  [   47.002173] which lock already depends on the new lock.
  [   47.002173]
  [   47.010573]
  [   47.010573] the existing dependency chain (in reverse order) is:
  [   47.018262]
  [   47.018262] -> #3 (&ch->lock){....}-{2:2}:
  [   47.024033]        lock_acquire.part.0+0x120/0x330
  [   47.028970]        lock_acquire+0x64/0x80
  [   47.033105]        _raw_spin_lock_irqsave+0x7c/0xc4
  [   47.038130]        sh_cmt_start+0x28/0x210
  [   47.042352]        sh_cmt_clocksource_enable+0x28/0x50
  [   47.047644]        change_clocksource+0x9c/0x160
  [   47.052402]        multi_cpu_stop+0xa4/0x190
  [   47.056799]        cpu_stopper_thread+0x90/0x154
  [   47.061557]        smpboot_thread_fn+0x244/0x270
  [   47.066310]        kthread+0x154/0x160
  [   47.070175]        ret_from_fork+0x10/0x20
  [   47.074390]
  [   47.074390] -> #2 (tk_core.seq.seqcount){----}-{0:0}:
  [   47.081136]        lock_acquire.part.0+0x120/0x330
  [   47.086070]        lock_acquire+0x64/0x80
  [   47.090203]        seqcount_lockdep_reader_access.constprop.0+0x74/0x100
  [   47.097096]        ktime_get+0x28/0xa0
  [   47.100960]        hrtimer_start_range_ns+0x210/0x2dc
  [   47.106164]        generic_sched_clock_init+0x70/0x88
  [   47.111364]        sched_clock_init+0x40/0x64
  [   47.115853]        start_kernel+0x494/0x524
  [   47.120156]
  [   47.120156] -> #1 (hrtimer_bases.lock){-.-.}-{2:2}:
  [   47.126721]        lock_acquire.part.0+0x120/0x330
  [   47.136042]        lock_acquire+0x64/0x80
  [   47.144461]        _raw_spin_lock_irqsave+0x7c/0xc4
  [   47.153721]        hrtimer_start_range_ns+0x68/0x2dc
  [   47.163054]        rpm_suspend+0x308/0x5dc
  [   47.171473]        rpm_idle+0xc4/0x2a4
  [   47.179550]        pm_runtime_work+0x98/0xc0
  [   47.188209]        process_one_work+0x294/0x6f0
  [   47.197142]        worker_thread+0x70/0x45c
  [   47.205661]        kthread+0x154/0x160
  [   47.213673]        ret_from_fork+0x10/0x20
  [   47.221957]
  [   47.221957] -> #0 (&dev->power.lock){-...}-{2:2}:
  [   47.236292]        check_noncircular+0x128/0x140
  [   47.244907]        __lock_acquire+0x13b0/0x204c
  [   47.253332]        lock_acquire.part.0+0x120/0x330
  [   47.262033]        lock_acquire+0x64/0x80
  [   47.269826]        _raw_spin_lock_irqsave+0x7c/0xc4
  [   47.278430]        __pm_runtime_resume+0x40/0x74
  [   47.286758]        sh_cmt_start+0x84/0x210
  [   47.294537]        sh_cmt_clocksource_enable+0x28/0x50
  [   47.303449]        change_clocksource+0x9c/0x160
  [   47.311783]        multi_cpu_stop+0xa4/0x190
  [   47.319720]        cpu_stopper_thread+0x90/0x154
  [   47.328022]        smpboot_thread_fn+0x244/0x270
  [   47.336298]        kthread+0x154/0x160
  [   47.343708]        ret_from_fork+0x10/0x20
  [   47.351445]
  [   47.351445] other info that might help us debug this:
  [   47.351445]
  [   47.370225] Chain exists of:
  [   47.370225]   &dev->power.lock --> tk_core.seq.seqcount --> &ch->lock
  [   47.370225]
  [   47.392003]  Possible unsafe locking scenario:
  [   47.392003]
  [   47.405314]        CPU0                    CPU1
  [   47.413569]        ----                    ----
  [   47.421768]   lock(&ch->lock);
  [   47.428425]                                lock(tk_core.seq.seqcount);
  [   47.438701]                                lock(&ch->lock);
  [   47.447930]   lock(&dev->power.lock);
  [   47.455172]
  [   47.455172]  *** DEADLOCK ***
  [   47.455172]
  [   47.471433] 3 locks held by migration/0/11:
  [   47.479099]  #0: ffff8000113c9278 (timekeeper_lock){-.-.}-{2:2}, at: change_clocksource+0x2c/0x160
  [   47.491834]  #1: ffff8000113c8f88 (tk_core.seq.seqcount){----}-{0:0}, at: multi_cpu_stop+0xa4/0x190
  [   47.504727]  #2: ffff000040ad0298 (&ch->lock){....}-{2:2}, at: sh_cmt_start+0x28/0x210
  [   47.516541]
  [   47.516541] stack backtrace:
  [   47.528480] CPU: 0 PID: 11 Comm: migration/0 Not tainted 5.10.0-rc6-arm64-renesas-00001-g0e5fd7414e8b #36
  [   47.542147] Hardware name: Renesas Salvator-X 2nd version board based on r8a77965 (DT)
  [   47.554241] Call trace:
  [   47.560832]  dump_backtrace+0x0/0x190
  [   47.568670]  show_stack+0x14/0x30
  [   47.576144]  dump_stack+0xe8/0x130
  [   47.583670]  print_circular_bug+0x1f0/0x200
  [   47.592015]  check_noncircular+0x128/0x140
  [   47.600289]  __lock_acquire+0x13b0/0x204c
  [   47.608486]  lock_acquire.part.0+0x120/0x330
  [   47.616953]  lock_acquire+0x64/0x80
  [   47.624582]  _raw_spin_lock_irqsave+0x7c/0xc4
  [   47.633114]  __pm_runtime_resume+0x40/0x74
  [   47.641371]  sh_cmt_start+0x84/0x210
  [   47.649115]  sh_cmt_clocksource_enable+0x28/0x50
  [   47.657916]  change_clocksource+0x9c/0x160
  [   47.666165]  multi_cpu_stop+0xa4/0x190
  [   47.674056]  cpu_stopper_thread+0x90/0x154
  [   47.682308]  smpboot_thread_fn+0x244/0x270
  [   47.690560]  kthread+0x154/0x160
  [   47.697927]  ret_from_fork+0x10/0x20
  [   47.708447] clocksource: Switched to clocksource e60f0000.timer

Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
---
 drivers/clocksource/sh_cmt.c | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/drivers/clocksource/sh_cmt.c b/drivers/clocksource/sh_cmt.c
index 760777458a9091cd..19fa3ef75e3bd62b 100644
--- a/drivers/clocksource/sh_cmt.c
+++ b/drivers/clocksource/sh_cmt.c
@@ -319,7 +319,6 @@ static int sh_cmt_enable(struct sh_cmt_channel *ch)
 {
 	int k, ret;
 
-	pm_runtime_get_sync(&ch->cmt->pdev->dev);
 	dev_pm_syscore_device(&ch->cmt->pdev->dev, true);
 
 	/* enable clock */
@@ -394,7 +393,6 @@ static void sh_cmt_disable(struct sh_cmt_channel *ch)
 	clk_disable(ch->cmt->clk);
 
 	dev_pm_syscore_device(&ch->cmt->pdev->dev, false);
-	pm_runtime_put(&ch->cmt->pdev->dev);
 }
 
 /* private flags */
@@ -562,10 +560,16 @@ static int sh_cmt_start(struct sh_cmt_channel *ch, unsigned long flag)
 	int ret = 0;
 	unsigned long flags;
 
+	if (flag & FLAG_CLOCKSOURCE)
+		pm_runtime_get_sync(&ch->cmt->pdev->dev);
+
 	raw_spin_lock_irqsave(&ch->lock, flags);
 
-	if (!(ch->flags & (FLAG_CLOCKEVENT | FLAG_CLOCKSOURCE)))
+	if (!(ch->flags & (FLAG_CLOCKEVENT | FLAG_CLOCKSOURCE))) {
+		if (flag & FLAG_CLOCKEVENT)
+			pm_runtime_get_sync(&ch->cmt->pdev->dev);
 		ret = sh_cmt_enable(ch);
+	}
 
 	if (ret)
 		goto out;
@@ -590,14 +594,20 @@ static void sh_cmt_stop(struct sh_cmt_channel *ch, unsigned long flag)
 	f = ch->flags & (FLAG_CLOCKEVENT | FLAG_CLOCKSOURCE);
 	ch->flags &= ~flag;
 
-	if (f && !(ch->flags & (FLAG_CLOCKEVENT | FLAG_CLOCKSOURCE)))
+	if (f && !(ch->flags & (FLAG_CLOCKEVENT | FLAG_CLOCKSOURCE))) {
 		sh_cmt_disable(ch);
+		if (flag & FLAG_CLOCKEVENT)
+			pm_runtime_put(&ch->cmt->pdev->dev);
+	}
 
 	/* adjust the timeout to maximum if only clocksource left */
 	if ((flag == FLAG_CLOCKEVENT) && (ch->flags & FLAG_CLOCKSOURCE))
 		__sh_cmt_set_next(ch, ch->max_match_value);
 
 	raw_spin_unlock_irqrestore(&ch->lock, flags);
+
+	if (flag & FLAG_CLOCKSOURCE)
+		pm_runtime_put(&ch->cmt->pdev->dev);
 }
 
 static struct sh_cmt_channel *cs_to_sh_cmt(struct clocksource *cs)
-- 
2.29.2


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 2/2] timekeeping: Allow runtime PM from change_clocksource()
  2020-12-05  2:19 [PATCH 0/2] timekeeping: Fix change_clocksource() for PM and sh_cmt Niklas Söderlund
  2020-12-05  2:19 ` [PATCH 1/2] clocksource/drivers/sh_cmt: Fix potential deadlock when calling runtime PM Niklas Söderlund
@ 2020-12-05  2:19 ` Niklas Söderlund
  1 sibling, 0 replies; 5+ messages in thread
From: Niklas Söderlund @ 2020-12-05  2:19 UTC (permalink / raw)
  To: Daniel Lezcano, Thomas Gleixner, John Stultz, linux-kernel,
	linux-renesas-soc
  Cc: Niklas Söderlund

The struct clocksource callbacks enable() and disable() are described as
a way to allow clock sources to enter a power save mode [1]. But using
runtime PM from these callbacks triggers a cyclic lockdep warning when
switching clock source using change_clocksource().

This change allows the new clocksource to be enabled() and the old one
to be disabled() without holding the timekeeper_lock. This solution is
modeled on timekeeping_resume() and timekeeping_suspend() where the
struct clocksource resume() and suspend() callbacks are called without
holding the timekeeper_lock.

Triggering and log of the deadlock warning,

  # echo e60f0000.timer > /sys/devices/system/clocksource/clocksource0/current_clocksource
  [  118.081095] ======================================================
  [  118.087455] WARNING: possible circular locking dependency detected
  [  118.093821] 5.10.0-rc6-arm64-renesas-00002-ga0cf8106e584 #37 Not tainted
  [  118.100712] ------------------------------------------------------
  [  118.107069] migration/0/11 is trying to acquire lock:
  [  118.112269] ffff0000403ed220 (&dev->power.lock){-...}-{2:2}, at: __pm_runtime_resume+0x40/0x74
  [  118.121172]
  [  118.121172] but task is already holding lock:
  [  118.127170] ffff8000113c8f88 (tk_core.seq.seqcount){----}-{0:0}, at: multi_cpu_stop+0xa4/0x190
  [  118.136061]
  [  118.136061] which lock already depends on the new lock.
  [  118.136061]
  [  118.144461]
  [  118.144461] the existing dependency chain (in reverse order) is:
  [  118.152149]
  [  118.152149] -> #2 (tk_core.seq.seqcount){----}-{0:0}:
  [  118.158900]        lock_acquire.part.0+0x120/0x330
  [  118.163834]        lock_acquire+0x64/0x80
  [  118.167971]        seqcount_lockdep_reader_access.constprop.0+0x74/0x100
  [  118.174865]        ktime_get+0x28/0xa0
  [  118.178729]        hrtimer_start_range_ns+0x210/0x2dc
  [  118.183934]        generic_sched_clock_init+0x70/0x88
  [  118.189134]        sched_clock_init+0x40/0x64
  [  118.193622]        start_kernel+0x494/0x524
  [  118.197926]
  [  118.197926] -> #1 (hrtimer_bases.lock){-.-.}-{2:2}:
  [  118.204491]        lock_acquire.part.0+0x120/0x330
  [  118.209424]        lock_acquire+0x64/0x80
  [  118.213557]        _raw_spin_lock_irqsave+0x7c/0xc4
  [  118.218579]        hrtimer_start_range_ns+0x68/0x2dc
  [  118.223690]        rpm_suspend+0x308/0x5dc
  [  118.227909]        rpm_idle+0xc4/0x2a4
  [  118.231771]        pm_runtime_work+0x98/0xc0
  [  118.236171]        process_one_work+0x294/0x6f0
  [  118.240836]        worker_thread+0x70/0x45c
  [  118.245143]        kthread+0x154/0x160
  [  118.249007]        ret_from_fork+0x10/0x20
  [  118.253222]
  [  118.253222] -> #0 (&dev->power.lock){-...}-{2:2}:
  [  118.259607]        check_noncircular+0x128/0x140
  [  118.268774]        __lock_acquire+0x13b0/0x204c
  [  118.277780]        lock_acquire.part.0+0x120/0x330
  [  118.287001]        lock_acquire+0x64/0x80
  [  118.295375]        _raw_spin_lock_irqsave+0x7c/0xc4
  [  118.304623]        __pm_runtime_resume+0x40/0x74
  [  118.313644]        sh_cmt_start+0x1c4/0x260
  [  118.322275]        sh_cmt_clocksource_enable+0x28/0x50
  [  118.331891]        change_clocksource+0x9c/0x160
  [  118.340910]        multi_cpu_stop+0xa4/0x190
  [  118.349522]        cpu_stopper_thread+0x90/0x154
  [  118.358429]        smpboot_thread_fn+0x244/0x270
  [  118.367265]        kthread+0x154/0x160
  [  118.375158]        ret_from_fork+0x10/0x20
  [  118.383284]
  [  118.383284] other info that might help us debug this:
  [  118.383284]
  [  118.402810] Chain exists of:
  [  118.402810]   &dev->power.lock --> hrtimer_bases.lock --> tk_core.seq.seqcount
  [  118.402810]
  [  118.425597]  Possible unsafe locking scenario:
  [  118.425597]
  [  118.439130]        CPU0                    CPU1
  [  118.447413]        ----                    ----
  [  118.455641]   lock(tk_core.seq.seqcount);
  [  118.463335]                                lock(hrtimer_bases.lock);
  [  118.473507]                                lock(tk_core.seq.seqcount);
  [  118.483787]   lock(&dev->power.lock);
  [  118.491120]
  [  118.491120]  *** DEADLOCK ***
  [  118.491120]
  [  118.507666] 2 locks held by migration/0/11:
  [  118.515424]  #0: ffff8000113c9278 (timekeeper_lock){-.-.}-{2:2}, at: change_clocksource+0x2c/0x160
  [  118.528257]  #1: ffff8000113c8f88 (tk_core.seq.seqcount){----}-{0:0}, at: multi_cpu_stop+0xa4/0x190
  [  118.541248]
  [  118.541248] stack backtrace:
  [  118.553226] CPU: 0 PID: 11 Comm: migration/0 Not tainted 5.10.0-rc6-arm64-renesas-00002-ga0cf8106e584 #37
  [  118.566923] Hardware name: Renesas Salvator-X 2nd version board based on r8a77965 (DT)
  [  118.579051] Call trace:
  [  118.585649]  dump_backtrace+0x0/0x190
  [  118.593505]  show_stack+0x14/0x30
  [  118.601001]  dump_stack+0xe8/0x130
  [  118.608567]  print_circular_bug+0x1f0/0x200
  [  118.616930]  check_noncircular+0x128/0x140
  [  118.625231]  __lock_acquire+0x13b0/0x204c
  [  118.633451]  lock_acquire.part.0+0x120/0x330
  [  118.641958]  lock_acquire+0x64/0x80
  [  118.649630]  _raw_spin_lock_irqsave+0x7c/0xc4
  [  118.658186]  __pm_runtime_resume+0x40/0x74
  [  118.666483]  sh_cmt_start+0x1c4/0x260
  [  118.674327]  sh_cmt_clocksource_enable+0x28/0x50
  [  118.683170]  change_clocksource+0x9c/0x160
  [  118.691460]  multi_cpu_stop+0xa4/0x190
  [  118.699384]  cpu_stopper_thread+0x90/0x154
  [  118.707672]  smpboot_thread_fn+0x244/0x270
  [  118.715961]  kthread+0x154/0x160
  [  118.723360]  ret_from_fork+0x10/0x20
  [  118.731465] clocksource: Switched to clocksource e60f0000.timer

1. commit 4614e6adafa2c5e6 ("clocksource: add enable() and disable() callbacks")

Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
---
 kernel/time/timekeeping.c | 36 +++++++++++++++++++++++-------------
 1 file changed, 23 insertions(+), 13 deletions(-)

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 6858a31364b6456f..41f587359ca8db0e 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1433,35 +1433,45 @@ static void __timekeeping_set_tai_offset(struct timekeeper *tk, s32 tai_offset)
 static int change_clocksource(void *data)
 {
 	struct timekeeper *tk = &tk_core.timekeeper;
-	struct clocksource *new, *old;
+	struct clocksource *new, *old = NULL;
 	unsigned long flags;
+	bool change = false;
 
 	new = (struct clocksource *) data;
 
-	raw_spin_lock_irqsave(&timekeeper_lock, flags);
-	write_seqcount_begin(&tk_core.seq);
-
-	timekeeping_forward_now(tk);
 	/*
 	 * If the cs is in module, get a module reference. Succeeds
 	 * for built-in code (owner == NULL) as well.
 	 */
 	if (try_module_get(new->owner)) {
-		if (!new->enable || new->enable(new) == 0) {
-			old = tk->tkr_mono.clock;
-			tk_setup_internals(tk, new);
-			if (old->disable)
-				old->disable(old);
-			module_put(old->owner);
-		} else {
+		if (!new->enable || new->enable(new) == 0)
+			change = true;
+		else
 			module_put(new->owner);
-		}
 	}
+
+	raw_spin_lock_irqsave(&timekeeper_lock, flags);
+	write_seqcount_begin(&tk_core.seq);
+
+	timekeeping_forward_now(tk);
+
+	if (change) {
+		old = tk->tkr_mono.clock;
+		tk_setup_internals(tk, new);
+	}
+
 	timekeeping_update(tk, TK_CLEAR_NTP | TK_MIRROR | TK_CLOCK_WAS_SET);
 
 	write_seqcount_end(&tk_core.seq);
 	raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
 
+	if (old) {
+		if (old->disable)
+			old->disable(old);
+
+		module_put(old->owner);
+	}
+
 	return 0;
 }
 
-- 
2.29.2


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/2] clocksource/drivers/sh_cmt: Fix potential deadlock when calling runtime PM
  2020-12-05  2:19 ` [PATCH 1/2] clocksource/drivers/sh_cmt: Fix potential deadlock when calling runtime PM Niklas Söderlund
@ 2020-12-07 15:57   ` Geert Uytterhoeven
  2020-12-12 12:58   ` [tip: timers/core] " tip-bot2 for Niklas Söderlund
  1 sibling, 0 replies; 5+ messages in thread
From: Geert Uytterhoeven @ 2020-12-07 15:57 UTC (permalink / raw)
  To: Niklas Söderlund
  Cc: Daniel Lezcano, Thomas Gleixner, John Stultz,
	Linux Kernel Mailing List, Linux-Renesas

Hi Niklas,

On Sat, Dec 5, 2020 at 3:20 AM Niklas Söderlund
<niklas.soderlund+renesas@ragnatech.se> wrote:
> The ch->lock is used to protect the whole enable() and read() of
> sh_cmt's implementation of struct clocksource. The enable()
> implementation calls pm_runtime_get_sync() which may result in the clock
> source to be read() triggering a cyclic lockdep warning for the
> ch->lock.
>
> The sh_cmt driver implement its own balancing of calls to
> sh_cmt_{enable,disable}() with flags in sh_cmt_{start,stop}(). It does
> this to deal with that start and stop are shared between the clock
> source and clock event providers. While this could be improved on
> verifying corner cases based on any substantial rework on all devices
> this driver supports might prove hard.
>
> As a first step separate the PM handling for clock event and clock
> source. Always put/get the device when enabling/disabling the clock
> source but keep the clock event logic unchanged. This allows the sh_cmt
> implementation of struct clocksource to call PM without holding the
> ch->lock and avoiding the deadlock.
>
> Triggering and log of the deadlock warning,
>
>   # echo e60f0000.timer > /sys/devices/system/clocksource/clocksource0/current_clocksource
>   [   46.948370] ======================================================
>   [   46.954730] WARNING: possible circular locking dependency detected

[...]

> Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>

Thanks for looking into this!

> --- a/drivers/clocksource/sh_cmt.c
> +++ b/drivers/clocksource/sh_cmt.c
> @@ -319,7 +319,6 @@ static int sh_cmt_enable(struct sh_cmt_channel *ch)
>  {
>         int k, ret;
>
> -       pm_runtime_get_sync(&ch->cmt->pdev->dev);
>         dev_pm_syscore_device(&ch->cmt->pdev->dev, true);
>
>         /* enable clock */
> @@ -394,7 +393,6 @@ static void sh_cmt_disable(struct sh_cmt_channel *ch)
>         clk_disable(ch->cmt->clk);
>
>         dev_pm_syscore_device(&ch->cmt->pdev->dev, false);
> -       pm_runtime_put(&ch->cmt->pdev->dev);
>  }
>
>  /* private flags */
> @@ -562,10 +560,16 @@ static int sh_cmt_start(struct sh_cmt_channel *ch, unsigned long flag)
>         int ret = 0;
>         unsigned long flags;
>
> +       if (flag & FLAG_CLOCKSOURCE)
> +               pm_runtime_get_sync(&ch->cmt->pdev->dev);
> +
>         raw_spin_lock_irqsave(&ch->lock, flags);
>
> -       if (!(ch->flags & (FLAG_CLOCKEVENT | FLAG_CLOCKSOURCE)))
> +       if (!(ch->flags & (FLAG_CLOCKEVENT | FLAG_CLOCKSOURCE))) {
> +               if (flag & FLAG_CLOCKEVENT)
> +                       pm_runtime_get_sync(&ch->cmt->pdev->dev);

This change emphasizes the (pre-existing) issue with clock events:
pm_runtime_get_sync() is called while holding a spinlock, leading to the
following splat on r8a7740/armadillo:

    sh_cmt e6138000.timer: ch0: used for periodic clock events
    =============================
    [ BUG: Invalid wait context ]
    5.10.0-rc5-armadillo-00566-g8eaa6103691d-dirty #225 Not tainted
    -----------------------------
    swapper/1 is trying to lock:
    c254cd2c (&dev->power.lock){....}-{3:3}, at: __pm_runtime_resume+0x54/0x80
    other info that might help us debug this:
    context-{5:5}
    3 locks held by swapper/1:
     #0: c254ccd0 (&dev->mutex){....}-{4:4}, at: device_driver_attach+0x18/0x5c
     #1: c0bed230 (clockevents_lock){....}-{2:2}, at:
clockevents_register_device+0x5c/0x10c
     #2: c26a5038 (&ch->lock){....}-{2:2}, at: sh_cmt_start+0x18/0x1b0

As this is a pre-existing issue:
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>

>                 ret = sh_cmt_enable(ch);
> +       }
>
>         if (ret)
>                 goto out;

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [tip: timers/core] clocksource/drivers/sh_cmt: Fix potential deadlock when calling runtime PM
  2020-12-05  2:19 ` [PATCH 1/2] clocksource/drivers/sh_cmt: Fix potential deadlock when calling runtime PM Niklas Söderlund
  2020-12-07 15:57   ` Geert Uytterhoeven
@ 2020-12-12 12:58   ` tip-bot2 for Niklas Söderlund
  1 sibling, 0 replies; 5+ messages in thread
From: tip-bot2 for Niklas Söderlund @ 2020-12-12 12:58 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: niklas.soderlund+renesas, Geert Uytterhoeven, Daniel Lezcano,
	x86, linux-kernel

The following commit has been merged into the timers/core branch of tip:

Commit-ID:     8ae954caf49ac403c177d117fb8e05cbc866aa3c
Gitweb:        https://git.kernel.org/tip/8ae954caf49ac403c177d117fb8e05cbc866aa3c
Author:        Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
AuthorDate:    Sat, 05 Dec 2020 03:19:20 +01:00
Committer:     Daniel Lezcano <daniel.lezcano@linaro.org>
CommitterDate: Mon, 07 Dec 2020 20:10:05 +01:00

clocksource/drivers/sh_cmt: Fix potential deadlock when calling runtime PM

The ch->lock is used to protect the whole enable() and read() of
sh_cmt's implementation of struct clocksource. The enable()
implementation calls pm_runtime_get_sync() which may result in the clock
source to be read() triggering a cyclic lockdep warning for the
ch->lock.

The sh_cmt driver implement its own balancing of calls to
sh_cmt_{enable,disable}() with flags in sh_cmt_{start,stop}(). It does
this to deal with that start and stop are shared between the clock
source and clock event providers. While this could be improved on
verifying corner cases based on any substantial rework on all devices
this driver supports might prove hard.

As a first step separate the PM handling for clock event and clock
source. Always put/get the device when enabling/disabling the clock
source but keep the clock event logic unchanged. This allows the sh_cmt
implementation of struct clocksource to call PM without holding the
ch->lock and avoiding the deadlock.

Triggering and log of the deadlock warning,

  # echo e60f0000.timer > /sys/devices/system/clocksource/clocksource0/current_clocksource
  [   46.948370] ======================================================
  [   46.954730] WARNING: possible circular locking dependency detected
  [   46.961094] 5.10.0-rc6-arm64-renesas-00001-g0e5fd7414e8b #36 Not tainted
  [   46.967985] ------------------------------------------------------
  [   46.974342] migration/0/11 is trying to acquire lock:
  [   46.979543] ffff0000403ed220 (&dev->power.lock){-...}-{2:2}, at: __pm_runtime_resume+0x40/0x74
  [   46.988445]
  [   46.988445] but task is already holding lock:
  [   46.994441] ffff000040ad0298 (&ch->lock){....}-{2:2}, at: sh_cmt_start+0x28/0x210
  [   47.002173]
  [   47.002173] which lock already depends on the new lock.
  [   47.002173]
  [   47.010573]
  [   47.010573] the existing dependency chain (in reverse order) is:
  [   47.018262]
  [   47.018262] -> #3 (&ch->lock){....}-{2:2}:
  [   47.024033]        lock_acquire.part.0+0x120/0x330
  [   47.028970]        lock_acquire+0x64/0x80
  [   47.033105]        _raw_spin_lock_irqsave+0x7c/0xc4
  [   47.038130]        sh_cmt_start+0x28/0x210
  [   47.042352]        sh_cmt_clocksource_enable+0x28/0x50
  [   47.047644]        change_clocksource+0x9c/0x160
  [   47.052402]        multi_cpu_stop+0xa4/0x190
  [   47.056799]        cpu_stopper_thread+0x90/0x154
  [   47.061557]        smpboot_thread_fn+0x244/0x270
  [   47.066310]        kthread+0x154/0x160
  [   47.070175]        ret_from_fork+0x10/0x20
  [   47.074390]
  [   47.074390] -> #2 (tk_core.seq.seqcount){----}-{0:0}:
  [   47.081136]        lock_acquire.part.0+0x120/0x330
  [   47.086070]        lock_acquire+0x64/0x80
  [   47.090203]        seqcount_lockdep_reader_access.constprop.0+0x74/0x100
  [   47.097096]        ktime_get+0x28/0xa0
  [   47.100960]        hrtimer_start_range_ns+0x210/0x2dc
  [   47.106164]        generic_sched_clock_init+0x70/0x88
  [   47.111364]        sched_clock_init+0x40/0x64
  [   47.115853]        start_kernel+0x494/0x524
  [   47.120156]
  [   47.120156] -> #1 (hrtimer_bases.lock){-.-.}-{2:2}:
  [   47.126721]        lock_acquire.part.0+0x120/0x330
  [   47.136042]        lock_acquire+0x64/0x80
  [   47.144461]        _raw_spin_lock_irqsave+0x7c/0xc4
  [   47.153721]        hrtimer_start_range_ns+0x68/0x2dc
  [   47.163054]        rpm_suspend+0x308/0x5dc
  [   47.171473]        rpm_idle+0xc4/0x2a4
  [   47.179550]        pm_runtime_work+0x98/0xc0
  [   47.188209]        process_one_work+0x294/0x6f0
  [   47.197142]        worker_thread+0x70/0x45c
  [   47.205661]        kthread+0x154/0x160
  [   47.213673]        ret_from_fork+0x10/0x20
  [   47.221957]
  [   47.221957] -> #0 (&dev->power.lock){-...}-{2:2}:
  [   47.236292]        check_noncircular+0x128/0x140
  [   47.244907]        __lock_acquire+0x13b0/0x204c
  [   47.253332]        lock_acquire.part.0+0x120/0x330
  [   47.262033]        lock_acquire+0x64/0x80
  [   47.269826]        _raw_spin_lock_irqsave+0x7c/0xc4
  [   47.278430]        __pm_runtime_resume+0x40/0x74
  [   47.286758]        sh_cmt_start+0x84/0x210
  [   47.294537]        sh_cmt_clocksource_enable+0x28/0x50
  [   47.303449]        change_clocksource+0x9c/0x160
  [   47.311783]        multi_cpu_stop+0xa4/0x190
  [   47.319720]        cpu_stopper_thread+0x90/0x154
  [   47.328022]        smpboot_thread_fn+0x244/0x270
  [   47.336298]        kthread+0x154/0x160
  [   47.343708]        ret_from_fork+0x10/0x20
  [   47.351445]
  [   47.351445] other info that might help us debug this:
  [   47.351445]
  [   47.370225] Chain exists of:
  [   47.370225]   &dev->power.lock --> tk_core.seq.seqcount --> &ch->lock
  [   47.370225]
  [   47.392003]  Possible unsafe locking scenario:
  [   47.392003]
  [   47.405314]        CPU0                    CPU1
  [   47.413569]        ----                    ----
  [   47.421768]   lock(&ch->lock);
  [   47.428425]                                lock(tk_core.seq.seqcount);
  [   47.438701]                                lock(&ch->lock);
  [   47.447930]   lock(&dev->power.lock);
  [   47.455172]
  [   47.455172]  *** DEADLOCK ***
  [   47.455172]
  [   47.471433] 3 locks held by migration/0/11:
  [   47.479099]  #0: ffff8000113c9278 (timekeeper_lock){-.-.}-{2:2}, at: change_clocksource+0x2c/0x160
  [   47.491834]  #1: ffff8000113c8f88 (tk_core.seq.seqcount){----}-{0:0}, at: multi_cpu_stop+0xa4/0x190
  [   47.504727]  #2: ffff000040ad0298 (&ch->lock){....}-{2:2}, at: sh_cmt_start+0x28/0x210
  [   47.516541]
  [   47.516541] stack backtrace:
  [   47.528480] CPU: 0 PID: 11 Comm: migration/0 Not tainted 5.10.0-rc6-arm64-renesas-00001-g0e5fd7414e8b #36
  [   47.542147] Hardware name: Renesas Salvator-X 2nd version board based on r8a77965 (DT)
  [   47.554241] Call trace:
  [   47.560832]  dump_backtrace+0x0/0x190
  [   47.568670]  show_stack+0x14/0x30
  [   47.576144]  dump_stack+0xe8/0x130
  [   47.583670]  print_circular_bug+0x1f0/0x200
  [   47.592015]  check_noncircular+0x128/0x140
  [   47.600289]  __lock_acquire+0x13b0/0x204c
  [   47.608486]  lock_acquire.part.0+0x120/0x330
  [   47.616953]  lock_acquire+0x64/0x80
  [   47.624582]  _raw_spin_lock_irqsave+0x7c/0xc4
  [   47.633114]  __pm_runtime_resume+0x40/0x74
  [   47.641371]  sh_cmt_start+0x84/0x210
  [   47.649115]  sh_cmt_clocksource_enable+0x28/0x50
  [   47.657916]  change_clocksource+0x9c/0x160
  [   47.666165]  multi_cpu_stop+0xa4/0x190
  [   47.674056]  cpu_stopper_thread+0x90/0x154
  [   47.682308]  smpboot_thread_fn+0x244/0x270
  [   47.690560]  kthread+0x154/0x160
  [   47.697927]  ret_from_fork+0x10/0x20
  [   47.708447] clocksource: Switched to clocksource e60f0000.timer

Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20201205021921.1456190-2-niklas.soderlund+renesas@ragnatech.se
---
 drivers/clocksource/sh_cmt.c | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/drivers/clocksource/sh_cmt.c b/drivers/clocksource/sh_cmt.c
index 7607774..19fa3ef 100644
--- a/drivers/clocksource/sh_cmt.c
+++ b/drivers/clocksource/sh_cmt.c
@@ -319,7 +319,6 @@ static int sh_cmt_enable(struct sh_cmt_channel *ch)
 {
 	int k, ret;
 
-	pm_runtime_get_sync(&ch->cmt->pdev->dev);
 	dev_pm_syscore_device(&ch->cmt->pdev->dev, true);
 
 	/* enable clock */
@@ -394,7 +393,6 @@ static void sh_cmt_disable(struct sh_cmt_channel *ch)
 	clk_disable(ch->cmt->clk);
 
 	dev_pm_syscore_device(&ch->cmt->pdev->dev, false);
-	pm_runtime_put(&ch->cmt->pdev->dev);
 }
 
 /* private flags */
@@ -562,10 +560,16 @@ static int sh_cmt_start(struct sh_cmt_channel *ch, unsigned long flag)
 	int ret = 0;
 	unsigned long flags;
 
+	if (flag & FLAG_CLOCKSOURCE)
+		pm_runtime_get_sync(&ch->cmt->pdev->dev);
+
 	raw_spin_lock_irqsave(&ch->lock, flags);
 
-	if (!(ch->flags & (FLAG_CLOCKEVENT | FLAG_CLOCKSOURCE)))
+	if (!(ch->flags & (FLAG_CLOCKEVENT | FLAG_CLOCKSOURCE))) {
+		if (flag & FLAG_CLOCKEVENT)
+			pm_runtime_get_sync(&ch->cmt->pdev->dev);
 		ret = sh_cmt_enable(ch);
+	}
 
 	if (ret)
 		goto out;
@@ -590,14 +594,20 @@ static void sh_cmt_stop(struct sh_cmt_channel *ch, unsigned long flag)
 	f = ch->flags & (FLAG_CLOCKEVENT | FLAG_CLOCKSOURCE);
 	ch->flags &= ~flag;
 
-	if (f && !(ch->flags & (FLAG_CLOCKEVENT | FLAG_CLOCKSOURCE)))
+	if (f && !(ch->flags & (FLAG_CLOCKEVENT | FLAG_CLOCKSOURCE))) {
 		sh_cmt_disable(ch);
+		if (flag & FLAG_CLOCKEVENT)
+			pm_runtime_put(&ch->cmt->pdev->dev);
+	}
 
 	/* adjust the timeout to maximum if only clocksource left */
 	if ((flag == FLAG_CLOCKEVENT) && (ch->flags & FLAG_CLOCKSOURCE))
 		__sh_cmt_set_next(ch, ch->max_match_value);
 
 	raw_spin_unlock_irqrestore(&ch->lock, flags);
+
+	if (flag & FLAG_CLOCKSOURCE)
+		pm_runtime_put(&ch->cmt->pdev->dev);
 }
 
 static struct sh_cmt_channel *cs_to_sh_cmt(struct clocksource *cs)

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-12-12 13:03 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-05  2:19 [PATCH 0/2] timekeeping: Fix change_clocksource() for PM and sh_cmt Niklas Söderlund
2020-12-05  2:19 ` [PATCH 1/2] clocksource/drivers/sh_cmt: Fix potential deadlock when calling runtime PM Niklas Söderlund
2020-12-07 15:57   ` Geert Uytterhoeven
2020-12-12 12:58   ` [tip: timers/core] " tip-bot2 for Niklas Söderlund
2020-12-05  2:19 ` [PATCH 2/2] timekeeping: Allow runtime PM from change_clocksource() Niklas Söderlund

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.