linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* timers: Move clearing of base::timer_running under base::lock
@ 2020-12-06 21:40 Thomas Gleixner
  2020-12-07  1:10 ` Frederic Weisbecker
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Thomas Gleixner @ 2020-12-06 21:40 UTC (permalink / raw)
  To: LKML
  Cc: Marco Elver, kasan-dev, Peter Zijlstra, Paul E. McKenney,
	Anna-Maria Behnsen, Sebastian Andrzej Siewior

syzbot reported KCSAN data races vs. timer_base::timer_running being set to
NULL without holding base::lock in expire_timers().

This looks innocent and most reads are clearly not problematic but for a
non-RT kernel it's completely irrelevant whether the store happens before
or after taking the lock. For an RT kernel moving the store under the lock
requires an extra unlock/lock pair in the case that there is a waiter for
the timer. But that's not the end of the world and definitely not worth the
trouble of adding boatloads of comments and annotations to the code. Famous
last words...

Reported-by: syzbot+aa7c2385d46c5eba0b89@syzkaller.appspotmail.com
Reported-by: syzbot+abea4558531bae1ba9fe@syzkaller.appspotmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/time/timer.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -1263,8 +1263,10 @@ static inline void timer_base_unlock_exp
 static void timer_sync_wait_running(struct timer_base *base)
 {
 	if (atomic_read(&base->timer_waiters)) {
+		raw_spin_unlock_irq(&base->lock);
 		spin_unlock(&base->expiry_lock);
 		spin_lock(&base->expiry_lock);
+		raw_spin_lock_irq(&base->lock);
 	}
 }
 
@@ -1448,14 +1450,14 @@ static void expire_timers(struct timer_b
 		if (timer->flags & TIMER_IRQSAFE) {
 			raw_spin_unlock(&base->lock);
 			call_timer_fn(timer, fn, baseclk);
-			base->running_timer = NULL;
 			raw_spin_lock(&base->lock);
+			base->running_timer = NULL;
 		} else {
 			raw_spin_unlock_irq(&base->lock);
 			call_timer_fn(timer, fn, baseclk);
+			raw_spin_lock_irq(&base->lock);
 			base->running_timer = NULL;
 			timer_sync_wait_running(base);
-			raw_spin_lock_irq(&base->lock);
 		}
 	}
 }

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: timers: Move clearing of base::timer_running under base::lock
  2020-12-06 21:40 timers: Move clearing of base::timer_running under base::lock Thomas Gleixner
@ 2020-12-07  1:10 ` Frederic Weisbecker
  2020-12-07 12:25   ` Peter Zijlstra
  2020-12-07 13:07 ` Sebastian Andrzej Siewior
  2021-07-27 19:00 ` [tip: timers/urgent] timers: Move clearing of base::timer_running under base:: Lock tip-bot2 for Thomas Gleixner
  2 siblings, 1 reply; 13+ messages in thread
From: Frederic Weisbecker @ 2020-12-07  1:10 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Marco Elver, kasan-dev, Peter Zijlstra, Paul E. McKenney,
	Anna-Maria Behnsen, Sebastian Andrzej Siewior

On Sun, Dec 06, 2020 at 10:40:07PM +0100, Thomas Gleixner wrote:
> syzbot reported KCSAN data races vs. timer_base::timer_running being set to
> NULL without holding base::lock in expire_timers().
> 
> This looks innocent and most reads are clearly not problematic but for a
> non-RT kernel it's completely irrelevant whether the store happens before
> or after taking the lock. For an RT kernel moving the store under the lock
> requires an extra unlock/lock pair in the case that there is a waiter for
> the timer. But that's not the end of the world and definitely not worth the
> trouble of adding boatloads of comments and annotations to the code. Famous
> last words...

There is another thing I noticed lately wrt. del_timer_sync() VS timer execution:


    int data = 0;

    void timer_func(struct timer_list *t)
    {
        data = 1;
    }

                 CPU 0                                           CPU 1
    ------------------------------                             --------------------------
    base = lock_timer_base(timer, &flags);                     raw_spin_unlock(&base->lock);
    if (base->running_timer != timer)                          call_timer_fn(timer, fn, baseclk);
        ret = detach_if_pending(timer, base, true);            base->running_timer = NULL;
    raw_spin_unlock_irqrestore(&base->lock, flags);            raw_spin_lock(&base->lock);

    x = data;
    

Here if the timer has previously executed on CPU 1 and then CPU 0 sees base->running_timer == NULL,
it will return, assuming the timer has completed. But there is nothing to enforce the fact that x
will be equal to 1. Enforcing that is a behaviour I would expect in this case since this is a kind
of "wait for completion" function. But perhaps it doesn't apply here, in fact I have no idea...

But if we recognize that as an issue, we would need a mirroring load_acquire()/store_release() on
base->running_timer.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: timers: Move clearing of base::timer_running under base::lock
  2020-12-07  1:10 ` Frederic Weisbecker
@ 2020-12-07 12:25   ` Peter Zijlstra
  2020-12-07 12:49     ` Frederic Weisbecker
  0 siblings, 1 reply; 13+ messages in thread
From: Peter Zijlstra @ 2020-12-07 12:25 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Thomas Gleixner, LKML, Marco Elver, kasan-dev, Paul E. McKenney,
	Anna-Maria Behnsen, Sebastian Andrzej Siewior

On Mon, Dec 07, 2020 at 02:10:13AM +0100, Frederic Weisbecker wrote:
> On Sun, Dec 06, 2020 at 10:40:07PM +0100, Thomas Gleixner wrote:
> > syzbot reported KCSAN data races vs. timer_base::timer_running being set to
> > NULL without holding base::lock in expire_timers().
> > 
> > This looks innocent and most reads are clearly not problematic but for a
> > non-RT kernel it's completely irrelevant whether the store happens before
> > or after taking the lock. For an RT kernel moving the store under the lock
> > requires an extra unlock/lock pair in the case that there is a waiter for
> > the timer. But that's not the end of the world and definitely not worth the
> > trouble of adding boatloads of comments and annotations to the code. Famous
> > last words...
> 
> There is another thing I noticed lately wrt. del_timer_sync() VS timer execution:

> Here if the timer has previously executed on CPU 1 and then CPU 0 sees base->running_timer == NULL,
> it will return, assuming the timer has completed. But there is nothing to enforce the fact that x
> will be equal to 1. Enforcing that is a behaviour I would expect in this case since this is a kind
> of "wait for completion" function. But perhaps it doesn't apply here, in fact I have no idea...
> 
> But if we recognize that as an issue, we would need a mirroring load_acquire()/store_release() on
> base->running_timer.

Yeah, I think you're right. del_timer_sync() explicitly states it waits
for completion of the handler, so it isn't weird to then also expect to
be able to observe the results of the handler.

Thomas' patch fixes this.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: timers: Move clearing of base::timer_running under base::lock
  2020-12-07 12:25   ` Peter Zijlstra
@ 2020-12-07 12:49     ` Frederic Weisbecker
  0 siblings, 0 replies; 13+ messages in thread
From: Frederic Weisbecker @ 2020-12-07 12:49 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Thomas Gleixner, LKML, Marco Elver, kasan-dev, Paul E. McKenney,
	Anna-Maria Behnsen, Sebastian Andrzej Siewior

On Mon, Dec 07, 2020 at 01:25:13PM +0100, Peter Zijlstra wrote:
> On Mon, Dec 07, 2020 at 02:10:13AM +0100, Frederic Weisbecker wrote:
> > On Sun, Dec 06, 2020 at 10:40:07PM +0100, Thomas Gleixner wrote:
> > > syzbot reported KCSAN data races vs. timer_base::timer_running being set to
> > > NULL without holding base::lock in expire_timers().
> > > 
> > > This looks innocent and most reads are clearly not problematic but for a
> > > non-RT kernel it's completely irrelevant whether the store happens before
> > > or after taking the lock. For an RT kernel moving the store under the lock
> > > requires an extra unlock/lock pair in the case that there is a waiter for
> > > the timer. But that's not the end of the world and definitely not worth the
> > > trouble of adding boatloads of comments and annotations to the code. Famous
> > > last words...
> > 
> > There is another thing I noticed lately wrt. del_timer_sync() VS timer execution:
> 
> > Here if the timer has previously executed on CPU 1 and then CPU 0 sees base->running_timer == NULL,
> > it will return, assuming the timer has completed. But there is nothing to enforce the fact that x
> > will be equal to 1. Enforcing that is a behaviour I would expect in this case since this is a kind
> > of "wait for completion" function. But perhaps it doesn't apply here, in fact I have no idea...
> > 
> > But if we recognize that as an issue, we would need a mirroring load_acquire()/store_release() on
> > base->running_timer.
> 
> Yeah, I think you're right. del_timer_sync() explicitly states it waits
> for completion of the handler, so it isn't weird to then also expect to
> be able to observe the results of the handler.
> 
> Thomas' patch fixes this.

Indeed!

Thanks.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: timers: Move clearing of base::timer_running under base::lock
  2020-12-06 21:40 timers: Move clearing of base::timer_running under base::lock Thomas Gleixner
  2020-12-07  1:10 ` Frederic Weisbecker
@ 2020-12-07 13:07 ` Sebastian Andrzej Siewior
  2020-12-07 14:29   ` Thomas Gleixner
  2021-07-27 19:00 ` [tip: timers/urgent] timers: Move clearing of base::timer_running under base:: Lock tip-bot2 for Thomas Gleixner
  2 siblings, 1 reply; 13+ messages in thread
From: Sebastian Andrzej Siewior @ 2020-12-07 13:07 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Marco Elver, kasan-dev, Peter Zijlstra, Paul E. McKenney,
	Anna-Maria Behnsen

On 2020-12-06 22:40:07 [+0100], Thomas Gleixner wrote:
> syzbot reported KCSAN data races vs. timer_base::timer_running being set to
> NULL without holding base::lock in expire_timers().
> 
> This looks innocent and most reads are clearly not problematic but for a
> non-RT kernel it's completely irrelevant whether the store happens before
> or after taking the lock. For an RT kernel moving the store under the lock
> requires an extra unlock/lock pair in the case that there is a waiter for
> the timer. But that's not the end of the world and definitely not worth the
> trouble of adding boatloads of comments and annotations to the code. Famous
> last words...
> 
> Reported-by: syzbot+aa7c2385d46c5eba0b89@syzkaller.appspotmail.com
> Reported-by: syzbot+abea4558531bae1ba9fe@syzkaller.appspotmail.com
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

One thing I noticed while testing it is that the "corner" case in
timer_sync_wait_running() is quite reliably hit by rcu_preempt
rcu_gp_fqs_loop() -> swait_event_idle_timeout_exclusive() invocation.

Sebastian

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: timers: Move clearing of base::timer_running under base::lock
  2020-12-07 13:07 ` Sebastian Andrzej Siewior
@ 2020-12-07 14:29   ` Thomas Gleixner
  2020-12-07 15:25     ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 13+ messages in thread
From: Thomas Gleixner @ 2020-12-07 14:29 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: LKML, Marco Elver, kasan-dev, Peter Zijlstra, Paul E. McKenney,
	Anna-Maria Behnsen

On Mon, Dec 07 2020 at 14:07, Sebastian Andrzej Siewior wrote:
> On 2020-12-06 22:40:07 [+0100], Thomas Gleixner wrote:
>> syzbot reported KCSAN data races vs. timer_base::timer_running being set to
>> NULL without holding base::lock in expire_timers().
>> 
>> This looks innocent and most reads are clearly not problematic but for a
>> non-RT kernel it's completely irrelevant whether the store happens before
>> or after taking the lock. For an RT kernel moving the store under the lock
>> requires an extra unlock/lock pair in the case that there is a waiter for
>> the timer. But that's not the end of the world and definitely not worth the
>> trouble of adding boatloads of comments and annotations to the code. Famous
>> last words...
>> 
>> Reported-by: syzbot+aa7c2385d46c5eba0b89@syzkaller.appspotmail.com
>> Reported-by: syzbot+abea4558531bae1ba9fe@syzkaller.appspotmail.com
>> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
>
> One thing I noticed while testing it is that the "corner" case in
> timer_sync_wait_running() is quite reliably hit by rcu_preempt
> rcu_gp_fqs_loop() -> swait_event_idle_timeout_exclusive() invocation.

I assume it's something like this:

     timeout -> wakeup

->preemption
        del_timer_sync()
                .....

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: timers: Move clearing of base::timer_running under base::lock
  2020-12-07 14:29   ` Thomas Gleixner
@ 2020-12-07 15:25     ` Sebastian Andrzej Siewior
  2020-12-07 16:06       ` Paul E. McKenney
  0 siblings, 1 reply; 13+ messages in thread
From: Sebastian Andrzej Siewior @ 2020-12-07 15:25 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Marco Elver, kasan-dev, Peter Zijlstra, Paul E. McKenney,
	Anna-Maria Behnsen

On 2020-12-07 15:29:50 [+0100], Thomas Gleixner wrote:
> On Mon, Dec 07 2020 at 14:07, Sebastian Andrzej Siewior wrote:
> > One thing I noticed while testing it is that the "corner" case in
> > timer_sync_wait_running() is quite reliably hit by rcu_preempt
> > rcu_gp_fqs_loop() -> swait_event_idle_timeout_exclusive() invocation.
> 
> I assume it's something like this:
> 
>      timeout -> wakeup
> 
> ->preemption
>         del_timer_sync()
>                 .....

Yes, but it triggers frequently. Like `rcuc' is somehow is aligned with
the timeout.

|          <idle>-0       [007] dN.h4..    46.299705: sched_wakeup: comm=rcuc/7 pid=53 prio=98 target_cpu=007
|          <idle>-0       [007] d...2..    46.299728: sched_switch: prev_comm=swapper/7 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=rcuc/7 next_pid=53 next_prio=98
|          rcuc/7-53      [007] d...2..    46.299742: sched_switch: prev_comm=rcuc/7 prev_pid=53 prev_prio=98 prev_state=S ==> next_comm=ksoftirqd/7 next_pid=54 next_prio=120
|     ksoftirqd/7-54      [007] .....13    46.299750: timer_expire_entry: timer=000000003bd1e045 function=process_timeout now=4294903802 baseclk=4294903802
|     ksoftirqd/7-54      [007] d...213    46.299750: sched_waking: comm=rcu_preempt pid=11 prio=98 target_cpu=007
|     ksoftirqd/7-54      [007] dN..313    46.299754: sched_wakeup: comm=rcu_preempt pid=11 prio=98 target_cpu=007
|     ksoftirqd/7-54      [007] dN..213    46.299756: sched_stat_runtime: comm=ksoftirqd/7 pid=54 runtime=13265 [ns] vruntime=3012610540 [ns]
|     ksoftirqd/7-54      [007] d...213    46.299760: sched_switch: prev_comm=ksoftirqd/7 prev_pid=54 prev_prio=120 prev_state=R+ ==> next_comm=rcu_preempt next_pid=11 next_prio=98
|     rcu_preempt-11      [007] d...311    46.299766: sched_pi_setprio: comm=ksoftirqd/7 pid=54 oldprio=120 newprio=98
del_timer_sync()
|     rcu_preempt-11      [007] d...211    46.299773: sched_switch: prev_comm=rcu_preempt prev_pid=11 prev_prio=98 prev_state=R+ ==> next_comm=ksoftirqd/7 next_pid=54 next_prio=98
|     ksoftirqd/7-54      [007] .....13    46.299774: timer_expire_exit: timer=000000003bd1e045
|     ksoftirqd/7-54      [007] dN..311    46.299784: sched_pi_setprio: comm=ksoftirqd/7 pid=54 oldprio=98 newprio=120
|     ksoftirqd/7-54      [007] dN..311    46.299788: sched_waking: comm=rcu_preempt pid=11 prio=98 target_cpu=007
|     ksoftirqd/7-54      [007] dN..411    46.299790: sched_wakeup: comm=rcu_preempt pid=11 prio=98 target_cpu=007
|     ksoftirqd/7-54      [007] dN..311    46.299792: sched_stat_runtime: comm=ksoftirqd/7 pid=54 runtime=7404 [ns] vruntime=3012617944 [ns]
|     ksoftirqd/7-54      [007] d...2..    46.299797: sched_switch: prev_comm=ksoftirqd/7 prev_pid=54 prev_prio=120 prev_state=S ==> next_comm=rcu_preempt next_pid=11 next_prio=98


> Thanks,
> 
>         tglx

Sebastian

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: timers: Move clearing of base::timer_running under base::lock
  2020-12-07 15:25     ` Sebastian Andrzej Siewior
@ 2020-12-07 16:06       ` Paul E. McKenney
  2020-12-08  8:50         ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 13+ messages in thread
From: Paul E. McKenney @ 2020-12-07 16:06 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Thomas Gleixner, LKML, Marco Elver, kasan-dev, Peter Zijlstra,
	Anna-Maria Behnsen

On Mon, Dec 07, 2020 at 04:25:33PM +0100, Sebastian Andrzej Siewior wrote:
> On 2020-12-07 15:29:50 [+0100], Thomas Gleixner wrote:
> > On Mon, Dec 07 2020 at 14:07, Sebastian Andrzej Siewior wrote:
> > > One thing I noticed while testing it is that the "corner" case in
> > > timer_sync_wait_running() is quite reliably hit by rcu_preempt
> > > rcu_gp_fqs_loop() -> swait_event_idle_timeout_exclusive() invocation.
> > 
> > I assume it's something like this:
> > 
> >      timeout -> wakeup
> > 
> > ->preemption
> >         del_timer_sync()
> >                 .....
> 
> Yes, but it triggers frequently. Like `rcuc' is somehow is aligned with
> the timeout.

Given that a lot of RCU processing is event-driven based on timers,
and given that the scheduling-clock interrupts are synchronized for
energy-efficiency reasons on many configs, maybe this alignment is
expected behavior?

							Thanx, Paul

> |          <idle>-0       [007] dN.h4..    46.299705: sched_wakeup: comm=rcuc/7 pid=53 prio=98 target_cpu=007
> |          <idle>-0       [007] d...2..    46.299728: sched_switch: prev_comm=swapper/7 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=rcuc/7 next_pid=53 next_prio=98
> |          rcuc/7-53      [007] d...2..    46.299742: sched_switch: prev_comm=rcuc/7 prev_pid=53 prev_prio=98 prev_state=S ==> next_comm=ksoftirqd/7 next_pid=54 next_prio=120
> |     ksoftirqd/7-54      [007] .....13    46.299750: timer_expire_entry: timer=000000003bd1e045 function=process_timeout now=4294903802 baseclk=4294903802
> |     ksoftirqd/7-54      [007] d...213    46.299750: sched_waking: comm=rcu_preempt pid=11 prio=98 target_cpu=007
> |     ksoftirqd/7-54      [007] dN..313    46.299754: sched_wakeup: comm=rcu_preempt pid=11 prio=98 target_cpu=007
> |     ksoftirqd/7-54      [007] dN..213    46.299756: sched_stat_runtime: comm=ksoftirqd/7 pid=54 runtime=13265 [ns] vruntime=3012610540 [ns]
> |     ksoftirqd/7-54      [007] d...213    46.299760: sched_switch: prev_comm=ksoftirqd/7 prev_pid=54 prev_prio=120 prev_state=R+ ==> next_comm=rcu_preempt next_pid=11 next_prio=98
> |     rcu_preempt-11      [007] d...311    46.299766: sched_pi_setprio: comm=ksoftirqd/7 pid=54 oldprio=120 newprio=98
> del_timer_sync()
> |     rcu_preempt-11      [007] d...211    46.299773: sched_switch: prev_comm=rcu_preempt prev_pid=11 prev_prio=98 prev_state=R+ ==> next_comm=ksoftirqd/7 next_pid=54 next_prio=98
> |     ksoftirqd/7-54      [007] .....13    46.299774: timer_expire_exit: timer=000000003bd1e045
> |     ksoftirqd/7-54      [007] dN..311    46.299784: sched_pi_setprio: comm=ksoftirqd/7 pid=54 oldprio=98 newprio=120
> |     ksoftirqd/7-54      [007] dN..311    46.299788: sched_waking: comm=rcu_preempt pid=11 prio=98 target_cpu=007
> |     ksoftirqd/7-54      [007] dN..411    46.299790: sched_wakeup: comm=rcu_preempt pid=11 prio=98 target_cpu=007
> |     ksoftirqd/7-54      [007] dN..311    46.299792: sched_stat_runtime: comm=ksoftirqd/7 pid=54 runtime=7404 [ns] vruntime=3012617944 [ns]
> |     ksoftirqd/7-54      [007] d...2..    46.299797: sched_switch: prev_comm=ksoftirqd/7 prev_pid=54 prev_prio=120 prev_state=S ==> next_comm=rcu_preempt next_pid=11 next_prio=98
> 
> 
> > Thanks,
> > 
> >         tglx
> 
> Sebastian

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: timers: Move clearing of base::timer_running under base::lock
  2020-12-07 16:06       ` Paul E. McKenney
@ 2020-12-08  8:50         ` Sebastian Andrzej Siewior
  2020-12-08 15:04           ` Paul E. McKenney
  2020-12-11 14:36           ` Thomas Gleixner
  0 siblings, 2 replies; 13+ messages in thread
From: Sebastian Andrzej Siewior @ 2020-12-08  8:50 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Thomas Gleixner, LKML, Marco Elver, kasan-dev, Peter Zijlstra,
	Anna-Maria Behnsen

On 2020-12-07 08:06:48 [-0800], Paul E. McKenney wrote:
> > Yes, but it triggers frequently. Like `rcuc' is somehow is aligned with
> > the timeout.
> 
> Given that a lot of RCU processing is event-driven based on timers,
> and given that the scheduling-clock interrupts are synchronized for
> energy-efficiency reasons on many configs, maybe this alignment is
> expected behavior?

No, it is the fact that rcu_preempt has a higher priority than
ksoftirqd. So immediately after the wakeup (of rcu_preempt) there is a
context switch and expire_timers() has this:

|   raw_spin_unlock_irq(&base->lock);
|   call_timer_fn(timer, fn, baseclk);
|   raw_spin_lock_irq(&base->lock);
|   base->running_timer = NULL;
|   timer_sync_wait_running(base);

So ->running_timer isn't reset and try_to_del_timer_sync() (that
del_timer_sync() from schedule_timeout()) returns -1 and then the corner
case is handled where `expiry_lock' is acquired. So everything goes as
expected.

> 							Thanx, Paul

Sebastian

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: timers: Move clearing of base::timer_running under base::lock
  2020-12-08  8:50         ` Sebastian Andrzej Siewior
@ 2020-12-08 15:04           ` Paul E. McKenney
  2020-12-11 14:36           ` Thomas Gleixner
  1 sibling, 0 replies; 13+ messages in thread
From: Paul E. McKenney @ 2020-12-08 15:04 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Thomas Gleixner, LKML, Marco Elver, kasan-dev, Peter Zijlstra,
	Anna-Maria Behnsen

On Tue, Dec 08, 2020 at 09:50:49AM +0100, Sebastian Andrzej Siewior wrote:
> On 2020-12-07 08:06:48 [-0800], Paul E. McKenney wrote:
> > > Yes, but it triggers frequently. Like `rcuc' is somehow is aligned with
> > > the timeout.
> > 
> > Given that a lot of RCU processing is event-driven based on timers,
> > and given that the scheduling-clock interrupts are synchronized for
> > energy-efficiency reasons on many configs, maybe this alignment is
> > expected behavior?
> 
> No, it is the fact that rcu_preempt has a higher priority than
> ksoftirqd. So immediately after the wakeup (of rcu_preempt) there is a
> context switch and expire_timers() has this:
> 
> |   raw_spin_unlock_irq(&base->lock);
> |   call_timer_fn(timer, fn, baseclk);
> |   raw_spin_lock_irq(&base->lock);
> |   base->running_timer = NULL;
> |   timer_sync_wait_running(base);
> 
> So ->running_timer isn't reset and try_to_del_timer_sync() (that
> del_timer_sync() from schedule_timeout()) returns -1 and then the corner
> case is handled where `expiry_lock' is acquired. So everything goes as
> expected.

Makes sense!  Thank you for the explanation!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: timers: Move clearing of base::timer_running under base::lock
  2020-12-08  8:50         ` Sebastian Andrzej Siewior
  2020-12-08 15:04           ` Paul E. McKenney
@ 2020-12-11 14:36           ` Thomas Gleixner
  2020-12-11 15:04             ` Sebastian Andrzej Siewior
  1 sibling, 1 reply; 13+ messages in thread
From: Thomas Gleixner @ 2020-12-11 14:36 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior, Paul E. McKenney
  Cc: LKML, Marco Elver, kasan-dev, Peter Zijlstra, Anna-Maria Behnsen

On Tue, Dec 08 2020 at 09:50, Sebastian Andrzej Siewior wrote:
> On 2020-12-07 08:06:48 [-0800], Paul E. McKenney wrote:
>> > Yes, but it triggers frequently. Like `rcuc' is somehow is aligned with
>> > the timeout.
>> 
>> Given that a lot of RCU processing is event-driven based on timers,
>> and given that the scheduling-clock interrupts are synchronized for
>> energy-efficiency reasons on many configs, maybe this alignment is
>> expected behavior?
>
> No, it is the fact that rcu_preempt has a higher priority than
> ksoftirqd. So immediately after the wakeup (of rcu_preempt) there is a
> context switch and expire_timers() has this:
>
> |   raw_spin_unlock_irq(&base->lock);
> |   call_timer_fn(timer, fn, baseclk);
> |   raw_spin_lock_irq(&base->lock);
> |   base->running_timer = NULL;
> |   timer_sync_wait_running(base);
>
> So ->running_timer isn't reset and try_to_del_timer_sync() (that
> del_timer_sync() from schedule_timeout()) returns -1 and then the corner
> case is handled where `expiry_lock' is acquired. So everything goes as
> expected.

Well, but even without that change you have the same situation:

      timer_fn()
        wakeup()
          -->preemption
                        del_timer_sync()
                          if (running)
                             wait_for_running()
                               lock(expiry)

     running = NULL
     sync_wait_running()
       unlock(expiry)
         wakeup_lock()
          -->preemption
                             ...

    lock(base)
     
So the change at hand does not make things worse, right?

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: timers: Move clearing of base::timer_running under base::lock
  2020-12-11 14:36           ` Thomas Gleixner
@ 2020-12-11 15:04             ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 13+ messages in thread
From: Sebastian Andrzej Siewior @ 2020-12-11 15:04 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Paul E. McKenney, LKML, Marco Elver, kasan-dev, Peter Zijlstra,
	Anna-Maria Behnsen

On 2020-12-11 15:36:27 [+0100], Thomas Gleixner wrote:
> So the change at hand does not make things worse, right?

Correct. This *ping*-*pong* was there before this patch because
->running_timer is always cleared after the callback finishes.

I was just curious why out of a sudden there are *that* many users of
the corner case. 

> Thanks,
> 
>         tglx

Sebastian

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [tip: timers/urgent] timers: Move clearing of base::timer_running under base:: Lock
  2020-12-06 21:40 timers: Move clearing of base::timer_running under base::lock Thomas Gleixner
  2020-12-07  1:10 ` Frederic Weisbecker
  2020-12-07 13:07 ` Sebastian Andrzej Siewior
@ 2021-07-27 19:00 ` tip-bot2 for Thomas Gleixner
  2 siblings, 0 replies; 13+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-07-27 19:00 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: syzbot+aa7c2385d46c5eba0b89, syzbot+abea4558531bae1ba9fe,
	Thomas Gleixner, Sebastian Andrzej Siewior, stable, x86,
	linux-kernel

The following commit has been merged into the timers/urgent branch of tip:

Commit-ID:     bb7262b295472eb6858b5c49893954794027cd84
Gitweb:        https://git.kernel.org/tip/bb7262b295472eb6858b5c49893954794027cd84
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Sun, 06 Dec 2020 22:40:07 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 27 Jul 2021 20:57:44 +02:00

timers: Move clearing of base::timer_running under base:: Lock

syzbot reported KCSAN data races vs. timer_base::timer_running being set to
NULL without holding base::lock in expire_timers().

This looks innocent and most reads are clearly not problematic, but
Frederic identified an issue which is:

 int data = 0;

 void timer_func(struct timer_list *t)
 {
    data = 1;
 }

 CPU 0                                            CPU 1
 ------------------------------                   --------------------------
 base = lock_timer_base(timer, &flags);           raw_spin_unlock(&base->lock);
 if (base->running_timer != timer)                call_timer_fn(timer, fn, baseclk);
   ret = detach_if_pending(timer, base, true);    base->running_timer = NULL;
 raw_spin_unlock_irqrestore(&base->lock, flags);  raw_spin_lock(&base->lock);

 x = data;

If the timer has previously executed on CPU 1 and then CPU 0 can observe
base->running_timer == NULL and returns, assuming the timer has completed,
but it's not guaranteed on all architectures. The comment for
del_timer_sync() makes that guarantee. Moving the assignment under
base->lock prevents this.

For non-RT kernel it's performance wise completely irrelevant whether the
store happens before or after taking the lock. For an RT kernel moving the
store under the lock requires an extra unlock/lock pair in the case that
there is a waiter for the timer, but that's not the end of the world.

Reported-by: syzbot+aa7c2385d46c5eba0b89@syzkaller.appspotmail.com
Reported-by: syzbot+abea4558531bae1ba9fe@syzkaller.appspotmail.com
Fixes: 030dcdd197d7 ("timers: Prepare support for PREEMPT_RT")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/87lfea7gw8.fsf@nanos.tec.linutronix.de
Cc: stable@vger.kernel.org
---
 kernel/time/timer.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 9eb11c2..e3d2c23 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -1265,8 +1265,10 @@ static inline void timer_base_unlock_expiry(struct timer_base *base)
 static void timer_sync_wait_running(struct timer_base *base)
 {
 	if (atomic_read(&base->timer_waiters)) {
+		raw_spin_unlock_irq(&base->lock);
 		spin_unlock(&base->expiry_lock);
 		spin_lock(&base->expiry_lock);
+		raw_spin_lock_irq(&base->lock);
 	}
 }
 
@@ -1457,14 +1459,14 @@ static void expire_timers(struct timer_base *base, struct hlist_head *head)
 		if (timer->flags & TIMER_IRQSAFE) {
 			raw_spin_unlock(&base->lock);
 			call_timer_fn(timer, fn, baseclk);
-			base->running_timer = NULL;
 			raw_spin_lock(&base->lock);
+			base->running_timer = NULL;
 		} else {
 			raw_spin_unlock_irq(&base->lock);
 			call_timer_fn(timer, fn, baseclk);
+			raw_spin_lock_irq(&base->lock);
 			base->running_timer = NULL;
 			timer_sync_wait_running(base);
-			raw_spin_lock_irq(&base->lock);
 		}
 	}
 }

^ permalink raw reply related	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-07-27 19:00 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-06 21:40 timers: Move clearing of base::timer_running under base::lock Thomas Gleixner
2020-12-07  1:10 ` Frederic Weisbecker
2020-12-07 12:25   ` Peter Zijlstra
2020-12-07 12:49     ` Frederic Weisbecker
2020-12-07 13:07 ` Sebastian Andrzej Siewior
2020-12-07 14:29   ` Thomas Gleixner
2020-12-07 15:25     ` Sebastian Andrzej Siewior
2020-12-07 16:06       ` Paul E. McKenney
2020-12-08  8:50         ` Sebastian Andrzej Siewior
2020-12-08 15:04           ` Paul E. McKenney
2020-12-11 14:36           ` Thomas Gleixner
2020-12-11 15:04             ` Sebastian Andrzej Siewior
2021-07-27 19:00 ` [tip: timers/urgent] timers: Move clearing of base::timer_running under base:: Lock tip-bot2 for Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).