linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] fix hrtimer_enqueue_reprogram race
@ 2013-02-04 12:33 Izik Eidus
  2013-02-05 10:44 ` Thomas Gleixner
  2013-02-05 11:02 ` [tip:timers/core] hrtimer: Prevent " tip-bot for Leonid Shatz
  0 siblings, 2 replies; 6+ messages in thread
From: Izik Eidus @ 2013-02-04 12:33 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Andrea Arcangeli, linux-kernel, leonid Shatz, Izik Eidus

From: leonid Shatz <leonid.shatz@ravellosystems.com>

it seems like hrtimer_enqueue_reprogram contain a race which could result in
timer.base switch during unlock/lock sequence.

See the code at __hrtimer_start_range_ns where it calls
hrtimer_enqueue_reprogram. The later is releasing lock protecting the timer
base for a short time and timer base switch can occur from a different CPU
thread. Later when __hrtimer_start_range_ns calls unlock_hrtimer_base, a base
switch could have happened and this causes the bug

Try to start the same hrtimer from two different threads in kernel running
each one on a different CPU. Eventually one of the calls will cause timer base
switch while another thread is not expecting it.

This can happen in virtualized environment where one thread can be delayed by
lower hypervisor, and due to time delay a different CPU is taking care of
missed timer start and runs the timer start logic on its own.

Signed-off-by: Leonid Shatz <leonid.shatz@ravellosystems.com>
Signed-off-by: Izik Eidus <izik.eidus@ravellosystems.com>
---
 kernel/hrtimer.c |   32 ++++++++++++++------------------
 1 file changed, 14 insertions(+), 18 deletions(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index 6db7a5e..0c8c6cd 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -640,21 +640,9 @@ static inline void hrtimer_init_hres(struct hrtimer_cpu_base *base)
  * and expiry check is done in the hrtimer_interrupt or in the softirq.
  */
 static inline int hrtimer_enqueue_reprogram(struct hrtimer *timer,
-					    struct hrtimer_clock_base *base,
-					    int wakeup)
+					    struct hrtimer_clock_base *base)
 {
-	if (base->cpu_base->hres_active && hrtimer_reprogram(timer, base)) {
-		if (wakeup) {
-			raw_spin_unlock(&base->cpu_base->lock);
-			raise_softirq_irqoff(HRTIMER_SOFTIRQ);
-			raw_spin_lock(&base->cpu_base->lock);
-		} else
-			__raise_softirq_irqoff(HRTIMER_SOFTIRQ);
-
-		return 1;
-	}
-
-	return 0;
+	return base->cpu_base->hres_active && hrtimer_reprogram(timer, base);
 }
 
 static inline ktime_t hrtimer_update_base(struct hrtimer_cpu_base *base)
@@ -735,8 +723,7 @@ static inline int hrtimer_switch_to_hres(void) { return 0; }
 static inline void
 hrtimer_force_reprogram(struct hrtimer_cpu_base *base, int skip_equal) { }
 static inline int hrtimer_enqueue_reprogram(struct hrtimer *timer,
-					    struct hrtimer_clock_base *base,
-					    int wakeup)
+					    struct hrtimer_clock_base *base)
 {
 	return 0;
 }
@@ -995,8 +982,17 @@ int __hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
 	 *
 	 * XXX send_remote_softirq() ?
 	 */
-	if (leftmost && new_base->cpu_base == &__get_cpu_var(hrtimer_bases))
-		hrtimer_enqueue_reprogram(timer, new_base, wakeup);
+	if (leftmost && new_base->cpu_base == &__get_cpu_var(hrtimer_bases)
+		&& hrtimer_enqueue_reprogram(timer, new_base)) {
+		if (wakeup) {
+			raw_spin_unlock(&new_base->cpu_base->lock);
+			raise_softirq_irqoff(HRTIMER_SOFTIRQ);
+			local_irq_restore(flags);
+			return ret;
+		} else {
+			__raise_softirq_irqoff(HRTIMER_SOFTIRQ);
+		}
+	}
 
 	unlock_hrtimer_base(timer, &flags);
 
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] fix hrtimer_enqueue_reprogram race
  2013-02-04 12:33 [PATCH] fix hrtimer_enqueue_reprogram race Izik Eidus
@ 2013-02-05 10:44 ` Thomas Gleixner
  2013-02-05 10:55   ` Leonid Shatz
  2013-02-05 11:02 ` [tip:timers/core] hrtimer: Prevent " tip-bot for Leonid Shatz
  1 sibling, 1 reply; 6+ messages in thread
From: Thomas Gleixner @ 2013-02-05 10:44 UTC (permalink / raw)
  To: Izik Eidus; +Cc: Andrea Arcangeli, linux-kernel, leonid Shatz

On Mon, 4 Feb 2013, Izik Eidus wrote:

> From: leonid Shatz <leonid.shatz@ravellosystems.com>
> 
> it seems like hrtimer_enqueue_reprogram contain a race which could result in
> timer.base switch during unlock/lock sequence.
> 
> See the code at __hrtimer_start_range_ns where it calls
> hrtimer_enqueue_reprogram. The later is releasing lock protecting the timer
> base for a short time and timer base switch can occur from a different CPU
> thread. Later when __hrtimer_start_range_ns calls unlock_hrtimer_base, a base
> switch could have happened and this causes the bug
> 
> Try to start the same hrtimer from two different threads in kernel running
> each one on a different CPU. Eventually one of the calls will cause timer base
> switch while another thread is not expecting it.

Aside of the bug in the hrtimer code being a real one, writing code
which fiddles with the same resource (hrtimer) unserialized is broken
on its own.
 
> This can happen in virtualized environment where one thread can be delayed by
> lower hypervisor, and due to time delay a different CPU is taking care of
> missed timer start and runs the timer start logic on its own.

Without noticing that something else already takes care of it? So
you're saying that the code in question relies on magic serialization
in the hrtimer code. Doesn't look like a brilliant design.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [PATCH] fix hrtimer_enqueue_reprogram race
  2013-02-05 10:44 ` Thomas Gleixner
@ 2013-02-05 10:55   ` Leonid Shatz
  2013-02-05 11:15     ` Thomas Gleixner
  0 siblings, 1 reply; 6+ messages in thread
From: Leonid Shatz @ 2013-02-05 10:55 UTC (permalink / raw)
  To: 'Thomas Gleixner', 'Izik Eidus'
  Cc: 'Andrea Arcangeli', linux-kernel

The explanation were submitted as possible scenario which could explain how
the bug in kernel could happen and it does not mean that serious designer
could do exactly that. As I said before, it's also possible that a race
between hrtimer_cancel and hrtimer_start can trigger the bug. The idea is to
have kernel more robust. There are already locks used inside hrtimer code,
so why should users of the hrtimer add another layer of locks and get
involved in the intricacy of which cases are protected by internal hrtimer
lock and which are not?

Leonid Shatz

> -----Original Message-----
> From: Thomas Gleixner [mailto:tglx@linutronix.de]
> Sent: Tuesday, February 05, 2013 12:44 PM
> To: Izik Eidus
> Cc: Andrea Arcangeli; linux-kernel@vger.kernel.org; leonid Shatz
> Subject: Re: [PATCH] fix hrtimer_enqueue_reprogram race
> 
> On Mon, 4 Feb 2013, Izik Eidus wrote:
> 
> > From: leonid Shatz <leonid.shatz@ravellosystems.com>
> >
> > it seems like hrtimer_enqueue_reprogram contain a race which could
> > result in timer.base switch during unlock/lock sequence.
> >
> > See the code at __hrtimer_start_range_ns where it calls
> > hrtimer_enqueue_reprogram. The later is releasing lock protecting the
> > timer base for a short time and timer base switch can occur from a
> > different CPU thread. Later when __hrtimer_start_range_ns calls
> > unlock_hrtimer_base, a base switch could have happened and this causes
> > the bug
> >
> > Try to start the same hrtimer from two different threads in kernel
> > running each one on a different CPU. Eventually one of the calls will
> > cause timer base switch while another thread is not expecting it.
> 
> Aside of the bug in the hrtimer code being a real one, writing code which
> fiddles with the same resource (hrtimer) unserialized is broken on its
own.
> 
> > This can happen in virtualized environment where one thread can be
> > delayed by lower hypervisor, and due to time delay a different CPU is
> > taking care of missed timer start and runs the timer start logic on its
own.
> 
> Without noticing that something else already takes care of it? So you're
> saying that the code in question relies on magic serialization in the
hrtimer
> code. Doesn't look like a brilliant design.
> 
> Thanks,
> 
> 	tglx


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [tip:timers/core] hrtimer: Prevent hrtimer_enqueue_reprogram race
  2013-02-04 12:33 [PATCH] fix hrtimer_enqueue_reprogram race Izik Eidus
  2013-02-05 10:44 ` Thomas Gleixner
@ 2013-02-05 11:02 ` tip-bot for Leonid Shatz
  1 sibling, 0 replies; 6+ messages in thread
From: tip-bot for Leonid Shatz @ 2013-02-05 11:02 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, izik.eidus, aarcange, leonid.shatz, tglx

Commit-ID:  b22affe0aef429d657bc6505aacb1c569340ddd2
Gitweb:     http://git.kernel.org/tip/b22affe0aef429d657bc6505aacb1c569340ddd2
Author:     Leonid Shatz <leonid.shatz@ravellosystems.com>
AuthorDate: Mon, 4 Feb 2013 14:33:37 +0200
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Tue, 5 Feb 2013 11:52:41 +0100

hrtimer: Prevent hrtimer_enqueue_reprogram race

hrtimer_enqueue_reprogram contains a race which could result in
timer.base switch during unlock/lock sequence.

hrtimer_enqueue_reprogram is releasing the lock protecting the timer
base for calling raise_softirq_irqsoff() due to a lock ordering issue
versus rq->lock.

If during that time another CPU calls __hrtimer_start_range_ns() on
the same hrtimer, the timer base might switch, before the current CPU
can lock base->lock again and therefor the unlock_timer_base() call
will unlock the wrong lock.

[ tglx: Added comment and massaged changelog ]

Signed-off-by: Leonid Shatz <leonid.shatz@ravellosystems.com>
Signed-off-by: Izik Eidus <izik.eidus@ravellosystems.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1359981217-389-1-git-send-email-izik.eidus@ravellosystems.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/hrtimer.c | 36 ++++++++++++++++++------------------
 1 file changed, 18 insertions(+), 18 deletions(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index 6db7a5e..cdd5607 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -640,21 +640,9 @@ static inline void hrtimer_init_hres(struct hrtimer_cpu_base *base)
  * and expiry check is done in the hrtimer_interrupt or in the softirq.
  */
 static inline int hrtimer_enqueue_reprogram(struct hrtimer *timer,
-					    struct hrtimer_clock_base *base,
-					    int wakeup)
+					    struct hrtimer_clock_base *base)
 {
-	if (base->cpu_base->hres_active && hrtimer_reprogram(timer, base)) {
-		if (wakeup) {
-			raw_spin_unlock(&base->cpu_base->lock);
-			raise_softirq_irqoff(HRTIMER_SOFTIRQ);
-			raw_spin_lock(&base->cpu_base->lock);
-		} else
-			__raise_softirq_irqoff(HRTIMER_SOFTIRQ);
-
-		return 1;
-	}
-
-	return 0;
+	return base->cpu_base->hres_active && hrtimer_reprogram(timer, base);
 }
 
 static inline ktime_t hrtimer_update_base(struct hrtimer_cpu_base *base)
@@ -735,8 +723,7 @@ static inline int hrtimer_switch_to_hres(void) { return 0; }
 static inline void
 hrtimer_force_reprogram(struct hrtimer_cpu_base *base, int skip_equal) { }
 static inline int hrtimer_enqueue_reprogram(struct hrtimer *timer,
-					    struct hrtimer_clock_base *base,
-					    int wakeup)
+					    struct hrtimer_clock_base *base)
 {
 	return 0;
 }
@@ -995,8 +982,21 @@ int __hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
 	 *
 	 * XXX send_remote_softirq() ?
 	 */
-	if (leftmost && new_base->cpu_base == &__get_cpu_var(hrtimer_bases))
-		hrtimer_enqueue_reprogram(timer, new_base, wakeup);
+	if (leftmost && new_base->cpu_base == &__get_cpu_var(hrtimer_bases)
+		&& hrtimer_enqueue_reprogram(timer, new_base)) {
+		if (wakeup) {
+			/*
+			 * We need to drop cpu_base->lock to avoid a
+			 * lock ordering issue vs. rq->lock.
+			 */
+			raw_spin_unlock(&new_base->cpu_base->lock);
+			raise_softirq_irqoff(HRTIMER_SOFTIRQ);
+			local_irq_restore(flags);
+			return ret;
+		} else {
+			__raise_softirq_irqoff(HRTIMER_SOFTIRQ);
+		}
+	}
 
 	unlock_hrtimer_base(timer, &flags);
 

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* RE: [PATCH] fix hrtimer_enqueue_reprogram race
  2013-02-05 10:55   ` Leonid Shatz
@ 2013-02-05 11:15     ` Thomas Gleixner
  2013-02-05 12:36       ` Leonid Shatz
  0 siblings, 1 reply; 6+ messages in thread
From: Thomas Gleixner @ 2013-02-05 11:15 UTC (permalink / raw)
  To: Leonid Shatz
  Cc: 'Izik Eidus', 'Andrea Arcangeli', linux-kernel

Leonid,

On Tue, 5 Feb 2013, Leonid Shatz wrote:

Please stop top posting!

> The explanation were submitted as possible scenario which could explain how
> the bug in kernel could happen and it does not mean that serious designer
> could do exactly that. As I said before, it's also possible that a race
> between hrtimer_cancel and hrtimer_start can trigger the bug. The idea is to
> have kernel more robust.

I'm not against making the kernel more robust and I already applied
the patch.

> There are already locks used inside hrtimer code, so why should
> users of the hrtimer add another layer of locks and get involved in
> the intricacy of which cases are protected by internal hrtimer lock
> and which are not?

Groan. The hrtimer locks are there to protect the internal data
structures of the hrtimer code and to ensure that hrtimer functions
are proper protected against concurrent running callbacks. But that
does not give you any kind of protection versus multiple users of your
hrtimer resource.

Look at the following scenario:

CPU0	     	    CPU1

hrtimer_cancel()
		    hrtimer_start()
teardown_crap()
		    hrtimer_callback() runs

That's probably not what you want and magic serialization in the
hrtimer code does not help at all.

There is also no protection against:

CPU0  	      	    CPU1

hrtimer_cancel()
		    hrtimer_start()
hrtimer_forward()

Which leaves the hrtimer enqueued on CPU1 with a wrong expiry value.

So while concurrent hrtimer_start() is protected, other things are
not. So do we need to create a list of functions which can be abused
by a programmer without proper protection of the resource and which
not?

If you want to use any kind of resource (including hrtimers)
concurrently you better have proper serialization in that
code. Everything else is voodoo programming of the worst kind.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [PATCH] fix hrtimer_enqueue_reprogram race
  2013-02-05 11:15     ` Thomas Gleixner
@ 2013-02-05 12:36       ` Leonid Shatz
  0 siblings, 0 replies; 6+ messages in thread
From: Leonid Shatz @ 2013-02-05 12:36 UTC (permalink / raw)
  To: 'Thomas Gleixner'
  Cc: 'Izik Eidus', 'Andrea Arcangeli', linux-kernel

> > There are already locks used inside hrtimer code, so why should users
> > of the hrtimer add another layer of locks and get involved in the
> > intricacy of which cases are protected by internal hrtimer lock and
> > which are not?
> 
> Groan. The hrtimer locks are there to protect the internal data structures
of
> the hrtimer code and to ensure that hrtimer functions are proper protected
> against concurrent running callbacks. But that does not give you any kind
of
> protection versus multiple users of your hrtimer resource.
> 

I was not raising the issue of protecting the logic of hrtimer use cases -
this was not the scope of the entire issue. The issue was that internal
hrtimer locks were getting screwed and it was not reasonable to add external
layer of locks to protect internal hrtimer locks (I mean before this patch).
The fix should provide protection for internal locks and I think we both
agree on that.

Leonid


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2013-02-05 12:36 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-02-04 12:33 [PATCH] fix hrtimer_enqueue_reprogram race Izik Eidus
2013-02-05 10:44 ` Thomas Gleixner
2013-02-05 10:55   ` Leonid Shatz
2013-02-05 11:15     ` Thomas Gleixner
2013-02-05 12:36       ` Leonid Shatz
2013-02-05 11:02 ` [tip:timers/core] hrtimer: Prevent " tip-bot for Leonid Shatz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).