linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: Andrew Morton <akpm@osdl.org>
Cc: linas@austin.ibm.com, <linux-kernel@vger.kernel.org>
Subject: Re: PATCH: Race in 2.6.0-test2 timer code
Date: Wed, 30 Jul 2003 07:57:32 +0200 (CEST)	[thread overview]
Message-ID: <Pine.LNX.4.44.0307300733200.25010-100000@localhost.localdomain> (raw)
In-Reply-To: <20030729135643.2e9b74bc.akpm@osdl.org>


On Tue, 29 Jul 2003, Andrew Morton wrote:

> Andrea says that we need to take the per-timer lock in add_timer() and
> del_timer(), but I haven't yet got around to working out why.

this makes no sense - in 2.6 (and in 2.4) there's no safe add_timer() /
del_timer() use without using external SMP synchronization. (There's one
special timer use variant involving del_timer_sync() that was safe in 2.4
but is unsafe in 2.6, see below.)

and i dont think Linas' patch is correct either - how can the timer base
change under us? We are holding the timer spinlock.

What i'd propose is the attached (tested, against 2.6.0-test2) patch
instead. It unifies the functionality of add_timer() and mod_timer(), and
makes any combination of the timer API calls completely SMP-safe.  
del_timer() is still not using the timer lock.

this patch fixes the only timer bug in 2.6 i'm aware of: the
del_timer_sync() + add_timer() combination in kernel/itimer.c is buggy.
This was correct code in 2.4, because there it was safe to do an
add_timer() from the timer handler itself, parallel to a del_timer_sync().  
If we want to make this safe in 2.6 too (which i think we want to) then we
have to make add_timer() almost equivalent to mod_timer(), locking-wise.
And once we are at this point i think it's much cleaner to actually make
add_timer() a variant of mod_timer(). (There's no locking cost for
add_timer(), only the cost of an extra branch. And we've removed another
commonly used function from the icache.)

Linas, could you please give this patch a go, does it make a difference to
your timer list corruption problem? I've booted it on SMP and UP as well.

	Ingo

--- linux/include/linux/timer.h.orig	
+++ linux/include/linux/timer.h	
@@ -60,11 +60,30 @@ static inline int timer_pending(const st
 	return timer->base != NULL;
 }
 
-extern void add_timer(struct timer_list * timer);
 extern void add_timer_on(struct timer_list *timer, int cpu);
 extern int del_timer(struct timer_list * timer);
+extern int __mod_timer(struct timer_list *timer, unsigned long expires);
 extern int mod_timer(struct timer_list *timer, unsigned long expires);
-  
+
+/***
+ * add_timer - start a timer
+ * @timer: the timer to be added
+ *
+ * The kernel will do a ->function(->data) callback from the
+ * timer interrupt at the ->expired point in the future. The
+ * current time is 'jiffies'.
+ *
+ * The timer's ->expired, ->function (and if the handler uses it, ->data)
+ * fields must be set prior calling this function.
+ *
+ * Timers with an ->expired field in the past will be executed in the next
+ * timer tick.
+ */
+static inline void add_timer(struct timer_list * timer)
+{
+	__mod_timer(timer, timer->expires);
+}
+ 
 #ifdef CONFIG_SMP
   extern int del_timer_sync(struct timer_list * timer);
 #else
--- linux/kernel/timer.c.orig	
+++ linux/kernel/timer.c	
@@ -144,34 +144,62 @@ static void internal_add_timer(tvec_base
 	list_add_tail(&timer->entry, vec);
 }
 
-/***
- * add_timer - start a timer
- * @timer: the timer to be added
- *
- * The kernel will do a ->function(->data) callback from the
- * timer interrupt at the ->expired point in the future. The
- * current time is 'jiffies'.
- *
- * The timer's ->expired, ->function (and if the handler uses it, ->data)
- * fields must be set prior calling this function.
- *
- * Timers with an ->expired field in the past will be executed in the next
- * timer tick. It's illegal to add an already pending timer.
- */
-void add_timer(struct timer_list *timer)
+int __mod_timer(struct timer_list *timer, unsigned long expires)
 {
-	tvec_base_t *base = &get_cpu_var(tvec_bases);
-  	unsigned long flags;
-  
-  	BUG_ON(timer_pending(timer) || !timer->function);
+	tvec_base_t *old_base, *new_base;
+	unsigned long flags;
+	int ret = 0;
+
+	BUG_ON(!timer->function);
 
 	check_timer(timer);
 
-	spin_lock_irqsave(&base->lock, flags);
-	internal_add_timer(base, timer);
-	timer->base = base;
-	spin_unlock_irqrestore(&base->lock, flags);
-	put_cpu_var(tvec_bases);
+	spin_lock_irqsave(&timer->lock, flags);
+	new_base = &__get_cpu_var(tvec_bases);
+repeat:
+	old_base = timer->base;
+
+	/*
+	 * Prevent deadlocks via ordering by old_base < new_base.
+	 */
+	if (old_base && (new_base != old_base)) {
+		if (old_base < new_base) {
+			spin_lock(&new_base->lock);
+			spin_lock(&old_base->lock);
+		} else {
+			spin_lock(&old_base->lock);
+			spin_lock(&new_base->lock);
+		}
+		/*
+		 * The timer base might have been cancelled while we were
+		 * trying to take the lock(s):
+		 */
+		if (timer->base != old_base) {
+			spin_unlock(&new_base->lock);
+			spin_unlock(&old_base->lock);
+			goto repeat;
+		}
+	} else
+		spin_lock(&new_base->lock);
+
+	/*
+	 * Delete the previous timeout (if there was any), and install
+	 * the new one:
+	 */
+	if (old_base) {
+		list_del(&timer->entry);
+		ret = 1;
+	}
+	timer->expires = expires;
+	internal_add_timer(new_base, timer);
+	timer->base = new_base;
+
+	if (old_base && (new_base != old_base))
+		spin_unlock(&old_base->lock);
+	spin_unlock(&new_base->lock);
+	spin_unlock_irqrestore(&timer->lock, flags);
+
+	return ret;
 }
 
 /***
@@ -179,7 +207,7 @@ void add_timer(struct timer_list *timer)
  * @timer: the timer to be added
  * @cpu: the CPU to start it on
  *
- * This is not very scalable on SMP.
+ * This is not very scalable on SMP. Double adds are not possible.
  */
 void add_timer_on(struct timer_list *timer, int cpu)
 {
@@ -217,10 +245,6 @@ void add_timer_on(struct timer_list *tim
  */
 int mod_timer(struct timer_list *timer, unsigned long expires)
 {
-	tvec_base_t *old_base, *new_base;
-	unsigned long flags;
-	int ret = 0;
-
 	BUG_ON(!timer->function);
 
 	check_timer(timer);
@@ -233,52 +257,7 @@ int mod_timer(struct timer_list *timer, 
 	if (timer->expires == expires && timer_pending(timer))
 		return 1;
 
-	spin_lock_irqsave(&timer->lock, flags);
-	new_base = &__get_cpu_var(tvec_bases);
-repeat:
-	old_base = timer->base;
-
-	/*
-	 * Prevent deadlocks via ordering by old_base < new_base.
-	 */
-	if (old_base && (new_base != old_base)) {
-		if (old_base < new_base) {
-			spin_lock(&new_base->lock);
-			spin_lock(&old_base->lock);
-		} else {
-			spin_lock(&old_base->lock);
-			spin_lock(&new_base->lock);
-		}
-		/*
-		 * The timer base might have been cancelled while we were
-		 * trying to take the lock(s):
-		 */
-		if (timer->base != old_base) {
-			spin_unlock(&new_base->lock);
-			spin_unlock(&old_base->lock);
-			goto repeat;
-		}
-	} else
-		spin_lock(&new_base->lock);
-
-	/*
-	 * Delete the previous timeout (if there was any), and install
-	 * the new one:
-	 */
-	if (old_base) {
-		list_del(&timer->entry);
-		ret = 1;
-	}
-	timer->expires = expires;
-	internal_add_timer(new_base, timer);
-	timer->base = new_base;
-
-	if (old_base && (new_base != old_base))
-		spin_unlock(&old_base->lock);
-	spin_unlock(&new_base->lock);
-	spin_unlock_irqrestore(&timer->lock, flags);
-
-	return ret;
+	return __mod_timer(timer, expires);
 }
 
 /***



  reply	other threads:[~2003-07-30  5:58 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-07-29 15:41 PATCH: Race in 2.6.0-test2 timer code linas
2003-07-29 20:56 ` Andrew Morton
2003-07-30  5:57   ` Ingo Molnar [this message]
2003-07-30  6:36     ` Andrew Morton
2003-07-30  7:07       ` Ingo Molnar
2003-07-30  7:34         ` Andrea Arcangeli
2003-07-30  7:34           ` Ingo Molnar
2003-07-30  8:28             ` Andrea Arcangeli
2003-07-30 10:31               ` Ingo Molnar
2003-07-30 11:16                 ` Andrea Arcangeli
2003-07-30 11:49                   ` Ingo Molnar
2003-07-30 12:34                     ` Andrea Arcangeli
2003-07-30 21:18                       ` linas
2003-07-30 22:06                         ` Andrea Arcangeli
2003-07-30 22:17                           ` Andrea Arcangeli
2003-07-31  7:04                             ` Ingo Molnar
2003-07-30 21:19                       ` Andrea Arcangeli
2003-07-30 23:43                 ` linas
2003-07-30 23:56                   ` Andrea Arcangeli
2003-07-30 23:54                     ` Andrew Morton
2003-07-31  0:16                       ` Andrea Arcangeli
2003-07-31 17:23                     ` linas
2003-08-01  6:27                       ` Ingo Molnar
2003-07-30  7:40           ` Ingo Molnar
2003-07-30  8:37             ` Andrea Arcangeli
2003-07-30 10:34               ` Ingo Molnar
2003-07-30 10:51                 ` Andrew Morton
2003-07-30 11:28                   ` Andrea Arcangeli
2003-07-30 11:22                 ` Andrea Arcangeli
2003-07-30 20:05     ` linas
2003-07-31  6:50       ` Ingo Molnar
2003-07-31 22:56     ` linas
2003-08-01  6:23       ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.44.0307300733200.25010-100000@localhost.localdomain \
    --to=mingo@elte.hu \
    --cc=akpm@osdl.org \
    --cc=linas@austin.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).