linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Oleg Nesterov <oleg@redhat.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
	Ingo Molnar <mingo@elte.hu>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] sched: RCU-protect __set_task_cpu() in set_task_cpu()
Date: Tue, 07 Jun 2011 11:31:47 +0200	[thread overview]
Message-ID: <1307439107.2322.229.camel@twins> (raw)
In-Reply-To: <20110606164657.GA20752@redhat.com>

On Mon, 2011-06-06 at 18:46 +0200, Oleg Nesterov wrote:
> On 06/06, Peter Zijlstra wrote:
> >
> > You're right, p->pi_lock for wakeups, rq->lock for runnable tasks.
> 
> Good, thanks.
> 
> Help! I have another question.
> 
> 	try_to_wake_up:
> 
> 		raw_spin_lock_irqsave(&p->pi_lock, flags);
> 		if (!(p->state & state))
> 			goto out;
> 
> 		cpu = task_cpu(p);
> 
> 		if (p->on_rq && ttwu_remote(p, wake_flags))
> 			goto stat;
> 
> This doesn't look a bit confusing, we can't trust "cpu = task_cpu" before
> we check ->on_rq. OK, not a problem, this cpu number can only be used in
> ttwu_stat(cpu).
> 
> But ttwu_stat(cpu) in turn does
> 
> 	if (cpu != task_cpu(p))
> 		schedstat_inc(p, se.statistics.nr_wakeups_migrate);
> 
> Ignoring the theoretical races with pull_task/etc, how it is possible
> that cpu != task_cpu(p) ? Another caller is try_to_wake_up_local(), it
> obviously can't trigger this case.
> 
> This looks broken to me. Looking at its name, I guess nr_wakeups_migrate
> should be incremented if ttwu does set_task_cpu(), correct?
> 
> IOW. Don't we need something like the (untested/ucompiled) patch below?
> _If_ I am right, I can resend it with the changelog/etc but please feel
> free to make another fix.

You're right, I spotted the same a few days ago which resulted in:

---
commit f339b9dc1f03591761d5d930800db24bc0eda1e1
Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
Date:   Tue May 31 10:49:20 2011 +0200

    sched: Fix schedstat.nr_wakeups_migrate
    
    While looking over the code I found that with the ttwu rework the
    nr_wakeups_migrate test broke since we now switch cpus prior to
    calling ttwu_stat(), hence the test is always true.
    
    Cure this by passing the migration state in wake_flags. Also move the
    whole test under CONFIG_SMP, its hard to migrate tasks on UP :-)
    
    Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
    Link: http://lkml.kernel.org/n/tip-pwwxl7gdqs5676f1d4cx6pj7@git.kernel.org
    Signed-off-by: Ingo Molnar <mingo@elte.hu>

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 8da84b7..483c1ed 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1063,6 +1063,7 @@ struct sched_domain;
  */
 #define WF_SYNC		0x01		/* waker goes to sleep after wakup */
 #define WF_FORK		0x02		/* child wakeup after fork */
+#define WF_MIGRATED	0x04		/* internal use, task got migrated */
 
 #define ENQUEUE_WAKEUP		1
 #define ENQUEUE_HEAD		2
diff --git a/kernel/sched.c b/kernel/sched.c
index 49cc70b..2fe98ed 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -2447,6 +2447,10 @@ ttwu_stat(struct task_struct *p, int cpu, int wake_flags)
 		}
 		rcu_read_unlock();
 	}
+
+	if (wake_flags & WF_MIGRATED)
+		schedstat_inc(p, se.statistics.nr_wakeups_migrate);
+
 #endif /* CONFIG_SMP */
 
 	schedstat_inc(rq, ttwu_count);
@@ -2455,9 +2459,6 @@ ttwu_stat(struct task_struct *p, int cpu, int wake_flags)
 	if (wake_flags & WF_SYNC)
 		schedstat_inc(p, se.statistics.nr_wakeups_sync);
 
-	if (cpu != task_cpu(p))
-		schedstat_inc(p, se.statistics.nr_wakeups_migrate);
-
 #endif /* CONFIG_SCHEDSTATS */
 }
 
@@ -2675,8 +2676,10 @@ try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags)
 		p->sched_class->task_waking(p);
 
 	cpu = select_task_rq(p, SD_BALANCE_WAKE, wake_flags);
-	if (task_cpu(p) != cpu)
+	if (task_cpu(p) != cpu) {
+		wake_flags |= WF_MIGRATED;
 		set_task_cpu(p, cpu);
+	}
 #endif /* CONFIG_SMP */
 
 	ttwu_queue(p, cpu);


  reply	other threads:[~2011-06-07  9:32 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-31 17:26 [PATCH] sched: RCU-protect __set_task_cpu() in set_task_cpu() Sergey Senozhatsky
2011-05-31 19:45 ` Peter Zijlstra
2011-06-03 15:37 ` Peter Zijlstra
2011-06-03 18:16   ` Sergey Senozhatsky
2011-06-03 22:49   ` Sergey Senozhatsky
2011-06-05 19:12   ` Oleg Nesterov
2011-06-06  9:06     ` Peter Zijlstra
2011-06-06 16:46       ` Oleg Nesterov
2011-06-07  9:31         ` Peter Zijlstra [this message]
2011-06-07 14:03           ` Oleg Nesterov
2011-06-06 13:43     ` Peter Zijlstra
2011-06-07 12:03   ` [tip:sched/urgent] sched: Fix/clarify set_task_cpu() locking rules tip-bot for Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1307439107.2322.229.camel@twins \
    --to=peterz@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=oleg@redhat.com \
    --cc=sergey.senozhatsky@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).