All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC -v6 PATCH 0/8] directed yield for Pause Loop Exiting
@ 2011-01-20 21:31 Rik van Riel
  2011-01-20 21:32 ` [RFC -v6 PATCH 1/8] sched: check the right ->nr_running in yield_task_fair Rik van Riel
                   ` (7 more replies)
  0 siblings, 8 replies; 17+ messages in thread
From: Rik van Riel @ 2011-01-20 21:31 UTC (permalink / raw)
  To: kvm
  Cc: linux-kernel, Avi Kiviti, Srivatsa Vaddagiri, Peter Zijlstra,
	Mike Galbraith, Chris Wright, ttracy, dshaks, Nakajima, Jun

When running SMP virtual machines, it is possible for one VCPU to be
spinning on a spinlock, while the VCPU that holds the spinlock is not
currently running, because the host scheduler preempted it to run
something else.

Both Intel and AMD CPUs have a feature that detects when a virtual
CPU is spinning on a lock and will trap to the host.

The current KVM code sleeps for a bit whenever that happens, which
results in eg. a 64 VCPU Windows guest taking forever and a bit to
boot up.  This is because the VCPU holding the lock is actually
running and not sleeping, so the pause is counter-productive.

In other workloads a pause can also be counter-productive, with
spinlock detection resulting in one guest giving up its CPU time
to the others.  Instead of spinning, it ends up simply not running
much at all.

This patch series aims to fix that, by having a VCPU that spins
give the remainder of its timeslice to another VCPU in the same
guest before yielding the CPU - one that is runnable but got 
preempted, hopefully the lock holder.

v6:
- implement yield_task_fair in a way that works with task groups,
  this allows me to actually get a performance improvement!
- fix another race Avi pointed out, the code should be good now
v5:
- fix the race condition Avi pointed out, by tracking vcpu->pid
- also allows us to yield to vcpu tasks that got preempted while in qemu
  userspace
v4:
- change to newer version of Mike Galbraith's yield_to implementation
- chainsaw out some code from Mike that looked like a great idea, but
  turned out to give weird interactions in practice
v3:
- more cleanups
- change to Mike Galbraith's yield_to implementation
- yield to spinning VCPUs, this seems to work better in some
  situations and has little downside potential
v2:
- make lots of cleanups and improvements suggested
- do not implement timeslice scheduling or fairness stuff
  yet, since it is not entirely clear how to do that right
  (suggestions welcome)


Benchmark "results":

Two 4-CPU KVM guests are pinned to the same 4 physical CPUs.

One guest runs the AMQP performance test, the other guest runs
0, 2 or 4 infinite loops, for CPU overcommit factors of 0, 1.5
and 4.

The AMQP perftest is run 30 times, with 8 and 16 threads.

8thr	no overcommit	1.5x overcommit		2x overcommit

no PLE	223801		135137			104951
PLE	224135		141105			118744

16thr	no overcommit	1.5x overcommit		2x overcommit

no PLE	222424		126175			105299
PLE	222534		138082			132945

Note: this is with the KVM guests NOT running inside cgroups.  There
seems to be a CPU load balancing issue with cgroup fair group scheduling,
which often results in one guest getting only 80% CPU time and the other
guest 320%.  That will have to be fixed to get meaningful results with
cgroups.

CPU time division between the AMQP guest and the infinite loop guest
were not exactly fair, but the guests got close to the same amount
of CPU time in each test run.

There is a substantial amount of randomness in CPU time division between
guests, but the performance improvement is consistent between multiple
runs.

-- 
All rights reversed.


-- 
-- 
All rights reversed.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC -v6 PATCH 1/8] sched: check the right ->nr_running in yield_task_fair
  2011-01-20 21:31 [RFC -v6 PATCH 0/8] directed yield for Pause Loop Exiting Rik van Riel
@ 2011-01-20 21:32 ` Rik van Riel
  2011-01-20 21:33 ` [RFC -v6 PATCH 2/8] sched: limit the scope of clear_buddies Rik van Riel
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 17+ messages in thread
From: Rik van Riel @ 2011-01-20 21:32 UTC (permalink / raw)
  To: kvm
  Cc: linux-kernel, Avi Kiviti, Srivatsa Vaddagiri, Peter Zijlstra,
	Mike Galbraith, Chris Wright, ttracy, dshaks, Nakajima, Jun

With CONFIG_FAIR_GROUP_SCHED, each task_group has its own cfs_rq.
Yielding to a task from another cfs_rq may be worthwhile, since
a process calling yield typically cannot use the CPU right now.

Therefor, we want to check the per-cpu nr_running, not the
cgroup local one.

Signed-off-by: Rik van Riel <riel@redhat.com>
---
 kernel/sched_fair.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
index c62ebae..7b338ac 100644
--- a/kernel/sched_fair.c
+++ b/kernel/sched_fair.c
@@ -1304,7 +1304,7 @@ static void yield_task_fair(struct rq *rq)
 	/*
 	 * Are we the only task in the tree?
 	 */
-	if (unlikely(cfs_rq->nr_running == 1))
+	if (unlikely(rq->nr_running == 1))
 		return;
 
 	clear_buddies(cfs_rq, se);


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC -v6 PATCH 2/8] sched: limit the scope of clear_buddies
  2011-01-20 21:31 [RFC -v6 PATCH 0/8] directed yield for Pause Loop Exiting Rik van Riel
  2011-01-20 21:32 ` [RFC -v6 PATCH 1/8] sched: check the right ->nr_running in yield_task_fair Rik van Riel
@ 2011-01-20 21:33 ` Rik van Riel
  2011-01-24 17:57   ` Peter Zijlstra
  2011-01-20 21:33 ` [RFC -v6 PATCH 3/8] sched: use a buddy to implement yield_task_fair Rik van Riel
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 17+ messages in thread
From: Rik van Riel @ 2011-01-20 21:33 UTC (permalink / raw)
  To: kvm
  Cc: linux-kernel, Avi Kiviti, Srivatsa Vaddagiri, Peter Zijlstra,
	Mike Galbraith, Chris Wright, ttracy, dshaks, Nakajima, Jun

The clear_buddies function does not seem to play well with the concept
of hierarchical runqueues.  In the following tree, task groups are
represented by 'G', tasks by 'T', next by 'n' and last by 'l'.

     (nl)
    /    \
   G(nl)  G
   / \     \
 T(l) T(n)  T

This situation can arise when a task is woken up T(n), and the previously
running task T(l) is marked last.

When clear_buddies is called from either T(l) or T(n), the next and last
buddies of the group G(nl) will be cleared.  This is not the desired
result, since we would like to be able to find the other type of buddy
in many cases.

This especially a worry when implementing yield_task_fair through the
buddy system.

The fix is simple: only clear the buddy type that the task itself
is indicated to be.  As an added bonus, we stop walking up the tree
when the buddy has already been cleared or pointed elsewhere.

Signed-off-by: Rik van Riel <riel@redhat.coM>
---
 kernel/sched_fair.c |   30 +++++++++++++++++++++++-------
 1 files changed, 23 insertions(+), 7 deletions(-)

diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
index f4ee445..0321473 100644
--- a/kernel/sched_fair.c
+++ b/kernel/sched_fair.c
@@ -784,19 +784,35 @@ enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
 		__enqueue_entity(cfs_rq, se);
 }
 
-static void __clear_buddies(struct cfs_rq *cfs_rq, struct sched_entity *se)
+static void __clear_buddies_last(struct sched_entity *se)
 {
-	if (!se || cfs_rq->last == se)
-		cfs_rq->last = NULL;
+	for_each_sched_entity(se) {
+		struct cfs_rq *cfs_rq = cfs_rq_of(se);
+		if (cfs_rq->last == se)
+			cfs_rq->last = NULL;
+		else
+			break;
+	}
+}
 
-	if (!se || cfs_rq->next == se)
-		cfs_rq->next = NULL;
+static void __clear_buddies_next(struct sched_entity *se)
+{
+	for_each_sched_entity(se) {
+		struct cfs_rq *cfs_rq = cfs_rq_of(se);
+		if (cfs_rq->next == se)
+			cfs_rq->next = NULL;
+		else
+			break;
+	}
 }
 
 static void clear_buddies(struct cfs_rq *cfs_rq, struct sched_entity *se)
 {
-	for_each_sched_entity(se)
-		__clear_buddies(cfs_rq_of(se), se);
+	if (cfs_rq->last == se)
+		__clear_buddies_last(se);
+
+	if (cfs_rq->next == se)
+		__clear_buddies_next(se);
 }
 
 static void


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC -v6 PATCH 3/8] sched: use a buddy to implement yield_task_fair
  2011-01-20 21:31 [RFC -v6 PATCH 0/8] directed yield for Pause Loop Exiting Rik van Riel
  2011-01-20 21:32 ` [RFC -v6 PATCH 1/8] sched: check the right ->nr_running in yield_task_fair Rik van Riel
  2011-01-20 21:33 ` [RFC -v6 PATCH 2/8] sched: limit the scope of clear_buddies Rik van Riel
@ 2011-01-20 21:33 ` Rik van Riel
  2011-01-24 18:04   ` Peter Zijlstra
  2011-01-20 21:34 ` [RFC -v6 PATCH 4/8] sched: Add yield_to(task, preempt) functionality Rik van Riel
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 17+ messages in thread
From: Rik van Riel @ 2011-01-20 21:33 UTC (permalink / raw)
  To: kvm
  Cc: linux-kernel, Avi Kiviti, Srivatsa Vaddagiri, Peter Zijlstra,
	Mike Galbraith, Chris Wright, ttracy, dshaks, Nakajima, Jun

Use the buddy mechanism to implement yield_task_fair.  This
allows us to skip onto the next highest priority se at every
level in the CFS tree, unless doing so would introduce gross
unfairness in CPU time distribution.

We order the buddy selection in pick_next_entity to check
yield first, then last, then next.  We need next to be able
to override yield, because it is possible for the "next" and
"yield" task to be different processen in the same sub-tree
of the CFS tree.  When they are, we need to go into that
sub-tree regardless of the "yield" hint, and pick the correct
entity once we get to the right level.

Signed-off-by: Rik van Riel <riel@redhat.com>

diff --git a/kernel/sched.c b/kernel/sched.c
index dc91a4d..e4e57ff 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -327,7 +327,7 @@ struct cfs_rq {
 	 * 'curr' points to currently running entity on this cfs_rq.
 	 * It is set to NULL otherwise (i.e when none are currently running).
 	 */
-	struct sched_entity *curr, *next, *last;
+	struct sched_entity *curr, *next, *last, *yield;
 
 	unsigned int nr_spread_over;
 
diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
index ad946fd..f701a51 100644
--- a/kernel/sched_fair.c
+++ b/kernel/sched_fair.c
@@ -384,6 +384,22 @@ static struct sched_entity *__pick_next_entity(struct cfs_rq *cfs_rq)
 	return rb_entry(left, struct sched_entity, run_node);
 }
 
+static struct sched_entity *__pick_second_entity(struct cfs_rq *cfs_rq)
+{
+	struct rb_node *left = cfs_rq->rb_leftmost;
+	struct rb_node *second;
+
+	if (!left)
+		return NULL;
+
+	second = rb_next(left);
+
+	if (!second)
+		second = left;
+
+	return rb_entry(second, struct sched_entity, run_node);
+}
+
 static struct sched_entity *__pick_last_entity(struct cfs_rq *cfs_rq)
 {
 	struct rb_node *last = rb_last(&cfs_rq->tasks_timeline);
@@ -806,6 +822,17 @@ static void __clear_buddies_next(struct sched_entity *se)
 	}
 }
 
+static void __clear_buddies_yield(struct sched_entity *se)
+{
+	for_each_sched_entity(se) {
+		struct cfs_rq *cfs_rq = cfs_rq_of(se);
+		if (cfs_rq->yield == se)
+			cfs_rq->yield = NULL;
+		else
+			break;
+	}
+}
+
 static void clear_buddies(struct cfs_rq *cfs_rq, struct sched_entity *se)
 {
 	if (cfs_rq->last == se)
@@ -813,6 +840,9 @@ static void clear_buddies(struct cfs_rq *cfs_rq, struct sched_entity *se)
 
 	if (cfs_rq->next == se)
 		__clear_buddies_next(se);
+
+	if (cfs_rq->yield == se)
+		__clear_buddies_yield(se);
 }
 
 static void
@@ -926,13 +956,27 @@ set_next_entity(struct cfs_rq *cfs_rq, struct sched_entity *se)
 static int
 wakeup_preempt_entity(struct sched_entity *curr, struct sched_entity *se);
 
+/*
+ * Pick the next process, keeping these things in mind, in this order:
+ * 1) keep things fair between processes/task groups
+ * 2) pick the "next" process, since someone really wants that to run
+ * 3) pick the "last" process, for cache locality
+ * 4) do not run the "yield" process, if something else is available
+ */
 static struct sched_entity *pick_next_entity(struct cfs_rq *cfs_rq)
 {
 	struct sched_entity *se = __pick_next_entity(cfs_rq);
 	struct sched_entity *left = se;
 
-	if (cfs_rq->next && wakeup_preempt_entity(cfs_rq->next, left) < 1)
-		se = cfs_rq->next;
+	/*
+	 * Avoid running the yield buddy, if running something else can
+	 * be done without getting too unfair.
+	 */
+	if (cfs_rq->yield == se) {
+		struct sched_entity *second = __pick_second_entity(cfs_rq);
+		if (wakeup_preempt_entity(second, left) < 1)
+			se = second;
+	}
 
 	/*
 	 * Prefer last buddy, try to return the CPU to a preempted task.
@@ -940,6 +984,12 @@ static struct sched_entity *pick_next_entity(struct cfs_rq *cfs_rq)
 	if (cfs_rq->last && wakeup_preempt_entity(cfs_rq->last, left) < 1)
 		se = cfs_rq->last;
 
+	/*
+	 * Someone really wants this to run. If it's not unfair, run it.
+	 */
+	if (cfs_rq->next && wakeup_preempt_entity(cfs_rq->next, left) < 1)
+		se = cfs_rq->next;
+
 	clear_buddies(cfs_rq, se);
 
 	return se;
@@ -1096,52 +1146,6 @@ static void dequeue_task_fair(struct rq *rq, struct task_struct *p, int flags)
 	hrtick_update(rq);
 }
 
-/*
- * sched_yield() support is very simple - we dequeue and enqueue.
- *
- * If compat_yield is turned on then we requeue to the end of the tree.
- */
-static void yield_task_fair(struct rq *rq)
-{
-	struct task_struct *curr = rq->curr;
-	struct cfs_rq *cfs_rq = task_cfs_rq(curr);
-	struct sched_entity *rightmost, *se = &curr->se;
-
-	/*
-	 * Are we the only task in the tree?
-	 */
-	if (unlikely(rq->nr_running == 1))
-		return;
-
-	clear_buddies(cfs_rq, se);
-
-	if (likely(!sysctl_sched_compat_yield) && curr->policy != SCHED_BATCH) {
-		update_rq_clock(rq);
-		/*
-		 * Update run-time statistics of the 'current'.
-		 */
-		update_curr(cfs_rq);
-
-		return;
-	}
-	/*
-	 * Find the rightmost entry in the rbtree:
-	 */
-	rightmost = __pick_last_entity(cfs_rq);
-	/*
-	 * Already in the rightmost position?
-	 */
-	if (unlikely(!rightmost || entity_before(rightmost, se)))
-		return;
-
-	/*
-	 * Minimally necessary key value to be last in the tree:
-	 * Upon rescheduling, sched_class::put_prev_task() will place
-	 * 'current' within the tree based on its new key value.
-	 */
-	se->vruntime = rightmost->vruntime + 1;
-}
-
 #ifdef CONFIG_SMP
 
 static void task_waking_fair(struct rq *rq, struct task_struct *p)
@@ -1660,6 +1664,14 @@ static void set_next_buddy(struct sched_entity *se)
 	}
 }
 
+static void set_yield_buddy(struct sched_entity *se)
+{
+	if (likely(task_of(se)->policy != SCHED_IDLE)) {
+		for_each_sched_entity(se)
+			cfs_rq_of(se)->yield = se;
+	}
+}
+
 /*
  * Preempt the current task with a newly woken task if needed:
  */
@@ -1758,6 +1770,36 @@ static void put_prev_task_fair(struct rq *rq, struct task_struct *prev)
 	}
 }
 
+/*
+ * sched_yield() is very simple
+ *
+ * The magic of dealing with the ->yield buddy is in pick_next_entity.
+ */
+static void yield_task_fair(struct rq *rq)
+{
+	struct task_struct *curr = rq->curr;
+	struct cfs_rq *cfs_rq = task_cfs_rq(curr);
+	struct sched_entity *se = &curr->se;
+
+	/*
+	 * Are we the only task in the tree?
+	 */
+	if (unlikely(rq->nr_running == 1))
+		return;
+
+	clear_buddies(cfs_rq, se);
+
+	if (curr->policy != SCHED_BATCH) {
+		update_rq_clock(rq);
+		/*
+		 * Update run-time statistics of the 'current'.
+		 */
+		update_curr(cfs_rq);
+	}
+
+	set_yield_buddy(se);
+}
+
 #ifdef CONFIG_SMP
 /**************************************************
  * Fair scheduling class load-balancing methods:


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC -v6 PATCH 4/8] sched: Add yield_to(task, preempt) functionality
  2011-01-20 21:31 [RFC -v6 PATCH 0/8] directed yield for Pause Loop Exiting Rik van Riel
                   ` (2 preceding siblings ...)
  2011-01-20 21:33 ` [RFC -v6 PATCH 3/8] sched: use a buddy to implement yield_task_fair Rik van Riel
@ 2011-01-20 21:34 ` Rik van Riel
  2011-01-24 18:12   ` Peter Zijlstra
  2011-01-20 21:36 ` [RFC -v6 PATCH 6/8] export pid symbols needed for kvm_vcpu_on_spin Rik van Riel
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 17+ messages in thread
From: Rik van Riel @ 2011-01-20 21:34 UTC (permalink / raw)
  To: kvm
  Cc: linux-kernel, Avi Kiviti, Srivatsa Vaddagiri, Peter Zijlstra,
	Mike Galbraith, Chris Wright, ttracy, dshaks, Nakajima, Jun

From: Mike Galbraith <efault@gmx.de>

Currently only implemented for fair class tasks.

Add a yield_to_task method() to the fair scheduling class. allowing the
caller of yield_to() to accelerate another thread in it's thread group,
task group.

Implemented via a scheduler hint, using cfs_rq->next to encourage the
target being selected.  We can rely on pick_next_entity to keep things
fair, so noone can accelerate a thread that has already used its fair
share of CPU time.

This also means callers should only call yield_to when they really
mean it.  Calling it too often can result in the scheduler just
ignoring the hint.

Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Mike Galbraith <efault@gmx.de>

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 2c79e92..6c43fc4 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1047,6 +1047,7 @@ struct sched_class {
 	void (*enqueue_task) (struct rq *rq, struct task_struct *p, int flags);
 	void (*dequeue_task) (struct rq *rq, struct task_struct *p, int flags);
 	void (*yield_task) (struct rq *rq);
+	bool (*yield_to_task) (struct rq *rq, struct task_struct *p, bool preempt);
 
 	void (*check_preempt_curr) (struct rq *rq, struct task_struct *p, int flags);
 
@@ -1943,6 +1944,7 @@ static inline int rt_mutex_getprio(struct task_struct *p)
 # define rt_mutex_adjust_pi(p)		do { } while (0)
 #endif
 
+extern bool yield_to(struct task_struct *p, bool preempt);
 extern void set_user_nice(struct task_struct *p, long nice);
 extern int task_prio(const struct task_struct *p);
 extern int task_nice(const struct task_struct *p);
diff --git a/kernel/sched.c b/kernel/sched.c
index e4e57ff..1f38ed2 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -5270,6 +5270,64 @@ void __sched yield(void)
 }
 EXPORT_SYMBOL(yield);
 
+/**
+ * yield_to - yield the current processor to another thread in
+ * your thread group, or accelerate that thread toward the
+ * processor it's on.
+ *
+ * It's the caller's job to ensure that the target task struct
+ * can't go away on us before we can do any checks.
+ *
+ * Returns true if we indeed boosted the target task.
+ */
+bool __sched yield_to(struct task_struct *p, bool preempt)
+{
+	struct task_struct *curr = current;
+	struct rq *rq, *p_rq;
+	unsigned long flags;
+	bool yielded = 0;
+
+	local_irq_save(flags);
+	rq = this_rq();
+
+again:
+	p_rq = task_rq(p);
+	double_rq_lock(rq, p_rq);
+	while (task_rq(p) != p_rq) {
+		double_rq_unlock(rq, p_rq);
+		goto again;
+	}
+
+	if (!curr->sched_class->yield_to_task)
+		goto out;
+
+	if (curr->sched_class != p->sched_class)
+		goto out;
+
+	if (task_running(p_rq, p) || p->state)
+		goto out;
+
+	if (!same_thread_group(p, curr))
+		goto out;
+
+#ifdef CONFIG_FAIR_GROUP_SCHED
+	if (task_group(p) != task_group(curr))
+		goto out;
+#endif
+
+	yielded = curr->sched_class->yield_to_task(rq, p, preempt);
+
+out:
+	double_rq_unlock(rq, p_rq);
+	local_irq_restore(flags);
+
+	if (yielded)
+		yield();
+
+	return yielded;
+}
+EXPORT_SYMBOL_GPL(yield_to);
+
 /*
  * This task is about to go to sleep on IO. Increment rq->nr_iowait so
  * that process accounting knows that this is a task in IO wait state.
diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
index f701a51..097e936 100644
--- a/kernel/sched_fair.c
+++ b/kernel/sched_fair.c
@@ -1800,6 +1800,23 @@ static void yield_task_fair(struct rq *rq)
 	set_yield_buddy(se);
 }
 
+static bool yield_to_task_fair(struct rq *rq, struct task_struct *p, bool preempt)
+{
+	struct sched_entity *se = &p->se;
+
+	if (!se->on_rq)
+		return false;
+
+	/* Tell the scheduler that we'd really like pse to run next. */
+	set_next_buddy(se);
+
+	/* Make p's CPU reschedule; pick_next_entity takes care of fairness. */
+	if (preempt)
+		resched_task(rq->curr);
+
+	return true;
+}
+
 #ifdef CONFIG_SMP
 /**************************************************
  * Fair scheduling class load-balancing methods:
@@ -3993,6 +4010,7 @@ static const struct sched_class fair_sched_class = {
 	.enqueue_task		= enqueue_task_fair,
 	.dequeue_task		= dequeue_task_fair,
 	.yield_task		= yield_task_fair,
+	.yield_to_task		= yield_to_task_fair,
 
 	.check_preempt_curr	= check_preempt_wakeup,
 


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC -v6 PATCH 6/8] export pid symbols needed for kvm_vcpu_on_spin
  2011-01-20 21:31 [RFC -v6 PATCH 0/8] directed yield for Pause Loop Exiting Rik van Riel
                   ` (3 preceding siblings ...)
  2011-01-20 21:34 ` [RFC -v6 PATCH 4/8] sched: Add yield_to(task, preempt) functionality Rik van Riel
@ 2011-01-20 21:36 ` Rik van Riel
  2011-01-20 21:36 ` [RFC -v6 PATCH 7/8] kvm: keep track of which task is running a KVM vcpu Rik van Riel
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 17+ messages in thread
From: Rik van Riel @ 2011-01-20 21:36 UTC (permalink / raw)
  Cc: kvm, linux-kernel, Avi Kiviti, Srivatsa Vaddagiri,
	Peter Zijlstra, Mike Galbraith, Chris Wright, ttracy, dshaks,
	Nakajima, Jun

Export the symbols required for a race-free kvm_vcpu_on_spin.

Signed-off-by: Rik van Riel <riel@redhat.com>

diff --git a/kernel/fork.c b/kernel/fork.c
index 3b159c5..adc8f47 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -191,6 +191,7 @@ void __put_task_struct(struct task_struct *tsk)
 	if (!profile_handoff_task(tsk))
 		free_task(tsk);
 }
+EXPORT_SYMBOL_GPL(__put_task_struct);
 
 /*
  * macro override instead of weak attribute alias, to workaround
diff --git a/kernel/pid.c b/kernel/pid.c
index 39b65b6..02f2212 100644
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -435,6 +435,7 @@ struct pid *get_task_pid(struct task_struct *task, enum pid_type type)
 	rcu_read_unlock();
 	return pid;
 }
+EXPORT_SYMBOL_GPL(get_task_pid);
 
 struct task_struct *get_pid_task(struct pid *pid, enum pid_type type)
 {
@@ -446,6 +447,7 @@ struct task_struct *get_pid_task(struct pid *pid, enum pid_type type)
 	rcu_read_unlock();
 	return result;
 }
+EXPORT_SYMBOL_GPL(get_pid_task);
 
 struct pid *find_get_pid(pid_t nr)
 {


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC -v6 PATCH 7/8] kvm: keep track of which task is running a KVM vcpu
  2011-01-20 21:31 [RFC -v6 PATCH 0/8] directed yield for Pause Loop Exiting Rik van Riel
                   ` (4 preceding siblings ...)
  2011-01-20 21:36 ` [RFC -v6 PATCH 6/8] export pid symbols needed for kvm_vcpu_on_spin Rik van Riel
@ 2011-01-20 21:36 ` Rik van Riel
  2011-01-26 13:01   ` Avi Kivity
  2011-01-20 21:37 ` [RFC -v6 PATCH 5/8] sched: drop superfluous tests from yield_to Rik van Riel
  2011-01-20 21:38 ` [RFC -v6 PATCH 8/8] kvm: use yield_to instead of sleep in kvm_vcpu_on_spin Rik van Riel
  7 siblings, 1 reply; 17+ messages in thread
From: Rik van Riel @ 2011-01-20 21:36 UTC (permalink / raw)
  To: kvm
  Cc: linux-kernel, Avi Kiviti, Srivatsa Vaddagiri, Peter Zijlstra,
	Mike Galbraith, Chris Wright, ttracy, dshaks, Nakajima, Jun

Keep track of which task is running a KVM vcpu.  This helps us
figure out later what task to wake up if we want to boost a
vcpu that got preempted.

Unfortunately there are no guarantees that the same task
always keeps the same vcpu, so we can only track the task
across a single "run" of the vcpu.

Signed-off-by: Rik van Riel <riel@redhat.com>

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index a055742..9d56ed5 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -81,6 +81,7 @@ struct kvm_vcpu {
 #endif
 	int vcpu_id;
 	struct mutex mutex;
+	struct pid *pid;
 	int   cpu;
 	atomic_t guest_mode;
 	struct kvm_run *run;
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 5225052..86c4905 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -185,6 +185,7 @@ int kvm_vcpu_init(struct kvm_vcpu *vcpu, struct kvm *kvm, unsigned id)
 	vcpu->cpu = -1;
 	vcpu->kvm = kvm;
 	vcpu->vcpu_id = id;
+	vcpu->pid = NULL;
 	init_waitqueue_head(&vcpu->wq);
 
 	page = alloc_page(GFP_KERNEL | __GFP_ZERO);
@@ -208,6 +209,8 @@ EXPORT_SYMBOL_GPL(kvm_vcpu_init);
 
 void kvm_vcpu_uninit(struct kvm_vcpu *vcpu)
 {
+	if (vcpu->pid)
+		put_pid(vcpu->pid);
 	kvm_arch_vcpu_uninit(vcpu);
 	free_page((unsigned long)vcpu->run);
 }
@@ -1456,6 +1459,14 @@ static long kvm_vcpu_ioctl(struct file *filp,
 		r = -EINVAL;
 		if (arg)
 			goto out;
+		if (unlikely(vcpu->pid != current->pids[PIDTYPE_PID].pid)) {
+			/* The thread running this VCPU changed. */
+			struct pid *oldpid = vcpu->pid;
+			struct pid *newpid = get_task_pid(current, PIDTYPE_PID);
+			rcu_assign_pointer(vcpu->pid, newpid);
+			synchronize_rcu();
+			put_pid(oldpid);
+		}
 		r = kvm_arch_vcpu_ioctl_run(vcpu, vcpu->run);
 		break;
 	case KVM_GET_REGS: {


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC -v6 PATCH 5/8] sched: drop superfluous tests from yield_to
  2011-01-20 21:31 [RFC -v6 PATCH 0/8] directed yield for Pause Loop Exiting Rik van Riel
                   ` (5 preceding siblings ...)
  2011-01-20 21:36 ` [RFC -v6 PATCH 7/8] kvm: keep track of which task is running a KVM vcpu Rik van Riel
@ 2011-01-20 21:37 ` Rik van Riel
  2011-01-20 21:38 ` [RFC -v6 PATCH 8/8] kvm: use yield_to instead of sleep in kvm_vcpu_on_spin Rik van Riel
  7 siblings, 0 replies; 17+ messages in thread
From: Rik van Riel @ 2011-01-20 21:37 UTC (permalink / raw)
  To: kvm
  Cc: linux-kernel, Avi Kiviti, Srivatsa Vaddagiri, Peter Zijlstra,
	Mike Galbraith, Chris Wright, ttracy, dshaks, Nakajima, Jun

Fairness is enforced by pick_next_entity, so we can drop some
superfluous tests from yield_to.

Signed-off-by: Rik van Riel <riel@redhat.com>
---
 kernel/sched.c |    8 --------
 1 files changed, 0 insertions(+), 8 deletions(-)

diff --git a/kernel/sched.c b/kernel/sched.c
index 1f38ed2..398eedf 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -5307,14 +5307,6 @@ again:
 	if (task_running(p_rq, p) || p->state)
 		goto out;
 
-	if (!same_thread_group(p, curr))
-		goto out;
-
-#ifdef CONFIG_FAIR_GROUP_SCHED
-	if (task_group(p) != task_group(curr))
-		goto out;
-#endif
-
 	yielded = curr->sched_class->yield_to_task(rq, p, preempt);
 
 out:

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC -v6 PATCH 8/8] kvm: use yield_to instead of sleep in kvm_vcpu_on_spin
  2011-01-20 21:31 [RFC -v6 PATCH 0/8] directed yield for Pause Loop Exiting Rik van Riel
                   ` (6 preceding siblings ...)
  2011-01-20 21:37 ` [RFC -v6 PATCH 5/8] sched: drop superfluous tests from yield_to Rik van Riel
@ 2011-01-20 21:38 ` Rik van Riel
  7 siblings, 0 replies; 17+ messages in thread
From: Rik van Riel @ 2011-01-20 21:38 UTC (permalink / raw)
  To: kvm
  Cc: linux-kernel, Avi Kiviti, Srivatsa Vaddagiri, Peter Zijlstra,
	Mike Galbraith, Chris Wright, ttracy, dshaks, Nakajima, Jun

Instead of sleeping in kvm_vcpu_on_spin, which can cause gigantic
slowdowns of certain workloads, we instead use yield_to to get
another VCPU in the same KVM guest to run sooner.

This seems to give a 10-15% speedup in certain workloads, versus
not having PLE at all.

Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 9d56ed5..fab2250 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -187,6 +187,7 @@ struct kvm {
 #endif
 	struct kvm_vcpu *vcpus[KVM_MAX_VCPUS];
 	atomic_t online_vcpus;
+	int last_boosted_vcpu;
 	struct list_head vm_list;
 	struct mutex lock;
 	struct kvm_io_bus *buses[KVM_NR_BUSES];
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 86c4905..8b761ba 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1292,18 +1292,55 @@ void kvm_resched(struct kvm_vcpu *vcpu)
 }
 EXPORT_SYMBOL_GPL(kvm_resched);
 
-void kvm_vcpu_on_spin(struct kvm_vcpu *vcpu)
+void kvm_vcpu_on_spin(struct kvm_vcpu *me)
 {
-	ktime_t expires;
-	DEFINE_WAIT(wait);
-
-	prepare_to_wait(&vcpu->wq, &wait, TASK_INTERRUPTIBLE);
-
-	/* Sleep for 100 us, and hope lock-holder got scheduled */
-	expires = ktime_add_ns(ktime_get(), 100000UL);
-	schedule_hrtimeout(&expires, HRTIMER_MODE_ABS);
+	struct kvm *kvm = me->kvm;
+	struct kvm_vcpu *vcpu;
+	int last_boosted_vcpu = me->kvm->last_boosted_vcpu;
+	int yielded = 0;
+	int pass;
+	int i;
 
-	finish_wait(&vcpu->wq, &wait);
+	/*
+	 * We boost the priority of a VCPU that is runnable but not
+	 * currently running, because it got preempted by something
+	 * else and called schedule in __vcpu_run.  Hopefully that
+	 * VCPU is holding the lock that we need and will release it.
+	 * We approximate round-robin by starting at the last boosted VCPU.
+	 */
+	for (pass = 0; pass < 2 && !yielded; pass++) {
+		kvm_for_each_vcpu(i, vcpu, kvm) {
+			struct task_struct *task = NULL;
+			struct pid *pid;
+			if (!pass && i < last_boosted_vcpu) {
+				i = last_boosted_vcpu;
+				continue;
+			} else if (pass && i > last_boosted_vcpu)
+				break;
+			if (vcpu == me)
+				continue;
+			if (waitqueue_active(&vcpu->wq))
+				continue;
+			rcu_read_lock();
+			pid = rcu_dereference(vcpu->pid);
+			if (pid)
+				task = get_pid_task(vcpu->pid, PIDTYPE_PID);
+			rcu_read_unlock();
+			if (!task)
+				continue;
+			if (task->flags & PF_VCPU) {
+				put_task_struct(task);
+				continue;
+			}
+			if (yield_to(task, 1)) {
+				put_task_struct(task);
+				kvm->last_boosted_vcpu = i;
+				yielded = 1;
+				break;
+			}
+			put_task_struct(task);
+		}
+	}
 }
 EXPORT_SYMBOL_GPL(kvm_vcpu_on_spin);
 


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [RFC -v6 PATCH 2/8] sched: limit the scope of clear_buddies
  2011-01-20 21:33 ` [RFC -v6 PATCH 2/8] sched: limit the scope of clear_buddies Rik van Riel
@ 2011-01-24 17:57   ` Peter Zijlstra
  2011-01-24 18:04     ` Rik van Riel
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Zijlstra @ 2011-01-24 17:57 UTC (permalink / raw)
  To: Rik van Riel
  Cc: kvm, linux-kernel, Avi Kiviti, Srivatsa Vaddagiri,
	Mike Galbraith, Chris Wright, ttracy, dshaks, Nakajima, Jun,
	Paul Turner

On Thu, 2011-01-20 at 16:33 -0500, Rik van Riel wrote:
> The clear_buddies function does not seem to play well with the concept
> of hierarchical runqueues.  In the following tree, task groups are
> represented by 'G', tasks by 'T', next by 'n' and last by 'l'.
> 
>      (nl)
>     /    \
>    G(nl)  G
>    / \     \
>  T(l) T(n)  T
> 
> This situation can arise when a task is woken up T(n), and the previously
> running task T(l) is marked last.
> 
> When clear_buddies is called from either T(l) or T(n), the next and last
> buddies of the group G(nl) will be cleared.  This is not the desired
> result, since we would like to be able to find the other type of buddy
> in many cases.
> 
> This especially a worry when implementing yield_task_fair through the
> buddy system.
> 
> The fix is simple: only clear the buddy type that the task itself
> is indicated to be.  As an added bonus, we stop walking up the tree
> when the buddy has already been cleared or pointed elsewhere.
> 
> Signed-off-by: Rik van Riel <riel@redhat.coM>
> ---
>  kernel/sched_fair.c |   30 +++++++++++++++++++++++-------
>  1 files changed, 23 insertions(+), 7 deletions(-)
> 
> diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
> index f4ee445..0321473 100644
> --- a/kernel/sched_fair.c
> +++ b/kernel/sched_fair.c
> @@ -784,19 +784,35 @@ enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
>  		__enqueue_entity(cfs_rq, se);
>  }
>  
> -static void __clear_buddies(struct cfs_rq *cfs_rq, struct sched_entity *se)
> +static void __clear_buddies_last(struct sched_entity *se)
>  {
> -	if (!se || cfs_rq->last == se)
> -		cfs_rq->last = NULL;
> +	for_each_sched_entity(se) {
> +		struct cfs_rq *cfs_rq = cfs_rq_of(se);
> +		if (cfs_rq->last == se)
> +			cfs_rq->last = NULL;
> +		else
> +			break;
> +	}
> +}
>  
> -	if (!se || cfs_rq->next == se)
> -		cfs_rq->next = NULL;
> +static void __clear_buddies_next(struct sched_entity *se)
> +{
> +	for_each_sched_entity(se) {
> +		struct cfs_rq *cfs_rq = cfs_rq_of(se);
> +		if (cfs_rq->next == se)
> +			cfs_rq->next = NULL;
> +		else
> +			break;
> +	}
>  }
>  
>  static void clear_buddies(struct cfs_rq *cfs_rq, struct sched_entity *se)
>  {
> -	for_each_sched_entity(se)
> -		__clear_buddies(cfs_rq_of(se), se);
> +	if (cfs_rq->last == se)
> +		__clear_buddies_last(se);
> +
> +	if (cfs_rq->next == se)
> +		__clear_buddies_next(se);
>  }
>  

Right, I think this sorta matches with something the Google guys talked
about, they wanted to change pick_next_task() no always start from the
top but only go up one level when the current level ran out.

It looks ok, just sad that we can now have two hierarchy traversals (and
3 with the next patch).



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC -v6 PATCH 2/8] sched: limit the scope of clear_buddies
  2011-01-24 17:57   ` Peter Zijlstra
@ 2011-01-24 18:04     ` Rik van Riel
  0 siblings, 0 replies; 17+ messages in thread
From: Rik van Riel @ 2011-01-24 18:04 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: kvm, linux-kernel, Avi Kiviti, Srivatsa Vaddagiri,
	Mike Galbraith, Chris Wright, ttracy, dshaks, Nakajima, Jun,
	Paul Turner

On 01/24/2011 12:57 PM, Peter Zijlstra wrote:
> On Thu, 2011-01-20 at 16:33 -0500, Rik van Riel wrote:
>> The clear_buddies function does not seem to play well with the concept
>> of hierarchical runqueues.  In the following tree, task groups are
>> represented by 'G', tasks by 'T', next by 'n' and last by 'l'.
>>
>>       (nl)
>>      /    \
>>     G(nl)  G
>>     / \     \
>>   T(l) T(n)  T
>>
>> This situation can arise when a task is woken up T(n), and the previously
>> running task T(l) is marked last.
>>
>> When clear_buddies is called from either T(l) or T(n), the next and last
>> buddies of the group G(nl) will be cleared.  This is not the desired
>> result, since we would like to be able to find the other type of buddy
>> in many cases.
>>
>> This especially a worry when implementing yield_task_fair through the
>> buddy system.
>>
>> The fix is simple: only clear the buddy type that the task itself
>> is indicated to be.  As an added bonus, we stop walking up the tree
>> when the buddy has already been cleared or pointed elsewhere.
>>
>> Signed-off-by: Rik van Riel<riel@redhat.coM>
>> ---
>>   kernel/sched_fair.c |   30 +++++++++++++++++++++++-------
>>   1 files changed, 23 insertions(+), 7 deletions(-)
>>
>> diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
>> index f4ee445..0321473 100644
>> --- a/kernel/sched_fair.c
>> +++ b/kernel/sched_fair.c
>> @@ -784,19 +784,35 @@ enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
>>   		__enqueue_entity(cfs_rq, se);
>>   }
>>
>> -static void __clear_buddies(struct cfs_rq *cfs_rq, struct sched_entity *se)
>> +static void __clear_buddies_last(struct sched_entity *se)
>>   {
>> -	if (!se || cfs_rq->last == se)
>> -		cfs_rq->last = NULL;
>> +	for_each_sched_entity(se) {
>> +		struct cfs_rq *cfs_rq = cfs_rq_of(se);
>> +		if (cfs_rq->last == se)
>> +			cfs_rq->last = NULL;
>> +		else
>> +			break;
>> +	}
>> +}
>>
>> -	if (!se || cfs_rq->next == se)
>> -		cfs_rq->next = NULL;
>> +static void __clear_buddies_next(struct sched_entity *se)
>> +{
>> +	for_each_sched_entity(se) {
>> +		struct cfs_rq *cfs_rq = cfs_rq_of(se);
>> +		if (cfs_rq->next == se)
>> +			cfs_rq->next = NULL;
>> +		else
>> +			break;
>> +	}
>>   }
>>
>>   static void clear_buddies(struct cfs_rq *cfs_rq, struct sched_entity *se)
>>   {
>> -	for_each_sched_entity(se)
>> -		__clear_buddies(cfs_rq_of(se), se);
>> +	if (cfs_rq->last == se)
>> +		__clear_buddies_last(se);
>> +
>> +	if (cfs_rq->next == se)
>> +		__clear_buddies_next(se);
>>   }
>>
>
> Right, I think this sorta matches with something the Google guys talked
> about, they wanted to change pick_next_task() no always start from the
> top but only go up one level when the current level ran out.
>
> It looks ok, just sad that we can now have two hierarchy traversals (and
> 3 with the next patch).

On the other hand, I don't think we'll actually _do_ the
hierarchy traversal most of the time, since pick_next_entity
calls clear_buddies, every step of the way down the tree.

A hierarchy traversal will only be done if a task already
has one type of buddy set, and then gets another type of
buddy set, before it is rescheduled.

Eg. a task can have ->last set and then call yield, causing
the ->yield buddy to get pointed at itself.  When doing that,
it will walk up the tree, clearing ->last.

I suspect that with this patch, we'll end up doing less
tree traversal than before.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC -v6 PATCH 3/8] sched: use a buddy to implement yield_task_fair
  2011-01-20 21:33 ` [RFC -v6 PATCH 3/8] sched: use a buddy to implement yield_task_fair Rik van Riel
@ 2011-01-24 18:04   ` Peter Zijlstra
  2011-01-24 18:16     ` Rik van Riel
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Zijlstra @ 2011-01-24 18:04 UTC (permalink / raw)
  To: Rik van Riel
  Cc: kvm, linux-kernel, Avi Kiviti, Srivatsa Vaddagiri,
	Mike Galbraith, Chris Wright, ttracy, dshaks, Nakajima, Jun

On Thu, 2011-01-20 at 16:33 -0500, Rik van Riel wrote:
> Use the buddy mechanism to implement yield_task_fair.  This
> allows us to skip onto the next highest priority se at every
> level in the CFS tree, unless doing so would introduce gross
> unfairness in CPU time distribution.
> 
> We order the buddy selection in pick_next_entity to check
> yield first, then last, then next.  We need next to be able
> to override yield, because it is possible for the "next" and
> "yield" task to be different processen in the same sub-tree
> of the CFS tree.  When they are, we need to go into that
> sub-tree regardless of the "yield" hint, and pick the correct
> entity once we get to the right level.
> 
> Signed-off-by: Rik van Riel <riel@redhat.com>
> 
> diff --git a/kernel/sched.c b/kernel/sched.c
> index dc91a4d..e4e57ff 100644
> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -327,7 +327,7 @@ struct cfs_rq {
>  	 * 'curr' points to currently running entity on this cfs_rq.
>  	 * It is set to NULL otherwise (i.e when none are currently running).
>  	 */
> -	struct sched_entity *curr, *next, *last;
> +	struct sched_entity *curr, *next, *last, *yield;

I'd prefer it be called: skip or somesuch..

>  	unsigned int nr_spread_over;
>  
> diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
> index ad946fd..f701a51 100644
> --- a/kernel/sched_fair.c
> +++ b/kernel/sched_fair.c
> @@ -384,6 +384,22 @@ static struct sched_entity *__pick_next_entity(struct cfs_rq *cfs_rq)
>  	return rb_entry(left, struct sched_entity, run_node);
>  }
>  
> +static struct sched_entity *__pick_second_entity(struct cfs_rq *cfs_rq)
> +{
> +	struct rb_node *left = cfs_rq->rb_leftmost;
> +	struct rb_node *second;
> +
> +	if (!left)
> +		return NULL;
> +
> +	second = rb_next(left);
> +
> +	if (!second)
> +		second = left;
> +
> +	return rb_entry(second, struct sched_entity, run_node);
> +}

So this works because you only ever skip the leftmost, should we perhaps
write this as something like the below?

static struct sched_entity *__pick_next_entity(sched_entity *se)
{
	struct rb_node *next = rb_next(&se->run_node);
	if (!next)
		return NULL;
	return rb_entry(next, struct sched_entity, run_node);
}

>  static struct sched_entity *__pick_last_entity(struct cfs_rq *cfs_rq)
>  {
>  	struct rb_node *last = rb_last(&cfs_rq->tasks_timeline);
> @@ -806,6 +822,17 @@ static void __clear_buddies_next(struct sched_entity *se)
>  	}
>  }
>  
> +static void __clear_buddies_yield(struct sched_entity *se)
> +{
> +	for_each_sched_entity(se) {
> +		struct cfs_rq *cfs_rq = cfs_rq_of(se);
> +		if (cfs_rq->yield == se)
> +			cfs_rq->yield = NULL;
> +		else
> +			break;
> +	}
> +}
> +
>  static void clear_buddies(struct cfs_rq *cfs_rq, struct sched_entity *se)
>  {
>  	if (cfs_rq->last == se)
> @@ -813,6 +840,9 @@ static void clear_buddies(struct cfs_rq *cfs_rq, struct sched_entity *se)
>  
>  	if (cfs_rq->next == se)
>  		__clear_buddies_next(se);
> +
> +	if (cfs_rq->yield == se)
> +		__clear_buddies_yield(se);
>  }

The 3rd hierarchy iteration.. :/

>  static void
> @@ -926,13 +956,27 @@ set_next_entity(struct cfs_rq *cfs_rq, struct sched_entity *se)
>  static int
>  wakeup_preempt_entity(struct sched_entity *curr, struct sched_entity *se);
>  
> +/*
> + * Pick the next process, keeping these things in mind, in this order:
> + * 1) keep things fair between processes/task groups
> + * 2) pick the "next" process, since someone really wants that to run
> + * 3) pick the "last" process, for cache locality
> + * 4) do not run the "yield" process, if something else is available
> + */
>  static struct sched_entity *pick_next_entity(struct cfs_rq *cfs_rq)
>  {
>  	struct sched_entity *se = __pick_next_entity(cfs_rq);
>  	struct sched_entity *left = se;
>  
> -	if (cfs_rq->next && wakeup_preempt_entity(cfs_rq->next, left) < 1)
> -		se = cfs_rq->next;
> +	/*
> +	 * Avoid running the yield buddy, if running something else can
> +	 * be done without getting too unfair.
> +	 */
> +	if (cfs_rq->yield == se) {
> +		struct sched_entity *second = __pick_second_entity(cfs_rq);
> +		if (wakeup_preempt_entity(second, left) < 1)
> +			se = second;
> +	}
>  
>  	/*
>  	 * Prefer last buddy, try to return the CPU to a preempted task.
> @@ -940,6 +984,12 @@ static struct sched_entity *pick_next_entity(struct cfs_rq *cfs_rq)
>  	if (cfs_rq->last && wakeup_preempt_entity(cfs_rq->last, left) < 1)
>  		se = cfs_rq->last;
>  
> +	/*
> +	 * Someone really wants this to run. If it's not unfair, run it.
> +	 */
> +	if (cfs_rq->next && wakeup_preempt_entity(cfs_rq->next, left) < 1)
> +		se = cfs_rq->next;
> +
>  	clear_buddies(cfs_rq, se);
>  
>  	return se;

This seems to assume ->yield cannot be ->next nor ->last, but I'm not
quite sure that will actually be true.

> @@ -1096,52 +1146,6 @@ static void dequeue_task_fair(struct rq *rq, struct task_struct *p, int flags)
>  	hrtick_update(rq);
>  }
>  
> -/*
> - * sched_yield() support is very simple - we dequeue and enqueue.
> - *
> - * If compat_yield is turned on then we requeue to the end of the tree.
> - */
> -static void yield_task_fair(struct rq *rq)
> -{
> -	struct task_struct *curr = rq->curr;
> -	struct cfs_rq *cfs_rq = task_cfs_rq(curr);
> -	struct sched_entity *rightmost, *se = &curr->se;
> -
> -	/*
> -	 * Are we the only task in the tree?
> -	 */
> -	if (unlikely(rq->nr_running == 1))
> -		return;
> -
> -	clear_buddies(cfs_rq, se);
> -
> -	if (likely(!sysctl_sched_compat_yield) && curr->policy != SCHED_BATCH) {
> -		update_rq_clock(rq);
> -		/*
> -		 * Update run-time statistics of the 'current'.
> -		 */
> -		update_curr(cfs_rq);
> -
> -		return;
> -	}
> -	/*
> -	 * Find the rightmost entry in the rbtree:
> -	 */
> -	rightmost = __pick_last_entity(cfs_rq);
> -	/*
> -	 * Already in the rightmost position?
> -	 */
> -	if (unlikely(!rightmost || entity_before(rightmost, se)))
> -		return;
> -
> -	/*
> -	 * Minimally necessary key value to be last in the tree:
> -	 * Upon rescheduling, sched_class::put_prev_task() will place
> -	 * 'current' within the tree based on its new key value.
> -	 */
> -	se->vruntime = rightmost->vruntime + 1;
> -}
> -
>  #ifdef CONFIG_SMP
>  
>  static void task_waking_fair(struct rq *rq, struct task_struct *p)
> @@ -1660,6 +1664,14 @@ static void set_next_buddy(struct sched_entity *se)
>  	}
>  }
>  
> +static void set_yield_buddy(struct sched_entity *se)
> +{
> +	if (likely(task_of(se)->policy != SCHED_IDLE)) {
> +		for_each_sched_entity(se)
> +			cfs_rq_of(se)->yield = se;
> +	}
> +}
> +
>  /*
>   * Preempt the current task with a newly woken task if needed:
>   */
> @@ -1758,6 +1770,36 @@ static void put_prev_task_fair(struct rq *rq, struct task_struct *prev)
>  	}
>  }
>  
> +/*
> + * sched_yield() is very simple
> + *
> + * The magic of dealing with the ->yield buddy is in pick_next_entity.
> + */
> +static void yield_task_fair(struct rq *rq)
> +{
> +	struct task_struct *curr = rq->curr;
> +	struct cfs_rq *cfs_rq = task_cfs_rq(curr);
> +	struct sched_entity *se = &curr->se;
> +
> +	/*
> +	 * Are we the only task in the tree?
> +	 */
> +	if (unlikely(rq->nr_running == 1))
> +		return;
> +
> +	clear_buddies(cfs_rq, se);
> +
> +	if (curr->policy != SCHED_BATCH) {
> +		update_rq_clock(rq);
> +		/*
> +		 * Update run-time statistics of the 'current'.
> +		 */
> +		update_curr(cfs_rq);
> +	}
> +
> +	set_yield_buddy(se);
> +}

You just lost sysctl_sched_compat_yield, someone might be upset (I
really can't be bothered much with people using sys_yield :-), but if
you're going down that road you want a hunk in kernel/sysctl.c as well I
think.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC -v6 PATCH 4/8] sched: Add yield_to(task, preempt) functionality
  2011-01-20 21:34 ` [RFC -v6 PATCH 4/8] sched: Add yield_to(task, preempt) functionality Rik van Riel
@ 2011-01-24 18:12   ` Peter Zijlstra
  2011-01-24 18:19     ` Rik van Riel
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Zijlstra @ 2011-01-24 18:12 UTC (permalink / raw)
  To: Rik van Riel
  Cc: kvm, linux-kernel, Avi Kiviti, Srivatsa Vaddagiri,
	Mike Galbraith, Chris Wright, ttracy, dshaks, Nakajima, Jun

On Thu, 2011-01-20 at 16:34 -0500, Rik van Riel wrote:
> From: Mike Galbraith <efault@gmx.de>
> 
> Currently only implemented for fair class tasks.
> 
> Add a yield_to_task method() to the fair scheduling class. allowing the
> caller of yield_to() to accelerate another thread in it's thread group,
> task group.
> 
> Implemented via a scheduler hint, using cfs_rq->next to encourage the
> target being selected.  We can rely on pick_next_entity to keep things
> fair, so noone can accelerate a thread that has already used its fair
> share of CPU time.
> 
> This also means callers should only call yield_to when they really
> mean it.  Calling it too often can result in the scheduler just
> ignoring the hint.
> 
> Signed-off-by: Rik van Riel <riel@redhat.com>
> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
> Signed-off-by: Mike Galbraith <efault@gmx.de>

Patch 5 wants to be merged back in here I think..

> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 2c79e92..6c43fc4 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1047,6 +1047,7 @@ struct sched_class {
>  	void (*enqueue_task) (struct rq *rq, struct task_struct *p, int flags);
>  	void (*dequeue_task) (struct rq *rq, struct task_struct *p, int flags);
>  	void (*yield_task) (struct rq *rq);
> +	bool (*yield_to_task) (struct rq *rq, struct task_struct *p, bool preempt);
>  
>  	void (*check_preempt_curr) (struct rq *rq, struct task_struct *p, int flags);
>  
> @@ -1943,6 +1944,7 @@ static inline int rt_mutex_getprio(struct task_struct *p)
>  # define rt_mutex_adjust_pi(p)		do { } while (0)
>  #endif
>  
> +extern bool yield_to(struct task_struct *p, bool preempt);
>  extern void set_user_nice(struct task_struct *p, long nice);
>  extern int task_prio(const struct task_struct *p);
>  extern int task_nice(const struct task_struct *p);
> diff --git a/kernel/sched.c b/kernel/sched.c
> index e4e57ff..1f38ed2 100644
> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -5270,6 +5270,64 @@ void __sched yield(void)
>  }
>  EXPORT_SYMBOL(yield);
>  
> +/**
> + * yield_to - yield the current processor to another thread in
> + * your thread group, or accelerate that thread toward the
> + * processor it's on.
> + *
> + * It's the caller's job to ensure that the target task struct
> + * can't go away on us before we can do any checks.
> + *
> + * Returns true if we indeed boosted the target task.
> + */
> +bool __sched yield_to(struct task_struct *p, bool preempt)
> +{
> +	struct task_struct *curr = current;
> +	struct rq *rq, *p_rq;
> +	unsigned long flags;
> +	bool yielded = 0;
> +
> +	local_irq_save(flags);
> +	rq = this_rq();
> +
> +again:
> +	p_rq = task_rq(p);
> +	double_rq_lock(rq, p_rq);
> +	while (task_rq(p) != p_rq) {
> +		double_rq_unlock(rq, p_rq);
> +		goto again;
> +	}
> +
> +	if (!curr->sched_class->yield_to_task)
> +		goto out;
> +
> +	if (curr->sched_class != p->sched_class)
> +		goto out;
> +
> +	if (task_running(p_rq, p) || p->state)
> +		goto out;
> +
> +	if (!same_thread_group(p, curr))
> +		goto out;
> +
> +#ifdef CONFIG_FAIR_GROUP_SCHED
> +	if (task_group(p) != task_group(curr))
> +		goto out;
> +#endif
> +
> +	yielded = curr->sched_class->yield_to_task(rq, p, preempt);
> +
> +out:
> +	double_rq_unlock(rq, p_rq);
> +	local_irq_restore(flags);
> +
> +	if (yielded)
> +		yield();

Calling yield() here is funny, you just had all the locks to actually do
it..

> +	return yielded;
> +}
> +EXPORT_SYMBOL_GPL(yield_to);



> diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
> index f701a51..097e936 100644
> --- a/kernel/sched_fair.c
> +++ b/kernel/sched_fair.c
> @@ -1800,6 +1800,23 @@ static void yield_task_fair(struct rq *rq)
>  	set_yield_buddy(se);
>  }
>  
> +static bool yield_to_task_fair(struct rq *rq, struct task_struct *p, bool preempt)
> +{
> +	struct sched_entity *se = &p->se;
> +
> +	if (!se->on_rq)
> +		return false;
> +
> +	/* Tell the scheduler that we'd really like pse to run next. */
> +	set_next_buddy(se);
> +
> +	/* Make p's CPU reschedule; pick_next_entity takes care of fairness. */
> +	if (preempt)
> +		resched_task(rq->curr);
> +
> +	return true;
> +}

So here we set ->next, we could be ->last, and after this we'll set
->yield to curr by calling yield().

So if you do this cyclically I can see ->yield == {->next,->last}
happening.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC -v6 PATCH 3/8] sched: use a buddy to implement yield_task_fair
  2011-01-24 18:04   ` Peter Zijlstra
@ 2011-01-24 18:16     ` Rik van Riel
  0 siblings, 0 replies; 17+ messages in thread
From: Rik van Riel @ 2011-01-24 18:16 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: kvm, linux-kernel, Avi Kiviti, Srivatsa Vaddagiri,
	Mike Galbraith, Chris Wright, ttracy, dshaks, Nakajima, Jun

On 01/24/2011 01:04 PM, Peter Zijlstra wrote:

>> diff --git a/kernel/sched.c b/kernel/sched.c
>> index dc91a4d..e4e57ff 100644
>> --- a/kernel/sched.c
>> +++ b/kernel/sched.c
>> @@ -327,7 +327,7 @@ struct cfs_rq {
>>   	 * 'curr' points to currently running entity on this cfs_rq.
>>   	 * It is set to NULL otherwise (i.e when none are currently running).
>>   	 */
>> -	struct sched_entity *curr, *next, *last;
>> +	struct sched_entity *curr, *next, *last, *yield;
>
> I'd prefer it be called: skip or somesuch..

I could do that.  Do any of the other scheduler people have
a preference?

>> +static struct sched_entity *__pick_second_entity(struct cfs_rq *cfs_rq)
>> +{
>> +	struct rb_node *left = cfs_rq->rb_leftmost;
>> +	struct rb_node *second;
>> +
>> +	if (!left)
>> +		return NULL;
>> +
>> +	second = rb_next(left);
>> +
>> +	if (!second)
>> +		second = left;
>> +
>> +	return rb_entry(second, struct sched_entity, run_node);
>> +}
>
> So this works because you only ever skip the leftmost, should we perhaps
> write this as something like the below?

Well, pick_next_entity only ever *picks* the leftmost entity,
so there's no reason to skip others.

>> @@ -813,6 +840,9 @@ static void clear_buddies(struct cfs_rq *cfs_rq, struct sched_entity *se)
>>
>>   	if (cfs_rq->next == se)
>>   		__clear_buddies_next(se);
>> +
>> +	if (cfs_rq->yield == se)
>> +		__clear_buddies_yield(se);
>>   }
>
> The 3rd hierarchy iteration.. :/

Except it won't actually walk up the tree above the level
where the buddy actually points at the se.  I suspect the
new code will do less tree walking than the old code.

>> +	/*
>> +	 * Someone really wants this to run. If it's not unfair, run it.
>> +	 */
>> +	if (cfs_rq->next&&  wakeup_preempt_entity(cfs_rq->next, left)<  1)
>> +		se = cfs_rq->next;
>> +
>>   	clear_buddies(cfs_rq, se);
>>
>>   	return se;
>
> This seems to assume ->yield cannot be ->next nor ->last, but I'm not
> quite sure that will actually be true.

On the contrary, I specifically want ->next to be able to
override ->yield, for the reason that the _tasks_ that
have ->next and ->yield set could be inside the same _group_.

What I am assuming is that ->yield and ->last are not the
same task.  This is achieved by yield_task_fair calling
clear_buddies.

>> +/*
>> + * sched_yield() is very simple
>> + *
>> + * The magic of dealing with the ->yield buddy is in pick_next_entity.
>> + */
>> +static void yield_task_fair(struct rq *rq)
>> +{
>> +	struct task_struct *curr = rq->curr;
>> +	struct cfs_rq *cfs_rq = task_cfs_rq(curr);
>> +	struct sched_entity *se =&curr->se;
>> +
>> +	/*
>> +	 * Are we the only task in the tree?
>> +	 */
>> +	if (unlikely(rq->nr_running == 1))
>> +		return;
>> +
>> +	clear_buddies(cfs_rq, se);
>> +
>> +	if (curr->policy != SCHED_BATCH) {
>> +		update_rq_clock(rq);
>> +		/*
>> +		 * Update run-time statistics of the 'current'.
>> +		 */
>> +		update_curr(cfs_rq);
>> +	}
>> +
>> +	set_yield_buddy(se);
>> +}
>
> You just lost sysctl_sched_compat_yield, someone might be upset (I
> really can't be bothered much with people using sys_yield :-), but if
> you're going down that road you want a hunk in kernel/sysctl.c as well I
> think.

I lost sysctl_sched_compat_yield, because with my code
yield is no longer a noop.

I'd be glad to remove the sysctl.c bits if you want :)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC -v6 PATCH 4/8] sched: Add yield_to(task, preempt) functionality
  2011-01-24 18:12   ` Peter Zijlstra
@ 2011-01-24 18:19     ` Rik van Riel
  0 siblings, 0 replies; 17+ messages in thread
From: Rik van Riel @ 2011-01-24 18:19 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: kvm, linux-kernel, Avi Kiviti, Srivatsa Vaddagiri,
	Mike Galbraith, Chris Wright, ttracy, dshaks, Nakajima, Jun

On 01/24/2011 01:12 PM, Peter Zijlstra wrote:
> On Thu, 2011-01-20 at 16:34 -0500, Rik van Riel wrote:
>> From: Mike Galbraith<efault@gmx.de>
>>
>> Currently only implemented for fair class tasks.
>>
>> Add a yield_to_task method() to the fair scheduling class. allowing the
>> caller of yield_to() to accelerate another thread in it's thread group,
>> task group.
>>
>> Implemented via a scheduler hint, using cfs_rq->next to encourage the
>> target being selected.  We can rely on pick_next_entity to keep things
>> fair, so noone can accelerate a thread that has already used its fair
>> share of CPU time.
>>
>> This also means callers should only call yield_to when they really
>> mean it.  Calling it too often can result in the scheduler just
>> ignoring the hint.
>>
>> Signed-off-by: Rik van Riel<riel@redhat.com>
>> Signed-off-by: Marcelo Tosatti<mtosatti@redhat.com>
>> Signed-off-by: Mike Galbraith<efault@gmx.de>
>
> Patch 5 wants to be merged back in here I think..

Agreed, but I wanted Mike's comments first  :)


>> +/**
>> + * yield_to - yield the current processor to another thread in
>> + * your thread group, or accelerate that thread toward the
>> + * processor it's on.
>> + *
>> + * It's the caller's job to ensure that the target task struct
>> + * can't go away on us before we can do any checks.
>> + *
>> + * Returns true if we indeed boosted the target task.
>> + */
>> +bool __sched yield_to(struct task_struct *p, bool preempt)
>> +{
>> +	struct task_struct *curr = current;
>> +	struct rq *rq, *p_rq;
>> +	unsigned long flags;
>> +	bool yielded = 0;
>> +
>> +	local_irq_save(flags);
>> +	rq = this_rq();
>> +
>> +again:
>> +	p_rq = task_rq(p);
>> +	double_rq_lock(rq, p_rq);
>> +	while (task_rq(p) != p_rq) {
>> +		double_rq_unlock(rq, p_rq);
>> +		goto again;
>> +	}
>> +
>> +	if (!curr->sched_class->yield_to_task)
>> +		goto out;
>> +
>> +	if (curr->sched_class != p->sched_class)
>> +		goto out;
>> +
>> +	if (task_running(p_rq, p) || p->state)
>> +		goto out;
>> +
>> +	if (!same_thread_group(p, curr))
>> +		goto out;
>> +
>> +#ifdef CONFIG_FAIR_GROUP_SCHED
>> +	if (task_group(p) != task_group(curr))
>> +		goto out;
>> +#endif
>> +
>> +	yielded = curr->sched_class->yield_to_task(rq, p, preempt);
>> +
>> +out:
>> +	double_rq_unlock(rq, p_rq);
>> +	local_irq_restore(flags);
>> +
>> +	if (yielded)
>> +		yield();
>
> Calling yield() here is funny, you just had all the locks to actually do
> it..

This is us giving up the CPU, which requires not holding locks.

A different thing than us giving the CPU away to someone else.

>> +static bool yield_to_task_fair(struct rq *rq, struct task_struct *p, bool preempt)
>> +{
>> +	struct sched_entity *se =&p->se;
>> +
>> +	if (!se->on_rq)
>> +		return false;
>> +
>> +	/* Tell the scheduler that we'd really like pse to run next. */
>> +	set_next_buddy(se);
>> +
>> +	/* Make p's CPU reschedule; pick_next_entity takes care of fairness. */
>> +	if (preempt)
>> +		resched_task(rq->curr);
>> +
>> +	return true;
>> +}
>
> So here we set ->next, we could be ->last, and after this we'll set
> ->yield to curr by calling yield().
>
> So if you do this cyclically I can see ->yield == {->next,->last}
> happening.

That would only happen if we called yield_to with ourselves
as the argument!

There is no caller in the tree that does that - task p is
another task, not ourselves.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC -v6 PATCH 7/8] kvm: keep track of which task is running a KVM vcpu
  2011-01-20 21:36 ` [RFC -v6 PATCH 7/8] kvm: keep track of which task is running a KVM vcpu Rik van Riel
@ 2011-01-26 13:01   ` Avi Kivity
  2011-01-26 15:20     ` Rik van Riel
  0 siblings, 1 reply; 17+ messages in thread
From: Avi Kivity @ 2011-01-26 13:01 UTC (permalink / raw)
  To: Rik van Riel
  Cc: kvm, linux-kernel, Srivatsa Vaddagiri, Peter Zijlstra,
	Mike Galbraith, Chris Wright, ttracy, dshaks, Nakajima, Jun

On 01/20/2011 11:36 PM, Rik van Riel wrote:
> Keep track of which task is running a KVM vcpu.  This helps us
> figure out later what task to wake up if we want to boost a
> vcpu that got preempted.
>
> Unfortunately there are no guarantees that the same task
> always keeps the same vcpu, so we can only track the task
> across a single "run" of the vcpu.
>
> Signed-off-by: Rik van Riel<riel@redhat.com>
>
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index a055742..9d56ed5 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -81,6 +81,7 @@ struct kvm_vcpu {
>   #endif
>   	int vcpu_id;
>   	struct mutex mutex;
> +	struct pid *pid;
>   	int   cpu;
>   	atomic_t guest_mode;
>   	struct kvm_run *run;
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 5225052..86c4905 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -185,6 +185,7 @@ int kvm_vcpu_init(struct kvm_vcpu *vcpu, struct kvm *kvm, unsigned id)
>   	vcpu->cpu = -1;
>   	vcpu->kvm = kvm;
>   	vcpu->vcpu_id = id;
> +	vcpu->pid = NULL;
>   	init_waitqueue_head(&vcpu->wq);
>
>   	page = alloc_page(GFP_KERNEL | __GFP_ZERO);
> @@ -208,6 +209,8 @@ EXPORT_SYMBOL_GPL(kvm_vcpu_init);
>
>   void kvm_vcpu_uninit(struct kvm_vcpu *vcpu)
>   {
> +	if (vcpu->pid)
> +		put_pid(vcpu->pid);

Unconditional put_pid() suffices.

>   	kvm_arch_vcpu_uninit(vcpu);
>   	free_page((unsigned long)vcpu->run);
>   }
> @@ -1456,6 +1459,14 @@ static long kvm_vcpu_ioctl(struct file *filp,
>   		r = -EINVAL;
>   		if (arg)
>   			goto out;
> +		if (unlikely(vcpu->pid != current->pids[PIDTYPE_PID].pid)) {
> +			/* The thread running this VCPU changed. */
> +			struct pid *oldpid = vcpu->pid;
> +			struct pid *newpid = get_task_pid(current, PIDTYPE_PID);
> +			rcu_assign_pointer(vcpu->pid, newpid);
> +			synchronize_rcu();
> +			put_pid(oldpid);
> +		}

This is executed without any lock held, so two concurrent KVM_RUNs can 
race and cause a double put_pid().

Suggest moving the code to vcpu_load(), where it can execute under the 
protection of vcpu->mutex.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC -v6 PATCH 7/8] kvm: keep track of which task is running a KVM vcpu
  2011-01-26 13:01   ` Avi Kivity
@ 2011-01-26 15:20     ` Rik van Riel
  0 siblings, 0 replies; 17+ messages in thread
From: Rik van Riel @ 2011-01-26 15:20 UTC (permalink / raw)
  To: Avi Kivity
  Cc: kvm, linux-kernel, Srivatsa Vaddagiri, Peter Zijlstra,
	Mike Galbraith, Chris Wright, ttracy, dshaks, Nakajima, Jun

On 01/26/2011 08:01 AM, Avi Kivity wrote:

> Suggest moving the code to vcpu_load(), where it can execute under the
> protection of vcpu->mutex.

I've made the suggested changes by you and Peter, and
will re-post the patch series in a bit...

-- 
All rights reversed

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2011-01-26 15:20 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-20 21:31 [RFC -v6 PATCH 0/8] directed yield for Pause Loop Exiting Rik van Riel
2011-01-20 21:32 ` [RFC -v6 PATCH 1/8] sched: check the right ->nr_running in yield_task_fair Rik van Riel
2011-01-20 21:33 ` [RFC -v6 PATCH 2/8] sched: limit the scope of clear_buddies Rik van Riel
2011-01-24 17:57   ` Peter Zijlstra
2011-01-24 18:04     ` Rik van Riel
2011-01-20 21:33 ` [RFC -v6 PATCH 3/8] sched: use a buddy to implement yield_task_fair Rik van Riel
2011-01-24 18:04   ` Peter Zijlstra
2011-01-24 18:16     ` Rik van Riel
2011-01-20 21:34 ` [RFC -v6 PATCH 4/8] sched: Add yield_to(task, preempt) functionality Rik van Riel
2011-01-24 18:12   ` Peter Zijlstra
2011-01-24 18:19     ` Rik van Riel
2011-01-20 21:36 ` [RFC -v6 PATCH 6/8] export pid symbols needed for kvm_vcpu_on_spin Rik van Riel
2011-01-20 21:36 ` [RFC -v6 PATCH 7/8] kvm: keep track of which task is running a KVM vcpu Rik van Riel
2011-01-26 13:01   ` Avi Kivity
2011-01-26 15:20     ` Rik van Riel
2011-01-20 21:37 ` [RFC -v6 PATCH 5/8] sched: drop superfluous tests from yield_to Rik van Riel
2011-01-20 21:38 ` [RFC -v6 PATCH 8/8] kvm: use yield_to instead of sleep in kvm_vcpu_on_spin Rik van Riel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.