All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] KVM: Fix lock holder candidate yield
@ 2018-06-11  7:38 Wanpeng Li
  2018-06-11  7:38 ` [PATCH 2/2] sched/core: Consider afffinity constrain when yield to a task Wanpeng Li
  0 siblings, 1 reply; 4+ messages in thread
From: Wanpeng Li @ 2018-06-11  7:38 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Radim Krčmář, Peter Zijlstra, Ingo Molnar

From: Wanpeng Li <wanpengli@tencent.com>

After detecting pause loop which is executed by a Lock Waiter in the 
guest, the pCPU will be yielded to a Lock Holder candidate, the Lock 
Holder candidate may have its own task affinity constrain, however, 
current yield logic yield to the Lock Holder condidate unconditionally 
w/o checking the affinity constrain and set the task to the next buddy 
of cfs, this will break the scheduler. This patch fixes it by skipping 
the candidate vCPU if the current pCPU doesn't meat the affinity constrain.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
---
 virt/kvm/kvm_main.c | 29 +++++++++++++++++++++++++++--
 1 file changed, 27 insertions(+), 2 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index aa7da1d8e..ccf8907 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2239,17 +2239,40 @@ void kvm_vcpu_kick(struct kvm_vcpu *vcpu)
 EXPORT_SYMBOL_GPL(kvm_vcpu_kick);
 #endif /* !CONFIG_S390 */
 
-int kvm_vcpu_yield_to(struct kvm_vcpu *target)
+struct task_struct *vcpu_to_task(struct kvm_vcpu *target)
 {
 	struct pid *pid;
 	struct task_struct *task = NULL;
-	int ret = 0;
 
 	rcu_read_lock();
 	pid = rcu_dereference(target->pid);
 	if (pid)
 		task = get_pid_task(pid, PIDTYPE_PID);
 	rcu_read_unlock();
+	return task;
+}
+
+bool kvm_vcpu_allow_yield(struct kvm_vcpu *target)
+{
+	struct task_struct *task = NULL;
+	bool ret = false;
+
+	task = vcpu_to_task(target);
+	if (!task)
+		return ret;
+	if (cpumask_test_cpu(raw_smp_processor_id(), &task->cpus_allowed))
+		ret = true;
+	put_task_struct(task);
+
+	return ret;
+}
+
+int kvm_vcpu_yield_to(struct kvm_vcpu *target)
+{
+	struct task_struct *task = NULL;
+	int ret = 0;
+
+	task = vcpu_to_task(target);
 	if (!task)
 		return ret;
 	ret = yield_to(task, 1);
@@ -2333,6 +2356,8 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me, bool yield_to_kernel_mode)
 				continue;
 			if (!kvm_vcpu_eligible_for_directed_yield(vcpu))
 				continue;
+			if (!kvm_vcpu_allow_yield(vcpu))
+				continue;
 
 			yielded = kvm_vcpu_yield_to(vcpu);
 			if (yielded > 0) {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 2/2] sched/core: Consider afffinity constrain when yield to a task
  2018-06-11  7:38 [PATCH 1/2] KVM: Fix lock holder candidate yield Wanpeng Li
@ 2018-06-11  7:38 ` Wanpeng Li
  2018-06-11  8:38   ` Peter Zijlstra
  0 siblings, 1 reply; 4+ messages in thread
From: Wanpeng Li @ 2018-06-11  7:38 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Radim Krčmář, Peter Zijlstra, Ingo Molnar

From: Wanpeng Li <wanpengli@tencent.com>

Consider the task afffinity constrain when yield to a task.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
---
 kernel/sched/core.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 092f7c4..11001ff 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5102,6 +5102,9 @@ int __sched yield_to(struct task_struct *p, bool preempt)
 	if (task_running(p_rq, p) || p->state)
 		goto out_unlock;
 
+	if (!cpumask_test_cpu(smp_processor_id(), &p->cpus_allowed))
+		goto out_unlock;
+
 	yielded = curr->sched_class->yield_to_task(rq, p, preempt);
 	if (yielded) {
 		schedstat_inc(rq->yld_count);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH 2/2] sched/core: Consider afffinity constrain when yield to a task
  2018-06-11  7:38 ` [PATCH 2/2] sched/core: Consider afffinity constrain when yield to a task Wanpeng Li
@ 2018-06-11  8:38   ` Peter Zijlstra
  2018-06-11  9:03     ` Wanpeng Li
  0 siblings, 1 reply; 4+ messages in thread
From: Peter Zijlstra @ 2018-06-11  8:38 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: linux-kernel, kvm, Paolo Bonzini, Radim Krčmář,
	Ingo Molnar

On Mon, Jun 11, 2018 at 03:38:50PM +0800, Wanpeng Li wrote:
> From: Wanpeng Li <wanpengli@tencent.com>
> 
> Consider the task afffinity constrain when yield to a task.
> 
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Radim Krčmář <rkrcmar@redhat.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@kernel.org>
> Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
> ---
>  kernel/sched/core.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 092f7c4..11001ff 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -5102,6 +5102,9 @@ int __sched yield_to(struct task_struct *p, bool preempt)
>  	if (task_running(p_rq, p) || p->state)
>  		goto out_unlock;
>  
> +	if (!cpumask_test_cpu(smp_processor_id(), &p->cpus_allowed))
> +		goto out_unlock;
> +
>  	yielded = curr->sched_class->yield_to_task(rq, p, preempt);
>  	if (yielded) {
>  		schedstat_inc(rq->yld_count);

I'm confused... why?

So yield_to(@p) is documented as yielding @curr and getting @p to run
'sooner'. If they're on the same rq, yay, that means we'll likely switch
from @curr to @p, however if they're not on the same rq, it should still
work, except we'll reschedule 2 CPUs.

Look at the code, yield_to() will lock 2 rqs: rq and p_rq.

First if verifies p_rq == task_rq(p) (p could've been migrated while we
were waiting to acquire the lock) if this is not so, we unlock and try
again.

Then we check trivial things like actually having a ->yield_to and @curr
and @p being of the same class.

Then we check if @p is in fact already running or not runnable at all,
if either, obviously nothing to do.

So now, we have rq and p_rq locked, they need not be the same and we'll
call ->yield_to(), on success we'll force reschedule p_rq, unlock the
rqs and reschedule ourself.

So far, nothing requires @p to be runnable on the current CPU.

So let us look at yield_to_task_fair() the only yield_to implementation:

That again checks if @p is in fact runnable, if not, nothing to do.

Then it calls set_next_buddy(&p->se), which will mark @p more likely to
get picked on its rq (p_rq) _not_ our rq. Note how set_next_buddy() uses
cfs_rq_of().

Then it yields @curr on the current cpu.


Again, nothing cares if @curr and @p are on the same CPU and if @curr is
allowed to run on the current CPU -- there are no migrations.


So.. why?!

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH 2/2] sched/core: Consider afffinity constrain when yield to a task
  2018-06-11  8:38   ` Peter Zijlstra
@ 2018-06-11  9:03     ` Wanpeng Li
  0 siblings, 0 replies; 4+ messages in thread
From: Wanpeng Li @ 2018-06-11  9:03 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: LKML, kvm, Paolo Bonzini, Radim Krcmar, Ingo Molnar

On Mon, 11 Jun 2018 at 16:39, Peter Zijlstra <peterz@infradead.org> wrote:
[.../...]
>
> Then it calls set_next_buddy(&p->se), which will mark @p more likely to
> get picked on its rq (p_rq) _not_ our rq. Note how set_next_buddy() uses
> cfs_rq_of().

I miss it, thanks for the great explanation. I will drop the two patches.

Regards,
Wanpeng Li

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-06-11  9:03 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-11  7:38 [PATCH 1/2] KVM: Fix lock holder candidate yield Wanpeng Li
2018-06-11  7:38 ` [PATCH 2/2] sched/core: Consider afffinity constrain when yield to a task Wanpeng Li
2018-06-11  8:38   ` Peter Zijlstra
2018-06-11  9:03     ` Wanpeng Li

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.