kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/5] kvm: Use rcuwait for vcpu blocking
@ 2020-04-24  5:48 Davidlohr Bueso
  2020-04-24  5:48 ` [PATCH 1/5] rcuwait: Fix stale wake call name in comment Davidlohr Bueso
                   ` (4 more replies)
  0 siblings, 5 replies; 14+ messages in thread
From: Davidlohr Bueso @ 2020-04-24  5:48 UTC (permalink / raw)
  To: tglx, pbonzini
  Cc: peterz, maz, bigeasy, rostedt, torvalds, will, joel,
	linux-kernel, kvm, dave

Hi,

The following is an updated (and hopefully final) revision of the kvm
vcpu to rcuwait conversion[0], following the work on completions using
simple waitqueues.

Patches 1-4 level up the rcuwait api with waitqueues.
Patch 5 converts kvm to use rcuwait.

Changes from v3:
  - picked up maz and peterz's acks for routing via kvm tree.
  - added new patch 4/5 which introduces rcuwait_active. This is to avoid
    directly calling rcu_dereference() to peek at the wait->task.
  - fixed breakage for arm in patch 4/5.
  - removed previous patch 5/5 which updates swait doc as peterz will
    keep it.

Changes from v2:
  - new patch 3 which adds prepare_to_rcuwait and finish_rcuwait helpers.
  - fixed broken sleep and tracepoint semantics in patch 4. (Marc and Paolo)
  
This has only been run tested on x86 but compile tested on mips, powerpc
and arm. It passes all tests from kvm-unit-tests[1].

This series applies on top of current kvm and tip trees.
Please consider for v5.8.

[0] https://lore.kernel.org/lkml/20200320085527.23861-3-dave@stgolabs.net/
[1] git://git.kernel.org/pub/scm/virt/kvm/kvm-unit-tests.git

Thanks!

Davidlohr Bueso (5):
  rcuwait: Fix stale wake call name in comment
  rcuwait: Let rcuwait_wake_up() return whether or not a task was awoken
  rcuwait: Introduce prepare_to and finish_rcuwait
  rcuwait: Introduce rcuwait_active()
  kvm: Replace vcpu->swait with rcuwait

 arch/mips/kvm/mips.c                  |  6 ++----
 arch/powerpc/include/asm/kvm_book3s.h |  2 +-
 arch/powerpc/include/asm/kvm_host.h   |  2 +-
 arch/powerpc/kvm/book3s_hv.c          | 22 ++++++++--------------
 arch/powerpc/kvm/powerpc.c            |  2 +-
 arch/x86/kvm/lapic.c                  |  2 +-
 include/linux/kvm_host.h              | 10 +++++-----
 include/linux/rcuwait.h               | 32 ++++++++++++++++++++++++++------
 kernel/exit.c                         |  9 ++++++---
 virt/kvm/arm/arch_timer.c             |  3 ++-
 virt/kvm/arm/arm.c                    |  9 +++++----
 virt/kvm/async_pf.c                   |  3 +--
 virt/kvm/kvm_main.c                   | 19 +++++++++----------
 13 files changed, 68 insertions(+), 53 deletions(-)

--
2.16.4


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/5] rcuwait: Fix stale wake call name in comment
  2020-04-24  5:48 [PATCH v4 0/5] kvm: Use rcuwait for vcpu blocking Davidlohr Bueso
@ 2020-04-24  5:48 ` Davidlohr Bueso
  2020-04-24  5:48 ` [PATCH 2/5] rcuwait: Let rcuwait_wake_up() return whether or not a task was awoken Davidlohr Bueso
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 14+ messages in thread
From: Davidlohr Bueso @ 2020-04-24  5:48 UTC (permalink / raw)
  To: tglx, pbonzini
  Cc: peterz, maz, bigeasy, rostedt, torvalds, will, joel,
	linux-kernel, kvm, dave, Davidlohr Bueso

The 'trywake' name was renamed to simply 'wake', update the comment.

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
---
 kernel/exit.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/exit.c b/kernel/exit.c
index 389a88cb3081..9f9015f3f6b0 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -236,7 +236,7 @@ void rcuwait_wake_up(struct rcuwait *w)
 	/*
 	 * Order condition vs @task, such that everything prior to the load
 	 * of @task is visible. This is the condition as to why the user called
-	 * rcuwait_trywake() in the first place. Pairs with set_current_state()
+	 * rcuwait_wake() in the first place. Pairs with set_current_state()
 	 * barrier (A) in rcuwait_wait_event().
 	 *
 	 *    WAIT                WAKE
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/5] rcuwait: Let rcuwait_wake_up() return whether or not a task was awoken
  2020-04-24  5:48 [PATCH v4 0/5] kvm: Use rcuwait for vcpu blocking Davidlohr Bueso
  2020-04-24  5:48 ` [PATCH 1/5] rcuwait: Fix stale wake call name in comment Davidlohr Bueso
@ 2020-04-24  5:48 ` Davidlohr Bueso
  2020-04-24  5:48 ` [PATCH 3/5] rcuwait: Introduce prepare_to and finish_rcuwait Davidlohr Bueso
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 14+ messages in thread
From: Davidlohr Bueso @ 2020-04-24  5:48 UTC (permalink / raw)
  To: tglx, pbonzini
  Cc: peterz, maz, bigeasy, rostedt, torvalds, will, joel,
	linux-kernel, kvm, dave, Davidlohr Bueso

Propagating the return value of wake_up_process() back to the caller
can come in handy for future users, such as for statistics or
accounting purposes.

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
---
 include/linux/rcuwait.h | 2 +-
 kernel/exit.c           | 7 +++++--
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/include/linux/rcuwait.h b/include/linux/rcuwait.h
index 2ffe1ee6d482..6ebb23258a27 100644
--- a/include/linux/rcuwait.h
+++ b/include/linux/rcuwait.h
@@ -25,7 +25,7 @@ static inline void rcuwait_init(struct rcuwait *w)
 	w->task = NULL;
 }
 
-extern void rcuwait_wake_up(struct rcuwait *w);
+extern int rcuwait_wake_up(struct rcuwait *w);
 
 /*
  * The caller is responsible for locking around rcuwait_wait_event(),
diff --git a/kernel/exit.c b/kernel/exit.c
index 9f9015f3f6b0..f3beb637acf7 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -227,8 +227,9 @@ void release_task(struct task_struct *p)
 		goto repeat;
 }
 
-void rcuwait_wake_up(struct rcuwait *w)
+int rcuwait_wake_up(struct rcuwait *w)
 {
+	int ret = 0;
 	struct task_struct *task;
 
 	rcu_read_lock();
@@ -248,8 +249,10 @@ void rcuwait_wake_up(struct rcuwait *w)
 
 	task = rcu_dereference(w->task);
 	if (task)
-		wake_up_process(task);
+		ret = wake_up_process(task);
 	rcu_read_unlock();
+
+	return ret;
 }
 EXPORT_SYMBOL_GPL(rcuwait_wake_up);
 
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/5] rcuwait: Introduce prepare_to and finish_rcuwait
  2020-04-24  5:48 [PATCH v4 0/5] kvm: Use rcuwait for vcpu blocking Davidlohr Bueso
  2020-04-24  5:48 ` [PATCH 1/5] rcuwait: Fix stale wake call name in comment Davidlohr Bueso
  2020-04-24  5:48 ` [PATCH 2/5] rcuwait: Let rcuwait_wake_up() return whether or not a task was awoken Davidlohr Bueso
@ 2020-04-24  5:48 ` Davidlohr Bueso
  2020-04-24  5:48 ` [PATCH 4/5] rcuwait: Introduce rcuwait_active() Davidlohr Bueso
  2020-04-24  5:48 ` [PATCH 5/5] kvm: Replace vcpu->swait with rcuwait Davidlohr Bueso
  4 siblings, 0 replies; 14+ messages in thread
From: Davidlohr Bueso @ 2020-04-24  5:48 UTC (permalink / raw)
  To: tglx, pbonzini
  Cc: peterz, maz, bigeasy, rostedt, torvalds, will, joel,
	linux-kernel, kvm, dave, Davidlohr Bueso

This allows further flexibility for some callers to implement
ad-hoc versions of the generic rcuwait_wait_event(). For example,
kvm will need this to maintain tracing semantics. The naming
is of course similar to what waitqueue apis offer.

Also go ahead and make use of rcu_assign_pointer() for both task
writes as it will make the __rcu sparse people happy - this will
be the special nil case, thus no added serialization.

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
---
 include/linux/rcuwait.h | 21 ++++++++++++++++-----
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/include/linux/rcuwait.h b/include/linux/rcuwait.h
index 6ebb23258a27..45bc6604e9b1 100644
--- a/include/linux/rcuwait.h
+++ b/include/linux/rcuwait.h
@@ -29,12 +29,25 @@ extern int rcuwait_wake_up(struct rcuwait *w);
 
 /*
  * The caller is responsible for locking around rcuwait_wait_event(),
- * such that writes to @task are properly serialized.
+ * and [prepare_to/finish]_rcuwait() such that writes to @task are
+ * properly serialized.
  */
+
+static inline void prepare_to_rcuwait(struct rcuwait *w)
+{
+	rcu_assign_pointer(w->task, current);
+}
+
+static inline void finish_rcuwait(struct rcuwait *w)
+{
+        rcu_assign_pointer(w->task, NULL);
+	__set_current_state(TASK_RUNNING);
+}
+
 #define rcuwait_wait_event(w, condition, state)				\
 ({									\
 	int __ret = 0;							\
-	rcu_assign_pointer((w)->task, current);				\
+	prepare_to_rcuwait(w);						\
 	for (;;) {							\
 		/*							\
 		 * Implicit barrier (A) pairs with (B) in		\
@@ -51,9 +64,7 @@ extern int rcuwait_wake_up(struct rcuwait *w);
 									\
 		schedule();						\
 	}								\
-									\
-	WRITE_ONCE((w)->task, NULL);					\
-	__set_current_state(TASK_RUNNING);				\
+	finish_rcuwait(w);						\
 	__ret;								\
 })
 
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 4/5] rcuwait: Introduce rcuwait_active()
  2020-04-24  5:48 [PATCH v4 0/5] kvm: Use rcuwait for vcpu blocking Davidlohr Bueso
                   ` (2 preceding siblings ...)
  2020-04-24  5:48 ` [PATCH 3/5] rcuwait: Introduce prepare_to and finish_rcuwait Davidlohr Bueso
@ 2020-04-24  5:48 ` Davidlohr Bueso
  2020-04-24  9:55   ` Peter Zijlstra
  2020-05-18 10:33   ` Paolo Bonzini
  2020-04-24  5:48 ` [PATCH 5/5] kvm: Replace vcpu->swait with rcuwait Davidlohr Bueso
  4 siblings, 2 replies; 14+ messages in thread
From: Davidlohr Bueso @ 2020-04-24  5:48 UTC (permalink / raw)
  To: tglx, pbonzini
  Cc: peterz, maz, bigeasy, rostedt, torvalds, will, joel,
	linux-kernel, kvm, dave, Davidlohr Bueso

This call is lockless and thus should not be trustedblindly,
ie: for wakeup purposes, which is already provided correctly
by rcuwait_wakeup().

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
---
 include/linux/rcuwait.h | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/include/linux/rcuwait.h b/include/linux/rcuwait.h
index 45bc6604e9b1..c1414ce44abc 100644
--- a/include/linux/rcuwait.h
+++ b/include/linux/rcuwait.h
@@ -25,6 +25,15 @@ static inline void rcuwait_init(struct rcuwait *w)
 	w->task = NULL;
 }
 
+/*
+ * Note: this provides no serialization and, just as with waitqueues,
+ * requires care to estimate as to whether or not the wait is active.
+ */
+static inline int rcuwait_active(struct rcuwait *w)
+{
+	return !!rcu_dereference(w->task);
+}
+
 extern int rcuwait_wake_up(struct rcuwait *w);
 
 /*
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 5/5] kvm: Replace vcpu->swait with rcuwait
  2020-04-24  5:48 [PATCH v4 0/5] kvm: Use rcuwait for vcpu blocking Davidlohr Bueso
                   ` (3 preceding siblings ...)
  2020-04-24  5:48 ` [PATCH 4/5] rcuwait: Introduce rcuwait_active() Davidlohr Bueso
@ 2020-04-24  5:48 ` Davidlohr Bueso
  2020-04-25 10:24   ` Paolo Bonzini
  2020-05-18  8:51   ` Wanpeng Li
  4 siblings, 2 replies; 14+ messages in thread
From: Davidlohr Bueso @ 2020-04-24  5:48 UTC (permalink / raw)
  To: tglx, pbonzini
  Cc: peterz, maz, bigeasy, rostedt, torvalds, will, joel,
	linux-kernel, kvm, dave, Paul Mackerras, kvmarm, linux-mips,
	Davidlohr Bueso

The use of any sort of waitqueue (simple or regular) for
wait/waking vcpus has always been an overkill and semantically
wrong. Because this is per-vcpu (which is blocked) there is
only ever a single waiting vcpu, thus no need for any sort of
queue.

As such, make use of the rcuwait primitive, with the following
considerations:

  - rcuwait already provides the proper barriers that serialize
  concurrent waiter and waker.

  - Task wakeup is done in rcu read critical region, with a
  stable task pointer.

  - Because there is no concurrency among waiters, we need
  not worry about rcuwait_wait_event() calls corrupting
  the wait->task. As a consequence, this saves the locking
  done in swait when modifying the queue. This also applies
  to per-vcore wait for powerpc kvm-hv.

The x86 tscdeadline_latency test mentioned in 8577370fb0cb
("KVM: Use simple waitqueue for vcpu->wq") shows that, on avg,
latency is reduced by around 15-20% with this change.

Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: kvmarm@lists.cs.columbia.edu
Cc: linux-mips@vger.kernel.org
Reviewed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
---
 arch/mips/kvm/mips.c                  |  6 ++----
 arch/powerpc/include/asm/kvm_book3s.h |  2 +-
 arch/powerpc/include/asm/kvm_host.h   |  2 +-
 arch/powerpc/kvm/book3s_hv.c          | 22 ++++++++--------------
 arch/powerpc/kvm/powerpc.c            |  2 +-
 arch/x86/kvm/lapic.c                  |  2 +-
 include/linux/kvm_host.h              | 10 +++++-----
 virt/kvm/arm/arch_timer.c             |  3 ++-
 virt/kvm/arm/arm.c                    |  9 +++++----
 virt/kvm/async_pf.c                   |  3 +--
 virt/kvm/kvm_main.c                   | 19 +++++++++----------
 11 files changed, 36 insertions(+), 44 deletions(-)

diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c
index 8f05dd0a0f4e..fad6acce46e4 100644
--- a/arch/mips/kvm/mips.c
+++ b/arch/mips/kvm/mips.c
@@ -284,8 +284,7 @@ static enum hrtimer_restart kvm_mips_comparecount_wakeup(struct hrtimer *timer)
 	kvm_mips_callbacks->queue_timer_int(vcpu);
 
 	vcpu->arch.wait = 0;
-	if (swq_has_sleeper(&vcpu->wq))
-		swake_up_one(&vcpu->wq);
+	rcuwait_wake_up(&vcpu->wait);
 
 	return kvm_mips_count_timeout(vcpu);
 }
@@ -511,8 +510,7 @@ int kvm_vcpu_ioctl_interrupt(struct kvm_vcpu *vcpu,
 
 	dvcpu->arch.wait = 0;
 
-	if (swq_has_sleeper(&dvcpu->wq))
-		swake_up_one(&dvcpu->wq);
+	rcuwait_wake_up(&dvcpu->wait);
 
 	return 0;
 }
diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h
index 506e4df2d730..6e5d85ba588d 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -78,7 +78,7 @@ struct kvmppc_vcore {
 	struct kvm_vcpu *runnable_threads[MAX_SMT_THREADS];
 	struct list_head preempt_list;
 	spinlock_t lock;
-	struct swait_queue_head wq;
+	struct rcuwait wait;
 	spinlock_t stoltb_lock;	/* protects stolen_tb and preempt_tb */
 	u64 stolen_tb;
 	u64 preempt_tb;
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 1dc63101ffe1..337047ba4a56 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -751,7 +751,7 @@ struct kvm_vcpu_arch {
 	u8 irq_pending; /* Used by XIVE to signal pending guest irqs */
 	u32 last_inst;
 
-	struct swait_queue_head *wqp;
+	struct rcuwait *waitp;
 	struct kvmppc_vcore *vcore;
 	int ret;
 	int trap;
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 93493f0cbfe8..b8d42f523ca7 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -230,13 +230,11 @@ static bool kvmppc_ipi_thread(int cpu)
 static void kvmppc_fast_vcpu_kick_hv(struct kvm_vcpu *vcpu)
 {
 	int cpu;
-	struct swait_queue_head *wqp;
+	struct rcuwait *wait;
 
-	wqp = kvm_arch_vcpu_wq(vcpu);
-	if (swq_has_sleeper(wqp)) {
-		swake_up_one(wqp);
+	wait = kvm_arch_vcpu_get_wait(vcpu);
+	if (rcuwait_wake_up(wait))
 		++vcpu->stat.halt_wakeup;
-	}
 
 	cpu = READ_ONCE(vcpu->arch.thread_cpu);
 	if (cpu >= 0 && kvmppc_ipi_thread(cpu))
@@ -2125,7 +2123,7 @@ static struct kvmppc_vcore *kvmppc_vcore_create(struct kvm *kvm, int id)
 
 	spin_lock_init(&vcore->lock);
 	spin_lock_init(&vcore->stoltb_lock);
-	init_swait_queue_head(&vcore->wq);
+	rcuwait_init(&vcore->wait);
 	vcore->preempt_tb = TB_NIL;
 	vcore->lpcr = kvm->arch.lpcr;
 	vcore->first_vcpuid = id;
@@ -3784,7 +3782,6 @@ static void kvmppc_vcore_blocked(struct kvmppc_vcore *vc)
 	ktime_t cur, start_poll, start_wait;
 	int do_sleep = 1;
 	u64 block_ns;
-	DECLARE_SWAITQUEUE(wait);
 
 	/* Poll for pending exceptions and ceded state */
 	cur = start_poll = ktime_get();
@@ -3812,10 +3809,7 @@ static void kvmppc_vcore_blocked(struct kvmppc_vcore *vc)
 		}
 	}
 
-	prepare_to_swait_exclusive(&vc->wq, &wait, TASK_INTERRUPTIBLE);
-
 	if (kvmppc_vcore_check_block(vc)) {
-		finish_swait(&vc->wq, &wait);
 		do_sleep = 0;
 		/* If we polled, count this as a successful poll */
 		if (vc->halt_poll_ns)
@@ -3828,8 +3822,8 @@ static void kvmppc_vcore_blocked(struct kvmppc_vcore *vc)
 	vc->vcore_state = VCORE_SLEEPING;
 	trace_kvmppc_vcore_blocked(vc, 0);
 	spin_unlock(&vc->lock);
-	schedule();
-	finish_swait(&vc->wq, &wait);
+	rcuwait_wait_event(&vc->wait,
+			   kvmppc_vcore_check_block(vc), TASK_INTERRUPTIBLE);
 	spin_lock(&vc->lock);
 	vc->vcore_state = VCORE_INACTIVE;
 	trace_kvmppc_vcore_blocked(vc, 1);
@@ -3940,7 +3934,7 @@ static int kvmppc_run_vcpu(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 			kvmppc_start_thread(vcpu, vc);
 			trace_kvm_guest_enter(vcpu);
 		} else if (vc->vcore_state == VCORE_SLEEPING) {
-			swake_up_one(&vc->wq);
+		        rcuwait_wake_up(&vc->wait);
 		}
 
 	}
@@ -4279,7 +4273,7 @@ static int kvmppc_vcpu_run_hv(struct kvm_run *run, struct kvm_vcpu *vcpu)
 	}
 	user_vrsave = mfspr(SPRN_VRSAVE);
 
-	vcpu->arch.wqp = &vcpu->arch.vcore->wq;
+	vcpu->arch.waitp = &vcpu->arch.vcore->wait;
 	vcpu->arch.pgdir = kvm->mm->pgd;
 	vcpu->arch.state = KVMPPC_VCPU_BUSY_IN_HOST;
 
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index e15166b0a16d..4a074b587520 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -751,7 +751,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
 	if (err)
 		goto out_vcpu_uninit;
 
-	vcpu->arch.wqp = &vcpu->wq;
+	vcpu->arch.waitp = &vcpu->wait;
 	kvmppc_create_vcpu_debugfs(vcpu, vcpu->vcpu_id);
 	return 0;
 
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 9af25c97612a..54345dc645ba 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1833,7 +1833,7 @@ void kvm_lapic_expired_hv_timer(struct kvm_vcpu *vcpu)
 	/* If the preempt notifier has already run, it also called apic_timer_expired */
 	if (!apic->lapic_timer.hv_timer_in_use)
 		goto out;
-	WARN_ON(swait_active(&vcpu->wq));
+	WARN_ON(rcuwait_active(&vcpu->wait));
 	cancel_hv_timer(apic);
 	apic_timer_expired(apic);
 
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 6d58beb65454..fc34021546bd 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -23,7 +23,7 @@
 #include <linux/irqflags.h>
 #include <linux/context_tracking.h>
 #include <linux/irqbypass.h>
-#include <linux/swait.h>
+#include <linux/rcuwait.h>
 #include <linux/refcount.h>
 #include <linux/nospec.h>
 #include <asm/signal.h>
@@ -277,7 +277,7 @@ struct kvm_vcpu {
 	struct mutex mutex;
 	struct kvm_run *run;
 
-	struct swait_queue_head wq;
+	struct rcuwait wait;
 	struct pid __rcu *pid;
 	int sigset_active;
 	sigset_t sigset;
@@ -956,12 +956,12 @@ static inline bool kvm_arch_has_assigned_device(struct kvm *kvm)
 }
 #endif
 
-static inline struct swait_queue_head *kvm_arch_vcpu_wq(struct kvm_vcpu *vcpu)
+static inline struct rcuwait *kvm_arch_vcpu_get_wait(struct kvm_vcpu *vcpu)
 {
 #ifdef __KVM_HAVE_ARCH_WQP
-	return vcpu->arch.wqp;
+	return vcpu->arch.waitp;
 #else
-	return &vcpu->wq;
+	return &vcpu->wait;
 #endif
 }
 
diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index 93bd59b46848..d5024416e722 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -571,6 +571,7 @@ void kvm_timer_vcpu_put(struct kvm_vcpu *vcpu)
 {
 	struct arch_timer_cpu *timer = vcpu_timer(vcpu);
 	struct timer_map map;
+	struct rcuwait *wait = kvm_arch_vcpu_get_wait(vcpu);
 
 	if (unlikely(!timer->enabled))
 		return;
@@ -593,7 +594,7 @@ void kvm_timer_vcpu_put(struct kvm_vcpu *vcpu)
 	if (map.emul_ptimer)
 		soft_timer_cancel(&map.emul_ptimer->hrtimer);
 
-	if (swait_active(kvm_arch_vcpu_wq(vcpu)))
+	if (rcuwait_active(wait))
 		kvm_timer_blocking(vcpu);
 
 	/*
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index 48d0ec44ad77..479f36d02418 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -579,16 +579,17 @@ void kvm_arm_resume_guest(struct kvm *kvm)
 
 	kvm_for_each_vcpu(i, vcpu, kvm) {
 		vcpu->arch.pause = false;
-		swake_up_one(kvm_arch_vcpu_wq(vcpu));
+		rcuwait_wake_up(kvm_arch_vcpu_get_wait(vcpu));
 	}
 }
 
 static void vcpu_req_sleep(struct kvm_vcpu *vcpu)
 {
-	struct swait_queue_head *wq = kvm_arch_vcpu_wq(vcpu);
+	struct rcuwait *wait = kvm_arch_vcpu_get_wait(vcpu);
 
-	swait_event_interruptible_exclusive(*wq, ((!vcpu->arch.power_off) &&
-				       (!vcpu->arch.pause)));
+	rcuwait_wait_event(wait,
+			   (!vcpu->arch.power_off) &&(!vcpu->arch.pause),
+			   TASK_INTERRUPTIBLE);
 
 	if (vcpu->arch.power_off || vcpu->arch.pause) {
 		/* Awaken to handle a signal, request we sleep again later. */
diff --git a/virt/kvm/async_pf.c b/virt/kvm/async_pf.c
index 15e5b037f92d..10b533f641a6 100644
--- a/virt/kvm/async_pf.c
+++ b/virt/kvm/async_pf.c
@@ -80,8 +80,7 @@ static void async_pf_execute(struct work_struct *work)
 
 	trace_kvm_async_pf_completed(addr, cr2_or_gpa);
 
-	if (swq_has_sleeper(&vcpu->wq))
-		swake_up_one(&vcpu->wq);
+	rcuwait_wake_up(&vcpu->wait);
 
 	mmput(mm);
 	kvm_put_kvm(vcpu->kvm);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 74bdb7bf3295..f027ae3598e8 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -341,7 +341,7 @@ static void kvm_vcpu_init(struct kvm_vcpu *vcpu, struct kvm *kvm, unsigned id)
 	vcpu->kvm = kvm;
 	vcpu->vcpu_id = id;
 	vcpu->pid = NULL;
-	init_swait_queue_head(&vcpu->wq);
+	rcuwait_init(&vcpu->wait);
 	kvm_async_pf_vcpu_init(vcpu);
 
 	vcpu->pre_pcpu = -1;
@@ -2671,7 +2671,6 @@ static int kvm_vcpu_check_block(struct kvm_vcpu *vcpu)
 void kvm_vcpu_block(struct kvm_vcpu *vcpu)
 {
 	ktime_t start, cur;
-	DECLARE_SWAITQUEUE(wait);
 	bool waited = false;
 	u64 block_ns;
 
@@ -2697,8 +2696,9 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
 		} while (single_task_running() && ktime_before(cur, stop));
 	}
 
+	prepare_to_rcuwait(&vcpu->wait);
 	for (;;) {
-		prepare_to_swait_exclusive(&vcpu->wq, &wait, TASK_INTERRUPTIBLE);
+		set_current_state(TASK_INTERRUPTIBLE);
 
 		if (kvm_vcpu_check_block(vcpu) < 0)
 			break;
@@ -2706,8 +2706,7 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
 		waited = true;
 		schedule();
 	}
-
-	finish_swait(&vcpu->wq, &wait);
+	finish_rcuwait(&vcpu->wait);
 	cur = ktime_get();
 out:
 	kvm_arch_vcpu_unblocking(vcpu);
@@ -2738,11 +2737,10 @@ EXPORT_SYMBOL_GPL(kvm_vcpu_block);
 
 bool kvm_vcpu_wake_up(struct kvm_vcpu *vcpu)
 {
-	struct swait_queue_head *wqp;
+	struct rcuwait *wait;
 
-	wqp = kvm_arch_vcpu_wq(vcpu);
-	if (swq_has_sleeper(wqp)) {
-		swake_up_one(wqp);
+	wait = kvm_arch_vcpu_get_wait(vcpu);
+	if (rcuwait_wake_up(wait)) {
 		WRITE_ONCE(vcpu->ready, true);
 		++vcpu->stat.halt_wakeup;
 		return true;
@@ -2884,7 +2882,8 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me, bool yield_to_kernel_mode)
 				continue;
 			if (vcpu == me)
 				continue;
-			if (swait_active(&vcpu->wq) && !vcpu_dy_runnable(vcpu))
+			if (rcuwait_active(&vcpu->wait) &&
+			    !vcpu_dy_runnable(vcpu))
 				continue;
 			if (READ_ONCE(vcpu->preempted) && yield_to_kernel_mode &&
 				!kvm_arch_vcpu_in_kernel(vcpu))
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/5] rcuwait: Introduce rcuwait_active()
  2020-04-24  5:48 ` [PATCH 4/5] rcuwait: Introduce rcuwait_active() Davidlohr Bueso
@ 2020-04-24  9:55   ` Peter Zijlstra
  2020-05-18 10:33   ` Paolo Bonzini
  1 sibling, 0 replies; 14+ messages in thread
From: Peter Zijlstra @ 2020-04-24  9:55 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: tglx, pbonzini, maz, bigeasy, rostedt, torvalds, will, joel,
	linux-kernel, kvm, Davidlohr Bueso

On Thu, Apr 23, 2020 at 10:48:36PM -0700, Davidlohr Bueso wrote:
> This call is lockless and thus should not be trustedblindly,
> ie: for wakeup purposes, which is already provided correctly
> by rcuwait_wakeup().
> 
> Signed-off-by: Davidlohr Bueso <dbueso@suse.de>

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 5/5] kvm: Replace vcpu->swait with rcuwait
  2020-04-24  5:48 ` [PATCH 5/5] kvm: Replace vcpu->swait with rcuwait Davidlohr Bueso
@ 2020-04-25 10:24   ` Paolo Bonzini
  2020-04-25 22:30     ` Davidlohr Bueso
  2020-05-18  8:51   ` Wanpeng Li
  1 sibling, 1 reply; 14+ messages in thread
From: Paolo Bonzini @ 2020-04-25 10:24 UTC (permalink / raw)
  To: Davidlohr Bueso, tglx
  Cc: peterz, maz, bigeasy, rostedt, torvalds, will, joel,
	linux-kernel, kvm, Paul Mackerras, kvmarm, linux-mips,
	Davidlohr Bueso

I'm squashing this in to keep the changes compared to the current code minimal,
and queuing the series.

Thanks,

Paolo

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index bbefa2a7f950..feca3118b17f 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -230,10 +230,10 @@ static bool kvmppc_ipi_thread(int cpu)
 static void kvmppc_fast_vcpu_kick_hv(struct kvm_vcpu *vcpu)
 {
 	int cpu;
-	struct rcuwait *wait;
+	struct rcuwait *waitp;
 
-	wait = kvm_arch_vcpu_get_wait(vcpu);
-	if (rcuwait_wake_up(wait))
+	waitp = kvm_arch_vcpu_get_wait(vcpu);
+	if (rcuwait_wake_up(waitp))
 		++vcpu->stat.halt_wakeup;
 
 	cpu = READ_ONCE(vcpu->arch.thread_cpu);
@@ -3814,7 +3814,10 @@ static void kvmppc_vcore_blocked(struct kvmppc_vcore *vc)
 		}
 	}
 
+	prepare_to_rcuwait(&vc->wait);
+	set_current_state(TASK_INTERRUPTIBLE);
 	if (kvmppc_vcore_check_block(vc)) {
+		finish_rcuwait(&vc->wait);
 		do_sleep = 0;
 		/* If we polled, count this as a successful poll */
 		if (vc->halt_poll_ns)
@@ -3827,8 +3830,8 @@ static void kvmppc_vcore_blocked(struct kvmppc_vcore *vc)
 	vc->vcore_state = VCORE_SLEEPING;
 	trace_kvmppc_vcore_blocked(vc, 0);
 	spin_unlock(&vc->lock);
-	rcuwait_wait_event(&vc->wait,
-			   kvmppc_vcore_check_block(vc), TASK_INTERRUPTIBLE);
+	schedule();
+	finish_rcuwait(&vc->wait);
 	spin_lock(&vc->lock);
 	vc->vcore_state = VCORE_INACTIVE;
 	trace_kvmppc_vcore_blocked(vc, 1);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 7c2d18c12d87..c671de51a753 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2737,10 +2737,10 @@ EXPORT_SYMBOL_GPL(kvm_vcpu_block);
 
 bool kvm_vcpu_wake_up(struct kvm_vcpu *vcpu)
 {
-	struct rcuwait *wait;
+	struct rcuwait *waitp;
 
-	wait = kvm_arch_vcpu_get_wait(vcpu);
-	if (rcuwait_wake_up(wait)) {
+	waitp = kvm_arch_vcpu_get_wait(vcpu);
+	if (rcuwait_wake_up(waitp)) {
 		WRITE_ONCE(vcpu->ready, true);
 		++vcpu->stat.halt_wakeup;
 		return true;


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 5/5] kvm: Replace vcpu->swait with rcuwait
  2020-04-25 10:24   ` Paolo Bonzini
@ 2020-04-25 22:30     ` Davidlohr Bueso
  0 siblings, 0 replies; 14+ messages in thread
From: Davidlohr Bueso @ 2020-04-25 22:30 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: tglx, peterz, maz, bigeasy, rostedt, torvalds, will, joel,
	linux-kernel, kvm, Paul Mackerras, kvmarm, linux-mips,
	Davidlohr Bueso

On Sat, 25 Apr 2020, Paolo Bonzini wrote:

>I'm squashing this in to keep the changes compared to the current code minimal,
>and queuing the series.

Thanks!

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 5/5] kvm: Replace vcpu->swait with rcuwait
  2020-04-24  5:48 ` [PATCH 5/5] kvm: Replace vcpu->swait with rcuwait Davidlohr Bueso
  2020-04-25 10:24   ` Paolo Bonzini
@ 2020-05-18  8:51   ` Wanpeng Li
  1 sibling, 0 replies; 14+ messages in thread
From: Wanpeng Li @ 2020-05-18  8:51 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: Thomas Gleixner, Paolo Bonzini, Peter Zijlstra, Marc Zyngier,
	Sebastian Sewior, Steven Rostedt, Linus Torvalds, will, joel,
	LKML, kvm, Paul Mackerras, kvmarm, linux-mips, Davidlohr Bueso

On Fri, 24 Apr 2020 at 13:53, Davidlohr Bueso <dave@stgolabs.net> wrote:
>
> The use of any sort of waitqueue (simple or regular) for
> wait/waking vcpus has always been an overkill and semantically
> wrong. Because this is per-vcpu (which is blocked) there is
> only ever a single waiting vcpu, thus no need for any sort of
> queue.
>
> As such, make use of the rcuwait primitive, with the following
> considerations:
>
>   - rcuwait already provides the proper barriers that serialize
>   concurrent waiter and waker.
>
>   - Task wakeup is done in rcu read critical region, with a
>   stable task pointer.
>
>   - Because there is no concurrency among waiters, we need
>   not worry about rcuwait_wait_event() calls corrupting
>   the wait->task. As a consequence, this saves the locking
>   done in swait when modifying the queue. This also applies
>   to per-vcore wait for powerpc kvm-hv.
>
> The x86 tscdeadline_latency test mentioned in 8577370fb0cb
> ("KVM: Use simple waitqueue for vcpu->wq") shows that, on avg,
> latency is reduced by around 15-20% with this change.
>

This is splatting when I run linux guest on latest kvm/queue.

[24726.009187] =============================
[24726.009193] WARNING: suspicious RCU usage
[24726.009201] 5.7.0-rc2+ #3 Not tainted
[24726.009207] -----------------------------
[24726.009215] ./include/linux/rcuwait.h:34 suspicious
rcu_dereference_check() usage!
[24726.009222]
other info that might help us debug this:

[24726.009229]
rcu_scheduler_active = 2, debug_locks = 1
[24726.009237] 2 locks held by qemu-system-x86/6094:
[24726.009244]  #0: ffff88837b6cb990 (&vcpu->mutex){+.+.}-{3:3}, at:
kvm_vcpu_ioctl+0x191/0xbb0 [kvm]
[24726.009347]  #1: ffffc900036c2c68 (&kvm->srcu){....}-{0:0}, at:
kvm_arch_vcpu_ioctl_run+0x17f1/0x5680 [kvm]
[24726.009386]
stack backtrace:
[24726.009394] CPU: 5 PID: 6094 Comm: qemu-system-x86 Not tainted 5.7.0-rc2+ #3
[24726.009400] Hardware name: LENOVO ThinkCentre M8500t-N000/SHARKBAY,
BIOS FBKTC1AUS 02/16/2016
[24726.009405] Call Trace:
[24726.009418]  dump_stack+0x98/0xd5
[24726.009432]  lockdep_rcu_suspicious+0x123/0x170
[24726.009465]  kvm_vcpu_on_spin+0x46f/0x5d0 [kvm]
[24726.009497]  handle_pause+0x7e/0x3e0 [kvm_intel]
[24726.009517]  vmx_handle_exit+0x1fe/0x1000 [kvm_intel]
[24726.009547]  ? kvm_arch_vcpu_ioctl_run+0x17c5/0x5680 [kvm]
[24726.009586]  kvm_arch_vcpu_ioctl_run+0x18f5/0x5680 [kvm]
[24726.009595]  ? check_chain_key+0x26e/0x670
[24726.009651]  ? kvm_arch_vcpu_runnable+0x540/0x540 [kvm]
[24726.009667]  ? tomoyo_execute_permission+0x4b0/0x4b0
[24726.009677]  ? sched_clock+0x31/0x40
[24726.009726]  kvm_vcpu_ioctl+0x5d2/0xbb0 [kvm]
[24726.009754]  ? kvm_vcpu_ioctl+0x5d2/0xbb0 [kvm]
[24726.009786]  ? kvm_set_memory_region+0x90/0x90 [kvm]
[24726.009819]  ? __fget_files+0x2d4/0x420
[24726.009837]  ? do_dup2+0x660/0x660
[24726.009847]  ? lock_acquire+0x1e5/0x660
[24726.009864]  ? tomoyo_file_ioctl+0x19/0x20
[24726.009880]  ksys_ioctl+0x3d2/0x1390
[24726.009900]  ? generic_block_fiemap+0x70/0x70
[24726.009911]  ? rcu_read_lock_sched_held+0xb4/0xe0
[24726.009920]  ? rcu_read_lock_any_held.part.9+0x20/0x20
[24726.009935]  ? __x64_sys_futex+0x1a1/0x400
[24726.009943]  ? __kasan_check_write+0x14/0x20
[24726.009951]  ? switch_fpu_return+0x181/0x3e0
[24726.009963]  ? do_futex+0x14e0/0x14e0
[24726.009970]  ? lockdep_hardirqs_off+0x1df/0x2d0
[24726.009977]  ? syscall_return_slowpath+0x66/0x9d0
[24726.009987]  ? do_syscall_64+0x8e/0xae0
[24726.009995]  ? entry_SYSCALL_64_after_hwframe+0x49/0xb3
[24726.010012]  __x64_sys_ioctl+0x73/0xb0
[24726.010023]  do_syscall_64+0x108/0xae0
[24726.010032]  ? trace_hardirqs_on_thunk+0x1a/0x1c
[24726.010042]  ? syscall_return_slowpath+0x9d0/0x9d0
[24726.010048]  ? trace_hardirqs_off_thunk+0x1a/0x1c
[24726.010059]  ? trace_hardirqs_off_caller+0x28/0x1b0
[24726.010074]  ? trace_hardirqs_off_thunk+0x1a/0x1c
[24726.010092]  entry_SYSCALL_64_after_hwframe+0x49/0xb3


[25372.530417] =============================
[25372.530420] WARNING: suspicious RCU usage
[25372.530425] 5.7.0-rc2+ #3 Not tainted
[25372.530428] -----------------------------
[25372.530432] ./include/linux/rcuwait.h:34 suspicious
rcu_dereference_check() usage!
[25372.530436]
other info that might help us debug this:

[25372.530440]
rcu_scheduler_active = 2, debug_locks = 1
[25372.530443] 1 lock held by qemu-system-x86/6433:
[25372.530447]  #0: ffff88837b6cb990 (&vcpu->mutex){+.+.}-{3:3}, at:
kvm_vcpu_ioctl+0x191/0xbb0 [kvm]
[25372.530483]
stack backtrace:
[25372.530487] CPU: 1 PID: 6433 Comm: qemu-system-x86 Not tainted 5.7.0-rc2+ #3
[25372.530491] Hardware name: LENOVO ThinkCentre M8500t-N000/SHARKBAY,
BIOS FBKTC1AUS 02/16/2016
[25372.530495] Call Trace:
[25372.530504]  dump_stack+0x98/0xd5
[25372.530513]  lockdep_rcu_suspicious+0x123/0x170
[25372.530547]  kvm_lapic_expired_hv_timer+0x1ad/0x1f0 [kvm]
[25372.530559]  vmx_vcpu_run+0x1892/0x2c60 [kvm_intel]
[25372.530568]  ? rcu_preempt_deferred_qs_handler+0x20/0x40
[25372.530579]  ? clear_atomic_switch_msr+0x970/0x970 [kvm_intel]
[25372.530586]  ? rcu_read_lock_any_held.part.9+0x20/0x20
[25372.530613]  kvm_arch_vcpu_ioctl_run+0x1579/0x5680 [kvm]
[25372.530619]  ? check_chain_key+0x26e/0x670
[25372.530654]  ? kvm_arch_vcpu_runnable+0x540/0x540 [kvm]
[25372.530665]  ? tomoyo_execute_permission+0x4b0/0x4b0
[25372.530671]  ? sched_clock+0x31/0x40
[25372.530678]  ? sched_clock_cpu+0x1b/0x1b0
[25372.530706]  kvm_vcpu_ioctl+0x5d2/0xbb0 [kvm]
[25372.530724]  ? kvm_vcpu_ioctl+0x5d2/0xbb0 [kvm]
[25372.530744]  ? kvm_set_memory_region+0x90/0x90 [kvm]
[25372.530764]  ? __fget_files+0x2d4/0x420
[25372.530775]  ? do_dup2+0x660/0x660
[25372.530779]  ? vfs_iter_write+0xb0/0xb0
[25372.530785]  ? rcu_read_lock_held+0xb4/0xc0
[25372.530795]  ? tomoyo_file_ioctl+0x19/0x20
[25372.530806]  ksys_ioctl+0x3d2/0x1390
[25372.530814]  ? do_dup2+0x660/0x660
[25372.530821]  ? generic_block_fiemap+0x70/0x70
[25372.530833]  ? __kasan_check_write+0x14/0x20
[25372.530838]  ? fput_many+0x20/0x140
[25372.530844]  ? fput+0x13/0x20
[25372.530848]  ? do_writev+0x175/0x320
[25372.530856]  ? lockdep_hardirqs_off+0x1df/0x2d0
[25372.530861]  ? syscall_return_slowpath+0x66/0x9d0
[25372.530867]  ? do_syscall_64+0x8e/0xae0
[25372.530872]  ? entry_SYSCALL_64_after_hwframe+0x49/0xb3
[25372.530883]  __x64_sys_ioctl+0x73/0xb0
[25372.530890]  do_syscall_64+0x108/0xae0
[25372.530894]  ? switch_fpu_return+0x181/0x3e0
[25372.530899]  ? trace_hardirqs_on_thunk+0x1a/0x1c
[25372.530905]  ? syscall_return_slowpath+0x9d0/0x9d0
[25372.530908]  ? trace_hardirqs_off_thunk+0x1a/0x1c
[25372.530917]  ? trace_hardirqs_off_caller+0x28/0x1b0
[25372.530926]  ? trace_hardirqs_off_thunk+0x1a/0x1c
[25372.530937]  entry_SYSCALL_64_after_hwframe+0x49/0xb3

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/5] rcuwait: Introduce rcuwait_active()
  2020-04-24  5:48 ` [PATCH 4/5] rcuwait: Introduce rcuwait_active() Davidlohr Bueso
  2020-04-24  9:55   ` Peter Zijlstra
@ 2020-05-18 10:33   ` Paolo Bonzini
  2020-05-18 16:37     ` Davidlohr Bueso
  2020-05-19  0:34     ` Wanpeng Li
  1 sibling, 2 replies; 14+ messages in thread
From: Paolo Bonzini @ 2020-05-18 10:33 UTC (permalink / raw)
  To: Davidlohr Bueso, tglx
  Cc: peterz, maz, bigeasy, rostedt, torvalds, will, joel,
	linux-kernel, kvm, Davidlohr Bueso

On 24/04/20 07:48, Davidlohr Bueso wrote:
> +/*
> + * Note: this provides no serialization and, just as with waitqueues,
> + * requires care to estimate as to whether or not the wait is active.
> + */
> +static inline int rcuwait_active(struct rcuwait *w)
> +{
> +	return !!rcu_dereference(w->task);
> +}

This needs to be changed to rcu_access_pointer:


--------------- 8< -----------------
From: Paolo Bonzini <pbonzini@redhat.com>
Subject: [PATCH] rcuwait: avoid lockdep splats from rcuwait_active()

rcuwait_active only returns whether w->task is not NULL.  This is 
exactly one of the usecases that are mentioned in the documentation
for rcu_access_pointer() where it is correct to bypass lockdep checks.

This avoids a splat from kvm_vcpu_on_spin().

Reported-by: Wanpeng Li <kernellwp@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

diff --git a/include/linux/rcuwait.h b/include/linux/rcuwait.h
index c1414ce44abc..61c56cca95c4 100644
--- a/include/linux/rcuwait.h
+++ b/include/linux/rcuwait.h
@@ -31,7 +31,7 @@ static inline void rcuwait_init(struct rcuwait *w)
  */
 static inline int rcuwait_active(struct rcuwait *w)
 {
-	return !!rcu_dereference(w->task);
+	return !!rcu_access_pointer(w->task);
 }
 
 extern int rcuwait_wake_up(struct rcuwait *w);


Paolo


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/5] rcuwait: Introduce rcuwait_active()
  2020-05-18 10:33   ` Paolo Bonzini
@ 2020-05-18 16:37     ` Davidlohr Bueso
  2020-05-19  0:34     ` Wanpeng Li
  1 sibling, 0 replies; 14+ messages in thread
From: Davidlohr Bueso @ 2020-05-18 16:37 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: tglx, peterz, maz, bigeasy, rostedt, torvalds, will, joel,
	linux-kernel, kvm, Davidlohr Bueso

On Mon, 18 May 2020, Paolo Bonzini wrote:

>On 24/04/20 07:48, Davidlohr Bueso wrote:
>> +/*
>> + * Note: this provides no serialization and, just as with waitqueues,
>> + * requires care to estimate as to whether or not the wait is active.
>> + */
>> +static inline int rcuwait_active(struct rcuwait *w)
>> +{
>> +	return !!rcu_dereference(w->task);
>> +}
>
>This needs to be changed to rcu_access_pointer:
>
>
>--------------- 8< -----------------
>From: Paolo Bonzini <pbonzini@redhat.com>
>Subject: [PATCH] rcuwait: avoid lockdep splats from rcuwait_active()
>
>rcuwait_active only returns whether w->task is not NULL.  This is
>exactly one of the usecases that are mentioned in the documentation
>for rcu_access_pointer() where it is correct to bypass lockdep checks.
>
>This avoids a splat from kvm_vcpu_on_spin().
>
>Reported-by: Wanpeng Li <kernellwp@gmail.com>
>Cc: Peter Zijlstra <peterz@infradead.org>
>Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Acked-by: Davidlohr Bueso <dbueso@suse.de>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/5] rcuwait: Introduce rcuwait_active()
  2020-05-18 10:33   ` Paolo Bonzini
  2020-05-18 16:37     ` Davidlohr Bueso
@ 2020-05-19  0:34     ` Wanpeng Li
  1 sibling, 0 replies; 14+ messages in thread
From: Wanpeng Li @ 2020-05-19  0:34 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Davidlohr Bueso, Thomas Gleixner, Peter Zijlstra, Marc Zyngier,
	Sebastian Sewior, Steven Rostedt, Linus Torvalds, will, joel,
	LKML, kvm, Davidlohr Bueso

On Mon, 18 May 2020 at 18:36, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On 24/04/20 07:48, Davidlohr Bueso wrote:
> > +/*
> > + * Note: this provides no serialization and, just as with waitqueues,
> > + * requires care to estimate as to whether or not the wait is active.
> > + */
> > +static inline int rcuwait_active(struct rcuwait *w)
> > +{
> > +     return !!rcu_dereference(w->task);
> > +}
>
> This needs to be changed to rcu_access_pointer:
>
>
> --------------- 8< -----------------
> From: Paolo Bonzini <pbonzini@redhat.com>
> Subject: [PATCH] rcuwait: avoid lockdep splats from rcuwait_active()
>
> rcuwait_active only returns whether w->task is not NULL.  This is
> exactly one of the usecases that are mentioned in the documentation
> for rcu_access_pointer() where it is correct to bypass lockdep checks.
>
> This avoids a splat from kvm_vcpu_on_spin().
>
> Reported-by: Wanpeng Li <kernellwp@gmail.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Tested-by: Wanpeng Li <wanpengli@tencent.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/5] rcuwait: Fix stale wake call name in comment
  2020-04-22  4:07 [PATCH v3 -tip 0/5] kvm: Use rcuwait for vcpu blocking Davidlohr Bueso
@ 2020-04-22  4:07 ` Davidlohr Bueso
  0 siblings, 0 replies; 14+ messages in thread
From: Davidlohr Bueso @ 2020-04-22  4:07 UTC (permalink / raw)
  To: tglx, pbonzini
  Cc: bigeasy, peterz, rostedt, torvalds, will, joel, linux-kernel,
	kvm, dave, Davidlohr Bueso

The 'trywake' name was renamed to simply 'wake', update the comment.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
---
 kernel/exit.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/exit.c b/kernel/exit.c
index 389a88cb3081..9f9015f3f6b0 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -236,7 +236,7 @@ void rcuwait_wake_up(struct rcuwait *w)
 	/*
 	 * Order condition vs @task, such that everything prior to the load
 	 * of @task is visible. This is the condition as to why the user called
-	 * rcuwait_trywake() in the first place. Pairs with set_current_state()
+	 * rcuwait_wake() in the first place. Pairs with set_current_state()
 	 * barrier (A) in rcuwait_wait_event().
 	 *
 	 *    WAIT                WAKE
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2020-05-19  0:34 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-24  5:48 [PATCH v4 0/5] kvm: Use rcuwait for vcpu blocking Davidlohr Bueso
2020-04-24  5:48 ` [PATCH 1/5] rcuwait: Fix stale wake call name in comment Davidlohr Bueso
2020-04-24  5:48 ` [PATCH 2/5] rcuwait: Let rcuwait_wake_up() return whether or not a task was awoken Davidlohr Bueso
2020-04-24  5:48 ` [PATCH 3/5] rcuwait: Introduce prepare_to and finish_rcuwait Davidlohr Bueso
2020-04-24  5:48 ` [PATCH 4/5] rcuwait: Introduce rcuwait_active() Davidlohr Bueso
2020-04-24  9:55   ` Peter Zijlstra
2020-05-18 10:33   ` Paolo Bonzini
2020-05-18 16:37     ` Davidlohr Bueso
2020-05-19  0:34     ` Wanpeng Li
2020-04-24  5:48 ` [PATCH 5/5] kvm: Replace vcpu->swait with rcuwait Davidlohr Bueso
2020-04-25 10:24   ` Paolo Bonzini
2020-04-25 22:30     ` Davidlohr Bueso
2020-05-18  8:51   ` Wanpeng Li
  -- strict thread matches above, loose matches on Subject: below --
2020-04-22  4:07 [PATCH v3 -tip 0/5] kvm: Use rcuwait for vcpu blocking Davidlohr Bueso
2020-04-22  4:07 ` [PATCH 1/5] rcuwait: Fix stale wake call name in comment Davidlohr Bueso

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).