kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V2 1/5] kvm/ppc/book3s: Move struct kvmppc_vcore from kvm_host.h to kvm_book3s.h
@ 2016-07-11  7:08 Suraj Jitindar Singh
  2016-07-11  7:08 ` [PATCH V2 2/5] kvm/ppc/book3s_hv: Change vcore element runnable_threads from linked-list to array Suraj Jitindar Singh
                   ` (3 more replies)
  0 siblings, 4 replies; 26+ messages in thread
From: Suraj Jitindar Singh @ 2016-07-11  7:08 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: kvm-ppc, mpe, paulus, benh, kvm, pbonzini, agraf, rkrcmar,
	sjitindarsingh

The next commit will introduce a member to the kvmppc_vcore struct which
references MAX_SMT_THREADS which is defined in kvm_book3s_asm.h, however
this file isn't included in kvm_host.h directly. Thus compiling for
certain platforms such as pmac32_defconfig and ppc64e_defconfig with KVM
fails due to MAX_SMT_THREADS not being defined.

Move the struct kvmppc_vcore definition to kvm_book3s.h which explicitly
includes kvm_book3s_asm.h.

---
Change Log:

V1 -> V2:
	- Added patch to series

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
---
 arch/powerpc/include/asm/kvm_book3s.h | 35 +++++++++++++++++++++++++++++++++++
 arch/powerpc/include/asm/kvm_host.h   | 35 -----------------------------------
 2 files changed, 35 insertions(+), 35 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h
index 8f39796..a50c5fe 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -69,6 +69,41 @@ struct hpte_cache {
 	int pagesize;
 };
 
+/*
+ * Struct for a virtual core.
+ * Note: entry_exit_map combines a bitmap of threads that have entered
+ * in the bottom 8 bits and a bitmap of threads that have exited in the
+ * next 8 bits.  This is so that we can atomically set the entry bit
+ * iff the exit map is 0 without taking a lock.
+ */
+struct kvmppc_vcore {
+	int n_runnable;
+	int num_threads;
+	int entry_exit_map;
+	int napping_threads;
+	int first_vcpuid;
+	u16 pcpu;
+	u16 last_cpu;
+	u8 vcore_state;
+	u8 in_guest;
+	struct kvmppc_vcore *master_vcore;
+	struct list_head runnable_threads;
+	struct list_head preempt_list;
+	spinlock_t lock;
+	struct swait_queue_head wq;
+	spinlock_t stoltb_lock;	/* protects stolen_tb and preempt_tb */
+	u64 stolen_tb;
+	u64 preempt_tb;
+	struct kvm_vcpu *runner;
+	struct kvm *kvm;
+	u64 tb_offset;		/* guest timebase - host timebase */
+	ulong lpcr;
+	u32 arch_compat;
+	ulong pcr;
+	ulong dpdes;		/* doorbell state (POWER8) */
+	ulong conferring_threads;
+};
+
 struct kvmppc_vcpu_book3s {
 	struct kvmppc_sid_map sid_map[SID_MAP_NUM];
 	struct {
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index ec35af3..19c6731 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -275,41 +275,6 @@ struct kvm_arch {
 #endif
 };
 
-/*
- * Struct for a virtual core.
- * Note: entry_exit_map combines a bitmap of threads that have entered
- * in the bottom 8 bits and a bitmap of threads that have exited in the
- * next 8 bits.  This is so that we can atomically set the entry bit
- * iff the exit map is 0 without taking a lock.
- */
-struct kvmppc_vcore {
-	int n_runnable;
-	int num_threads;
-	int entry_exit_map;
-	int napping_threads;
-	int first_vcpuid;
-	u16 pcpu;
-	u16 last_cpu;
-	u8 vcore_state;
-	u8 in_guest;
-	struct kvmppc_vcore *master_vcore;
-	struct list_head runnable_threads;
-	struct list_head preempt_list;
-	spinlock_t lock;
-	struct swait_queue_head wq;
-	spinlock_t stoltb_lock;	/* protects stolen_tb and preempt_tb */
-	u64 stolen_tb;
-	u64 preempt_tb;
-	struct kvm_vcpu *runner;
-	struct kvm *kvm;
-	u64 tb_offset;		/* guest timebase - host timebase */
-	ulong lpcr;
-	u32 arch_compat;
-	ulong pcr;
-	ulong dpdes;		/* doorbell state (POWER8) */
-	ulong conferring_threads;
-};
-
 #define VCORE_ENTRY_MAP(vc)	((vc)->entry_exit_map & 0xff)
 #define VCORE_EXIT_MAP(vc)	((vc)->entry_exit_map >> 8)
 #define VCORE_IS_EXITING(vc)	(VCORE_EXIT_MAP(vc) != 0)
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH V2 2/5] kvm/ppc/book3s_hv: Change vcore element runnable_threads from linked-list to array
  2016-07-11  7:08 [PATCH V2 1/5] kvm/ppc/book3s: Move struct kvmppc_vcore from kvm_host.h to kvm_book3s.h Suraj Jitindar Singh
@ 2016-07-11  7:08 ` Suraj Jitindar Singh
  2016-07-11  7:08 ` [PATCH V2 3/5] kvm/ppc/book3s_hv: Implement halt polling in the kvm_hv kernel module Suraj Jitindar Singh
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 26+ messages in thread
From: Suraj Jitindar Singh @ 2016-07-11  7:08 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: kvm-ppc, mpe, paulus, benh, kvm, pbonzini, agraf, rkrcmar,
	sjitindarsingh

The struct kvmppc_vcore is a structure used to store various information
about a virtual core for a kvm guest. The runnable_threads element of the
struct provides a list of all of the currently runnable vcpus on the core
(those in the KVMPPC_VCPU_RUNNABLE state). The previous implementation of
this list was a linked_list. The next patch requires that the list be able
to be iterated over without holding the vcore lock.

Reimplement the runnable_threads list in the kvmppc_vcore struct as an
array. Implement function to iterate over valid entries in the array and
update access sites accordingly.

---
Change Log:

V1 -> V2:
	- Nothing

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
---
 arch/powerpc/include/asm/kvm_book3s.h |  2 +-
 arch/powerpc/include/asm/kvm_host.h   |  1 -
 arch/powerpc/kvm/book3s_hv.c          | 68 +++++++++++++++++++++--------------
 3 files changed, 43 insertions(+), 28 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h
index a50c5fe..151f817 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -87,7 +87,7 @@ struct kvmppc_vcore {
 	u8 vcore_state;
 	u8 in_guest;
 	struct kvmppc_vcore *master_vcore;
-	struct list_head runnable_threads;
+	struct kvm_vcpu *runnable_threads[MAX_SMT_THREADS];
 	struct list_head preempt_list;
 	spinlock_t lock;
 	struct swait_queue_head wq;
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 19c6731..02d06e9 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -633,7 +633,6 @@ struct kvm_vcpu_arch {
 	long pgfault_index;
 	unsigned long pgfault_hpte[2];
 
-	struct list_head run_list;
 	struct task_struct *run_task;
 	struct kvm_run *kvm_run;
 
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index e20beae..3bcf9e6 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -57,6 +57,7 @@
 #include <linux/highmem.h>
 #include <linux/hugetlb.h>
 #include <linux/module.h>
+#include <linux/compiler.h>
 
 #include "book3s.h"
 
@@ -96,6 +97,26 @@ MODULE_PARM_DESC(h_ipi_redirect, "Redirect H_IPI wakeup to a free host core");
 static void kvmppc_end_cede(struct kvm_vcpu *vcpu);
 static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu);
 
+static inline struct kvm_vcpu *next_runnable_thread(struct kvmppc_vcore *vc,
+		int *ip)
+{
+	int i = *ip;
+	struct kvm_vcpu *vcpu;
+
+	while (++i < MAX_SMT_THREADS) {
+		vcpu = READ_ONCE(vc->runnable_threads[i]);
+		if (vcpu) {
+			*ip = i;
+			return vcpu;
+		}
+	}
+	return NULL;
+}
+
+/* Used to traverse the list of runnable threads for a given vcore */
+#define for_each_runnable_thread(i, vcpu, vc) \
+	for (i = -1; (vcpu = next_runnable_thread(vc, &i)); )
+
 static bool kvmppc_ipi_thread(int cpu)
 {
 	/* On POWER8 for IPIs to threads in the same core, use msgsnd */
@@ -1492,7 +1513,6 @@ static struct kvmppc_vcore *kvmppc_vcore_create(struct kvm *kvm, int core)
 	if (vcore == NULL)
 		return NULL;
 
-	INIT_LIST_HEAD(&vcore->runnable_threads);
 	spin_lock_init(&vcore->lock);
 	spin_lock_init(&vcore->stoltb_lock);
 	init_swait_queue_head(&vcore->wq);
@@ -1801,7 +1821,7 @@ static void kvmppc_remove_runnable(struct kvmppc_vcore *vc,
 	vcpu->arch.state = KVMPPC_VCPU_BUSY_IN_HOST;
 	spin_unlock_irq(&vcpu->arch.tbacct_lock);
 	--vc->n_runnable;
-	list_del(&vcpu->arch.run_list);
+	WRITE_ONCE(vc->runnable_threads[vcpu->arch.ptid], NULL);
 }
 
 static int kvmppc_grab_hwthread(int cpu)
@@ -2208,10 +2228,10 @@ static bool can_piggyback(struct kvmppc_vcore *pvc, struct core_info *cip,
 
 static void prepare_threads(struct kvmppc_vcore *vc)
 {
-	struct kvm_vcpu *vcpu, *vnext;
+	int i;
+	struct kvm_vcpu *vcpu;
 
-	list_for_each_entry_safe(vcpu, vnext, &vc->runnable_threads,
-				 arch.run_list) {
+	for_each_runnable_thread(i, vcpu, vc) {
 		if (signal_pending(vcpu->arch.run_task))
 			vcpu->arch.ret = -EINTR;
 		else if (vcpu->arch.vpa.update_pending ||
@@ -2258,15 +2278,14 @@ static void collect_piggybacks(struct core_info *cip, int target_threads)
 
 static void post_guest_process(struct kvmppc_vcore *vc, bool is_master)
 {
-	int still_running = 0;
+	int still_running = 0, i;
 	u64 now;
 	long ret;
-	struct kvm_vcpu *vcpu, *vnext;
+	struct kvm_vcpu *vcpu;
 
 	spin_lock(&vc->lock);
 	now = get_tb();
-	list_for_each_entry_safe(vcpu, vnext, &vc->runnable_threads,
-				 arch.run_list) {
+	for_each_runnable_thread(i, vcpu, vc) {
 		/* cancel pending dec exception if dec is positive */
 		if (now < vcpu->arch.dec_expires &&
 		    kvmppc_core_pending_dec(vcpu))
@@ -2306,8 +2325,8 @@ static void post_guest_process(struct kvmppc_vcore *vc, bool is_master)
 		}
 		if (vc->n_runnable > 0 && vc->runner == NULL) {
 			/* make sure there's a candidate runner awake */
-			vcpu = list_first_entry(&vc->runnable_threads,
-						struct kvm_vcpu, arch.run_list);
+			i = -1;
+			vcpu = next_runnable_thread(vc, &i);
 			wake_up(&vcpu->arch.cpu_run);
 		}
 	}
@@ -2360,7 +2379,7 @@ static inline void kvmppc_set_host_core(int cpu)
  */
 static noinline void kvmppc_run_core(struct kvmppc_vcore *vc)
 {
-	struct kvm_vcpu *vcpu, *vnext;
+	struct kvm_vcpu *vcpu;
 	int i;
 	int srcu_idx;
 	struct core_info core_info;
@@ -2396,8 +2415,7 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc)
 	 */
 	if ((threads_per_core > 1) &&
 	    ((vc->num_threads > threads_per_subcore) || !on_primary_thread())) {
-		list_for_each_entry_safe(vcpu, vnext, &vc->runnable_threads,
-					 arch.run_list) {
+		for_each_runnable_thread(i, vcpu, vc) {
 			vcpu->arch.ret = -EBUSY;
 			kvmppc_remove_runnable(vc, vcpu);
 			wake_up(&vcpu->arch.cpu_run);
@@ -2476,8 +2494,7 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc)
 		active |= 1 << thr;
 		list_for_each_entry(pvc, &core_info.vcs[sub], preempt_list) {
 			pvc->pcpu = pcpu + thr;
-			list_for_each_entry(vcpu, &pvc->runnable_threads,
-					    arch.run_list) {
+			for_each_runnable_thread(i, vcpu, pvc) {
 				kvmppc_start_thread(vcpu, pvc);
 				kvmppc_create_dtl_entry(vcpu, pvc);
 				trace_kvm_guest_enter(vcpu);
@@ -2610,7 +2627,7 @@ static void kvmppc_wait_for_exec(struct kvmppc_vcore *vc,
 static void kvmppc_vcore_blocked(struct kvmppc_vcore *vc)
 {
 	struct kvm_vcpu *vcpu;
-	int do_sleep = 1;
+	int do_sleep = 1, i;
 	DECLARE_SWAITQUEUE(wait);
 
 	prepare_to_swait(&vc->wq, &wait, TASK_INTERRUPTIBLE);
@@ -2619,7 +2636,7 @@ static void kvmppc_vcore_blocked(struct kvmppc_vcore *vc)
 	 * Check one last time for pending exceptions and ceded state after
 	 * we put ourselves on the wait queue
 	 */
-	list_for_each_entry(vcpu, &vc->runnable_threads, arch.run_list) {
+	for_each_runnable_thread(i, vcpu, vc) {
 		if (vcpu->arch.pending_exceptions || !vcpu->arch.ceded) {
 			do_sleep = 0;
 			break;
@@ -2643,9 +2660,9 @@ static void kvmppc_vcore_blocked(struct kvmppc_vcore *vc)
 
 static int kvmppc_run_vcpu(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 {
-	int n_ceded;
+	int n_ceded, i;
 	struct kvmppc_vcore *vc;
-	struct kvm_vcpu *v, *vn;
+	struct kvm_vcpu *v;
 
 	trace_kvmppc_run_vcpu_enter(vcpu);
 
@@ -2665,7 +2682,7 @@ static int kvmppc_run_vcpu(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 	vcpu->arch.stolen_logged = vcore_stolen_time(vc, mftb());
 	vcpu->arch.state = KVMPPC_VCPU_RUNNABLE;
 	vcpu->arch.busy_preempt = TB_NIL;
-	list_add_tail(&vcpu->arch.run_list, &vc->runnable_threads);
+	WRITE_ONCE(vc->runnable_threads[vcpu->arch.ptid], vcpu);
 	++vc->n_runnable;
 
 	/*
@@ -2705,8 +2722,7 @@ static int kvmppc_run_vcpu(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 			kvmppc_wait_for_exec(vc, vcpu, TASK_INTERRUPTIBLE);
 			continue;
 		}
-		list_for_each_entry_safe(v, vn, &vc->runnable_threads,
-					 arch.run_list) {
+		for_each_runnable_thread(i, v, vc) {
 			kvmppc_core_prepare_to_enter(v);
 			if (signal_pending(v->arch.run_task)) {
 				kvmppc_remove_runnable(vc, v);
@@ -2719,7 +2735,7 @@ static int kvmppc_run_vcpu(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 		if (!vc->n_runnable || vcpu->arch.state != KVMPPC_VCPU_RUNNABLE)
 			break;
 		n_ceded = 0;
-		list_for_each_entry(v, &vc->runnable_threads, arch.run_list) {
+		for_each_runnable_thread(i, v, vc) {
 			if (!v->arch.pending_exceptions)
 				n_ceded += v->arch.ceded;
 			else
@@ -2758,8 +2774,8 @@ static int kvmppc_run_vcpu(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 
 	if (vc->n_runnable && vc->vcore_state == VCORE_INACTIVE) {
 		/* Wake up some vcpu to run the core */
-		v = list_first_entry(&vc->runnable_threads,
-				     struct kvm_vcpu, arch.run_list);
+		i = -1;
+		v = next_runnable_thread(vc, &i);
 		wake_up(&v->arch.cpu_run);
 	}
 
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH V2 3/5] kvm/ppc/book3s_hv: Implement halt polling in the kvm_hv kernel module
  2016-07-11  7:08 [PATCH V2 1/5] kvm/ppc/book3s: Move struct kvmppc_vcore from kvm_host.h to kvm_book3s.h Suraj Jitindar Singh
  2016-07-11  7:08 ` [PATCH V2 2/5] kvm/ppc/book3s_hv: Change vcore element runnable_threads from linked-list to array Suraj Jitindar Singh
@ 2016-07-11  7:08 ` Suraj Jitindar Singh
  2016-07-11 16:57   ` David Matlack
  2016-07-11  7:08 ` [PATCH V2 4/5] kvm/stats: Add provisioning for 64-bit vcpu statistics Suraj Jitindar Singh
  2016-07-11  7:08 ` [PATCH V2 5/5] powerpc/kvm/stats: Implement existing and add new halt polling vcpu stats Suraj Jitindar Singh
  3 siblings, 1 reply; 26+ messages in thread
From: Suraj Jitindar Singh @ 2016-07-11  7:08 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: kvm-ppc, mpe, paulus, benh, kvm, pbonzini, agraf, rkrcmar,
	sjitindarsingh

This patch introduces new halt polling functionality into the kvm_hv kernel
module. When a vcore is idle it will poll for some period of time before
scheduling itself out.

When all of the runnable vcpus on a vcore have ceded (and thus the vcore is
idle) we schedule ourselves out to allow something else to run. In the
event that we need to wake up very quickly (for example an interrupt
arrives), we are required to wait until we get scheduled again.

Implement halt polling so that when a vcore is idle, and before scheduling
ourselves, we poll for vcpus in the runnable_threads list which have
pending exceptions or which leave the ceded state. If we poll successfully
then we can get back into the guest very quickly without ever scheduling
ourselves, otherwise we schedule ourselves out as before.

Testing of this patch with a TCP round robin test between two guests with
virtio network interfaces has found a decrease in round trip time from
~140us to ~115us. A performance gain is only seen when going out of and
back into the guest often and quickly, otherwise there is no net benefit
from the polling. The polling interval is adjusted such that when we are
often scheduled out for long periods of time it is reduced, and when we
often poll successfully it is increased. The rate at which the polling
interval increases or decreases, and the maximum polling interval, can
be set through module parameters.

Based on the implementation in the generic kvm module by Wanpeng Li and
Paolo Bonzini, and on direction from Paul Mackerras.

---
Change Log:

V1 -> V2:
	- Nothing

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
---
 arch/powerpc/include/asm/kvm_book3s.h |   1 +
 arch/powerpc/include/asm/kvm_host.h   |   1 +
 arch/powerpc/kvm/book3s_hv.c          | 115 +++++++++++++++++++++++++++++-----
 arch/powerpc/kvm/trace_hv.h           |  22 +++++++
 4 files changed, 125 insertions(+), 14 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h
index 151f817..c261f52 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -102,6 +102,7 @@ struct kvmppc_vcore {
 	ulong pcr;
 	ulong dpdes;		/* doorbell state (POWER8) */
 	ulong conferring_threads;
+	unsigned int halt_poll_ns;
 };
 
 struct kvmppc_vcpu_book3s {
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 02d06e9..610f393 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -294,6 +294,7 @@ struct kvm_arch {
 #define VCORE_SLEEPING	3
 #define VCORE_RUNNING	4
 #define VCORE_EXITING	5
+#define VCORE_POLLING	6
 
 /*
  * Struct used to manage memory for a virtual processor area
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 3bcf9e6..0d8ce14 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -94,6 +94,23 @@ module_param_cb(h_ipi_redirect, &module_param_ops, &h_ipi_redirect,
 MODULE_PARM_DESC(h_ipi_redirect, "Redirect H_IPI wakeup to a free host core");
 #endif
 
+/* Maximum halt poll interval defaults to KVM_HALT_POLL_NS_DEFAULT */
+static unsigned int halt_poll_max_ns = KVM_HALT_POLL_NS_DEFAULT;
+module_param(halt_poll_max_ns, uint, S_IRUGO | S_IWUSR);
+MODULE_PARM_DESC(halt_poll_max_ns, "Maximum halt poll time in ns");
+
+/* Factor by which the vcore halt poll interval is grown, default is to double
+ */
+static unsigned int halt_poll_ns_grow = 2;
+module_param(halt_poll_ns_grow, int, S_IRUGO);
+MODULE_PARM_DESC(halt_poll_ns_grow, "Factor halt poll time is grown by");
+
+/* Factor by which the vcore halt poll interval is shrunk, default is to reset
+ */
+static unsigned int halt_poll_ns_shrink;
+module_param(halt_poll_ns_shrink, int, S_IRUGO);
+MODULE_PARM_DESC(halt_poll_ns_shrink, "Factor halt poll time is shrunk by");
+
 static void kvmppc_end_cede(struct kvm_vcpu *vcpu);
 static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu);
 
@@ -2620,32 +2637,82 @@ static void kvmppc_wait_for_exec(struct kvmppc_vcore *vc,
 	finish_wait(&vcpu->arch.cpu_run, &wait);
 }
 
+static void grow_halt_poll_ns(struct kvmppc_vcore *vc)
+{
+	/* 10us base */
+	if (vc->halt_poll_ns == 0 && halt_poll_ns_grow)
+		vc->halt_poll_ns = 10000;
+	else
+		vc->halt_poll_ns *= halt_poll_ns_grow;
+
+	if (vc->halt_poll_ns > halt_poll_max_ns)
+		vc->halt_poll_ns = halt_poll_max_ns;
+}
+
+static void shrink_halt_poll_ns(struct kvmppc_vcore *vc)
+{
+	if (halt_poll_ns_shrink == 0)
+		vc->halt_poll_ns = 0;
+	else
+		vc->halt_poll_ns /= halt_poll_ns_shrink;
+}
+
+/* Check to see if any of the runnable vcpus on the vcore have pending
+ * exceptions or are no longer ceded
+ */
+static int kvmppc_vcore_check_block(struct kvmppc_vcore *vc)
+{
+	struct kvm_vcpu *vcpu;
+	int i;
+
+	for_each_runnable_thread(i, vcpu, vc) {
+		if (vcpu->arch.pending_exceptions || !vcpu->arch.ceded)
+			return 1;
+	}
+
+	return 0;
+}
+
 /*
  * All the vcpus in this vcore are idle, so wait for a decrementer
  * or external interrupt to one of the vcpus.  vc->lock is held.
  */
 static void kvmppc_vcore_blocked(struct kvmppc_vcore *vc)
 {
-	struct kvm_vcpu *vcpu;
-	int do_sleep = 1, i;
+	int do_sleep = 1;
+	ktime_t cur, start;
+	u64 block_ns;
 	DECLARE_SWAITQUEUE(wait);
 
-	prepare_to_swait(&vc->wq, &wait, TASK_INTERRUPTIBLE);
+	/* Poll for pending exceptions and ceded state */
+	cur = start = ktime_get();
+	if (vc->halt_poll_ns) {
+		ktime_t stop = ktime_add_ns(start, vc->halt_poll_ns);
 
-	/*
-	 * Check one last time for pending exceptions and ceded state after
-	 * we put ourselves on the wait queue
-	 */
-	for_each_runnable_thread(i, vcpu, vc) {
-		if (vcpu->arch.pending_exceptions || !vcpu->arch.ceded) {
-			do_sleep = 0;
-			break;
-		}
+		vc->vcore_state = VCORE_POLLING;
+		spin_unlock(&vc->lock);
+
+		do {
+			if (kvmppc_vcore_check_block(vc)) {
+				do_sleep = 0;
+				break;
+			}
+			cur = ktime_get();
+		} while (ktime_before(cur, stop));
+
+		spin_lock(&vc->lock);
+		vc->vcore_state = VCORE_INACTIVE;
+
+		if (!do_sleep)
+			goto out;
 	}
 
-	if (!do_sleep) {
+	prepare_to_swait(&vc->wq, &wait, TASK_INTERRUPTIBLE);
+
+	if (kvmppc_vcore_check_block(vc)) {
 		finish_swait(&vc->wq, &wait);
-		return;
+		do_sleep = 0;
+		goto out;
 	}
 
 	vc->vcore_state = VCORE_SLEEPING;
@@ -2656,6 +2723,26 @@ static void kvmppc_vcore_blocked(struct kvmppc_vcore *vc)
 	spin_lock(&vc->lock);
 	vc->vcore_state = VCORE_INACTIVE;
 	trace_kvmppc_vcore_blocked(vc, 1);
+
+	cur = ktime_get();
+
+out:
+	block_ns = ktime_to_ns(cur) - ktime_to_ns(start);
+
+	if (halt_poll_max_ns) {
+		if (block_ns <= vc->halt_poll_ns)
+			;
+		/* We slept and blocked for longer than the max halt time */
+		else if (vc->halt_poll_ns && block_ns > halt_poll_max_ns)
+			shrink_halt_poll_ns(vc);
+		/* We slept and our poll time is too small */
+		else if (vc->halt_poll_ns < halt_poll_max_ns &&
+				block_ns < halt_poll_max_ns)
+			grow_halt_poll_ns(vc);
+	} else
+		vc->halt_poll_ns = 0;
+
+	trace_kvmppc_vcore_wakeup(do_sleep, block_ns);
 }
 
 static int kvmppc_run_vcpu(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
diff --git a/arch/powerpc/kvm/trace_hv.h b/arch/powerpc/kvm/trace_hv.h
index 33d9daf..fb21990 100644
--- a/arch/powerpc/kvm/trace_hv.h
+++ b/arch/powerpc/kvm/trace_hv.h
@@ -432,6 +432,28 @@ TRACE_EVENT(kvmppc_vcore_blocked,
 		   __entry->runner_vcpu, __entry->n_runnable, __entry->tgid)
 );
 
+TRACE_EVENT(kvmppc_vcore_wakeup,
+	TP_PROTO(int do_sleep, __u64 ns),
+
+	TP_ARGS(do_sleep, ns),
+
+	TP_STRUCT__entry(
+		__field(__u64,  ns)
+		__field(int,    waited)
+		__field(pid_t,  tgid)
+	),
+
+	TP_fast_assign(
+		__entry->ns     = ns;
+		__entry->waited = do_sleep;
+		__entry->tgid   = current->tgid;
+	),
+
+	TP_printk("%s time %lld ns, tgid=%d",
+		__entry->waited ? "wait" : "poll",
+		__entry->ns, __entry->tgid)
+);
+
 TRACE_EVENT(kvmppc_run_vcpu_enter,
 	TP_PROTO(struct kvm_vcpu *vcpu),
 
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH V2 4/5] kvm/stats: Add provisioning for 64-bit vcpu statistics
  2016-07-11  7:08 [PATCH V2 1/5] kvm/ppc/book3s: Move struct kvmppc_vcore from kvm_host.h to kvm_book3s.h Suraj Jitindar Singh
  2016-07-11  7:08 ` [PATCH V2 2/5] kvm/ppc/book3s_hv: Change vcore element runnable_threads from linked-list to array Suraj Jitindar Singh
  2016-07-11  7:08 ` [PATCH V2 3/5] kvm/ppc/book3s_hv: Implement halt polling in the kvm_hv kernel module Suraj Jitindar Singh
@ 2016-07-11  7:08 ` Suraj Jitindar Singh
  2016-07-11 16:51   ` David Matlack
  2016-07-11  7:08 ` [PATCH V2 5/5] powerpc/kvm/stats: Implement existing and add new halt polling vcpu stats Suraj Jitindar Singh
  3 siblings, 1 reply; 26+ messages in thread
From: Suraj Jitindar Singh @ 2016-07-11  7:08 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: kvm-ppc, mpe, paulus, benh, kvm, pbonzini, agraf, rkrcmar,
	sjitindarsingh

vcpus have statistics associated with them which can be viewed within the
debugfs. Currently it is assumed within the vcpu_stat_get() and
vcpu_stat_get_per_vm() functions that all of these statistics are
represented as 32-bit numbers. The next patch adds some 64-bit statistics,
so add provisioning for the display of 64-bit vcpu statistics.

---
Change Log:

V1 -> V2:
	- Nothing

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
---
 arch/powerpc/kvm/book3s.c |  1 +
 include/linux/kvm_host.h  |  1 +
 virt/kvm/kvm_main.c       | 60 +++++++++++++++++++++++++++++++++++++++++++----
 3 files changed, 58 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index 47018fc..ed9132b 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -40,6 +40,7 @@
 #include "trace.h"
 
 #define VCPU_STAT(x) offsetof(struct kvm_vcpu, stat.x), KVM_STAT_VCPU
+#define VCPU_STAT_U64(x) offsetof(struct kvm_vcpu, stat.x), KVM_STAT_VCPU_U64
 
 /* #define EXIT_DEBUG */
 
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 1c9c973..667b30e 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -991,6 +991,7 @@ static inline bool kvm_is_error_gpa(struct kvm *kvm, gpa_t gpa)
 enum kvm_stat_kind {
 	KVM_STAT_VM,
 	KVM_STAT_VCPU,
+	KVM_STAT_VCPU_U64,
 };
 
 struct kvm_stat_data {
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 48bd520..7ab5901 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -3566,6 +3566,20 @@ static int vcpu_stat_get_per_vm(void *data, u64 *val)
 	return 0;
 }
 
+static int vcpu_stat_u64_get_per_vm(void *data, u64 *val)
+{
+	int i;
+	struct kvm_stat_data *stat_data = (struct kvm_stat_data *)data;
+	struct kvm_vcpu *vcpu;
+
+	*val = 0;
+
+	kvm_for_each_vcpu(i, vcpu, stat_data->kvm)
+		*val += *(u64 *)((void *)vcpu + stat_data->offset);
+
+	return 0;
+}
+
 static int vcpu_stat_get_per_vm_open(struct inode *inode, struct file *file)
 {
 	__simple_attr_check_format("%llu\n", 0ull);
@@ -3573,6 +3587,13 @@ static int vcpu_stat_get_per_vm_open(struct inode *inode, struct file *file)
 				 NULL, "%llu\n");
 }
 
+static int vcpu_stat_u64_get_per_vm_open(struct inode *inode, struct file *file)
+{
+	__simple_attr_check_format("%llu\n", 0ull);
+	return kvm_debugfs_open(inode, file, vcpu_stat_u64_get_per_vm,
+				 NULL, "%llu\n");
+}
+
 static const struct file_operations vcpu_stat_get_per_vm_fops = {
 	.owner   = THIS_MODULE,
 	.open    = vcpu_stat_get_per_vm_open,
@@ -3582,9 +3603,19 @@ static const struct file_operations vcpu_stat_get_per_vm_fops = {
 	.llseek  = generic_file_llseek,
 };
 
+static const struct file_operations vcpu_stat_u64_get_per_vm_fops = {
+	.owner   = THIS_MODULE,
+	.open    = vcpu_stat_u64_get_per_vm_open,
+	.release = kvm_debugfs_release,
+	.read    = simple_attr_read,
+	.write   = simple_attr_write,
+	.llseek  = generic_file_llseek,
+};
+
 static const struct file_operations *stat_fops_per_vm[] = {
-	[KVM_STAT_VCPU] = &vcpu_stat_get_per_vm_fops,
-	[KVM_STAT_VM]   = &vm_stat_get_per_vm_fops,
+	[KVM_STAT_VCPU]		= &vcpu_stat_get_per_vm_fops,
+	[KVM_STAT_VCPU_U64]	= &vcpu_stat_u64_get_per_vm_fops,
+	[KVM_STAT_VM]		= &vm_stat_get_per_vm_fops,
 };
 
 static int vm_stat_get(void *_offset, u64 *val)
@@ -3627,9 +3658,30 @@ static int vcpu_stat_get(void *_offset, u64 *val)
 
 DEFINE_SIMPLE_ATTRIBUTE(vcpu_stat_fops, vcpu_stat_get, NULL, "%llu\n");
 
+static int vcpu_stat_u64_get(void *_offset, u64 *val)
+{
+	unsigned offset = (long)_offset;
+	struct kvm *kvm;
+	struct kvm_stat_data stat_tmp = {.offset = offset};
+	u64 tmp_val;
+
+	*val = 0;
+	spin_lock(&kvm_lock);
+	list_for_each_entry(kvm, &vm_list, vm_list) {
+		stat_tmp.kvm = kvm;
+		vcpu_stat_u64_get_per_vm((void *)&stat_tmp, &tmp_val);
+		*val += tmp_val;
+	}
+	spin_unlock(&kvm_lock);
+	return 0;
+}
+
+DEFINE_SIMPLE_ATTRIBUTE(vcpu_stat_u64_fops, vcpu_stat_u64_get, NULL, "%llu\n");
+
 static const struct file_operations *stat_fops[] = {
-	[KVM_STAT_VCPU] = &vcpu_stat_fops,
-	[KVM_STAT_VM]   = &vm_stat_fops,
+	[KVM_STAT_VCPU]		= &vcpu_stat_fops,
+	[KVM_STAT_VCPU_U64]	= &vcpu_stat_u64_fops,
+	[KVM_STAT_VM]		= &vm_stat_fops,
 };
 
 static int kvm_init_debug(void)
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH V2 5/5] powerpc/kvm/stats: Implement existing and add new halt polling vcpu stats
  2016-07-11  7:08 [PATCH V2 1/5] kvm/ppc/book3s: Move struct kvmppc_vcore from kvm_host.h to kvm_book3s.h Suraj Jitindar Singh
                   ` (2 preceding siblings ...)
  2016-07-11  7:08 ` [PATCH V2 4/5] kvm/stats: Add provisioning for 64-bit vcpu statistics Suraj Jitindar Singh
@ 2016-07-11  7:08 ` Suraj Jitindar Singh
  2016-07-11 16:49   ` David Matlack
  3 siblings, 1 reply; 26+ messages in thread
From: Suraj Jitindar Singh @ 2016-07-11  7:08 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: kvm-ppc, mpe, paulus, benh, kvm, pbonzini, agraf, rkrcmar,
	sjitindarsingh

vcpu stats are used to collect information about a vcpu which can be viewed
in the debugfs. For example halt_attempted_poll and halt_successful_poll
are used to keep track of the number of times the vcpu attempts to and
successfully polls. These stats are currently not used on powerpc.

Implement incrementation of the halt_attempted_poll and
halt_successful_poll vcpu stats for powerpc. Since these stats are summed
over all the vcpus for all running guests it doesn't matter which vcpu
they are attributed to, thus we choose the current runner vcpu of the
vcore.

Also add new vcpu stats: halt_poll_time and halt_wait_time to be used to
accumulate the total time spend polling and waiting respectively, and
halt_successful_wait to accumulate the number of times the vcpu waits.
Given that halt_poll_time and halt_wait_time are expressed in nanoseconds
it is necessary to represent these as 64-bit quantities, otherwise they
would overflow after only about 4 seconds.

Given that the total time spend either polling or waiting will be known and
the number of times that each was done, it will be possible to determine
the average poll and wait times. This will give the ability to tune the kvm
module parameters based on the calculated average wait and poll times.

---
Change Log:

V1 -> V2:
	- Nothing

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
---
 arch/powerpc/include/asm/kvm_host.h |  3 +++
 arch/powerpc/kvm/book3s.c           |  3 +++
 arch/powerpc/kvm/book3s_hv.c        | 14 +++++++++++++-
 3 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 610f393..66a7198 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -114,8 +114,11 @@ struct kvm_vcpu_stat {
 	u32 emulated_inst_exits;
 	u32 dec_exits;
 	u32 ext_intr_exits;
+	u64 halt_poll_time;
+	u64 halt_wait_time;
 	u32 halt_successful_poll;
 	u32 halt_attempted_poll;
+	u32 halt_successful_wait;
 	u32 halt_poll_invalid;
 	u32 halt_wakeup;
 	u32 dbell_exits;
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index ed9132b..6217bea 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -53,8 +53,11 @@ struct kvm_stats_debugfs_item debugfs_entries[] = {
 	{ "dec",         VCPU_STAT(dec_exits) },
 	{ "ext_intr",    VCPU_STAT(ext_intr_exits) },
 	{ "queue_intr",  VCPU_STAT(queue_intr) },
+	{ "halt_poll_time_ns",		VCPU_STAT_U64(halt_poll_time) },
+	{ "halt_wait_time_ns",		VCPU_STAT_U64(halt_wait_time) },
 	{ "halt_successful_poll", VCPU_STAT(halt_successful_poll), },
 	{ "halt_attempted_poll", VCPU_STAT(halt_attempted_poll), },
+	{ "halt_successful_wait",	VCPU_STAT(halt_successful_wait) },
 	{ "halt_poll_invalid", VCPU_STAT(halt_poll_invalid) },
 	{ "halt_wakeup", VCPU_STAT(halt_wakeup) },
 	{ "pf_storage",  VCPU_STAT(pf_storage) },
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 0d8ce14..a0dae63 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -2688,6 +2688,7 @@ static void kvmppc_vcore_blocked(struct kvmppc_vcore *vc)
 	cur = start = ktime_get();
 	if (vc->halt_poll_ns) {
 		ktime_t stop = ktime_add_ns(start, vc->halt_poll_ns);
+		++vc->runner->stat.halt_attempted_poll;
 
 		vc->vcore_state = VCORE_POLLING;
 		spin_unlock(&vc->lock);
@@ -2703,8 +2704,10 @@ static void kvmppc_vcore_blocked(struct kvmppc_vcore *vc)
 		spin_lock(&vc->lock);
 		vc->vcore_state = VCORE_INACTIVE;
 
-		if (!do_sleep)
+		if (!do_sleep) {
+			++vc->runner->stat.halt_successful_poll;
 			goto out;
+		}
 	}
 
 	prepare_to_swait(&vc->wq, &wait, TASK_INTERRUPTIBLE);
@@ -2712,6 +2715,9 @@ static void kvmppc_vcore_blocked(struct kvmppc_vcore *vc)
 	if (kvmppc_vcore_check_block(vc)) {
 		finish_swait(&vc->wq, &wait);
 		do_sleep = 0;
+		/* If we polled, count this as a successful poll */
+		if (vc->halt_poll_ns)
+			++vc->runner->stat.halt_successful_poll;
 		goto out;
 	}
 
@@ -2723,12 +2729,18 @@ static void kvmppc_vcore_blocked(struct kvmppc_vcore *vc)
 	spin_lock(&vc->lock);
 	vc->vcore_state = VCORE_INACTIVE;
 	trace_kvmppc_vcore_blocked(vc, 1);
+	++vc->runner->stat.halt_successful_wait;
 
 	cur = ktime_get();
 
 out:
 	block_ns = ktime_to_ns(cur) - ktime_to_ns(start);
 
+	if (do_sleep)
+		vc->runner->stat.halt_wait_time += block_ns;
+	else if (vc->halt_poll_ns)
+		vc->runner->stat.halt_poll_time += block_ns;
+
 	if (halt_poll_max_ns) {
 		if (block_ns <= vc->halt_poll_ns)
 			;
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH V2 5/5] powerpc/kvm/stats: Implement existing and add new halt polling vcpu stats
  2016-07-11  7:08 ` [PATCH V2 5/5] powerpc/kvm/stats: Implement existing and add new halt polling vcpu stats Suraj Jitindar Singh
@ 2016-07-11 16:49   ` David Matlack
  2016-07-12  6:17     ` Suraj Jitindar Singh
  0 siblings, 1 reply; 26+ messages in thread
From: David Matlack @ 2016-07-11 16:49 UTC (permalink / raw)
  To: Suraj Jitindar Singh
  Cc: linuxppc-dev, kvm-ppc, mpe, paulus, benh, kvm list,
	Paolo Bonzini, agraf, Radim Krčmář

On Mon, Jul 11, 2016 at 12:08 AM, Suraj Jitindar Singh
<sjitindarsingh@gmail.com> wrote:
> vcpu stats are used to collect information about a vcpu which can be viewed
> in the debugfs. For example halt_attempted_poll and halt_successful_poll
> are used to keep track of the number of times the vcpu attempts to and
> successfully polls. These stats are currently not used on powerpc.
>
> Implement incrementation of the halt_attempted_poll and
> halt_successful_poll vcpu stats for powerpc. Since these stats are summed
> over all the vcpus for all running guests it doesn't matter which vcpu
> they are attributed to, thus we choose the current runner vcpu of the
> vcore.
>
> Also add new vcpu stats: halt_poll_time and halt_wait_time to be used to
> accumulate the total time spend polling and waiting respectively, and
> halt_successful_wait to accumulate the number of times the vcpu waits.
> Given that halt_poll_time and halt_wait_time are expressed in nanoseconds
> it is necessary to represent these as 64-bit quantities, otherwise they
> would overflow after only about 4 seconds.
>
> Given that the total time spend either polling or waiting will be known and
> the number of times that each was done, it will be possible to determine
> the average poll and wait times. This will give the ability to tune the kvm
> module parameters based on the calculated average wait and poll times.
>
> ---
> Change Log:
>
> V1 -> V2:
>         - Nothing
>
> Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
> ---
>  arch/powerpc/include/asm/kvm_host.h |  3 +++
>  arch/powerpc/kvm/book3s.c           |  3 +++
>  arch/powerpc/kvm/book3s_hv.c        | 14 +++++++++++++-
>  3 files changed, 19 insertions(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
> index 610f393..66a7198 100644
> --- a/arch/powerpc/include/asm/kvm_host.h
> +++ b/arch/powerpc/include/asm/kvm_host.h
> @@ -114,8 +114,11 @@ struct kvm_vcpu_stat {
>         u32 emulated_inst_exits;
>         u32 dec_exits;
>         u32 ext_intr_exits;
> +       u64 halt_poll_time;
> +       u64 halt_wait_time;
>         u32 halt_successful_poll;
>         u32 halt_attempted_poll;
> +       u32 halt_successful_wait;
>         u32 halt_poll_invalid;
>         u32 halt_wakeup;
>         u32 dbell_exits;
> diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
> index ed9132b..6217bea 100644
> --- a/arch/powerpc/kvm/book3s.c
> +++ b/arch/powerpc/kvm/book3s.c
> @@ -53,8 +53,11 @@ struct kvm_stats_debugfs_item debugfs_entries[] = {
>         { "dec",         VCPU_STAT(dec_exits) },
>         { "ext_intr",    VCPU_STAT(ext_intr_exits) },
>         { "queue_intr",  VCPU_STAT(queue_intr) },
> +       { "halt_poll_time_ns",          VCPU_STAT_U64(halt_poll_time) },
> +       { "halt_wait_time_ns",          VCPU_STAT_U64(halt_wait_time) },
>         { "halt_successful_poll", VCPU_STAT(halt_successful_poll), },
>         { "halt_attempted_poll", VCPU_STAT(halt_attempted_poll), },
> +       { "halt_successful_wait",       VCPU_STAT(halt_successful_wait) },
>         { "halt_poll_invalid", VCPU_STAT(halt_poll_invalid) },
>         { "halt_wakeup", VCPU_STAT(halt_wakeup) },
>         { "pf_storage",  VCPU_STAT(pf_storage) },
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 0d8ce14..a0dae63 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -2688,6 +2688,7 @@ static void kvmppc_vcore_blocked(struct kvmppc_vcore *vc)
>         cur = start = ktime_get();
>         if (vc->halt_poll_ns) {
>                 ktime_t stop = ktime_add_ns(start, vc->halt_poll_ns);
> +               ++vc->runner->stat.halt_attempted_poll;
>
>                 vc->vcore_state = VCORE_POLLING;
>                 spin_unlock(&vc->lock);
> @@ -2703,8 +2704,10 @@ static void kvmppc_vcore_blocked(struct kvmppc_vcore *vc)
>                 spin_lock(&vc->lock);
>                 vc->vcore_state = VCORE_INACTIVE;
>
> -               if (!do_sleep)
> +               if (!do_sleep) {
> +                       ++vc->runner->stat.halt_successful_poll;
>                         goto out;
> +               }
>         }
>
>         prepare_to_swait(&vc->wq, &wait, TASK_INTERRUPTIBLE);
> @@ -2712,6 +2715,9 @@ static void kvmppc_vcore_blocked(struct kvmppc_vcore *vc)
>         if (kvmppc_vcore_check_block(vc)) {
>                 finish_swait(&vc->wq, &wait);
>                 do_sleep = 0;
> +               /* If we polled, count this as a successful poll */
> +               if (vc->halt_poll_ns)
> +                       ++vc->runner->stat.halt_successful_poll;
>                 goto out;
>         }
>
> @@ -2723,12 +2729,18 @@ static void kvmppc_vcore_blocked(struct kvmppc_vcore *vc)
>         spin_lock(&vc->lock);
>         vc->vcore_state = VCORE_INACTIVE;
>         trace_kvmppc_vcore_blocked(vc, 1);
> +       ++vc->runner->stat.halt_successful_wait;
>
>         cur = ktime_get();
>
>  out:
>         block_ns = ktime_to_ns(cur) - ktime_to_ns(start);
>
> +       if (do_sleep)
> +               vc->runner->stat.halt_wait_time += block_ns;

It's possible to poll and wait in one halt, conflating this stat with
polling time. Is it useful to split out a third stat,
halt_poll_fail_ns which counts how long we polled which ended up
sleeping? Then halt_wait_time only counts the time the VCPU spent on
the wait queue. The sum of all 3 is still the total time spent halted.

> +       else if (vc->halt_poll_ns)
> +               vc->runner->stat.halt_poll_time += block_ns;
> +
>         if (halt_poll_max_ns) {
>                 if (block_ns <= vc->halt_poll_ns)
>                         ;
> --
> 2.5.5
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH V2 4/5] kvm/stats: Add provisioning for 64-bit vcpu statistics
  2016-07-11  7:08 ` [PATCH V2 4/5] kvm/stats: Add provisioning for 64-bit vcpu statistics Suraj Jitindar Singh
@ 2016-07-11 16:51   ` David Matlack
  2016-07-11 17:05     ` Paolo Bonzini
  0 siblings, 1 reply; 26+ messages in thread
From: David Matlack @ 2016-07-11 16:51 UTC (permalink / raw)
  To: Suraj Jitindar Singh
  Cc: linuxppc-dev, kvm-ppc, mpe, paulus, benh, kvm list,
	Paolo Bonzini, agraf, Radim Krčmář

On Mon, Jul 11, 2016 at 12:08 AM, Suraj Jitindar Singh
<sjitindarsingh@gmail.com> wrote:
> vcpus have statistics associated with them which can be viewed within the
> debugfs. Currently it is assumed within the vcpu_stat_get() and
> vcpu_stat_get_per_vm() functions that all of these statistics are
> represented as 32-bit numbers. The next patch adds some 64-bit statistics,
> so add provisioning for the display of 64-bit vcpu statistics.

Thanks, we need 64-bit stats in other places as well. Can we use this
opportunity to wholesale upgrade all KVM stats from u32 to u64? Most
of this patch is duplicated code with "u32" swapped with "u64".

>
> ---
> Change Log:
>
> V1 -> V2:
>         - Nothing
>
> Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
> ---
>  arch/powerpc/kvm/book3s.c |  1 +
>  include/linux/kvm_host.h  |  1 +
>  virt/kvm/kvm_main.c       | 60 +++++++++++++++++++++++++++++++++++++++++++----
>  3 files changed, 58 insertions(+), 4 deletions(-)
>
> diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
> index 47018fc..ed9132b 100644
> --- a/arch/powerpc/kvm/book3s.c
> +++ b/arch/powerpc/kvm/book3s.c
> @@ -40,6 +40,7 @@
>  #include "trace.h"
>
>  #define VCPU_STAT(x) offsetof(struct kvm_vcpu, stat.x), KVM_STAT_VCPU
> +#define VCPU_STAT_U64(x) offsetof(struct kvm_vcpu, stat.x), KVM_STAT_VCPU_U64
>
>  /* #define EXIT_DEBUG */
>
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 1c9c973..667b30e 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -991,6 +991,7 @@ static inline bool kvm_is_error_gpa(struct kvm *kvm, gpa_t gpa)
>  enum kvm_stat_kind {
>         KVM_STAT_VM,
>         KVM_STAT_VCPU,
> +       KVM_STAT_VCPU_U64,
>  };
>
>  struct kvm_stat_data {
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 48bd520..7ab5901 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -3566,6 +3566,20 @@ static int vcpu_stat_get_per_vm(void *data, u64 *val)
>         return 0;
>  }
>
> +static int vcpu_stat_u64_get_per_vm(void *data, u64 *val)
> +{
> +       int i;
> +       struct kvm_stat_data *stat_data = (struct kvm_stat_data *)data;
> +       struct kvm_vcpu *vcpu;
> +
> +       *val = 0;
> +
> +       kvm_for_each_vcpu(i, vcpu, stat_data->kvm)
> +               *val += *(u64 *)((void *)vcpu + stat_data->offset);
> +
> +       return 0;
> +}
> +
>  static int vcpu_stat_get_per_vm_open(struct inode *inode, struct file *file)
>  {
>         __simple_attr_check_format("%llu\n", 0ull);
> @@ -3573,6 +3587,13 @@ static int vcpu_stat_get_per_vm_open(struct inode *inode, struct file *file)
>                                  NULL, "%llu\n");
>  }
>
> +static int vcpu_stat_u64_get_per_vm_open(struct inode *inode, struct file *file)
> +{
> +       __simple_attr_check_format("%llu\n", 0ull);
> +       return kvm_debugfs_open(inode, file, vcpu_stat_u64_get_per_vm,
> +                                NULL, "%llu\n");
> +}
> +
>  static const struct file_operations vcpu_stat_get_per_vm_fops = {
>         .owner   = THIS_MODULE,
>         .open    = vcpu_stat_get_per_vm_open,
> @@ -3582,9 +3603,19 @@ static const struct file_operations vcpu_stat_get_per_vm_fops = {
>         .llseek  = generic_file_llseek,
>  };
>
> +static const struct file_operations vcpu_stat_u64_get_per_vm_fops = {
> +       .owner   = THIS_MODULE,
> +       .open    = vcpu_stat_u64_get_per_vm_open,
> +       .release = kvm_debugfs_release,
> +       .read    = simple_attr_read,
> +       .write   = simple_attr_write,
> +       .llseek  = generic_file_llseek,
> +};
> +
>  static const struct file_operations *stat_fops_per_vm[] = {
> -       [KVM_STAT_VCPU] = &vcpu_stat_get_per_vm_fops,
> -       [KVM_STAT_VM]   = &vm_stat_get_per_vm_fops,
> +       [KVM_STAT_VCPU]         = &vcpu_stat_get_per_vm_fops,
> +       [KVM_STAT_VCPU_U64]     = &vcpu_stat_u64_get_per_vm_fops,
> +       [KVM_STAT_VM]           = &vm_stat_get_per_vm_fops,
>  };
>
>  static int vm_stat_get(void *_offset, u64 *val)
> @@ -3627,9 +3658,30 @@ static int vcpu_stat_get(void *_offset, u64 *val)
>
>  DEFINE_SIMPLE_ATTRIBUTE(vcpu_stat_fops, vcpu_stat_get, NULL, "%llu\n");
>
> +static int vcpu_stat_u64_get(void *_offset, u64 *val)
> +{
> +       unsigned offset = (long)_offset;
> +       struct kvm *kvm;
> +       struct kvm_stat_data stat_tmp = {.offset = offset};
> +       u64 tmp_val;
> +
> +       *val = 0;
> +       spin_lock(&kvm_lock);
> +       list_for_each_entry(kvm, &vm_list, vm_list) {
> +               stat_tmp.kvm = kvm;
> +               vcpu_stat_u64_get_per_vm((void *)&stat_tmp, &tmp_val);
> +               *val += tmp_val;
> +       }
> +       spin_unlock(&kvm_lock);
> +       return 0;
> +}
> +
> +DEFINE_SIMPLE_ATTRIBUTE(vcpu_stat_u64_fops, vcpu_stat_u64_get, NULL, "%llu\n");
> +
>  static const struct file_operations *stat_fops[] = {
> -       [KVM_STAT_VCPU] = &vcpu_stat_fops,
> -       [KVM_STAT_VM]   = &vm_stat_fops,
> +       [KVM_STAT_VCPU]         = &vcpu_stat_fops,
> +       [KVM_STAT_VCPU_U64]     = &vcpu_stat_u64_fops,
> +       [KVM_STAT_VM]           = &vm_stat_fops,
>  };
>
>  static int kvm_init_debug(void)
> --
> 2.5.5
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH V2 3/5] kvm/ppc/book3s_hv: Implement halt polling in the kvm_hv kernel module
  2016-07-11  7:08 ` [PATCH V2 3/5] kvm/ppc/book3s_hv: Implement halt polling in the kvm_hv kernel module Suraj Jitindar Singh
@ 2016-07-11 16:57   ` David Matlack
  2016-07-11 17:07     ` Paolo Bonzini
  0 siblings, 1 reply; 26+ messages in thread
From: David Matlack @ 2016-07-11 16:57 UTC (permalink / raw)
  To: Suraj Jitindar Singh
  Cc: linuxppc-dev, kvm-ppc, mpe, paulus, benh, kvm list,
	Paolo Bonzini, agraf, Radim Krčmář

On Mon, Jul 11, 2016 at 12:08 AM, Suraj Jitindar Singh
<sjitindarsingh@gmail.com> wrote:
> This patch introduces new halt polling functionality into the kvm_hv kernel
> module. When a vcore is idle it will poll for some period of time before
> scheduling itself out.

Is there any way to reuse the existing halt-polling code? Having two
copies risks them diverging over time.

>
> When all of the runnable vcpus on a vcore have ceded (and thus the vcore is
> idle) we schedule ourselves out to allow something else to run. In the
> event that we need to wake up very quickly (for example an interrupt
> arrives), we are required to wait until we get scheduled again.
>
> Implement halt polling so that when a vcore is idle, and before scheduling
> ourselves, we poll for vcpus in the runnable_threads list which have
> pending exceptions or which leave the ceded state. If we poll successfully
> then we can get back into the guest very quickly without ever scheduling
> ourselves, otherwise we schedule ourselves out as before.
>
> Testing of this patch with a TCP round robin test between two guests with
> virtio network interfaces has found a decrease in round trip time from
> ~140us to ~115us. A performance gain is only seen when going out of and
> back into the guest often and quickly, otherwise there is no net benefit
> from the polling. The polling interval is adjusted such that when we are
> often scheduled out for long periods of time it is reduced, and when we
> often poll successfully it is increased. The rate at which the polling
> interval increases or decreases, and the maximum polling interval, can
> be set through module parameters.
>
> Based on the implementation in the generic kvm module by Wanpeng Li and
> Paolo Bonzini, and on direction from Paul Mackerras.
>
> ---
> Change Log:
>
> V1 -> V2:
>         - Nothing
>
> Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
> ---
>  arch/powerpc/include/asm/kvm_book3s.h |   1 +
>  arch/powerpc/include/asm/kvm_host.h   |   1 +
>  arch/powerpc/kvm/book3s_hv.c          | 115 +++++++++++++++++++++++++++++-----
>  arch/powerpc/kvm/trace_hv.h           |  22 +++++++
>  4 files changed, 125 insertions(+), 14 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h
> index 151f817..c261f52 100644
> --- a/arch/powerpc/include/asm/kvm_book3s.h
> +++ b/arch/powerpc/include/asm/kvm_book3s.h
> @@ -102,6 +102,7 @@ struct kvmppc_vcore {
>         ulong pcr;
>         ulong dpdes;            /* doorbell state (POWER8) */
>         ulong conferring_threads;
> +       unsigned int halt_poll_ns;
>  };
>
>  struct kvmppc_vcpu_book3s {
> diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
> index 02d06e9..610f393 100644
> --- a/arch/powerpc/include/asm/kvm_host.h
> +++ b/arch/powerpc/include/asm/kvm_host.h
> @@ -294,6 +294,7 @@ struct kvm_arch {
>  #define VCORE_SLEEPING 3
>  #define VCORE_RUNNING  4
>  #define VCORE_EXITING  5
> +#define VCORE_POLLING  6
>
>  /*
>   * Struct used to manage memory for a virtual processor area
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 3bcf9e6..0d8ce14 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -94,6 +94,23 @@ module_param_cb(h_ipi_redirect, &module_param_ops, &h_ipi_redirect,
>  MODULE_PARM_DESC(h_ipi_redirect, "Redirect H_IPI wakeup to a free host core");
>  #endif
>
> +/* Maximum halt poll interval defaults to KVM_HALT_POLL_NS_DEFAULT */
> +static unsigned int halt_poll_max_ns = KVM_HALT_POLL_NS_DEFAULT;
> +module_param(halt_poll_max_ns, uint, S_IRUGO | S_IWUSR);
> +MODULE_PARM_DESC(halt_poll_max_ns, "Maximum halt poll time in ns");
> +
> +/* Factor by which the vcore halt poll interval is grown, default is to double
> + */
> +static unsigned int halt_poll_ns_grow = 2;
> +module_param(halt_poll_ns_grow, int, S_IRUGO);
> +MODULE_PARM_DESC(halt_poll_ns_grow, "Factor halt poll time is grown by");
> +
> +/* Factor by which the vcore halt poll interval is shrunk, default is to reset
> + */
> +static unsigned int halt_poll_ns_shrink;
> +module_param(halt_poll_ns_shrink, int, S_IRUGO);
> +MODULE_PARM_DESC(halt_poll_ns_shrink, "Factor halt poll time is shrunk by");
> +
>  static void kvmppc_end_cede(struct kvm_vcpu *vcpu);
>  static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu);
>
> @@ -2620,32 +2637,82 @@ static void kvmppc_wait_for_exec(struct kvmppc_vcore *vc,
>         finish_wait(&vcpu->arch.cpu_run, &wait);
>  }
>
> +static void grow_halt_poll_ns(struct kvmppc_vcore *vc)
> +{
> +       /* 10us base */
> +       if (vc->halt_poll_ns == 0 && halt_poll_ns_grow)
> +               vc->halt_poll_ns = 10000;
> +       else
> +               vc->halt_poll_ns *= halt_poll_ns_grow;
> +
> +       if (vc->halt_poll_ns > halt_poll_max_ns)
> +               vc->halt_poll_ns = halt_poll_max_ns;
> +}
> +
> +static void shrink_halt_poll_ns(struct kvmppc_vcore *vc)
> +{
> +       if (halt_poll_ns_shrink == 0)
> +               vc->halt_poll_ns = 0;
> +       else
> +               vc->halt_poll_ns /= halt_poll_ns_shrink;
> +}
> +
> +/* Check to see if any of the runnable vcpus on the vcore have pending
> + * exceptions or are no longer ceded
> + */
> +static int kvmppc_vcore_check_block(struct kvmppc_vcore *vc)
> +{
> +       struct kvm_vcpu *vcpu;
> +       int i;
> +
> +       for_each_runnable_thread(i, vcpu, vc) {
> +               if (vcpu->arch.pending_exceptions || !vcpu->arch.ceded)
> +                       return 1;
> +       }
> +
> +       return 0;
> +}
> +
>  /*
>   * All the vcpus in this vcore are idle, so wait for a decrementer
>   * or external interrupt to one of the vcpus.  vc->lock is held.
>   */
>  static void kvmppc_vcore_blocked(struct kvmppc_vcore *vc)
>  {
> -       struct kvm_vcpu *vcpu;
> -       int do_sleep = 1, i;
> +       int do_sleep = 1;
> +       ktime_t cur, start;
> +       u64 block_ns;
>         DECLARE_SWAITQUEUE(wait);
>
> -       prepare_to_swait(&vc->wq, &wait, TASK_INTERRUPTIBLE);
> +       /* Poll for pending exceptions and ceded state */
> +       cur = start = ktime_get();
> +       if (vc->halt_poll_ns) {
> +               ktime_t stop = ktime_add_ns(start, vc->halt_poll_ns);
>
> -       /*
> -        * Check one last time for pending exceptions and ceded state after
> -        * we put ourselves on the wait queue
> -        */
> -       for_each_runnable_thread(i, vcpu, vc) {
> -               if (vcpu->arch.pending_exceptions || !vcpu->arch.ceded) {
> -                       do_sleep = 0;
> -                       break;
> -               }
> +               vc->vcore_state = VCORE_POLLING;
> +               spin_unlock(&vc->lock);
> +
> +               do {
> +                       if (kvmppc_vcore_check_block(vc)) {
> +                               do_sleep = 0;
> +                               break;
> +                       }
> +                       cur = ktime_get();
> +               } while (ktime_before(cur, stop));
> +
> +               spin_lock(&vc->lock);
> +               vc->vcore_state = VCORE_INACTIVE;
> +
> +               if (!do_sleep)
> +                       goto out;
>         }
>
> -       if (!do_sleep) {
> +       prepare_to_swait(&vc->wq, &wait, TASK_INTERRUPTIBLE);
> +
> +       if (kvmppc_vcore_check_block(vc)) {
>                 finish_swait(&vc->wq, &wait);
> -               return;
> +               do_sleep = 0;
> +               goto out;
>         }
>
>         vc->vcore_state = VCORE_SLEEPING;
> @@ -2656,6 +2723,26 @@ static void kvmppc_vcore_blocked(struct kvmppc_vcore *vc)
>         spin_lock(&vc->lock);
>         vc->vcore_state = VCORE_INACTIVE;
>         trace_kvmppc_vcore_blocked(vc, 1);
> +
> +       cur = ktime_get();
> +
> +out:
> +       block_ns = ktime_to_ns(cur) - ktime_to_ns(start);
> +
> +       if (halt_poll_max_ns) {
> +               if (block_ns <= vc->halt_poll_ns)
> +                       ;
> +               /* We slept and blocked for longer than the max halt time */
> +               else if (vc->halt_poll_ns && block_ns > halt_poll_max_ns)
> +                       shrink_halt_poll_ns(vc);
> +               /* We slept and our poll time is too small */
> +               else if (vc->halt_poll_ns < halt_poll_max_ns &&
> +                               block_ns < halt_poll_max_ns)
> +                       grow_halt_poll_ns(vc);
> +       } else
> +               vc->halt_poll_ns = 0;
> +
> +       trace_kvmppc_vcore_wakeup(do_sleep, block_ns);
>  }
>
>  static int kvmppc_run_vcpu(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
> diff --git a/arch/powerpc/kvm/trace_hv.h b/arch/powerpc/kvm/trace_hv.h
> index 33d9daf..fb21990 100644
> --- a/arch/powerpc/kvm/trace_hv.h
> +++ b/arch/powerpc/kvm/trace_hv.h
> @@ -432,6 +432,28 @@ TRACE_EVENT(kvmppc_vcore_blocked,
>                    __entry->runner_vcpu, __entry->n_runnable, __entry->tgid)
>  );
>
> +TRACE_EVENT(kvmppc_vcore_wakeup,
> +       TP_PROTO(int do_sleep, __u64 ns),
> +
> +       TP_ARGS(do_sleep, ns),
> +
> +       TP_STRUCT__entry(
> +               __field(__u64,  ns)
> +               __field(int,    waited)
> +               __field(pid_t,  tgid)
> +       ),
> +
> +       TP_fast_assign(
> +               __entry->ns     = ns;
> +               __entry->waited = do_sleep;
> +               __entry->tgid   = current->tgid;
> +       ),
> +
> +       TP_printk("%s time %lld ns, tgid=%d",
> +               __entry->waited ? "wait" : "poll",
> +               __entry->ns, __entry->tgid)
> +);
> +
>  TRACE_EVENT(kvmppc_run_vcpu_enter,
>         TP_PROTO(struct kvm_vcpu *vcpu),
>
> --
> 2.5.5
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH V2 4/5] kvm/stats: Add provisioning for 64-bit vcpu statistics
  2016-07-11 16:51   ` David Matlack
@ 2016-07-11 17:05     ` Paolo Bonzini
  2016-07-11 17:30       ` David Matlack
  0 siblings, 1 reply; 26+ messages in thread
From: Paolo Bonzini @ 2016-07-11 17:05 UTC (permalink / raw)
  To: David Matlack, Suraj Jitindar Singh
  Cc: linuxppc-dev, kvm-ppc, mpe, paulus, benh, kvm list, agraf,
	Radim Krčmář



On 11/07/2016 18:51, David Matlack wrote:
>> > vcpus have statistics associated with them which can be viewed within the
>> > debugfs. Currently it is assumed within the vcpu_stat_get() and
>> > vcpu_stat_get_per_vm() functions that all of these statistics are
>> > represented as 32-bit numbers. The next patch adds some 64-bit statistics,
>> > so add provisioning for the display of 64-bit vcpu statistics.
> Thanks, we need 64-bit stats in other places as well. Can we use this
> opportunity to wholesale upgrade all KVM stats from u32 to u64? Most
> of this patch is duplicated code with "u32" swapped with "u64".
> 

I'm not sure of what 32-bit architectures would do, but perhaps we could
upgrade them to unsigned long at least.

Paolo

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH V2 3/5] kvm/ppc/book3s_hv: Implement halt polling in the kvm_hv kernel module
  2016-07-11 16:57   ` David Matlack
@ 2016-07-11 17:07     ` Paolo Bonzini
  2016-07-11 17:26       ` David Matlack
  0 siblings, 1 reply; 26+ messages in thread
From: Paolo Bonzini @ 2016-07-11 17:07 UTC (permalink / raw)
  To: David Matlack, Suraj Jitindar Singh
  Cc: linuxppc-dev, kvm-ppc, mpe, paulus, benh, kvm list, agraf,
	Radim Krčmář



On 11/07/2016 18:57, David Matlack wrote:
> On Mon, Jul 11, 2016 at 12:08 AM, Suraj Jitindar Singh
> <sjitindarsingh@gmail.com> wrote:
> > This patch introduces new halt polling functionality into the kvm_hv kernel
> > module. When a vcore is idle it will poll for some period of time before
> > scheduling itself out.
> 
> Is there any way to reuse the existing halt-polling code? Having two
> copies risks them diverging over time.

s/risks/guarantees/ :(

Unfortunately, handling of the hardware threads in KVM PPC is a mess,
and I don't think it's possible to remove the duplication.

Paolo

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH V2 3/5] kvm/ppc/book3s_hv: Implement halt polling in the kvm_hv kernel module
  2016-07-11 17:07     ` Paolo Bonzini
@ 2016-07-11 17:26       ` David Matlack
  2016-07-12  6:33         ` Suraj Jitindar Singh
  0 siblings, 1 reply; 26+ messages in thread
From: David Matlack @ 2016-07-11 17:26 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Suraj Jitindar Singh, linuxppc-dev, kvm-ppc, mpe, paulus, benh,
	kvm list, agraf, Radim Krčmář

On Mon, Jul 11, 2016 at 10:07 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
>
> On 11/07/2016 18:57, David Matlack wrote:
>> On Mon, Jul 11, 2016 at 12:08 AM, Suraj Jitindar Singh
>> <sjitindarsingh@gmail.com> wrote:
>> > This patch introduces new halt polling functionality into the kvm_hv kernel
>> > module. When a vcore is idle it will poll for some period of time before
>> > scheduling itself out.
>>
>> Is there any way to reuse the existing halt-polling code? Having two
>> copies risks them diverging over time.
>
> s/risks/guarantees/ :(
>
> Unfortunately, handling of the hardware threads in KVM PPC is a mess,
> and I don't think it's possible to remove the duplication.

Ah, ok. That's a shame.

>
> Paolo

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH V2 4/5] kvm/stats: Add provisioning for 64-bit vcpu statistics
  2016-07-11 17:05     ` Paolo Bonzini
@ 2016-07-11 17:30       ` David Matlack
  2016-07-11 19:31         ` Paolo Bonzini
  0 siblings, 1 reply; 26+ messages in thread
From: David Matlack @ 2016-07-11 17:30 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Suraj Jitindar Singh, linuxppc-dev, kvm-ppc, mpe, paulus, benh,
	kvm list, agraf, Radim Krčmář

On Mon, Jul 11, 2016 at 10:05 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
>
> On 11/07/2016 18:51, David Matlack wrote:
>>> > vcpus have statistics associated with them which can be viewed within the
>>> > debugfs. Currently it is assumed within the vcpu_stat_get() and
>>> > vcpu_stat_get_per_vm() functions that all of these statistics are
>>> > represented as 32-bit numbers. The next patch adds some 64-bit statistics,
>>> > so add provisioning for the display of 64-bit vcpu statistics.
>> Thanks, we need 64-bit stats in other places as well. Can we use this
>> opportunity to wholesale upgrade all KVM stats from u32 to u64? Most
>> of this patch is duplicated code with "u32" swapped with "u64".
>>
>
> I'm not sure of what 32-bit architectures would do, but perhaps we could
> upgrade them to unsigned long at least.

I thought u64 still existed on 32-bit architectures. unsigned long
would be fine but with the caveat that certain stats would overflow on
32-bit architectures.

>
> Paolo

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH V2 4/5] kvm/stats: Add provisioning for 64-bit vcpu statistics
  2016-07-11 17:30       ` David Matlack
@ 2016-07-11 19:31         ` Paolo Bonzini
  2016-07-11 19:45           ` David Matlack
  2016-07-13 18:00           ` Christian Borntraeger
  0 siblings, 2 replies; 26+ messages in thread
From: Paolo Bonzini @ 2016-07-11 19:31 UTC (permalink / raw)
  To: David Matlack
  Cc: Suraj Jitindar Singh, linuxppc-dev, kvm-ppc, mpe, paulus, benh,
	kvm list, agraf, Radim Krčmář



On 11/07/2016 19:30, David Matlack wrote:
> On Mon, Jul 11, 2016 at 10:05 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>
>>
>> On 11/07/2016 18:51, David Matlack wrote:
>>>>> vcpus have statistics associated with them which can be viewed within the
>>>>> debugfs. Currently it is assumed within the vcpu_stat_get() and
>>>>> vcpu_stat_get_per_vm() functions that all of these statistics are
>>>>> represented as 32-bit numbers. The next patch adds some 64-bit statistics,
>>>>> so add provisioning for the display of 64-bit vcpu statistics.
>>> Thanks, we need 64-bit stats in other places as well. Can we use this
>>> opportunity to wholesale upgrade all KVM stats from u32 to u64? Most
>>> of this patch is duplicated code with "u32" swapped with "u64".
>>>
>>
>> I'm not sure of what 32-bit architectures would do, but perhaps we could
>> upgrade them to unsigned long at least.
> 
> I thought u64 still existed on 32-bit architectures. unsigned long
> would be fine but with the caveat that certain stats would overflow on
> 32-bit architectures.

Yes, but not all 32-bit architectures can do atomic read-modify-write
(e.g. add) operations on 64-bit values.

Paolo

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH V2 4/5] kvm/stats: Add provisioning for 64-bit vcpu statistics
  2016-07-11 19:31         ` Paolo Bonzini
@ 2016-07-11 19:45           ` David Matlack
  2016-07-12  6:24             ` Suraj Jitindar Singh
  2016-07-13 18:00           ` Christian Borntraeger
  1 sibling, 1 reply; 26+ messages in thread
From: David Matlack @ 2016-07-11 19:45 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Suraj Jitindar Singh, linuxppc-dev, kvm-ppc, mpe, paulus, benh,
	kvm list, agraf, Radim Krčmář

On Mon, Jul 11, 2016 at 12:31 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
>
> On 11/07/2016 19:30, David Matlack wrote:
>> On Mon, Jul 11, 2016 at 10:05 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>>
>>>
>>> On 11/07/2016 18:51, David Matlack wrote:
>>>>>> vcpus have statistics associated with them which can be viewed within the
>>>>>> debugfs. Currently it is assumed within the vcpu_stat_get() and
>>>>>> vcpu_stat_get_per_vm() functions that all of these statistics are
>>>>>> represented as 32-bit numbers. The next patch adds some 64-bit statistics,
>>>>>> so add provisioning for the display of 64-bit vcpu statistics.
>>>> Thanks, we need 64-bit stats in other places as well. Can we use this
>>>> opportunity to wholesale upgrade all KVM stats from u32 to u64? Most
>>>> of this patch is duplicated code with "u32" swapped with "u64".
>>>>
>>>
>>> I'm not sure of what 32-bit architectures would do, but perhaps we could
>>> upgrade them to unsigned long at least.
>>
>> I thought u64 still existed on 32-bit architectures. unsigned long
>> would be fine but with the caveat that certain stats would overflow on
>> 32-bit architectures.
>
> Yes, but not all 32-bit architectures can do atomic read-modify-write
> (e.g. add) operations on 64-bit values.

I think that's ok, none of the stats currently use atomic operations.

>
> Paolo

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH V2 5/5] powerpc/kvm/stats: Implement existing and add new halt polling vcpu stats
  2016-07-11 16:49   ` David Matlack
@ 2016-07-12  6:17     ` Suraj Jitindar Singh
  2016-07-13  6:07       ` Suraj Jitindar Singh
  0 siblings, 1 reply; 26+ messages in thread
From: Suraj Jitindar Singh @ 2016-07-12  6:17 UTC (permalink / raw)
  To: David Matlack
  Cc: linuxppc-dev, kvm-ppc, mpe, paulus, benh, kvm list,
	Paolo Bonzini, agraf, Radim Krčmář

On 12/07/16 02:49, David Matlack wrote:
> On Mon, Jul 11, 2016 at 12:08 AM, Suraj Jitindar Singh
> <sjitindarsingh@gmail.com> wrote:
>> vcpu stats are used to collect information about a vcpu which can be viewed
>> in the debugfs. For example halt_attempted_poll and halt_successful_poll
>> are used to keep track of the number of times the vcpu attempts to and
>> successfully polls. These stats are currently not used on powerpc.
>>
>> Implement incrementation of the halt_attempted_poll and
>> halt_successful_poll vcpu stats for powerpc. Since these stats are summed
>> over all the vcpus for all running guests it doesn't matter which vcpu
>> they are attributed to, thus we choose the current runner vcpu of the
>> vcore.
>>
>> Also add new vcpu stats: halt_poll_time and halt_wait_time to be used to
>> accumulate the total time spend polling and waiting respectively, and
>> halt_successful_wait to accumulate the number of times the vcpu waits.
>> Given that halt_poll_time and halt_wait_time are expressed in nanoseconds
>> it is necessary to represent these as 64-bit quantities, otherwise they
>> would overflow after only about 4 seconds.
>>
>> Given that the total time spend either polling or waiting will be known and
>> the number of times that each was done, it will be possible to determine
>> the average poll and wait times. This will give the ability to tune the kvm
>> module parameters based on the calculated average wait and poll times.
>>
>> ---
>> Change Log:
>>
>> V1 -> V2:
>>         - Nothing
>>
>> Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
>> ---
>>  arch/powerpc/include/asm/kvm_host.h |  3 +++
>>  arch/powerpc/kvm/book3s.c           |  3 +++
>>  arch/powerpc/kvm/book3s_hv.c        | 14 +++++++++++++-
>>  3 files changed, 19 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
>> index 610f393..66a7198 100644
>> --- a/arch/powerpc/include/asm/kvm_host.h
>> +++ b/arch/powerpc/include/asm/kvm_host.h
>> @@ -114,8 +114,11 @@ struct kvm_vcpu_stat {
>>         u32 emulated_inst_exits;
>>         u32 dec_exits;
>>         u32 ext_intr_exits;
>> +       u64 halt_poll_time;
>> +       u64 halt_wait_time;
>>         u32 halt_successful_poll;
>>         u32 halt_attempted_poll;
>> +       u32 halt_successful_wait;
>>         u32 halt_poll_invalid;
>>         u32 halt_wakeup;
>>         u32 dbell_exits;
>> diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
>> index ed9132b..6217bea 100644
>> --- a/arch/powerpc/kvm/book3s.c
>> +++ b/arch/powerpc/kvm/book3s.c
>> @@ -53,8 +53,11 @@ struct kvm_stats_debugfs_item debugfs_entries[] = {
>>         { "dec",         VCPU_STAT(dec_exits) },
>>         { "ext_intr",    VCPU_STAT(ext_intr_exits) },
>>         { "queue_intr",  VCPU_STAT(queue_intr) },
>> +       { "halt_poll_time_ns",          VCPU_STAT_U64(halt_poll_time) },
>> +       { "halt_wait_time_ns",          VCPU_STAT_U64(halt_wait_time) },
>>         { "halt_successful_poll", VCPU_STAT(halt_successful_poll), },
>>         { "halt_attempted_poll", VCPU_STAT(halt_attempted_poll), },
>> +       { "halt_successful_wait",       VCPU_STAT(halt_successful_wait) },
>>         { "halt_poll_invalid", VCPU_STAT(halt_poll_invalid) },
>>         { "halt_wakeup", VCPU_STAT(halt_wakeup) },
>>         { "pf_storage",  VCPU_STAT(pf_storage) },
>> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
>> index 0d8ce14..a0dae63 100644
>> --- a/arch/powerpc/kvm/book3s_hv.c
>> +++ b/arch/powerpc/kvm/book3s_hv.c
>> @@ -2688,6 +2688,7 @@ static void kvmppc_vcore_blocked(struct kvmppc_vcore *vc)
>>         cur = start = ktime_get();
>>         if (vc->halt_poll_ns) {
>>                 ktime_t stop = ktime_add_ns(start, vc->halt_poll_ns);
>> +               ++vc->runner->stat.halt_attempted_poll;
>>
>>                 vc->vcore_state = VCORE_POLLING;
>>                 spin_unlock(&vc->lock);
>> @@ -2703,8 +2704,10 @@ static void kvmppc_vcore_blocked(struct kvmppc_vcore *vc)
>>                 spin_lock(&vc->lock);
>>                 vc->vcore_state = VCORE_INACTIVE;
>>
>> -               if (!do_sleep)
>> +               if (!do_sleep) {
>> +                       ++vc->runner->stat.halt_successful_poll;
>>                         goto out;
>> +               }
>>         }
>>
>>         prepare_to_swait(&vc->wq, &wait, TASK_INTERRUPTIBLE);
>> @@ -2712,6 +2715,9 @@ static void kvmppc_vcore_blocked(struct kvmppc_vcore *vc)
>>         if (kvmppc_vcore_check_block(vc)) {
>>                 finish_swait(&vc->wq, &wait);
>>                 do_sleep = 0;
>> +               /* If we polled, count this as a successful poll */
>> +               if (vc->halt_poll_ns)
>> +                       ++vc->runner->stat.halt_successful_poll;
>>                 goto out;
>>         }
>>
>> @@ -2723,12 +2729,18 @@ static void kvmppc_vcore_blocked(struct kvmppc_vcore *vc)
>>         spin_lock(&vc->lock);
>>         vc->vcore_state = VCORE_INACTIVE;
>>         trace_kvmppc_vcore_blocked(vc, 1);
>> +       ++vc->runner->stat.halt_successful_wait;
>>
>>         cur = ktime_get();
>>
>>  out:
>>         block_ns = ktime_to_ns(cur) - ktime_to_ns(start);
>>
>> +       if (do_sleep)
>> +               vc->runner->stat.halt_wait_time += block_ns;
> It's possible to poll and wait in one halt, conflating this stat with
> polling time. Is it useful to split out a third stat,
> halt_poll_fail_ns which counts how long we polled which ended up
> sleeping? Then halt_wait_time only counts the time the VCPU spent on
> the wait queue. The sum of all 3 is still the total time spent halted.
>
I see what you're saying. I would say that in the event that you do wait
then the most useful number is going to be the total block time (the sum
of the wait and poll time) as this is the minimum value you would have to
set the halt_poll_max_ns module parameter in order to ensure you poll
for long enough (in most circumstances) to avoid waiting, which is the main 
use case I envision for this statistic. That being said this is definitely
a source of ambiguity and splitting this into two statistics would make the
distinction clearer without any loss of data, you could simply sum the two
stats to get the same number.

Either way I don't think it really makes much of a difference, but in the
interest of clarity I think I'll split the statistic.

>> +       else if (vc->halt_poll_ns)
>> +               vc->runner->stat.halt_poll_time += block_ns;
>> +
>>         if (halt_poll_max_ns) {
>>                 if (block_ns <= vc->halt_poll_ns)
>>                         ;
>> --
>> 2.5.5
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH V2 4/5] kvm/stats: Add provisioning for 64-bit vcpu statistics
  2016-07-11 19:45           ` David Matlack
@ 2016-07-12  6:24             ` Suraj Jitindar Singh
  0 siblings, 0 replies; 26+ messages in thread
From: Suraj Jitindar Singh @ 2016-07-12  6:24 UTC (permalink / raw)
  To: David Matlack, Paolo Bonzini
  Cc: linuxppc-dev, kvm-ppc, mpe, paulus, benh, kvm list, agraf,
	Radim Krčmář



On 12/07/16 05:45, David Matlack wrote:
> On Mon, Jul 11, 2016 at 12:31 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>
>> On 11/07/2016 19:30, David Matlack wrote:
>>> On Mon, Jul 11, 2016 at 10:05 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>>>
>>>> On 11/07/2016 18:51, David Matlack wrote:
>>>>>>> vcpus have statistics associated with them which can be viewed within the
>>>>>>> debugfs. Currently it is assumed within the vcpu_stat_get() and
>>>>>>> vcpu_stat_get_per_vm() functions that all of these statistics are
>>>>>>> represented as 32-bit numbers. The next patch adds some 64-bit statistics,
>>>>>>> so add provisioning for the display of 64-bit vcpu statistics.
>>>>> Thanks, we need 64-bit stats in other places as well. Can we use this
>>>>> opportunity to wholesale upgrade all KVM stats from u32 to u64? Most
>>>>> of this patch is duplicated code with "u32" swapped with "u64".
>>>>>
>>>> I'm not sure of what 32-bit architectures would do, but perhaps we could
>>>> upgrade them to unsigned long at least.
>>> I thought u64 still existed on 32-bit architectures. unsigned long
>>> would be fine but with the caveat that certain stats would overflow on
>>> 32-bit architectures.
>> Yes, but not all 32-bit architectures can do atomic read-modify-write
>> (e.g. add) operations on 64-bit values.
> I think that's ok, none of the stats currently use atomic operations.

Yeah so this patch pretty much duplicates the 32-bit code.

So what you're saying is just replace all of the 32-bit statistics with
longs, that way we get 32-bit on 32-bit machines and 64-bit on 64-bit
machines? Then we just accept that on 32-bit machines we will get overflow
on some stats.

Or do you think u64s would be better and we accept that on 32-bit machines
we might get update conflicts from non-atomic concurrent accesses? Which
honestly I don't see being a huge issue in this use case.

>
>> Paolo


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH V2 3/5] kvm/ppc/book3s_hv: Implement halt polling in the kvm_hv kernel module
  2016-07-11 17:26       ` David Matlack
@ 2016-07-12  6:33         ` Suraj Jitindar Singh
  0 siblings, 0 replies; 26+ messages in thread
From: Suraj Jitindar Singh @ 2016-07-12  6:33 UTC (permalink / raw)
  To: David Matlack, Paolo Bonzini
  Cc: linuxppc-dev, kvm-ppc, mpe, paulus, benh, kvm list, agraf,
	Radim Krčmář



On 12/07/16 03:26, David Matlack wrote:
> On Mon, Jul 11, 2016 at 10:07 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>
>> On 11/07/2016 18:57, David Matlack wrote:
>>> On Mon, Jul 11, 2016 at 12:08 AM, Suraj Jitindar Singh
>>> <sjitindarsingh@gmail.com> wrote:
>>>> This patch introduces new halt polling functionality into the kvm_hv kernel
>>>> module. When a vcore is idle it will poll for some period of time before
>>>> scheduling itself out.
>>> Is there any way to reuse the existing halt-polling code? Having two
>>> copies risks them diverging over time.
>> s/risks/guarantees/ :(
>>
>> Unfortunately, handling of the hardware threads in KVM PPC is a mess,
>> and I don't think it's possible to remove the duplication.
> Ah, ok. That's a shame.

It's definitely not ideal having this code duplicated, although we have
the issue that on PPC we only poll once all of the vcpus on a vcore have
ceded and need to retain a reference to that vcore.

Additionally we only actually do this in HV code, on the KVM PR
version we call the generic halt-polling code which doesn't know
about vcores.

I don't see an easy way to use the existing function.

>
>> Paolo


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH V2 5/5] powerpc/kvm/stats: Implement existing and add new halt polling vcpu stats
  2016-07-12  6:17     ` Suraj Jitindar Singh
@ 2016-07-13  6:07       ` Suraj Jitindar Singh
  2016-07-13 17:20         ` David Matlack
  0 siblings, 1 reply; 26+ messages in thread
From: Suraj Jitindar Singh @ 2016-07-13  6:07 UTC (permalink / raw)
  To: David Matlack
  Cc: linuxppc-dev, kvm-ppc, mpe, paulus, benh, kvm list,
	Paolo Bonzini, agraf, Radim Krčmář

On 12/07/16 16:17, Suraj Jitindar Singh wrote:
> On 12/07/16 02:49, David Matlack wrote:
>> On Mon, Jul 11, 2016 at 12:08 AM, Suraj Jitindar Singh
>> <sjitindarsingh@gmail.com> wrote:
>>> vcpu stats are used to collect information about a vcpu which can be viewed
>>> in the debugfs. For example halt_attempted_poll and halt_successful_poll
>>> are used to keep track of the number of times the vcpu attempts to and
>>> successfully polls. These stats are currently not used on powerpc.
>>>
>>> Implement incrementation of the halt_attempted_poll and
>>> halt_successful_poll vcpu stats for powerpc. Since these stats are summed
>>> over all the vcpus for all running guests it doesn't matter which vcpu
>>> they are attributed to, thus we choose the current runner vcpu of the
>>> vcore.
>>>
>>> Also add new vcpu stats: halt_poll_time and halt_wait_time to be used to
>>> accumulate the total time spend polling and waiting respectively, and
>>> halt_successful_wait to accumulate the number of times the vcpu waits.
>>> Given that halt_poll_time and halt_wait_time are expressed in nanoseconds
>>> it is necessary to represent these as 64-bit quantities, otherwise they
>>> would overflow after only about 4 seconds.
>>>
>>> Given that the total time spend either polling or waiting will be known and
>>> the number of times that each was done, it will be possible to determine
>>> the average poll and wait times. This will give the ability to tune the kvm
>>> module parameters based on the calculated average wait and poll times.
>>>
>>> ---
>>> Change Log:
>>>
>>> V1 -> V2:
>>>         - Nothing
>>>
>>> Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
>>> ---
>>>  arch/powerpc/include/asm/kvm_host.h |  3 +++
>>>  arch/powerpc/kvm/book3s.c           |  3 +++
>>>  arch/powerpc/kvm/book3s_hv.c        | 14 +++++++++++++-
>>>  3 files changed, 19 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
>>> index 610f393..66a7198 100644
>>> --- a/arch/powerpc/include/asm/kvm_host.h
>>> +++ b/arch/powerpc/include/asm/kvm_host.h
>>> @@ -114,8 +114,11 @@ struct kvm_vcpu_stat {
>>>         u32 emulated_inst_exits;
>>>         u32 dec_exits;
>>>         u32 ext_intr_exits;
>>> +       u64 halt_poll_time;
>>> +       u64 halt_wait_time;
>>>         u32 halt_successful_poll;
>>>         u32 halt_attempted_poll;
>>> +       u32 halt_successful_wait;
>>>         u32 halt_poll_invalid;
>>>         u32 halt_wakeup;
>>>         u32 dbell_exits;
>>> diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
>>> index ed9132b..6217bea 100644
>>> --- a/arch/powerpc/kvm/book3s.c
>>> +++ b/arch/powerpc/kvm/book3s.c
>>> @@ -53,8 +53,11 @@ struct kvm_stats_debugfs_item debugfs_entries[] = {
>>>         { "dec",         VCPU_STAT(dec_exits) },
>>>         { "ext_intr",    VCPU_STAT(ext_intr_exits) },
>>>         { "queue_intr",  VCPU_STAT(queue_intr) },
>>> +       { "halt_poll_time_ns",          VCPU_STAT_U64(halt_poll_time) },
>>> +       { "halt_wait_time_ns",          VCPU_STAT_U64(halt_wait_time) },
>>>         { "halt_successful_poll", VCPU_STAT(halt_successful_poll), },
>>>         { "halt_attempted_poll", VCPU_STAT(halt_attempted_poll), },
>>> +       { "halt_successful_wait",       VCPU_STAT(halt_successful_wait) },
>>>         { "halt_poll_invalid", VCPU_STAT(halt_poll_invalid) },
>>>         { "halt_wakeup", VCPU_STAT(halt_wakeup) },
>>>         { "pf_storage",  VCPU_STAT(pf_storage) },
>>> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
>>> index 0d8ce14..a0dae63 100644
>>> --- a/arch/powerpc/kvm/book3s_hv.c
>>> +++ b/arch/powerpc/kvm/book3s_hv.c
>>> @@ -2688,6 +2688,7 @@ static void kvmppc_vcore_blocked(struct kvmppc_vcore *vc)
>>>         cur = start = ktime_get();
>>>         if (vc->halt_poll_ns) {
>>>                 ktime_t stop = ktime_add_ns(start, vc->halt_poll_ns);
>>> +               ++vc->runner->stat.halt_attempted_poll;
>>>
>>>                 vc->vcore_state = VCORE_POLLING;
>>>                 spin_unlock(&vc->lock);
>>> @@ -2703,8 +2704,10 @@ static void kvmppc_vcore_blocked(struct kvmppc_vcore *vc)
>>>                 spin_lock(&vc->lock);
>>>                 vc->vcore_state = VCORE_INACTIVE;
>>>
>>> -               if (!do_sleep)
>>> +               if (!do_sleep) {
>>> +                       ++vc->runner->stat.halt_successful_poll;
>>>                         goto out;
>>> +               }
>>>         }
>>>
>>>         prepare_to_swait(&vc->wq, &wait, TASK_INTERRUPTIBLE);
>>> @@ -2712,6 +2715,9 @@ static void kvmppc_vcore_blocked(struct kvmppc_vcore *vc)
>>>         if (kvmppc_vcore_check_block(vc)) {
>>>                 finish_swait(&vc->wq, &wait);
>>>                 do_sleep = 0;
>>> +               /* If we polled, count this as a successful poll */
>>> +               if (vc->halt_poll_ns)
>>> +                       ++vc->runner->stat.halt_successful_poll;
>>>                 goto out;
>>>         }
>>>
>>> @@ -2723,12 +2729,18 @@ static void kvmppc_vcore_blocked(struct kvmppc_vcore *vc)
>>>         spin_lock(&vc->lock);
>>>         vc->vcore_state = VCORE_INACTIVE;
>>>         trace_kvmppc_vcore_blocked(vc, 1);
>>> +       ++vc->runner->stat.halt_successful_wait;
>>>
>>>         cur = ktime_get();
>>>
>>>  out:
>>>         block_ns = ktime_to_ns(cur) - ktime_to_ns(start);
>>>
>>> +       if (do_sleep)
>>> +               vc->runner->stat.halt_wait_time += block_ns;
>> It's possible to poll and wait in one halt, conflating this stat with
>> polling time. Is it useful to split out a third stat,
>> halt_poll_fail_ns which counts how long we polled which ended up
>> sleeping? Then halt_wait_time only counts the time the VCPU spent on
>> the wait queue. The sum of all 3 is still the total time spent halted.
>>
> I see what you're saying. I would say that in the event that you do wait
> then the most useful number is going to be the total block time (the sum
> of the wait and poll time) as this is the minimum value you would have to
> set the halt_poll_max_ns module parameter in order to ensure you poll
> for long enough (in most circumstances) to avoid waiting, which is the main 
> use case I envision for this statistic. That being said this is definitely
> a source of ambiguity and splitting this into two statistics would make the
> distinction clearer without any loss of data, you could simply sum the two
> stats to get the same number.
>
> Either way I don't think it really makes much of a difference, but in the
> interest of clarity I think I'll split the statistic.

On further though, I really think that splitting this statistic is an
unnecessary source of ambiguity. In reality the interesting piece of
information is going to be the average time that you blocked on
either an unsuccessful poll or a successful poll.

So instead of splitting the statistic I'm going to rename them as:
halt_poll_time -> halt_block_time_successful_poll
halt_wait_time -> halt_block_time_waited

>
>>> +       else if (vc->halt_poll_ns)
>>> +               vc->runner->stat.halt_poll_time += block_ns;
>>> +
>>>         if (halt_poll_max_ns) {
>>>                 if (block_ns <= vc->halt_poll_ns)
>>>                         ;
>>> --
>>> 2.5.5
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH V2 5/5] powerpc/kvm/stats: Implement existing and add new halt polling vcpu stats
  2016-07-13  6:07       ` Suraj Jitindar Singh
@ 2016-07-13 17:20         ` David Matlack
  2016-07-15  7:53           ` Suraj Jitindar Singh
  0 siblings, 1 reply; 26+ messages in thread
From: David Matlack @ 2016-07-13 17:20 UTC (permalink / raw)
  To: Suraj Jitindar Singh
  Cc: linuxppc-dev, kvm-ppc, mpe, paulus, benh, kvm list,
	Paolo Bonzini, agraf, Radim Krčmář

On Tue, Jul 12, 2016 at 11:07 PM, Suraj Jitindar Singh
<sjitindarsingh@gmail.com> wrote:
> On 12/07/16 16:17, Suraj Jitindar Singh wrote:
>> On 12/07/16 02:49, David Matlack wrote:
[snip]
>>> It's possible to poll and wait in one halt, conflating this stat with
>>> polling time. Is it useful to split out a third stat,
>>> halt_poll_fail_ns which counts how long we polled which ended up
>>> sleeping? Then halt_wait_time only counts the time the VCPU spent on
>>> the wait queue. The sum of all 3 is still the total time spent halted.
>>>
>> I see what you're saying. I would say that in the event that you do wait
>> then the most useful number is going to be the total block time (the sum
>> of the wait and poll time) as this is the minimum value you would have to
>> set the halt_poll_max_ns module parameter in order to ensure you poll
>> for long enough (in most circumstances) to avoid waiting, which is the main
>> use case I envision for this statistic. That being said this is definitely
>> a source of ambiguity and splitting this into two statistics would make the
>> distinction clearer without any loss of data, you could simply sum the two
>> stats to get the same number.
>>
>> Either way I don't think it really makes much of a difference, but in the
>> interest of clarity I think I'll split the statistic.
>
> On further though, I really think that splitting this statistic is an
> unnecessary source of ambiguity. In reality the interesting piece of
> information is going to be the average time that you blocked on
> either an unsuccessful poll or a successful poll.
>
> So instead of splitting the statistic I'm going to rename them as:
> halt_poll_time -> halt_block_time_successful_poll
> halt_wait_time -> halt_block_time_waited

The downside of having only these 2 stats is there is no way to see
the total time spent halt-polling. Halt-polling shows up as host
kernel CPU usage on the VCPU thread, despite it really being idle
cycles that could be reclaimed. It's useful to have the total amount
of time spent halt-polling (halt_poll_fail + halt_poll_success) to
feed into provisioning/monitoring systems that look at CPU usage.

FWIW, I have a very similar patch internally. It adds 2 stats,
halt_poll_success_ns and halt_poll_fail_ns, to the halt-polling code
in virt/kvm/kvm_main.c. So if you agree splitting the stats makes
sense, it would be helpful to us if we can adopt the same naming
convention.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH V2 4/5] kvm/stats: Add provisioning for 64-bit vcpu statistics
  2016-07-11 19:31         ` Paolo Bonzini
  2016-07-11 19:45           ` David Matlack
@ 2016-07-13 18:00           ` Christian Borntraeger
  2016-07-14  9:42             ` Paolo Bonzini
  1 sibling, 1 reply; 26+ messages in thread
From: Christian Borntraeger @ 2016-07-13 18:00 UTC (permalink / raw)
  To: Paolo Bonzini, David Matlack
  Cc: kvm list, Radim Krčmář,
	kvm-ppc, paulus, Suraj Jitindar Singh, linuxppc-dev, agraf

On 07/11/2016 09:31 PM, Paolo Bonzini wrote:
> 
> 
> On 11/07/2016 19:30, David Matlack wrote:
>> On Mon, Jul 11, 2016 at 10:05 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>>
>>>
>>> On 11/07/2016 18:51, David Matlack wrote:
>>>>>> vcpus have statistics associated with them which can be viewed within the
>>>>>> debugfs. Currently it is assumed within the vcpu_stat_get() and
>>>>>> vcpu_stat_get_per_vm() functions that all of these statistics are
>>>>>> represented as 32-bit numbers. The next patch adds some 64-bit statistics,
>>>>>> so add provisioning for the display of 64-bit vcpu statistics.
>>>> Thanks, we need 64-bit stats in other places as well. Can we use this
>>>> opportunity to wholesale upgrade all KVM stats from u32 to u64? Most
>>>> of this patch is duplicated code with "u32" swapped with "u64".
>>>>
>>>
>>> I'm not sure of what 32-bit architectures would do, but perhaps we could
>>> upgrade them to unsigned long at least.
>>
>> I thought u64 still existed on 32-bit architectures. unsigned long
>> would be fine but with the caveat that certain stats would overflow on
>> 32-bit architectures.
> 
> Yes, but not all 32-bit architectures can do atomic read-modify-write
> (e.g. add) operations on 64-bit values.

So what about only doing it for the VCPU events? Those should be only
modified by one CPU. We would have some odd values on 32bit overflow, but
this will be certainly better than just start with 0


_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH V2 4/5] kvm/stats: Add provisioning for 64-bit vcpu statistics
  2016-07-13 18:00           ` Christian Borntraeger
@ 2016-07-14  9:42             ` Paolo Bonzini
  2016-07-15  7:52               ` Suraj Jitindar Singh
  0 siblings, 1 reply; 26+ messages in thread
From: Paolo Bonzini @ 2016-07-14  9:42 UTC (permalink / raw)
  To: Christian Borntraeger, David Matlack
  Cc: Suraj Jitindar Singh, linuxppc-dev, kvm-ppc, mpe, paulus, benh,
	kvm list, agraf, Radim Krčmář



On 13/07/2016 20:00, Christian Borntraeger wrote:
>>> >> I thought u64 still existed on 32-bit architectures. unsigned long
>>> >> would be fine but with the caveat that certain stats would overflow on
>>> >> 32-bit architectures.
>> > 
>> > Yes, but not all 32-bit architectures can do atomic read-modify-write
>> > (e.g. add) operations on 64-bit values.
> So what about only doing it for the VCPU events? Those should be only
> modified by one CPU. We would have some odd values on 32bit overflow, but
> this will be certainly better than just start with 0

If that's good enough for PPC, that's fine.

Paolo

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH V2 4/5] kvm/stats: Add provisioning for 64-bit vcpu statistics
  2016-07-14  9:42             ` Paolo Bonzini
@ 2016-07-15  7:52               ` Suraj Jitindar Singh
  2016-07-18  7:17                 ` Christian Borntraeger
  0 siblings, 1 reply; 26+ messages in thread
From: Suraj Jitindar Singh @ 2016-07-15  7:52 UTC (permalink / raw)
  To: Paolo Bonzini, Christian Borntraeger, David Matlack
  Cc: linuxppc-dev, kvm-ppc, mpe, paulus, benh, kvm list, agraf,
	Radim Krčmář



On 14/07/16 19:42, Paolo Bonzini wrote:
>
> On 13/07/2016 20:00, Christian Borntraeger wrote:
>>>>>> I thought u64 still existed on 32-bit architectures. unsigned long
>>>>>> would be fine but with the caveat that certain stats would overflow on
>>>>>> 32-bit architectures.
>>>> Yes, but not all 32-bit architectures can do atomic read-modify-write
>>>> (e.g. add) operations on 64-bit values.
>> So what about only doing it for the VCPU events? Those should be only
>> modified by one CPU. We would have some odd values on 32bit overflow, but
>> this will be certainly better than just start with 0
> If that's good enough for PPC, that's fine.
>
> Paolo

I'm don't feel great about having vcpu_stats as u64 and vm_stats still as u32
it's just a bit inconsistent.

That being said, it's only the vcpu_stats which I require to be u64 at this
stage so it's possible to just upgrade those.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH V2 5/5] powerpc/kvm/stats: Implement existing and add new halt polling vcpu stats
  2016-07-13 17:20         ` David Matlack
@ 2016-07-15  7:53           ` Suraj Jitindar Singh
  0 siblings, 0 replies; 26+ messages in thread
From: Suraj Jitindar Singh @ 2016-07-15  7:53 UTC (permalink / raw)
  To: David Matlack
  Cc: linuxppc-dev, kvm-ppc, mpe, paulus, benh, kvm list,
	Paolo Bonzini, agraf, Radim Krčmář



On 14/07/16 03:20, David Matlack wrote:
> On Tue, Jul 12, 2016 at 11:07 PM, Suraj Jitindar Singh
> <sjitindarsingh@gmail.com> wrote:
>> On 12/07/16 16:17, Suraj Jitindar Singh wrote:
>>> On 12/07/16 02:49, David Matlack wrote:
> [snip]
>>>> It's possible to poll and wait in one halt, conflating this stat with
>>>> polling time. Is it useful to split out a third stat,
>>>> halt_poll_fail_ns which counts how long we polled which ended up
>>>> sleeping? Then halt_wait_time only counts the time the VCPU spent on
>>>> the wait queue. The sum of all 3 is still the total time spent halted.
>>>>
>>> I see what you're saying. I would say that in the event that you do wait
>>> then the most useful number is going to be the total block time (the sum
>>> of the wait and poll time) as this is the minimum value you would have to
>>> set the halt_poll_max_ns module parameter in order to ensure you poll
>>> for long enough (in most circumstances) to avoid waiting, which is the main
>>> use case I envision for this statistic. That being said this is definitely
>>> a source of ambiguity and splitting this into two statistics would make the
>>> distinction clearer without any loss of data, you could simply sum the two
>>> stats to get the same number.
>>>
>>> Either way I don't think it really makes much of a difference, but in the
>>> interest of clarity I think I'll split the statistic.
>> On further though, I really think that splitting this statistic is an
>> unnecessary source of ambiguity. In reality the interesting piece of
>> information is going to be the average time that you blocked on
>> either an unsuccessful poll or a successful poll.
>>
>> So instead of splitting the statistic I'm going to rename them as:
>> halt_poll_time -> halt_block_time_successful_poll
>> halt_wait_time -> halt_block_time_waited
> The downside of having only these 2 stats is there is no way to see
> the total time spent halt-polling. Halt-polling shows up as host
> kernel CPU usage on the VCPU thread, despite it really being idle
> cycles that could be reclaimed. It's useful to have the total amount
> of time spent halt-polling (halt_poll_fail + halt_poll_success) to
> feed into provisioning/monitoring systems that look at CPU usage.
>
> FWIW, I have a very similar patch internally. It adds 2 stats,
> halt_poll_success_ns and halt_poll_fail_ns, to the halt-polling code
> in virt/kvm/kvm_main.c. So if you agree splitting the stats makes
> sense, it would be helpful to us if we can adopt the same naming
> convention.

Ok, I didn't realise that was a use case.

Makes sense, I'll split it and adopt those names.

Thanks


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH V2 4/5] kvm/stats: Add provisioning for 64-bit vcpu statistics
  2016-07-15  7:52               ` Suraj Jitindar Singh
@ 2016-07-18  7:17                 ` Christian Borntraeger
  2016-07-18  8:24                   ` Paolo Bonzini
  0 siblings, 1 reply; 26+ messages in thread
From: Christian Borntraeger @ 2016-07-18  7:17 UTC (permalink / raw)
  To: Suraj Jitindar Singh, Paolo Bonzini, David Matlack
  Cc: linuxppc-dev, kvm-ppc, mpe, paulus, benh, kvm list, agraf,
	Radim Krčmář

On 07/15/2016 09:52 AM, Suraj Jitindar Singh wrote:
> 
> 
> On 14/07/16 19:42, Paolo Bonzini wrote:
>>
>> On 13/07/2016 20:00, Christian Borntraeger wrote:
>>>>>>> I thought u64 still existed on 32-bit architectures. unsigned long
>>>>>>> would be fine but with the caveat that certain stats would overflow on
>>>>>>> 32-bit architectures.
>>>>> Yes, but not all 32-bit architectures can do atomic read-modify-write
>>>>> (e.g. add) operations on 64-bit values.
>>> So what about only doing it for the VCPU events? Those should be only
>>> modified by one CPU. We would have some odd values on 32bit overflow, but
>>> this will be certainly better than just start with 0
>> If that's good enough for PPC, that's fine.
>>
>> Paolo
> 
> I'm don't feel great about having vcpu_stats as u64 and vm_stats still as u32
> it's just a bit inconsistent.
> 
> That being said, it's only the vcpu_stats which I require to be u64 at this
> stage so it's possible to just upgrade those.

Yes, its not nice, but we probably want to avoid the overhead of atomics.
What about using u64 for vcpu_stats and unsigned long for vm_stats. This will be
correct for anyone and on 64bit systems we get 64 bits for everything?




^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH V2 4/5] kvm/stats: Add provisioning for 64-bit vcpu statistics
  2016-07-18  7:17                 ` Christian Borntraeger
@ 2016-07-18  8:24                   ` Paolo Bonzini
  2016-07-19  1:31                     ` Suraj Jitindar Singh
  0 siblings, 1 reply; 26+ messages in thread
From: Paolo Bonzini @ 2016-07-18  8:24 UTC (permalink / raw)
  To: Christian Borntraeger, Suraj Jitindar Singh, David Matlack
  Cc: linuxppc-dev, kvm-ppc, mpe, paulus, benh, kvm list, agraf,
	Radim Krčmář



On 18/07/2016 09:17, Christian Borntraeger wrote:
> On 07/15/2016 09:52 AM, Suraj Jitindar Singh wrote:
>>
>>
>> On 14/07/16 19:42, Paolo Bonzini wrote:
>>>
>>> On 13/07/2016 20:00, Christian Borntraeger wrote:
>>>>>>>> I thought u64 still existed on 32-bit architectures. unsigned long
>>>>>>>> would be fine but with the caveat that certain stats would overflow on
>>>>>>>> 32-bit architectures.
>>>>>> Yes, but not all 32-bit architectures can do atomic read-modify-write
>>>>>> (e.g. add) operations on 64-bit values.
>>>> So what about only doing it for the VCPU events? Those should be only
>>>> modified by one CPU. We would have some odd values on 32bit overflow, but
>>>> this will be certainly better than just start with 0
>>> If that's good enough for PPC, that's fine.
>>>
>>> Paolo
>>
>> I'm don't feel great about having vcpu_stats as u64 and vm_stats still as u32
>> it's just a bit inconsistent.
>>
>> That being said, it's only the vcpu_stats which I require to be u64 at this
>> stage so it's possible to just upgrade those.
> 
> Yes, its not nice, but we probably want to avoid the overhead of atomics.
> What about using u64 for vcpu_stats and unsigned long for vm_stats. This will be
> correct for anyone and on 64bit systems we get 64 bits for everything?

That makes sense.

Paolo

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH V2 4/5] kvm/stats: Add provisioning for 64-bit vcpu statistics
  2016-07-18  8:24                   ` Paolo Bonzini
@ 2016-07-19  1:31                     ` Suraj Jitindar Singh
  0 siblings, 0 replies; 26+ messages in thread
From: Suraj Jitindar Singh @ 2016-07-19  1:31 UTC (permalink / raw)
  To: Paolo Bonzini, Christian Borntraeger, David Matlack
  Cc: linuxppc-dev, kvm-ppc, mpe, paulus, benh, kvm list, agraf,
	Radim Krčmář



On 18/07/16 18:24, Paolo Bonzini wrote:
>
> On 18/07/2016 09:17, Christian Borntraeger wrote:
>> On 07/15/2016 09:52 AM, Suraj Jitindar Singh wrote:
>>>
>>> On 14/07/16 19:42, Paolo Bonzini wrote:
>>>> On 13/07/2016 20:00, Christian Borntraeger wrote:
>>>>>>>>> I thought u64 still existed on 32-bit architectures. unsigned long
>>>>>>>>> would be fine but with the caveat that certain stats would overflow on
>>>>>>>>> 32-bit architectures.
>>>>>>> Yes, but not all 32-bit architectures can do atomic read-modify-write
>>>>>>> (e.g. add) operations on 64-bit values.
>>>>> So what about only doing it for the VCPU events? Those should be only
>>>>> modified by one CPU. We would have some odd values on 32bit overflow, but
>>>>> this will be certainly better than just start with 0
>>>> If that's good enough for PPC, that's fine.
>>>>
>>>> Paolo
>>> I'm don't feel great about having vcpu_stats as u64 and vm_stats still as u32
>>> it's just a bit inconsistent.
>>>
>>> That being said, it's only the vcpu_stats which I require to be u64 at this
>>> stage so it's possible to just upgrade those.
>> Yes, its not nice, but we probably want to avoid the overhead of atomics.
>> What about using u64 for vcpu_stats and unsigned long for vm_stats. This will be
>> correct for anyone and on 64bit systems we get 64 bits for everything?
> That makes sense.
>
> Paolo

Sound good, I am happy with this.


^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2016-07-19  1:31 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-11  7:08 [PATCH V2 1/5] kvm/ppc/book3s: Move struct kvmppc_vcore from kvm_host.h to kvm_book3s.h Suraj Jitindar Singh
2016-07-11  7:08 ` [PATCH V2 2/5] kvm/ppc/book3s_hv: Change vcore element runnable_threads from linked-list to array Suraj Jitindar Singh
2016-07-11  7:08 ` [PATCH V2 3/5] kvm/ppc/book3s_hv: Implement halt polling in the kvm_hv kernel module Suraj Jitindar Singh
2016-07-11 16:57   ` David Matlack
2016-07-11 17:07     ` Paolo Bonzini
2016-07-11 17:26       ` David Matlack
2016-07-12  6:33         ` Suraj Jitindar Singh
2016-07-11  7:08 ` [PATCH V2 4/5] kvm/stats: Add provisioning for 64-bit vcpu statistics Suraj Jitindar Singh
2016-07-11 16:51   ` David Matlack
2016-07-11 17:05     ` Paolo Bonzini
2016-07-11 17:30       ` David Matlack
2016-07-11 19:31         ` Paolo Bonzini
2016-07-11 19:45           ` David Matlack
2016-07-12  6:24             ` Suraj Jitindar Singh
2016-07-13 18:00           ` Christian Borntraeger
2016-07-14  9:42             ` Paolo Bonzini
2016-07-15  7:52               ` Suraj Jitindar Singh
2016-07-18  7:17                 ` Christian Borntraeger
2016-07-18  8:24                   ` Paolo Bonzini
2016-07-19  1:31                     ` Suraj Jitindar Singh
2016-07-11  7:08 ` [PATCH V2 5/5] powerpc/kvm/stats: Implement existing and add new halt polling vcpu stats Suraj Jitindar Singh
2016-07-11 16:49   ` David Matlack
2016-07-12  6:17     ` Suraj Jitindar Singh
2016-07-13  6:07       ` Suraj Jitindar Singh
2016-07-13 17:20         ` David Matlack
2016-07-15  7:53           ` Suraj Jitindar Singh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).