kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] KVM: selftests: Randomize memory access of dirty_log_perf_test
@ 2022-08-10 17:58 Colton Lewis
  2022-08-10 17:58 ` [PATCH 1/3] KVM: selftests: Add random table to randomize memory access Colton Lewis
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Colton Lewis @ 2022-08-10 17:58 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, maz, dmatlack, seanjc, oupton, ricarkol, Colton Lewis

This patch adds the ability to randomize parts of dirty_log_perf_test,
specifically the order pages are accessed and whether pages are read
or written. This is implemented through a table of random numbers
stored in VM memory and refreshed between test iterations.

Patch series based on kvm/queue

Colton Lewis (3):
  KVM: selftests: Add random table to randomize memory access
  KVM: selftests: Randomize which pages are written vs read
  KVM: selftests: Randomize page access order

 .../selftests/kvm/dirty_log_perf_test.c       | 34 ++++++++--
 .../selftests/kvm/include/perf_test_util.h    |  6 ++
 .../selftests/kvm/lib/perf_test_util.c        | 68 ++++++++++++++++++-
 3 files changed, 98 insertions(+), 10 deletions(-)

-- 
2.37.1.559.g78731f0fdb-goog


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/3] KVM: selftests: Add random table to randomize memory access
  2022-08-10 17:58 [PATCH 0/3] KVM: selftests: Randomize memory access of dirty_log_perf_test Colton Lewis
@ 2022-08-10 17:58 ` Colton Lewis
  2022-08-10 23:18   ` David Matlack
  2022-08-10 17:58 ` [PATCH 2/3] KVM: selftests: Randomize which pages are written vs read Colton Lewis
  2022-08-10 17:58 ` [PATCH 3/3] KVM: selftests: Randomize page access order Colton Lewis
  2 siblings, 1 reply; 14+ messages in thread
From: Colton Lewis @ 2022-08-10 17:58 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, maz, dmatlack, seanjc, oupton, ricarkol, Colton Lewis

Linear access through all pages does not seem to replicate performance
problems with realistic dirty logging workloads. Make the test more
sophisticated through random access. Each vcpu has its own sequence of
random numbers that are refilled after every iteration. Having the
main thread fill the table for every vcpu is less efficient than
having each vcpu generate its own numbers, but this ensures threading
nondeterminism won't destroy reproducibility with a given random seed.

Signed-off-by: Colton Lewis <coltonlewis@google.com>
---
 .../selftests/kvm/dirty_log_perf_test.c       | 13 ++++-
 .../selftests/kvm/include/perf_test_util.h    |  4 ++
 .../selftests/kvm/lib/perf_test_util.c        | 47 +++++++++++++++++++
 3 files changed, 63 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c
index f99e39a672d3..80a1cbe7fbb0 100644
--- a/tools/testing/selftests/kvm/dirty_log_perf_test.c
+++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c
@@ -132,6 +132,7 @@ struct test_params {
 	bool partition_vcpu_memory_access;
 	enum vm_mem_backing_src_type backing_src;
 	int slots;
+	uint32_t random_seed;
 };
 
 static void toggle_dirty_logging(struct kvm_vm *vm, int slots, bool enable)
@@ -243,6 +244,10 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 	/* Start the iterations */
 	iteration = 0;
 	host_quit = false;
+	srandom(p->random_seed);
+	pr_info("Random seed: %d\n", p->random_seed);
+	alloc_random_table(nr_vcpus, guest_percpu_mem_size >> vm->page_shift);
+	fill_random_table(nr_vcpus, guest_percpu_mem_size >> vm->page_shift);
 
 	clock_gettime(CLOCK_MONOTONIC, &start);
 	for (i = 0; i < nr_vcpus; i++)
@@ -270,6 +275,7 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 		ts_diff.tv_sec, ts_diff.tv_nsec);
 
 	while (iteration < p->iterations) {
+		fill_random_table(nr_vcpus, guest_percpu_mem_size >> vm->page_shift);
 		/*
 		 * Incrementing the iteration number will start the vCPUs
 		 * dirtying memory again.
@@ -380,6 +386,7 @@ static void help(char *name)
 	printf(" -v: specify the number of vCPUs to run.\n");
 	printf(" -o: Overlap guest memory accesses instead of partitioning\n"
 	       "     them into a separate region of memory for each vCPU.\n");
+	printf(" -r: specify the starting random seed.\n");
 	backing_src_help("-s");
 	printf(" -x: Split the memory region into this number of memslots.\n"
 	       "     (default: 1)\n");
@@ -396,6 +403,7 @@ int main(int argc, char *argv[])
 		.partition_vcpu_memory_access = true,
 		.backing_src = DEFAULT_VM_MEM_SRC,
 		.slots = 1,
+		.random_seed = time(NULL),
 	};
 	int opt;
 
@@ -406,7 +414,7 @@ int main(int argc, char *argv[])
 
 	guest_modes_append_default();
 
-	while ((opt = getopt(argc, argv, "eghi:p:m:nb:f:v:os:x:")) != -1) {
+	while ((opt = getopt(argc, argv, "eghi:p:m:nb:f:v:or:s:x:")) != -1) {
 		switch (opt) {
 		case 'e':
 			/* 'e' is for evil. */
@@ -442,6 +450,9 @@ int main(int argc, char *argv[])
 		case 'o':
 			p.partition_vcpu_memory_access = false;
 			break;
+		case 'r':
+			p.random_seed = atoi(optarg);
+			break;
 		case 's':
 			p.backing_src = parse_backing_src_type(optarg);
 			break;
diff --git a/tools/testing/selftests/kvm/include/perf_test_util.h b/tools/testing/selftests/kvm/include/perf_test_util.h
index eaa88df0555a..597875d0c3db 100644
--- a/tools/testing/selftests/kvm/include/perf_test_util.h
+++ b/tools/testing/selftests/kvm/include/perf_test_util.h
@@ -44,6 +44,10 @@ struct perf_test_args {
 };
 
 extern struct perf_test_args perf_test_args;
+extern uint32_t **random_table;
+
+void alloc_random_table(uint32_t nr_vcpus, uint32_t nr_randoms);
+void fill_random_table(uint32_t nr_vcpus, uint32_t nr_randoms);
 
 struct kvm_vm *perf_test_create_vm(enum vm_guest_mode mode, int nr_vcpus,
 				   uint64_t vcpu_memory_bytes, int slots,
diff --git a/tools/testing/selftests/kvm/lib/perf_test_util.c b/tools/testing/selftests/kvm/lib/perf_test_util.c
index 9618b37c66f7..b04e8d2c0f37 100644
--- a/tools/testing/selftests/kvm/lib/perf_test_util.c
+++ b/tools/testing/selftests/kvm/lib/perf_test_util.c
@@ -9,6 +9,10 @@
 #include "processor.h"
 
 struct perf_test_args perf_test_args;
+/* This pointer points to guest memory and must be converted with
+ * addr_gva2hva to be accessed from the host.
+ */
+uint32_t **random_table;
 
 /*
  * Guest virtual memory offset of the testing memory slot.
@@ -70,6 +74,49 @@ void perf_test_guest_code(uint32_t vcpu_idx)
 	}
 }
 
+void alloc_random_table(uint32_t nr_vcpus, uint32_t nr_randoms)
+{
+	struct perf_test_args *pta = &perf_test_args;
+	uint32_t **host_random_table;
+
+	random_table = (uint32_t **)vm_vaddr_alloc(
+		pta->vm,
+		nr_vcpus * sizeof(uint32_t *),
+		(vm_vaddr_t)0);
+	host_random_table = addr_gva2hva(pta->vm, (vm_vaddr_t)random_table);
+	pr_debug("Random start addr: %p %p.\n", random_table, host_random_table);
+
+	for (uint32_t i = 0; i < nr_vcpus; i++) {
+		host_random_table[i] = (uint32_t *)vm_vaddr_alloc(
+			pta->vm,
+			nr_randoms * sizeof(uint32_t),
+			(vm_vaddr_t)0);
+		pr_debug("Random row addr: %p %p.\n",
+			 host_random_table[i],
+			 addr_gva2hva(pta->vm, (vm_vaddr_t)host_random_table[i]));
+	}
+}
+
+void fill_random_table(uint32_t nr_vcpus, uint32_t nr_randoms)
+{
+	struct perf_test_args *pta = &perf_test_args;
+	uint32_t **host_random_table = addr_gva2hva(pta->vm, (vm_vaddr_t)random_table);
+	uint32_t *host_row;
+
+	pr_debug("Random start addr: %p %p.\n", random_table, host_random_table);
+
+	for (uint32_t i = 0; i < nr_vcpus; i++) {
+		host_row = addr_gva2hva(pta->vm, (vm_vaddr_t)host_random_table[i]);
+		pr_debug("Random row addr: %p %p.\n", host_random_table[i], host_row);
+
+		for (uint32_t j = 0; j < nr_randoms; j++)
+			host_row[j] = random();
+
+		pr_debug("New randoms row %d: %d, %d, %d...\n",
+			 i, host_row[0], host_row[1], host_row[2]);
+	}
+}
+
 void perf_test_setup_vcpus(struct kvm_vm *vm, int nr_vcpus,
 			   struct kvm_vcpu *vcpus[],
 			   uint64_t vcpu_memory_bytes,
-- 
2.37.1.559.g78731f0fdb-goog


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/3] KVM: selftests: Randomize which pages are written vs read
  2022-08-10 17:58 [PATCH 0/3] KVM: selftests: Randomize memory access of dirty_log_perf_test Colton Lewis
  2022-08-10 17:58 ` [PATCH 1/3] KVM: selftests: Add random table to randomize memory access Colton Lewis
@ 2022-08-10 17:58 ` Colton Lewis
  2022-08-10 23:33   ` David Matlack
  2022-08-10 17:58 ` [PATCH 3/3] KVM: selftests: Randomize page access order Colton Lewis
  2 siblings, 1 reply; 14+ messages in thread
From: Colton Lewis @ 2022-08-10 17:58 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, maz, dmatlack, seanjc, oupton, ricarkol, Colton Lewis

Randomize which pages are written vs read by using the random number
table for each page modulo 100. This changes how the -w argument
works. It is now a percentage from 0 to 100 inclusive that represents
what percentage of accesses are writes. It keeps the same default of
100 percent writes.

Signed-off-by: Colton Lewis <coltonlewis@google.com>
---
 tools/testing/selftests/kvm/dirty_log_perf_test.c | 12 +++++++-----
 tools/testing/selftests/kvm/lib/perf_test_util.c  |  4 ++--
 2 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c
index 80a1cbe7fbb0..dcc5d44fc757 100644
--- a/tools/testing/selftests/kvm/dirty_log_perf_test.c
+++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c
@@ -381,8 +381,8 @@ static void help(char *name)
 	       "     (default: 1G)\n");
 	printf(" -f: specify the fraction of pages which should be written to\n"
 	       "     as opposed to simply read, in the form\n"
-	       "     1/<fraction of pages to write>.\n"
-	       "     (default: 1 i.e. all pages are written to.)\n");
+	       "     [0-100]%% of pages to write.\n"
+	       "     (default: 100 i.e. all pages are written to.)\n");
 	printf(" -v: specify the number of vCPUs to run.\n");
 	printf(" -o: Overlap guest memory accesses instead of partitioning\n"
 	       "     them into a separate region of memory for each vCPU.\n");
@@ -399,7 +399,7 @@ int main(int argc, char *argv[])
 	int max_vcpus = kvm_check_cap(KVM_CAP_MAX_VCPUS);
 	struct test_params p = {
 		.iterations = TEST_HOST_LOOP_N,
-		.wr_fract = 1,
+		.wr_fract = 100,
 		.partition_vcpu_memory_access = true,
 		.backing_src = DEFAULT_VM_MEM_SRC,
 		.slots = 1,
@@ -439,8 +439,10 @@ int main(int argc, char *argv[])
 			break;
 		case 'f':
 			p.wr_fract = atoi(optarg);
-			TEST_ASSERT(p.wr_fract >= 1,
-				    "Write fraction cannot be less than one");
+			TEST_ASSERT(p.wr_fract >= 0,
+				    "Write fraction cannot be less than 0");
+			TEST_ASSERT(p.wr_fract <= 100,
+				    "Write fraction cannot be greater than 100");
 			break;
 		case 'v':
 			nr_vcpus = atoi(optarg);
diff --git a/tools/testing/selftests/kvm/lib/perf_test_util.c b/tools/testing/selftests/kvm/lib/perf_test_util.c
index b04e8d2c0f37..3c7b93349fef 100644
--- a/tools/testing/selftests/kvm/lib/perf_test_util.c
+++ b/tools/testing/selftests/kvm/lib/perf_test_util.c
@@ -64,7 +64,7 @@ void perf_test_guest_code(uint32_t vcpu_idx)
 		for (i = 0; i < pages; i++) {
 			uint64_t addr = gva + (i * pta->guest_page_size);
 
-			if (i % pta->wr_fract == 0)
+			if (random_table[vcpu_idx][i] % 100 < pta->wr_fract)
 				*(uint64_t *)addr = 0x0123456789ABCDEF;
 			else
 				READ_ONCE(*(uint64_t *)addr);
@@ -168,7 +168,7 @@ struct kvm_vm *perf_test_create_vm(enum vm_guest_mode mode, int nr_vcpus,
 	pr_info("Testing guest mode: %s\n", vm_guest_mode_string(mode));
 
 	/* By default vCPUs will write to memory. */
-	pta->wr_fract = 1;
+	pta->wr_fract = 100;
 
 	/*
 	 * Snapshot the non-huge page size.  This is used by the guest code to
-- 
2.37.1.559.g78731f0fdb-goog


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/3] KVM: selftests: Randomize page access order
  2022-08-10 17:58 [PATCH 0/3] KVM: selftests: Randomize memory access of dirty_log_perf_test Colton Lewis
  2022-08-10 17:58 ` [PATCH 1/3] KVM: selftests: Add random table to randomize memory access Colton Lewis
  2022-08-10 17:58 ` [PATCH 2/3] KVM: selftests: Randomize which pages are written vs read Colton Lewis
@ 2022-08-10 17:58 ` Colton Lewis
  2022-08-10 23:49   ` David Matlack
  2 siblings, 1 reply; 14+ messages in thread
From: Colton Lewis @ 2022-08-10 17:58 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, maz, dmatlack, seanjc, oupton, ricarkol, Colton Lewis

Add the ability to use random_table to randomize the order in which
pages are accessed. Add the -a argument to enable this new
behavior. This should make accesses less predictable and make for a
more realistic test. It includes the possibility that the same pages
may be hit multiple times during an iteration.

Signed-off-by: Colton Lewis <coltonlewis@google.com>
---
 .../testing/selftests/kvm/dirty_log_perf_test.c | 11 +++++++++--
 .../selftests/kvm/include/perf_test_util.h      |  2 ++
 .../testing/selftests/kvm/lib/perf_test_util.c  | 17 ++++++++++++++++-
 3 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c
index dcc5d44fc757..265cb4f7e088 100644
--- a/tools/testing/selftests/kvm/dirty_log_perf_test.c
+++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c
@@ -132,6 +132,7 @@ struct test_params {
 	bool partition_vcpu_memory_access;
 	enum vm_mem_backing_src_type backing_src;
 	int slots;
+	bool random_access;
 	uint32_t random_seed;
 };
 
@@ -227,6 +228,7 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 				 p->partition_vcpu_memory_access);
 
 	perf_test_set_wr_fract(vm, p->wr_fract);
+	perf_test_set_random_access(vm, p->random_access);
 
 	guest_num_pages = (nr_vcpus * guest_percpu_mem_size) >> vm->page_shift;
 	guest_num_pages = vm_adjust_num_guest_pages(mode, guest_num_pages);
@@ -357,10 +359,11 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 static void help(char *name)
 {
 	puts("");
-	printf("usage: %s [-h] [-i iterations] [-p offset] [-g] "
+	printf("usage: %s [-h] [-a] [-r random seed] [-i iterations] [-p offset] [-g] "
 	       "[-m mode] [-n] [-b vcpu bytes] [-v vcpus] [-o] [-s mem type]"
 	       "[-x memslots]\n", name);
 	puts("");
+	printf(" -a: access memory randomly rather than in order.\n");
 	printf(" -i: specify iteration counts (default: %"PRIu64")\n",
 	       TEST_HOST_LOOP_N);
 	printf(" -g: Do not enable KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2. This\n"
@@ -403,6 +406,7 @@ int main(int argc, char *argv[])
 		.partition_vcpu_memory_access = true,
 		.backing_src = DEFAULT_VM_MEM_SRC,
 		.slots = 1,
+		.random_access = false,
 		.random_seed = time(NULL),
 	};
 	int opt;
@@ -414,8 +418,11 @@ int main(int argc, char *argv[])
 
 	guest_modes_append_default();
 
-	while ((opt = getopt(argc, argv, "eghi:p:m:nb:f:v:or:s:x:")) != -1) {
+	while ((opt = getopt(argc, argv, "aeghi:p:m:nb:f:v:or:s:x:")) != -1) {
 		switch (opt) {
+		case 'a':
+			p.random_access = true;
+			break;
 		case 'e':
 			/* 'e' is for evil. */
 			run_vcpus_while_disabling_dirty_logging = true;
diff --git a/tools/testing/selftests/kvm/include/perf_test_util.h b/tools/testing/selftests/kvm/include/perf_test_util.h
index 597875d0c3db..6c6f81ce2216 100644
--- a/tools/testing/selftests/kvm/include/perf_test_util.h
+++ b/tools/testing/selftests/kvm/include/perf_test_util.h
@@ -39,6 +39,7 @@ struct perf_test_args {
 
 	/* Run vCPUs in L2 instead of L1, if the architecture supports it. */
 	bool nested;
+	bool random_access;
 
 	struct perf_test_vcpu_args vcpu_args[KVM_MAX_VCPUS];
 };
@@ -56,6 +57,7 @@ struct kvm_vm *perf_test_create_vm(enum vm_guest_mode mode, int nr_vcpus,
 void perf_test_destroy_vm(struct kvm_vm *vm);
 
 void perf_test_set_wr_fract(struct kvm_vm *vm, int wr_fract);
+void perf_test_set_random_access(struct kvm_vm *vm, bool random_access);
 
 void perf_test_start_vcpu_threads(int vcpus, void (*vcpu_fn)(struct perf_test_vcpu_args *));
 void perf_test_join_vcpu_threads(int vcpus);
diff --git a/tools/testing/selftests/kvm/lib/perf_test_util.c b/tools/testing/selftests/kvm/lib/perf_test_util.c
index 3c7b93349fef..9838d1ad9166 100644
--- a/tools/testing/selftests/kvm/lib/perf_test_util.c
+++ b/tools/testing/selftests/kvm/lib/perf_test_util.c
@@ -52,6 +52,9 @@ void perf_test_guest_code(uint32_t vcpu_idx)
 	struct perf_test_vcpu_args *vcpu_args = &pta->vcpu_args[vcpu_idx];
 	uint64_t gva;
 	uint64_t pages;
+	uint64_t addr;
+	bool random_access = pta->random_access;
+	bool populated = false;
 	int i;
 
 	gva = vcpu_args->gva;
@@ -62,7 +65,11 @@ void perf_test_guest_code(uint32_t vcpu_idx)
 
 	while (true) {
 		for (i = 0; i < pages; i++) {
-			uint64_t addr = gva + (i * pta->guest_page_size);
+			if (populated && random_access)
+				addr = gva +
+					((random_table[vcpu_idx][i] % pages) * pta->guest_page_size);
+			else
+				addr = gva + (i * pta->guest_page_size);
 
 			if (random_table[vcpu_idx][i] % 100 < pta->wr_fract)
 				*(uint64_t *)addr = 0x0123456789ABCDEF;
@@ -70,6 +77,7 @@ void perf_test_guest_code(uint32_t vcpu_idx)
 				READ_ONCE(*(uint64_t *)addr);
 		}
 
+		populated = true;
 		GUEST_SYNC(1);
 	}
 }
@@ -169,6 +177,7 @@ struct kvm_vm *perf_test_create_vm(enum vm_guest_mode mode, int nr_vcpus,
 
 	/* By default vCPUs will write to memory. */
 	pta->wr_fract = 100;
+	pta->random_access = false;
 
 	/*
 	 * Snapshot the non-huge page size.  This is used by the guest code to
@@ -276,6 +285,12 @@ void perf_test_set_wr_fract(struct kvm_vm *vm, int wr_fract)
 	sync_global_to_guest(vm, perf_test_args);
 }
 
+void perf_test_set_random_access(struct kvm_vm *vm, bool random_access)
+{
+	perf_test_args.random_access = random_access;
+	sync_global_to_guest(vm, perf_test_args);
+}
+
 uint64_t __weak perf_test_nested_pages(int nr_vcpus)
 {
 	return 0;
-- 
2.37.1.559.g78731f0fdb-goog


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] KVM: selftests: Add random table to randomize memory access
  2022-08-10 17:58 ` [PATCH 1/3] KVM: selftests: Add random table to randomize memory access Colton Lewis
@ 2022-08-10 23:18   ` David Matlack
  2022-08-10 23:26     ` David Matlack
  0 siblings, 1 reply; 14+ messages in thread
From: David Matlack @ 2022-08-10 23:18 UTC (permalink / raw)
  To: Colton Lewis; +Cc: kvm, pbonzini, maz, seanjc, oupton, ricarkol

On Wed, Aug 10, 2022 at 05:58:28PM +0000, Colton Lewis wrote:
> Linear access through all pages does not seem to replicate performance

State what the patch does first, then the background/motivation.

> problems with realistic dirty logging workloads. Make the test more
> sophisticated through random access. Each vcpu has its own sequence of
> random numbers that are refilled after every iteration. Having the
> main thread fill the table for every vcpu is less efficient than
> having each vcpu generate its own numbers, but this ensures threading
> nondeterminism won't destroy reproducibility with a given random seed.

Make it clear what this patch does specifically. e.g. "Make the test
more sophisticated through random access" is a bit misleading since all
this patch does is create a table of random numbers.

> 
> Signed-off-by: Colton Lewis <coltonlewis@google.com>
> ---
>  .../selftests/kvm/dirty_log_perf_test.c       | 13 ++++-
>  .../selftests/kvm/include/perf_test_util.h    |  4 ++
>  .../selftests/kvm/lib/perf_test_util.c        | 47 +++++++++++++++++++
>  3 files changed, 63 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c
> index f99e39a672d3..80a1cbe7fbb0 100644
> --- a/tools/testing/selftests/kvm/dirty_log_perf_test.c
> +++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c
> @@ -132,6 +132,7 @@ struct test_params {
>  	bool partition_vcpu_memory_access;
>  	enum vm_mem_backing_src_type backing_src;
>  	int slots;
> +	uint32_t random_seed;
>  };
>  
>  static void toggle_dirty_logging(struct kvm_vm *vm, int slots, bool enable)
> @@ -243,6 +244,10 @@ static void run_test(enum vm_guest_mode mode, void *arg)
>  	/* Start the iterations */
>  	iteration = 0;
>  	host_quit = false;
> +	srandom(p->random_seed);
> +	pr_info("Random seed: %d\n", p->random_seed);
> +	alloc_random_table(nr_vcpus, guest_percpu_mem_size >> vm->page_shift);
> +	fill_random_table(nr_vcpus, guest_percpu_mem_size >> vm->page_shift);

Drive the allocate and filling of the random table in perf_test_util.c
as part of VM setup, and also move random_seed to perf_test_args.

This will reduce the amount of code needed in the test to use
perf_test_util with random accesses.  dirty_log_perf_test is the only
test using random accesses right now, but I could see us wanting to use
it in demand_paging_test and access_tracking_perf_test in the near
future.

You can still have the test refresh the random table every iteration by
exporting e.g. perf_test_refresh_random_table() for use by tests.

>  
>  	clock_gettime(CLOCK_MONOTONIC, &start);
>  	for (i = 0; i < nr_vcpus; i++)
> @@ -270,6 +275,7 @@ static void run_test(enum vm_guest_mode mode, void *arg)
>  		ts_diff.tv_sec, ts_diff.tv_nsec);
>  
>  	while (iteration < p->iterations) {
> +		fill_random_table(nr_vcpus, guest_percpu_mem_size >> vm->page_shift);

I wonder if it would be better to use the same random access pattern
across iterations. One of the reasons to have multiple iterations is to
see how the guest performance changes as the memory moves through
different phases of dirty tracking. e.g. KVM might be splitting huge
pages during the first iteration but not the second. If the access
pattern is also changing across iterations that could make it harder to
identify performance changes due to KVM.

>  		/*
>  		 * Incrementing the iteration number will start the vCPUs
>  		 * dirtying memory again.
> @@ -380,6 +386,7 @@ static void help(char *name)
>  	printf(" -v: specify the number of vCPUs to run.\n");
>  	printf(" -o: Overlap guest memory accesses instead of partitioning\n"
>  	       "     them into a separate region of memory for each vCPU.\n");
> +	printf(" -r: specify the starting random seed.\n");
>  	backing_src_help("-s");
>  	printf(" -x: Split the memory region into this number of memslots.\n"
>  	       "     (default: 1)\n");
> @@ -396,6 +403,7 @@ int main(int argc, char *argv[])
>  		.partition_vcpu_memory_access = true,
>  		.backing_src = DEFAULT_VM_MEM_SRC,
>  		.slots = 1,
> +		.random_seed = time(NULL),

Perhaps the default seed should be a hard-coded value so that users
running the test with default arguments get deterministic results across
runs.

>  	};
>  	int opt;
>  
> @@ -406,7 +414,7 @@ int main(int argc, char *argv[])
>  
>  	guest_modes_append_default();
>  
> -	while ((opt = getopt(argc, argv, "eghi:p:m:nb:f:v:os:x:")) != -1) {
> +	while ((opt = getopt(argc, argv, "eghi:p:m:nb:f:v:or:s:x:")) != -1) {
>  		switch (opt) {
>  		case 'e':
>  			/* 'e' is for evil. */
> @@ -442,6 +450,9 @@ int main(int argc, char *argv[])
>  		case 'o':
>  			p.partition_vcpu_memory_access = false;
>  			break;
> +		case 'r':
> +			p.random_seed = atoi(optarg);
> +			break;
>  		case 's':
>  			p.backing_src = parse_backing_src_type(optarg);
>  			break;
> diff --git a/tools/testing/selftests/kvm/include/perf_test_util.h b/tools/testing/selftests/kvm/include/perf_test_util.h
> index eaa88df0555a..597875d0c3db 100644
> --- a/tools/testing/selftests/kvm/include/perf_test_util.h
> +++ b/tools/testing/selftests/kvm/include/perf_test_util.h
> @@ -44,6 +44,10 @@ struct perf_test_args {
>  };
>  
>  extern struct perf_test_args perf_test_args;
> +extern uint32_t **random_table;

Adding random_table to perf_test_util.h is unnecessary in this commit
(it's only used in perf_test_util.c).

> +
> +void alloc_random_table(uint32_t nr_vcpus, uint32_t nr_randoms);
> +void fill_random_table(uint32_t nr_vcpus, uint32_t nr_randoms);

Use perf_test_ prefixes for symbols visible outside of perf_test_util.c.

e.g.

  perf_test_random_table
  perf_test_alloc_random_table()
  perf_test_fill_random_table()

>  
>  struct kvm_vm *perf_test_create_vm(enum vm_guest_mode mode, int nr_vcpus,
>  				   uint64_t vcpu_memory_bytes, int slots,
> diff --git a/tools/testing/selftests/kvm/lib/perf_test_util.c b/tools/testing/selftests/kvm/lib/perf_test_util.c
> index 9618b37c66f7..b04e8d2c0f37 100644
> --- a/tools/testing/selftests/kvm/lib/perf_test_util.c
> +++ b/tools/testing/selftests/kvm/lib/perf_test_util.c
> @@ -9,6 +9,10 @@
>  #include "processor.h"
>  
>  struct perf_test_args perf_test_args;
> +/* This pointer points to guest memory and must be converted with
> + * addr_gva2hva to be accessed from the host.
> + */
> +uint32_t **random_table;

Use vm_vaddr_t for variables that contain guest virtual addresses
(exception within guest_code(), of course).

>  
>  /*
>   * Guest virtual memory offset of the testing memory slot.
> @@ -70,6 +74,49 @@ void perf_test_guest_code(uint32_t vcpu_idx)
>  	}
>  }
>  
> +void alloc_random_table(uint32_t nr_vcpus, uint32_t nr_randoms)
> +{
> +	struct perf_test_args *pta = &perf_test_args;
> +	uint32_t **host_random_table;
> +
> +	random_table = (uint32_t **)vm_vaddr_alloc(
> +		pta->vm,
> +		nr_vcpus * sizeof(uint32_t *),
> +		(vm_vaddr_t)0);

I notice vm_vaddr_alloc_pages() and vcpu_alloc_cpuid() use
KVM_UTIL_MIN_VADDR for the min. Should we use that here too?

If so, this is a good opporunity to rename vm_vaddr_alloc() to
__vm_vaddr_alloc() and introduce:

vm_vaddr_t vm_vaddr_alloc(struct kvm_vm *vm, size_t sz)
{
        return __vm_vaddr_alloc(vm, sz, KVM_UTIL_MIN_VADDR);
}

> +	host_random_table = addr_gva2hva(pta->vm, (vm_vaddr_t)random_table);
> +	pr_debug("Random start addr: %p %p.\n", random_table, host_random_table);
> +
> +	for (uint32_t i = 0; i < nr_vcpus; i++) {
> +		host_random_table[i] = (uint32_t *)vm_vaddr_alloc(

The per-vCPU random table should go in perf_test_vcpu_args along with
all the other per-vCPU information that is set up by the test and
consumed by the guest code.

This will reduce some of the complexity here because you won't need to
allocate the top-level array of pointers.

> +			pta->vm,
> +			nr_randoms * sizeof(uint32_t),
> +			(vm_vaddr_t)0);
> +		pr_debug("Random row addr: %p %p.\n",
> +			 host_random_table[i],
> +			 addr_gva2hva(pta->vm, (vm_vaddr_t)host_random_table[i]));

Logging the host virtual addresses of the random table would probably
not be valuable. But logging the guest virtual address would probably be
more useful. The guest virtual address space management it pretty
ad-hoc.

> +	}
> +}
> +
> +void fill_random_table(uint32_t nr_vcpus, uint32_t nr_randoms)
> +{
> +	struct perf_test_args *pta = &perf_test_args;
> +	uint32_t **host_random_table = addr_gva2hva(pta->vm, (vm_vaddr_t)random_table);
> +	uint32_t *host_row;
> +
> +	pr_debug("Random start addr: %p %p.\n", random_table, host_random_table);
> +
> +	for (uint32_t i = 0; i < nr_vcpus; i++) {
> +		host_row = addr_gva2hva(pta->vm, (vm_vaddr_t)host_random_table[i]);
> +		pr_debug("Random row addr: %p %p.\n", host_random_table[i], host_row);
> +
> +		for (uint32_t j = 0; j < nr_randoms; j++)
> +			host_row[j] = random();
> +
> +		pr_debug("New randoms row %d: %d, %d, %d...\n",
> +			 i, host_row[0], host_row[1], host_row[2]);
> +	}
> +}
> +
>  void perf_test_setup_vcpus(struct kvm_vm *vm, int nr_vcpus,
>  			   struct kvm_vcpu *vcpus[],
>  			   uint64_t vcpu_memory_bytes,
> -- 
> 2.37.1.559.g78731f0fdb-goog
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] KVM: selftests: Add random table to randomize memory access
  2022-08-10 23:18   ` David Matlack
@ 2022-08-10 23:26     ` David Matlack
  2022-08-12 16:09       ` Colton Lewis
  0 siblings, 1 reply; 14+ messages in thread
From: David Matlack @ 2022-08-10 23:26 UTC (permalink / raw)
  To: Colton Lewis; +Cc: kvm, pbonzini, maz, seanjc, oupton, ricarkol

On Wed, Aug 10, 2022 at 04:18:10PM -0700, David Matlack wrote:
> On Wed, Aug 10, 2022 at 05:58:28PM +0000, Colton Lewis wrote:
> > Linear access through all pages does not seem to replicate performance
> 
> State what the patch does first, then the background/motivation.
> 
> > problems with realistic dirty logging workloads. Make the test more
> > sophisticated through random access. Each vcpu has its own sequence of
> > random numbers that are refilled after every iteration. Having the
> > main thread fill the table for every vcpu is less efficient than
> > having each vcpu generate its own numbers, but this ensures threading
> > nondeterminism won't destroy reproducibility with a given random seed.
> 
> Make it clear what this patch does specifically. e.g. "Make the test
> more sophisticated through random access" is a bit misleading since all
> this patch does is create a table of random numbers.

Please also call out how this change affects memory usage of the test.

> 
> > 
> > Signed-off-by: Colton Lewis <coltonlewis@google.com>
> > ---
> >  .../selftests/kvm/dirty_log_perf_test.c       | 13 ++++-
> >  .../selftests/kvm/include/perf_test_util.h    |  4 ++
> >  .../selftests/kvm/lib/perf_test_util.c        | 47 +++++++++++++++++++
> >  3 files changed, 63 insertions(+), 1 deletion(-)
> > 
> > diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c
> > index f99e39a672d3..80a1cbe7fbb0 100644
> > --- a/tools/testing/selftests/kvm/dirty_log_perf_test.c
> > +++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c
> > @@ -132,6 +132,7 @@ struct test_params {
> >  	bool partition_vcpu_memory_access;
> >  	enum vm_mem_backing_src_type backing_src;
> >  	int slots;
> > +	uint32_t random_seed;
> >  };
> >  
> >  static void toggle_dirty_logging(struct kvm_vm *vm, int slots, bool enable)
> > @@ -243,6 +244,10 @@ static void run_test(enum vm_guest_mode mode, void *arg)
> >  	/* Start the iterations */
> >  	iteration = 0;
> >  	host_quit = false;
> > +	srandom(p->random_seed);
> > +	pr_info("Random seed: %d\n", p->random_seed);
> > +	alloc_random_table(nr_vcpus, guest_percpu_mem_size >> vm->page_shift);
> > +	fill_random_table(nr_vcpus, guest_percpu_mem_size >> vm->page_shift);
> 
> Drive the allocate and filling of the random table in perf_test_util.c
> as part of VM setup, and also move random_seed to perf_test_args.
> 
> This will reduce the amount of code needed in the test to use
> perf_test_util with random accesses.  dirty_log_perf_test is the only
> test using random accesses right now, but I could see us wanting to use
> it in demand_paging_test and access_tracking_perf_test in the near
> future.
> 
> You can still have the test refresh the random table every iteration by
> exporting e.g. perf_test_refresh_random_table() for use by tests.
> 
> >  
> >  	clock_gettime(CLOCK_MONOTONIC, &start);
> >  	for (i = 0; i < nr_vcpus; i++)
> > @@ -270,6 +275,7 @@ static void run_test(enum vm_guest_mode mode, void *arg)
> >  		ts_diff.tv_sec, ts_diff.tv_nsec);
> >  
> >  	while (iteration < p->iterations) {
> > +		fill_random_table(nr_vcpus, guest_percpu_mem_size >> vm->page_shift);
> 
> I wonder if it would be better to use the same random access pattern
> across iterations. One of the reasons to have multiple iterations is to
> see how the guest performance changes as the memory moves through
> different phases of dirty tracking. e.g. KVM might be splitting huge
> pages during the first iteration but not the second. If the access
> pattern is also changing across iterations that could make it harder to
> identify performance changes due to KVM.
> 
> >  		/*
> >  		 * Incrementing the iteration number will start the vCPUs
> >  		 * dirtying memory again.
> > @@ -380,6 +386,7 @@ static void help(char *name)
> >  	printf(" -v: specify the number of vCPUs to run.\n");
> >  	printf(" -o: Overlap guest memory accesses instead of partitioning\n"
> >  	       "     them into a separate region of memory for each vCPU.\n");
> > +	printf(" -r: specify the starting random seed.\n");
> >  	backing_src_help("-s");
> >  	printf(" -x: Split the memory region into this number of memslots.\n"
> >  	       "     (default: 1)\n");
> > @@ -396,6 +403,7 @@ int main(int argc, char *argv[])
> >  		.partition_vcpu_memory_access = true,
> >  		.backing_src = DEFAULT_VM_MEM_SRC,
> >  		.slots = 1,
> > +		.random_seed = time(NULL),
> 
> Perhaps the default seed should be a hard-coded value so that users
> running the test with default arguments get deterministic results across
> runs.
> 
> >  	};
> >  	int opt;
> >  
> > @@ -406,7 +414,7 @@ int main(int argc, char *argv[])
> >  
> >  	guest_modes_append_default();
> >  
> > -	while ((opt = getopt(argc, argv, "eghi:p:m:nb:f:v:os:x:")) != -1) {
> > +	while ((opt = getopt(argc, argv, "eghi:p:m:nb:f:v:or:s:x:")) != -1) {
> >  		switch (opt) {
> >  		case 'e':
> >  			/* 'e' is for evil. */
> > @@ -442,6 +450,9 @@ int main(int argc, char *argv[])
> >  		case 'o':
> >  			p.partition_vcpu_memory_access = false;
> >  			break;
> > +		case 'r':
> > +			p.random_seed = atoi(optarg);
> > +			break;
> >  		case 's':
> >  			p.backing_src = parse_backing_src_type(optarg);
> >  			break;
> > diff --git a/tools/testing/selftests/kvm/include/perf_test_util.h b/tools/testing/selftests/kvm/include/perf_test_util.h
> > index eaa88df0555a..597875d0c3db 100644
> > --- a/tools/testing/selftests/kvm/include/perf_test_util.h
> > +++ b/tools/testing/selftests/kvm/include/perf_test_util.h
> > @@ -44,6 +44,10 @@ struct perf_test_args {
> >  };
> >  
> >  extern struct perf_test_args perf_test_args;
> > +extern uint32_t **random_table;
> 
> Adding random_table to perf_test_util.h is unnecessary in this commit
> (it's only used in perf_test_util.c).
> 
> > +
> > +void alloc_random_table(uint32_t nr_vcpus, uint32_t nr_randoms);
> > +void fill_random_table(uint32_t nr_vcpus, uint32_t nr_randoms);
> 
> Use perf_test_ prefixes for symbols visible outside of perf_test_util.c.
> 
> e.g.
> 
>   perf_test_random_table
>   perf_test_alloc_random_table()
>   perf_test_fill_random_table()
> 
> >  
> >  struct kvm_vm *perf_test_create_vm(enum vm_guest_mode mode, int nr_vcpus,
> >  				   uint64_t vcpu_memory_bytes, int slots,
> > diff --git a/tools/testing/selftests/kvm/lib/perf_test_util.c b/tools/testing/selftests/kvm/lib/perf_test_util.c
> > index 9618b37c66f7..b04e8d2c0f37 100644
> > --- a/tools/testing/selftests/kvm/lib/perf_test_util.c
> > +++ b/tools/testing/selftests/kvm/lib/perf_test_util.c
> > @@ -9,6 +9,10 @@
> >  #include "processor.h"
> >  
> >  struct perf_test_args perf_test_args;
> > +/* This pointer points to guest memory and must be converted with
> > + * addr_gva2hva to be accessed from the host.
> > + */
> > +uint32_t **random_table;
> 
> Use vm_vaddr_t for variables that contain guest virtual addresses
> (exception within guest_code(), of course).
> 
> >  
> >  /*
> >   * Guest virtual memory offset of the testing memory slot.
> > @@ -70,6 +74,49 @@ void perf_test_guest_code(uint32_t vcpu_idx)
> >  	}
> >  }
> >  
> > +void alloc_random_table(uint32_t nr_vcpus, uint32_t nr_randoms)
> > +{
> > +	struct perf_test_args *pta = &perf_test_args;
> > +	uint32_t **host_random_table;
> > +
> > +	random_table = (uint32_t **)vm_vaddr_alloc(
> > +		pta->vm,
> > +		nr_vcpus * sizeof(uint32_t *),
> > +		(vm_vaddr_t)0);
> 
> I notice vm_vaddr_alloc_pages() and vcpu_alloc_cpuid() use
> KVM_UTIL_MIN_VADDR for the min. Should we use that here too?
> 
> If so, this is a good opporunity to rename vm_vaddr_alloc() to
> __vm_vaddr_alloc() and introduce:
> 
> vm_vaddr_t vm_vaddr_alloc(struct kvm_vm *vm, size_t sz)
> {
>         return __vm_vaddr_alloc(vm, sz, KVM_UTIL_MIN_VADDR);
> }
> 
> > +	host_random_table = addr_gva2hva(pta->vm, (vm_vaddr_t)random_table);
> > +	pr_debug("Random start addr: %p %p.\n", random_table, host_random_table);
> > +
> > +	for (uint32_t i = 0; i < nr_vcpus; i++) {
> > +		host_random_table[i] = (uint32_t *)vm_vaddr_alloc(
> 
> The per-vCPU random table should go in perf_test_vcpu_args along with
> all the other per-vCPU information that is set up by the test and
> consumed by the guest code.
> 
> This will reduce some of the complexity here because you won't need to
> allocate the top-level array of pointers.
> 
> > +			pta->vm,
> > +			nr_randoms * sizeof(uint32_t),
> > +			(vm_vaddr_t)0);
> > +		pr_debug("Random row addr: %p %p.\n",
> > +			 host_random_table[i],
> > +			 addr_gva2hva(pta->vm, (vm_vaddr_t)host_random_table[i]));
> 
> Logging the host virtual addresses of the random table would probably
> not be valuable. But logging the guest virtual address would probably be
> more useful. The guest virtual address space management it pretty
> ad-hoc.
> 
> > +	}
> > +}
> > +
> > +void fill_random_table(uint32_t nr_vcpus, uint32_t nr_randoms)
> > +{
> > +	struct perf_test_args *pta = &perf_test_args;
> > +	uint32_t **host_random_table = addr_gva2hva(pta->vm, (vm_vaddr_t)random_table);
> > +	uint32_t *host_row;
> > +
> > +	pr_debug("Random start addr: %p %p.\n", random_table, host_random_table);
> > +
> > +	for (uint32_t i = 0; i < nr_vcpus; i++) {
> > +		host_row = addr_gva2hva(pta->vm, (vm_vaddr_t)host_random_table[i]);
> > +		pr_debug("Random row addr: %p %p.\n", host_random_table[i], host_row);
> > +
> > +		for (uint32_t j = 0; j < nr_randoms; j++)
> > +			host_row[j] = random();
> > +
> > +		pr_debug("New randoms row %d: %d, %d, %d...\n",
> > +			 i, host_row[0], host_row[1], host_row[2]);
> > +	}
> > +}
> > +
> >  void perf_test_setup_vcpus(struct kvm_vm *vm, int nr_vcpus,
> >  			   struct kvm_vcpu *vcpus[],
> >  			   uint64_t vcpu_memory_bytes,
> > -- 
> > 2.37.1.559.g78731f0fdb-goog
> > 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/3] KVM: selftests: Randomize which pages are written vs read
  2022-08-10 17:58 ` [PATCH 2/3] KVM: selftests: Randomize which pages are written vs read Colton Lewis
@ 2022-08-10 23:33   ` David Matlack
  2022-08-10 23:37     ` David Matlack
  0 siblings, 1 reply; 14+ messages in thread
From: David Matlack @ 2022-08-10 23:33 UTC (permalink / raw)
  To: Colton Lewis; +Cc: kvm, pbonzini, maz, seanjc, oupton, ricarkol

On Wed, Aug 10, 2022 at 05:58:29PM +0000, Colton Lewis wrote:
> Randomize which pages are written vs read by using the random number

Same thing here about stating what the patch does first.

> table for each page modulo 100. This changes how the -w argument

s/-f/-w/

Although I would love it if you renamed -f to -w in the code instead.

> works. It is now a percentage from 0 to 100 inclusive that represents
> what percentage of accesses are writes. It keeps the same default of
> 100 percent writes.
> 
> Signed-off-by: Colton Lewis <coltonlewis@google.com>
> ---
>  tools/testing/selftests/kvm/dirty_log_perf_test.c | 12 +++++++-----

access_tracking_perf_test.c also uses wr_fract and will need to be
updated.

>  tools/testing/selftests/kvm/lib/perf_test_util.c  |  4 ++--
>  2 files changed, 9 insertions(+), 7 deletions(-)
> 
> diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c
> index 80a1cbe7fbb0..dcc5d44fc757 100644
> --- a/tools/testing/selftests/kvm/dirty_log_perf_test.c
> +++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c
> @@ -381,8 +381,8 @@ static void help(char *name)
>  	       "     (default: 1G)\n");
>  	printf(" -f: specify the fraction of pages which should be written to\n"

s/fraction/percentage/

>  	       "     as opposed to simply read, in the form\n"
> -	       "     1/<fraction of pages to write>.\n"
> -	       "     (default: 1 i.e. all pages are written to.)\n");
> +	       "     [0-100]%% of pages to write.\n"
> +	       "     (default: 100 i.e. all pages are written to.)\n");

Mention that the implementation is probabilistic.

>  	printf(" -v: specify the number of vCPUs to run.\n");
>  	printf(" -o: Overlap guest memory accesses instead of partitioning\n"
>  	       "     them into a separate region of memory for each vCPU.\n");
> @@ -399,7 +399,7 @@ int main(int argc, char *argv[])
>  	int max_vcpus = kvm_check_cap(KVM_CAP_MAX_VCPUS);
>  	struct test_params p = {
>  		.iterations = TEST_HOST_LOOP_N,
> -		.wr_fract = 1,
> +		.wr_fract = 100,

Please rename wr_fract to e.g. write_percent to reflect the new
semantics. Same goes for perf_test_set_wr_fract(),
perf_test_args.wr_fract, etc.

>  		.partition_vcpu_memory_access = true,
>  		.backing_src = DEFAULT_VM_MEM_SRC,
>  		.slots = 1,
> @@ -439,8 +439,10 @@ int main(int argc, char *argv[])
>  			break;
>  		case 'f':
>  			p.wr_fract = atoi(optarg);
> -			TEST_ASSERT(p.wr_fract >= 1,
> -				    "Write fraction cannot be less than one");
> +			TEST_ASSERT(p.wr_fract >= 0,
> +				    "Write fraction cannot be less than 0");
> +			TEST_ASSERT(p.wr_fract <= 100,
> +				    "Write fraction cannot be greater than 100");

nit: This could be combined into a single assert pretty easily.

>  			break;
>  		case 'v':
>  			nr_vcpus = atoi(optarg);
> diff --git a/tools/testing/selftests/kvm/lib/perf_test_util.c b/tools/testing/selftests/kvm/lib/perf_test_util.c
> index b04e8d2c0f37..3c7b93349fef 100644
> --- a/tools/testing/selftests/kvm/lib/perf_test_util.c
> +++ b/tools/testing/selftests/kvm/lib/perf_test_util.c
> @@ -64,7 +64,7 @@ void perf_test_guest_code(uint32_t vcpu_idx)
>  		for (i = 0; i < pages; i++) {
>  			uint64_t addr = gva + (i * pta->guest_page_size);
>  
> -			if (i % pta->wr_fract == 0)
> +			if (random_table[vcpu_idx][i] % 100 < pta->wr_fract)
>  				*(uint64_t *)addr = 0x0123456789ABCDEF;
>  			else
>  				READ_ONCE(*(uint64_t *)addr);
> @@ -168,7 +168,7 @@ struct kvm_vm *perf_test_create_vm(enum vm_guest_mode mode, int nr_vcpus,
>  	pr_info("Testing guest mode: %s\n", vm_guest_mode_string(mode));
>  
>  	/* By default vCPUs will write to memory. */
> -	pta->wr_fract = 1;
> +	pta->wr_fract = 100;
>  
>  	/*
>  	 * Snapshot the non-huge page size.  This is used by the guest code to
> -- 
> 2.37.1.559.g78731f0fdb-goog
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/3] KVM: selftests: Randomize which pages are written vs read
  2022-08-10 23:33   ` David Matlack
@ 2022-08-10 23:37     ` David Matlack
  2022-08-12 16:11       ` Colton Lewis
  0 siblings, 1 reply; 14+ messages in thread
From: David Matlack @ 2022-08-10 23:37 UTC (permalink / raw)
  To: Colton Lewis; +Cc: kvm, pbonzini, maz, seanjc, oupton, ricarkol

On Wed, Aug 10, 2022 at 04:33:44PM -0700, David Matlack wrote:
> On Wed, Aug 10, 2022 at 05:58:29PM +0000, Colton Lewis wrote:
> > Randomize which pages are written vs read by using the random number
> 
> Same thing here about stating what the patch does first.

Sorry -- you do state what the patch does first here. But I think it
could just be a little more direct and specific. e.g.

  Replace the -f<fraction> option in dirty_log_perf_test.c with
  -w<percent>, to allow the user to specify the percentage of which
  pages are written.

> 
> > table for each page modulo 100. This changes how the -w argument
> > works. It is now a percentage from 0 to 100 inclusive that represents
> > what percentage of accesses are writes. It keeps the same default of
> > 100 percent writes.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/3] KVM: selftests: Randomize page access order
  2022-08-10 17:58 ` [PATCH 3/3] KVM: selftests: Randomize page access order Colton Lewis
@ 2022-08-10 23:49   ` David Matlack
  2022-08-12 16:24     ` Colton Lewis
  0 siblings, 1 reply; 14+ messages in thread
From: David Matlack @ 2022-08-10 23:49 UTC (permalink / raw)
  To: Colton Lewis; +Cc: kvm, pbonzini, maz, seanjc, oupton, ricarkol

On Wed, Aug 10, 2022 at 05:58:30PM +0000, Colton Lewis wrote:
> Add the ability to use random_table to randomize the order in which
> pages are accessed. Add the -a argument to enable this new
> behavior. This should make accesses less predictable and make for a
> more realistic test. It includes the possibility that the same pages
> may be hit multiple times during an iteration.
> 
> Signed-off-by: Colton Lewis <coltonlewis@google.com>
> ---
>  .../testing/selftests/kvm/dirty_log_perf_test.c | 11 +++++++++--
>  .../selftests/kvm/include/perf_test_util.h      |  2 ++
>  .../testing/selftests/kvm/lib/perf_test_util.c  | 17 ++++++++++++++++-
>  3 files changed, 27 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c
> index dcc5d44fc757..265cb4f7e088 100644
> --- a/tools/testing/selftests/kvm/dirty_log_perf_test.c
> +++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c
> @@ -132,6 +132,7 @@ struct test_params {
>  	bool partition_vcpu_memory_access;
>  	enum vm_mem_backing_src_type backing_src;
>  	int slots;
> +	bool random_access;
>  	uint32_t random_seed;
>  };
>  
> @@ -227,6 +228,7 @@ static void run_test(enum vm_guest_mode mode, void *arg)
>  				 p->partition_vcpu_memory_access);
>  
>  	perf_test_set_wr_fract(vm, p->wr_fract);
> +	perf_test_set_random_access(vm, p->random_access);
>  
>  	guest_num_pages = (nr_vcpus * guest_percpu_mem_size) >> vm->page_shift;
>  	guest_num_pages = vm_adjust_num_guest_pages(mode, guest_num_pages);
> @@ -357,10 +359,11 @@ static void run_test(enum vm_guest_mode mode, void *arg)
>  static void help(char *name)
>  {
>  	puts("");
> -	printf("usage: %s [-h] [-i iterations] [-p offset] [-g] "
> +	printf("usage: %s [-h] [-a] [-r random seed] [-i iterations] [-p offset] [-g] "
>  	       "[-m mode] [-n] [-b vcpu bytes] [-v vcpus] [-o] [-s mem type]"
>  	       "[-x memslots]\n", name);
>  	puts("");
> +	printf(" -a: access memory randomly rather than in order.\n");
>  	printf(" -i: specify iteration counts (default: %"PRIu64")\n",
>  	       TEST_HOST_LOOP_N);
>  	printf(" -g: Do not enable KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2. This\n"
> @@ -403,6 +406,7 @@ int main(int argc, char *argv[])
>  		.partition_vcpu_memory_access = true,
>  		.backing_src = DEFAULT_VM_MEM_SRC,
>  		.slots = 1,
> +		.random_access = false,
>  		.random_seed = time(NULL),
>  	};
>  	int opt;
> @@ -414,8 +418,11 @@ int main(int argc, char *argv[])
>  
>  	guest_modes_append_default();
>  
> -	while ((opt = getopt(argc, argv, "eghi:p:m:nb:f:v:or:s:x:")) != -1) {
> +	while ((opt = getopt(argc, argv, "aeghi:p:m:nb:f:v:or:s:x:")) != -1) {
>  		switch (opt) {
> +		case 'a':
> +			p.random_access = true;
> +			break;
>  		case 'e':
>  			/* 'e' is for evil. */
>  			run_vcpus_while_disabling_dirty_logging = true;
> diff --git a/tools/testing/selftests/kvm/include/perf_test_util.h b/tools/testing/selftests/kvm/include/perf_test_util.h
> index 597875d0c3db..6c6f81ce2216 100644
> --- a/tools/testing/selftests/kvm/include/perf_test_util.h
> +++ b/tools/testing/selftests/kvm/include/perf_test_util.h
> @@ -39,6 +39,7 @@ struct perf_test_args {
>  
>  	/* Run vCPUs in L2 instead of L1, if the architecture supports it. */
>  	bool nested;
> +	bool random_access;
>  
>  	struct perf_test_vcpu_args vcpu_args[KVM_MAX_VCPUS];
>  };
> @@ -56,6 +57,7 @@ struct kvm_vm *perf_test_create_vm(enum vm_guest_mode mode, int nr_vcpus,
>  void perf_test_destroy_vm(struct kvm_vm *vm);
>  
>  void perf_test_set_wr_fract(struct kvm_vm *vm, int wr_fract);
> +void perf_test_set_random_access(struct kvm_vm *vm, bool random_access);
>  
>  void perf_test_start_vcpu_threads(int vcpus, void (*vcpu_fn)(struct perf_test_vcpu_args *));
>  void perf_test_join_vcpu_threads(int vcpus);
> diff --git a/tools/testing/selftests/kvm/lib/perf_test_util.c b/tools/testing/selftests/kvm/lib/perf_test_util.c
> index 3c7b93349fef..9838d1ad9166 100644
> --- a/tools/testing/selftests/kvm/lib/perf_test_util.c
> +++ b/tools/testing/selftests/kvm/lib/perf_test_util.c
> @@ -52,6 +52,9 @@ void perf_test_guest_code(uint32_t vcpu_idx)
>  	struct perf_test_vcpu_args *vcpu_args = &pta->vcpu_args[vcpu_idx];
>  	uint64_t gva;
>  	uint64_t pages;
> +	uint64_t addr;
> +	bool random_access = pta->random_access;
> +	bool populated = false;
>  	int i;
>  
>  	gva = vcpu_args->gva;
> @@ -62,7 +65,11 @@ void perf_test_guest_code(uint32_t vcpu_idx)
>  
>  	while (true) {
>  		for (i = 0; i < pages; i++) {
> -			uint64_t addr = gva + (i * pta->guest_page_size);
> +			if (populated && random_access)

Skipping the populate phase makes sense to ensure everything is
populated I guess. What was your rational?

Either way I think this policy should be driven by the test, rather than
harde-coded in perf_test_guest_code(). i.e. Move the call
perf_test_set_random_access() in dirty_log_perf_test.c to just after the
population phase.

> +				addr = gva +
> +					((random_table[vcpu_idx][i] % pages) * pta->guest_page_size);
> +			else
> +				addr = gva + (i * pta->guest_page_size);
>  
>  			if (random_table[vcpu_idx][i] % 100 < pta->wr_fract)
>  				*(uint64_t *)addr = 0x0123456789ABCDEF;
> @@ -70,6 +77,7 @@ void perf_test_guest_code(uint32_t vcpu_idx)
>  				READ_ONCE(*(uint64_t *)addr);
>  		}
>  
> +		populated = true;
>  		GUEST_SYNC(1);
>  	}
>  }
> @@ -169,6 +177,7 @@ struct kvm_vm *perf_test_create_vm(enum vm_guest_mode mode, int nr_vcpus,
>  
>  	/* By default vCPUs will write to memory. */
>  	pta->wr_fract = 100;
> +	pta->random_access = false;
>  
>  	/*
>  	 * Snapshot the non-huge page size.  This is used by the guest code to
> @@ -276,6 +285,12 @@ void perf_test_set_wr_fract(struct kvm_vm *vm, int wr_fract)
>  	sync_global_to_guest(vm, perf_test_args);
>  }
>  
> +void perf_test_set_random_access(struct kvm_vm *vm, bool random_access)
> +{
> +	perf_test_args.random_access = random_access;
> +	sync_global_to_guest(vm, perf_test_args);
> +}
> +
>  uint64_t __weak perf_test_nested_pages(int nr_vcpus)
>  {
>  	return 0;
> -- 
> 2.37.1.559.g78731f0fdb-goog
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] KVM: selftests: Add random table to randomize memory access
  2022-08-10 23:26     ` David Matlack
@ 2022-08-12 16:09       ` Colton Lewis
  0 siblings, 0 replies; 14+ messages in thread
From: Colton Lewis @ 2022-08-12 16:09 UTC (permalink / raw)
  To: David Matlack; +Cc: kvm, pbonzini, maz, seanjc, oupton, ricarkol

On Wed, Aug 10, 2022 at 04:26:39PM -0700, David Matlack wrote:
> On Wed, Aug 10, 2022 at 04:18:10PM -0700, David Matlack wrote:
> > On Wed, Aug 10, 2022 at 05:58:28PM +0000, Colton Lewis wrote:
> > > Linear access through all pages does not seem to replicate performance
> > 
> > State what the patch does first, then the background/motivation.

Will do.

> > 
> > > problems with realistic dirty logging workloads. Make the test more
> > > sophisticated through random access. Each vcpu has its own sequence of
> > > random numbers that are refilled after every iteration. Having the
> > > main thread fill the table for every vcpu is less efficient than
> > > having each vcpu generate its own numbers, but this ensures threading
> > > nondeterminism won't destroy reproducibility with a given random seed.
> > 
> > Make it clear what this patch does specifically. e.g. "Make the test
> > more sophisticated through random access" is a bit misleading since all
> > this patch does is create a table of random numbers.
> 
> Please also call out how this change affects memory usage of the test.

Yes that seems important to mention.

> 
> > 
> > > 
> > > Signed-off-by: Colton Lewis <coltonlewis@google.com>
> > > ---
> > >  .../selftests/kvm/dirty_log_perf_test.c       | 13 ++++-
> > >  .../selftests/kvm/include/perf_test_util.h    |  4 ++
> > >  .../selftests/kvm/lib/perf_test_util.c        | 47 +++++++++++++++++++
> > >  3 files changed, 63 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c
> > > index f99e39a672d3..80a1cbe7fbb0 100644
> > > --- a/tools/testing/selftests/kvm/dirty_log_perf_test.c
> > > +++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c
> > > @@ -132,6 +132,7 @@ struct test_params {
> > >  	bool partition_vcpu_memory_access;
> > >  	enum vm_mem_backing_src_type backing_src;
> > >  	int slots;
> > > +	uint32_t random_seed;
> > >  };
> > >  
> > >  static void toggle_dirty_logging(struct kvm_vm *vm, int slots, bool enable)
> > > @@ -243,6 +244,10 @@ static void run_test(enum vm_guest_mode mode, void *arg)
> > >  	/* Start the iterations */
> > >  	iteration = 0;
> > >  	host_quit = false;
> > > +	srandom(p->random_seed);
> > > +	pr_info("Random seed: %d\n", p->random_seed);
> > > +	alloc_random_table(nr_vcpus, guest_percpu_mem_size >> vm->page_shift);
> > > +	fill_random_table(nr_vcpus, guest_percpu_mem_size >> vm->page_shift);
> > 
> > Drive the allocate and filling of the random table in perf_test_util.c
> > as part of VM setup, and also move random_seed to perf_test_args.
> > 
> > This will reduce the amount of code needed in the test to use
> > perf_test_util with random accesses.  dirty_log_perf_test is the only
> > test using random accesses right now, but I could see us wanting to use
> > it in demand_paging_test and access_tracking_perf_test in the near
> > future.
> > 
> > You can still have the test refresh the random table every iteration by
> > exporting e.g. perf_test_refresh_random_table() for use by tests.

Will do.

> > 
> > >  
> > >  	clock_gettime(CLOCK_MONOTONIC, &start);
> > >  	for (i = 0; i < nr_vcpus; i++)
> > > @@ -270,6 +275,7 @@ static void run_test(enum vm_guest_mode mode, void *arg)
> > >  		ts_diff.tv_sec, ts_diff.tv_nsec);
> > >  
> > >  	while (iteration < p->iterations) {
> > > +		fill_random_table(nr_vcpus, guest_percpu_mem_size >> vm->page_shift);
> > 
> > I wonder if it would be better to use the same random access pattern
> > across iterations. One of the reasons to have multiple iterations is to
> > see how the guest performance changes as the memory moves through
> > different phases of dirty tracking. e.g. KVM might be splitting huge
> > pages during the first iteration but not the second. If the access
> > pattern is also changing across iterations that could make it harder to
> > identify performance changes due to KVM.

I hadn't thought about it like that. I agree and it means we don't
have to refresh the random table every iteration.

> > 
> > >  		/*
> > >  		 * Incrementing the iteration number will start the vCPUs
> > >  		 * dirtying memory again.
> > > @@ -380,6 +386,7 @@ static void help(char *name)
> > >  	printf(" -v: specify the number of vCPUs to run.\n");
> > >  	printf(" -o: Overlap guest memory accesses instead of partitioning\n"
> > >  	       "     them into a separate region of memory for each vCPU.\n");
> > > +	printf(" -r: specify the starting random seed.\n");
> > >  	backing_src_help("-s");
> > >  	printf(" -x: Split the memory region into this number of memslots.\n"
> > >  	       "     (default: 1)\n");
> > > @@ -396,6 +403,7 @@ int main(int argc, char *argv[])
> > >  		.partition_vcpu_memory_access = true,
> > >  		.backing_src = DEFAULT_VM_MEM_SRC,
> > >  		.slots = 1,
> > > +		.random_seed = time(NULL),
> > 
> > Perhaps the default seed should be a hard-coded value so that users
> > running the test with default arguments get deterministic results across
> > runs.

I don't think that's a good idea. Always using the same random seed
every time seems to negate the point of making it random. If people
want deterministic results across runs, they can always set the random
seed themselves.

> > 
> > >  	};
> > >  	int opt;
> > >  
> > > @@ -406,7 +414,7 @@ int main(int argc, char *argv[])
> > >  
> > >  	guest_modes_append_default();
> > >  
> > > -	while ((opt = getopt(argc, argv, "eghi:p:m:nb:f:v:os:x:")) != -1) {
> > > +	while ((opt = getopt(argc, argv, "eghi:p:m:nb:f:v:or:s:x:")) != -1) {
> > >  		switch (opt) {
> > >  		case 'e':
> > >  			/* 'e' is for evil. */
> > > @@ -442,6 +450,9 @@ int main(int argc, char *argv[])
> > >  		case 'o':
> > >  			p.partition_vcpu_memory_access = false;
> > >  			break;
> > > +		case 'r':
> > > +			p.random_seed = atoi(optarg);
> > > +			break;
> > >  		case 's':
> > >  			p.backing_src = parse_backing_src_type(optarg);
> > >  			break;
> > > diff --git a/tools/testing/selftests/kvm/include/perf_test_util.h b/tools/testing/selftests/kvm/include/perf_test_util.h
> > > index eaa88df0555a..597875d0c3db 100644
> > > --- a/tools/testing/selftests/kvm/include/perf_test_util.h
> > > +++ b/tools/testing/selftests/kvm/include/perf_test_util.h
> > > @@ -44,6 +44,10 @@ struct perf_test_args {
> > >  };
> > >  
> > >  extern struct perf_test_args perf_test_args;
> > > +extern uint32_t **random_table;
> > 
> > Adding random_table to perf_test_util.h is unnecessary in this commit
> > (it's only used in perf_test_util.c).
> >

Will move this to the second commit.

> > > +
> > > +void alloc_random_table(uint32_t nr_vcpus, uint32_t nr_randoms);
> > > +void fill_random_table(uint32_t nr_vcpus, uint32_t nr_randoms);
> > 
> > Use perf_test_ prefixes for symbols visible outside of perf_test_util.c.
> > 
> > e.g.
> > 
> >   perf_test_random_table
> >   perf_test_alloc_random_table()
> >   perf_test_fill_random_table()
> > 

Will do.

> > >  
> > >  struct kvm_vm *perf_test_create_vm(enum vm_guest_mode mode, int nr_vcpus,
> > >  				   uint64_t vcpu_memory_bytes, int slots,
> > > diff --git a/tools/testing/selftests/kvm/lib/perf_test_util.c b/tools/testing/selftests/kvm/lib/perf_test_util.c
> > > index 9618b37c66f7..b04e8d2c0f37 100644
> > > --- a/tools/testing/selftests/kvm/lib/perf_test_util.c
> > > +++ b/tools/testing/selftests/kvm/lib/perf_test_util.c
> > > @@ -9,6 +9,10 @@
> > >  #include "processor.h"
> > >  
> > >  struct perf_test_args perf_test_args;
> > > +/* This pointer points to guest memory and must be converted with
> > > + * addr_gva2hva to be accessed from the host.
> > > + */
> > > +uint32_t **random_table;
> > 
> > Use vm_vaddr_t for variables that contain guest virtual addresses
> > (exception within guest_code(), of course).

Will do.

> > 
> > >  
> > >  /*
> > >   * Guest virtual memory offset of the testing memory slot.
> > > @@ -70,6 +74,49 @@ void perf_test_guest_code(uint32_t vcpu_idx)
> > >  	}
> > >  }
> > >  
> > > +void alloc_random_table(uint32_t nr_vcpus, uint32_t nr_randoms)
> > > +{
> > > +	struct perf_test_args *pta = &perf_test_args;
> > > +	uint32_t **host_random_table;
> > > +
> > > +	random_table = (uint32_t **)vm_vaddr_alloc(
> > > +		pta->vm,
> > > +		nr_vcpus * sizeof(uint32_t *),
> > > +		(vm_vaddr_t)0);
> > 
> > I notice vm_vaddr_alloc_pages() and vcpu_alloc_cpuid() use
> > KVM_UTIL_MIN_VADDR for the min. Should we use that here too?

I was curious so I looked at the history of KVM_UTIL_MIN_VADDR to see
if there's a reason for its value. According to Sean in 106a2e76, it's
arbitrary.

But I'll still use it for consistency's sake.

> > 
> > If so, this is a good opporunity to rename vm_vaddr_alloc() to
> > __vm_vaddr_alloc() and introduce:
> > 
> > vm_vaddr_t vm_vaddr_alloc(struct kvm_vm *vm, size_t sz)
> > {
> >         return __vm_vaddr_alloc(vm, sz, KVM_UTIL_MIN_VADDR);
> > }
> > 

vm_vaddr_alloc() is used with a different value for that argument in
several places. Changing those calls is beyond scope for this patch.

> > > +	host_random_table = addr_gva2hva(pta->vm, (vm_vaddr_t)random_table);
> > > +	pr_debug("Random start addr: %p %p.\n", random_table, host_random_table);
> > > +
> > > +	for (uint32_t i = 0; i < nr_vcpus; i++) {
> > > +		host_random_table[i] = (uint32_t *)vm_vaddr_alloc(
> > 
> > The per-vCPU random table should go in perf_test_vcpu_args along with
> > all the other per-vCPU information that is set up by the test and
> > consumed by the guest code.
> > 
> > This will reduce some of the complexity here because you won't need to
> > allocate the top-level array of pointers.
> >

Good idea. Will do.

> > > +			pta->vm,
> > > +			nr_randoms * sizeof(uint32_t),
> > > +			(vm_vaddr_t)0);
> > > +		pr_debug("Random row addr: %p %p.\n",
> > > +			 host_random_table[i],
> > > +			 addr_gva2hva(pta->vm, (vm_vaddr_t)host_random_table[i]));
> > 
> > Logging the host virtual addresses of the random table would probably
> > not be valuable. But logging the guest virtual address would probably be
> > more useful. The guest virtual address space management it pretty
> > ad-hoc.
> >

Will do. I added that to make sure the random numbers were different
every iteration, but it is probably not useful.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/3] KVM: selftests: Randomize which pages are written vs read
  2022-08-10 23:37     ` David Matlack
@ 2022-08-12 16:11       ` Colton Lewis
  0 siblings, 0 replies; 14+ messages in thread
From: Colton Lewis @ 2022-08-12 16:11 UTC (permalink / raw)
  To: David Matlack; +Cc: kvm, pbonzini, maz, seanjc, oupton, ricarkol

On Wed, Aug 10, 2022 at 04:37:47PM -0700, David Matlack wrote:
> On Wed, Aug 10, 2022 at 04:33:44PM -0700, David Matlack wrote:
> > On Wed, Aug 10, 2022 at 05:58:29PM +0000, Colton Lewis wrote:
> > > Randomize which pages are written vs read by using the random number
> > 
> > Same thing here about stating what the patch does first.
> 
> Sorry -- you do state what the patch does first here. But I think it
> could just be a little more direct and specific. e.g.
> 
>   Replace the -f<fraction> option in dirty_log_perf_test.c with
>   -w<percent>, to allow the user to specify the percentage of which
>   pages are written.
> 
> > 
> > > table for each page modulo 100. This changes how the -w argument
> > > works. It is now a percentage from 0 to 100 inclusive that represents
> > > what percentage of accesses are writes. It keeps the same default of
> > > 100 percent writes.

Will do.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/3] KVM: selftests: Randomize page access order
  2022-08-10 23:49   ` David Matlack
@ 2022-08-12 16:24     ` Colton Lewis
  2022-08-12 16:28       ` David Matlack
  0 siblings, 1 reply; 14+ messages in thread
From: Colton Lewis @ 2022-08-12 16:24 UTC (permalink / raw)
  To: David Matlack; +Cc: kvm, pbonzini, maz, seanjc, oupton, ricarkol

On Wed, Aug 10, 2022 at 04:49:23PM -0700, David Matlack wrote:
> On Wed, Aug 10, 2022 at 05:58:30PM +0000, Colton Lewis wrote:
> > diff --git a/tools/testing/selftests/kvm/lib/perf_test_util.c b/tools/testing/selftests/kvm/lib/perf_test_util.c
> > index 3c7b93349fef..9838d1ad9166 100644
> > --- a/tools/testing/selftests/kvm/lib/perf_test_util.c
> > +++ b/tools/testing/selftests/kvm/lib/perf_test_util.c
> > @@ -52,6 +52,9 @@ void perf_test_guest_code(uint32_t vcpu_idx)
> >  	struct perf_test_vcpu_args *vcpu_args = &pta->vcpu_args[vcpu_idx];
> >  	uint64_t gva;
> >  	uint64_t pages;
> > +	uint64_t addr;
> > +	bool random_access = pta->random_access;
> > +	bool populated = false;
> >  	int i;
> >  
> >  	gva = vcpu_args->gva;
> > @@ -62,7 +65,11 @@ void perf_test_guest_code(uint32_t vcpu_idx)
> >  
> >  	while (true) {
> >  		for (i = 0; i < pages; i++) {
> > -			uint64_t addr = gva + (i * pta->guest_page_size);
> > +			if (populated && random_access)
> 
> Skipping the populate phase makes sense to ensure everything is
> populated I guess. What was your rational?

That's it. Wanted to ensure everything was populated. Random
population won't hit every page, but those unpopulated pages might be
hit on subsequent iterations. I originally let population be random
too and suspect this was driving an odd behavior I noticed early in
testing where later iterations would be much faster than earlier ones.

> Either way I think this policy should be driven by the test, rather than
> harde-coded in perf_test_guest_code(). i.e. Move the call
> perf_test_set_random_access() in dirty_log_perf_test.c to just after the
> population phase.

That makes sense. Will do.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/3] KVM: selftests: Randomize page access order
  2022-08-12 16:24     ` Colton Lewis
@ 2022-08-12 16:28       ` David Matlack
  2022-08-12 16:40         ` Colton Lewis
  0 siblings, 1 reply; 14+ messages in thread
From: David Matlack @ 2022-08-12 16:28 UTC (permalink / raw)
  To: Colton Lewis
  Cc: kvm list, Paolo Bonzini, Marc Zyngier, Sean Christopherson,
	Oliver Upton, Ricardo Koller

On Fri, Aug 12, 2022 at 9:24 AM Colton Lewis <coltonlewis@google.com> wrote:
>
> On Wed, Aug 10, 2022 at 04:49:23PM -0700, David Matlack wrote:
> > On Wed, Aug 10, 2022 at 05:58:30PM +0000, Colton Lewis wrote:
> > > diff --git a/tools/testing/selftests/kvm/lib/perf_test_util.c b/tools/testing/selftests/kvm/lib/perf_test_util.c
> > > index 3c7b93349fef..9838d1ad9166 100644
> > > --- a/tools/testing/selftests/kvm/lib/perf_test_util.c
> > > +++ b/tools/testing/selftests/kvm/lib/perf_test_util.c
> > > @@ -52,6 +52,9 @@ void perf_test_guest_code(uint32_t vcpu_idx)
> > >     struct perf_test_vcpu_args *vcpu_args = &pta->vcpu_args[vcpu_idx];
> > >     uint64_t gva;
> > >     uint64_t pages;
> > > +   uint64_t addr;
> > > +   bool random_access = pta->random_access;
> > > +   bool populated = false;
> > >     int i;
> > >
> > >     gva = vcpu_args->gva;
> > > @@ -62,7 +65,11 @@ void perf_test_guest_code(uint32_t vcpu_idx)
> > >
> > >     while (true) {
> > >             for (i = 0; i < pages; i++) {
> > > -                   uint64_t addr = gva + (i * pta->guest_page_size);
> > > +                   if (populated && random_access)
> >
> > Skipping the populate phase makes sense to ensure everything is
> > populated I guess. What was your rational?
>
> That's it. Wanted to ensure everything was populated. Random
> population won't hit every page, but those unpopulated pages might be
> hit on subsequent iterations. I originally let population be random
> too and suspect this was driving an odd behavior I noticed early in
> testing where later iterations would be much faster than earlier ones.
>
> > Either way I think this policy should be driven by the test, rather than
> > harde-coded in perf_test_guest_code(). i.e. Move the call
> > perf_test_set_random_access() in dirty_log_perf_test.c to just after the
> > population phase.
>
> That makes sense. Will do.

Ah but if you get rid of the table refill between iterations, each
vCPU will access the same pages every iteration. At that point there's
no reason to distinguish the populate phase from the other phases, so
perhaps just drop the special case for the populate phase altogether?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/3] KVM: selftests: Randomize page access order
  2022-08-12 16:28       ` David Matlack
@ 2022-08-12 16:40         ` Colton Lewis
  0 siblings, 0 replies; 14+ messages in thread
From: Colton Lewis @ 2022-08-12 16:40 UTC (permalink / raw)
  To: David Matlack
  Cc: kvm list, Paolo Bonzini, Marc Zyngier, Sean Christopherson,
	Oliver Upton, Ricardo Koller

On Fri, Aug 12, 2022 at 09:28:05AM -0700, David Matlack wrote:
> On Fri, Aug 12, 2022 at 9:24 AM Colton Lewis <coltonlewis@google.com> wrote:
> >
> > On Wed, Aug 10, 2022 at 04:49:23PM -0700, David Matlack wrote:
> > > On Wed, Aug 10, 2022 at 05:58:30PM +0000, Colton Lewis wrote:
> > > > diff --git a/tools/testing/selftests/kvm/lib/perf_test_util.c b/tools/testing/selftests/kvm/lib/perf_test_util.c
> > > > index 3c7b93349fef..9838d1ad9166 100644
> > > > --- a/tools/testing/selftests/kvm/lib/perf_test_util.c
> > > > +++ b/tools/testing/selftests/kvm/lib/perf_test_util.c
> > > > @@ -52,6 +52,9 @@ void perf_test_guest_code(uint32_t vcpu_idx)
> > > >     struct perf_test_vcpu_args *vcpu_args = &pta->vcpu_args[vcpu_idx];
> > > >     uint64_t gva;
> > > >     uint64_t pages;
> > > > +   uint64_t addr;
> > > > +   bool random_access = pta->random_access;
> > > > +   bool populated = false;
> > > >     int i;
> > > >
> > > >     gva = vcpu_args->gva;
> > > > @@ -62,7 +65,11 @@ void perf_test_guest_code(uint32_t vcpu_idx)
> > > >
> > > >     while (true) {
> > > >             for (i = 0; i < pages; i++) {
> > > > -                   uint64_t addr = gva + (i * pta->guest_page_size);
> > > > +                   if (populated && random_access)
> > >
> > > Skipping the populate phase makes sense to ensure everything is
> > > populated I guess. What was your rational?
> >
> > That's it. Wanted to ensure everything was populated. Random
> > population won't hit every page, but those unpopulated pages might be
> > hit on subsequent iterations. I originally let population be random
> > too and suspect this was driving an odd behavior I noticed early in
> > testing where later iterations would be much faster than earlier ones.
> >
> > > Either way I think this policy should be driven by the test, rather than
> > > harde-coded in perf_test_guest_code(). i.e. Move the call
> > > perf_test_set_random_access() in dirty_log_perf_test.c to just after the
> > > population phase.
> >
> > That makes sense. Will do.
> 
> Ah but if you get rid of the table refill between iterations, each
> vCPU will access the same pages every iteration. At that point there's
> no reason to distinguish the populate phase from the other phases, so
> perhaps just drop the special case for the populate phase altogether?

You're right. Will do.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2022-08-12 16:40 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-10 17:58 [PATCH 0/3] KVM: selftests: Randomize memory access of dirty_log_perf_test Colton Lewis
2022-08-10 17:58 ` [PATCH 1/3] KVM: selftests: Add random table to randomize memory access Colton Lewis
2022-08-10 23:18   ` David Matlack
2022-08-10 23:26     ` David Matlack
2022-08-12 16:09       ` Colton Lewis
2022-08-10 17:58 ` [PATCH 2/3] KVM: selftests: Randomize which pages are written vs read Colton Lewis
2022-08-10 23:33   ` David Matlack
2022-08-10 23:37     ` David Matlack
2022-08-12 16:11       ` Colton Lewis
2022-08-10 17:58 ` [PATCH 3/3] KVM: selftests: Randomize page access order Colton Lewis
2022-08-10 23:49   ` David Matlack
2022-08-12 16:24     ` Colton Lewis
2022-08-12 16:28       ` David Matlack
2022-08-12 16:40         ` Colton Lewis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).