[PATCH 0/9] KVM: arm64: Use MMU read lock for clearing dirty logs

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/9] KVM: arm64: Use MMU read lock for clearing dirty logs
@ 2023-04-21 16:52 ` Vipin Sharma
  0 siblings, 0 replies; 42+ messages in thread
From: Vipin Sharma @ 2023-04-21 16:52 UTC (permalink / raw)
  To: maz, oliver.upton, james.morse, suzuki.poulose, yuzenghui,
	catalin.marinas, will, chenhuacai, aleksandar.qemu.devel,
	tsbogend, anup, atishp, paul.walmsley, palmer, aou, seanjc,
	pbonzini, dmatlack, ricarkol
  Cc: linux-arm-kernel, kvmarm, linux-mips, kvm-riscv, linux-riscv,
	linux-kselftest, kvm, linux-kernel, Vipin Sharma

This patch series improves guest vCPUs performances on Arm during clearing
dirty log operations by taking MMU read lock instead of MMU write lock.

vCPUs write protection faults are fixed in Arm using MMU read locks.
However, when userspace is clearing dirty logs via KVM_CLEAR_DIRTY_LOG
ioctl, then kernel code takes MMU write lock. This will block vCPUs
write protection faults and degrade guest performance.  This
degradation gets worse as guest VM size increases in terms of memory and
vCPU count.

In this series, MMU read lock adoption is made possible by using
KVM_PGTABLE_WALK_SHARED flag in page walker.

Patches 1 to 5:
These patches are modifying dirty_log_perf_test. Intent is to mimic
production scenarios where guest keeps on executing while userspace
threads collect and clear dirty logs independently.

Three new command line options are added:
1. j: Allows to run guest vCPUs and main thread collecting dirty logs
      independently of each other after initialization is complete.
2. k: Allows to clear dirty logs in smaller chunks compared to existing
      whole memslot clear in one call.
3. l: Allows to add customizable wait time between consecutive clear
      dirty log calls to mimic sending dirty memory to destination.

Patch 7-8:
These patches refactor code to move MMU lock operations to arch specific
code, refactor Arm's page table walker APIs, and change MMU write lock
for clearing dirty logs to read lock. Patch 8 has results showing
improvements based on dirty_log_perf_test.

Vipin Sharma (9):
  KVM: selftests: Allow dirty_log_perf_test to clear dirty memory in
    chunks
  KVM: selftests: Add optional delay between consecutive Clear-Dirty-Log
    calls
  KVM: selftests: Pass count of read and write accesses from guest to
    host
  KVM: selftests: Print read and write accesses of pages by vCPUs in
    dirty_log_perf_test
  KVM: selftests: Allow independent execution of vCPUs in
    dirty_log_perf_test
  KVM: arm64: Correct the kvm_pgtable_stage2_flush() documentation
  KVM: mmu: Move mmu lock/unlock to arch code for clear dirty log
  KMV: arm64: Allow stage2_apply_range_sched() to pass page table walker
    flags
  KVM: arm64: Run clear-dirty-log under MMU read lock

 arch/arm64/include/asm/kvm_pgtable.h          |  17 ++-
 arch/arm64/kvm/hyp/nvhe/mem_protect.c         |   4 +-
 arch/arm64/kvm/hyp/pgtable.c                  |  16 ++-
 arch/arm64/kvm/mmu.c                          |  36 ++++--
 arch/mips/kvm/mmu.c                           |   2 +
 arch/riscv/kvm/mmu.c                          |   2 +
 arch/x86/kvm/mmu/mmu.c                        |   3 +
 .../selftests/kvm/dirty_log_perf_test.c       | 108 ++++++++++++++----
 .../testing/selftests/kvm/include/memstress.h |  13 ++-
 tools/testing/selftests/kvm/lib/memstress.c   |  43 +++++--
 virt/kvm/dirty_ring.c                         |   2 -
 virt/kvm/kvm_main.c                           |   4 -
 12 files changed, 185 insertions(+), 65 deletions(-)


base-commit: 95b9779c1758f03cf494e8550d6249a40089ed1c
-- 
2.40.0.634.g4ca3ef3211-goog


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 0/9] KVM: arm64: Use MMU read lock for clearing dirty logs
@ 2023-04-21 16:52 ` Vipin Sharma
  0 siblings, 0 replies; 42+ messages in thread
From: Vipin Sharma @ 2023-04-21 16:52 UTC (permalink / raw)
  To: maz, oliver.upton, james.morse, suzuki.poulose, yuzenghui,
	catalin.marinas, will, chenhuacai, aleksandar.qemu.devel,
	tsbogend, anup, atishp, paul.walmsley, palmer, aou, seanjc,
	pbonzini, dmatlack, ricarkol
  Cc: linux-arm-kernel, kvmarm, linux-mips, kvm-riscv, linux-riscv,
	linux-kselftest, kvm, linux-kernel, Vipin Sharma

This patch series improves guest vCPUs performances on Arm during clearing
dirty log operations by taking MMU read lock instead of MMU write lock.

vCPUs write protection faults are fixed in Arm using MMU read locks.
However, when userspace is clearing dirty logs via KVM_CLEAR_DIRTY_LOG
ioctl, then kernel code takes MMU write lock. This will block vCPUs
write protection faults and degrade guest performance.  This
degradation gets worse as guest VM size increases in terms of memory and
vCPU count.

In this series, MMU read lock adoption is made possible by using
KVM_PGTABLE_WALK_SHARED flag in page walker.

Patches 1 to 5:
These patches are modifying dirty_log_perf_test. Intent is to mimic
production scenarios where guest keeps on executing while userspace
threads collect and clear dirty logs independently.

Three new command line options are added:
1. j: Allows to run guest vCPUs and main thread collecting dirty logs
      independently of each other after initialization is complete.
2. k: Allows to clear dirty logs in smaller chunks compared to existing
      whole memslot clear in one call.
3. l: Allows to add customizable wait time between consecutive clear
      dirty log calls to mimic sending dirty memory to destination.

Patch 7-8:
These patches refactor code to move MMU lock operations to arch specific
code, refactor Arm's page table walker APIs, and change MMU write lock
for clearing dirty logs to read lock. Patch 8 has results showing
improvements based on dirty_log_perf_test.

Vipin Sharma (9):
  KVM: selftests: Allow dirty_log_perf_test to clear dirty memory in
    chunks
  KVM: selftests: Add optional delay between consecutive Clear-Dirty-Log
    calls
  KVM: selftests: Pass count of read and write accesses from guest to
    host
  KVM: selftests: Print read and write accesses of pages by vCPUs in
    dirty_log_perf_test
  KVM: selftests: Allow independent execution of vCPUs in
    dirty_log_perf_test
  KVM: arm64: Correct the kvm_pgtable_stage2_flush() documentation
  KVM: mmu: Move mmu lock/unlock to arch code for clear dirty log
  KMV: arm64: Allow stage2_apply_range_sched() to pass page table walker
    flags
  KVM: arm64: Run clear-dirty-log under MMU read lock

 arch/arm64/include/asm/kvm_pgtable.h          |  17 ++-
 arch/arm64/kvm/hyp/nvhe/mem_protect.c         |   4 +-
 arch/arm64/kvm/hyp/pgtable.c                  |  16 ++-
 arch/arm64/kvm/mmu.c                          |  36 ++++--
 arch/mips/kvm/mmu.c                           |   2 +
 arch/riscv/kvm/mmu.c                          |   2 +
 arch/x86/kvm/mmu/mmu.c                        |   3 +
 .../selftests/kvm/dirty_log_perf_test.c       | 108 ++++++++++++++----
 .../testing/selftests/kvm/include/memstress.h |  13 ++-
 tools/testing/selftests/kvm/lib/memstress.c   |  43 +++++--
 virt/kvm/dirty_ring.c                         |   2 -
 virt/kvm/kvm_main.c                           |   4 -
 12 files changed, 185 insertions(+), 65 deletions(-)


base-commit: 95b9779c1758f03cf494e8550d6249a40089ed1c
-- 
2.40.0.634.g4ca3ef3211-goog


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 0/9] KVM: arm64: Use MMU read lock for clearing dirty logs
@ 2023-04-21 16:52 ` Vipin Sharma
  0 siblings, 0 replies; 42+ messages in thread
From: Vipin Sharma @ 2023-04-21 16:52 UTC (permalink / raw)
  To: maz, oliver.upton, james.morse, suzuki.poulose, yuzenghui,
	catalin.marinas, will, chenhuacai, aleksandar.qemu.devel,
	tsbogend, anup, atishp, paul.walmsley, palmer, aou, seanjc,
	pbonzini, dmatlack, ricarkol
  Cc: linux-arm-kernel, kvmarm, linux-mips, kvm-riscv, linux-riscv,
	linux-kselftest, kvm, linux-kernel, Vipin Sharma

This patch series improves guest vCPUs performances on Arm during clearing
dirty log operations by taking MMU read lock instead of MMU write lock.

vCPUs write protection faults are fixed in Arm using MMU read locks.
However, when userspace is clearing dirty logs via KVM_CLEAR_DIRTY_LOG
ioctl, then kernel code takes MMU write lock. This will block vCPUs
write protection faults and degrade guest performance.  This
degradation gets worse as guest VM size increases in terms of memory and
vCPU count.

In this series, MMU read lock adoption is made possible by using
KVM_PGTABLE_WALK_SHARED flag in page walker.

Patches 1 to 5:
These patches are modifying dirty_log_perf_test. Intent is to mimic
production scenarios where guest keeps on executing while userspace
threads collect and clear dirty logs independently.

Three new command line options are added:
1. j: Allows to run guest vCPUs and main thread collecting dirty logs
      independently of each other after initialization is complete.
2. k: Allows to clear dirty logs in smaller chunks compared to existing
      whole memslot clear in one call.
3. l: Allows to add customizable wait time between consecutive clear
      dirty log calls to mimic sending dirty memory to destination.

Patch 7-8:
These patches refactor code to move MMU lock operations to arch specific
code, refactor Arm's page table walker APIs, and change MMU write lock
for clearing dirty logs to read lock. Patch 8 has results showing
improvements based on dirty_log_perf_test.

Vipin Sharma (9):
  KVM: selftests: Allow dirty_log_perf_test to clear dirty memory in
    chunks
  KVM: selftests: Add optional delay between consecutive Clear-Dirty-Log
    calls
  KVM: selftests: Pass count of read and write accesses from guest to
    host
  KVM: selftests: Print read and write accesses of pages by vCPUs in
    dirty_log_perf_test
  KVM: selftests: Allow independent execution of vCPUs in
    dirty_log_perf_test
  KVM: arm64: Correct the kvm_pgtable_stage2_flush() documentation
  KVM: mmu: Move mmu lock/unlock to arch code for clear dirty log
  KMV: arm64: Allow stage2_apply_range_sched() to pass page table walker
    flags
  KVM: arm64: Run clear-dirty-log under MMU read lock

 arch/arm64/include/asm/kvm_pgtable.h          |  17 ++-
 arch/arm64/kvm/hyp/nvhe/mem_protect.c         |   4 +-
 arch/arm64/kvm/hyp/pgtable.c                  |  16 ++-
 arch/arm64/kvm/mmu.c                          |  36 ++++--
 arch/mips/kvm/mmu.c                           |   2 +
 arch/riscv/kvm/mmu.c                          |   2 +
 arch/x86/kvm/mmu/mmu.c                        |   3 +
 .../selftests/kvm/dirty_log_perf_test.c       | 108 ++++++++++++++----
 .../testing/selftests/kvm/include/memstress.h |  13 ++-
 tools/testing/selftests/kvm/lib/memstress.c   |  43 +++++--
 virt/kvm/dirty_ring.c                         |   2 -
 virt/kvm/kvm_main.c                           |   4 -
 12 files changed, 185 insertions(+), 65 deletions(-)


base-commit: 95b9779c1758f03cf494e8550d6249a40089ed1c
-- 
2.40.0.634.g4ca3ef3211-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 1/9] KVM: selftests: Allow dirty_log_perf_test to clear dirty memory in chunks
  2023-04-21 16:52 ` Vipin Sharma
  (?)
@ 2023-04-21 16:52   ` Vipin Sharma
  -1 siblings, 0 replies; 42+ messages in thread
From: Vipin Sharma @ 2023-04-21 16:52 UTC (permalink / raw)
  To: maz, oliver.upton, james.morse, suzuki.poulose, yuzenghui,
	catalin.marinas, will, chenhuacai, aleksandar.qemu.devel,
	tsbogend, anup, atishp, paul.walmsley, palmer, aou, seanjc,
	pbonzini, dmatlack, ricarkol
  Cc: linux-arm-kernel, kvmarm, linux-mips, kvm-riscv, linux-riscv,
	linux-kselftest, kvm, linux-kernel, Vipin Sharma

In dirty_log_perf_test, provide option 'k' to specify the size of the
chunks and clear dirty memory in chunks in each iteration. If this
option is not provided then fallback to old way of clearing whole
memslot in one call per iteration.

In production environment whole memslot is rarely cleared in a single
call, instead clearing operation is split across multiple calls to
reduce time between clearing and sending memory to a remote host. This
change mimics the production usecases and allow to get metrics based on
that.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 .../selftests/kvm/dirty_log_perf_test.c       | 19 ++++++++++++---
 .../testing/selftests/kvm/include/memstress.h | 12 ++++++++--
 tools/testing/selftests/kvm/lib/memstress.c   | 24 ++++++++++++++-----
 3 files changed, 44 insertions(+), 11 deletions(-)

diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c
index 416719e20518..0852a7ba42e1 100644
--- a/tools/testing/selftests/kvm/dirty_log_perf_test.c
+++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c
@@ -134,6 +134,7 @@ struct test_params {
 	uint32_t write_percent;
 	uint32_t random_seed;
 	bool random_access;
+	uint64_t clear_chunk_size;
 };
 
 static void run_test(enum vm_guest_mode mode, void *arg)
@@ -144,6 +145,7 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 	uint64_t guest_num_pages;
 	uint64_t host_num_pages;
 	uint64_t pages_per_slot;
+	uint64_t pages_per_clear;
 	struct timespec start;
 	struct timespec ts_diff;
 	struct timespec get_dirty_log_total = (struct timespec){0};
@@ -164,6 +166,7 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 	guest_num_pages = vm_adjust_num_guest_pages(mode, guest_num_pages);
 	host_num_pages = vm_num_host_pages(mode, guest_num_pages);
 	pages_per_slot = host_num_pages / p->slots;
+	pages_per_clear = p->clear_chunk_size / getpagesize();
 
 	bitmaps = memstress_alloc_bitmaps(p->slots, pages_per_slot);
 
@@ -244,8 +247,9 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 
 		if (dirty_log_manual_caps) {
 			clock_gettime(CLOCK_MONOTONIC, &start);
-			memstress_clear_dirty_log(vm, bitmaps, p->slots,
-						  pages_per_slot);
+			memstress_clear_dirty_log_in_chunks(vm, bitmaps, p->slots,
+							    pages_per_slot,
+							    pages_per_clear);
 			ts_diff = timespec_elapsed(start);
 			clear_dirty_log_total = timespec_add(clear_dirty_log_total,
 							     ts_diff);
@@ -343,6 +347,11 @@ static void help(char *name)
 	       "     To leave the application task unpinned, drop the final entry:\n\n"
 	       "         ./dirty_log_perf_test -v 3 -c 22,23,24\n\n"
 	       "     (default: no pinning)\n");
+	printf(" -k: Specify the chunk size in which dirty memory gets cleared\n"
+	       "     in memslots in each iteration. If the size is bigger than\n"
+	       "     the memslot size then whole memslot is cleared in one call.\n"
+	       "     Size must be aligned to the host page size. e.g. 10M or 3G\n"
+	       "     (default: UINT64_MAX, clears whole memslot in one call)\n");
 	puts("");
 	exit(0);
 }
@@ -358,6 +367,7 @@ int main(int argc, char *argv[])
 		.slots = 1,
 		.random_seed = 1,
 		.write_percent = 100,
+		.clear_chunk_size = UINT64_MAX,
 	};
 	int opt;
 
@@ -368,7 +378,7 @@ int main(int argc, char *argv[])
 
 	guest_modes_append_default();
 
-	while ((opt = getopt(argc, argv, "ab:c:eghi:m:nop:r:s:v:x:w:")) != -1) {
+	while ((opt = getopt(argc, argv, "ab:c:eghi:k:m:nop:r:s:v:x:w:")) != -1) {
 		switch (opt) {
 		case 'a':
 			p.random_access = true;
@@ -392,6 +402,9 @@ int main(int argc, char *argv[])
 		case 'i':
 			p.iterations = atoi_positive("Number of iterations", optarg);
 			break;
+		case 'k':
+			p.clear_chunk_size = parse_size(optarg);
+			break;
 		case 'm':
 			guest_modes_cmdline(optarg);
 			break;
diff --git a/tools/testing/selftests/kvm/include/memstress.h b/tools/testing/selftests/kvm/include/memstress.h
index ce4e603050ea..2acc93f76fc3 100644
--- a/tools/testing/selftests/kvm/include/memstress.h
+++ b/tools/testing/selftests/kvm/include/memstress.h
@@ -75,8 +75,16 @@ void memstress_setup_nested(struct kvm_vm *vm, int nr_vcpus, struct kvm_vcpu *vc
 void memstress_enable_dirty_logging(struct kvm_vm *vm, int slots);
 void memstress_disable_dirty_logging(struct kvm_vm *vm, int slots);
 void memstress_get_dirty_log(struct kvm_vm *vm, unsigned long *bitmaps[], int slots);
-void memstress_clear_dirty_log(struct kvm_vm *vm, unsigned long *bitmaps[],
-			       int slots, uint64_t pages_per_slot);
+void memstress_clear_dirty_log_in_chunks(struct kvm_vm *vm,
+					 unsigned long *bitmaps[], int slots,
+					 uint64_t pages_per_slot,
+					 uint64_t pages_per_clear);
+static inline void memstress_clear_dirty_log(struct kvm_vm *vm,
+					     unsigned long *bitmaps[], int slots,
+					     uint64_t pages_per_slot) {
+	memstress_clear_dirty_log_in_chunks(vm, bitmaps, slots, pages_per_slot,
+					    pages_per_slot);
+}
 unsigned long **memstress_alloc_bitmaps(int slots, uint64_t pages_per_slot);
 void memstress_free_bitmaps(unsigned long *bitmaps[], int slots);
 
diff --git a/tools/testing/selftests/kvm/lib/memstress.c b/tools/testing/selftests/kvm/lib/memstress.c
index 3632956c6bcf..e0c701ab4e9a 100644
--- a/tools/testing/selftests/kvm/lib/memstress.c
+++ b/tools/testing/selftests/kvm/lib/memstress.c
@@ -355,16 +355,28 @@ void memstress_get_dirty_log(struct kvm_vm *vm, unsigned long *bitmaps[], int sl
 	}
 }
 
-void memstress_clear_dirty_log(struct kvm_vm *vm, unsigned long *bitmaps[],
-			       int slots, uint64_t pages_per_slot)
+void memstress_clear_dirty_log_in_chunks(struct kvm_vm *vm,
+					 unsigned long *bitmaps[], int slots,
+					 uint64_t pages_per_slot,
+					 uint64_t pages_per_clear)
 {
-	int i;
+	int i, slot;
+	uint64_t from, clear_pages_count;
 
 	for (i = 0; i < slots; i++) {
-		int slot = MEMSTRESS_MEM_SLOT_INDEX + i;
-
-		kvm_vm_clear_dirty_log(vm, slot, bitmaps[i], 0, pages_per_slot);
+		slot = MEMSTRESS_MEM_SLOT_INDEX + i;
+		from = 0;
+		clear_pages_count = pages_per_clear;
+
+		while (from < pages_per_slot) {
+			if (from + clear_pages_count > pages_per_slot)
+				clear_pages_count = pages_per_slot - from;
+			kvm_vm_clear_dirty_log(vm, slot, bitmaps[i], from,
+					       clear_pages_count);
+			from += clear_pages_count;
+		}
 	}
+
 }
 
 unsigned long **memstress_alloc_bitmaps(int slots, uint64_t pages_per_slot)
-- 
2.40.0.634.g4ca3ef3211-goog


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 1/9] KVM: selftests: Allow dirty_log_perf_test to clear dirty memory in chunks
@ 2023-04-21 16:52   ` Vipin Sharma
  0 siblings, 0 replies; 42+ messages in thread
From: Vipin Sharma @ 2023-04-21 16:52 UTC (permalink / raw)
  To: maz, oliver.upton, james.morse, suzuki.poulose, yuzenghui,
	catalin.marinas, will, chenhuacai, aleksandar.qemu.devel,
	tsbogend, anup, atishp, paul.walmsley, palmer, aou, seanjc,
	pbonzini, dmatlack, ricarkol
  Cc: linux-arm-kernel, kvmarm, linux-mips, kvm-riscv, linux-riscv,
	linux-kselftest, kvm, linux-kernel, Vipin Sharma

In dirty_log_perf_test, provide option 'k' to specify the size of the
chunks and clear dirty memory in chunks in each iteration. If this
option is not provided then fallback to old way of clearing whole
memslot in one call per iteration.

In production environment whole memslot is rarely cleared in a single
call, instead clearing operation is split across multiple calls to
reduce time between clearing and sending memory to a remote host. This
change mimics the production usecases and allow to get metrics based on
that.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 .../selftests/kvm/dirty_log_perf_test.c       | 19 ++++++++++++---
 .../testing/selftests/kvm/include/memstress.h | 12 ++++++++--
 tools/testing/selftests/kvm/lib/memstress.c   | 24 ++++++++++++++-----
 3 files changed, 44 insertions(+), 11 deletions(-)

diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c
index 416719e20518..0852a7ba42e1 100644
--- a/tools/testing/selftests/kvm/dirty_log_perf_test.c
+++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c
@@ -134,6 +134,7 @@ struct test_params {
 	uint32_t write_percent;
 	uint32_t random_seed;
 	bool random_access;
+	uint64_t clear_chunk_size;
 };
 
 static void run_test(enum vm_guest_mode mode, void *arg)
@@ -144,6 +145,7 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 	uint64_t guest_num_pages;
 	uint64_t host_num_pages;
 	uint64_t pages_per_slot;
+	uint64_t pages_per_clear;
 	struct timespec start;
 	struct timespec ts_diff;
 	struct timespec get_dirty_log_total = (struct timespec){0};
@@ -164,6 +166,7 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 	guest_num_pages = vm_adjust_num_guest_pages(mode, guest_num_pages);
 	host_num_pages = vm_num_host_pages(mode, guest_num_pages);
 	pages_per_slot = host_num_pages / p->slots;
+	pages_per_clear = p->clear_chunk_size / getpagesize();
 
 	bitmaps = memstress_alloc_bitmaps(p->slots, pages_per_slot);
 
@@ -244,8 +247,9 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 
 		if (dirty_log_manual_caps) {
 			clock_gettime(CLOCK_MONOTONIC, &start);
-			memstress_clear_dirty_log(vm, bitmaps, p->slots,
-						  pages_per_slot);
+			memstress_clear_dirty_log_in_chunks(vm, bitmaps, p->slots,
+							    pages_per_slot,
+							    pages_per_clear);
 			ts_diff = timespec_elapsed(start);
 			clear_dirty_log_total = timespec_add(clear_dirty_log_total,
 							     ts_diff);
@@ -343,6 +347,11 @@ static void help(char *name)
 	       "     To leave the application task unpinned, drop the final entry:\n\n"
 	       "         ./dirty_log_perf_test -v 3 -c 22,23,24\n\n"
 	       "     (default: no pinning)\n");
+	printf(" -k: Specify the chunk size in which dirty memory gets cleared\n"
+	       "     in memslots in each iteration. If the size is bigger than\n"
+	       "     the memslot size then whole memslot is cleared in one call.\n"
+	       "     Size must be aligned to the host page size. e.g. 10M or 3G\n"
+	       "     (default: UINT64_MAX, clears whole memslot in one call)\n");
 	puts("");
 	exit(0);
 }
@@ -358,6 +367,7 @@ int main(int argc, char *argv[])
 		.slots = 1,
 		.random_seed = 1,
 		.write_percent = 100,
+		.clear_chunk_size = UINT64_MAX,
 	};
 	int opt;
 
@@ -368,7 +378,7 @@ int main(int argc, char *argv[])
 
 	guest_modes_append_default();
 
-	while ((opt = getopt(argc, argv, "ab:c:eghi:m:nop:r:s:v:x:w:")) != -1) {
+	while ((opt = getopt(argc, argv, "ab:c:eghi:k:m:nop:r:s:v:x:w:")) != -1) {
 		switch (opt) {
 		case 'a':
 			p.random_access = true;
@@ -392,6 +402,9 @@ int main(int argc, char *argv[])
 		case 'i':
 			p.iterations = atoi_positive("Number of iterations", optarg);
 			break;
+		case 'k':
+			p.clear_chunk_size = parse_size(optarg);
+			break;
 		case 'm':
 			guest_modes_cmdline(optarg);
 			break;
diff --git a/tools/testing/selftests/kvm/include/memstress.h b/tools/testing/selftests/kvm/include/memstress.h
index ce4e603050ea..2acc93f76fc3 100644
--- a/tools/testing/selftests/kvm/include/memstress.h
+++ b/tools/testing/selftests/kvm/include/memstress.h
@@ -75,8 +75,16 @@ void memstress_setup_nested(struct kvm_vm *vm, int nr_vcpus, struct kvm_vcpu *vc
 void memstress_enable_dirty_logging(struct kvm_vm *vm, int slots);
 void memstress_disable_dirty_logging(struct kvm_vm *vm, int slots);
 void memstress_get_dirty_log(struct kvm_vm *vm, unsigned long *bitmaps[], int slots);
-void memstress_clear_dirty_log(struct kvm_vm *vm, unsigned long *bitmaps[],
-			       int slots, uint64_t pages_per_slot);
+void memstress_clear_dirty_log_in_chunks(struct kvm_vm *vm,
+					 unsigned long *bitmaps[], int slots,
+					 uint64_t pages_per_slot,
+					 uint64_t pages_per_clear);
+static inline void memstress_clear_dirty_log(struct kvm_vm *vm,
+					     unsigned long *bitmaps[], int slots,
+					     uint64_t pages_per_slot) {
+	memstress_clear_dirty_log_in_chunks(vm, bitmaps, slots, pages_per_slot,
+					    pages_per_slot);
+}
 unsigned long **memstress_alloc_bitmaps(int slots, uint64_t pages_per_slot);
 void memstress_free_bitmaps(unsigned long *bitmaps[], int slots);
 
diff --git a/tools/testing/selftests/kvm/lib/memstress.c b/tools/testing/selftests/kvm/lib/memstress.c
index 3632956c6bcf..e0c701ab4e9a 100644
--- a/tools/testing/selftests/kvm/lib/memstress.c
+++ b/tools/testing/selftests/kvm/lib/memstress.c
@@ -355,16 +355,28 @@ void memstress_get_dirty_log(struct kvm_vm *vm, unsigned long *bitmaps[], int sl
 	}
 }
 
-void memstress_clear_dirty_log(struct kvm_vm *vm, unsigned long *bitmaps[],
-			       int slots, uint64_t pages_per_slot)
+void memstress_clear_dirty_log_in_chunks(struct kvm_vm *vm,
+					 unsigned long *bitmaps[], int slots,
+					 uint64_t pages_per_slot,
+					 uint64_t pages_per_clear)
 {
-	int i;
+	int i, slot;
+	uint64_t from, clear_pages_count;
 
 	for (i = 0; i < slots; i++) {
-		int slot = MEMSTRESS_MEM_SLOT_INDEX + i;
-
-		kvm_vm_clear_dirty_log(vm, slot, bitmaps[i], 0, pages_per_slot);
+		slot = MEMSTRESS_MEM_SLOT_INDEX + i;
+		from = 0;
+		clear_pages_count = pages_per_clear;
+
+		while (from < pages_per_slot) {
+			if (from + clear_pages_count > pages_per_slot)
+				clear_pages_count = pages_per_slot - from;
+			kvm_vm_clear_dirty_log(vm, slot, bitmaps[i], from,
+					       clear_pages_count);
+			from += clear_pages_count;
+		}
 	}
+
 }
 
 unsigned long **memstress_alloc_bitmaps(int slots, uint64_t pages_per_slot)
-- 
2.40.0.634.g4ca3ef3211-goog


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 1/9] KVM: selftests: Allow dirty_log_perf_test to clear dirty memory in chunks
@ 2023-04-21 16:52   ` Vipin Sharma
  0 siblings, 0 replies; 42+ messages in thread
From: Vipin Sharma @ 2023-04-21 16:52 UTC (permalink / raw)
  To: maz, oliver.upton, james.morse, suzuki.poulose, yuzenghui,
	catalin.marinas, will, chenhuacai, aleksandar.qemu.devel,
	tsbogend, anup, atishp, paul.walmsley, palmer, aou, seanjc,
	pbonzini, dmatlack, ricarkol
  Cc: linux-arm-kernel, kvmarm, linux-mips, kvm-riscv, linux-riscv,
	linux-kselftest, kvm, linux-kernel, Vipin Sharma

In dirty_log_perf_test, provide option 'k' to specify the size of the
chunks and clear dirty memory in chunks in each iteration. If this
option is not provided then fallback to old way of clearing whole
memslot in one call per iteration.

In production environment whole memslot is rarely cleared in a single
call, instead clearing operation is split across multiple calls to
reduce time between clearing and sending memory to a remote host. This
change mimics the production usecases and allow to get metrics based on
that.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 .../selftests/kvm/dirty_log_perf_test.c       | 19 ++++++++++++---
 .../testing/selftests/kvm/include/memstress.h | 12 ++++++++--
 tools/testing/selftests/kvm/lib/memstress.c   | 24 ++++++++++++++-----
 3 files changed, 44 insertions(+), 11 deletions(-)

diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c
index 416719e20518..0852a7ba42e1 100644
--- a/tools/testing/selftests/kvm/dirty_log_perf_test.c
+++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c
@@ -134,6 +134,7 @@ struct test_params {
 	uint32_t write_percent;
 	uint32_t random_seed;
 	bool random_access;
+	uint64_t clear_chunk_size;
 };
 
 static void run_test(enum vm_guest_mode mode, void *arg)
@@ -144,6 +145,7 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 	uint64_t guest_num_pages;
 	uint64_t host_num_pages;
 	uint64_t pages_per_slot;
+	uint64_t pages_per_clear;
 	struct timespec start;
 	struct timespec ts_diff;
 	struct timespec get_dirty_log_total = (struct timespec){0};
@@ -164,6 +166,7 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 	guest_num_pages = vm_adjust_num_guest_pages(mode, guest_num_pages);
 	host_num_pages = vm_num_host_pages(mode, guest_num_pages);
 	pages_per_slot = host_num_pages / p->slots;
+	pages_per_clear = p->clear_chunk_size / getpagesize();
 
 	bitmaps = memstress_alloc_bitmaps(p->slots, pages_per_slot);
 
@@ -244,8 +247,9 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 
 		if (dirty_log_manual_caps) {
 			clock_gettime(CLOCK_MONOTONIC, &start);
-			memstress_clear_dirty_log(vm, bitmaps, p->slots,
-						  pages_per_slot);
+			memstress_clear_dirty_log_in_chunks(vm, bitmaps, p->slots,
+							    pages_per_slot,
+							    pages_per_clear);
 			ts_diff = timespec_elapsed(start);
 			clear_dirty_log_total = timespec_add(clear_dirty_log_total,
 							     ts_diff);
@@ -343,6 +347,11 @@ static void help(char *name)
 	       "     To leave the application task unpinned, drop the final entry:\n\n"
 	       "         ./dirty_log_perf_test -v 3 -c 22,23,24\n\n"
 	       "     (default: no pinning)\n");
+	printf(" -k: Specify the chunk size in which dirty memory gets cleared\n"
+	       "     in memslots in each iteration. If the size is bigger than\n"
+	       "     the memslot size then whole memslot is cleared in one call.\n"
+	       "     Size must be aligned to the host page size. e.g. 10M or 3G\n"
+	       "     (default: UINT64_MAX, clears whole memslot in one call)\n");
 	puts("");
 	exit(0);
 }
@@ -358,6 +367,7 @@ int main(int argc, char *argv[])
 		.slots = 1,
 		.random_seed = 1,
 		.write_percent = 100,
+		.clear_chunk_size = UINT64_MAX,
 	};
 	int opt;
 
@@ -368,7 +378,7 @@ int main(int argc, char *argv[])
 
 	guest_modes_append_default();
 
-	while ((opt = getopt(argc, argv, "ab:c:eghi:m:nop:r:s:v:x:w:")) != -1) {
+	while ((opt = getopt(argc, argv, "ab:c:eghi:k:m:nop:r:s:v:x:w:")) != -1) {
 		switch (opt) {
 		case 'a':
 			p.random_access = true;
@@ -392,6 +402,9 @@ int main(int argc, char *argv[])
 		case 'i':
 			p.iterations = atoi_positive("Number of iterations", optarg);
 			break;
+		case 'k':
+			p.clear_chunk_size = parse_size(optarg);
+			break;
 		case 'm':
 			guest_modes_cmdline(optarg);
 			break;
diff --git a/tools/testing/selftests/kvm/include/memstress.h b/tools/testing/selftests/kvm/include/memstress.h
index ce4e603050ea..2acc93f76fc3 100644
--- a/tools/testing/selftests/kvm/include/memstress.h
+++ b/tools/testing/selftests/kvm/include/memstress.h
@@ -75,8 +75,16 @@ void memstress_setup_nested(struct kvm_vm *vm, int nr_vcpus, struct kvm_vcpu *vc
 void memstress_enable_dirty_logging(struct kvm_vm *vm, int slots);
 void memstress_disable_dirty_logging(struct kvm_vm *vm, int slots);
 void memstress_get_dirty_log(struct kvm_vm *vm, unsigned long *bitmaps[], int slots);
-void memstress_clear_dirty_log(struct kvm_vm *vm, unsigned long *bitmaps[],
-			       int slots, uint64_t pages_per_slot);
+void memstress_clear_dirty_log_in_chunks(struct kvm_vm *vm,
+					 unsigned long *bitmaps[], int slots,
+					 uint64_t pages_per_slot,
+					 uint64_t pages_per_clear);
+static inline void memstress_clear_dirty_log(struct kvm_vm *vm,
+					     unsigned long *bitmaps[], int slots,
+					     uint64_t pages_per_slot) {
+	memstress_clear_dirty_log_in_chunks(vm, bitmaps, slots, pages_per_slot,
+					    pages_per_slot);
+}
 unsigned long **memstress_alloc_bitmaps(int slots, uint64_t pages_per_slot);
 void memstress_free_bitmaps(unsigned long *bitmaps[], int slots);
 
diff --git a/tools/testing/selftests/kvm/lib/memstress.c b/tools/testing/selftests/kvm/lib/memstress.c
index 3632956c6bcf..e0c701ab4e9a 100644
--- a/tools/testing/selftests/kvm/lib/memstress.c
+++ b/tools/testing/selftests/kvm/lib/memstress.c
@@ -355,16 +355,28 @@ void memstress_get_dirty_log(struct kvm_vm *vm, unsigned long *bitmaps[], int sl
 	}
 }
 
-void memstress_clear_dirty_log(struct kvm_vm *vm, unsigned long *bitmaps[],
-			       int slots, uint64_t pages_per_slot)
+void memstress_clear_dirty_log_in_chunks(struct kvm_vm *vm,
+					 unsigned long *bitmaps[], int slots,
+					 uint64_t pages_per_slot,
+					 uint64_t pages_per_clear)
 {
-	int i;
+	int i, slot;
+	uint64_t from, clear_pages_count;
 
 	for (i = 0; i < slots; i++) {
-		int slot = MEMSTRESS_MEM_SLOT_INDEX + i;
-
-		kvm_vm_clear_dirty_log(vm, slot, bitmaps[i], 0, pages_per_slot);
+		slot = MEMSTRESS_MEM_SLOT_INDEX + i;
+		from = 0;
+		clear_pages_count = pages_per_clear;
+
+		while (from < pages_per_slot) {
+			if (from + clear_pages_count > pages_per_slot)
+				clear_pages_count = pages_per_slot - from;
+			kvm_vm_clear_dirty_log(vm, slot, bitmaps[i], from,
+					       clear_pages_count);
+			from += clear_pages_count;
+		}
 	}
+
 }
 
 unsigned long **memstress_alloc_bitmaps(int slots, uint64_t pages_per_slot)
-- 
2.40.0.634.g4ca3ef3211-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 2/9] KVM: selftests: Add optional delay between consecutive Clear-Dirty-Log calls
  2023-04-21 16:52 ` Vipin Sharma
  (?)
@ 2023-04-21 16:52   ` Vipin Sharma
  -1 siblings, 0 replies; 42+ messages in thread
From: Vipin Sharma @ 2023-04-21 16:52 UTC (permalink / raw)
  To: maz, oliver.upton, james.morse, suzuki.poulose, yuzenghui,
	catalin.marinas, will, chenhuacai, aleksandar.qemu.devel,
	tsbogend, anup, atishp, paul.walmsley, palmer, aou, seanjc,
	pbonzini, dmatlack, ricarkol
  Cc: linux-arm-kernel, kvmarm, linux-mips, kvm-riscv, linux-riscv,
	linux-kselftest, kvm, linux-kernel, Vipin Sharma

In dirty_log_perf_test, add option "-l" to wait between consecutive
Clear-Dirty-Log calls. Accept delay in milliseconds.

This allows dirty_log_perf_test to mimic real world use where after
clearing dirty memory, some time is spent in transferring memory before
making a subsequeunt Clear-Dirty-Log call.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 .../testing/selftests/kvm/dirty_log_perf_test.c | 17 +++++++++++++++--
 tools/testing/selftests/kvm/include/memstress.h |  5 +++--
 tools/testing/selftests/kvm/lib/memstress.c     | 10 +++++++++-
 3 files changed, 27 insertions(+), 5 deletions(-)

diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c
index 0852a7ba42e1..338f03a4a550 100644
--- a/tools/testing/selftests/kvm/dirty_log_perf_test.c
+++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c
@@ -135,6 +135,7 @@ struct test_params {
 	uint32_t random_seed;
 	bool random_access;
 	uint64_t clear_chunk_size;
+	int clear_chunk_wait_time_ms
 };
 
 static void run_test(enum vm_guest_mode mode, void *arg)
@@ -249,7 +250,8 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 			clock_gettime(CLOCK_MONOTONIC, &start);
 			memstress_clear_dirty_log_in_chunks(vm, bitmaps, p->slots,
 							    pages_per_slot,
-							    pages_per_clear);
+							    pages_per_clear,
+							    p->clear_chunk_wait_time_ms);
 			ts_diff = timespec_elapsed(start);
 			clear_dirty_log_total = timespec_add(clear_dirty_log_total,
 							     ts_diff);
@@ -352,6 +354,11 @@ static void help(char *name)
 	       "     the memslot size then whole memslot is cleared in one call.\n"
 	       "     Size must be aligned to the host page size. e.g. 10M or 3G\n"
 	       "     (default: UINT64_MAX, clears whole memslot in one call)\n");
+	printf(" -l: Specify time in milliseconds to wait after Clear-Dirty-Log\n"
+	       "     call. This allows to mimic use cases where flow is to get\n"
+	       "     dirty log followed by multiple clear dirty log calls and\n"
+	       "     sending corresponding memory to destination (in this test\n"
+	       "     sending will be just idle waiting)\n");
 	puts("");
 	exit(0);
 }
@@ -368,6 +375,7 @@ int main(int argc, char *argv[])
 		.random_seed = 1,
 		.write_percent = 100,
 		.clear_chunk_size = UINT64_MAX,
+		.clear_chunk_wait_time_ms = 0,
 	};
 	int opt;
 
@@ -378,7 +386,7 @@ int main(int argc, char *argv[])
 
 	guest_modes_append_default();
 
-	while ((opt = getopt(argc, argv, "ab:c:eghi:k:m:nop:r:s:v:x:w:")) != -1) {
+	while ((opt = getopt(argc, argv, "ab:c:eghi:k:l:m:nop:r:s:v:x:w:")) != -1) {
 		switch (opt) {
 		case 'a':
 			p.random_access = true;
@@ -405,6 +413,11 @@ int main(int argc, char *argv[])
 		case 'k':
 			p.clear_chunk_size = parse_size(optarg);
 			break;
+		case 'l':
+			p.clear_chunk_wait_time_ms =
+					atoi_non_negative("Clear dirty log chunks wait time",
+							  optarg);
+			break;
 		case 'm':
 			guest_modes_cmdline(optarg);
 			break;
diff --git a/tools/testing/selftests/kvm/include/memstress.h b/tools/testing/selftests/kvm/include/memstress.h
index 2acc93f76fc3..01fdcea80360 100644
--- a/tools/testing/selftests/kvm/include/memstress.h
+++ b/tools/testing/selftests/kvm/include/memstress.h
@@ -78,12 +78,13 @@ void memstress_get_dirty_log(struct kvm_vm *vm, unsigned long *bitmaps[], int sl
 void memstress_clear_dirty_log_in_chunks(struct kvm_vm *vm,
 					 unsigned long *bitmaps[], int slots,
 					 uint64_t pages_per_slot,
-					 uint64_t pages_per_clear);
+					 uint64_t pages_per_clear,
+					 int wait_ms);
 static inline void memstress_clear_dirty_log(struct kvm_vm *vm,
 					     unsigned long *bitmaps[], int slots,
 					     uint64_t pages_per_slot) {
 	memstress_clear_dirty_log_in_chunks(vm, bitmaps, slots, pages_per_slot,
-					    pages_per_slot);
+					    pages_per_slot, 0);
 }
 unsigned long **memstress_alloc_bitmaps(int slots, uint64_t pages_per_slot);
 void memstress_free_bitmaps(unsigned long *bitmaps[], int slots);
diff --git a/tools/testing/selftests/kvm/lib/memstress.c b/tools/testing/selftests/kvm/lib/memstress.c
index e0c701ab4e9a..483ecbc53a5b 100644
--- a/tools/testing/selftests/kvm/lib/memstress.c
+++ b/tools/testing/selftests/kvm/lib/memstress.c
@@ -358,10 +358,15 @@ void memstress_get_dirty_log(struct kvm_vm *vm, unsigned long *bitmaps[], int sl
 void memstress_clear_dirty_log_in_chunks(struct kvm_vm *vm,
 					 unsigned long *bitmaps[], int slots,
 					 uint64_t pages_per_slot,
-					 uint64_t pages_per_clear)
+					 uint64_t pages_per_clear,
+					 int wait_ms)
 {
 	int i, slot;
 	uint64_t from, clear_pages_count;
+	struct timespec wait = {
+		.tv_sec = wait_ms / 1000,
+		.tv_nsec = (wait_ms % 1000) * 1000000ull,
+	};
 
 	for (i = 0; i < slots; i++) {
 		slot = MEMSTRESS_MEM_SLOT_INDEX + i;
@@ -374,6 +379,9 @@ void memstress_clear_dirty_log_in_chunks(struct kvm_vm *vm,
 			kvm_vm_clear_dirty_log(vm, slot, bitmaps[i], from,
 					       clear_pages_count);
 			from += clear_pages_count;
+			if (wait_ms)
+				nanosleep(&wait, NULL);
+
 		}
 	}
 
-- 
2.40.0.634.g4ca3ef3211-goog


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 2/9] KVM: selftests: Add optional delay between consecutive Clear-Dirty-Log calls
@ 2023-04-21 16:52   ` Vipin Sharma
  0 siblings, 0 replies; 42+ messages in thread
From: Vipin Sharma @ 2023-04-21 16:52 UTC (permalink / raw)
  To: maz, oliver.upton, james.morse, suzuki.poulose, yuzenghui,
	catalin.marinas, will, chenhuacai, aleksandar.qemu.devel,
	tsbogend, anup, atishp, paul.walmsley, palmer, aou, seanjc,
	pbonzini, dmatlack, ricarkol
  Cc: linux-arm-kernel, kvmarm, linux-mips, kvm-riscv, linux-riscv,
	linux-kselftest, kvm, linux-kernel, Vipin Sharma

In dirty_log_perf_test, add option "-l" to wait between consecutive
Clear-Dirty-Log calls. Accept delay in milliseconds.

This allows dirty_log_perf_test to mimic real world use where after
clearing dirty memory, some time is spent in transferring memory before
making a subsequeunt Clear-Dirty-Log call.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 .../testing/selftests/kvm/dirty_log_perf_test.c | 17 +++++++++++++++--
 tools/testing/selftests/kvm/include/memstress.h |  5 +++--
 tools/testing/selftests/kvm/lib/memstress.c     | 10 +++++++++-
 3 files changed, 27 insertions(+), 5 deletions(-)

diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c
index 0852a7ba42e1..338f03a4a550 100644
--- a/tools/testing/selftests/kvm/dirty_log_perf_test.c
+++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c
@@ -135,6 +135,7 @@ struct test_params {
 	uint32_t random_seed;
 	bool random_access;
 	uint64_t clear_chunk_size;
+	int clear_chunk_wait_time_ms
 };
 
 static void run_test(enum vm_guest_mode mode, void *arg)
@@ -249,7 +250,8 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 			clock_gettime(CLOCK_MONOTONIC, &start);
 			memstress_clear_dirty_log_in_chunks(vm, bitmaps, p->slots,
 							    pages_per_slot,
-							    pages_per_clear);
+							    pages_per_clear,
+							    p->clear_chunk_wait_time_ms);
 			ts_diff = timespec_elapsed(start);
 			clear_dirty_log_total = timespec_add(clear_dirty_log_total,
 							     ts_diff);
@@ -352,6 +354,11 @@ static void help(char *name)
 	       "     the memslot size then whole memslot is cleared in one call.\n"
 	       "     Size must be aligned to the host page size. e.g. 10M or 3G\n"
 	       "     (default: UINT64_MAX, clears whole memslot in one call)\n");
+	printf(" -l: Specify time in milliseconds to wait after Clear-Dirty-Log\n"
+	       "     call. This allows to mimic use cases where flow is to get\n"
+	       "     dirty log followed by multiple clear dirty log calls and\n"
+	       "     sending corresponding memory to destination (in this test\n"
+	       "     sending will be just idle waiting)\n");
 	puts("");
 	exit(0);
 }
@@ -368,6 +375,7 @@ int main(int argc, char *argv[])
 		.random_seed = 1,
 		.write_percent = 100,
 		.clear_chunk_size = UINT64_MAX,
+		.clear_chunk_wait_time_ms = 0,
 	};
 	int opt;
 
@@ -378,7 +386,7 @@ int main(int argc, char *argv[])
 
 	guest_modes_append_default();
 
-	while ((opt = getopt(argc, argv, "ab:c:eghi:k:m:nop:r:s:v:x:w:")) != -1) {
+	while ((opt = getopt(argc, argv, "ab:c:eghi:k:l:m:nop:r:s:v:x:w:")) != -1) {
 		switch (opt) {
 		case 'a':
 			p.random_access = true;
@@ -405,6 +413,11 @@ int main(int argc, char *argv[])
 		case 'k':
 			p.clear_chunk_size = parse_size(optarg);
 			break;
+		case 'l':
+			p.clear_chunk_wait_time_ms =
+					atoi_non_negative("Clear dirty log chunks wait time",
+							  optarg);
+			break;
 		case 'm':
 			guest_modes_cmdline(optarg);
 			break;
diff --git a/tools/testing/selftests/kvm/include/memstress.h b/tools/testing/selftests/kvm/include/memstress.h
index 2acc93f76fc3..01fdcea80360 100644
--- a/tools/testing/selftests/kvm/include/memstress.h
+++ b/tools/testing/selftests/kvm/include/memstress.h
@@ -78,12 +78,13 @@ void memstress_get_dirty_log(struct kvm_vm *vm, unsigned long *bitmaps[], int sl
 void memstress_clear_dirty_log_in_chunks(struct kvm_vm *vm,
 					 unsigned long *bitmaps[], int slots,
 					 uint64_t pages_per_slot,
-					 uint64_t pages_per_clear);
+					 uint64_t pages_per_clear,
+					 int wait_ms);
 static inline void memstress_clear_dirty_log(struct kvm_vm *vm,
 					     unsigned long *bitmaps[], int slots,
 					     uint64_t pages_per_slot) {
 	memstress_clear_dirty_log_in_chunks(vm, bitmaps, slots, pages_per_slot,
-					    pages_per_slot);
+					    pages_per_slot, 0);
 }
 unsigned long **memstress_alloc_bitmaps(int slots, uint64_t pages_per_slot);
 void memstress_free_bitmaps(unsigned long *bitmaps[], int slots);
diff --git a/tools/testing/selftests/kvm/lib/memstress.c b/tools/testing/selftests/kvm/lib/memstress.c
index e0c701ab4e9a..483ecbc53a5b 100644
--- a/tools/testing/selftests/kvm/lib/memstress.c
+++ b/tools/testing/selftests/kvm/lib/memstress.c
@@ -358,10 +358,15 @@ void memstress_get_dirty_log(struct kvm_vm *vm, unsigned long *bitmaps[], int sl
 void memstress_clear_dirty_log_in_chunks(struct kvm_vm *vm,
 					 unsigned long *bitmaps[], int slots,
 					 uint64_t pages_per_slot,
-					 uint64_t pages_per_clear)
+					 uint64_t pages_per_clear,
+					 int wait_ms)
 {
 	int i, slot;
 	uint64_t from, clear_pages_count;
+	struct timespec wait = {
+		.tv_sec = wait_ms / 1000,
+		.tv_nsec = (wait_ms % 1000) * 1000000ull,
+	};
 
 	for (i = 0; i < slots; i++) {
 		slot = MEMSTRESS_MEM_SLOT_INDEX + i;
@@ -374,6 +379,9 @@ void memstress_clear_dirty_log_in_chunks(struct kvm_vm *vm,
 			kvm_vm_clear_dirty_log(vm, slot, bitmaps[i], from,
 					       clear_pages_count);
 			from += clear_pages_count;
+			if (wait_ms)
+				nanosleep(&wait, NULL);
+
 		}
 	}
 
-- 
2.40.0.634.g4ca3ef3211-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 2/9] KVM: selftests: Add optional delay between consecutive Clear-Dirty-Log calls
@ 2023-04-21 16:52   ` Vipin Sharma
  0 siblings, 0 replies; 42+ messages in thread
From: Vipin Sharma @ 2023-04-21 16:52 UTC (permalink / raw)
  To: maz, oliver.upton, james.morse, suzuki.poulose, yuzenghui,
	catalin.marinas, will, chenhuacai, aleksandar.qemu.devel,
	tsbogend, anup, atishp, paul.walmsley, palmer, aou, seanjc,
	pbonzini, dmatlack, ricarkol
  Cc: linux-arm-kernel, kvmarm, linux-mips, kvm-riscv, linux-riscv,
	linux-kselftest, kvm, linux-kernel, Vipin Sharma

In dirty_log_perf_test, add option "-l" to wait between consecutive
Clear-Dirty-Log calls. Accept delay in milliseconds.

This allows dirty_log_perf_test to mimic real world use where after
clearing dirty memory, some time is spent in transferring memory before
making a subsequeunt Clear-Dirty-Log call.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 .../testing/selftests/kvm/dirty_log_perf_test.c | 17 +++++++++++++++--
 tools/testing/selftests/kvm/include/memstress.h |  5 +++--
 tools/testing/selftests/kvm/lib/memstress.c     | 10 +++++++++-
 3 files changed, 27 insertions(+), 5 deletions(-)

diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c
index 0852a7ba42e1..338f03a4a550 100644
--- a/tools/testing/selftests/kvm/dirty_log_perf_test.c
+++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c
@@ -135,6 +135,7 @@ struct test_params {
 	uint32_t random_seed;
 	bool random_access;
 	uint64_t clear_chunk_size;
+	int clear_chunk_wait_time_ms
 };
 
 static void run_test(enum vm_guest_mode mode, void *arg)
@@ -249,7 +250,8 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 			clock_gettime(CLOCK_MONOTONIC, &start);
 			memstress_clear_dirty_log_in_chunks(vm, bitmaps, p->slots,
 							    pages_per_slot,
-							    pages_per_clear);
+							    pages_per_clear,
+							    p->clear_chunk_wait_time_ms);
 			ts_diff = timespec_elapsed(start);
 			clear_dirty_log_total = timespec_add(clear_dirty_log_total,
 							     ts_diff);
@@ -352,6 +354,11 @@ static void help(char *name)
 	       "     the memslot size then whole memslot is cleared in one call.\n"
 	       "     Size must be aligned to the host page size. e.g. 10M or 3G\n"
 	       "     (default: UINT64_MAX, clears whole memslot in one call)\n");
+	printf(" -l: Specify time in milliseconds to wait after Clear-Dirty-Log\n"
+	       "     call. This allows to mimic use cases where flow is to get\n"
+	       "     dirty log followed by multiple clear dirty log calls and\n"
+	       "     sending corresponding memory to destination (in this test\n"
+	       "     sending will be just idle waiting)\n");
 	puts("");
 	exit(0);
 }
@@ -368,6 +375,7 @@ int main(int argc, char *argv[])
 		.random_seed = 1,
 		.write_percent = 100,
 		.clear_chunk_size = UINT64_MAX,
+		.clear_chunk_wait_time_ms = 0,
 	};
 	int opt;
 
@@ -378,7 +386,7 @@ int main(int argc, char *argv[])
 
 	guest_modes_append_default();
 
-	while ((opt = getopt(argc, argv, "ab:c:eghi:k:m:nop:r:s:v:x:w:")) != -1) {
+	while ((opt = getopt(argc, argv, "ab:c:eghi:k:l:m:nop:r:s:v:x:w:")) != -1) {
 		switch (opt) {
 		case 'a':
 			p.random_access = true;
@@ -405,6 +413,11 @@ int main(int argc, char *argv[])
 		case 'k':
 			p.clear_chunk_size = parse_size(optarg);
 			break;
+		case 'l':
+			p.clear_chunk_wait_time_ms =
+					atoi_non_negative("Clear dirty log chunks wait time",
+							  optarg);
+			break;
 		case 'm':
 			guest_modes_cmdline(optarg);
 			break;
diff --git a/tools/testing/selftests/kvm/include/memstress.h b/tools/testing/selftests/kvm/include/memstress.h
index 2acc93f76fc3..01fdcea80360 100644
--- a/tools/testing/selftests/kvm/include/memstress.h
+++ b/tools/testing/selftests/kvm/include/memstress.h
@@ -78,12 +78,13 @@ void memstress_get_dirty_log(struct kvm_vm *vm, unsigned long *bitmaps[], int sl
 void memstress_clear_dirty_log_in_chunks(struct kvm_vm *vm,
 					 unsigned long *bitmaps[], int slots,
 					 uint64_t pages_per_slot,
-					 uint64_t pages_per_clear);
+					 uint64_t pages_per_clear,
+					 int wait_ms);
 static inline void memstress_clear_dirty_log(struct kvm_vm *vm,
 					     unsigned long *bitmaps[], int slots,
 					     uint64_t pages_per_slot) {
 	memstress_clear_dirty_log_in_chunks(vm, bitmaps, slots, pages_per_slot,
-					    pages_per_slot);
+					    pages_per_slot, 0);
 }
 unsigned long **memstress_alloc_bitmaps(int slots, uint64_t pages_per_slot);
 void memstress_free_bitmaps(unsigned long *bitmaps[], int slots);
diff --git a/tools/testing/selftests/kvm/lib/memstress.c b/tools/testing/selftests/kvm/lib/memstress.c
index e0c701ab4e9a..483ecbc53a5b 100644
--- a/tools/testing/selftests/kvm/lib/memstress.c
+++ b/tools/testing/selftests/kvm/lib/memstress.c
@@ -358,10 +358,15 @@ void memstress_get_dirty_log(struct kvm_vm *vm, unsigned long *bitmaps[], int sl
 void memstress_clear_dirty_log_in_chunks(struct kvm_vm *vm,
 					 unsigned long *bitmaps[], int slots,
 					 uint64_t pages_per_slot,
-					 uint64_t pages_per_clear)
+					 uint64_t pages_per_clear,
+					 int wait_ms)
 {
 	int i, slot;
 	uint64_t from, clear_pages_count;
+	struct timespec wait = {
+		.tv_sec = wait_ms / 1000,
+		.tv_nsec = (wait_ms % 1000) * 1000000ull,
+	};
 
 	for (i = 0; i < slots; i++) {
 		slot = MEMSTRESS_MEM_SLOT_INDEX + i;
@@ -374,6 +379,9 @@ void memstress_clear_dirty_log_in_chunks(struct kvm_vm *vm,
 			kvm_vm_clear_dirty_log(vm, slot, bitmaps[i], from,
 					       clear_pages_count);
 			from += clear_pages_count;
+			if (wait_ms)
+				nanosleep(&wait, NULL);
+
 		}
 	}
 
-- 
2.40.0.634.g4ca3ef3211-goog


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 3/9] KVM: selftests: Pass count of read and write accesses from guest to host
  2023-04-21 16:52 ` Vipin Sharma
  (?)
@ 2023-04-21 16:52   ` Vipin Sharma
  -1 siblings, 0 replies; 42+ messages in thread
From: Vipin Sharma @ 2023-04-21 16:52 UTC (permalink / raw)
  To: maz, oliver.upton, james.morse, suzuki.poulose, yuzenghui,
	catalin.marinas, will, chenhuacai, aleksandar.qemu.devel,
	tsbogend, anup, atishp, paul.walmsley, palmer, aou, seanjc,
	pbonzini, dmatlack, ricarkol
  Cc: linux-arm-kernel, kvmarm, linux-mips, kvm-riscv, linux-riscv,
	linux-kselftest, kvm, linux-kernel, Vipin Sharma

Pass number of read and write accesses done in the memstress' guest code
to userspace.

These counts will be one way to measure vCPU performances during
memstress and dirty logging related tests. For example, in
dirty_log_perf_test this can be used to measure impact of dirty and
clear log APIs on vCPUs performances.

In current dirty_log_perf_test, each vCPU executes in lockstep to the
current iteration in userspace, therefore, these access counts will not
provide much useful information except for observing individual vCPUs
read vs write accesses.

However, in future commits, dirty_log_perf_test behavior will be changed
to allow vCPUs to execute independent of userspace iterations. This will
mimic real world workload where guest keeps on executing while VMM is
collecting and clearing dirty logs separately. With read and write
accesses known for each vCPU, impact of get and clear dirty log APIs can
be quantified. Note that these access counts will not be 100% reliable
in knowing vCPUs performances since vCPUs scheduling can impact
the progress.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 tools/testing/selftests/kvm/lib/memstress.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/kvm/lib/memstress.c b/tools/testing/selftests/kvm/lib/memstress.c
index 483ecbc53a5b..9c2e360e610f 100644
--- a/tools/testing/selftests/kvm/lib/memstress.c
+++ b/tools/testing/selftests/kvm/lib/memstress.c
@@ -50,6 +50,8 @@ void memstress_guest_code(uint32_t vcpu_idx)
 	struct memstress_args *args = &memstress_args;
 	struct memstress_vcpu_args *vcpu_args = &args->vcpu_args[vcpu_idx];
 	struct guest_random_state rand_state;
+	uint64_t write_access;
+	uint64_t read_access;
 	uint64_t gva;
 	uint64_t pages;
 	uint64_t addr;
@@ -65,6 +67,8 @@ void memstress_guest_code(uint32_t vcpu_idx)
 	GUEST_ASSERT(vcpu_args->vcpu_idx == vcpu_idx);

 	while (true) {
+		write_access = 0;
+		read_access = 0;
 		for (i = 0; i < pages; i++) {
 			if (args->random_access)
 				page = guest_random_u32(&rand_state) % pages;
@@ -73,13 +77,16 @@ void memstress_guest_code(uint32_t vcpu_idx)

 			addr = gva + (page * args->guest_page_size);

-			if (guest_random_u32(&rand_state) % 100 < args->write_percent)
+			if (guest_random_u32(&rand_state) % 100 < args->write_percent) {
 				*(uint64_t *)addr = 0x0123456789ABCDEF;
-			else
+				write_access++;
+			} else {
 				READ_ONCE(*(uint64_t *)addr);
+				read_access++;
+			}
 		}

-		GUEST_SYNC(1);
+		GUEST_SYNC_ARGS(1, read_access, write_access, 0, 0);
 	}
 }

-- 
2.40.0.634.g4ca3ef3211-goog

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 3/9] KVM: selftests: Pass count of read and write accesses from guest to host
@ 2023-04-21 16:52   ` Vipin Sharma
  0 siblings, 0 replies; 42+ messages in thread
From: Vipin Sharma @ 2023-04-21 16:52 UTC (permalink / raw)
  To: maz, oliver.upton, james.morse, suzuki.poulose, yuzenghui,
	catalin.marinas, will, chenhuacai, aleksandar.qemu.devel,
	tsbogend, anup, atishp, paul.walmsley, palmer, aou, seanjc,
	pbonzini, dmatlack, ricarkol
  Cc: linux-arm-kernel, kvmarm, linux-mips, kvm-riscv, linux-riscv,
	linux-kselftest, kvm, linux-kernel, Vipin Sharma

Pass number of read and write accesses done in the memstress' guest code
to userspace.

These counts will be one way to measure vCPU performances during
memstress and dirty logging related tests. For example, in
dirty_log_perf_test this can be used to measure impact of dirty and
clear log APIs on vCPUs performances.

In current dirty_log_perf_test, each vCPU executes in lockstep to the
current iteration in userspace, therefore, these access counts will not
provide much useful information except for observing individual vCPUs
read vs write accesses.

However, in future commits, dirty_log_perf_test behavior will be changed
to allow vCPUs to execute independent of userspace iterations. This will
mimic real world workload where guest keeps on executing while VMM is
collecting and clearing dirty logs separately. With read and write
accesses known for each vCPU, impact of get and clear dirty log APIs can
be quantified. Note that these access counts will not be 100% reliable
in knowing vCPUs performances since vCPUs scheduling can impact
the progress.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 tools/testing/selftests/kvm/lib/memstress.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/kvm/lib/memstress.c b/tools/testing/selftests/kvm/lib/memstress.c
index 483ecbc53a5b..9c2e360e610f 100644
--- a/tools/testing/selftests/kvm/lib/memstress.c
+++ b/tools/testing/selftests/kvm/lib/memstress.c
@@ -50,6 +50,8 @@ void memstress_guest_code(uint32_t vcpu_idx)
 	struct memstress_args *args = &memstress_args;
 	struct memstress_vcpu_args *vcpu_args = &args->vcpu_args[vcpu_idx];
 	struct guest_random_state rand_state;
+	uint64_t write_access;
+	uint64_t read_access;
 	uint64_t gva;
 	uint64_t pages;
 	uint64_t addr;
@@ -65,6 +67,8 @@ void memstress_guest_code(uint32_t vcpu_idx)
 	GUEST_ASSERT(vcpu_args->vcpu_idx == vcpu_idx);

 	while (true) {
+		write_access = 0;
+		read_access = 0;
 		for (i = 0; i < pages; i++) {
 			if (args->random_access)
 				page = guest_random_u32(&rand_state) % pages;
@@ -73,13 +77,16 @@ void memstress_guest_code(uint32_t vcpu_idx)

 			addr = gva + (page * args->guest_page_size);

-			if (guest_random_u32(&rand_state) % 100 < args->write_percent)
+			if (guest_random_u32(&rand_state) % 100 < args->write_percent) {
 				*(uint64_t *)addr = 0x0123456789ABCDEF;
-			else
+				write_access++;
+			} else {
 				READ_ONCE(*(uint64_t *)addr);
+				read_access++;
+			}
 		}

-		GUEST_SYNC(1);
+		GUEST_SYNC_ARGS(1, read_access, write_access, 0, 0);
 	}
 }

-- 
2.40.0.634.g4ca3ef3211-goog

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 3/9] KVM: selftests: Pass count of read and write accesses from guest to host
@ 2023-04-21 16:52   ` Vipin Sharma
  0 siblings, 0 replies; 42+ messages in thread
From: Vipin Sharma @ 2023-04-21 16:52 UTC (permalink / raw)
  To: maz, oliver.upton, james.morse, suzuki.poulose, yuzenghui,
	catalin.marinas, will, chenhuacai, aleksandar.qemu.devel,
	tsbogend, anup, atishp, paul.walmsley, palmer, aou, seanjc,
	pbonzini, dmatlack, ricarkol
  Cc: linux-arm-kernel, kvmarm, linux-mips, kvm-riscv, linux-riscv,
	linux-kselftest, kvm, linux-kernel, Vipin Sharma

Pass number of read and write accesses done in the memstress' guest code
to userspace.

These counts will be one way to measure vCPU performances during
memstress and dirty logging related tests. For example, in
dirty_log_perf_test this can be used to measure impact of dirty and
clear log APIs on vCPUs performances.

In current dirty_log_perf_test, each vCPU executes in lockstep to the
current iteration in userspace, therefore, these access counts will not
provide much useful information except for observing individual vCPUs
read vs write accesses.

However, in future commits, dirty_log_perf_test behavior will be changed
to allow vCPUs to execute independent of userspace iterations. This will
mimic real world workload where guest keeps on executing while VMM is
collecting and clearing dirty logs separately. With read and write
accesses known for each vCPU, impact of get and clear dirty log APIs can
be quantified. Note that these access counts will not be 100% reliable
in knowing vCPUs performances since vCPUs scheduling can impact
the progress.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 tools/testing/selftests/kvm/lib/memstress.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/kvm/lib/memstress.c b/tools/testing/selftests/kvm/lib/memstress.c
index 483ecbc53a5b..9c2e360e610f 100644
--- a/tools/testing/selftests/kvm/lib/memstress.c
+++ b/tools/testing/selftests/kvm/lib/memstress.c
@@ -50,6 +50,8 @@ void memstress_guest_code(uint32_t vcpu_idx)
 	struct memstress_args *args = &memstress_args;
 	struct memstress_vcpu_args *vcpu_args = &args->vcpu_args[vcpu_idx];
 	struct guest_random_state rand_state;
+	uint64_t write_access;
+	uint64_t read_access;
 	uint64_t gva;
 	uint64_t pages;
 	uint64_t addr;
@@ -65,6 +67,8 @@ void memstress_guest_code(uint32_t vcpu_idx)
 	GUEST_ASSERT(vcpu_args->vcpu_idx == vcpu_idx);

 	while (true) {
+		write_access = 0;
+		read_access = 0;
 		for (i = 0; i < pages; i++) {
 			if (args->random_access)
 				page = guest_random_u32(&rand_state) % pages;
@@ -73,13 +77,16 @@ void memstress_guest_code(uint32_t vcpu_idx)

 			addr = gva + (page * args->guest_page_size);

-			if (guest_random_u32(&rand_state) % 100 < args->write_percent)
+			if (guest_random_u32(&rand_state) % 100 < args->write_percent) {
 				*(uint64_t *)addr = 0x0123456789ABCDEF;
-			else
+				write_access++;
+			} else {
 				READ_ONCE(*(uint64_t *)addr);
+				read_access++;
+			}
 		}

-		GUEST_SYNC(1);
+		GUEST_SYNC_ARGS(1, read_access, write_access, 0, 0);
 	}
 }

-- 
2.40.0.634.g4ca3ef3211-goog

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 4/9] KVM: selftests: Print read and write accesses of pages by vCPUs in dirty_log_perf_test
  2023-04-21 16:52 ` Vipin Sharma
  (?)
@ 2023-04-21 16:53   ` Vipin Sharma
  -1 siblings, 0 replies; 42+ messages in thread
From: Vipin Sharma @ 2023-04-21 16:53 UTC (permalink / raw)
  To: maz, oliver.upton, james.morse, suzuki.poulose, yuzenghui,
	catalin.marinas, will, chenhuacai, aleksandar.qemu.devel,
	tsbogend, anup, atishp, paul.walmsley, palmer, aou, seanjc,
	pbonzini, dmatlack, ricarkol
  Cc: linux-arm-kernel, kvmarm, linux-mips, kvm-riscv, linux-riscv,
	linux-kselftest, kvm, linux-kernel, Vipin Sharma

Fetch read and write accesses of pages from guest code and print count
across all vCPUs in dirty_log_perf_test.

This data provides progress made by vCPUs during dirty logging
operations. Since, vCPUs execute in lockstep with userspace dirty log
iterations, this metric is not very interesting. However, in future
commits when dirty_log_perf_test can execute vCPUs independently from
dirty log iterations then this metric can give good measure of vCPUs
performance during dirty logging.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 .../selftests/kvm/dirty_log_perf_test.c        | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c
index 338f03a4a550..0a08a3d21123 100644
--- a/tools/testing/selftests/kvm/dirty_log_perf_test.c
+++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c
@@ -12,6 +12,7 @@
 #include <stdlib.h>
 #include <time.h>
 #include <pthread.h>
+#include <stdatomic.h>
 #include <linux/bitmap.h>
 
 #include "kvm_util.h"
@@ -66,17 +67,22 @@ static u64 dirty_log_manual_caps;
 static bool host_quit;
 static int iteration;
 static int vcpu_last_completed_iteration[KVM_MAX_VCPUS];
+static atomic_ullong total_reads;
+static atomic_ullong total_writes;
 
 static void vcpu_worker(struct memstress_vcpu_args *vcpu_args)
 {
 	struct kvm_vcpu *vcpu = vcpu_args->vcpu;
 	int vcpu_idx = vcpu_args->vcpu_idx;
 	uint64_t pages_count = 0;
+	uint64_t reads = 0;
+	uint64_t writes = 0;
 	struct kvm_run *run;
 	struct timespec start;
 	struct timespec ts_diff;
 	struct timespec total = (struct timespec){0};
 	struct timespec avg;
+	struct ucall uc = {};
 	int ret;
 
 	run = vcpu->run;
@@ -89,7 +95,7 @@ static void vcpu_worker(struct memstress_vcpu_args *vcpu_args)
 		ts_diff = timespec_elapsed(start);
 
 		TEST_ASSERT(ret == 0, "vcpu_run failed: %d\n", ret);
-		TEST_ASSERT(get_ucall(vcpu, NULL) == UCALL_SYNC,
+		TEST_ASSERT(get_ucall(vcpu, &uc) == UCALL_SYNC,
 			    "Invalid guest sync status: exit_reason=%s\n",
 			    exit_reason_str(run->exit_reason));
 
@@ -101,6 +107,8 @@ static void vcpu_worker(struct memstress_vcpu_args *vcpu_args)
 		if (current_iteration) {
 			pages_count += vcpu_args->pages;
 			total = timespec_add(total, ts_diff);
+			reads += uc.args[2];
+			writes += uc.args[3];
 			pr_debug("vCPU %d iteration %d dirty memory time: %ld.%.9lds\n",
 				vcpu_idx, current_iteration, ts_diff.tv_sec,
 				ts_diff.tv_nsec);
@@ -123,6 +131,8 @@ static void vcpu_worker(struct memstress_vcpu_args *vcpu_args)
 	pr_debug("\nvCPU %d dirtied 0x%lx pages over %d iterations in %ld.%.9lds. (Avg %ld.%.9lds/iteration)\n",
 		vcpu_idx, pages_count, vcpu_last_completed_iteration[vcpu_idx],
 		total.tv_sec, total.tv_nsec, avg.tv_sec, avg.tv_nsec);
+	atomic_fetch_add(&total_reads, reads);
+	atomic_fetch_add(&total_writes, writes);
 }
 
 struct test_params {
@@ -176,6 +186,8 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 			      dirty_log_manual_caps);
 
 	arch_setup_vm(vm, nr_vcpus);
+	atomic_store(&total_reads, 0);
+	atomic_store(&total_writes, 0);
 
 	/* Start the iterations */
 	iteration = 0;
@@ -295,6 +307,10 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 			clear_dirty_log_total.tv_nsec, avg.tv_sec, avg.tv_nsec);
 	}
 
+	pr_info("Total pages touched: %llu (Reads: %llu, Writes: %llu)\n",
+		atomic_load(&total_reads) + atomic_load(&total_writes),
+		atomic_load(&total_reads), atomic_load(&total_writes));
+
 	memstress_free_bitmaps(bitmaps, p->slots);
 	arch_cleanup_vm(vm);
 	memstress_destroy_vm(vm);
-- 
2.40.0.634.g4ca3ef3211-goog


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 4/9] KVM: selftests: Print read and write accesses of pages by vCPUs in dirty_log_perf_test
@ 2023-04-21 16:53   ` Vipin Sharma
  0 siblings, 0 replies; 42+ messages in thread
From: Vipin Sharma @ 2023-04-21 16:53 UTC (permalink / raw)
  To: maz, oliver.upton, james.morse, suzuki.poulose, yuzenghui,
	catalin.marinas, will, chenhuacai, aleksandar.qemu.devel,
	tsbogend, anup, atishp, paul.walmsley, palmer, aou, seanjc,
	pbonzini, dmatlack, ricarkol
  Cc: linux-arm-kernel, kvmarm, linux-mips, kvm-riscv, linux-riscv,
	linux-kselftest, kvm, linux-kernel, Vipin Sharma

Fetch read and write accesses of pages from guest code and print count
across all vCPUs in dirty_log_perf_test.

This data provides progress made by vCPUs during dirty logging
operations. Since, vCPUs execute in lockstep with userspace dirty log
iterations, this metric is not very interesting. However, in future
commits when dirty_log_perf_test can execute vCPUs independently from
dirty log iterations then this metric can give good measure of vCPUs
performance during dirty logging.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 .../selftests/kvm/dirty_log_perf_test.c        | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c
index 338f03a4a550..0a08a3d21123 100644
--- a/tools/testing/selftests/kvm/dirty_log_perf_test.c
+++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c
@@ -12,6 +12,7 @@
 #include <stdlib.h>
 #include <time.h>
 #include <pthread.h>
+#include <stdatomic.h>
 #include <linux/bitmap.h>
 
 #include "kvm_util.h"
@@ -66,17 +67,22 @@ static u64 dirty_log_manual_caps;
 static bool host_quit;
 static int iteration;
 static int vcpu_last_completed_iteration[KVM_MAX_VCPUS];
+static atomic_ullong total_reads;
+static atomic_ullong total_writes;
 
 static void vcpu_worker(struct memstress_vcpu_args *vcpu_args)
 {
 	struct kvm_vcpu *vcpu = vcpu_args->vcpu;
 	int vcpu_idx = vcpu_args->vcpu_idx;
 	uint64_t pages_count = 0;
+	uint64_t reads = 0;
+	uint64_t writes = 0;
 	struct kvm_run *run;
 	struct timespec start;
 	struct timespec ts_diff;
 	struct timespec total = (struct timespec){0};
 	struct timespec avg;
+	struct ucall uc = {};
 	int ret;
 
 	run = vcpu->run;
@@ -89,7 +95,7 @@ static void vcpu_worker(struct memstress_vcpu_args *vcpu_args)
 		ts_diff = timespec_elapsed(start);
 
 		TEST_ASSERT(ret == 0, "vcpu_run failed: %d\n", ret);
-		TEST_ASSERT(get_ucall(vcpu, NULL) == UCALL_SYNC,
+		TEST_ASSERT(get_ucall(vcpu, &uc) == UCALL_SYNC,
 			    "Invalid guest sync status: exit_reason=%s\n",
 			    exit_reason_str(run->exit_reason));
 
@@ -101,6 +107,8 @@ static void vcpu_worker(struct memstress_vcpu_args *vcpu_args)
 		if (current_iteration) {
 			pages_count += vcpu_args->pages;
 			total = timespec_add(total, ts_diff);
+			reads += uc.args[2];
+			writes += uc.args[3];
 			pr_debug("vCPU %d iteration %d dirty memory time: %ld.%.9lds\n",
 				vcpu_idx, current_iteration, ts_diff.tv_sec,
 				ts_diff.tv_nsec);
@@ -123,6 +131,8 @@ static void vcpu_worker(struct memstress_vcpu_args *vcpu_args)
 	pr_debug("\nvCPU %d dirtied 0x%lx pages over %d iterations in %ld.%.9lds. (Avg %ld.%.9lds/iteration)\n",
 		vcpu_idx, pages_count, vcpu_last_completed_iteration[vcpu_idx],
 		total.tv_sec, total.tv_nsec, avg.tv_sec, avg.tv_nsec);
+	atomic_fetch_add(&total_reads, reads);
+	atomic_fetch_add(&total_writes, writes);
 }
 
 struct test_params {
@@ -176,6 +186,8 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 			      dirty_log_manual_caps);
 
 	arch_setup_vm(vm, nr_vcpus);
+	atomic_store(&total_reads, 0);
+	atomic_store(&total_writes, 0);
 
 	/* Start the iterations */
 	iteration = 0;
@@ -295,6 +307,10 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 			clear_dirty_log_total.tv_nsec, avg.tv_sec, avg.tv_nsec);
 	}
 
+	pr_info("Total pages touched: %llu (Reads: %llu, Writes: %llu)\n",
+		atomic_load(&total_reads) + atomic_load(&total_writes),
+		atomic_load(&total_reads), atomic_load(&total_writes));
+
 	memstress_free_bitmaps(bitmaps, p->slots);
 	arch_cleanup_vm(vm);
 	memstress_destroy_vm(vm);
-- 
2.40.0.634.g4ca3ef3211-goog


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 4/9] KVM: selftests: Print read and write accesses of pages by vCPUs in dirty_log_perf_test
@ 2023-04-21 16:53   ` Vipin Sharma
  0 siblings, 0 replies; 42+ messages in thread
From: Vipin Sharma @ 2023-04-21 16:53 UTC (permalink / raw)
  To: maz, oliver.upton, james.morse, suzuki.poulose, yuzenghui,
	catalin.marinas, will, chenhuacai, aleksandar.qemu.devel,
	tsbogend, anup, atishp, paul.walmsley, palmer, aou, seanjc,
	pbonzini, dmatlack, ricarkol
  Cc: linux-arm-kernel, kvmarm, linux-mips, kvm-riscv, linux-riscv,
	linux-kselftest, kvm, linux-kernel, Vipin Sharma

Fetch read and write accesses of pages from guest code and print count
across all vCPUs in dirty_log_perf_test.

This data provides progress made by vCPUs during dirty logging
operations. Since, vCPUs execute in lockstep with userspace dirty log
iterations, this metric is not very interesting. However, in future
commits when dirty_log_perf_test can execute vCPUs independently from
dirty log iterations then this metric can give good measure of vCPUs
performance during dirty logging.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 .../selftests/kvm/dirty_log_perf_test.c        | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c
index 338f03a4a550..0a08a3d21123 100644
--- a/tools/testing/selftests/kvm/dirty_log_perf_test.c
+++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c
@@ -12,6 +12,7 @@
 #include <stdlib.h>
 #include <time.h>
 #include <pthread.h>
+#include <stdatomic.h>
 #include <linux/bitmap.h>
 
 #include "kvm_util.h"
@@ -66,17 +67,22 @@ static u64 dirty_log_manual_caps;
 static bool host_quit;
 static int iteration;
 static int vcpu_last_completed_iteration[KVM_MAX_VCPUS];
+static atomic_ullong total_reads;
+static atomic_ullong total_writes;
 
 static void vcpu_worker(struct memstress_vcpu_args *vcpu_args)
 {
 	struct kvm_vcpu *vcpu = vcpu_args->vcpu;
 	int vcpu_idx = vcpu_args->vcpu_idx;
 	uint64_t pages_count = 0;
+	uint64_t reads = 0;
+	uint64_t writes = 0;
 	struct kvm_run *run;
 	struct timespec start;
 	struct timespec ts_diff;
 	struct timespec total = (struct timespec){0};
 	struct timespec avg;
+	struct ucall uc = {};
 	int ret;
 
 	run = vcpu->run;
@@ -89,7 +95,7 @@ static void vcpu_worker(struct memstress_vcpu_args *vcpu_args)
 		ts_diff = timespec_elapsed(start);
 
 		TEST_ASSERT(ret == 0, "vcpu_run failed: %d\n", ret);
-		TEST_ASSERT(get_ucall(vcpu, NULL) == UCALL_SYNC,
+		TEST_ASSERT(get_ucall(vcpu, &uc) == UCALL_SYNC,
 			    "Invalid guest sync status: exit_reason=%s\n",
 			    exit_reason_str(run->exit_reason));
 
@@ -101,6 +107,8 @@ static void vcpu_worker(struct memstress_vcpu_args *vcpu_args)
 		if (current_iteration) {
 			pages_count += vcpu_args->pages;
 			total = timespec_add(total, ts_diff);
+			reads += uc.args[2];
+			writes += uc.args[3];
 			pr_debug("vCPU %d iteration %d dirty memory time: %ld.%.9lds\n",
 				vcpu_idx, current_iteration, ts_diff.tv_sec,
 				ts_diff.tv_nsec);
@@ -123,6 +131,8 @@ static void vcpu_worker(struct memstress_vcpu_args *vcpu_args)
 	pr_debug("\nvCPU %d dirtied 0x%lx pages over %d iterations in %ld.%.9lds. (Avg %ld.%.9lds/iteration)\n",
 		vcpu_idx, pages_count, vcpu_last_completed_iteration[vcpu_idx],
 		total.tv_sec, total.tv_nsec, avg.tv_sec, avg.tv_nsec);
+	atomic_fetch_add(&total_reads, reads);
+	atomic_fetch_add(&total_writes, writes);
 }
 
 struct test_params {
@@ -176,6 +186,8 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 			      dirty_log_manual_caps);
 
 	arch_setup_vm(vm, nr_vcpus);
+	atomic_store(&total_reads, 0);
+	atomic_store(&total_writes, 0);
 
 	/* Start the iterations */
 	iteration = 0;
@@ -295,6 +307,10 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 			clear_dirty_log_total.tv_nsec, avg.tv_sec, avg.tv_nsec);
 	}
 
+	pr_info("Total pages touched: %llu (Reads: %llu, Writes: %llu)\n",
+		atomic_load(&total_reads) + atomic_load(&total_writes),
+		atomic_load(&total_reads), atomic_load(&total_writes));
+
 	memstress_free_bitmaps(bitmaps, p->slots);
 	arch_cleanup_vm(vm);
 	memstress_destroy_vm(vm);
-- 
2.40.0.634.g4ca3ef3211-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 5/9]  KVM: selftests: Allow independent execution of vCPUs in dirty_log_perf_test
  2023-04-21 16:52 ` Vipin Sharma
  (?)
@ 2023-04-21 16:53   ` Vipin Sharma
  -1 siblings, 0 replies; 42+ messages in thread
From: Vipin Sharma @ 2023-04-21 16:53 UTC (permalink / raw)
  To: maz, oliver.upton, james.morse, suzuki.poulose, yuzenghui,
	catalin.marinas, will, chenhuacai, aleksandar.qemu.devel,
	tsbogend, anup, atishp, paul.walmsley, palmer, aou, seanjc,
	pbonzini, dmatlack, ricarkol
  Cc: linux-arm-kernel, kvmarm, linux-mips, kvm-riscv, linux-riscv,
	linux-kselftest, kvm, linux-kernel, Vipin Sharma

Allow vCPUs to execute independent of dirty log iterations after
initialization is complete. Hide this feature behind the new option
"-j".

This change makes dirty_log_perf_test execute like real world workflows
where guest vCPUs keep on executing while VMM collects dirty logs. Total
pages touched during execution of test will give good estimate of how
vCPUs are performing while dirty logging is enabled.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 .../selftests/kvm/dirty_log_perf_test.c       | 60 ++++++++++++-------
 1 file changed, 40 insertions(+), 20 deletions(-)

diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c
index 0a08a3d21123..ffdad535fdaa 100644
--- a/tools/testing/selftests/kvm/dirty_log_perf_test.c
+++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c
@@ -69,6 +69,7 @@ static int iteration;
 static int vcpu_last_completed_iteration[KVM_MAX_VCPUS];
 static atomic_ullong total_reads;
 static atomic_ullong total_writes;
+static bool lockstep_iterations;
 
 static void vcpu_worker(struct memstress_vcpu_args *vcpu_args)
 {
@@ -83,12 +84,16 @@ static void vcpu_worker(struct memstress_vcpu_args *vcpu_args)
 	struct timespec total = (struct timespec){0};
 	struct timespec avg;
 	struct ucall uc = {};
+	int current_iteration = -1;
 	int ret;
 
 	run = vcpu->run;
 
 	while (!READ_ONCE(host_quit)) {
-		int current_iteration = READ_ONCE(iteration);
+		if (lockstep_iterations)
+			current_iteration = READ_ONCE(iteration);
+		else
+			current_iteration++;
 
 		clock_gettime(CLOCK_MONOTONIC, &start);
 		ret = _vcpu_run(vcpu);
@@ -118,13 +123,19 @@ static void vcpu_worker(struct memstress_vcpu_args *vcpu_args)
 				ts_diff.tv_nsec);
 		}
 
-		/*
-		 * Keep running the guest while dirty logging is being disabled
-		 * (iteration is negative) so that vCPUs are accessing memory
-		 * for the entire duration of zapping collapsible SPTEs.
-		 */
-		while (current_iteration == READ_ONCE(iteration) &&
-		       READ_ONCE(iteration) >= 0 && !READ_ONCE(host_quit)) {}
+		if (lockstep_iterations) {
+			/*
+			 * Keep running the guest while dirty logging is being disabled
+			 * (iteration is negative) so that vCPUs are accessing memory
+			 * for the entire duration of zapping collapsible SPTEs.
+			 */
+			while (current_iteration == READ_ONCE(iteration) &&
+			       READ_ONCE(iteration) >= 0 && !READ_ONCE(host_quit))
+				;
+		} else {
+			while (!READ_ONCE(iteration))
+				;
+		}
 	}
 
 	avg = timespec_div(total, vcpu_last_completed_iteration[vcpu_idx]);
@@ -238,17 +249,19 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 		clock_gettime(CLOCK_MONOTONIC, &start);
 		iteration++;
 
-		pr_debug("Starting iteration %d\n", iteration);
-		for (i = 0; i < nr_vcpus; i++) {
-			while (READ_ONCE(vcpu_last_completed_iteration[i])
-			       != iteration)
-				;
-		}
+		if (lockstep_iterations) {
+			pr_debug("Starting iteration %d\n", iteration);
+			for (i = 0; i < nr_vcpus; i++) {
+				while (READ_ONCE(vcpu_last_completed_iteration[i])
+				       != iteration)
+					;
+			}
 
-		ts_diff = timespec_elapsed(start);
-		vcpu_dirty_total = timespec_add(vcpu_dirty_total, ts_diff);
-		pr_info("Iteration %d dirty memory time: %ld.%.9lds\n",
-			iteration, ts_diff.tv_sec, ts_diff.tv_nsec);
+			ts_diff = timespec_elapsed(start);
+			vcpu_dirty_total = timespec_add(vcpu_dirty_total, ts_diff);
+			pr_info("Iteration %d dirty memory time: %ld.%.9lds\n",
+				iteration, ts_diff.tv_sec, ts_diff.tv_nsec);
+		}
 
 		clock_gettime(CLOCK_MONOTONIC, &start);
 		memstress_get_dirty_log(vm, bitmaps, p->slots);
@@ -365,6 +378,10 @@ static void help(char *name)
 	       "     To leave the application task unpinned, drop the final entry:\n\n"
 	       "         ./dirty_log_perf_test -v 3 -c 22,23,24\n\n"
 	       "     (default: no pinning)\n");
+	printf(" -j: Execute vCPUs independent of dirty log iterations\n"
+	       "     Independent vCPUs execution will allow them to continuously\n"
+	       "     dirty memory while main thread is collecting and clearing\n"
+	       "     dirty logs in the main thread's iterations.\n");
 	printf(" -k: Specify the chunk size in which dirty memory gets cleared\n"
 	       "     in memslots in each iteration. If the size is bigger than\n"
 	       "     the memslot size then whole memslot is cleared in one call.\n"
@@ -399,10 +416,10 @@ int main(int argc, char *argv[])
 		kvm_check_cap(KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2);
 	dirty_log_manual_caps &= (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE |
 				  KVM_DIRTY_LOG_INITIALLY_SET);
-
+	lockstep_iterations = true;
 	guest_modes_append_default();
 
-	while ((opt = getopt(argc, argv, "ab:c:eghi:k:l:m:nop:r:s:v:x:w:")) != -1) {
+	while ((opt = getopt(argc, argv, "ab:c:eghi:jk:l:m:nop:r:s:v:x:w:")) != -1) {
 		switch (opt) {
 		case 'a':
 			p.random_access = true;
@@ -426,6 +443,9 @@ int main(int argc, char *argv[])
 		case 'i':
 			p.iterations = atoi_positive("Number of iterations", optarg);
 			break;
+		case 'j':
+			lockstep_iterations = false;
+			break;
 		case 'k':
 			p.clear_chunk_size = parse_size(optarg);
 			break;
-- 
2.40.0.634.g4ca3ef3211-goog


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 5/9]  KVM: selftests: Allow independent execution of vCPUs in dirty_log_perf_test
@ 2023-04-21 16:53   ` Vipin Sharma
  0 siblings, 0 replies; 42+ messages in thread
From: Vipin Sharma @ 2023-04-21 16:53 UTC (permalink / raw)
  To: maz, oliver.upton, james.morse, suzuki.poulose, yuzenghui,
	catalin.marinas, will, chenhuacai, aleksandar.qemu.devel,
	tsbogend, anup, atishp, paul.walmsley, palmer, aou, seanjc,
	pbonzini, dmatlack, ricarkol
  Cc: linux-arm-kernel, kvmarm, linux-mips, kvm-riscv, linux-riscv,
	linux-kselftest, kvm, linux-kernel, Vipin Sharma

Allow vCPUs to execute independent of dirty log iterations after
initialization is complete. Hide this feature behind the new option
"-j".

This change makes dirty_log_perf_test execute like real world workflows
where guest vCPUs keep on executing while VMM collects dirty logs. Total
pages touched during execution of test will give good estimate of how
vCPUs are performing while dirty logging is enabled.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 .../selftests/kvm/dirty_log_perf_test.c       | 60 ++++++++++++-------
 1 file changed, 40 insertions(+), 20 deletions(-)

diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c
index 0a08a3d21123..ffdad535fdaa 100644
--- a/tools/testing/selftests/kvm/dirty_log_perf_test.c
+++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c
@@ -69,6 +69,7 @@ static int iteration;
 static int vcpu_last_completed_iteration[KVM_MAX_VCPUS];
 static atomic_ullong total_reads;
 static atomic_ullong total_writes;
+static bool lockstep_iterations;
 
 static void vcpu_worker(struct memstress_vcpu_args *vcpu_args)
 {
@@ -83,12 +84,16 @@ static void vcpu_worker(struct memstress_vcpu_args *vcpu_args)
 	struct timespec total = (struct timespec){0};
 	struct timespec avg;
 	struct ucall uc = {};
+	int current_iteration = -1;
 	int ret;
 
 	run = vcpu->run;
 
 	while (!READ_ONCE(host_quit)) {
-		int current_iteration = READ_ONCE(iteration);
+		if (lockstep_iterations)
+			current_iteration = READ_ONCE(iteration);
+		else
+			current_iteration++;
 
 		clock_gettime(CLOCK_MONOTONIC, &start);
 		ret = _vcpu_run(vcpu);
@@ -118,13 +123,19 @@ static void vcpu_worker(struct memstress_vcpu_args *vcpu_args)
 				ts_diff.tv_nsec);
 		}
 
-		/*
-		 * Keep running the guest while dirty logging is being disabled
-		 * (iteration is negative) so that vCPUs are accessing memory
-		 * for the entire duration of zapping collapsible SPTEs.
-		 */
-		while (current_iteration == READ_ONCE(iteration) &&
-		       READ_ONCE(iteration) >= 0 && !READ_ONCE(host_quit)) {}
+		if (lockstep_iterations) {
+			/*
+			 * Keep running the guest while dirty logging is being disabled
+			 * (iteration is negative) so that vCPUs are accessing memory
+			 * for the entire duration of zapping collapsible SPTEs.
+			 */
+			while (current_iteration == READ_ONCE(iteration) &&
+			       READ_ONCE(iteration) >= 0 && !READ_ONCE(host_quit))
+				;
+		} else {
+			while (!READ_ONCE(iteration))
+				;
+		}
 	}
 
 	avg = timespec_div(total, vcpu_last_completed_iteration[vcpu_idx]);
@@ -238,17 +249,19 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 		clock_gettime(CLOCK_MONOTONIC, &start);
 		iteration++;
 
-		pr_debug("Starting iteration %d\n", iteration);
-		for (i = 0; i < nr_vcpus; i++) {
-			while (READ_ONCE(vcpu_last_completed_iteration[i])
-			       != iteration)
-				;
-		}
+		if (lockstep_iterations) {
+			pr_debug("Starting iteration %d\n", iteration);
+			for (i = 0; i < nr_vcpus; i++) {
+				while (READ_ONCE(vcpu_last_completed_iteration[i])
+				       != iteration)
+					;
+			}
 
-		ts_diff = timespec_elapsed(start);
-		vcpu_dirty_total = timespec_add(vcpu_dirty_total, ts_diff);
-		pr_info("Iteration %d dirty memory time: %ld.%.9lds\n",
-			iteration, ts_diff.tv_sec, ts_diff.tv_nsec);
+			ts_diff = timespec_elapsed(start);
+			vcpu_dirty_total = timespec_add(vcpu_dirty_total, ts_diff);
+			pr_info("Iteration %d dirty memory time: %ld.%.9lds\n",
+				iteration, ts_diff.tv_sec, ts_diff.tv_nsec);
+		}
 
 		clock_gettime(CLOCK_MONOTONIC, &start);
 		memstress_get_dirty_log(vm, bitmaps, p->slots);
@@ -365,6 +378,10 @@ static void help(char *name)
 	       "     To leave the application task unpinned, drop the final entry:\n\n"
 	       "         ./dirty_log_perf_test -v 3 -c 22,23,24\n\n"
 	       "     (default: no pinning)\n");
+	printf(" -j: Execute vCPUs independent of dirty log iterations\n"
+	       "     Independent vCPUs execution will allow them to continuously\n"
+	       "     dirty memory while main thread is collecting and clearing\n"
+	       "     dirty logs in the main thread's iterations.\n");
 	printf(" -k: Specify the chunk size in which dirty memory gets cleared\n"
 	       "     in memslots in each iteration. If the size is bigger than\n"
 	       "     the memslot size then whole memslot is cleared in one call.\n"
@@ -399,10 +416,10 @@ int main(int argc, char *argv[])
 		kvm_check_cap(KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2);
 	dirty_log_manual_caps &= (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE |
 				  KVM_DIRTY_LOG_INITIALLY_SET);
-
+	lockstep_iterations = true;
 	guest_modes_append_default();
 
-	while ((opt = getopt(argc, argv, "ab:c:eghi:k:l:m:nop:r:s:v:x:w:")) != -1) {
+	while ((opt = getopt(argc, argv, "ab:c:eghi:jk:l:m:nop:r:s:v:x:w:")) != -1) {
 		switch (opt) {
 		case 'a':
 			p.random_access = true;
@@ -426,6 +443,9 @@ int main(int argc, char *argv[])
 		case 'i':
 			p.iterations = atoi_positive("Number of iterations", optarg);
 			break;
+		case 'j':
+			lockstep_iterations = false;
+			break;
 		case 'k':
 			p.clear_chunk_size = parse_size(optarg);
 			break;
-- 
2.40.0.634.g4ca3ef3211-goog


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 5/9]  KVM: selftests: Allow independent execution of vCPUs in dirty_log_perf_test
@ 2023-04-21 16:53   ` Vipin Sharma
  0 siblings, 0 replies; 42+ messages in thread
From: Vipin Sharma @ 2023-04-21 16:53 UTC (permalink / raw)
  To: maz, oliver.upton, james.morse, suzuki.poulose, yuzenghui,
	catalin.marinas, will, chenhuacai, aleksandar.qemu.devel,
	tsbogend, anup, atishp, paul.walmsley, palmer, aou, seanjc,
	pbonzini, dmatlack, ricarkol
  Cc: linux-arm-kernel, kvmarm, linux-mips, kvm-riscv, linux-riscv,
	linux-kselftest, kvm, linux-kernel, Vipin Sharma

Allow vCPUs to execute independent of dirty log iterations after
initialization is complete. Hide this feature behind the new option
"-j".

This change makes dirty_log_perf_test execute like real world workflows
where guest vCPUs keep on executing while VMM collects dirty logs. Total
pages touched during execution of test will give good estimate of how
vCPUs are performing while dirty logging is enabled.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 .../selftests/kvm/dirty_log_perf_test.c       | 60 ++++++++++++-------
 1 file changed, 40 insertions(+), 20 deletions(-)

diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c
index 0a08a3d21123..ffdad535fdaa 100644
--- a/tools/testing/selftests/kvm/dirty_log_perf_test.c
+++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c
@@ -69,6 +69,7 @@ static int iteration;
 static int vcpu_last_completed_iteration[KVM_MAX_VCPUS];
 static atomic_ullong total_reads;
 static atomic_ullong total_writes;
+static bool lockstep_iterations;
 
 static void vcpu_worker(struct memstress_vcpu_args *vcpu_args)
 {
@@ -83,12 +84,16 @@ static void vcpu_worker(struct memstress_vcpu_args *vcpu_args)
 	struct timespec total = (struct timespec){0};
 	struct timespec avg;
 	struct ucall uc = {};
+	int current_iteration = -1;
 	int ret;
 
 	run = vcpu->run;
 
 	while (!READ_ONCE(host_quit)) {
-		int current_iteration = READ_ONCE(iteration);
+		if (lockstep_iterations)
+			current_iteration = READ_ONCE(iteration);
+		else
+			current_iteration++;
 
 		clock_gettime(CLOCK_MONOTONIC, &start);
 		ret = _vcpu_run(vcpu);
@@ -118,13 +123,19 @@ static void vcpu_worker(struct memstress_vcpu_args *vcpu_args)
 				ts_diff.tv_nsec);
 		}
 
-		/*
-		 * Keep running the guest while dirty logging is being disabled
-		 * (iteration is negative) so that vCPUs are accessing memory
-		 * for the entire duration of zapping collapsible SPTEs.
-		 */
-		while (current_iteration == READ_ONCE(iteration) &&
-		       READ_ONCE(iteration) >= 0 && !READ_ONCE(host_quit)) {}
+		if (lockstep_iterations) {
+			/*
+			 * Keep running the guest while dirty logging is being disabled
+			 * (iteration is negative) so that vCPUs are accessing memory
+			 * for the entire duration of zapping collapsible SPTEs.
+			 */
+			while (current_iteration == READ_ONCE(iteration) &&
+			       READ_ONCE(iteration) >= 0 && !READ_ONCE(host_quit))
+				;
+		} else {
+			while (!READ_ONCE(iteration))
+				;
+		}
 	}
 
 	avg = timespec_div(total, vcpu_last_completed_iteration[vcpu_idx]);
@@ -238,17 +249,19 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 		clock_gettime(CLOCK_MONOTONIC, &start);
 		iteration++;
 
-		pr_debug("Starting iteration %d\n", iteration);
-		for (i = 0; i < nr_vcpus; i++) {
-			while (READ_ONCE(vcpu_last_completed_iteration[i])
-			       != iteration)
-				;
-		}
+		if (lockstep_iterations) {
+			pr_debug("Starting iteration %d\n", iteration);
+			for (i = 0; i < nr_vcpus; i++) {
+				while (READ_ONCE(vcpu_last_completed_iteration[i])
+				       != iteration)
+					;
+			}
 
-		ts_diff = timespec_elapsed(start);
-		vcpu_dirty_total = timespec_add(vcpu_dirty_total, ts_diff);
-		pr_info("Iteration %d dirty memory time: %ld.%.9lds\n",
-			iteration, ts_diff.tv_sec, ts_diff.tv_nsec);
+			ts_diff = timespec_elapsed(start);
+			vcpu_dirty_total = timespec_add(vcpu_dirty_total, ts_diff);
+			pr_info("Iteration %d dirty memory time: %ld.%.9lds\n",
+				iteration, ts_diff.tv_sec, ts_diff.tv_nsec);
+		}
 
 		clock_gettime(CLOCK_MONOTONIC, &start);
 		memstress_get_dirty_log(vm, bitmaps, p->slots);
@@ -365,6 +378,10 @@ static void help(char *name)
 	       "     To leave the application task unpinned, drop the final entry:\n\n"
 	       "         ./dirty_log_perf_test -v 3 -c 22,23,24\n\n"
 	       "     (default: no pinning)\n");
+	printf(" -j: Execute vCPUs independent of dirty log iterations\n"
+	       "     Independent vCPUs execution will allow them to continuously\n"
+	       "     dirty memory while main thread is collecting and clearing\n"
+	       "     dirty logs in the main thread's iterations.\n");
 	printf(" -k: Specify the chunk size in which dirty memory gets cleared\n"
 	       "     in memslots in each iteration. If the size is bigger than\n"
 	       "     the memslot size then whole memslot is cleared in one call.\n"
@@ -399,10 +416,10 @@ int main(int argc, char *argv[])
 		kvm_check_cap(KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2);
 	dirty_log_manual_caps &= (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE |
 				  KVM_DIRTY_LOG_INITIALLY_SET);
-
+	lockstep_iterations = true;
 	guest_modes_append_default();
 
-	while ((opt = getopt(argc, argv, "ab:c:eghi:k:l:m:nop:r:s:v:x:w:")) != -1) {
+	while ((opt = getopt(argc, argv, "ab:c:eghi:jk:l:m:nop:r:s:v:x:w:")) != -1) {
 		switch (opt) {
 		case 'a':
 			p.random_access = true;
@@ -426,6 +443,9 @@ int main(int argc, char *argv[])
 		case 'i':
 			p.iterations = atoi_positive("Number of iterations", optarg);
 			break;
+		case 'j':
+			lockstep_iterations = false;
+			break;
 		case 'k':
 			p.clear_chunk_size = parse_size(optarg);
 			break;
-- 
2.40.0.634.g4ca3ef3211-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 6/9] KVM: arm64: Correct the kvm_pgtable_stage2_flush() documentation
  2023-04-21 16:52 ` Vipin Sharma
  (?)
@ 2023-04-21 16:53   ` Vipin Sharma
  -1 siblings, 0 replies; 42+ messages in thread
From: Vipin Sharma @ 2023-04-21 16:53 UTC (permalink / raw)
  To: maz, oliver.upton, james.morse, suzuki.poulose, yuzenghui,
	catalin.marinas, will, chenhuacai, aleksandar.qemu.devel,
	tsbogend, anup, atishp, paul.walmsley, palmer, aou, seanjc,
	pbonzini, dmatlack, ricarkol
  Cc: linux-arm-kernel, kvmarm, linux-mips, kvm-riscv, linux-riscv,
	linux-kselftest, kvm, linux-kernel, Vipin Sharma

Remove _range suffix from kvm_pgtable_stage2_flush_range which is used
in documentation of kvm_pgtable_stage2_flush(). There is no function
named kvm_pgtable_stage2_flush_range().

Fixes: 93c66b40d728 ("KVM: arm64: Add support for stage-2 cache flushing in generic page-table")
Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 arch/arm64/include/asm/kvm_pgtable.h | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
index 4cd6762bda80..4cd62506c198 100644
--- a/arch/arm64/include/asm/kvm_pgtable.h
+++ b/arch/arm64/include/asm/kvm_pgtable.h
@@ -605,9 +605,8 @@ int kvm_pgtable_stage2_relax_perms(struct kvm_pgtable *pgt, u64 addr,
 bool kvm_pgtable_stage2_is_young(struct kvm_pgtable *pgt, u64 addr);
 
 /**
- * kvm_pgtable_stage2_flush_range() - Clean and invalidate data cache to Point
- * 				      of Coherency for guest stage-2 address
- *				      range.
+ * kvm_pgtable_stage2_flush() - Clean and invalidate data cache to Point of
+ *				Coherency for guest stage-2 address range.
  * @pgt:	Page-table structure initialised by kvm_pgtable_stage2_init*().
  * @addr:	Intermediate physical address from which to flush.
  * @size:	Size of the range.
-- 
2.40.0.634.g4ca3ef3211-goog


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 6/9] KVM: arm64: Correct the kvm_pgtable_stage2_flush() documentation
@ 2023-04-21 16:53   ` Vipin Sharma
  0 siblings, 0 replies; 42+ messages in thread
From: Vipin Sharma @ 2023-04-21 16:53 UTC (permalink / raw)
  To: maz, oliver.upton, james.morse, suzuki.poulose, yuzenghui,
	catalin.marinas, will, chenhuacai, aleksandar.qemu.devel,
	tsbogend, anup, atishp, paul.walmsley, palmer, aou, seanjc,
	pbonzini, dmatlack, ricarkol
  Cc: linux-arm-kernel, kvmarm, linux-mips, kvm-riscv, linux-riscv,
	linux-kselftest, kvm, linux-kernel, Vipin Sharma

Remove _range suffix from kvm_pgtable_stage2_flush_range which is used
in documentation of kvm_pgtable_stage2_flush(). There is no function
named kvm_pgtable_stage2_flush_range().

Fixes: 93c66b40d728 ("KVM: arm64: Add support for stage-2 cache flushing in generic page-table")
Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 arch/arm64/include/asm/kvm_pgtable.h | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
index 4cd6762bda80..4cd62506c198 100644
--- a/arch/arm64/include/asm/kvm_pgtable.h
+++ b/arch/arm64/include/asm/kvm_pgtable.h
@@ -605,9 +605,8 @@ int kvm_pgtable_stage2_relax_perms(struct kvm_pgtable *pgt, u64 addr,
 bool kvm_pgtable_stage2_is_young(struct kvm_pgtable *pgt, u64 addr);
 
 /**
- * kvm_pgtable_stage2_flush_range() - Clean and invalidate data cache to Point
- * 				      of Coherency for guest stage-2 address
- *				      range.
+ * kvm_pgtable_stage2_flush() - Clean and invalidate data cache to Point of
+ *				Coherency for guest stage-2 address range.
  * @pgt:	Page-table structure initialised by kvm_pgtable_stage2_init*().
  * @addr:	Intermediate physical address from which to flush.
  * @size:	Size of the range.
-- 
2.40.0.634.g4ca3ef3211-goog


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 6/9] KVM: arm64: Correct the kvm_pgtable_stage2_flush() documentation
@ 2023-04-21 16:53   ` Vipin Sharma
  0 siblings, 0 replies; 42+ messages in thread
From: Vipin Sharma @ 2023-04-21 16:53 UTC (permalink / raw)
  To: maz, oliver.upton, james.morse, suzuki.poulose, yuzenghui,
	catalin.marinas, will, chenhuacai, aleksandar.qemu.devel,
	tsbogend, anup, atishp, paul.walmsley, palmer, aou, seanjc,
	pbonzini, dmatlack, ricarkol
  Cc: linux-arm-kernel, kvmarm, linux-mips, kvm-riscv, linux-riscv,
	linux-kselftest, kvm, linux-kernel, Vipin Sharma

Remove _range suffix from kvm_pgtable_stage2_flush_range which is used
in documentation of kvm_pgtable_stage2_flush(). There is no function
named kvm_pgtable_stage2_flush_range().

Fixes: 93c66b40d728 ("KVM: arm64: Add support for stage-2 cache flushing in generic page-table")
Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 arch/arm64/include/asm/kvm_pgtable.h | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
index 4cd6762bda80..4cd62506c198 100644
--- a/arch/arm64/include/asm/kvm_pgtable.h
+++ b/arch/arm64/include/asm/kvm_pgtable.h
@@ -605,9 +605,8 @@ int kvm_pgtable_stage2_relax_perms(struct kvm_pgtable *pgt, u64 addr,
 bool kvm_pgtable_stage2_is_young(struct kvm_pgtable *pgt, u64 addr);
 
 /**
- * kvm_pgtable_stage2_flush_range() - Clean and invalidate data cache to Point
- * 				      of Coherency for guest stage-2 address
- *				      range.
+ * kvm_pgtable_stage2_flush() - Clean and invalidate data cache to Point of
+ *				Coherency for guest stage-2 address range.
  * @pgt:	Page-table structure initialised by kvm_pgtable_stage2_init*().
  * @addr:	Intermediate physical address from which to flush.
  * @size:	Size of the range.
-- 
2.40.0.634.g4ca3ef3211-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 7/9] KVM: mmu: Move mmu lock/unlock to arch code for clear dirty log
  2023-04-21 16:52 ` Vipin Sharma
  (?)
@ 2023-04-21 16:53   ` Vipin Sharma
  -1 siblings, 0 replies; 42+ messages in thread
From: Vipin Sharma @ 2023-04-21 16:53 UTC (permalink / raw)
  To: maz, oliver.upton, james.morse, suzuki.poulose, yuzenghui,
	catalin.marinas, will, chenhuacai, aleksandar.qemu.devel,
	tsbogend, anup, atishp, paul.walmsley, palmer, aou, seanjc,
	pbonzini, dmatlack, ricarkol
  Cc: linux-arm-kernel, kvmarm, linux-mips, kvm-riscv, linux-riscv,
	linux-kselftest, kvm, linux-kernel, Vipin Sharma

Move mmu_lock lock and unlock calls from common code in
kvm_clear_dirty_log_protect() to arch specific code in
kvm_arch_mmu_enable_log_dirty_pt_masked(). None of the other code inside
the for loop of kvm_arch_mmu_enable_log_dirty_pt_masked() needs mmu_lock
exclusivity apart from the arch specific API call.

Future commits will change clear dirty log operations under mmu read
lock instead of write lock for ARM and, potentially, x86 architectures.

No functional changes intended.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 arch/arm64/kvm/mmu.c   | 2 ++
 arch/mips/kvm/mmu.c    | 2 ++
 arch/riscv/kvm/mmu.c   | 2 ++
 arch/x86/kvm/mmu/mmu.c | 3 +++
 virt/kvm/dirty_ring.c  | 2 --
 virt/kvm/kvm_main.c    | 4 ----
 6 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 7113587222ff..dc1c9059604e 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1002,7 +1002,9 @@ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm,
 		struct kvm_memory_slot *slot,
 		gfn_t gfn_offset, unsigned long mask)
 {
+	write_lock(&kvm->mmu_lock);
 	kvm_mmu_write_protect_pt_masked(kvm, slot, gfn_offset, mask);
+	write_unlock(&kvm->mmu_lock);
 }
 
 static void kvm_send_hwpoison_signal(unsigned long address, short lsb)
diff --git a/arch/mips/kvm/mmu.c b/arch/mips/kvm/mmu.c
index e8c08988ed37..b8d4723d197e 100644
--- a/arch/mips/kvm/mmu.c
+++ b/arch/mips/kvm/mmu.c
@@ -415,11 +415,13 @@ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm,
 		struct kvm_memory_slot *slot,
 		gfn_t gfn_offset, unsigned long mask)
 {
+	spin_lock(&kvm->mmu_lock);
 	gfn_t base_gfn = slot->base_gfn + gfn_offset;
 	gfn_t start = base_gfn +  __ffs(mask);
 	gfn_t end = base_gfn + __fls(mask);
 
 	kvm_mips_mkclean_gpa_pt(kvm, start, end);
+	spin_unlock(&kvm->mmu_lock);
 }
 
 /*
diff --git a/arch/riscv/kvm/mmu.c b/arch/riscv/kvm/mmu.c
index 78211aed36fa..425fa11dcf9c 100644
--- a/arch/riscv/kvm/mmu.c
+++ b/arch/riscv/kvm/mmu.c
@@ -395,11 +395,13 @@ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm,
 					     gfn_t gfn_offset,
 					     unsigned long mask)
 {
+	spin_lock(&kvm->mmu_lock);
 	phys_addr_t base_gfn = slot->base_gfn + gfn_offset;
 	phys_addr_t start = (base_gfn +  __ffs(mask)) << PAGE_SHIFT;
 	phys_addr_t end = (base_gfn + __fls(mask) + 1) << PAGE_SHIFT;
 
 	gstage_wp_range(kvm, start, end);
+	spin_unlock(&kvm->mmu_lock);
 }
 
 void kvm_arch_sync_dirty_log(struct kvm *kvm, struct kvm_memory_slot *memslot)
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 144c5a01cd77..f1dc549b01cb 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -1367,6 +1367,7 @@ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm,
 				struct kvm_memory_slot *slot,
 				gfn_t gfn_offset, unsigned long mask)
 {
+	write_lock(&kvm->mmu_lock);
 	/*
 	 * Huge pages are NOT write protected when we start dirty logging in
 	 * initially-all-set mode; must write protect them here so that they
@@ -1397,6 +1398,8 @@ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm,
 		kvm_mmu_clear_dirty_pt_masked(kvm, slot, gfn_offset, mask);
 	else
 		kvm_mmu_write_protect_pt_masked(kvm, slot, gfn_offset, mask);
+
+	write_unlock(&kvm->mmu_lock);
 }
 
 int kvm_cpu_dirty_log_size(void)
diff --git a/virt/kvm/dirty_ring.c b/virt/kvm/dirty_ring.c
index c1cd7dfe4a90..d894c58d2152 100644
--- a/virt/kvm/dirty_ring.c
+++ b/virt/kvm/dirty_ring.c
@@ -66,9 +66,7 @@ static void kvm_reset_dirty_gfn(struct kvm *kvm, u32 slot, u64 offset, u64 mask)
 	if (!memslot || (offset + __fls(mask)) >= memslot->npages)
 		return;
 
-	KVM_MMU_LOCK(kvm);
 	kvm_arch_mmu_enable_log_dirty_pt_masked(kvm, memslot, offset, mask);
-	KVM_MMU_UNLOCK(kvm);
 }
 
 int kvm_dirty_ring_alloc(struct kvm_dirty_ring *ring, int index, u32 size)
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index f40b72eb0e7b..378c40e958b6 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2157,7 +2157,6 @@ static int kvm_get_dirty_log_protect(struct kvm *kvm, struct kvm_dirty_log *log)
 		dirty_bitmap_buffer = kvm_second_dirty_bitmap(memslot);
 		memset(dirty_bitmap_buffer, 0, n);
 
-		KVM_MMU_LOCK(kvm);
 		for (i = 0; i < n / sizeof(long); i++) {
 			unsigned long mask;
 			gfn_t offset;
@@ -2173,7 +2172,6 @@ static int kvm_get_dirty_log_protect(struct kvm *kvm, struct kvm_dirty_log *log)
 			kvm_arch_mmu_enable_log_dirty_pt_masked(kvm, memslot,
 								offset, mask);
 		}
-		KVM_MMU_UNLOCK(kvm);
 	}
 
 	if (flush)
@@ -2268,7 +2266,6 @@ static int kvm_clear_dirty_log_protect(struct kvm *kvm,
 	if (copy_from_user(dirty_bitmap_buffer, log->dirty_bitmap, n))
 		return -EFAULT;
 
-	KVM_MMU_LOCK(kvm);
 	for (offset = log->first_page, i = offset / BITS_PER_LONG,
 		 n = DIV_ROUND_UP(log->num_pages, BITS_PER_LONG); n--;
 	     i++, offset += BITS_PER_LONG) {
@@ -2291,7 +2288,6 @@ static int kvm_clear_dirty_log_protect(struct kvm *kvm,
 								offset, mask);
 		}
 	}
-	KVM_MMU_UNLOCK(kvm);
 
 	if (flush)
 		kvm_arch_flush_remote_tlbs_memslot(kvm, memslot);
-- 
2.40.0.634.g4ca3ef3211-goog


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 7/9] KVM: mmu: Move mmu lock/unlock to arch code for clear dirty log
@ 2023-04-21 16:53   ` Vipin Sharma
  0 siblings, 0 replies; 42+ messages in thread
From: Vipin Sharma @ 2023-04-21 16:53 UTC (permalink / raw)
  To: maz, oliver.upton, james.morse, suzuki.poulose, yuzenghui,
	catalin.marinas, will, chenhuacai, aleksandar.qemu.devel,
	tsbogend, anup, atishp, paul.walmsley, palmer, aou, seanjc,
	pbonzini, dmatlack, ricarkol
  Cc: linux-arm-kernel, kvmarm, linux-mips, kvm-riscv, linux-riscv,
	linux-kselftest, kvm, linux-kernel, Vipin Sharma

Move mmu_lock lock and unlock calls from common code in
kvm_clear_dirty_log_protect() to arch specific code in
kvm_arch_mmu_enable_log_dirty_pt_masked(). None of the other code inside
the for loop of kvm_arch_mmu_enable_log_dirty_pt_masked() needs mmu_lock
exclusivity apart from the arch specific API call.

Future commits will change clear dirty log operations under mmu read
lock instead of write lock for ARM and, potentially, x86 architectures.

No functional changes intended.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 arch/arm64/kvm/mmu.c   | 2 ++
 arch/mips/kvm/mmu.c    | 2 ++
 arch/riscv/kvm/mmu.c   | 2 ++
 arch/x86/kvm/mmu/mmu.c | 3 +++
 virt/kvm/dirty_ring.c  | 2 --
 virt/kvm/kvm_main.c    | 4 ----
 6 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 7113587222ff..dc1c9059604e 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1002,7 +1002,9 @@ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm,
 		struct kvm_memory_slot *slot,
 		gfn_t gfn_offset, unsigned long mask)
 {
+	write_lock(&kvm->mmu_lock);
 	kvm_mmu_write_protect_pt_masked(kvm, slot, gfn_offset, mask);
+	write_unlock(&kvm->mmu_lock);
 }
 
 static void kvm_send_hwpoison_signal(unsigned long address, short lsb)
diff --git a/arch/mips/kvm/mmu.c b/arch/mips/kvm/mmu.c
index e8c08988ed37..b8d4723d197e 100644
--- a/arch/mips/kvm/mmu.c
+++ b/arch/mips/kvm/mmu.c
@@ -415,11 +415,13 @@ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm,
 		struct kvm_memory_slot *slot,
 		gfn_t gfn_offset, unsigned long mask)
 {
+	spin_lock(&kvm->mmu_lock);
 	gfn_t base_gfn = slot->base_gfn + gfn_offset;
 	gfn_t start = base_gfn +  __ffs(mask);
 	gfn_t end = base_gfn + __fls(mask);
 
 	kvm_mips_mkclean_gpa_pt(kvm, start, end);
+	spin_unlock(&kvm->mmu_lock);
 }
 
 /*
diff --git a/arch/riscv/kvm/mmu.c b/arch/riscv/kvm/mmu.c
index 78211aed36fa..425fa11dcf9c 100644
--- a/arch/riscv/kvm/mmu.c
+++ b/arch/riscv/kvm/mmu.c
@@ -395,11 +395,13 @@ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm,
 					     gfn_t gfn_offset,
 					     unsigned long mask)
 {
+	spin_lock(&kvm->mmu_lock);
 	phys_addr_t base_gfn = slot->base_gfn + gfn_offset;
 	phys_addr_t start = (base_gfn +  __ffs(mask)) << PAGE_SHIFT;
 	phys_addr_t end = (base_gfn + __fls(mask) + 1) << PAGE_SHIFT;
 
 	gstage_wp_range(kvm, start, end);
+	spin_unlock(&kvm->mmu_lock);
 }
 
 void kvm_arch_sync_dirty_log(struct kvm *kvm, struct kvm_memory_slot *memslot)
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 144c5a01cd77..f1dc549b01cb 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -1367,6 +1367,7 @@ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm,
 				struct kvm_memory_slot *slot,
 				gfn_t gfn_offset, unsigned long mask)
 {
+	write_lock(&kvm->mmu_lock);
 	/*
 	 * Huge pages are NOT write protected when we start dirty logging in
 	 * initially-all-set mode; must write protect them here so that they
@@ -1397,6 +1398,8 @@ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm,
 		kvm_mmu_clear_dirty_pt_masked(kvm, slot, gfn_offset, mask);
 	else
 		kvm_mmu_write_protect_pt_masked(kvm, slot, gfn_offset, mask);
+
+	write_unlock(&kvm->mmu_lock);
 }
 
 int kvm_cpu_dirty_log_size(void)
diff --git a/virt/kvm/dirty_ring.c b/virt/kvm/dirty_ring.c
index c1cd7dfe4a90..d894c58d2152 100644
--- a/virt/kvm/dirty_ring.c
+++ b/virt/kvm/dirty_ring.c
@@ -66,9 +66,7 @@ static void kvm_reset_dirty_gfn(struct kvm *kvm, u32 slot, u64 offset, u64 mask)
 	if (!memslot || (offset + __fls(mask)) >= memslot->npages)
 		return;
 
-	KVM_MMU_LOCK(kvm);
 	kvm_arch_mmu_enable_log_dirty_pt_masked(kvm, memslot, offset, mask);
-	KVM_MMU_UNLOCK(kvm);
 }
 
 int kvm_dirty_ring_alloc(struct kvm_dirty_ring *ring, int index, u32 size)
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index f40b72eb0e7b..378c40e958b6 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2157,7 +2157,6 @@ static int kvm_get_dirty_log_protect(struct kvm *kvm, struct kvm_dirty_log *log)
 		dirty_bitmap_buffer = kvm_second_dirty_bitmap(memslot);
 		memset(dirty_bitmap_buffer, 0, n);
 
-		KVM_MMU_LOCK(kvm);
 		for (i = 0; i < n / sizeof(long); i++) {
 			unsigned long mask;
 			gfn_t offset;
@@ -2173,7 +2172,6 @@ static int kvm_get_dirty_log_protect(struct kvm *kvm, struct kvm_dirty_log *log)
 			kvm_arch_mmu_enable_log_dirty_pt_masked(kvm, memslot,
 								offset, mask);
 		}
-		KVM_MMU_UNLOCK(kvm);
 	}
 
 	if (flush)
@@ -2268,7 +2266,6 @@ static int kvm_clear_dirty_log_protect(struct kvm *kvm,
 	if (copy_from_user(dirty_bitmap_buffer, log->dirty_bitmap, n))
 		return -EFAULT;
 
-	KVM_MMU_LOCK(kvm);
 	for (offset = log->first_page, i = offset / BITS_PER_LONG,
 		 n = DIV_ROUND_UP(log->num_pages, BITS_PER_LONG); n--;
 	     i++, offset += BITS_PER_LONG) {
@@ -2291,7 +2288,6 @@ static int kvm_clear_dirty_log_protect(struct kvm *kvm,
 								offset, mask);
 		}
 	}
-	KVM_MMU_UNLOCK(kvm);
 
 	if (flush)
 		kvm_arch_flush_remote_tlbs_memslot(kvm, memslot);
-- 
2.40.0.634.g4ca3ef3211-goog


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 7/9] KVM: mmu: Move mmu lock/unlock to arch code for clear dirty log
@ 2023-04-21 16:53   ` Vipin Sharma
  0 siblings, 0 replies; 42+ messages in thread
From: Vipin Sharma @ 2023-04-21 16:53 UTC (permalink / raw)
  To: maz, oliver.upton, james.morse, suzuki.poulose, yuzenghui,
	catalin.marinas, will, chenhuacai, aleksandar.qemu.devel,
	tsbogend, anup, atishp, paul.walmsley, palmer, aou, seanjc,
	pbonzini, dmatlack, ricarkol
  Cc: linux-arm-kernel, kvmarm, linux-mips, kvm-riscv, linux-riscv,
	linux-kselftest, kvm, linux-kernel, Vipin Sharma

Move mmu_lock lock and unlock calls from common code in
kvm_clear_dirty_log_protect() to arch specific code in
kvm_arch_mmu_enable_log_dirty_pt_masked(). None of the other code inside
the for loop of kvm_arch_mmu_enable_log_dirty_pt_masked() needs mmu_lock
exclusivity apart from the arch specific API call.

Future commits will change clear dirty log operations under mmu read
lock instead of write lock for ARM and, potentially, x86 architectures.

No functional changes intended.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 arch/arm64/kvm/mmu.c   | 2 ++
 arch/mips/kvm/mmu.c    | 2 ++
 arch/riscv/kvm/mmu.c   | 2 ++
 arch/x86/kvm/mmu/mmu.c | 3 +++
 virt/kvm/dirty_ring.c  | 2 --
 virt/kvm/kvm_main.c    | 4 ----
 6 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 7113587222ff..dc1c9059604e 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1002,7 +1002,9 @@ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm,
 		struct kvm_memory_slot *slot,
 		gfn_t gfn_offset, unsigned long mask)
 {
+	write_lock(&kvm->mmu_lock);
 	kvm_mmu_write_protect_pt_masked(kvm, slot, gfn_offset, mask);
+	write_unlock(&kvm->mmu_lock);
 }
 
 static void kvm_send_hwpoison_signal(unsigned long address, short lsb)
diff --git a/arch/mips/kvm/mmu.c b/arch/mips/kvm/mmu.c
index e8c08988ed37..b8d4723d197e 100644
--- a/arch/mips/kvm/mmu.c
+++ b/arch/mips/kvm/mmu.c
@@ -415,11 +415,13 @@ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm,
 		struct kvm_memory_slot *slot,
 		gfn_t gfn_offset, unsigned long mask)
 {
+	spin_lock(&kvm->mmu_lock);
 	gfn_t base_gfn = slot->base_gfn + gfn_offset;
 	gfn_t start = base_gfn +  __ffs(mask);
 	gfn_t end = base_gfn + __fls(mask);
 
 	kvm_mips_mkclean_gpa_pt(kvm, start, end);
+	spin_unlock(&kvm->mmu_lock);
 }
 
 /*
diff --git a/arch/riscv/kvm/mmu.c b/arch/riscv/kvm/mmu.c
index 78211aed36fa..425fa11dcf9c 100644
--- a/arch/riscv/kvm/mmu.c
+++ b/arch/riscv/kvm/mmu.c
@@ -395,11 +395,13 @@ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm,
 					     gfn_t gfn_offset,
 					     unsigned long mask)
 {
+	spin_lock(&kvm->mmu_lock);
 	phys_addr_t base_gfn = slot->base_gfn + gfn_offset;
 	phys_addr_t start = (base_gfn +  __ffs(mask)) << PAGE_SHIFT;
 	phys_addr_t end = (base_gfn + __fls(mask) + 1) << PAGE_SHIFT;
 
 	gstage_wp_range(kvm, start, end);
+	spin_unlock(&kvm->mmu_lock);
 }
 
 void kvm_arch_sync_dirty_log(struct kvm *kvm, struct kvm_memory_slot *memslot)
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 144c5a01cd77..f1dc549b01cb 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -1367,6 +1367,7 @@ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm,
 				struct kvm_memory_slot *slot,
 				gfn_t gfn_offset, unsigned long mask)
 {
+	write_lock(&kvm->mmu_lock);
 	/*
 	 * Huge pages are NOT write protected when we start dirty logging in
 	 * initially-all-set mode; must write protect them here so that they
@@ -1397,6 +1398,8 @@ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm,
 		kvm_mmu_clear_dirty_pt_masked(kvm, slot, gfn_offset, mask);
 	else
 		kvm_mmu_write_protect_pt_masked(kvm, slot, gfn_offset, mask);
+
+	write_unlock(&kvm->mmu_lock);
 }
 
 int kvm_cpu_dirty_log_size(void)
diff --git a/virt/kvm/dirty_ring.c b/virt/kvm/dirty_ring.c
index c1cd7dfe4a90..d894c58d2152 100644
--- a/virt/kvm/dirty_ring.c
+++ b/virt/kvm/dirty_ring.c
@@ -66,9 +66,7 @@ static void kvm_reset_dirty_gfn(struct kvm *kvm, u32 slot, u64 offset, u64 mask)
 	if (!memslot || (offset + __fls(mask)) >= memslot->npages)
 		return;
 
-	KVM_MMU_LOCK(kvm);
 	kvm_arch_mmu_enable_log_dirty_pt_masked(kvm, memslot, offset, mask);
-	KVM_MMU_UNLOCK(kvm);
 }
 
 int kvm_dirty_ring_alloc(struct kvm_dirty_ring *ring, int index, u32 size)
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index f40b72eb0e7b..378c40e958b6 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2157,7 +2157,6 @@ static int kvm_get_dirty_log_protect(struct kvm *kvm, struct kvm_dirty_log *log)
 		dirty_bitmap_buffer = kvm_second_dirty_bitmap(memslot);
 		memset(dirty_bitmap_buffer, 0, n);
 
-		KVM_MMU_LOCK(kvm);
 		for (i = 0; i < n / sizeof(long); i++) {
 			unsigned long mask;
 			gfn_t offset;
@@ -2173,7 +2172,6 @@ static int kvm_get_dirty_log_protect(struct kvm *kvm, struct kvm_dirty_log *log)
 			kvm_arch_mmu_enable_log_dirty_pt_masked(kvm, memslot,
 								offset, mask);
 		}
-		KVM_MMU_UNLOCK(kvm);
 	}
 
 	if (flush)
@@ -2268,7 +2266,6 @@ static int kvm_clear_dirty_log_protect(struct kvm *kvm,
 	if (copy_from_user(dirty_bitmap_buffer, log->dirty_bitmap, n))
 		return -EFAULT;
 
-	KVM_MMU_LOCK(kvm);
 	for (offset = log->first_page, i = offset / BITS_PER_LONG,
 		 n = DIV_ROUND_UP(log->num_pages, BITS_PER_LONG); n--;
 	     i++, offset += BITS_PER_LONG) {
@@ -2291,7 +2288,6 @@ static int kvm_clear_dirty_log_protect(struct kvm *kvm,
 								offset, mask);
 		}
 	}
-	KVM_MMU_UNLOCK(kvm);
 
 	if (flush)
 		kvm_arch_flush_remote_tlbs_memslot(kvm, memslot);
-- 
2.40.0.634.g4ca3ef3211-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 8/9] KMV: arm64: Allow stage2_apply_range_sched() to pass page table walker flags
  2023-04-21 16:52 ` Vipin Sharma
  (?)
@ 2023-04-21 16:53   ` Vipin Sharma
  -1 siblings, 0 replies; 42+ messages in thread
From: Vipin Sharma @ 2023-04-21 16:53 UTC (permalink / raw)
  To: maz, oliver.upton, james.morse, suzuki.poulose, yuzenghui,
	catalin.marinas, will, chenhuacai, aleksandar.qemu.devel,
	tsbogend, anup, atishp, paul.walmsley, palmer, aou, seanjc,
	pbonzini, dmatlack, ricarkol
  Cc: linux-arm-kernel, kvmarm, linux-mips, kvm-riscv, linux-riscv,
	linux-kselftest, kvm, linux-kernel, Vipin Sharma

Allow stage2_apply_range_sched() to pass enum kvm_pgtable_walk_flags{}
to stage 2 walkers. Pass 0 as the flag to make this change no-op

This capability will be used in future commits to enable clear-dirty-log
operation under MMU read lock.

Current users of stage2_apply_range_*() API run under assumption of
holding MMU write lock. Stage2 page table walkers then run under the
same assumption. In future commits when clear-dirty-log operation under
MMU read lock is added then there needs to be a way to pass this shared
intent to page table walkers.

No functional changes intended.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 arch/arm64/include/asm/kvm_pgtable.h  | 12 +++++++++---
 arch/arm64/kvm/hyp/nvhe/mem_protect.c |  4 ++--
 arch/arm64/kvm/hyp/pgtable.c          | 16 ++++++++++------
 arch/arm64/kvm/mmu.c                  | 26 ++++++++++++++++----------
 4 files changed, 37 insertions(+), 21 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
index 4cd62506c198..79a452d78e08 100644
--- a/arch/arm64/include/asm/kvm_pgtable.h
+++ b/arch/arm64/include/asm/kvm_pgtable.h
@@ -508,6 +508,7 @@ int kvm_pgtable_stage2_set_owner(struct kvm_pgtable *pgt, u64 addr, u64 size,
  * @pgt:	Page-table structure initialised by kvm_pgtable_stage2_init*().
  * @addr:	Intermediate physical address from which to remove the mapping.
  * @size:	Size of the mapping.
+ * @flags:	Page-table walker flags.
  *
  * The offset of @addr within a page is ignored and @size is rounded-up to
  * the next page boundary.
@@ -520,7 +521,8 @@ int kvm_pgtable_stage2_set_owner(struct kvm_pgtable *pgt, u64 addr, u64 size,
  *
  * Return: 0 on success, negative error code on failure.
  */
-int kvm_pgtable_stage2_unmap(struct kvm_pgtable *pgt, u64 addr, u64 size);
+int kvm_pgtable_stage2_unmap(struct kvm_pgtable *pgt, u64 addr, u64 size,
+			     enum kvm_pgtable_walk_flags flags);
 
 /**
  * kvm_pgtable_stage2_wrprotect() - Write-protect guest stage-2 address range
@@ -528,6 +530,7 @@ int kvm_pgtable_stage2_unmap(struct kvm_pgtable *pgt, u64 addr, u64 size);
  * @pgt:	Page-table structure initialised by kvm_pgtable_stage2_init*().
  * @addr:	Intermediate physical address from which to write-protect,
  * @size:	Size of the range.
+ * @flags:	Page-table walker flags.
  *
  * The offset of @addr within a page is ignored and @size is rounded-up to
  * the next page boundary.
@@ -538,7 +541,8 @@ int kvm_pgtable_stage2_unmap(struct kvm_pgtable *pgt, u64 addr, u64 size);
  *
  * Return: 0 on success, negative error code on failure.
  */
-int kvm_pgtable_stage2_wrprotect(struct kvm_pgtable *pgt, u64 addr, u64 size);
+int kvm_pgtable_stage2_wrprotect(struct kvm_pgtable *pgt, u64 addr, u64 size,
+				 enum kvm_pgtable_walk_flags flags);
 
 /**
  * kvm_pgtable_stage2_mkyoung() - Set the access flag in a page-table entry.
@@ -610,13 +614,15 @@ bool kvm_pgtable_stage2_is_young(struct kvm_pgtable *pgt, u64 addr);
  * @pgt:	Page-table structure initialised by kvm_pgtable_stage2_init*().
  * @addr:	Intermediate physical address from which to flush.
  * @size:	Size of the range.
+ * @flags:	Page-table walker flags.
  *
  * The offset of @addr within a page is ignored and @size is rounded-up to
  * the next page boundary.
  *
  * Return: 0 on success, negative error code on failure.
  */
-int kvm_pgtable_stage2_flush(struct kvm_pgtable *pgt, u64 addr, u64 size);
+int kvm_pgtable_stage2_flush(struct kvm_pgtable *pgt, u64 addr, u64 size,
+			     enum kvm_pgtable_walk_flags flags);
 
 /**
  * kvm_pgtable_walk() - Walk a page-table.
diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
index 552653fa18be..bac3c2c31cbe 100644
--- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
@@ -326,11 +326,11 @@ static int host_stage2_unmap_dev_all(void)
 	/* Unmap all non-memory regions to recycle the pages */
 	for (i = 0; i < hyp_memblock_nr; i++, addr = reg->base + reg->size) {
 		reg = &hyp_memory[i];
-		ret = kvm_pgtable_stage2_unmap(pgt, addr, reg->base - addr);
+		ret = kvm_pgtable_stage2_unmap(pgt, addr, reg->base - addr, 0);
 		if (ret)
 			return ret;
 	}
-	return kvm_pgtable_stage2_unmap(pgt, addr, BIT(pgt->ia_bits) - addr);
+	return kvm_pgtable_stage2_unmap(pgt, addr, BIT(pgt->ia_bits) - addr, 0);
 }
 
 struct kvm_mem_range {
diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index 3d61bd3e591d..3a585e1fba11 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -1024,12 +1024,14 @@ static int stage2_unmap_walker(const struct kvm_pgtable_visit_ctx *ctx,
 	return 0;
 }
 
-int kvm_pgtable_stage2_unmap(struct kvm_pgtable *pgt, u64 addr, u64 size)
+int kvm_pgtable_stage2_unmap(struct kvm_pgtable *pgt, u64 addr, u64 size,
+			     enum kvm_pgtable_walk_flags flags)
 {
 	struct kvm_pgtable_walker walker = {
 		.cb	= stage2_unmap_walker,
 		.arg	= pgt,
-		.flags	= KVM_PGTABLE_WALK_LEAF | KVM_PGTABLE_WALK_TABLE_POST,
+		.flags	= flags | KVM_PGTABLE_WALK_LEAF |
+				KVM_PGTABLE_WALK_TABLE_POST,
 	};
 
 	return kvm_pgtable_walk(pgt, addr, size, &walker);
@@ -1108,11 +1110,12 @@ static int stage2_update_leaf_attrs(struct kvm_pgtable *pgt, u64 addr,
 	return 0;
 }
 
-int kvm_pgtable_stage2_wrprotect(struct kvm_pgtable *pgt, u64 addr, u64 size)
+int kvm_pgtable_stage2_wrprotect(struct kvm_pgtable *pgt, u64 addr, u64 size,
+				 enum kvm_pgtable_walk_flags flags)
 {
 	return stage2_update_leaf_attrs(pgt, addr, size, 0,
 					KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W,
-					NULL, NULL, 0);
+					NULL, NULL, flags);
 }
 
 kvm_pte_t kvm_pgtable_stage2_mkyoung(struct kvm_pgtable *pgt, u64 addr)
@@ -1193,11 +1196,12 @@ static int stage2_flush_walker(const struct kvm_pgtable_visit_ctx *ctx,
 	return 0;
 }
 
-int kvm_pgtable_stage2_flush(struct kvm_pgtable *pgt, u64 addr, u64 size)
+int kvm_pgtable_stage2_flush(struct kvm_pgtable *pgt, u64 addr, u64 size,
+			     enum kvm_pgtable_walk_flags flags)
 {
 	struct kvm_pgtable_walker walker = {
 		.cb	= stage2_flush_walker,
-		.flags	= KVM_PGTABLE_WALK_LEAF,
+		.flags	= flags | KVM_PGTABLE_WALK_LEAF,
 		.arg	= pgt,
 	};
 
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index dc1c9059604e..e0189cdda43d 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -48,7 +48,9 @@ static phys_addr_t stage2_range_addr_end(phys_addr_t addr, phys_addr_t end)
  */
 static int stage2_apply_range(struct kvm_s2_mmu *mmu, phys_addr_t addr,
 			      phys_addr_t end,
-			      int (*fn)(struct kvm_pgtable *, u64, u64),
+			      enum kvm_pgtable_walk_flags flags,
+			      int (*fn)(struct kvm_pgtable *, u64, u64,
+					enum kvm_pgtable_walk_flags),
 			      bool resched)
 {
 	struct kvm *kvm = kvm_s2_mmu_to_kvm(mmu);
@@ -61,7 +63,7 @@ static int stage2_apply_range(struct kvm_s2_mmu *mmu, phys_addr_t addr,
 			return -EINVAL;
 
 		next = stage2_range_addr_end(addr, end);
-		ret = fn(pgt, addr, next - addr);
+		ret = fn(pgt, addr, next - addr, flags);
 		if (ret)
 			break;
 
@@ -72,8 +74,8 @@ static int stage2_apply_range(struct kvm_s2_mmu *mmu, phys_addr_t addr,
 	return ret;
 }
 
-#define stage2_apply_range_resched(mmu, addr, end, fn)			\
-	stage2_apply_range(mmu, addr, end, fn, true)
+#define stage2_apply_range_resched(mmu, addr, end, flags, fn)		\
+	stage2_apply_range(mmu, addr, end, flags, fn, true)
 
 static bool memslot_is_logging(struct kvm_memory_slot *memslot)
 {
@@ -236,7 +238,7 @@ static void __unmap_stage2_range(struct kvm_s2_mmu *mmu, phys_addr_t start, u64
 
 	lockdep_assert_held_write(&kvm->mmu_lock);
 	WARN_ON(size & ~PAGE_MASK);
-	WARN_ON(stage2_apply_range(mmu, start, end, kvm_pgtable_stage2_unmap,
+	WARN_ON(stage2_apply_range(mmu, start, end, 0, kvm_pgtable_stage2_unmap,
 				   may_block));
 }
 
@@ -251,7 +253,8 @@ static void stage2_flush_memslot(struct kvm *kvm,
 	phys_addr_t addr = memslot->base_gfn << PAGE_SHIFT;
 	phys_addr_t end = addr + PAGE_SIZE * memslot->npages;
 
-	stage2_apply_range_resched(&kvm->arch.mmu, addr, end, kvm_pgtable_stage2_flush);
+	stage2_apply_range_resched(&kvm->arch.mmu, addr, end, 0,
+				   kvm_pgtable_stage2_flush);
 }
 
 /**
@@ -932,10 +935,13 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa,
  * @mmu:        The KVM stage-2 MMU pointer
  * @addr:	Start address of range
  * @end:	End address of range
+ * @flags:	Page-table walker flags.
  */
-static void stage2_wp_range(struct kvm_s2_mmu *mmu, phys_addr_t addr, phys_addr_t end)
+static void stage2_wp_range(struct kvm_s2_mmu *mmu, phys_addr_t addr, phys_addr_t end,
+			    enum kvm_pgtable_walk_flags flags)
 {
-	stage2_apply_range_resched(mmu, addr, end, kvm_pgtable_stage2_wrprotect);
+	stage2_apply_range_resched(mmu, addr, end, flags,
+				   kvm_pgtable_stage2_wrprotect);
 }
 
 /**
@@ -964,7 +970,7 @@ static void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot)
 	end = (memslot->base_gfn + memslot->npages) << PAGE_SHIFT;
 
 	write_lock(&kvm->mmu_lock);
-	stage2_wp_range(&kvm->arch.mmu, start, end);
+	stage2_wp_range(&kvm->arch.mmu, start, end, 0);
 	write_unlock(&kvm->mmu_lock);
 	kvm_flush_remote_tlbs(kvm);
 }
@@ -988,7 +994,7 @@ static void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
 	phys_addr_t start = (base_gfn +  __ffs(mask)) << PAGE_SHIFT;
 	phys_addr_t end = (base_gfn + __fls(mask) + 1) << PAGE_SHIFT;
 
-	stage2_wp_range(&kvm->arch.mmu, start, end);
+	stage2_wp_range(&kvm->arch.mmu, start, end, 0);
 }
 
 /*
-- 
2.40.0.634.g4ca3ef3211-goog


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 8/9] KMV: arm64: Allow stage2_apply_range_sched() to pass page table walker flags
@ 2023-04-21 16:53   ` Vipin Sharma
  0 siblings, 0 replies; 42+ messages in thread
From: Vipin Sharma @ 2023-04-21 16:53 UTC (permalink / raw)
  To: maz, oliver.upton, james.morse, suzuki.poulose, yuzenghui,
	catalin.marinas, will, chenhuacai, aleksandar.qemu.devel,
	tsbogend, anup, atishp, paul.walmsley, palmer, aou, seanjc,
	pbonzini, dmatlack, ricarkol
  Cc: linux-arm-kernel, kvmarm, linux-mips, kvm-riscv, linux-riscv,
	linux-kselftest, kvm, linux-kernel, Vipin Sharma

Allow stage2_apply_range_sched() to pass enum kvm_pgtable_walk_flags{}
to stage 2 walkers. Pass 0 as the flag to make this change no-op

This capability will be used in future commits to enable clear-dirty-log
operation under MMU read lock.

Current users of stage2_apply_range_*() API run under assumption of
holding MMU write lock. Stage2 page table walkers then run under the
same assumption. In future commits when clear-dirty-log operation under
MMU read lock is added then there needs to be a way to pass this shared
intent to page table walkers.

No functional changes intended.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 arch/arm64/include/asm/kvm_pgtable.h  | 12 +++++++++---
 arch/arm64/kvm/hyp/nvhe/mem_protect.c |  4 ++--
 arch/arm64/kvm/hyp/pgtable.c          | 16 ++++++++++------
 arch/arm64/kvm/mmu.c                  | 26 ++++++++++++++++----------
 4 files changed, 37 insertions(+), 21 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
index 4cd62506c198..79a452d78e08 100644
--- a/arch/arm64/include/asm/kvm_pgtable.h
+++ b/arch/arm64/include/asm/kvm_pgtable.h
@@ -508,6 +508,7 @@ int kvm_pgtable_stage2_set_owner(struct kvm_pgtable *pgt, u64 addr, u64 size,
  * @pgt:	Page-table structure initialised by kvm_pgtable_stage2_init*().
  * @addr:	Intermediate physical address from which to remove the mapping.
  * @size:	Size of the mapping.
+ * @flags:	Page-table walker flags.
  *
  * The offset of @addr within a page is ignored and @size is rounded-up to
  * the next page boundary.
@@ -520,7 +521,8 @@ int kvm_pgtable_stage2_set_owner(struct kvm_pgtable *pgt, u64 addr, u64 size,
  *
  * Return: 0 on success, negative error code on failure.
  */
-int kvm_pgtable_stage2_unmap(struct kvm_pgtable *pgt, u64 addr, u64 size);
+int kvm_pgtable_stage2_unmap(struct kvm_pgtable *pgt, u64 addr, u64 size,
+			     enum kvm_pgtable_walk_flags flags);
 
 /**
  * kvm_pgtable_stage2_wrprotect() - Write-protect guest stage-2 address range
@@ -528,6 +530,7 @@ int kvm_pgtable_stage2_unmap(struct kvm_pgtable *pgt, u64 addr, u64 size);
  * @pgt:	Page-table structure initialised by kvm_pgtable_stage2_init*().
  * @addr:	Intermediate physical address from which to write-protect,
  * @size:	Size of the range.
+ * @flags:	Page-table walker flags.
  *
  * The offset of @addr within a page is ignored and @size is rounded-up to
  * the next page boundary.
@@ -538,7 +541,8 @@ int kvm_pgtable_stage2_unmap(struct kvm_pgtable *pgt, u64 addr, u64 size);
  *
  * Return: 0 on success, negative error code on failure.
  */
-int kvm_pgtable_stage2_wrprotect(struct kvm_pgtable *pgt, u64 addr, u64 size);
+int kvm_pgtable_stage2_wrprotect(struct kvm_pgtable *pgt, u64 addr, u64 size,
+				 enum kvm_pgtable_walk_flags flags);
 
 /**
  * kvm_pgtable_stage2_mkyoung() - Set the access flag in a page-table entry.
@@ -610,13 +614,15 @@ bool kvm_pgtable_stage2_is_young(struct kvm_pgtable *pgt, u64 addr);
  * @pgt:	Page-table structure initialised by kvm_pgtable_stage2_init*().
  * @addr:	Intermediate physical address from which to flush.
  * @size:	Size of the range.
+ * @flags:	Page-table walker flags.
  *
  * The offset of @addr within a page is ignored and @size is rounded-up to
  * the next page boundary.
  *
  * Return: 0 on success, negative error code on failure.
  */
-int kvm_pgtable_stage2_flush(struct kvm_pgtable *pgt, u64 addr, u64 size);
+int kvm_pgtable_stage2_flush(struct kvm_pgtable *pgt, u64 addr, u64 size,
+			     enum kvm_pgtable_walk_flags flags);
 
 /**
  * kvm_pgtable_walk() - Walk a page-table.
diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
index 552653fa18be..bac3c2c31cbe 100644
--- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
@@ -326,11 +326,11 @@ static int host_stage2_unmap_dev_all(void)
 	/* Unmap all non-memory regions to recycle the pages */
 	for (i = 0; i < hyp_memblock_nr; i++, addr = reg->base + reg->size) {
 		reg = &hyp_memory[i];
-		ret = kvm_pgtable_stage2_unmap(pgt, addr, reg->base - addr);
+		ret = kvm_pgtable_stage2_unmap(pgt, addr, reg->base - addr, 0);
 		if (ret)
 			return ret;
 	}
-	return kvm_pgtable_stage2_unmap(pgt, addr, BIT(pgt->ia_bits) - addr);
+	return kvm_pgtable_stage2_unmap(pgt, addr, BIT(pgt->ia_bits) - addr, 0);
 }
 
 struct kvm_mem_range {
diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index 3d61bd3e591d..3a585e1fba11 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -1024,12 +1024,14 @@ static int stage2_unmap_walker(const struct kvm_pgtable_visit_ctx *ctx,
 	return 0;
 }
 
-int kvm_pgtable_stage2_unmap(struct kvm_pgtable *pgt, u64 addr, u64 size)
+int kvm_pgtable_stage2_unmap(struct kvm_pgtable *pgt, u64 addr, u64 size,
+			     enum kvm_pgtable_walk_flags flags)
 {
 	struct kvm_pgtable_walker walker = {
 		.cb	= stage2_unmap_walker,
 		.arg	= pgt,
-		.flags	= KVM_PGTABLE_WALK_LEAF | KVM_PGTABLE_WALK_TABLE_POST,
+		.flags	= flags | KVM_PGTABLE_WALK_LEAF |
+				KVM_PGTABLE_WALK_TABLE_POST,
 	};
 
 	return kvm_pgtable_walk(pgt, addr, size, &walker);
@@ -1108,11 +1110,12 @@ static int stage2_update_leaf_attrs(struct kvm_pgtable *pgt, u64 addr,
 	return 0;
 }
 
-int kvm_pgtable_stage2_wrprotect(struct kvm_pgtable *pgt, u64 addr, u64 size)
+int kvm_pgtable_stage2_wrprotect(struct kvm_pgtable *pgt, u64 addr, u64 size,
+				 enum kvm_pgtable_walk_flags flags)
 {
 	return stage2_update_leaf_attrs(pgt, addr, size, 0,
 					KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W,
-					NULL, NULL, 0);
+					NULL, NULL, flags);
 }
 
 kvm_pte_t kvm_pgtable_stage2_mkyoung(struct kvm_pgtable *pgt, u64 addr)
@@ -1193,11 +1196,12 @@ static int stage2_flush_walker(const struct kvm_pgtable_visit_ctx *ctx,
 	return 0;
 }
 
-int kvm_pgtable_stage2_flush(struct kvm_pgtable *pgt, u64 addr, u64 size)
+int kvm_pgtable_stage2_flush(struct kvm_pgtable *pgt, u64 addr, u64 size,
+			     enum kvm_pgtable_walk_flags flags)
 {
 	struct kvm_pgtable_walker walker = {
 		.cb	= stage2_flush_walker,
-		.flags	= KVM_PGTABLE_WALK_LEAF,
+		.flags	= flags | KVM_PGTABLE_WALK_LEAF,
 		.arg	= pgt,
 	};
 
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index dc1c9059604e..e0189cdda43d 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -48,7 +48,9 @@ static phys_addr_t stage2_range_addr_end(phys_addr_t addr, phys_addr_t end)
  */
 static int stage2_apply_range(struct kvm_s2_mmu *mmu, phys_addr_t addr,
 			      phys_addr_t end,
-			      int (*fn)(struct kvm_pgtable *, u64, u64),
+			      enum kvm_pgtable_walk_flags flags,
+			      int (*fn)(struct kvm_pgtable *, u64, u64,
+					enum kvm_pgtable_walk_flags),
 			      bool resched)
 {
 	struct kvm *kvm = kvm_s2_mmu_to_kvm(mmu);
@@ -61,7 +63,7 @@ static int stage2_apply_range(struct kvm_s2_mmu *mmu, phys_addr_t addr,
 			return -EINVAL;
 
 		next = stage2_range_addr_end(addr, end);
-		ret = fn(pgt, addr, next - addr);
+		ret = fn(pgt, addr, next - addr, flags);
 		if (ret)
 			break;
 
@@ -72,8 +74,8 @@ static int stage2_apply_range(struct kvm_s2_mmu *mmu, phys_addr_t addr,
 	return ret;
 }
 
-#define stage2_apply_range_resched(mmu, addr, end, fn)			\
-	stage2_apply_range(mmu, addr, end, fn, true)
+#define stage2_apply_range_resched(mmu, addr, end, flags, fn)		\
+	stage2_apply_range(mmu, addr, end, flags, fn, true)
 
 static bool memslot_is_logging(struct kvm_memory_slot *memslot)
 {
@@ -236,7 +238,7 @@ static void __unmap_stage2_range(struct kvm_s2_mmu *mmu, phys_addr_t start, u64
 
 	lockdep_assert_held_write(&kvm->mmu_lock);
 	WARN_ON(size & ~PAGE_MASK);
-	WARN_ON(stage2_apply_range(mmu, start, end, kvm_pgtable_stage2_unmap,
+	WARN_ON(stage2_apply_range(mmu, start, end, 0, kvm_pgtable_stage2_unmap,
 				   may_block));
 }
 
@@ -251,7 +253,8 @@ static void stage2_flush_memslot(struct kvm *kvm,
 	phys_addr_t addr = memslot->base_gfn << PAGE_SHIFT;
 	phys_addr_t end = addr + PAGE_SIZE * memslot->npages;
 
-	stage2_apply_range_resched(&kvm->arch.mmu, addr, end, kvm_pgtable_stage2_flush);
+	stage2_apply_range_resched(&kvm->arch.mmu, addr, end, 0,
+				   kvm_pgtable_stage2_flush);
 }
 
 /**
@@ -932,10 +935,13 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa,
  * @mmu:        The KVM stage-2 MMU pointer
  * @addr:	Start address of range
  * @end:	End address of range
+ * @flags:	Page-table walker flags.
  */
-static void stage2_wp_range(struct kvm_s2_mmu *mmu, phys_addr_t addr, phys_addr_t end)
+static void stage2_wp_range(struct kvm_s2_mmu *mmu, phys_addr_t addr, phys_addr_t end,
+			    enum kvm_pgtable_walk_flags flags)
 {
-	stage2_apply_range_resched(mmu, addr, end, kvm_pgtable_stage2_wrprotect);
+	stage2_apply_range_resched(mmu, addr, end, flags,
+				   kvm_pgtable_stage2_wrprotect);
 }
 
 /**
@@ -964,7 +970,7 @@ static void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot)
 	end = (memslot->base_gfn + memslot->npages) << PAGE_SHIFT;
 
 	write_lock(&kvm->mmu_lock);
-	stage2_wp_range(&kvm->arch.mmu, start, end);
+	stage2_wp_range(&kvm->arch.mmu, start, end, 0);
 	write_unlock(&kvm->mmu_lock);
 	kvm_flush_remote_tlbs(kvm);
 }
@@ -988,7 +994,7 @@ static void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
 	phys_addr_t start = (base_gfn +  __ffs(mask)) << PAGE_SHIFT;
 	phys_addr_t end = (base_gfn + __fls(mask) + 1) << PAGE_SHIFT;
 
-	stage2_wp_range(&kvm->arch.mmu, start, end);
+	stage2_wp_range(&kvm->arch.mmu, start, end, 0);
 }
 
 /*
-- 
2.40.0.634.g4ca3ef3211-goog


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 8/9] KMV: arm64: Allow stage2_apply_range_sched() to pass page table walker flags
@ 2023-04-21 16:53   ` Vipin Sharma
  0 siblings, 0 replies; 42+ messages in thread
From: Vipin Sharma @ 2023-04-21 16:53 UTC (permalink / raw)
  To: maz, oliver.upton, james.morse, suzuki.poulose, yuzenghui,
	catalin.marinas, will, chenhuacai, aleksandar.qemu.devel,
	tsbogend, anup, atishp, paul.walmsley, palmer, aou, seanjc,
	pbonzini, dmatlack, ricarkol
  Cc: linux-arm-kernel, kvmarm, linux-mips, kvm-riscv, linux-riscv,
	linux-kselftest, kvm, linux-kernel, Vipin Sharma

Allow stage2_apply_range_sched() to pass enum kvm_pgtable_walk_flags{}
to stage 2 walkers. Pass 0 as the flag to make this change no-op

This capability will be used in future commits to enable clear-dirty-log
operation under MMU read lock.

Current users of stage2_apply_range_*() API run under assumption of
holding MMU write lock. Stage2 page table walkers then run under the
same assumption. In future commits when clear-dirty-log operation under
MMU read lock is added then there needs to be a way to pass this shared
intent to page table walkers.

No functional changes intended.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 arch/arm64/include/asm/kvm_pgtable.h  | 12 +++++++++---
 arch/arm64/kvm/hyp/nvhe/mem_protect.c |  4 ++--
 arch/arm64/kvm/hyp/pgtable.c          | 16 ++++++++++------
 arch/arm64/kvm/mmu.c                  | 26 ++++++++++++++++----------
 4 files changed, 37 insertions(+), 21 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
index 4cd62506c198..79a452d78e08 100644
--- a/arch/arm64/include/asm/kvm_pgtable.h
+++ b/arch/arm64/include/asm/kvm_pgtable.h
@@ -508,6 +508,7 @@ int kvm_pgtable_stage2_set_owner(struct kvm_pgtable *pgt, u64 addr, u64 size,
  * @pgt:	Page-table structure initialised by kvm_pgtable_stage2_init*().
  * @addr:	Intermediate physical address from which to remove the mapping.
  * @size:	Size of the mapping.
+ * @flags:	Page-table walker flags.
  *
  * The offset of @addr within a page is ignored and @size is rounded-up to
  * the next page boundary.
@@ -520,7 +521,8 @@ int kvm_pgtable_stage2_set_owner(struct kvm_pgtable *pgt, u64 addr, u64 size,
  *
  * Return: 0 on success, negative error code on failure.
  */
-int kvm_pgtable_stage2_unmap(struct kvm_pgtable *pgt, u64 addr, u64 size);
+int kvm_pgtable_stage2_unmap(struct kvm_pgtable *pgt, u64 addr, u64 size,
+			     enum kvm_pgtable_walk_flags flags);
 
 /**
  * kvm_pgtable_stage2_wrprotect() - Write-protect guest stage-2 address range
@@ -528,6 +530,7 @@ int kvm_pgtable_stage2_unmap(struct kvm_pgtable *pgt, u64 addr, u64 size);
  * @pgt:	Page-table structure initialised by kvm_pgtable_stage2_init*().
  * @addr:	Intermediate physical address from which to write-protect,
  * @size:	Size of the range.
+ * @flags:	Page-table walker flags.
  *
  * The offset of @addr within a page is ignored and @size is rounded-up to
  * the next page boundary.
@@ -538,7 +541,8 @@ int kvm_pgtable_stage2_unmap(struct kvm_pgtable *pgt, u64 addr, u64 size);
  *
  * Return: 0 on success, negative error code on failure.
  */
-int kvm_pgtable_stage2_wrprotect(struct kvm_pgtable *pgt, u64 addr, u64 size);
+int kvm_pgtable_stage2_wrprotect(struct kvm_pgtable *pgt, u64 addr, u64 size,
+				 enum kvm_pgtable_walk_flags flags);
 
 /**
  * kvm_pgtable_stage2_mkyoung() - Set the access flag in a page-table entry.
@@ -610,13 +614,15 @@ bool kvm_pgtable_stage2_is_young(struct kvm_pgtable *pgt, u64 addr);
  * @pgt:	Page-table structure initialised by kvm_pgtable_stage2_init*().
  * @addr:	Intermediate physical address from which to flush.
  * @size:	Size of the range.
+ * @flags:	Page-table walker flags.
  *
  * The offset of @addr within a page is ignored and @size is rounded-up to
  * the next page boundary.
  *
  * Return: 0 on success, negative error code on failure.
  */
-int kvm_pgtable_stage2_flush(struct kvm_pgtable *pgt, u64 addr, u64 size);
+int kvm_pgtable_stage2_flush(struct kvm_pgtable *pgt, u64 addr, u64 size,
+			     enum kvm_pgtable_walk_flags flags);
 
 /**
  * kvm_pgtable_walk() - Walk a page-table.
diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
index 552653fa18be..bac3c2c31cbe 100644
--- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
@@ -326,11 +326,11 @@ static int host_stage2_unmap_dev_all(void)
 	/* Unmap all non-memory regions to recycle the pages */
 	for (i = 0; i < hyp_memblock_nr; i++, addr = reg->base + reg->size) {
 		reg = &hyp_memory[i];
-		ret = kvm_pgtable_stage2_unmap(pgt, addr, reg->base - addr);
+		ret = kvm_pgtable_stage2_unmap(pgt, addr, reg->base - addr, 0);
 		if (ret)
 			return ret;
 	}
-	return kvm_pgtable_stage2_unmap(pgt, addr, BIT(pgt->ia_bits) - addr);
+	return kvm_pgtable_stage2_unmap(pgt, addr, BIT(pgt->ia_bits) - addr, 0);
 }
 
 struct kvm_mem_range {
diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index 3d61bd3e591d..3a585e1fba11 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -1024,12 +1024,14 @@ static int stage2_unmap_walker(const struct kvm_pgtable_visit_ctx *ctx,
 	return 0;
 }
 
-int kvm_pgtable_stage2_unmap(struct kvm_pgtable *pgt, u64 addr, u64 size)
+int kvm_pgtable_stage2_unmap(struct kvm_pgtable *pgt, u64 addr, u64 size,
+			     enum kvm_pgtable_walk_flags flags)
 {
 	struct kvm_pgtable_walker walker = {
 		.cb	= stage2_unmap_walker,
 		.arg	= pgt,
-		.flags	= KVM_PGTABLE_WALK_LEAF | KVM_PGTABLE_WALK_TABLE_POST,
+		.flags	= flags | KVM_PGTABLE_WALK_LEAF |
+				KVM_PGTABLE_WALK_TABLE_POST,
 	};
 
 	return kvm_pgtable_walk(pgt, addr, size, &walker);
@@ -1108,11 +1110,12 @@ static int stage2_update_leaf_attrs(struct kvm_pgtable *pgt, u64 addr,
 	return 0;
 }
 
-int kvm_pgtable_stage2_wrprotect(struct kvm_pgtable *pgt, u64 addr, u64 size)
+int kvm_pgtable_stage2_wrprotect(struct kvm_pgtable *pgt, u64 addr, u64 size,
+				 enum kvm_pgtable_walk_flags flags)
 {
 	return stage2_update_leaf_attrs(pgt, addr, size, 0,
 					KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W,
-					NULL, NULL, 0);
+					NULL, NULL, flags);
 }
 
 kvm_pte_t kvm_pgtable_stage2_mkyoung(struct kvm_pgtable *pgt, u64 addr)
@@ -1193,11 +1196,12 @@ static int stage2_flush_walker(const struct kvm_pgtable_visit_ctx *ctx,
 	return 0;
 }
 
-int kvm_pgtable_stage2_flush(struct kvm_pgtable *pgt, u64 addr, u64 size)
+int kvm_pgtable_stage2_flush(struct kvm_pgtable *pgt, u64 addr, u64 size,
+			     enum kvm_pgtable_walk_flags flags)
 {
 	struct kvm_pgtable_walker walker = {
 		.cb	= stage2_flush_walker,
-		.flags	= KVM_PGTABLE_WALK_LEAF,
+		.flags	= flags | KVM_PGTABLE_WALK_LEAF,
 		.arg	= pgt,
 	};
 
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index dc1c9059604e..e0189cdda43d 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -48,7 +48,9 @@ static phys_addr_t stage2_range_addr_end(phys_addr_t addr, phys_addr_t end)
  */
 static int stage2_apply_range(struct kvm_s2_mmu *mmu, phys_addr_t addr,
 			      phys_addr_t end,
-			      int (*fn)(struct kvm_pgtable *, u64, u64),
+			      enum kvm_pgtable_walk_flags flags,
+			      int (*fn)(struct kvm_pgtable *, u64, u64,
+					enum kvm_pgtable_walk_flags),
 			      bool resched)
 {
 	struct kvm *kvm = kvm_s2_mmu_to_kvm(mmu);
@@ -61,7 +63,7 @@ static int stage2_apply_range(struct kvm_s2_mmu *mmu, phys_addr_t addr,
 			return -EINVAL;
 
 		next = stage2_range_addr_end(addr, end);
-		ret = fn(pgt, addr, next - addr);
+		ret = fn(pgt, addr, next - addr, flags);
 		if (ret)
 			break;
 
@@ -72,8 +74,8 @@ static int stage2_apply_range(struct kvm_s2_mmu *mmu, phys_addr_t addr,
 	return ret;
 }
 
-#define stage2_apply_range_resched(mmu, addr, end, fn)			\
-	stage2_apply_range(mmu, addr, end, fn, true)
+#define stage2_apply_range_resched(mmu, addr, end, flags, fn)		\
+	stage2_apply_range(mmu, addr, end, flags, fn, true)
 
 static bool memslot_is_logging(struct kvm_memory_slot *memslot)
 {
@@ -236,7 +238,7 @@ static void __unmap_stage2_range(struct kvm_s2_mmu *mmu, phys_addr_t start, u64
 
 	lockdep_assert_held_write(&kvm->mmu_lock);
 	WARN_ON(size & ~PAGE_MASK);
-	WARN_ON(stage2_apply_range(mmu, start, end, kvm_pgtable_stage2_unmap,
+	WARN_ON(stage2_apply_range(mmu, start, end, 0, kvm_pgtable_stage2_unmap,
 				   may_block));
 }
 
@@ -251,7 +253,8 @@ static void stage2_flush_memslot(struct kvm *kvm,
 	phys_addr_t addr = memslot->base_gfn << PAGE_SHIFT;
 	phys_addr_t end = addr + PAGE_SIZE * memslot->npages;
 
-	stage2_apply_range_resched(&kvm->arch.mmu, addr, end, kvm_pgtable_stage2_flush);
+	stage2_apply_range_resched(&kvm->arch.mmu, addr, end, 0,
+				   kvm_pgtable_stage2_flush);
 }
 
 /**
@@ -932,10 +935,13 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa,
  * @mmu:        The KVM stage-2 MMU pointer
  * @addr:	Start address of range
  * @end:	End address of range
+ * @flags:	Page-table walker flags.
  */
-static void stage2_wp_range(struct kvm_s2_mmu *mmu, phys_addr_t addr, phys_addr_t end)
+static void stage2_wp_range(struct kvm_s2_mmu *mmu, phys_addr_t addr, phys_addr_t end,
+			    enum kvm_pgtable_walk_flags flags)
 {
-	stage2_apply_range_resched(mmu, addr, end, kvm_pgtable_stage2_wrprotect);
+	stage2_apply_range_resched(mmu, addr, end, flags,
+				   kvm_pgtable_stage2_wrprotect);
 }
 
 /**
@@ -964,7 +970,7 @@ static void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot)
 	end = (memslot->base_gfn + memslot->npages) << PAGE_SHIFT;
 
 	write_lock(&kvm->mmu_lock);
-	stage2_wp_range(&kvm->arch.mmu, start, end);
+	stage2_wp_range(&kvm->arch.mmu, start, end, 0);
 	write_unlock(&kvm->mmu_lock);
 	kvm_flush_remote_tlbs(kvm);
 }
@@ -988,7 +994,7 @@ static void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
 	phys_addr_t start = (base_gfn +  __ffs(mask)) << PAGE_SHIFT;
 	phys_addr_t end = (base_gfn + __fls(mask) + 1) << PAGE_SHIFT;
 
-	stage2_wp_range(&kvm->arch.mmu, start, end);
+	stage2_wp_range(&kvm->arch.mmu, start, end, 0);
 }
 
 /*
-- 
2.40.0.634.g4ca3ef3211-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 9/9] KVM: arm64: Run clear-dirty-log under MMU read lock
  2023-04-21 16:52 ` Vipin Sharma
  (?)
@ 2023-04-21 16:53   ` Vipin Sharma
  -1 siblings, 0 replies; 42+ messages in thread
From: Vipin Sharma @ 2023-04-21 16:53 UTC (permalink / raw)
  To: maz, oliver.upton, james.morse, suzuki.poulose, yuzenghui,
	catalin.marinas, will, chenhuacai, aleksandar.qemu.devel,
	tsbogend, anup, atishp, paul.walmsley, palmer, aou, seanjc,
	pbonzini, dmatlack, ricarkol
  Cc: linux-arm-kernel, kvmarm, linux-mips, kvm-riscv, linux-riscv,
	linux-kselftest, kvm, linux-kernel, Vipin Sharma

Take MMU read lock for write protecting PTEs and use shared page table
walker for clearing dirty logs.

Clearing dirty logs are currently performed under MMU write locks. This
means vCPUs write protection fault, which also take MMU read lock,  will
be blocked during this operation. This causes guest degradation and
especially noticeable on VMs with lot of vCPUs.

Taking MMU read lock will allow vCPUs to execute parallelly and reduces
the impact on vCPUs performance.

Tested improvement on a ARM Ampere Altra host (64 CPUs, 256 GB
memory and single NUMA node) via dirty_log_perf_test for 48 vCPU, 96
GB memory, 8GB clear chunk size, 1 second wait between Clear-Dirty-Log
calls and configuration:

Test command:
./dirty_log_perf_test -s anonymous_hugetlb_2mb -b 2G -v 48 -l 1 -k 8G -j -m 2

Before:
Total pages touched: 50331648 (Reads: 0, Writes: 50331648)

After:
Total pages touched: 125304832 (Reads: 0, Writes: 125304832)

Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 arch/arm64/kvm/mmu.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index e0189cdda43d..3f2117d93998 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -67,8 +67,12 @@ static int stage2_apply_range(struct kvm_s2_mmu *mmu, phys_addr_t addr,
 		if (ret)
 			break;
 
-		if (resched && next != end)
-			cond_resched_rwlock_write(&kvm->mmu_lock);
+		if (resched && next != end) {
+			if (flags & KVM_PGTABLE_WALK_SHARED)
+				cond_resched_rwlock_read(&kvm->mmu_lock);
+			else
+				cond_resched_rwlock_write(&kvm->mmu_lock);
+		}
 	} while (addr = next, addr != end);
 
 	return ret;
@@ -994,7 +998,7 @@ static void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
 	phys_addr_t start = (base_gfn +  __ffs(mask)) << PAGE_SHIFT;
 	phys_addr_t end = (base_gfn + __fls(mask) + 1) << PAGE_SHIFT;
 
-	stage2_wp_range(&kvm->arch.mmu, start, end, 0);
+	stage2_wp_range(&kvm->arch.mmu, start, end, KVM_PGTABLE_WALK_SHARED);
 }
 
 /*
@@ -1008,9 +1012,9 @@ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm,
 		struct kvm_memory_slot *slot,
 		gfn_t gfn_offset, unsigned long mask)
 {
-	write_lock(&kvm->mmu_lock);
+	read_lock(&kvm->mmu_lock);
 	kvm_mmu_write_protect_pt_masked(kvm, slot, gfn_offset, mask);
-	write_unlock(&kvm->mmu_lock);
+	read_unlock(&kvm->mmu_lock);
 }
 
 static void kvm_send_hwpoison_signal(unsigned long address, short lsb)
-- 
2.40.0.634.g4ca3ef3211-goog


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 9/9] KVM: arm64: Run clear-dirty-log under MMU read lock
@ 2023-04-21 16:53   ` Vipin Sharma
  0 siblings, 0 replies; 42+ messages in thread
From: Vipin Sharma @ 2023-04-21 16:53 UTC (permalink / raw)
  To: maz, oliver.upton, james.morse, suzuki.poulose, yuzenghui,
	catalin.marinas, will, chenhuacai, aleksandar.qemu.devel,
	tsbogend, anup, atishp, paul.walmsley, palmer, aou, seanjc,
	pbonzini, dmatlack, ricarkol
  Cc: linux-arm-kernel, kvmarm, linux-mips, kvm-riscv, linux-riscv,
	linux-kselftest, kvm, linux-kernel, Vipin Sharma

Take MMU read lock for write protecting PTEs and use shared page table
walker for clearing dirty logs.

Clearing dirty logs are currently performed under MMU write locks. This
means vCPUs write protection fault, which also take MMU read lock,  will
be blocked during this operation. This causes guest degradation and
especially noticeable on VMs with lot of vCPUs.

Taking MMU read lock will allow vCPUs to execute parallelly and reduces
the impact on vCPUs performance.

Tested improvement on a ARM Ampere Altra host (64 CPUs, 256 GB
memory and single NUMA node) via dirty_log_perf_test for 48 vCPU, 96
GB memory, 8GB clear chunk size, 1 second wait between Clear-Dirty-Log
calls and configuration:

Test command:
./dirty_log_perf_test -s anonymous_hugetlb_2mb -b 2G -v 48 -l 1 -k 8G -j -m 2

Before:
Total pages touched: 50331648 (Reads: 0, Writes: 50331648)

After:
Total pages touched: 125304832 (Reads: 0, Writes: 125304832)

Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 arch/arm64/kvm/mmu.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index e0189cdda43d..3f2117d93998 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -67,8 +67,12 @@ static int stage2_apply_range(struct kvm_s2_mmu *mmu, phys_addr_t addr,
 		if (ret)
 			break;
 
-		if (resched && next != end)
-			cond_resched_rwlock_write(&kvm->mmu_lock);
+		if (resched && next != end) {
+			if (flags & KVM_PGTABLE_WALK_SHARED)
+				cond_resched_rwlock_read(&kvm->mmu_lock);
+			else
+				cond_resched_rwlock_write(&kvm->mmu_lock);
+		}
 	} while (addr = next, addr != end);
 
 	return ret;
@@ -994,7 +998,7 @@ static void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
 	phys_addr_t start = (base_gfn +  __ffs(mask)) << PAGE_SHIFT;
 	phys_addr_t end = (base_gfn + __fls(mask) + 1) << PAGE_SHIFT;
 
-	stage2_wp_range(&kvm->arch.mmu, start, end, 0);
+	stage2_wp_range(&kvm->arch.mmu, start, end, KVM_PGTABLE_WALK_SHARED);
 }
 
 /*
@@ -1008,9 +1012,9 @@ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm,
 		struct kvm_memory_slot *slot,
 		gfn_t gfn_offset, unsigned long mask)
 {
-	write_lock(&kvm->mmu_lock);
+	read_lock(&kvm->mmu_lock);
 	kvm_mmu_write_protect_pt_masked(kvm, slot, gfn_offset, mask);
-	write_unlock(&kvm->mmu_lock);
+	read_unlock(&kvm->mmu_lock);
 }
 
 static void kvm_send_hwpoison_signal(unsigned long address, short lsb)
-- 
2.40.0.634.g4ca3ef3211-goog


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 9/9] KVM: arm64: Run clear-dirty-log under MMU read lock
@ 2023-04-21 16:53   ` Vipin Sharma
  0 siblings, 0 replies; 42+ messages in thread
From: Vipin Sharma @ 2023-04-21 16:53 UTC (permalink / raw)
  To: maz, oliver.upton, james.morse, suzuki.poulose, yuzenghui,
	catalin.marinas, will, chenhuacai, aleksandar.qemu.devel,
	tsbogend, anup, atishp, paul.walmsley, palmer, aou, seanjc,
	pbonzini, dmatlack, ricarkol
  Cc: linux-arm-kernel, kvmarm, linux-mips, kvm-riscv, linux-riscv,
	linux-kselftest, kvm, linux-kernel, Vipin Sharma

Take MMU read lock for write protecting PTEs and use shared page table
walker for clearing dirty logs.

Clearing dirty logs are currently performed under MMU write locks. This
means vCPUs write protection fault, which also take MMU read lock,  will
be blocked during this operation. This causes guest degradation and
especially noticeable on VMs with lot of vCPUs.

Taking MMU read lock will allow vCPUs to execute parallelly and reduces
the impact on vCPUs performance.

Tested improvement on a ARM Ampere Altra host (64 CPUs, 256 GB
memory and single NUMA node) via dirty_log_perf_test for 48 vCPU, 96
GB memory, 8GB clear chunk size, 1 second wait between Clear-Dirty-Log
calls and configuration:

Test command:
./dirty_log_perf_test -s anonymous_hugetlb_2mb -b 2G -v 48 -l 1 -k 8G -j -m 2

Before:
Total pages touched: 50331648 (Reads: 0, Writes: 50331648)

After:
Total pages touched: 125304832 (Reads: 0, Writes: 125304832)

Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 arch/arm64/kvm/mmu.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index e0189cdda43d..3f2117d93998 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -67,8 +67,12 @@ static int stage2_apply_range(struct kvm_s2_mmu *mmu, phys_addr_t addr,
 		if (ret)
 			break;
 
-		if (resched && next != end)
-			cond_resched_rwlock_write(&kvm->mmu_lock);
+		if (resched && next != end) {
+			if (flags & KVM_PGTABLE_WALK_SHARED)
+				cond_resched_rwlock_read(&kvm->mmu_lock);
+			else
+				cond_resched_rwlock_write(&kvm->mmu_lock);
+		}
 	} while (addr = next, addr != end);
 
 	return ret;
@@ -994,7 +998,7 @@ static void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
 	phys_addr_t start = (base_gfn +  __ffs(mask)) << PAGE_SHIFT;
 	phys_addr_t end = (base_gfn + __fls(mask) + 1) << PAGE_SHIFT;
 
-	stage2_wp_range(&kvm->arch.mmu, start, end, 0);
+	stage2_wp_range(&kvm->arch.mmu, start, end, KVM_PGTABLE_WALK_SHARED);
 }
 
 /*
@@ -1008,9 +1012,9 @@ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm,
 		struct kvm_memory_slot *slot,
 		gfn_t gfn_offset, unsigned long mask)
 {
-	write_lock(&kvm->mmu_lock);
+	read_lock(&kvm->mmu_lock);
 	kvm_mmu_write_protect_pt_masked(kvm, slot, gfn_offset, mask);
-	write_unlock(&kvm->mmu_lock);
+	read_unlock(&kvm->mmu_lock);
 }
 
 static void kvm_send_hwpoison_signal(unsigned long address, short lsb)
-- 
2.40.0.634.g4ca3ef3211-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH 9/9] KVM: arm64: Run clear-dirty-log under MMU read lock
  2023-04-21 16:53   ` Vipin Sharma
  (?)
@ 2023-04-21 17:10     ` Marc Zyngier
  -1 siblings, 0 replies; 42+ messages in thread
From: Marc Zyngier @ 2023-04-21 17:10 UTC (permalink / raw)
  To: Vipin Sharma
  Cc: oliver.upton, james.morse, suzuki.poulose, yuzenghui,
	catalin.marinas, will, chenhuacai, aleksandar.qemu.devel,
	tsbogend, anup, atishp, paul.walmsley, palmer, aou, seanjc,
	pbonzini, dmatlack, ricarkol, linux-arm-kernel, kvmarm,
	linux-mips, kvm-riscv, linux-riscv, linux-kselftest, kvm,
	linux-kernel

On Fri, 21 Apr 2023 17:53:05 +0100,
Vipin Sharma <vipinsh@google.com> wrote:
> 
> Take MMU read lock for write protecting PTEs and use shared page table
> walker for clearing dirty logs.
> 
> Clearing dirty logs are currently performed under MMU write locks. This
> means vCPUs write protection fault, which also take MMU read lock,  will
> be blocked during this operation. This causes guest degradation and
> especially noticeable on VMs with lot of vCPUs.
> 
> Taking MMU read lock will allow vCPUs to execute parallelly and reduces
> the impact on vCPUs performance.

Sure. Taking no lock whatsoever would be even better.

What I don't see is the detailed explanation that gives me the warm
feeling that this is safe and correct. Such an explanation is the
minimum condition for me to even read the patch.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 9/9] KVM: arm64: Run clear-dirty-log under MMU read lock
@ 2023-04-21 17:10     ` Marc Zyngier
  0 siblings, 0 replies; 42+ messages in thread
From: Marc Zyngier @ 2023-04-21 17:10 UTC (permalink / raw)
  To: Vipin Sharma
  Cc: oliver.upton, james.morse, suzuki.poulose, yuzenghui,
	catalin.marinas, will, chenhuacai, aleksandar.qemu.devel,
	tsbogend, anup, atishp, paul.walmsley, palmer, aou, seanjc,
	pbonzini, dmatlack, ricarkol, linux-arm-kernel, kvmarm,
	linux-mips, kvm-riscv, linux-riscv, linux-kselftest, kvm,
	linux-kernel

On Fri, 21 Apr 2023 17:53:05 +0100,
Vipin Sharma <vipinsh@google.com> wrote:
> 
> Take MMU read lock for write protecting PTEs and use shared page table
> walker for clearing dirty logs.
> 
> Clearing dirty logs are currently performed under MMU write locks. This
> means vCPUs write protection fault, which also take MMU read lock,  will
> be blocked during this operation. This causes guest degradation and
> especially noticeable on VMs with lot of vCPUs.
> 
> Taking MMU read lock will allow vCPUs to execute parallelly and reduces
> the impact on vCPUs performance.

Sure. Taking no lock whatsoever would be even better.

What I don't see is the detailed explanation that gives me the warm
feeling that this is safe and correct. Such an explanation is the
minimum condition for me to even read the patch.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 9/9] KVM: arm64: Run clear-dirty-log under MMU read lock
@ 2023-04-21 17:10     ` Marc Zyngier
  0 siblings, 0 replies; 42+ messages in thread
From: Marc Zyngier @ 2023-04-21 17:10 UTC (permalink / raw)
  To: Vipin Sharma
  Cc: oliver.upton, james.morse, suzuki.poulose, yuzenghui,
	catalin.marinas, will, chenhuacai, aleksandar.qemu.devel,
	tsbogend, anup, atishp, paul.walmsley, palmer, aou, seanjc,
	pbonzini, dmatlack, ricarkol, linux-arm-kernel, kvmarm,
	linux-mips, kvm-riscv, linux-riscv, linux-kselftest, kvm,
	linux-kernel

On Fri, 21 Apr 2023 17:53:05 +0100,
Vipin Sharma <vipinsh@google.com> wrote:
> 
> Take MMU read lock for write protecting PTEs and use shared page table
> walker for clearing dirty logs.
> 
> Clearing dirty logs are currently performed under MMU write locks. This
> means vCPUs write protection fault, which also take MMU read lock,  will
> be blocked during this operation. This causes guest degradation and
> especially noticeable on VMs with lot of vCPUs.
> 
> Taking MMU read lock will allow vCPUs to execute parallelly and reduces
> the impact on vCPUs performance.

Sure. Taking no lock whatsoever would be even better.

What I don't see is the detailed explanation that gives me the warm
feeling that this is safe and correct. Such an explanation is the
minimum condition for me to even read the patch.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 7/9] KVM: mmu: Move mmu lock/unlock to arch code for clear dirty log
  2023-04-21 16:53   ` Vipin Sharma
  (?)
@ 2023-04-21 19:43     ` kernel test robot
  -1 siblings, 0 replies; 42+ messages in thread
From: kernel test robot @ 2023-04-21 19:43 UTC (permalink / raw)
  To: Vipin Sharma, maz, oliver.upton, james.morse, suzuki.poulose,
	yuzenghui, catalin.marinas, will, chenhuacai,
	aleksandar.qemu.devel, tsbogend, anup, atishp, paul.walmsley,
	palmer, aou, seanjc, pbonzini, dmatlack, ricarkol
  Cc: oe-kbuild-all, linux-arm-kernel, kvmarm, linux-mips, kvm-riscv,
	linux-riscv, linux-kselftest, kvm, linux-kernel, Vipin Sharma

Hi Vipin,

kernel test robot noticed the following build warnings:

[auto build test WARNING on 95b9779c1758f03cf494e8550d6249a40089ed1c]

url:    https://github.com/intel-lab-lkp/linux/commits/Vipin-Sharma/KVM-selftests-Allow-dirty_log_perf_test-to-clear-dirty-memory-in-chunks/20230422-005708
base:   95b9779c1758f03cf494e8550d6249a40089ed1c
patch link:    https://lore.kernel.org/r/20230421165305.804301-8-vipinsh%40google.com
patch subject: [PATCH 7/9] KVM: mmu: Move mmu lock/unlock to arch code for clear dirty log
config: riscv-allyesconfig (https://download.01.org/0day-ci/archive/20230422/202304220315.bpwbgH5n-lkp@intel.com/config)
compiler: riscv64-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/e7505b53d53e3bb5e7f1c43233ef3644673edb75
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Vipin-Sharma/KVM-selftests-Allow-dirty_log_perf_test-to-clear-dirty-memory-in-chunks/20230422-005708
        git checkout e7505b53d53e3bb5e7f1c43233ef3644673edb75
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=riscv olddefconfig
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=riscv SHELL=/bin/bash arch/riscv/kvm/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202304220315.bpwbgH5n-lkp@intel.com/

All warnings (new ones prefixed by >>):

   arch/riscv/kvm/mmu.c: In function 'kvm_arch_mmu_enable_log_dirty_pt_masked':
>> arch/riscv/kvm/mmu.c:399:9: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]
     399 |         phys_addr_t base_gfn = slot->base_gfn + gfn_offset;
         |         ^~~~~~~~~~~


vim +399 arch/riscv/kvm/mmu.c

c9d57373fc87a3 Anup Patel   2022-07-29  392  
9d05c1fee83757 Anup Patel   2021-09-27  393  void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm,
9d05c1fee83757 Anup Patel   2021-09-27  394  					     struct kvm_memory_slot *slot,
9d05c1fee83757 Anup Patel   2021-09-27  395  					     gfn_t gfn_offset,
9d05c1fee83757 Anup Patel   2021-09-27  396  					     unsigned long mask)
9d05c1fee83757 Anup Patel   2021-09-27  397  {
e7505b53d53e3b Vipin Sharma 2023-04-21  398  	spin_lock(&kvm->mmu_lock);
9d05c1fee83757 Anup Patel   2021-09-27 @399  	phys_addr_t base_gfn = slot->base_gfn + gfn_offset;
9d05c1fee83757 Anup Patel   2021-09-27  400  	phys_addr_t start = (base_gfn +  __ffs(mask)) << PAGE_SHIFT;
9d05c1fee83757 Anup Patel   2021-09-27  401  	phys_addr_t end = (base_gfn + __fls(mask) + 1) << PAGE_SHIFT;
9d05c1fee83757 Anup Patel   2021-09-27  402  
26708234eb12e7 Anup Patel   2022-05-09  403  	gstage_wp_range(kvm, start, end);
e7505b53d53e3b Vipin Sharma 2023-04-21  404  	spin_unlock(&kvm->mmu_lock);
9d05c1fee83757 Anup Patel   2021-09-27  405  }
99cdc6c18c2d81 Anup Patel   2021-09-27  406  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 7/9] KVM: mmu: Move mmu lock/unlock to arch code for clear dirty log
@ 2023-04-21 19:43     ` kernel test robot
  0 siblings, 0 replies; 42+ messages in thread
From: kernel test robot @ 2023-04-21 19:43 UTC (permalink / raw)
  To: Vipin Sharma, maz, oliver.upton, james.morse, suzuki.poulose,
	yuzenghui, catalin.marinas, will, chenhuacai,
	aleksandar.qemu.devel, tsbogend, anup, atishp, paul.walmsley,
	palmer, aou, seanjc, pbonzini, dmatlack, ricarkol
  Cc: oe-kbuild-all, linux-arm-kernel, kvmarm, linux-mips, kvm-riscv,
	linux-riscv, linux-kselftest, kvm, linux-kernel, Vipin Sharma

Hi Vipin,

kernel test robot noticed the following build warnings:

[auto build test WARNING on 95b9779c1758f03cf494e8550d6249a40089ed1c]

url:    https://github.com/intel-lab-lkp/linux/commits/Vipin-Sharma/KVM-selftests-Allow-dirty_log_perf_test-to-clear-dirty-memory-in-chunks/20230422-005708
base:   95b9779c1758f03cf494e8550d6249a40089ed1c
patch link:    https://lore.kernel.org/r/20230421165305.804301-8-vipinsh%40google.com
patch subject: [PATCH 7/9] KVM: mmu: Move mmu lock/unlock to arch code for clear dirty log
config: riscv-allyesconfig (https://download.01.org/0day-ci/archive/20230422/202304220315.bpwbgH5n-lkp@intel.com/config)
compiler: riscv64-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/e7505b53d53e3bb5e7f1c43233ef3644673edb75
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Vipin-Sharma/KVM-selftests-Allow-dirty_log_perf_test-to-clear-dirty-memory-in-chunks/20230422-005708
        git checkout e7505b53d53e3bb5e7f1c43233ef3644673edb75
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=riscv olddefconfig
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=riscv SHELL=/bin/bash arch/riscv/kvm/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202304220315.bpwbgH5n-lkp@intel.com/

All warnings (new ones prefixed by >>):

   arch/riscv/kvm/mmu.c: In function 'kvm_arch_mmu_enable_log_dirty_pt_masked':
>> arch/riscv/kvm/mmu.c:399:9: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]
     399 |         phys_addr_t base_gfn = slot->base_gfn + gfn_offset;
         |         ^~~~~~~~~~~


vim +399 arch/riscv/kvm/mmu.c

c9d57373fc87a3 Anup Patel   2022-07-29  392  
9d05c1fee83757 Anup Patel   2021-09-27  393  void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm,
9d05c1fee83757 Anup Patel   2021-09-27  394  					     struct kvm_memory_slot *slot,
9d05c1fee83757 Anup Patel   2021-09-27  395  					     gfn_t gfn_offset,
9d05c1fee83757 Anup Patel   2021-09-27  396  					     unsigned long mask)
9d05c1fee83757 Anup Patel   2021-09-27  397  {
e7505b53d53e3b Vipin Sharma 2023-04-21  398  	spin_lock(&kvm->mmu_lock);
9d05c1fee83757 Anup Patel   2021-09-27 @399  	phys_addr_t base_gfn = slot->base_gfn + gfn_offset;
9d05c1fee83757 Anup Patel   2021-09-27  400  	phys_addr_t start = (base_gfn +  __ffs(mask)) << PAGE_SHIFT;
9d05c1fee83757 Anup Patel   2021-09-27  401  	phys_addr_t end = (base_gfn + __fls(mask) + 1) << PAGE_SHIFT;
9d05c1fee83757 Anup Patel   2021-09-27  402  
26708234eb12e7 Anup Patel   2022-05-09  403  	gstage_wp_range(kvm, start, end);
e7505b53d53e3b Vipin Sharma 2023-04-21  404  	spin_unlock(&kvm->mmu_lock);
9d05c1fee83757 Anup Patel   2021-09-27  405  }
99cdc6c18c2d81 Anup Patel   2021-09-27  406  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 7/9] KVM: mmu: Move mmu lock/unlock to arch code for clear dirty log
@ 2023-04-21 19:43     ` kernel test robot
  0 siblings, 0 replies; 42+ messages in thread
From: kernel test robot @ 2023-04-21 19:43 UTC (permalink / raw)
  To: Vipin Sharma, maz, oliver.upton, james.morse, suzuki.poulose,
	yuzenghui, catalin.marinas, will, chenhuacai,
	aleksandar.qemu.devel, tsbogend, anup, atishp, paul.walmsley,
	palmer, aou, seanjc, pbonzini, dmatlack, ricarkol
  Cc: oe-kbuild-all, linux-arm-kernel, kvmarm, linux-mips, kvm-riscv,
	linux-riscv, linux-kselftest, kvm, linux-kernel, Vipin Sharma

Hi Vipin,

kernel test robot noticed the following build warnings:

[auto build test WARNING on 95b9779c1758f03cf494e8550d6249a40089ed1c]

url:    https://github.com/intel-lab-lkp/linux/commits/Vipin-Sharma/KVM-selftests-Allow-dirty_log_perf_test-to-clear-dirty-memory-in-chunks/20230422-005708
base:   95b9779c1758f03cf494e8550d6249a40089ed1c
patch link:    https://lore.kernel.org/r/20230421165305.804301-8-vipinsh%40google.com
patch subject: [PATCH 7/9] KVM: mmu: Move mmu lock/unlock to arch code for clear dirty log
config: riscv-allyesconfig (https://download.01.org/0day-ci/archive/20230422/202304220315.bpwbgH5n-lkp@intel.com/config)
compiler: riscv64-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/e7505b53d53e3bb5e7f1c43233ef3644673edb75
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Vipin-Sharma/KVM-selftests-Allow-dirty_log_perf_test-to-clear-dirty-memory-in-chunks/20230422-005708
        git checkout e7505b53d53e3bb5e7f1c43233ef3644673edb75
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=riscv olddefconfig
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=riscv SHELL=/bin/bash arch/riscv/kvm/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202304220315.bpwbgH5n-lkp@intel.com/

All warnings (new ones prefixed by >>):

   arch/riscv/kvm/mmu.c: In function 'kvm_arch_mmu_enable_log_dirty_pt_masked':
>> arch/riscv/kvm/mmu.c:399:9: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]
     399 |         phys_addr_t base_gfn = slot->base_gfn + gfn_offset;
         |         ^~~~~~~~~~~


vim +399 arch/riscv/kvm/mmu.c

c9d57373fc87a3 Anup Patel   2022-07-29  392  
9d05c1fee83757 Anup Patel   2021-09-27  393  void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm,
9d05c1fee83757 Anup Patel   2021-09-27  394  					     struct kvm_memory_slot *slot,
9d05c1fee83757 Anup Patel   2021-09-27  395  					     gfn_t gfn_offset,
9d05c1fee83757 Anup Patel   2021-09-27  396  					     unsigned long mask)
9d05c1fee83757 Anup Patel   2021-09-27  397  {
e7505b53d53e3b Vipin Sharma 2023-04-21  398  	spin_lock(&kvm->mmu_lock);
9d05c1fee83757 Anup Patel   2021-09-27 @399  	phys_addr_t base_gfn = slot->base_gfn + gfn_offset;
9d05c1fee83757 Anup Patel   2021-09-27  400  	phys_addr_t start = (base_gfn +  __ffs(mask)) << PAGE_SHIFT;
9d05c1fee83757 Anup Patel   2021-09-27  401  	phys_addr_t end = (base_gfn + __fls(mask) + 1) << PAGE_SHIFT;
9d05c1fee83757 Anup Patel   2021-09-27  402  
26708234eb12e7 Anup Patel   2022-05-09  403  	gstage_wp_range(kvm, start, end);
e7505b53d53e3b Vipin Sharma 2023-04-21  404  	spin_unlock(&kvm->mmu_lock);
9d05c1fee83757 Anup Patel   2021-09-27  405  }
99cdc6c18c2d81 Anup Patel   2021-09-27  406  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 7/9] KVM: mmu: Move mmu lock/unlock to arch code for clear dirty log
  2023-04-21 19:43     ` kernel test robot
  (?)
@ 2023-04-24 16:45       ` Vipin Sharma
  -1 siblings, 0 replies; 42+ messages in thread
From: Vipin Sharma @ 2023-04-24 16:45 UTC (permalink / raw)
  To: kernel test robot
  Cc: maz, oliver.upton, james.morse, suzuki.poulose, yuzenghui,
	catalin.marinas, will, chenhuacai, aleksandar.qemu.devel,
	tsbogend, anup, atishp, paul.walmsley, palmer, aou, seanjc,
	pbonzini, dmatlack, ricarkol, oe-kbuild-all, linux-arm-kernel,
	kvmarm, linux-mips, kvm-riscv, linux-riscv, linux-kselftest, kvm,
	linux-kernel

On Fri, Apr 21, 2023 at 12:43 PM kernel test robot <lkp@intel.com> wrote:
>
> Hi Vipin,
>
> All warnings (new ones prefixed by >>):
>
>    arch/riscv/kvm/mmu.c: In function 'kvm_arch_mmu_enable_log_dirty_pt_masked':
> >> arch/riscv/kvm/mmu.c:399:9: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]
>      399 |         phys_addr_t base_gfn = slot->base_gfn + gfn_offset;
>          |         ^~~~~~~~~~~
>
>

I will fix it in v2.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 7/9] KVM: mmu: Move mmu lock/unlock to arch code for clear dirty log
@ 2023-04-24 16:45       ` Vipin Sharma
  0 siblings, 0 replies; 42+ messages in thread
From: Vipin Sharma @ 2023-04-24 16:45 UTC (permalink / raw)
  To: kernel test robot
  Cc: maz, oliver.upton, james.morse, suzuki.poulose, yuzenghui,
	catalin.marinas, will, chenhuacai, aleksandar.qemu.devel,
	tsbogend, anup, atishp, paul.walmsley, palmer, aou, seanjc,
	pbonzini, dmatlack, ricarkol, oe-kbuild-all, linux-arm-kernel,
	kvmarm, linux-mips, kvm-riscv, linux-riscv, linux-kselftest, kvm,
	linux-kernel

On Fri, Apr 21, 2023 at 12:43 PM kernel test robot <lkp@intel.com> wrote:
>
> Hi Vipin,
>
> All warnings (new ones prefixed by >>):
>
>    arch/riscv/kvm/mmu.c: In function 'kvm_arch_mmu_enable_log_dirty_pt_masked':
> >> arch/riscv/kvm/mmu.c:399:9: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]
>      399 |         phys_addr_t base_gfn = slot->base_gfn + gfn_offset;
>          |         ^~~~~~~~~~~
>
>

I will fix it in v2.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 7/9] KVM: mmu: Move mmu lock/unlock to arch code for clear dirty log
@ 2023-04-24 16:45       ` Vipin Sharma
  0 siblings, 0 replies; 42+ messages in thread
From: Vipin Sharma @ 2023-04-24 16:45 UTC (permalink / raw)
  To: kernel test robot
  Cc: maz, oliver.upton, james.morse, suzuki.poulose, yuzenghui,
	catalin.marinas, will, chenhuacai, aleksandar.qemu.devel,
	tsbogend, anup, atishp, paul.walmsley, palmer, aou, seanjc,
	pbonzini, dmatlack, ricarkol, oe-kbuild-all, linux-arm-kernel,
	kvmarm, linux-mips, kvm-riscv, linux-riscv, linux-kselftest, kvm,
	linux-kernel

On Fri, Apr 21, 2023 at 12:43 PM kernel test robot <lkp@intel.com> wrote:
>
> Hi Vipin,
>
> All warnings (new ones prefixed by >>):
>
>    arch/riscv/kvm/mmu.c: In function 'kvm_arch_mmu_enable_log_dirty_pt_masked':
> >> arch/riscv/kvm/mmu.c:399:9: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]
>      399 |         phys_addr_t base_gfn = slot->base_gfn + gfn_offset;
>          |         ^~~~~~~~~~~
>
>

I will fix it in v2.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 9/9] KVM: arm64: Run clear-dirty-log under MMU read lock
  2023-04-21 17:10     ` Marc Zyngier
  (?)
@ 2023-05-06  0:55       ` Vipin Sharma
  -1 siblings, 0 replies; 42+ messages in thread
From: Vipin Sharma @ 2023-05-06  0:55 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: oliver.upton, james.morse, suzuki.poulose, yuzenghui,
	catalin.marinas, will, chenhuacai, aleksandar.qemu.devel,
	tsbogend, anup, atishp, paul.walmsley, palmer, aou, seanjc,
	pbonzini, dmatlack, ricarkol, linux-arm-kernel, kvmarm,
	linux-mips, kvm-riscv, linux-riscv, linux-kselftest, kvm,
	linux-kernel

On Fri, Apr 21, 2023 at 10:11 AM Marc Zyngier <maz@kernel.org> wrote:
>
> On Fri, 21 Apr 2023 17:53:05 +0100,
> Vipin Sharma <vipinsh@google.com> wrote:
> >
> > Take MMU read lock for write protecting PTEs and use shared page table
> > walker for clearing dirty logs.
> >
> > Clearing dirty logs are currently performed under MMU write locks. This
> > means vCPUs write protection fault, which also take MMU read lock,  will
> > be blocked during this operation. This causes guest degradation and
> > especially noticeable on VMs with lot of vCPUs.
> >
> > Taking MMU read lock will allow vCPUs to execute parallelly and reduces
> > the impact on vCPUs performance.
>
> Sure. Taking no lock whatsoever would be even better.
>
> What I don't see is the detailed explanation that gives me the warm
> feeling that this is safe and correct. Such an explanation is the
> minimum condition for me to even read the patch.
>

Thanks for freaking me out. Your not getting warm feeling hunch was
right, stage2_attr_walker() and stage2_update_leaf_attrs() combo do
not retry if cmpxchg fails for write protection. Write protection
callers don't check what the return status of the API is and just
ignores cmpxchg failure. This means a vCPU (MMU read lock user) can
cause cmpxchg to fail for write protection operation (under read lock,
which this patch does) and clear ioctl will happily return as if
everything is good.

I will update the series and also work on validating the correctness
to instill more confidence.

Thanks

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 9/9] KVM: arm64: Run clear-dirty-log under MMU read lock
@ 2023-05-06  0:55       ` Vipin Sharma
  0 siblings, 0 replies; 42+ messages in thread
From: Vipin Sharma @ 2023-05-06  0:55 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: oliver.upton, james.morse, suzuki.poulose, yuzenghui,
	catalin.marinas, will, chenhuacai, aleksandar.qemu.devel,
	tsbogend, anup, atishp, paul.walmsley, palmer, aou, seanjc,
	pbonzini, dmatlack, ricarkol, linux-arm-kernel, kvmarm,
	linux-mips, kvm-riscv, linux-riscv, linux-kselftest, kvm,
	linux-kernel

On Fri, Apr 21, 2023 at 10:11 AM Marc Zyngier <maz@kernel.org> wrote:
>
> On Fri, 21 Apr 2023 17:53:05 +0100,
> Vipin Sharma <vipinsh@google.com> wrote:
> >
> > Take MMU read lock for write protecting PTEs and use shared page table
> > walker for clearing dirty logs.
> >
> > Clearing dirty logs are currently performed under MMU write locks. This
> > means vCPUs write protection fault, which also take MMU read lock,  will
> > be blocked during this operation. This causes guest degradation and
> > especially noticeable on VMs with lot of vCPUs.
> >
> > Taking MMU read lock will allow vCPUs to execute parallelly and reduces
> > the impact on vCPUs performance.
>
> Sure. Taking no lock whatsoever would be even better.
>
> What I don't see is the detailed explanation that gives me the warm
> feeling that this is safe and correct. Such an explanation is the
> minimum condition for me to even read the patch.
>

Thanks for freaking me out. Your not getting warm feeling hunch was
right, stage2_attr_walker() and stage2_update_leaf_attrs() combo do
not retry if cmpxchg fails for write protection. Write protection
callers don't check what the return status of the API is and just
ignores cmpxchg failure. This means a vCPU (MMU read lock user) can
cause cmpxchg to fail for write protection operation (under read lock,
which this patch does) and clear ioctl will happily return as if
everything is good.

I will update the series and also work on validating the correctness
to instill more confidence.

Thanks

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 9/9] KVM: arm64: Run clear-dirty-log under MMU read lock
@ 2023-05-06  0:55       ` Vipin Sharma
  0 siblings, 0 replies; 42+ messages in thread
From: Vipin Sharma @ 2023-05-06  0:55 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: oliver.upton, james.morse, suzuki.poulose, yuzenghui,
	catalin.marinas, will, chenhuacai, aleksandar.qemu.devel,
	tsbogend, anup, atishp, paul.walmsley, palmer, aou, seanjc,
	pbonzini, dmatlack, ricarkol, linux-arm-kernel, kvmarm,
	linux-mips, kvm-riscv, linux-riscv, linux-kselftest, kvm,
	linux-kernel

On Fri, Apr 21, 2023 at 10:11 AM Marc Zyngier <maz@kernel.org> wrote:
>
> On Fri, 21 Apr 2023 17:53:05 +0100,
> Vipin Sharma <vipinsh@google.com> wrote:
> >
> > Take MMU read lock for write protecting PTEs and use shared page table
> > walker for clearing dirty logs.
> >
> > Clearing dirty logs are currently performed under MMU write locks. This
> > means vCPUs write protection fault, which also take MMU read lock,  will
> > be blocked during this operation. This causes guest degradation and
> > especially noticeable on VMs with lot of vCPUs.
> >
> > Taking MMU read lock will allow vCPUs to execute parallelly and reduces
> > the impact on vCPUs performance.
>
> Sure. Taking no lock whatsoever would be even better.
>
> What I don't see is the detailed explanation that gives me the warm
> feeling that this is safe and correct. Such an explanation is the
> minimum condition for me to even read the patch.
>

Thanks for freaking me out. Your not getting warm feeling hunch was
right, stage2_attr_walker() and stage2_update_leaf_attrs() combo do
not retry if cmpxchg fails for write protection. Write protection
callers don't check what the return status of the API is and just
ignores cmpxchg failure. This means a vCPU (MMU read lock user) can
cause cmpxchg to fail for write protection operation (under read lock,
which this patch does) and clear ioctl will happily return as if
everything is good.

I will update the series and also work on validating the correctness
to instill more confidence.

Thanks

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2023-05-06  0:57 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-21 16:52 [PATCH 0/9] KVM: arm64: Use MMU read lock for clearing dirty logs Vipin Sharma
2023-04-21 16:52 ` Vipin Sharma
2023-04-21 16:52 ` Vipin Sharma
2023-04-21 16:52 ` [PATCH 1/9] KVM: selftests: Allow dirty_log_perf_test to clear dirty memory in chunks Vipin Sharma
2023-04-21 16:52   ` Vipin Sharma
2023-04-21 16:52   ` Vipin Sharma
2023-04-21 16:52 ` [PATCH 2/9] KVM: selftests: Add optional delay between consecutive Clear-Dirty-Log calls Vipin Sharma
2023-04-21 16:52   ` Vipin Sharma
2023-04-21 16:52   ` Vipin Sharma
2023-04-21 16:52 ` [PATCH 3/9] KVM: selftests: Pass count of read and write accesses from guest to host Vipin Sharma
2023-04-21 16:52   ` Vipin Sharma
2023-04-21 16:52   ` Vipin Sharma
2023-04-21 16:53 ` [PATCH 4/9] KVM: selftests: Print read and write accesses of pages by vCPUs in dirty_log_perf_test Vipin Sharma
2023-04-21 16:53   ` Vipin Sharma
2023-04-21 16:53   ` Vipin Sharma
2023-04-21 16:53 ` [PATCH 5/9] KVM: selftests: Allow independent execution of " Vipin Sharma
2023-04-21 16:53   ` Vipin Sharma
2023-04-21 16:53   ` Vipin Sharma
2023-04-21 16:53 ` [PATCH 6/9] KVM: arm64: Correct the kvm_pgtable_stage2_flush() documentation Vipin Sharma
2023-04-21 16:53   ` Vipin Sharma
2023-04-21 16:53   ` Vipin Sharma
2023-04-21 16:53 ` [PATCH 7/9] KVM: mmu: Move mmu lock/unlock to arch code for clear dirty log Vipin Sharma
2023-04-21 16:53   ` Vipin Sharma
2023-04-21 16:53   ` Vipin Sharma
2023-04-21 19:43   ` kernel test robot
2023-04-21 19:43     ` kernel test robot
2023-04-21 19:43     ` kernel test robot
2023-04-24 16:45     ` Vipin Sharma
2023-04-24 16:45       ` Vipin Sharma
2023-04-24 16:45       ` Vipin Sharma
2023-04-21 16:53 ` [PATCH 8/9] KMV: arm64: Allow stage2_apply_range_sched() to pass page table walker flags Vipin Sharma
2023-04-21 16:53   ` Vipin Sharma
2023-04-21 16:53   ` Vipin Sharma
2023-04-21 16:53 ` [PATCH 9/9] KVM: arm64: Run clear-dirty-log under MMU read lock Vipin Sharma
2023-04-21 16:53   ` Vipin Sharma
2023-04-21 16:53   ` Vipin Sharma
2023-04-21 17:10   ` Marc Zyngier
2023-04-21 17:10     ` Marc Zyngier
2023-04-21 17:10     ` Marc Zyngier
2023-05-06  0:55     ` Vipin Sharma
2023-05-06  0:55       ` Vipin Sharma
2023-05-06  0:55       ` Vipin Sharma

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.