All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 00/13] KVM: selftests: Add aarch64/page_fault_test
@ 2022-06-24 21:32 ` Ricardo Koller
  0 siblings, 0 replies; 56+ messages in thread
From: Ricardo Koller @ 2022-06-24 21:32 UTC (permalink / raw)
  To: kvm, kvmarm, drjones
  Cc: pbonzini, maz, alexandru.elisei, eric.auger, oupton, reijiw,
	rananta, bgardon, dmatlack, axelrasmussen, Ricardo Koller

This series adds a new aarch64 selftest for testing stage 2 fault handling for
various combinations of guest accesses (e.g., write, S1PTW), backing sources
(e.g., anon), and types of faults (e.g., read on hugetlbfs with a hole, write
on a readonly memslot). Each test tries a different combination and then checks
that the access results in the right behavior (e.g., uffd faults with the right
address and write/read flag). Some interesting combinations are:

- loading an instruction leads to a stage 1 page-table-walk that misses on
  stage 2 because the backing memslot for the page table it not in host memory
  (a hole was punched right there) and the fault is handled using userfaultfd.
  The expected behavior is that this leads to a userfaultfd fault marked as a
  write. See commit c4ad98e4b72c ("KVM: arm64: Assume write fault on S1PTW
  permission fault on instruction fetch") for why that's a write.
- a cas (compare-and-swap) on a readonly memslot leads to a failed vcpu run.
- write-faulting on a memslot that's marked for userfaultfd handling and dirty
  logging should result in a uffd fault and having the respective bit set in
  the dirty log.

The first 8 commits of this series add library support. The first one adds a
new userfaultfd library (out of demand_paging_test.c). The next 3 add some
library functions to get the GPA of a PTE, and to get the fd of a backing
source. Commit 6 fixes a leaked fd when using shared backing stores. The last 5
commits add the new selftest, one type of test at a time. It first adds core
tests, then uffd, then dirty logging, then readonly memslots tests, and finally
combinations of the previous ones (like uffd and dirty logging at the same
time).

v3 -> v4: https://lore.kernel.org/kvmarm/20220408004120.1969099-1-ricarkol@google.com/
- rebased on top of latest kvm/queue.
- addressed Oliver comments: vm_get_pte_gpa rename, page_fault_test and
  other nits.
- adding MAIR entry for MT_DEVICE_nGnRnE. The value and indices are both
  0, so the change is really esthetic.
- allocating less memory for the test (smaller memslots).
- better comments, including an ascii diagram about how memory is laid out
  for the test.

v2 -> v3:
Thank you very much Oliver and Ben
- collected r-b's from Ben. [Ben]
- moved some defitions (like TCR_EL1_HA) to common headers. [Oliver]
- use FIELD_GET and ARM64_FEATURE_MASK. [Oliver]
- put test data in a macro. [Oliver]
- check for DCZID_EL1.DZP=0b0 before using "dc zva". [Oliver]
- various new comments. [Oliver]
- use 'asm' instead of hand assembly. [Oliver]
- don't copy test descriptors into the guest. [Oliver]
- rename large_page_size into backing_page_size. [Oliver]
- add enumeration for memory types (4 is MT_NORMAL). [Oliver]
- refactored the test macro definitions.

v1 -> v2: https://lore.kernel.org/kvmarm/20220323225405.267155-1-ricarkol@google.com/
- collect r-b from Ben for the memslot lib commit. [Ben]
- move userfaultfd desc struct to header. [Ben]
- move commit "KVM: selftests: Add vm_mem_region_get_src_fd library function"
  to right before it's used. [Ben]
- nit: wrong indentation in patch 6. [Ben]

Ricardo Koller (13):
  KVM: selftests: Add a userfaultfd library
  KVM: selftests: aarch64: Add virt_get_pte_hva library function
  KVM: selftests: Add vm_alloc_page_table_in_memslot library function
  KVM: selftests: aarch64: Export _virt_pg_map with a pt_memslot arg
  KVM: selftests: Add missing close and munmap in __vm_mem_region_delete
  KVM: selftests: Add vm_mem_region_get_src_fd library function
  KVM: selftests: aarch64: Construct DEFAULT_MAIR_EL1 using sysreg.h
    macros
  tools: Copy bitfield.h from the kernel sources
  KVM: selftests: aarch64: Add aarch64/page_fault_test
  KVM: selftests: aarch64: Add userfaultfd tests into page_fault_test
  KVM: selftests: aarch64: Add dirty logging tests into page_fault_test
  KVM: selftests: aarch64: Add readonly memslot tests into
    page_fault_test
  KVM: selftests: aarch64: Add mix of tests into page_fault_test

 tools/include/linux/bitfield.h                |  176 +++
 tools/testing/selftests/kvm/Makefile          |    2 +
 .../selftests/kvm/aarch64/page_fault_test.c   | 1236 +++++++++++++++++
 .../selftests/kvm/demand_paging_test.c        |  228 +--
 .../selftests/kvm/include/aarch64/processor.h |   36 +-
 .../selftests/kvm/include/kvm_util_base.h     |    2 +
 .../selftests/kvm/include/userfaultfd_util.h  |   46 +
 .../selftests/kvm/lib/aarch64/processor.c     |   27 +-
 tools/testing/selftests/kvm/lib/kvm_util.c    |   37 +-
 .../selftests/kvm/lib/userfaultfd_util.c      |  187 +++
 10 files changed, 1762 insertions(+), 215 deletions(-)
 create mode 100644 tools/include/linux/bitfield.h
 create mode 100644 tools/testing/selftests/kvm/aarch64/page_fault_test.c
 create mode 100644 tools/testing/selftests/kvm/include/userfaultfd_util.h
 create mode 100644 tools/testing/selftests/kvm/lib/userfaultfd_util.c

-- 
2.37.0.rc0.161.g10f37bed90-goog


^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v4 00/13] KVM: selftests: Add aarch64/page_fault_test
@ 2022-06-24 21:32 ` Ricardo Koller
  0 siblings, 0 replies; 56+ messages in thread
From: Ricardo Koller @ 2022-06-24 21:32 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: maz, bgardon, dmatlack, pbonzini, axelrasmussen

This series adds a new aarch64 selftest for testing stage 2 fault handling for
various combinations of guest accesses (e.g., write, S1PTW), backing sources
(e.g., anon), and types of faults (e.g., read on hugetlbfs with a hole, write
on a readonly memslot). Each test tries a different combination and then checks
that the access results in the right behavior (e.g., uffd faults with the right
address and write/read flag). Some interesting combinations are:

- loading an instruction leads to a stage 1 page-table-walk that misses on
  stage 2 because the backing memslot for the page table it not in host memory
  (a hole was punched right there) and the fault is handled using userfaultfd.
  The expected behavior is that this leads to a userfaultfd fault marked as a
  write. See commit c4ad98e4b72c ("KVM: arm64: Assume write fault on S1PTW
  permission fault on instruction fetch") for why that's a write.
- a cas (compare-and-swap) on a readonly memslot leads to a failed vcpu run.
- write-faulting on a memslot that's marked for userfaultfd handling and dirty
  logging should result in a uffd fault and having the respective bit set in
  the dirty log.

The first 8 commits of this series add library support. The first one adds a
new userfaultfd library (out of demand_paging_test.c). The next 3 add some
library functions to get the GPA of a PTE, and to get the fd of a backing
source. Commit 6 fixes a leaked fd when using shared backing stores. The last 5
commits add the new selftest, one type of test at a time. It first adds core
tests, then uffd, then dirty logging, then readonly memslots tests, and finally
combinations of the previous ones (like uffd and dirty logging at the same
time).

v3 -> v4: https://lore.kernel.org/kvmarm/20220408004120.1969099-1-ricarkol@google.com/
- rebased on top of latest kvm/queue.
- addressed Oliver comments: vm_get_pte_gpa rename, page_fault_test and
  other nits.
- adding MAIR entry for MT_DEVICE_nGnRnE. The value and indices are both
  0, so the change is really esthetic.
- allocating less memory for the test (smaller memslots).
- better comments, including an ascii diagram about how memory is laid out
  for the test.

v2 -> v3:
Thank you very much Oliver and Ben
- collected r-b's from Ben. [Ben]
- moved some defitions (like TCR_EL1_HA) to common headers. [Oliver]
- use FIELD_GET and ARM64_FEATURE_MASK. [Oliver]
- put test data in a macro. [Oliver]
- check for DCZID_EL1.DZP=0b0 before using "dc zva". [Oliver]
- various new comments. [Oliver]
- use 'asm' instead of hand assembly. [Oliver]
- don't copy test descriptors into the guest. [Oliver]
- rename large_page_size into backing_page_size. [Oliver]
- add enumeration for memory types (4 is MT_NORMAL). [Oliver]
- refactored the test macro definitions.

v1 -> v2: https://lore.kernel.org/kvmarm/20220323225405.267155-1-ricarkol@google.com/
- collect r-b from Ben for the memslot lib commit. [Ben]
- move userfaultfd desc struct to header. [Ben]
- move commit "KVM: selftests: Add vm_mem_region_get_src_fd library function"
  to right before it's used. [Ben]
- nit: wrong indentation in patch 6. [Ben]

Ricardo Koller (13):
  KVM: selftests: Add a userfaultfd library
  KVM: selftests: aarch64: Add virt_get_pte_hva library function
  KVM: selftests: Add vm_alloc_page_table_in_memslot library function
  KVM: selftests: aarch64: Export _virt_pg_map with a pt_memslot arg
  KVM: selftests: Add missing close and munmap in __vm_mem_region_delete
  KVM: selftests: Add vm_mem_region_get_src_fd library function
  KVM: selftests: aarch64: Construct DEFAULT_MAIR_EL1 using sysreg.h
    macros
  tools: Copy bitfield.h from the kernel sources
  KVM: selftests: aarch64: Add aarch64/page_fault_test
  KVM: selftests: aarch64: Add userfaultfd tests into page_fault_test
  KVM: selftests: aarch64: Add dirty logging tests into page_fault_test
  KVM: selftests: aarch64: Add readonly memslot tests into
    page_fault_test
  KVM: selftests: aarch64: Add mix of tests into page_fault_test

 tools/include/linux/bitfield.h                |  176 +++
 tools/testing/selftests/kvm/Makefile          |    2 +
 .../selftests/kvm/aarch64/page_fault_test.c   | 1236 +++++++++++++++++
 .../selftests/kvm/demand_paging_test.c        |  228 +--
 .../selftests/kvm/include/aarch64/processor.h |   36 +-
 .../selftests/kvm/include/kvm_util_base.h     |    2 +
 .../selftests/kvm/include/userfaultfd_util.h  |   46 +
 .../selftests/kvm/lib/aarch64/processor.c     |   27 +-
 tools/testing/selftests/kvm/lib/kvm_util.c    |   37 +-
 .../selftests/kvm/lib/userfaultfd_util.c      |  187 +++
 10 files changed, 1762 insertions(+), 215 deletions(-)
 create mode 100644 tools/include/linux/bitfield.h
 create mode 100644 tools/testing/selftests/kvm/aarch64/page_fault_test.c
 create mode 100644 tools/testing/selftests/kvm/include/userfaultfd_util.h
 create mode 100644 tools/testing/selftests/kvm/lib/userfaultfd_util.c

-- 
2.37.0.rc0.161.g10f37bed90-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v4 01/13] KVM: selftests: Add a userfaultfd library
  2022-06-24 21:32 ` Ricardo Koller
@ 2022-06-24 21:32   ` Ricardo Koller
  -1 siblings, 0 replies; 56+ messages in thread
From: Ricardo Koller @ 2022-06-24 21:32 UTC (permalink / raw)
  To: kvm, kvmarm, drjones
  Cc: pbonzini, maz, alexandru.elisei, eric.auger, oupton, reijiw,
	rananta, bgardon, dmatlack, axelrasmussen, Ricardo Koller

Move the generic userfaultfd code out of demand_paging_test.c into a
common library, userfaultfd_util. This library consists of a setup and a
stop function. The setup function starts a thread for handling page
faults using the handler callback function. This setup returns a
uffd_desc object which is then used in the stop function (to wait and
destroy the threads).

Reviewed-by: Ben Gardon <bgardon@google.com>
Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 tools/testing/selftests/kvm/Makefile          |   1 +
 .../selftests/kvm/demand_paging_test.c        | 228 +++---------------
 .../selftests/kvm/include/userfaultfd_util.h  |  46 ++++
 .../selftests/kvm/lib/userfaultfd_util.c      | 187 ++++++++++++++
 4 files changed, 264 insertions(+), 198 deletions(-)
 create mode 100644 tools/testing/selftests/kvm/include/userfaultfd_util.h
 create mode 100644 tools/testing/selftests/kvm/lib/userfaultfd_util.c

diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index b52c130f7b2f..e4497a3a27d4 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -46,6 +46,7 @@ LIBKVM += lib/perf_test_util.c
 LIBKVM += lib/rbtree.c
 LIBKVM += lib/sparsebit.c
 LIBKVM += lib/test_util.c
+LIBKVM += lib/userfaultfd_util.c
 
 LIBKVM_x86_64 += lib/x86_64/apic.c
 LIBKVM_x86_64 += lib/x86_64/handlers.S
diff --git a/tools/testing/selftests/kvm/demand_paging_test.c b/tools/testing/selftests/kvm/demand_paging_test.c
index 779ae54f89c4..8e1fe4ffcccd 100644
--- a/tools/testing/selftests/kvm/demand_paging_test.c
+++ b/tools/testing/selftests/kvm/demand_paging_test.c
@@ -22,23 +22,13 @@
 #include "test_util.h"
 #include "perf_test_util.h"
 #include "guest_modes.h"
+#include "userfaultfd_util.h"
 
 #ifdef __NR_userfaultfd
 
-#ifdef PRINT_PER_PAGE_UPDATES
-#define PER_PAGE_DEBUG(...) printf(__VA_ARGS__)
-#else
-#define PER_PAGE_DEBUG(...) _no_printf(__VA_ARGS__)
-#endif
-
-#ifdef PRINT_PER_VCPU_UPDATES
-#define PER_VCPU_DEBUG(...) printf(__VA_ARGS__)
-#else
-#define PER_VCPU_DEBUG(...) _no_printf(__VA_ARGS__)
-#endif
-
 static int nr_vcpus = 1;
 static uint64_t guest_percpu_mem_size = DEFAULT_PER_VCPU_MEM_SIZE;
+
 static size_t demand_paging_size;
 static char *guest_data_prototype;
 
@@ -67,9 +57,11 @@ static void vcpu_worker(struct perf_test_vcpu_args *vcpu_args)
 		       ts_diff.tv_sec, ts_diff.tv_nsec);
 }
 
-static int handle_uffd_page_request(int uffd_mode, int uffd, uint64_t addr)
+static int handle_uffd_page_request(int uffd_mode, int uffd,
+		struct uffd_msg *msg)
 {
 	pid_t tid = syscall(__NR_gettid);
+	uint64_t addr = msg->arg.pagefault.address;
 	struct timespec start;
 	struct timespec ts_diff;
 	int r;
@@ -116,174 +108,32 @@ static int handle_uffd_page_request(int uffd_mode, int uffd, uint64_t addr)
 	return 0;
 }
 
-bool quit_uffd_thread;
-
-struct uffd_handler_args {
+struct test_params {
 	int uffd_mode;
-	int uffd;
-	int pipefd;
-	useconds_t delay;
+	useconds_t uffd_delay;
+	enum vm_mem_backing_src_type src_type;
+	bool partition_vcpu_memory_access;
 };
 
-static void *uffd_handler_thread_fn(void *arg)
-{
-	struct uffd_handler_args *uffd_args = (struct uffd_handler_args *)arg;
-	int uffd = uffd_args->uffd;
-	int pipefd = uffd_args->pipefd;
-	useconds_t delay = uffd_args->delay;
-	int64_t pages = 0;
-	struct timespec start;
-	struct timespec ts_diff;
-
-	clock_gettime(CLOCK_MONOTONIC, &start);
-	while (!quit_uffd_thread) {
-		struct uffd_msg msg;
-		struct pollfd pollfd[2];
-		char tmp_chr;
-		int r;
-		uint64_t addr;
-
-		pollfd[0].fd = uffd;
-		pollfd[0].events = POLLIN;
-		pollfd[1].fd = pipefd;
-		pollfd[1].events = POLLIN;
-
-		r = poll(pollfd, 2, -1);
-		switch (r) {
-		case -1:
-			pr_info("poll err");
-			continue;
-		case 0:
-			continue;
-		case 1:
-			break;
-		default:
-			pr_info("Polling uffd returned %d", r);
-			return NULL;
-		}
-
-		if (pollfd[0].revents & POLLERR) {
-			pr_info("uffd revents has POLLERR");
-			return NULL;
-		}
-
-		if (pollfd[1].revents & POLLIN) {
-			r = read(pollfd[1].fd, &tmp_chr, 1);
-			TEST_ASSERT(r == 1,
-				    "Error reading pipefd in UFFD thread\n");
-			return NULL;
-		}
-
-		if (!(pollfd[0].revents & POLLIN))
-			continue;
-
-		r = read(uffd, &msg, sizeof(msg));
-		if (r == -1) {
-			if (errno == EAGAIN)
-				continue;
-			pr_info("Read of uffd got errno %d\n", errno);
-			return NULL;
-		}
-
-		if (r != sizeof(msg)) {
-			pr_info("Read on uffd returned unexpected size: %d bytes", r);
-			return NULL;
-		}
-
-		if (!(msg.event & UFFD_EVENT_PAGEFAULT))
-			continue;
-
-		if (delay)
-			usleep(delay);
-		addr =  msg.arg.pagefault.address;
-		r = handle_uffd_page_request(uffd_args->uffd_mode, uffd, addr);
-		if (r < 0)
-			return NULL;
-		pages++;
-	}
-
-	ts_diff = timespec_elapsed(start);
-	PER_VCPU_DEBUG("userfaulted %ld pages over %ld.%.9lds. (%f/sec)\n",
-		       pages, ts_diff.tv_sec, ts_diff.tv_nsec,
-		       pages / ((double)ts_diff.tv_sec + (double)ts_diff.tv_nsec / 100000000.0));
-
-	return NULL;
-}
-
-static void setup_demand_paging(struct kvm_vm *vm,
-				pthread_t *uffd_handler_thread, int pipefd,
-				int uffd_mode, useconds_t uffd_delay,
-				struct uffd_handler_args *uffd_args,
-				void *hva, void *alias, uint64_t len)
+static void prefault_mem(void *alias, uint64_t len)
 {
-	bool is_minor = (uffd_mode == UFFDIO_REGISTER_MODE_MINOR);
-	int uffd;
-	struct uffdio_api uffdio_api;
-	struct uffdio_register uffdio_register;
-	uint64_t expected_ioctls = ((uint64_t) 1) << _UFFDIO_COPY;
-	int ret;
+	size_t p;
 
-	PER_PAGE_DEBUG("Userfaultfd %s mode, faults resolved with %s\n",
-		       is_minor ? "MINOR" : "MISSING",
-		       is_minor ? "UFFDIO_CONINUE" : "UFFDIO_COPY");
-
-	/* In order to get minor faults, prefault via the alias. */
-	if (is_minor) {
-		size_t p;
-
-		expected_ioctls = ((uint64_t) 1) << _UFFDIO_CONTINUE;
-
-		TEST_ASSERT(alias != NULL, "Alias required for minor faults");
-		for (p = 0; p < (len / demand_paging_size); ++p) {
-			memcpy(alias + (p * demand_paging_size),
-			       guest_data_prototype, demand_paging_size);
-		}
+	TEST_ASSERT(alias != NULL, "Alias required for minor faults");
+	for (p = 0; p < (len / demand_paging_size); ++p) {
+		memcpy(alias + (p * demand_paging_size),
+		       guest_data_prototype, demand_paging_size);
 	}
-
-	uffd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK);
-	TEST_ASSERT(uffd >= 0, __KVM_SYSCALL_ERROR("userfaultfd()", uffd));
-
-	uffdio_api.api = UFFD_API;
-	uffdio_api.features = 0;
-	ret = ioctl(uffd, UFFDIO_API, &uffdio_api);
-	TEST_ASSERT(ret != -1, __KVM_SYSCALL_ERROR("UFFDIO_API", ret));
-
-	uffdio_register.range.start = (uint64_t)hva;
-	uffdio_register.range.len = len;
-	uffdio_register.mode = uffd_mode;
-	ret = ioctl(uffd, UFFDIO_REGISTER, &uffdio_register);
-	TEST_ASSERT(ret != -1, __KVM_SYSCALL_ERROR("UFFDIO_REGISTER", ret));
-	TEST_ASSERT((uffdio_register.ioctls & expected_ioctls) ==
-		    expected_ioctls, "missing userfaultfd ioctls");
-
-	uffd_args->uffd_mode = uffd_mode;
-	uffd_args->uffd = uffd;
-	uffd_args->pipefd = pipefd;
-	uffd_args->delay = uffd_delay;
-	pthread_create(uffd_handler_thread, NULL, uffd_handler_thread_fn,
-		       uffd_args);
-
-	PER_VCPU_DEBUG("Created uffd thread for HVA range [%p, %p)\n",
-		       hva, hva + len);
 }
 
-struct test_params {
-	int uffd_mode;
-	useconds_t uffd_delay;
-	enum vm_mem_backing_src_type src_type;
-	bool partition_vcpu_memory_access;
-};
-
 static void run_test(enum vm_guest_mode mode, void *arg)
 {
 	struct test_params *p = arg;
-	pthread_t *uffd_handler_threads = NULL;
-	struct uffd_handler_args *uffd_args = NULL;
+	struct uffd_desc **uffd_descs = NULL;
 	struct timespec start;
 	struct timespec ts_diff;
-	int *pipefds = NULL;
 	struct kvm_vm *vm;
-	int r, i;
+	int i;
 
 	vm = perf_test_create_vm(mode, nr_vcpus, guest_percpu_mem_size, 1,
 				 p->src_type, p->partition_vcpu_memory_access);
@@ -296,15 +146,8 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 	memset(guest_data_prototype, 0xAB, demand_paging_size);
 
 	if (p->uffd_mode) {
-		uffd_handler_threads =
-			malloc(nr_vcpus * sizeof(*uffd_handler_threads));
-		TEST_ASSERT(uffd_handler_threads, "Memory allocation failed");
-
-		uffd_args = malloc(nr_vcpus * sizeof(*uffd_args));
-		TEST_ASSERT(uffd_args, "Memory allocation failed");
-
-		pipefds = malloc(sizeof(int) * nr_vcpus * 2);
-		TEST_ASSERT(pipefds, "Unable to allocate memory for pipefd");
+		uffd_descs = malloc(nr_vcpus * sizeof(struct uffd_desc *));
+		TEST_ASSERT(uffd_descs, "Memory allocation failed");
 
 		for (i = 0; i < nr_vcpus; i++) {
 			struct perf_test_vcpu_args *vcpu_args;
@@ -317,19 +160,17 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 			vcpu_hva = addr_gpa2hva(vm, vcpu_args->gpa);
 			vcpu_alias = addr_gpa2alias(vm, vcpu_args->gpa);
 
+			prefault_mem(vcpu_alias,
+				vcpu_args->pages * perf_test_args.guest_page_size);
+
 			/*
 			 * Set up user fault fd to handle demand paging
 			 * requests.
 			 */
-			r = pipe2(&pipefds[i * 2],
-				  O_CLOEXEC | O_NONBLOCK);
-			TEST_ASSERT(!r, "Failed to set up pipefd");
-
-			setup_demand_paging(vm, &uffd_handler_threads[i],
-					    pipefds[i * 2], p->uffd_mode,
-					    p->uffd_delay, &uffd_args[i],
-					    vcpu_hva, vcpu_alias,
-					    vcpu_args->pages * perf_test_args.guest_page_size);
+			uffd_descs[i] = uffd_setup_demand_paging(
+				p->uffd_mode, p->uffd_delay, vcpu_hva,
+				vcpu_args->pages * perf_test_args.guest_page_size,
+				&handle_uffd_page_request);
 		}
 	}
 
@@ -344,15 +185,9 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 	pr_info("All vCPU threads joined\n");
 
 	if (p->uffd_mode) {
-		char c;
-
 		/* Tell the user fault fd handler threads to quit */
-		for (i = 0; i < nr_vcpus; i++) {
-			r = write(pipefds[i * 2 + 1], &c, 1);
-			TEST_ASSERT(r == 1, "Unable to write to pipefd");
-
-			pthread_join(uffd_handler_threads[i], NULL);
-		}
+		for (i = 0; i < nr_vcpus; i++)
+			uffd_stop_demand_paging(uffd_descs[i]);
 	}
 
 	pr_info("Total guest execution time: %ld.%.9lds\n",
@@ -364,11 +199,8 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 	perf_test_destroy_vm(vm);
 
 	free(guest_data_prototype);
-	if (p->uffd_mode) {
-		free(uffd_handler_threads);
-		free(uffd_args);
-		free(pipefds);
-	}
+	if (p->uffd_mode)
+		free(uffd_descs);
 }
 
 static void help(char *name)
diff --git a/tools/testing/selftests/kvm/include/userfaultfd_util.h b/tools/testing/selftests/kvm/include/userfaultfd_util.h
new file mode 100644
index 000000000000..a1a386c083b0
--- /dev/null
+++ b/tools/testing/selftests/kvm/include/userfaultfd_util.h
@@ -0,0 +1,46 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * KVM userfaultfd util
+ * Adapted from demand_paging_test.c
+ *
+ * Copyright (C) 2018, Red Hat, Inc.
+ * Copyright (C) 2019-2022 Google LLC
+ */
+
+#define _GNU_SOURCE /* for pipe2 */
+
+#include <inttypes.h>
+#include <time.h>
+#include <pthread.h>
+#include <linux/userfaultfd.h>
+
+#include "test_util.h"
+
+typedef int (*uffd_handler_t)(int uffd_mode, int uffd, struct uffd_msg *msg);
+
+struct uffd_desc {
+	int uffd_mode;
+	int uffd;
+	int pipefds[2];
+	useconds_t delay;
+	uffd_handler_t handler;
+	pthread_t thread;
+};
+
+struct uffd_desc *uffd_setup_demand_paging(int uffd_mode,
+		useconds_t uffd_delay, void *hva, uint64_t len,
+		uffd_handler_t handler);
+
+void uffd_stop_demand_paging(struct uffd_desc *uffd);
+
+#ifdef PRINT_PER_PAGE_UPDATES
+#define PER_PAGE_DEBUG(...) printf(__VA_ARGS__)
+#else
+#define PER_PAGE_DEBUG(...) _no_printf(__VA_ARGS__)
+#endif
+
+#ifdef PRINT_PER_VCPU_UPDATES
+#define PER_VCPU_DEBUG(...) printf(__VA_ARGS__)
+#else
+#define PER_VCPU_DEBUG(...) _no_printf(__VA_ARGS__)
+#endif
diff --git a/tools/testing/selftests/kvm/lib/userfaultfd_util.c b/tools/testing/selftests/kvm/lib/userfaultfd_util.c
new file mode 100644
index 000000000000..4395032ccbe4
--- /dev/null
+++ b/tools/testing/selftests/kvm/lib/userfaultfd_util.c
@@ -0,0 +1,187 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * KVM userfaultfd util
+ * Adapted from demand_paging_test.c
+ *
+ * Copyright (C) 2018, Red Hat, Inc.
+ * Copyright (C) 2019, Google, Inc.
+ * Copyright (C) 2022, Google, Inc.
+ */
+
+#define _GNU_SOURCE /* for pipe2 */
+
+#include <inttypes.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <time.h>
+#include <poll.h>
+#include <pthread.h>
+#include <linux/userfaultfd.h>
+#include <sys/syscall.h>
+
+#include "kvm_util.h"
+#include "test_util.h"
+#include "perf_test_util.h"
+#include "userfaultfd_util.h"
+
+#ifdef __NR_userfaultfd
+
+static void *uffd_handler_thread_fn(void *arg)
+{
+	struct uffd_desc *uffd_desc = (struct uffd_desc *)arg;
+	int uffd = uffd_desc->uffd;
+	int pipefd = uffd_desc->pipefds[0];
+	useconds_t delay = uffd_desc->delay;
+	int64_t pages = 0;
+	struct timespec start;
+	struct timespec ts_diff;
+
+	clock_gettime(CLOCK_MONOTONIC, &start);
+	while (1) {
+		struct uffd_msg msg;
+		struct pollfd pollfd[2];
+		char tmp_chr;
+		int r;
+
+		pollfd[0].fd = uffd;
+		pollfd[0].events = POLLIN;
+		pollfd[1].fd = pipefd;
+		pollfd[1].events = POLLIN;
+
+		r = poll(pollfd, 2, -1);
+		switch (r) {
+		case -1:
+			pr_info("poll err");
+			continue;
+		case 0:
+			continue;
+		case 1:
+			break;
+		default:
+			pr_info("Polling uffd returned %d", r);
+			return NULL;
+		}
+
+		if (pollfd[0].revents & POLLERR) {
+			pr_info("uffd revents has POLLERR");
+			return NULL;
+		}
+
+		if (pollfd[1].revents & POLLIN) {
+			r = read(pollfd[1].fd, &tmp_chr, 1);
+			TEST_ASSERT(r == 1,
+				    "Error reading pipefd in UFFD thread\n");
+			return NULL;
+		}
+
+		if (!(pollfd[0].revents & POLLIN))
+			continue;
+
+		r = read(uffd, &msg, sizeof(msg));
+		if (r == -1) {
+			if (errno == EAGAIN)
+				continue;
+			pr_info("Read of uffd got errno %d\n", errno);
+			return NULL;
+		}
+
+		if (r != sizeof(msg)) {
+			pr_info("Read on uffd returned unexpected size: %d bytes", r);
+			return NULL;
+		}
+
+		if (!(msg.event & UFFD_EVENT_PAGEFAULT))
+			continue;
+
+		if (delay)
+			usleep(delay);
+		r = uffd_desc->handler(uffd_desc->uffd_mode, uffd, &msg);
+		if (r < 0)
+			return NULL;
+		pages++;
+	}
+
+	ts_diff = timespec_elapsed(start);
+	PER_VCPU_DEBUG("userfaulted %ld pages over %ld.%.9lds. (%f/sec)\n",
+		       pages, ts_diff.tv_sec, ts_diff.tv_nsec,
+		       pages / ((double)ts_diff.tv_sec + (double)ts_diff.tv_nsec / 100000000.0));
+
+	return NULL;
+}
+
+struct uffd_desc *uffd_setup_demand_paging(int uffd_mode,
+		useconds_t uffd_delay, void *hva, uint64_t len,
+		uffd_handler_t handler)
+{
+	struct uffd_desc *uffd_desc;
+	bool is_minor = (uffd_mode == UFFDIO_REGISTER_MODE_MINOR);
+	int uffd;
+	struct uffdio_api uffdio_api;
+	struct uffdio_register uffdio_register;
+	uint64_t expected_ioctls = ((uint64_t) 1) << _UFFDIO_COPY;
+	int ret;
+
+	PER_PAGE_DEBUG("Userfaultfd %s mode, faults resolved with %s\n",
+		       is_minor ? "MINOR" : "MISSING",
+		       is_minor ? "UFFDIO_CONINUE" : "UFFDIO_COPY");
+
+	uffd_desc = malloc(sizeof(struct uffd_desc));
+	TEST_ASSERT(uffd_desc, "malloc failed");
+
+	/* In order to get minor faults, prefault via the alias. */
+	if (is_minor)
+		expected_ioctls = ((uint64_t) 1) << _UFFDIO_CONTINUE;
+
+	uffd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK);
+	TEST_ASSERT(uffd >= 0, "uffd creation failed, errno: %d", errno);
+
+	uffdio_api.api = UFFD_API;
+	uffdio_api.features = 0;
+	TEST_ASSERT(ioctl(uffd, UFFDIO_API, &uffdio_api) != -1,
+		    "ioctl UFFDIO_API failed: %" PRIu64,
+		    (uint64_t)uffdio_api.api);
+
+	uffdio_register.range.start = (uint64_t)hva;
+	uffdio_register.range.len = len;
+	uffdio_register.mode = uffd_mode;
+	TEST_ASSERT(ioctl(uffd, UFFDIO_REGISTER, &uffdio_register) != -1,
+		    "ioctl UFFDIO_REGISTER failed");
+	TEST_ASSERT((uffdio_register.ioctls & expected_ioctls) ==
+			expected_ioctls, "missing userfaultfd ioctls");
+
+	ret = pipe2(uffd_desc->pipefds, O_CLOEXEC | O_NONBLOCK);
+	TEST_ASSERT(!ret, "Failed to set up pipefd");
+
+	uffd_desc->uffd_mode = uffd_mode;
+	uffd_desc->uffd = uffd;
+	uffd_desc->delay = uffd_delay;
+	uffd_desc->handler = handler;
+	pthread_create(&uffd_desc->thread, NULL, uffd_handler_thread_fn,
+		       uffd_desc);
+
+	PER_VCPU_DEBUG("Created uffd thread for HVA range [%p, %p)\n",
+		       hva, hva + len);
+
+	return uffd_desc;
+}
+
+void uffd_stop_demand_paging(struct uffd_desc *uffd)
+{
+	char c = 0;
+	int ret;
+
+	ret = write(uffd->pipefds[1], &c, 1);
+	TEST_ASSERT(ret == 1, "Unable to write to pipefd");
+
+	ret = pthread_join(uffd->thread, NULL);
+	TEST_ASSERT(ret == 0, "Pthread_join failed.");
+
+	close(uffd->uffd);
+
+	close(uffd->pipefds[1]);
+	close(uffd->pipefds[0]);
+
+	free(uffd);
+}
+
+#endif /* __NR_userfaultfd */
-- 
2.37.0.rc0.161.g10f37bed90-goog


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v4 01/13] KVM: selftests: Add a userfaultfd library
@ 2022-06-24 21:32   ` Ricardo Koller
  0 siblings, 0 replies; 56+ messages in thread
From: Ricardo Koller @ 2022-06-24 21:32 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: maz, bgardon, dmatlack, pbonzini, axelrasmussen

Move the generic userfaultfd code out of demand_paging_test.c into a
common library, userfaultfd_util. This library consists of a setup and a
stop function. The setup function starts a thread for handling page
faults using the handler callback function. This setup returns a
uffd_desc object which is then used in the stop function (to wait and
destroy the threads).

Reviewed-by: Ben Gardon <bgardon@google.com>
Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 tools/testing/selftests/kvm/Makefile          |   1 +
 .../selftests/kvm/demand_paging_test.c        | 228 +++---------------
 .../selftests/kvm/include/userfaultfd_util.h  |  46 ++++
 .../selftests/kvm/lib/userfaultfd_util.c      | 187 ++++++++++++++
 4 files changed, 264 insertions(+), 198 deletions(-)
 create mode 100644 tools/testing/selftests/kvm/include/userfaultfd_util.h
 create mode 100644 tools/testing/selftests/kvm/lib/userfaultfd_util.c

diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index b52c130f7b2f..e4497a3a27d4 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -46,6 +46,7 @@ LIBKVM += lib/perf_test_util.c
 LIBKVM += lib/rbtree.c
 LIBKVM += lib/sparsebit.c
 LIBKVM += lib/test_util.c
+LIBKVM += lib/userfaultfd_util.c
 
 LIBKVM_x86_64 += lib/x86_64/apic.c
 LIBKVM_x86_64 += lib/x86_64/handlers.S
diff --git a/tools/testing/selftests/kvm/demand_paging_test.c b/tools/testing/selftests/kvm/demand_paging_test.c
index 779ae54f89c4..8e1fe4ffcccd 100644
--- a/tools/testing/selftests/kvm/demand_paging_test.c
+++ b/tools/testing/selftests/kvm/demand_paging_test.c
@@ -22,23 +22,13 @@
 #include "test_util.h"
 #include "perf_test_util.h"
 #include "guest_modes.h"
+#include "userfaultfd_util.h"
 
 #ifdef __NR_userfaultfd
 
-#ifdef PRINT_PER_PAGE_UPDATES
-#define PER_PAGE_DEBUG(...) printf(__VA_ARGS__)
-#else
-#define PER_PAGE_DEBUG(...) _no_printf(__VA_ARGS__)
-#endif
-
-#ifdef PRINT_PER_VCPU_UPDATES
-#define PER_VCPU_DEBUG(...) printf(__VA_ARGS__)
-#else
-#define PER_VCPU_DEBUG(...) _no_printf(__VA_ARGS__)
-#endif
-
 static int nr_vcpus = 1;
 static uint64_t guest_percpu_mem_size = DEFAULT_PER_VCPU_MEM_SIZE;
+
 static size_t demand_paging_size;
 static char *guest_data_prototype;
 
@@ -67,9 +57,11 @@ static void vcpu_worker(struct perf_test_vcpu_args *vcpu_args)
 		       ts_diff.tv_sec, ts_diff.tv_nsec);
 }
 
-static int handle_uffd_page_request(int uffd_mode, int uffd, uint64_t addr)
+static int handle_uffd_page_request(int uffd_mode, int uffd,
+		struct uffd_msg *msg)
 {
 	pid_t tid = syscall(__NR_gettid);
+	uint64_t addr = msg->arg.pagefault.address;
 	struct timespec start;
 	struct timespec ts_diff;
 	int r;
@@ -116,174 +108,32 @@ static int handle_uffd_page_request(int uffd_mode, int uffd, uint64_t addr)
 	return 0;
 }
 
-bool quit_uffd_thread;
-
-struct uffd_handler_args {
+struct test_params {
 	int uffd_mode;
-	int uffd;
-	int pipefd;
-	useconds_t delay;
+	useconds_t uffd_delay;
+	enum vm_mem_backing_src_type src_type;
+	bool partition_vcpu_memory_access;
 };
 
-static void *uffd_handler_thread_fn(void *arg)
-{
-	struct uffd_handler_args *uffd_args = (struct uffd_handler_args *)arg;
-	int uffd = uffd_args->uffd;
-	int pipefd = uffd_args->pipefd;
-	useconds_t delay = uffd_args->delay;
-	int64_t pages = 0;
-	struct timespec start;
-	struct timespec ts_diff;
-
-	clock_gettime(CLOCK_MONOTONIC, &start);
-	while (!quit_uffd_thread) {
-		struct uffd_msg msg;
-		struct pollfd pollfd[2];
-		char tmp_chr;
-		int r;
-		uint64_t addr;
-
-		pollfd[0].fd = uffd;
-		pollfd[0].events = POLLIN;
-		pollfd[1].fd = pipefd;
-		pollfd[1].events = POLLIN;
-
-		r = poll(pollfd, 2, -1);
-		switch (r) {
-		case -1:
-			pr_info("poll err");
-			continue;
-		case 0:
-			continue;
-		case 1:
-			break;
-		default:
-			pr_info("Polling uffd returned %d", r);
-			return NULL;
-		}
-
-		if (pollfd[0].revents & POLLERR) {
-			pr_info("uffd revents has POLLERR");
-			return NULL;
-		}
-
-		if (pollfd[1].revents & POLLIN) {
-			r = read(pollfd[1].fd, &tmp_chr, 1);
-			TEST_ASSERT(r == 1,
-				    "Error reading pipefd in UFFD thread\n");
-			return NULL;
-		}
-
-		if (!(pollfd[0].revents & POLLIN))
-			continue;
-
-		r = read(uffd, &msg, sizeof(msg));
-		if (r == -1) {
-			if (errno == EAGAIN)
-				continue;
-			pr_info("Read of uffd got errno %d\n", errno);
-			return NULL;
-		}
-
-		if (r != sizeof(msg)) {
-			pr_info("Read on uffd returned unexpected size: %d bytes", r);
-			return NULL;
-		}
-
-		if (!(msg.event & UFFD_EVENT_PAGEFAULT))
-			continue;
-
-		if (delay)
-			usleep(delay);
-		addr =  msg.arg.pagefault.address;
-		r = handle_uffd_page_request(uffd_args->uffd_mode, uffd, addr);
-		if (r < 0)
-			return NULL;
-		pages++;
-	}
-
-	ts_diff = timespec_elapsed(start);
-	PER_VCPU_DEBUG("userfaulted %ld pages over %ld.%.9lds. (%f/sec)\n",
-		       pages, ts_diff.tv_sec, ts_diff.tv_nsec,
-		       pages / ((double)ts_diff.tv_sec + (double)ts_diff.tv_nsec / 100000000.0));
-
-	return NULL;
-}
-
-static void setup_demand_paging(struct kvm_vm *vm,
-				pthread_t *uffd_handler_thread, int pipefd,
-				int uffd_mode, useconds_t uffd_delay,
-				struct uffd_handler_args *uffd_args,
-				void *hva, void *alias, uint64_t len)
+static void prefault_mem(void *alias, uint64_t len)
 {
-	bool is_minor = (uffd_mode == UFFDIO_REGISTER_MODE_MINOR);
-	int uffd;
-	struct uffdio_api uffdio_api;
-	struct uffdio_register uffdio_register;
-	uint64_t expected_ioctls = ((uint64_t) 1) << _UFFDIO_COPY;
-	int ret;
+	size_t p;
 
-	PER_PAGE_DEBUG("Userfaultfd %s mode, faults resolved with %s\n",
-		       is_minor ? "MINOR" : "MISSING",
-		       is_minor ? "UFFDIO_CONINUE" : "UFFDIO_COPY");
-
-	/* In order to get minor faults, prefault via the alias. */
-	if (is_minor) {
-		size_t p;
-
-		expected_ioctls = ((uint64_t) 1) << _UFFDIO_CONTINUE;
-
-		TEST_ASSERT(alias != NULL, "Alias required for minor faults");
-		for (p = 0; p < (len / demand_paging_size); ++p) {
-			memcpy(alias + (p * demand_paging_size),
-			       guest_data_prototype, demand_paging_size);
-		}
+	TEST_ASSERT(alias != NULL, "Alias required for minor faults");
+	for (p = 0; p < (len / demand_paging_size); ++p) {
+		memcpy(alias + (p * demand_paging_size),
+		       guest_data_prototype, demand_paging_size);
 	}
-
-	uffd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK);
-	TEST_ASSERT(uffd >= 0, __KVM_SYSCALL_ERROR("userfaultfd()", uffd));
-
-	uffdio_api.api = UFFD_API;
-	uffdio_api.features = 0;
-	ret = ioctl(uffd, UFFDIO_API, &uffdio_api);
-	TEST_ASSERT(ret != -1, __KVM_SYSCALL_ERROR("UFFDIO_API", ret));
-
-	uffdio_register.range.start = (uint64_t)hva;
-	uffdio_register.range.len = len;
-	uffdio_register.mode = uffd_mode;
-	ret = ioctl(uffd, UFFDIO_REGISTER, &uffdio_register);
-	TEST_ASSERT(ret != -1, __KVM_SYSCALL_ERROR("UFFDIO_REGISTER", ret));
-	TEST_ASSERT((uffdio_register.ioctls & expected_ioctls) ==
-		    expected_ioctls, "missing userfaultfd ioctls");
-
-	uffd_args->uffd_mode = uffd_mode;
-	uffd_args->uffd = uffd;
-	uffd_args->pipefd = pipefd;
-	uffd_args->delay = uffd_delay;
-	pthread_create(uffd_handler_thread, NULL, uffd_handler_thread_fn,
-		       uffd_args);
-
-	PER_VCPU_DEBUG("Created uffd thread for HVA range [%p, %p)\n",
-		       hva, hva + len);
 }
 
-struct test_params {
-	int uffd_mode;
-	useconds_t uffd_delay;
-	enum vm_mem_backing_src_type src_type;
-	bool partition_vcpu_memory_access;
-};
-
 static void run_test(enum vm_guest_mode mode, void *arg)
 {
 	struct test_params *p = arg;
-	pthread_t *uffd_handler_threads = NULL;
-	struct uffd_handler_args *uffd_args = NULL;
+	struct uffd_desc **uffd_descs = NULL;
 	struct timespec start;
 	struct timespec ts_diff;
-	int *pipefds = NULL;
 	struct kvm_vm *vm;
-	int r, i;
+	int i;
 
 	vm = perf_test_create_vm(mode, nr_vcpus, guest_percpu_mem_size, 1,
 				 p->src_type, p->partition_vcpu_memory_access);
@@ -296,15 +146,8 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 	memset(guest_data_prototype, 0xAB, demand_paging_size);
 
 	if (p->uffd_mode) {
-		uffd_handler_threads =
-			malloc(nr_vcpus * sizeof(*uffd_handler_threads));
-		TEST_ASSERT(uffd_handler_threads, "Memory allocation failed");
-
-		uffd_args = malloc(nr_vcpus * sizeof(*uffd_args));
-		TEST_ASSERT(uffd_args, "Memory allocation failed");
-
-		pipefds = malloc(sizeof(int) * nr_vcpus * 2);
-		TEST_ASSERT(pipefds, "Unable to allocate memory for pipefd");
+		uffd_descs = malloc(nr_vcpus * sizeof(struct uffd_desc *));
+		TEST_ASSERT(uffd_descs, "Memory allocation failed");
 
 		for (i = 0; i < nr_vcpus; i++) {
 			struct perf_test_vcpu_args *vcpu_args;
@@ -317,19 +160,17 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 			vcpu_hva = addr_gpa2hva(vm, vcpu_args->gpa);
 			vcpu_alias = addr_gpa2alias(vm, vcpu_args->gpa);
 
+			prefault_mem(vcpu_alias,
+				vcpu_args->pages * perf_test_args.guest_page_size);
+
 			/*
 			 * Set up user fault fd to handle demand paging
 			 * requests.
 			 */
-			r = pipe2(&pipefds[i * 2],
-				  O_CLOEXEC | O_NONBLOCK);
-			TEST_ASSERT(!r, "Failed to set up pipefd");
-
-			setup_demand_paging(vm, &uffd_handler_threads[i],
-					    pipefds[i * 2], p->uffd_mode,
-					    p->uffd_delay, &uffd_args[i],
-					    vcpu_hva, vcpu_alias,
-					    vcpu_args->pages * perf_test_args.guest_page_size);
+			uffd_descs[i] = uffd_setup_demand_paging(
+				p->uffd_mode, p->uffd_delay, vcpu_hva,
+				vcpu_args->pages * perf_test_args.guest_page_size,
+				&handle_uffd_page_request);
 		}
 	}
 
@@ -344,15 +185,9 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 	pr_info("All vCPU threads joined\n");
 
 	if (p->uffd_mode) {
-		char c;
-
 		/* Tell the user fault fd handler threads to quit */
-		for (i = 0; i < nr_vcpus; i++) {
-			r = write(pipefds[i * 2 + 1], &c, 1);
-			TEST_ASSERT(r == 1, "Unable to write to pipefd");
-
-			pthread_join(uffd_handler_threads[i], NULL);
-		}
+		for (i = 0; i < nr_vcpus; i++)
+			uffd_stop_demand_paging(uffd_descs[i]);
 	}
 
 	pr_info("Total guest execution time: %ld.%.9lds\n",
@@ -364,11 +199,8 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 	perf_test_destroy_vm(vm);
 
 	free(guest_data_prototype);
-	if (p->uffd_mode) {
-		free(uffd_handler_threads);
-		free(uffd_args);
-		free(pipefds);
-	}
+	if (p->uffd_mode)
+		free(uffd_descs);
 }
 
 static void help(char *name)
diff --git a/tools/testing/selftests/kvm/include/userfaultfd_util.h b/tools/testing/selftests/kvm/include/userfaultfd_util.h
new file mode 100644
index 000000000000..a1a386c083b0
--- /dev/null
+++ b/tools/testing/selftests/kvm/include/userfaultfd_util.h
@@ -0,0 +1,46 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * KVM userfaultfd util
+ * Adapted from demand_paging_test.c
+ *
+ * Copyright (C) 2018, Red Hat, Inc.
+ * Copyright (C) 2019-2022 Google LLC
+ */
+
+#define _GNU_SOURCE /* for pipe2 */
+
+#include <inttypes.h>
+#include <time.h>
+#include <pthread.h>
+#include <linux/userfaultfd.h>
+
+#include "test_util.h"
+
+typedef int (*uffd_handler_t)(int uffd_mode, int uffd, struct uffd_msg *msg);
+
+struct uffd_desc {
+	int uffd_mode;
+	int uffd;
+	int pipefds[2];
+	useconds_t delay;
+	uffd_handler_t handler;
+	pthread_t thread;
+};
+
+struct uffd_desc *uffd_setup_demand_paging(int uffd_mode,
+		useconds_t uffd_delay, void *hva, uint64_t len,
+		uffd_handler_t handler);
+
+void uffd_stop_demand_paging(struct uffd_desc *uffd);
+
+#ifdef PRINT_PER_PAGE_UPDATES
+#define PER_PAGE_DEBUG(...) printf(__VA_ARGS__)
+#else
+#define PER_PAGE_DEBUG(...) _no_printf(__VA_ARGS__)
+#endif
+
+#ifdef PRINT_PER_VCPU_UPDATES
+#define PER_VCPU_DEBUG(...) printf(__VA_ARGS__)
+#else
+#define PER_VCPU_DEBUG(...) _no_printf(__VA_ARGS__)
+#endif
diff --git a/tools/testing/selftests/kvm/lib/userfaultfd_util.c b/tools/testing/selftests/kvm/lib/userfaultfd_util.c
new file mode 100644
index 000000000000..4395032ccbe4
--- /dev/null
+++ b/tools/testing/selftests/kvm/lib/userfaultfd_util.c
@@ -0,0 +1,187 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * KVM userfaultfd util
+ * Adapted from demand_paging_test.c
+ *
+ * Copyright (C) 2018, Red Hat, Inc.
+ * Copyright (C) 2019, Google, Inc.
+ * Copyright (C) 2022, Google, Inc.
+ */
+
+#define _GNU_SOURCE /* for pipe2 */
+
+#include <inttypes.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <time.h>
+#include <poll.h>
+#include <pthread.h>
+#include <linux/userfaultfd.h>
+#include <sys/syscall.h>
+
+#include "kvm_util.h"
+#include "test_util.h"
+#include "perf_test_util.h"
+#include "userfaultfd_util.h"
+
+#ifdef __NR_userfaultfd
+
+static void *uffd_handler_thread_fn(void *arg)
+{
+	struct uffd_desc *uffd_desc = (struct uffd_desc *)arg;
+	int uffd = uffd_desc->uffd;
+	int pipefd = uffd_desc->pipefds[0];
+	useconds_t delay = uffd_desc->delay;
+	int64_t pages = 0;
+	struct timespec start;
+	struct timespec ts_diff;
+
+	clock_gettime(CLOCK_MONOTONIC, &start);
+	while (1) {
+		struct uffd_msg msg;
+		struct pollfd pollfd[2];
+		char tmp_chr;
+		int r;
+
+		pollfd[0].fd = uffd;
+		pollfd[0].events = POLLIN;
+		pollfd[1].fd = pipefd;
+		pollfd[1].events = POLLIN;
+
+		r = poll(pollfd, 2, -1);
+		switch (r) {
+		case -1:
+			pr_info("poll err");
+			continue;
+		case 0:
+			continue;
+		case 1:
+			break;
+		default:
+			pr_info("Polling uffd returned %d", r);
+			return NULL;
+		}
+
+		if (pollfd[0].revents & POLLERR) {
+			pr_info("uffd revents has POLLERR");
+			return NULL;
+		}
+
+		if (pollfd[1].revents & POLLIN) {
+			r = read(pollfd[1].fd, &tmp_chr, 1);
+			TEST_ASSERT(r == 1,
+				    "Error reading pipefd in UFFD thread\n");
+			return NULL;
+		}
+
+		if (!(pollfd[0].revents & POLLIN))
+			continue;
+
+		r = read(uffd, &msg, sizeof(msg));
+		if (r == -1) {
+			if (errno == EAGAIN)
+				continue;
+			pr_info("Read of uffd got errno %d\n", errno);
+			return NULL;
+		}
+
+		if (r != sizeof(msg)) {
+			pr_info("Read on uffd returned unexpected size: %d bytes", r);
+			return NULL;
+		}
+
+		if (!(msg.event & UFFD_EVENT_PAGEFAULT))
+			continue;
+
+		if (delay)
+			usleep(delay);
+		r = uffd_desc->handler(uffd_desc->uffd_mode, uffd, &msg);
+		if (r < 0)
+			return NULL;
+		pages++;
+	}
+
+	ts_diff = timespec_elapsed(start);
+	PER_VCPU_DEBUG("userfaulted %ld pages over %ld.%.9lds. (%f/sec)\n",
+		       pages, ts_diff.tv_sec, ts_diff.tv_nsec,
+		       pages / ((double)ts_diff.tv_sec + (double)ts_diff.tv_nsec / 100000000.0));
+
+	return NULL;
+}
+
+struct uffd_desc *uffd_setup_demand_paging(int uffd_mode,
+		useconds_t uffd_delay, void *hva, uint64_t len,
+		uffd_handler_t handler)
+{
+	struct uffd_desc *uffd_desc;
+	bool is_minor = (uffd_mode == UFFDIO_REGISTER_MODE_MINOR);
+	int uffd;
+	struct uffdio_api uffdio_api;
+	struct uffdio_register uffdio_register;
+	uint64_t expected_ioctls = ((uint64_t) 1) << _UFFDIO_COPY;
+	int ret;
+
+	PER_PAGE_DEBUG("Userfaultfd %s mode, faults resolved with %s\n",
+		       is_minor ? "MINOR" : "MISSING",
+		       is_minor ? "UFFDIO_CONINUE" : "UFFDIO_COPY");
+
+	uffd_desc = malloc(sizeof(struct uffd_desc));
+	TEST_ASSERT(uffd_desc, "malloc failed");
+
+	/* In order to get minor faults, prefault via the alias. */
+	if (is_minor)
+		expected_ioctls = ((uint64_t) 1) << _UFFDIO_CONTINUE;
+
+	uffd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK);
+	TEST_ASSERT(uffd >= 0, "uffd creation failed, errno: %d", errno);
+
+	uffdio_api.api = UFFD_API;
+	uffdio_api.features = 0;
+	TEST_ASSERT(ioctl(uffd, UFFDIO_API, &uffdio_api) != -1,
+		    "ioctl UFFDIO_API failed: %" PRIu64,
+		    (uint64_t)uffdio_api.api);
+
+	uffdio_register.range.start = (uint64_t)hva;
+	uffdio_register.range.len = len;
+	uffdio_register.mode = uffd_mode;
+	TEST_ASSERT(ioctl(uffd, UFFDIO_REGISTER, &uffdio_register) != -1,
+		    "ioctl UFFDIO_REGISTER failed");
+	TEST_ASSERT((uffdio_register.ioctls & expected_ioctls) ==
+			expected_ioctls, "missing userfaultfd ioctls");
+
+	ret = pipe2(uffd_desc->pipefds, O_CLOEXEC | O_NONBLOCK);
+	TEST_ASSERT(!ret, "Failed to set up pipefd");
+
+	uffd_desc->uffd_mode = uffd_mode;
+	uffd_desc->uffd = uffd;
+	uffd_desc->delay = uffd_delay;
+	uffd_desc->handler = handler;
+	pthread_create(&uffd_desc->thread, NULL, uffd_handler_thread_fn,
+		       uffd_desc);
+
+	PER_VCPU_DEBUG("Created uffd thread for HVA range [%p, %p)\n",
+		       hva, hva + len);
+
+	return uffd_desc;
+}
+
+void uffd_stop_demand_paging(struct uffd_desc *uffd)
+{
+	char c = 0;
+	int ret;
+
+	ret = write(uffd->pipefds[1], &c, 1);
+	TEST_ASSERT(ret == 1, "Unable to write to pipefd");
+
+	ret = pthread_join(uffd->thread, NULL);
+	TEST_ASSERT(ret == 0, "Pthread_join failed.");
+
+	close(uffd->uffd);
+
+	close(uffd->pipefds[1]);
+	close(uffd->pipefds[0]);
+
+	free(uffd);
+}
+
+#endif /* __NR_userfaultfd */
-- 
2.37.0.rc0.161.g10f37bed90-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v4 02/13] KVM: selftests: aarch64: Add virt_get_pte_hva library function
  2022-06-24 21:32 ` Ricardo Koller
@ 2022-06-24 21:32   ` Ricardo Koller
  -1 siblings, 0 replies; 56+ messages in thread
From: Ricardo Koller @ 2022-06-24 21:32 UTC (permalink / raw)
  To: kvm, kvmarm, drjones
  Cc: pbonzini, maz, alexandru.elisei, eric.auger, oupton, reijiw,
	rananta, bgardon, dmatlack, axelrasmussen, Ricardo Koller

Add a library function to get the PTE (a host virtual address) of a
given GVA.  This will be used in a future commit by a test to clear and
check the access flag of a particular page.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 .../selftests/kvm/include/aarch64/processor.h       |  2 ++
 tools/testing/selftests/kvm/lib/aarch64/processor.c | 13 ++++++++++---
 2 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/aarch64/processor.h b/tools/testing/selftests/kvm/include/aarch64/processor.h
index a8124f9dd68a..df4bfac69551 100644
--- a/tools/testing/selftests/kvm/include/aarch64/processor.h
+++ b/tools/testing/selftests/kvm/include/aarch64/processor.h
@@ -109,6 +109,8 @@ void vm_install_exception_handler(struct kvm_vm *vm,
 void vm_install_sync_handler(struct kvm_vm *vm,
 		int vector, int ec, handler_fn handler);
 
+uint64_t *virt_get_pte_hva(struct kvm_vm *vm, vm_vaddr_t gva);
+
 static inline void cpu_relax(void)
 {
 	asm volatile("yield" ::: "memory");
diff --git a/tools/testing/selftests/kvm/lib/aarch64/processor.c b/tools/testing/selftests/kvm/lib/aarch64/processor.c
index 6f5551368944..63ef3c78e55e 100644
--- a/tools/testing/selftests/kvm/lib/aarch64/processor.c
+++ b/tools/testing/selftests/kvm/lib/aarch64/processor.c
@@ -138,7 +138,7 @@ void virt_arch_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr)
 	_virt_pg_map(vm, vaddr, paddr, attr_idx);
 }
 
-vm_paddr_t addr_arch_gva2gpa(struct kvm_vm *vm, vm_vaddr_t gva)
+uint64_t *virt_get_pte_hva(struct kvm_vm *vm, vm_vaddr_t gva)
 {
 	uint64_t *ptep;
 
@@ -169,11 +169,18 @@ vm_paddr_t addr_arch_gva2gpa(struct kvm_vm *vm, vm_vaddr_t gva)
 		TEST_FAIL("Page table levels must be 2, 3, or 4");
 	}
 
-	return pte_addr(vm, *ptep) + (gva & (vm->page_size - 1));
+	return ptep;
 
 unmapped_gva:
 	TEST_FAIL("No mapping for vm virtual address, gva: 0x%lx", gva);
-	exit(1);
+	exit(EXIT_FAILURE);
+}
+
+vm_paddr_t addr_arch_gva2gpa(struct kvm_vm *vm, vm_vaddr_t gva)
+{
+	uint64_t *ptep = virt_get_pte_hva(vm, gva);
+
+	return pte_addr(vm, *ptep) + (gva & (vm->page_size - 1));
 }
 
 static void pte_dump(FILE *stream, struct kvm_vm *vm, uint8_t indent, uint64_t page, int level)
-- 
2.37.0.rc0.161.g10f37bed90-goog


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v4 02/13] KVM: selftests: aarch64: Add virt_get_pte_hva library function
@ 2022-06-24 21:32   ` Ricardo Koller
  0 siblings, 0 replies; 56+ messages in thread
From: Ricardo Koller @ 2022-06-24 21:32 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: maz, bgardon, dmatlack, pbonzini, axelrasmussen

Add a library function to get the PTE (a host virtual address) of a
given GVA.  This will be used in a future commit by a test to clear and
check the access flag of a particular page.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 .../selftests/kvm/include/aarch64/processor.h       |  2 ++
 tools/testing/selftests/kvm/lib/aarch64/processor.c | 13 ++++++++++---
 2 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/aarch64/processor.h b/tools/testing/selftests/kvm/include/aarch64/processor.h
index a8124f9dd68a..df4bfac69551 100644
--- a/tools/testing/selftests/kvm/include/aarch64/processor.h
+++ b/tools/testing/selftests/kvm/include/aarch64/processor.h
@@ -109,6 +109,8 @@ void vm_install_exception_handler(struct kvm_vm *vm,
 void vm_install_sync_handler(struct kvm_vm *vm,
 		int vector, int ec, handler_fn handler);
 
+uint64_t *virt_get_pte_hva(struct kvm_vm *vm, vm_vaddr_t gva);
+
 static inline void cpu_relax(void)
 {
 	asm volatile("yield" ::: "memory");
diff --git a/tools/testing/selftests/kvm/lib/aarch64/processor.c b/tools/testing/selftests/kvm/lib/aarch64/processor.c
index 6f5551368944..63ef3c78e55e 100644
--- a/tools/testing/selftests/kvm/lib/aarch64/processor.c
+++ b/tools/testing/selftests/kvm/lib/aarch64/processor.c
@@ -138,7 +138,7 @@ void virt_arch_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr)
 	_virt_pg_map(vm, vaddr, paddr, attr_idx);
 }
 
-vm_paddr_t addr_arch_gva2gpa(struct kvm_vm *vm, vm_vaddr_t gva)
+uint64_t *virt_get_pte_hva(struct kvm_vm *vm, vm_vaddr_t gva)
 {
 	uint64_t *ptep;
 
@@ -169,11 +169,18 @@ vm_paddr_t addr_arch_gva2gpa(struct kvm_vm *vm, vm_vaddr_t gva)
 		TEST_FAIL("Page table levels must be 2, 3, or 4");
 	}
 
-	return pte_addr(vm, *ptep) + (gva & (vm->page_size - 1));
+	return ptep;
 
 unmapped_gva:
 	TEST_FAIL("No mapping for vm virtual address, gva: 0x%lx", gva);
-	exit(1);
+	exit(EXIT_FAILURE);
+}
+
+vm_paddr_t addr_arch_gva2gpa(struct kvm_vm *vm, vm_vaddr_t gva)
+{
+	uint64_t *ptep = virt_get_pte_hva(vm, gva);
+
+	return pte_addr(vm, *ptep) + (gva & (vm->page_size - 1));
 }
 
 static void pte_dump(FILE *stream, struct kvm_vm *vm, uint8_t indent, uint64_t page, int level)
-- 
2.37.0.rc0.161.g10f37bed90-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v4 03/13] KVM: selftests: Add vm_alloc_page_table_in_memslot library function
  2022-06-24 21:32 ` Ricardo Koller
@ 2022-06-24 21:32   ` Ricardo Koller
  -1 siblings, 0 replies; 56+ messages in thread
From: Ricardo Koller @ 2022-06-24 21:32 UTC (permalink / raw)
  To: kvm, kvmarm, drjones
  Cc: pbonzini, maz, alexandru.elisei, eric.auger, oupton, reijiw,
	rananta, bgardon, dmatlack, axelrasmussen, Ricardo Koller

Add a library function to allocate a page-table physical page in a
particular memslot.  The default behavior is to create new page-table
pages in memslot 0.

Reviewed-by: Oliver Upton <oupton@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 tools/testing/selftests/kvm/include/kvm_util_base.h | 1 +
 tools/testing/selftests/kvm/lib/kvm_util.c          | 8 +++++++-
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
index 7ebfc8c7de17..54ede9fc923c 100644
--- a/tools/testing/selftests/kvm/include/kvm_util_base.h
+++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
@@ -579,6 +579,7 @@ vm_paddr_t vm_phy_page_alloc(struct kvm_vm *vm, vm_paddr_t paddr_min,
 vm_paddr_t vm_phy_pages_alloc(struct kvm_vm *vm, size_t num,
 			      vm_paddr_t paddr_min, uint32_t memslot);
 vm_paddr_t vm_alloc_page_table(struct kvm_vm *vm);
+vm_paddr_t vm_alloc_page_table_in_memslot(struct kvm_vm *vm, uint32_t pt_memslot);
 
 /*
  * ____vm_create() does KVM_CREATE_VM and little else.  __vm_create() also
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index f8c104dba258..5ee20d4da222 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -1784,9 +1784,15 @@ vm_paddr_t vm_phy_page_alloc(struct kvm_vm *vm, vm_paddr_t paddr_min,
 /* Arbitrary minimum physical address used for virtual translation tables. */
 #define KVM_GUEST_PAGE_TABLE_MIN_PADDR 0x180000
 
+vm_paddr_t vm_alloc_page_table_in_memslot(struct kvm_vm *vm, uint32_t pt_memslot)
+{
+	return vm_phy_page_alloc(vm, KVM_GUEST_PAGE_TABLE_MIN_PADDR,
+			pt_memslot);
+}
+
 vm_paddr_t vm_alloc_page_table(struct kvm_vm *vm)
 {
-	return vm_phy_page_alloc(vm, KVM_GUEST_PAGE_TABLE_MIN_PADDR, 0);
+	return vm_alloc_page_table_in_memslot(vm, 0);
 }
 
 /*
-- 
2.37.0.rc0.161.g10f37bed90-goog


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v4 03/13] KVM: selftests: Add vm_alloc_page_table_in_memslot library function
@ 2022-06-24 21:32   ` Ricardo Koller
  0 siblings, 0 replies; 56+ messages in thread
From: Ricardo Koller @ 2022-06-24 21:32 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: maz, bgardon, dmatlack, pbonzini, axelrasmussen

Add a library function to allocate a page-table physical page in a
particular memslot.  The default behavior is to create new page-table
pages in memslot 0.

Reviewed-by: Oliver Upton <oupton@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 tools/testing/selftests/kvm/include/kvm_util_base.h | 1 +
 tools/testing/selftests/kvm/lib/kvm_util.c          | 8 +++++++-
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
index 7ebfc8c7de17..54ede9fc923c 100644
--- a/tools/testing/selftests/kvm/include/kvm_util_base.h
+++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
@@ -579,6 +579,7 @@ vm_paddr_t vm_phy_page_alloc(struct kvm_vm *vm, vm_paddr_t paddr_min,
 vm_paddr_t vm_phy_pages_alloc(struct kvm_vm *vm, size_t num,
 			      vm_paddr_t paddr_min, uint32_t memslot);
 vm_paddr_t vm_alloc_page_table(struct kvm_vm *vm);
+vm_paddr_t vm_alloc_page_table_in_memslot(struct kvm_vm *vm, uint32_t pt_memslot);
 
 /*
  * ____vm_create() does KVM_CREATE_VM and little else.  __vm_create() also
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index f8c104dba258..5ee20d4da222 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -1784,9 +1784,15 @@ vm_paddr_t vm_phy_page_alloc(struct kvm_vm *vm, vm_paddr_t paddr_min,
 /* Arbitrary minimum physical address used for virtual translation tables. */
 #define KVM_GUEST_PAGE_TABLE_MIN_PADDR 0x180000
 
+vm_paddr_t vm_alloc_page_table_in_memslot(struct kvm_vm *vm, uint32_t pt_memslot)
+{
+	return vm_phy_page_alloc(vm, KVM_GUEST_PAGE_TABLE_MIN_PADDR,
+			pt_memslot);
+}
+
 vm_paddr_t vm_alloc_page_table(struct kvm_vm *vm)
 {
-	return vm_phy_page_alloc(vm, KVM_GUEST_PAGE_TABLE_MIN_PADDR, 0);
+	return vm_alloc_page_table_in_memslot(vm, 0);
 }
 
 /*
-- 
2.37.0.rc0.161.g10f37bed90-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v4 04/13] KVM: selftests: aarch64: Export _virt_pg_map with a pt_memslot arg
  2022-06-24 21:32 ` Ricardo Koller
@ 2022-06-24 21:32   ` Ricardo Koller
  -1 siblings, 0 replies; 56+ messages in thread
From: Ricardo Koller @ 2022-06-24 21:32 UTC (permalink / raw)
  To: kvm, kvmarm, drjones
  Cc: pbonzini, maz, alexandru.elisei, eric.auger, oupton, reijiw,
	rananta, bgardon, dmatlack, axelrasmussen, Ricardo Koller

Add an argument, pt_memslot, into _virt_pg_map in order to use a
specific memslot for the page-table allocations performed when creating
a new map. This will be used in a future commit to test having PTEs
stored on memslots with different setups (e.g., hugetlb with a hole).

Reviewed-by: Oliver Upton <oupton@google.com>
Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 .../selftests/kvm/include/aarch64/processor.h        |  3 +++
 tools/testing/selftests/kvm/lib/aarch64/processor.c  | 12 ++++++------
 2 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/aarch64/processor.h b/tools/testing/selftests/kvm/include/aarch64/processor.h
index df4bfac69551..6649671fa7c1 100644
--- a/tools/testing/selftests/kvm/include/aarch64/processor.h
+++ b/tools/testing/selftests/kvm/include/aarch64/processor.h
@@ -109,6 +109,9 @@ void vm_install_exception_handler(struct kvm_vm *vm,
 void vm_install_sync_handler(struct kvm_vm *vm,
 		int vector, int ec, handler_fn handler);
 
+void _virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr,
+			 uint64_t flags, uint32_t pt_memslot);
+
 uint64_t *virt_get_pte_hva(struct kvm_vm *vm, vm_vaddr_t gva);
 
 static inline void cpu_relax(void)
diff --git a/tools/testing/selftests/kvm/lib/aarch64/processor.c b/tools/testing/selftests/kvm/lib/aarch64/processor.c
index 63ef3c78e55e..8dd511aa79c2 100644
--- a/tools/testing/selftests/kvm/lib/aarch64/processor.c
+++ b/tools/testing/selftests/kvm/lib/aarch64/processor.c
@@ -85,8 +85,8 @@ void virt_arch_pgd_alloc(struct kvm_vm *vm)
 	}
 }
 
-static void _virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr,
-			 uint64_t flags)
+void _virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr,
+			 uint64_t flags, uint32_t pt_memslot)
 {
 	uint8_t attr_idx = flags & 7;
 	uint64_t *ptep;
@@ -107,18 +107,18 @@ static void _virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr,
 
 	ptep = addr_gpa2hva(vm, vm->pgd) + pgd_index(vm, vaddr) * 8;
 	if (!*ptep)
-		*ptep = vm_alloc_page_table(vm) | 3;
+		*ptep = vm_alloc_page_table_in_memslot(vm, pt_memslot) | 3;
 
 	switch (vm->pgtable_levels) {
 	case 4:
 		ptep = addr_gpa2hva(vm, pte_addr(vm, *ptep)) + pud_index(vm, vaddr) * 8;
 		if (!*ptep)
-			*ptep = vm_alloc_page_table(vm) | 3;
+			*ptep = vm_alloc_page_table_in_memslot(vm, pt_memslot) | 3;
 		/* fall through */
 	case 3:
 		ptep = addr_gpa2hva(vm, pte_addr(vm, *ptep)) + pmd_index(vm, vaddr) * 8;
 		if (!*ptep)
-			*ptep = vm_alloc_page_table(vm) | 3;
+			*ptep = vm_alloc_page_table_in_memslot(vm, pt_memslot) | 3;
 		/* fall through */
 	case 2:
 		ptep = addr_gpa2hva(vm, pte_addr(vm, *ptep)) + pte_index(vm, vaddr) * 8;
@@ -135,7 +135,7 @@ void virt_arch_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr)
 {
 	uint64_t attr_idx = 4; /* NORMAL (See DEFAULT_MAIR_EL1) */
 
-	_virt_pg_map(vm, vaddr, paddr, attr_idx);
+	_virt_pg_map(vm, vaddr, paddr, attr_idx, 0);
 }
 
 uint64_t *virt_get_pte_hva(struct kvm_vm *vm, vm_vaddr_t gva)
-- 
2.37.0.rc0.161.g10f37bed90-goog


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v4 04/13] KVM: selftests: aarch64: Export _virt_pg_map with a pt_memslot arg
@ 2022-06-24 21:32   ` Ricardo Koller
  0 siblings, 0 replies; 56+ messages in thread
From: Ricardo Koller @ 2022-06-24 21:32 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: maz, bgardon, dmatlack, pbonzini, axelrasmussen

Add an argument, pt_memslot, into _virt_pg_map in order to use a
specific memslot for the page-table allocations performed when creating
a new map. This will be used in a future commit to test having PTEs
stored on memslots with different setups (e.g., hugetlb with a hole).

Reviewed-by: Oliver Upton <oupton@google.com>
Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 .../selftests/kvm/include/aarch64/processor.h        |  3 +++
 tools/testing/selftests/kvm/lib/aarch64/processor.c  | 12 ++++++------
 2 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/aarch64/processor.h b/tools/testing/selftests/kvm/include/aarch64/processor.h
index df4bfac69551..6649671fa7c1 100644
--- a/tools/testing/selftests/kvm/include/aarch64/processor.h
+++ b/tools/testing/selftests/kvm/include/aarch64/processor.h
@@ -109,6 +109,9 @@ void vm_install_exception_handler(struct kvm_vm *vm,
 void vm_install_sync_handler(struct kvm_vm *vm,
 		int vector, int ec, handler_fn handler);
 
+void _virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr,
+			 uint64_t flags, uint32_t pt_memslot);
+
 uint64_t *virt_get_pte_hva(struct kvm_vm *vm, vm_vaddr_t gva);
 
 static inline void cpu_relax(void)
diff --git a/tools/testing/selftests/kvm/lib/aarch64/processor.c b/tools/testing/selftests/kvm/lib/aarch64/processor.c
index 63ef3c78e55e..8dd511aa79c2 100644
--- a/tools/testing/selftests/kvm/lib/aarch64/processor.c
+++ b/tools/testing/selftests/kvm/lib/aarch64/processor.c
@@ -85,8 +85,8 @@ void virt_arch_pgd_alloc(struct kvm_vm *vm)
 	}
 }
 
-static void _virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr,
-			 uint64_t flags)
+void _virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr,
+			 uint64_t flags, uint32_t pt_memslot)
 {
 	uint8_t attr_idx = flags & 7;
 	uint64_t *ptep;
@@ -107,18 +107,18 @@ static void _virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr,
 
 	ptep = addr_gpa2hva(vm, vm->pgd) + pgd_index(vm, vaddr) * 8;
 	if (!*ptep)
-		*ptep = vm_alloc_page_table(vm) | 3;
+		*ptep = vm_alloc_page_table_in_memslot(vm, pt_memslot) | 3;
 
 	switch (vm->pgtable_levels) {
 	case 4:
 		ptep = addr_gpa2hva(vm, pte_addr(vm, *ptep)) + pud_index(vm, vaddr) * 8;
 		if (!*ptep)
-			*ptep = vm_alloc_page_table(vm) | 3;
+			*ptep = vm_alloc_page_table_in_memslot(vm, pt_memslot) | 3;
 		/* fall through */
 	case 3:
 		ptep = addr_gpa2hva(vm, pte_addr(vm, *ptep)) + pmd_index(vm, vaddr) * 8;
 		if (!*ptep)
-			*ptep = vm_alloc_page_table(vm) | 3;
+			*ptep = vm_alloc_page_table_in_memslot(vm, pt_memslot) | 3;
 		/* fall through */
 	case 2:
 		ptep = addr_gpa2hva(vm, pte_addr(vm, *ptep)) + pte_index(vm, vaddr) * 8;
@@ -135,7 +135,7 @@ void virt_arch_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr)
 {
 	uint64_t attr_idx = 4; /* NORMAL (See DEFAULT_MAIR_EL1) */
 
-	_virt_pg_map(vm, vaddr, paddr, attr_idx);
+	_virt_pg_map(vm, vaddr, paddr, attr_idx, 0);
 }
 
 uint64_t *virt_get_pte_hva(struct kvm_vm *vm, vm_vaddr_t gva)
-- 
2.37.0.rc0.161.g10f37bed90-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v4 05/13] KVM: selftests: Add missing close and munmap in __vm_mem_region_delete
  2022-06-24 21:32 ` Ricardo Koller
@ 2022-06-24 21:32   ` Ricardo Koller
  -1 siblings, 0 replies; 56+ messages in thread
From: Ricardo Koller @ 2022-06-24 21:32 UTC (permalink / raw)
  To: kvm, kvmarm, drjones
  Cc: pbonzini, maz, alexandru.elisei, eric.auger, oupton, reijiw,
	rananta, bgardon, dmatlack, axelrasmussen, Ricardo Koller

Deleting a memslot (when freeing a VM) is not closing the backing fd,
nor it's unmapping the alias mapping. Fix by adding the missing close
and munmap.

Reviewed-by: Oliver Upton <oupton@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 tools/testing/selftests/kvm/lib/kvm_util.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index 5ee20d4da222..3e45e3776bdf 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -531,6 +531,12 @@ static void __vm_mem_region_delete(struct kvm_vm *vm,
 	sparsebit_free(&region->unused_phy_pages);
 	ret = munmap(region->mmap_start, region->mmap_size);
 	TEST_ASSERT(!ret, __KVM_SYSCALL_ERROR("munmap()", ret));
+	if (region->fd >= 0) {
+		/* There's an extra map when using shared memory. */
+		ret = munmap(region->mmap_alias, region->mmap_size);
+		TEST_ASSERT(!ret, __KVM_SYSCALL_ERROR("munmap()", ret));
+		close(region->fd);
+	}
 
 	free(region);
 }
-- 
2.37.0.rc0.161.g10f37bed90-goog


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v4 05/13] KVM: selftests: Add missing close and munmap in __vm_mem_region_delete
@ 2022-06-24 21:32   ` Ricardo Koller
  0 siblings, 0 replies; 56+ messages in thread
From: Ricardo Koller @ 2022-06-24 21:32 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: maz, bgardon, dmatlack, pbonzini, axelrasmussen

Deleting a memslot (when freeing a VM) is not closing the backing fd,
nor it's unmapping the alias mapping. Fix by adding the missing close
and munmap.

Reviewed-by: Oliver Upton <oupton@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 tools/testing/selftests/kvm/lib/kvm_util.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index 5ee20d4da222..3e45e3776bdf 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -531,6 +531,12 @@ static void __vm_mem_region_delete(struct kvm_vm *vm,
 	sparsebit_free(&region->unused_phy_pages);
 	ret = munmap(region->mmap_start, region->mmap_size);
 	TEST_ASSERT(!ret, __KVM_SYSCALL_ERROR("munmap()", ret));
+	if (region->fd >= 0) {
+		/* There's an extra map when using shared memory. */
+		ret = munmap(region->mmap_alias, region->mmap_size);
+		TEST_ASSERT(!ret, __KVM_SYSCALL_ERROR("munmap()", ret));
+		close(region->fd);
+	}
 
 	free(region);
 }
-- 
2.37.0.rc0.161.g10f37bed90-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v4 06/13] KVM: selftests: Add vm_mem_region_get_src_fd library function
  2022-06-24 21:32 ` Ricardo Koller
@ 2022-06-24 21:32   ` Ricardo Koller
  -1 siblings, 0 replies; 56+ messages in thread
From: Ricardo Koller @ 2022-06-24 21:32 UTC (permalink / raw)
  To: kvm, kvmarm, drjones
  Cc: pbonzini, maz, alexandru.elisei, eric.auger, oupton, reijiw,
	rananta, bgardon, dmatlack, axelrasmussen, Ricardo Koller

Add a library function to get the backing source FD of a memslot.

Reviewed-by: Oliver Upton <oupton@google.com>
Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 .../selftests/kvm/include/kvm_util_base.h     |  1 +
 tools/testing/selftests/kvm/lib/kvm_util.c    | 23 +++++++++++++++++++
 2 files changed, 24 insertions(+)

diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
index 54ede9fc923c..72c8881fe8fb 100644
--- a/tools/testing/selftests/kvm/include/kvm_util_base.h
+++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
@@ -322,6 +322,7 @@ void vm_userspace_mem_region_add(struct kvm_vm *vm,
 void vm_mem_region_set_flags(struct kvm_vm *vm, uint32_t slot, uint32_t flags);
 void vm_mem_region_move(struct kvm_vm *vm, uint32_t slot, uint64_t new_gpa);
 void vm_mem_region_delete(struct kvm_vm *vm, uint32_t slot);
+int vm_mem_region_get_src_fd(struct kvm_vm *vm, uint32_t memslot);
 struct kvm_vcpu *__vm_vcpu_add(struct kvm_vm *vm, uint32_t vcpu_id);
 vm_vaddr_t vm_vaddr_alloc(struct kvm_vm *vm, size_t sz, vm_vaddr_t vaddr_min);
 vm_vaddr_t vm_vaddr_alloc_pages(struct kvm_vm *vm, int nr_pages);
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index 3e45e3776bdf..7c81028f23d8 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -466,6 +466,29 @@ kvm_userspace_memory_region_find(struct kvm_vm *vm, uint64_t start,
 	return &region->region;
 }
 
+/*
+ * KVM Userspace Memory Get Backing Source FD
+ *
+ * Input Args:
+ *   vm - Virtual Machine
+ *   memslot - KVM memory slot ID
+ *
+ * Output Args: None
+ *
+ * Return:
+ *   Backing source file descriptor, -1 if the memslot is an anonymous region.
+ *
+ * Returns the backing source fd of a memslot, so tests can use it to punch
+ * holes, or to setup permissions.
+ */
+int vm_mem_region_get_src_fd(struct kvm_vm *vm, uint32_t memslot)
+{
+	struct userspace_mem_region *region;
+
+	region = memslot2region(vm, memslot);
+	return region->fd;
+}
+
 /*
  * VM VCPU Remove
  *
-- 
2.37.0.rc0.161.g10f37bed90-goog


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v4 06/13] KVM: selftests: Add vm_mem_region_get_src_fd library function
@ 2022-06-24 21:32   ` Ricardo Koller
  0 siblings, 0 replies; 56+ messages in thread
From: Ricardo Koller @ 2022-06-24 21:32 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: maz, bgardon, dmatlack, pbonzini, axelrasmussen

Add a library function to get the backing source FD of a memslot.

Reviewed-by: Oliver Upton <oupton@google.com>
Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 .../selftests/kvm/include/kvm_util_base.h     |  1 +
 tools/testing/selftests/kvm/lib/kvm_util.c    | 23 +++++++++++++++++++
 2 files changed, 24 insertions(+)

diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
index 54ede9fc923c..72c8881fe8fb 100644
--- a/tools/testing/selftests/kvm/include/kvm_util_base.h
+++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
@@ -322,6 +322,7 @@ void vm_userspace_mem_region_add(struct kvm_vm *vm,
 void vm_mem_region_set_flags(struct kvm_vm *vm, uint32_t slot, uint32_t flags);
 void vm_mem_region_move(struct kvm_vm *vm, uint32_t slot, uint64_t new_gpa);
 void vm_mem_region_delete(struct kvm_vm *vm, uint32_t slot);
+int vm_mem_region_get_src_fd(struct kvm_vm *vm, uint32_t memslot);
 struct kvm_vcpu *__vm_vcpu_add(struct kvm_vm *vm, uint32_t vcpu_id);
 vm_vaddr_t vm_vaddr_alloc(struct kvm_vm *vm, size_t sz, vm_vaddr_t vaddr_min);
 vm_vaddr_t vm_vaddr_alloc_pages(struct kvm_vm *vm, int nr_pages);
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index 3e45e3776bdf..7c81028f23d8 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -466,6 +466,29 @@ kvm_userspace_memory_region_find(struct kvm_vm *vm, uint64_t start,
 	return &region->region;
 }
 
+/*
+ * KVM Userspace Memory Get Backing Source FD
+ *
+ * Input Args:
+ *   vm - Virtual Machine
+ *   memslot - KVM memory slot ID
+ *
+ * Output Args: None
+ *
+ * Return:
+ *   Backing source file descriptor, -1 if the memslot is an anonymous region.
+ *
+ * Returns the backing source fd of a memslot, so tests can use it to punch
+ * holes, or to setup permissions.
+ */
+int vm_mem_region_get_src_fd(struct kvm_vm *vm, uint32_t memslot)
+{
+	struct userspace_mem_region *region;
+
+	region = memslot2region(vm, memslot);
+	return region->fd;
+}
+
 /*
  * VM VCPU Remove
  *
-- 
2.37.0.rc0.161.g10f37bed90-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v4 07/13] KVM: selftests: aarch64: Construct DEFAULT_MAIR_EL1 using sysreg.h macros
  2022-06-24 21:32 ` Ricardo Koller
@ 2022-06-24 21:32   ` Ricardo Koller
  -1 siblings, 0 replies; 56+ messages in thread
From: Ricardo Koller @ 2022-06-24 21:32 UTC (permalink / raw)
  To: kvm, kvmarm, drjones
  Cc: pbonzini, maz, alexandru.elisei, eric.auger, oupton, reijiw,
	rananta, bgardon, dmatlack, axelrasmussen, Ricardo Koller

Define macros for memory type indexes and construct DEFAULT_MAIR_EL1
with macros from asm/sysreg.h.  The index macros can then be used when
constructing PTEs (instead of using raw numbers).

Reviewed-by: Oliver Upton <oupton@google.com>
Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 .../selftests/kvm/include/aarch64/processor.h | 25 ++++++++++++++-----
 .../selftests/kvm/lib/aarch64/processor.c     |  2 +-
 2 files changed, 20 insertions(+), 7 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/aarch64/processor.h b/tools/testing/selftests/kvm/include/aarch64/processor.h
index 6649671fa7c1..74f10d006e15 100644
--- a/tools/testing/selftests/kvm/include/aarch64/processor.h
+++ b/tools/testing/selftests/kvm/include/aarch64/processor.h
@@ -38,12 +38,25 @@
  * NORMAL             4     1111:1111
  * NORMAL_WT          5     1011:1011
  */
-#define DEFAULT_MAIR_EL1 ((0x00ul << (0 * 8)) | \
-			  (0x04ul << (1 * 8)) | \
-			  (0x0cul << (2 * 8)) | \
-			  (0x44ul << (3 * 8)) | \
-			  (0xfful << (4 * 8)) | \
-			  (0xbbul << (5 * 8)))
+
+/* Linux doesn't use these memory types, so let's define them. */
+#define MAIR_ATTR_DEVICE_GRE	UL(0x0c)
+#define MAIR_ATTR_NORMAL_WT	UL(0xbb)
+
+#define MT_DEVICE_nGnRnE	0
+#define MT_DEVICE_nGnRE		1
+#define MT_DEVICE_GRE		2
+#define MT_NORMAL_NC		3
+#define MT_NORMAL		4
+#define MT_NORMAL_WT		5
+
+#define DEFAULT_MAIR_EL1							\
+	(MAIR_ATTRIDX(MAIR_ATTR_DEVICE_nGnRnE, MT_DEVICE_nGnRnE) |		\
+	 MAIR_ATTRIDX(MAIR_ATTR_DEVICE_nGnRE, MT_DEVICE_nGnRE) |		\
+	 MAIR_ATTRIDX(MAIR_ATTR_DEVICE_GRE, MT_DEVICE_GRE) |			\
+	 MAIR_ATTRIDX(MAIR_ATTR_NORMAL_NC, MT_NORMAL_NC) |			\
+	 MAIR_ATTRIDX(MAIR_ATTR_NORMAL, MT_NORMAL) |				\
+	 MAIR_ATTRIDX(MAIR_ATTR_NORMAL_WT, MT_NORMAL_WT))
 
 #define MPIDR_HWID_BITMASK (0xff00fffffful)
 
diff --git a/tools/testing/selftests/kvm/lib/aarch64/processor.c b/tools/testing/selftests/kvm/lib/aarch64/processor.c
index 8dd511aa79c2..733a2b713580 100644
--- a/tools/testing/selftests/kvm/lib/aarch64/processor.c
+++ b/tools/testing/selftests/kvm/lib/aarch64/processor.c
@@ -133,7 +133,7 @@ void _virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr,
 
 void virt_arch_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr)
 {
-	uint64_t attr_idx = 4; /* NORMAL (See DEFAULT_MAIR_EL1) */
+	uint64_t attr_idx = MT_NORMAL;
 
 	_virt_pg_map(vm, vaddr, paddr, attr_idx, 0);
 }
-- 
2.37.0.rc0.161.g10f37bed90-goog


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v4 07/13] KVM: selftests: aarch64: Construct DEFAULT_MAIR_EL1 using sysreg.h macros
@ 2022-06-24 21:32   ` Ricardo Koller
  0 siblings, 0 replies; 56+ messages in thread
From: Ricardo Koller @ 2022-06-24 21:32 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: maz, bgardon, dmatlack, pbonzini, axelrasmussen

Define macros for memory type indexes and construct DEFAULT_MAIR_EL1
with macros from asm/sysreg.h.  The index macros can then be used when
constructing PTEs (instead of using raw numbers).

Reviewed-by: Oliver Upton <oupton@google.com>
Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 .../selftests/kvm/include/aarch64/processor.h | 25 ++++++++++++++-----
 .../selftests/kvm/lib/aarch64/processor.c     |  2 +-
 2 files changed, 20 insertions(+), 7 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/aarch64/processor.h b/tools/testing/selftests/kvm/include/aarch64/processor.h
index 6649671fa7c1..74f10d006e15 100644
--- a/tools/testing/selftests/kvm/include/aarch64/processor.h
+++ b/tools/testing/selftests/kvm/include/aarch64/processor.h
@@ -38,12 +38,25 @@
  * NORMAL             4     1111:1111
  * NORMAL_WT          5     1011:1011
  */
-#define DEFAULT_MAIR_EL1 ((0x00ul << (0 * 8)) | \
-			  (0x04ul << (1 * 8)) | \
-			  (0x0cul << (2 * 8)) | \
-			  (0x44ul << (3 * 8)) | \
-			  (0xfful << (4 * 8)) | \
-			  (0xbbul << (5 * 8)))
+
+/* Linux doesn't use these memory types, so let's define them. */
+#define MAIR_ATTR_DEVICE_GRE	UL(0x0c)
+#define MAIR_ATTR_NORMAL_WT	UL(0xbb)
+
+#define MT_DEVICE_nGnRnE	0
+#define MT_DEVICE_nGnRE		1
+#define MT_DEVICE_GRE		2
+#define MT_NORMAL_NC		3
+#define MT_NORMAL		4
+#define MT_NORMAL_WT		5
+
+#define DEFAULT_MAIR_EL1							\
+	(MAIR_ATTRIDX(MAIR_ATTR_DEVICE_nGnRnE, MT_DEVICE_nGnRnE) |		\
+	 MAIR_ATTRIDX(MAIR_ATTR_DEVICE_nGnRE, MT_DEVICE_nGnRE) |		\
+	 MAIR_ATTRIDX(MAIR_ATTR_DEVICE_GRE, MT_DEVICE_GRE) |			\
+	 MAIR_ATTRIDX(MAIR_ATTR_NORMAL_NC, MT_NORMAL_NC) |			\
+	 MAIR_ATTRIDX(MAIR_ATTR_NORMAL, MT_NORMAL) |				\
+	 MAIR_ATTRIDX(MAIR_ATTR_NORMAL_WT, MT_NORMAL_WT))
 
 #define MPIDR_HWID_BITMASK (0xff00fffffful)
 
diff --git a/tools/testing/selftests/kvm/lib/aarch64/processor.c b/tools/testing/selftests/kvm/lib/aarch64/processor.c
index 8dd511aa79c2..733a2b713580 100644
--- a/tools/testing/selftests/kvm/lib/aarch64/processor.c
+++ b/tools/testing/selftests/kvm/lib/aarch64/processor.c
@@ -133,7 +133,7 @@ void _virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr,
 
 void virt_arch_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr)
 {
-	uint64_t attr_idx = 4; /* NORMAL (See DEFAULT_MAIR_EL1) */
+	uint64_t attr_idx = MT_NORMAL;
 
 	_virt_pg_map(vm, vaddr, paddr, attr_idx, 0);
 }
-- 
2.37.0.rc0.161.g10f37bed90-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v4 08/13] tools: Copy bitfield.h from the kernel sources
  2022-06-24 21:32 ` Ricardo Koller
@ 2022-06-24 21:32   ` Ricardo Koller
  -1 siblings, 0 replies; 56+ messages in thread
From: Ricardo Koller @ 2022-06-24 21:32 UTC (permalink / raw)
  To: kvm, kvmarm, drjones
  Cc: pbonzini, maz, alexandru.elisei, eric.auger, oupton, reijiw,
	rananta, bgardon, dmatlack, axelrasmussen, Ricardo Koller,
	Jakub Kicinski, Arnaldo Carvalho de Melo

Copy bitfield.h from include/linux/bitfield.h.  A subsequent change will
make use of some FIELD_{GET,PREP} macros defined in this header.

The header was copied as-is, no changes needed.

Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Oliver Upton <oupton@google.com>
Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 tools/include/linux/bitfield.h | 176 +++++++++++++++++++++++++++++++++
 1 file changed, 176 insertions(+)
 create mode 100644 tools/include/linux/bitfield.h

diff --git a/tools/include/linux/bitfield.h b/tools/include/linux/bitfield.h
new file mode 100644
index 000000000000..6093fa6db260
--- /dev/null
+++ b/tools/include/linux/bitfield.h
@@ -0,0 +1,176 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2014 Felix Fietkau <nbd@nbd.name>
+ * Copyright (C) 2004 - 2009 Ivo van Doorn <IvDoorn@gmail.com>
+ */
+
+#ifndef _LINUX_BITFIELD_H
+#define _LINUX_BITFIELD_H
+
+#include <linux/build_bug.h>
+#include <asm/byteorder.h>
+
+/*
+ * Bitfield access macros
+ *
+ * FIELD_{GET,PREP} macros take as first parameter shifted mask
+ * from which they extract the base mask and shift amount.
+ * Mask must be a compilation time constant.
+ *
+ * Example:
+ *
+ *  #define REG_FIELD_A  GENMASK(6, 0)
+ *  #define REG_FIELD_B  BIT(7)
+ *  #define REG_FIELD_C  GENMASK(15, 8)
+ *  #define REG_FIELD_D  GENMASK(31, 16)
+ *
+ * Get:
+ *  a = FIELD_GET(REG_FIELD_A, reg);
+ *  b = FIELD_GET(REG_FIELD_B, reg);
+ *
+ * Set:
+ *  reg = FIELD_PREP(REG_FIELD_A, 1) |
+ *	  FIELD_PREP(REG_FIELD_B, 0) |
+ *	  FIELD_PREP(REG_FIELD_C, c) |
+ *	  FIELD_PREP(REG_FIELD_D, 0x40);
+ *
+ * Modify:
+ *  reg &= ~REG_FIELD_C;
+ *  reg |= FIELD_PREP(REG_FIELD_C, c);
+ */
+
+#define __bf_shf(x) (__builtin_ffsll(x) - 1)
+
+#define __scalar_type_to_unsigned_cases(type)				\
+		unsigned type:	(unsigned type)0,			\
+		signed type:	(unsigned type)0
+
+#define __unsigned_scalar_typeof(x) typeof(				\
+		_Generic((x),						\
+			char:	(unsigned char)0,			\
+			__scalar_type_to_unsigned_cases(char),		\
+			__scalar_type_to_unsigned_cases(short),		\
+			__scalar_type_to_unsigned_cases(int),		\
+			__scalar_type_to_unsigned_cases(long),		\
+			__scalar_type_to_unsigned_cases(long long),	\
+			default: (x)))
+
+#define __bf_cast_unsigned(type, x)	((__unsigned_scalar_typeof(type))(x))
+
+#define __BF_FIELD_CHECK(_mask, _reg, _val, _pfx)			\
+	({								\
+		BUILD_BUG_ON_MSG(!__builtin_constant_p(_mask),		\
+				 _pfx "mask is not constant");		\
+		BUILD_BUG_ON_MSG((_mask) == 0, _pfx "mask is zero");	\
+		BUILD_BUG_ON_MSG(__builtin_constant_p(_val) ?		\
+				 ~((_mask) >> __bf_shf(_mask)) & (_val) : 0, \
+				 _pfx "value too large for the field"); \
+		BUILD_BUG_ON_MSG(__bf_cast_unsigned(_mask, _mask) >	\
+				 __bf_cast_unsigned(_reg, ~0ull),	\
+				 _pfx "type of reg too small for mask"); \
+		__BUILD_BUG_ON_NOT_POWER_OF_2((_mask) +			\
+					      (1ULL << __bf_shf(_mask))); \
+	})
+
+/**
+ * FIELD_MAX() - produce the maximum value representable by a field
+ * @_mask: shifted mask defining the field's length and position
+ *
+ * FIELD_MAX() returns the maximum value that can be held in the field
+ * specified by @_mask.
+ */
+#define FIELD_MAX(_mask)						\
+	({								\
+		__BF_FIELD_CHECK(_mask, 0ULL, 0ULL, "FIELD_MAX: ");	\
+		(typeof(_mask))((_mask) >> __bf_shf(_mask));		\
+	})
+
+/**
+ * FIELD_FIT() - check if value fits in the field
+ * @_mask: shifted mask defining the field's length and position
+ * @_val:  value to test against the field
+ *
+ * Return: true if @_val can fit inside @_mask, false if @_val is too big.
+ */
+#define FIELD_FIT(_mask, _val)						\
+	({								\
+		__BF_FIELD_CHECK(_mask, 0ULL, 0ULL, "FIELD_FIT: ");	\
+		!((((typeof(_mask))_val) << __bf_shf(_mask)) & ~(_mask)); \
+	})
+
+/**
+ * FIELD_PREP() - prepare a bitfield element
+ * @_mask: shifted mask defining the field's length and position
+ * @_val:  value to put in the field
+ *
+ * FIELD_PREP() masks and shifts up the value.  The result should
+ * be combined with other fields of the bitfield using logical OR.
+ */
+#define FIELD_PREP(_mask, _val)						\
+	({								\
+		__BF_FIELD_CHECK(_mask, 0ULL, _val, "FIELD_PREP: ");	\
+		((typeof(_mask))(_val) << __bf_shf(_mask)) & (_mask);	\
+	})
+
+/**
+ * FIELD_GET() - extract a bitfield element
+ * @_mask: shifted mask defining the field's length and position
+ * @_reg:  value of entire bitfield
+ *
+ * FIELD_GET() extracts the field specified by @_mask from the
+ * bitfield passed in as @_reg by masking and shifting it down.
+ */
+#define FIELD_GET(_mask, _reg)						\
+	({								\
+		__BF_FIELD_CHECK(_mask, _reg, 0U, "FIELD_GET: ");	\
+		(typeof(_mask))(((_reg) & (_mask)) >> __bf_shf(_mask));	\
+	})
+
+extern void __compiletime_error("value doesn't fit into mask")
+__field_overflow(void);
+extern void __compiletime_error("bad bitfield mask")
+__bad_mask(void);
+static __always_inline u64 field_multiplier(u64 field)
+{
+	if ((field | (field - 1)) & ((field | (field - 1)) + 1))
+		__bad_mask();
+	return field & -field;
+}
+static __always_inline u64 field_mask(u64 field)
+{
+	return field / field_multiplier(field);
+}
+#define field_max(field)	((typeof(field))field_mask(field))
+#define ____MAKE_OP(type,base,to,from)					\
+static __always_inline __##type type##_encode_bits(base v, base field)	\
+{									\
+	if (__builtin_constant_p(v) && (v & ~field_mask(field)))	\
+		__field_overflow();					\
+	return to((v & field_mask(field)) * field_multiplier(field));	\
+}									\
+static __always_inline __##type type##_replace_bits(__##type old,	\
+					base val, base field)		\
+{									\
+	return (old & ~to(field)) | type##_encode_bits(val, field);	\
+}									\
+static __always_inline void type##p_replace_bits(__##type *p,		\
+					base val, base field)		\
+{									\
+	*p = (*p & ~to(field)) | type##_encode_bits(val, field);	\
+}									\
+static __always_inline base type##_get_bits(__##type v, base field)	\
+{									\
+	return (from(v) & field)/field_multiplier(field);		\
+}
+#define __MAKE_OP(size)							\
+	____MAKE_OP(le##size,u##size,cpu_to_le##size,le##size##_to_cpu)	\
+	____MAKE_OP(be##size,u##size,cpu_to_be##size,be##size##_to_cpu)	\
+	____MAKE_OP(u##size,u##size,,)
+____MAKE_OP(u8,u8,,)
+__MAKE_OP(16)
+__MAKE_OP(32)
+__MAKE_OP(64)
+#undef __MAKE_OP
+#undef ____MAKE_OP
+
+#endif
-- 
2.37.0.rc0.161.g10f37bed90-goog


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v4 08/13] tools: Copy bitfield.h from the kernel sources
@ 2022-06-24 21:32   ` Ricardo Koller
  0 siblings, 0 replies; 56+ messages in thread
From: Ricardo Koller @ 2022-06-24 21:32 UTC (permalink / raw)
  To: kvm, kvmarm, drjones
  Cc: maz, bgardon, Arnaldo Carvalho de Melo, Jakub Kicinski, dmatlack,
	pbonzini, axelrasmussen

Copy bitfield.h from include/linux/bitfield.h.  A subsequent change will
make use of some FIELD_{GET,PREP} macros defined in this header.

The header was copied as-is, no changes needed.

Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Oliver Upton <oupton@google.com>
Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 tools/include/linux/bitfield.h | 176 +++++++++++++++++++++++++++++++++
 1 file changed, 176 insertions(+)
 create mode 100644 tools/include/linux/bitfield.h

diff --git a/tools/include/linux/bitfield.h b/tools/include/linux/bitfield.h
new file mode 100644
index 000000000000..6093fa6db260
--- /dev/null
+++ b/tools/include/linux/bitfield.h
@@ -0,0 +1,176 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2014 Felix Fietkau <nbd@nbd.name>
+ * Copyright (C) 2004 - 2009 Ivo van Doorn <IvDoorn@gmail.com>
+ */
+
+#ifndef _LINUX_BITFIELD_H
+#define _LINUX_BITFIELD_H
+
+#include <linux/build_bug.h>
+#include <asm/byteorder.h>
+
+/*
+ * Bitfield access macros
+ *
+ * FIELD_{GET,PREP} macros take as first parameter shifted mask
+ * from which they extract the base mask and shift amount.
+ * Mask must be a compilation time constant.
+ *
+ * Example:
+ *
+ *  #define REG_FIELD_A  GENMASK(6, 0)
+ *  #define REG_FIELD_B  BIT(7)
+ *  #define REG_FIELD_C  GENMASK(15, 8)
+ *  #define REG_FIELD_D  GENMASK(31, 16)
+ *
+ * Get:
+ *  a = FIELD_GET(REG_FIELD_A, reg);
+ *  b = FIELD_GET(REG_FIELD_B, reg);
+ *
+ * Set:
+ *  reg = FIELD_PREP(REG_FIELD_A, 1) |
+ *	  FIELD_PREP(REG_FIELD_B, 0) |
+ *	  FIELD_PREP(REG_FIELD_C, c) |
+ *	  FIELD_PREP(REG_FIELD_D, 0x40);
+ *
+ * Modify:
+ *  reg &= ~REG_FIELD_C;
+ *  reg |= FIELD_PREP(REG_FIELD_C, c);
+ */
+
+#define __bf_shf(x) (__builtin_ffsll(x) - 1)
+
+#define __scalar_type_to_unsigned_cases(type)				\
+		unsigned type:	(unsigned type)0,			\
+		signed type:	(unsigned type)0
+
+#define __unsigned_scalar_typeof(x) typeof(				\
+		_Generic((x),						\
+			char:	(unsigned char)0,			\
+			__scalar_type_to_unsigned_cases(char),		\
+			__scalar_type_to_unsigned_cases(short),		\
+			__scalar_type_to_unsigned_cases(int),		\
+			__scalar_type_to_unsigned_cases(long),		\
+			__scalar_type_to_unsigned_cases(long long),	\
+			default: (x)))
+
+#define __bf_cast_unsigned(type, x)	((__unsigned_scalar_typeof(type))(x))
+
+#define __BF_FIELD_CHECK(_mask, _reg, _val, _pfx)			\
+	({								\
+		BUILD_BUG_ON_MSG(!__builtin_constant_p(_mask),		\
+				 _pfx "mask is not constant");		\
+		BUILD_BUG_ON_MSG((_mask) == 0, _pfx "mask is zero");	\
+		BUILD_BUG_ON_MSG(__builtin_constant_p(_val) ?		\
+				 ~((_mask) >> __bf_shf(_mask)) & (_val) : 0, \
+				 _pfx "value too large for the field"); \
+		BUILD_BUG_ON_MSG(__bf_cast_unsigned(_mask, _mask) >	\
+				 __bf_cast_unsigned(_reg, ~0ull),	\
+				 _pfx "type of reg too small for mask"); \
+		__BUILD_BUG_ON_NOT_POWER_OF_2((_mask) +			\
+					      (1ULL << __bf_shf(_mask))); \
+	})
+
+/**
+ * FIELD_MAX() - produce the maximum value representable by a field
+ * @_mask: shifted mask defining the field's length and position
+ *
+ * FIELD_MAX() returns the maximum value that can be held in the field
+ * specified by @_mask.
+ */
+#define FIELD_MAX(_mask)						\
+	({								\
+		__BF_FIELD_CHECK(_mask, 0ULL, 0ULL, "FIELD_MAX: ");	\
+		(typeof(_mask))((_mask) >> __bf_shf(_mask));		\
+	})
+
+/**
+ * FIELD_FIT() - check if value fits in the field
+ * @_mask: shifted mask defining the field's length and position
+ * @_val:  value to test against the field
+ *
+ * Return: true if @_val can fit inside @_mask, false if @_val is too big.
+ */
+#define FIELD_FIT(_mask, _val)						\
+	({								\
+		__BF_FIELD_CHECK(_mask, 0ULL, 0ULL, "FIELD_FIT: ");	\
+		!((((typeof(_mask))_val) << __bf_shf(_mask)) & ~(_mask)); \
+	})
+
+/**
+ * FIELD_PREP() - prepare a bitfield element
+ * @_mask: shifted mask defining the field's length and position
+ * @_val:  value to put in the field
+ *
+ * FIELD_PREP() masks and shifts up the value.  The result should
+ * be combined with other fields of the bitfield using logical OR.
+ */
+#define FIELD_PREP(_mask, _val)						\
+	({								\
+		__BF_FIELD_CHECK(_mask, 0ULL, _val, "FIELD_PREP: ");	\
+		((typeof(_mask))(_val) << __bf_shf(_mask)) & (_mask);	\
+	})
+
+/**
+ * FIELD_GET() - extract a bitfield element
+ * @_mask: shifted mask defining the field's length and position
+ * @_reg:  value of entire bitfield
+ *
+ * FIELD_GET() extracts the field specified by @_mask from the
+ * bitfield passed in as @_reg by masking and shifting it down.
+ */
+#define FIELD_GET(_mask, _reg)						\
+	({								\
+		__BF_FIELD_CHECK(_mask, _reg, 0U, "FIELD_GET: ");	\
+		(typeof(_mask))(((_reg) & (_mask)) >> __bf_shf(_mask));	\
+	})
+
+extern void __compiletime_error("value doesn't fit into mask")
+__field_overflow(void);
+extern void __compiletime_error("bad bitfield mask")
+__bad_mask(void);
+static __always_inline u64 field_multiplier(u64 field)
+{
+	if ((field | (field - 1)) & ((field | (field - 1)) + 1))
+		__bad_mask();
+	return field & -field;
+}
+static __always_inline u64 field_mask(u64 field)
+{
+	return field / field_multiplier(field);
+}
+#define field_max(field)	((typeof(field))field_mask(field))
+#define ____MAKE_OP(type,base,to,from)					\
+static __always_inline __##type type##_encode_bits(base v, base field)	\
+{									\
+	if (__builtin_constant_p(v) && (v & ~field_mask(field)))	\
+		__field_overflow();					\
+	return to((v & field_mask(field)) * field_multiplier(field));	\
+}									\
+static __always_inline __##type type##_replace_bits(__##type old,	\
+					base val, base field)		\
+{									\
+	return (old & ~to(field)) | type##_encode_bits(val, field);	\
+}									\
+static __always_inline void type##p_replace_bits(__##type *p,		\
+					base val, base field)		\
+{									\
+	*p = (*p & ~to(field)) | type##_encode_bits(val, field);	\
+}									\
+static __always_inline base type##_get_bits(__##type v, base field)	\
+{									\
+	return (from(v) & field)/field_multiplier(field);		\
+}
+#define __MAKE_OP(size)							\
+	____MAKE_OP(le##size,u##size,cpu_to_le##size,le##size##_to_cpu)	\
+	____MAKE_OP(be##size,u##size,cpu_to_be##size,be##size##_to_cpu)	\
+	____MAKE_OP(u##size,u##size,,)
+____MAKE_OP(u8,u8,,)
+__MAKE_OP(16)
+__MAKE_OP(32)
+__MAKE_OP(64)
+#undef __MAKE_OP
+#undef ____MAKE_OP
+
+#endif
-- 
2.37.0.rc0.161.g10f37bed90-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v4 09/13] KVM: selftests: aarch64: Add aarch64/page_fault_test
  2022-06-24 21:32 ` Ricardo Koller
@ 2022-06-24 21:32   ` Ricardo Koller
  -1 siblings, 0 replies; 56+ messages in thread
From: Ricardo Koller @ 2022-06-24 21:32 UTC (permalink / raw)
  To: kvm, kvmarm, drjones
  Cc: pbonzini, maz, alexandru.elisei, eric.auger, oupton, reijiw,
	rananta, bgardon, dmatlack, axelrasmussen, Ricardo Koller

Add a new test for stage 2 faults when using different combinations of
guest accesses (e.g., write, S1PTW), backing source type (e.g., anon)
and types of faults (e.g., read on hugetlbfs with a hole). The next
commits will add different handling methods and more faults (e.g., uffd
and dirty logging). This first commit starts by adding two sanity checks
for all types of accesses: AF setting by the hw, and accessing memslots
with holes.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 tools/testing/selftests/kvm/Makefile          |   1 +
 .../selftests/kvm/aarch64/page_fault_test.c   | 695 ++++++++++++++++++
 .../selftests/kvm/include/aarch64/processor.h |   6 +
 3 files changed, 702 insertions(+)
 create mode 100644 tools/testing/selftests/kvm/aarch64/page_fault_test.c

diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index e4497a3a27d4..13b913225ae7 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -139,6 +139,7 @@ TEST_GEN_PROGS_aarch64 += aarch64/arch_timer
 TEST_GEN_PROGS_aarch64 += aarch64/debug-exceptions
 TEST_GEN_PROGS_aarch64 += aarch64/get-reg-list
 TEST_GEN_PROGS_aarch64 += aarch64/hypercalls
+TEST_GEN_PROGS_aarch64 += aarch64/page_fault_test
 TEST_GEN_PROGS_aarch64 += aarch64/psci_test
 TEST_GEN_PROGS_aarch64 += aarch64/vcpu_width_config
 TEST_GEN_PROGS_aarch64 += aarch64/vgic_init
diff --git a/tools/testing/selftests/kvm/aarch64/page_fault_test.c b/tools/testing/selftests/kvm/aarch64/page_fault_test.c
new file mode 100644
index 000000000000..bdda4e3fcdaa
--- /dev/null
+++ b/tools/testing/selftests/kvm/aarch64/page_fault_test.c
@@ -0,0 +1,695 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * page_fault_test.c - Test stage 2 faults.
+ *
+ * This test tries different combinations of guest accesses (e.g., write,
+ * S1PTW), backing source type (e.g., anon) and types of faults (e.g., read on
+ * hugetlbfs with a hole). It checks that the expected handling method is
+ * called (e.g., uffd faults with the right address and write/read flag).
+ */
+
+#define _GNU_SOURCE
+#include <linux/bitmap.h>
+#include <fcntl.h>
+#include <test_util.h>
+#include <kvm_util.h>
+#include <processor.h>
+#include <asm/sysreg.h>
+#include <linux/bitfield.h>
+#include "guest_modes.h"
+#include "userfaultfd_util.h"
+
+#define TEST_MEM_SLOT_INDEX			1
+#define TEST_PT_SLOT_INDEX			2
+
+/* Guest virtual addresses that point to the test page and its PTE. */
+#define TEST_GVA				0xc0000000
+#define TEST_EXEC_GVA				0xc0000008
+#define TEST_PTE_GVA				0xb0000000
+#define TEST_DATA				0x0123456789ABCDEF
+
+static uint64_t *guest_test_memory = (uint64_t *)TEST_GVA;
+
+#define CMD_NONE				(0)
+#define CMD_SKIP_TEST				(1ULL << 1)
+#define CMD_HOLE_PT				(1ULL << 2)
+#define CMD_HOLE_TEST				(1ULL << 3)
+
+#define PREPARE_FN_NR				10
+#define CHECK_FN_NR				10
+
+uint64_t pte_gpa;
+
+enum {
+	PT,
+	TEST,
+	NR_MEMSLOTS
+};
+
+struct memslot_desc {
+	void *hva;
+	uint64_t gpa;
+	uint64_t size;
+	uint64_t guest_pages;
+	enum vm_mem_backing_src_type src_type;
+	uint32_t idx;
+} memslot[NR_MEMSLOTS] = {
+	{
+		.idx = TEST_PT_SLOT_INDEX,
+	},
+	{
+		.idx = TEST_MEM_SLOT_INDEX,
+	},
+};
+
+static struct event_cnt {
+	int aborts;
+	int fail_vcpu_runs;
+} events;
+
+struct test_desc {
+	const char *name;
+	uint64_t mem_mark_cmd;
+	/* Skip the test if any prepare function returns false */
+	bool (*guest_prepare[PREPARE_FN_NR])(void);
+	void (*guest_test)(void);
+	void (*guest_test_check[CHECK_FN_NR])(void);
+	void (*dabt_handler)(struct ex_regs *regs);
+	void (*iabt_handler)(struct ex_regs *regs);
+	uint32_t pt_memslot_flags;
+	uint32_t test_memslot_flags;
+	bool skip;
+	struct event_cnt expected_events;
+};
+
+struct test_params {
+	enum vm_mem_backing_src_type src_type;
+	struct test_desc *test_desc;
+};
+
+static inline void flush_tlb_page(uint64_t vaddr)
+{
+	uint64_t page = vaddr >> 12;
+
+	dsb(ishst);
+	asm volatile("tlbi vaae1is, %0" :: "r" (page));
+	dsb(ish);
+	isb();
+}
+
+static void guest_write64(void)
+{
+	uint64_t val;
+
+	WRITE_ONCE(*guest_test_memory, TEST_DATA);
+	val = READ_ONCE(*guest_test_memory);
+	GUEST_ASSERT_EQ(val, TEST_DATA);
+}
+
+/* Check the system for atomic instructions. */
+static bool guest_check_lse(void)
+{
+	uint64_t isar0 = read_sysreg(id_aa64isar0_el1);
+	uint64_t atomic;
+
+	atomic = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64ISAR0_ATOMICS), isar0);
+	return atomic >= 2;
+}
+
+static bool guest_check_dc_zva(void)
+{
+	uint64_t dczid = read_sysreg(dczid_el0);
+	uint64_t dzp = FIELD_GET(ARM64_FEATURE_MASK(DCZID_DZP), dczid);
+
+	return dzp == 0;
+}
+
+/* Compare and swap instruction. */
+static void guest_cas(void)
+{
+	uint64_t val;
+
+	GUEST_ASSERT_EQ(guest_check_lse(), 1);
+	asm volatile(".arch_extension lse\n"
+		     "casal %0, %1, [%2]\n"
+			:: "r" (0), "r" (TEST_DATA), "r" (guest_test_memory));
+	val = READ_ONCE(*guest_test_memory);
+	GUEST_ASSERT_EQ(val, TEST_DATA);
+}
+
+static void guest_read64(void)
+{
+	uint64_t val;
+
+	val = READ_ONCE(*guest_test_memory);
+	GUEST_ASSERT_EQ(val, 0);
+}
+
+/* Address translation instruction */
+static void guest_at(void)
+{
+	uint64_t par;
+	uint64_t paddr;
+
+	asm volatile("at s1e1r, %0" :: "r" (guest_test_memory));
+	par = read_sysreg(par_el1);
+
+	/* Bit 1 indicates whether the AT was successful */
+	GUEST_ASSERT_EQ(par & 1, 0);
+	/* The PA in bits [51:12] */
+	paddr = par & (((1ULL << 40) - 1) << 12);
+	GUEST_ASSERT_EQ(paddr, memslot[TEST].gpa);
+}
+
+/*
+ * The size of the block written by "dc zva" is guaranteed to be between (2 <<
+ * 0) and (2 << 9), which is safe in our case as we need the write to happen
+ * for at least a word, and not more than a page.
+ */
+static void guest_dc_zva(void)
+{
+	uint16_t val;
+
+	asm volatile("dc zva, %0\n"
+			"dsb ish\n"
+			:: "r" (guest_test_memory));
+	val = READ_ONCE(*guest_test_memory);
+	GUEST_ASSERT_EQ(val, 0);
+}
+
+/*
+ * Pre-indexing loads and stores don't have a valid syndrome (ESR_EL2.ISV==0).
+ * And that's special because KVM must take special care with those: they
+ * should still count as accesses for dirty logging or user-faulting, but
+ * should be handled differently on mmio.
+ */
+static void guest_ld_preidx(void)
+{
+	uint64_t val;
+	uint64_t addr = TEST_GVA - 8;
+
+	/*
+	 * This ends up accessing "TEST_GVA + 8 - 8", where "TEST_GVA - 8" is
+	 * in a gap between memslots not backing by anything.
+	 */
+	asm volatile("ldr %0, [%1, #8]!"
+			: "=r" (val), "+r" (addr));
+	GUEST_ASSERT_EQ(val, 0);
+	GUEST_ASSERT_EQ(addr, TEST_GVA);
+}
+
+static void guest_st_preidx(void)
+{
+	uint64_t val = TEST_DATA;
+	uint64_t addr = TEST_GVA - 8;
+
+	asm volatile("str %0, [%1, #8]!"
+			: "+r" (val), "+r" (addr));
+
+	GUEST_ASSERT_EQ(addr, TEST_GVA);
+	val = READ_ONCE(*guest_test_memory);
+}
+
+static bool guest_set_ha(void)
+{
+	uint64_t mmfr1 = read_sysreg(id_aa64mmfr1_el1);
+	uint64_t hadbs, tcr;
+
+	/* Skip if HA is not supported. */
+	hadbs = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR1_HADBS), mmfr1);
+	if (hadbs == 0)
+		return false;
+
+	tcr = read_sysreg(tcr_el1) | TCR_EL1_HA;
+	write_sysreg(tcr, tcr_el1);
+	isb();
+
+	return true;
+}
+
+static bool guest_clear_pte_af(void)
+{
+	*((uint64_t *)TEST_PTE_GVA) &= ~PTE_AF;
+	flush_tlb_page(TEST_PTE_GVA);
+
+	return true;
+}
+
+static void guest_check_pte_af(void)
+{
+	flush_tlb_page(TEST_PTE_GVA);
+	GUEST_ASSERT_EQ(*((uint64_t *)TEST_PTE_GVA) & PTE_AF, PTE_AF);
+}
+
+static void guest_exec(void)
+{
+	int (*code)(void) = (int (*)(void))TEST_EXEC_GVA;
+	int ret;
+
+	ret = code();
+	GUEST_ASSERT_EQ(ret, 0x77);
+}
+
+static bool guest_prepare(struct test_desc *test)
+{
+	bool (*prepare_fn)(void);
+	int i;
+
+	for (i = 0; i < PREPARE_FN_NR; i++) {
+		prepare_fn = test->guest_prepare[i];
+		if (prepare_fn && !prepare_fn())
+			return false;
+	}
+
+	return true;
+}
+
+static void guest_test_check(struct test_desc *test)
+{
+	void (*check_fn)(void);
+	int i;
+
+	for (i = 0; i < CHECK_FN_NR; i++) {
+		check_fn = test->guest_test_check[i];
+		if (check_fn)
+			check_fn();
+	}
+}
+
+static void guest_code(struct test_desc *test)
+{
+	if (!guest_prepare(test))
+		GUEST_SYNC(CMD_SKIP_TEST);
+
+	GUEST_SYNC(test->mem_mark_cmd);
+
+	if (test->guest_test)
+		test->guest_test();
+
+	guest_test_check(test);
+	GUEST_DONE();
+}
+
+static void no_dabt_handler(struct ex_regs *regs)
+{
+	GUEST_ASSERT_1(false, read_sysreg(far_el1));
+}
+
+static void no_iabt_handler(struct ex_regs *regs)
+{
+	GUEST_ASSERT_1(false, regs->pc);
+}
+
+/* Returns true to continue the test, and false if it should be skipped. */
+static bool punch_hole_in_memslot(struct kvm_vm *vm,
+		struct memslot_desc *memslot)
+{
+	int ret, fd;
+	void *hva;
+
+	fd = vm_mem_region_get_src_fd(vm, memslot->idx);
+	if (fd != -1) {
+		ret = fallocate(fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
+				0, memslot->size);
+		TEST_ASSERT(ret == 0, "fallocate failed, errno: %d\n", errno);
+	} else {
+		if (is_backing_src_hugetlb(memslot->src_type))
+			return false;
+
+		hva = addr_gpa2hva(vm, memslot->gpa);
+		ret = madvise(hva, memslot->size, MADV_DONTNEED);
+		TEST_ASSERT(ret == 0, "madvise failed, errno: %d\n", errno);
+	}
+
+	return true;
+}
+
+/* Returns true to continue the test, and false if it should be skipped. */
+static bool handle_cmd(struct kvm_vm *vm, int cmd)
+{
+	bool continue_test = true;
+
+	if (cmd == CMD_SKIP_TEST)
+		continue_test = false;
+
+	if (cmd & CMD_HOLE_PT)
+		continue_test = punch_hole_in_memslot(vm, &memslot[PT]);
+	if (cmd & CMD_HOLE_TEST)
+		continue_test = punch_hole_in_memslot(vm, &memslot[TEST]);
+
+	return continue_test;
+}
+
+static void sync_stats_from_guest(struct kvm_vm *vm)
+{
+	struct event_cnt *ec = addr_gva2hva(vm, (uint64_t)&events);
+
+	events.aborts += ec->aborts;
+}
+
+void fail_vcpu_run_no_handler(int ret)
+{
+	TEST_FAIL("Unexpected vcpu run failure\n");
+}
+
+extern unsigned char __exec_test;
+
+void noinline __return_0x77(void)
+{
+	asm volatile("__exec_test: mov x0, #0x77\n"
+			"ret\n");
+}
+
+static void load_exec_code_for_test(void)
+{
+	uint64_t *code, *c;
+
+	assert(TEST_EXEC_GVA - TEST_GVA);
+	code = memslot[TEST].hva + 8;
+
+	/*
+	 * We need the cast to be separate in order for the compiler to not
+	 * complain with: "‘memcpy’ forming offset [1, 7] is out of the bounds
+	 * [0, 1] of object ‘__exec_test’ with type ‘unsigned char’"
+	 */
+	c = (uint64_t *)&__exec_test;
+	memcpy(code, c, 8);
+}
+
+static void setup_abort_handlers(struct kvm_vm *vm, struct kvm_vcpu *vcpu,
+		struct test_desc *test)
+{
+	vm_init_descriptor_tables(vm);
+	vcpu_init_descriptor_tables(vcpu);
+	if (!test->dabt_handler)
+		test->dabt_handler = no_dabt_handler;
+	if (!test->iabt_handler)
+		test->iabt_handler = no_iabt_handler;
+	vm_install_sync_handler(vm, VECTOR_SYNC_CURRENT,
+			0x25, test->dabt_handler);
+	vm_install_sync_handler(vm, VECTOR_SYNC_CURRENT,
+			0x21, test->iabt_handler);
+}
+
+/*
+ * Create a memslot for test data (memslot[TEST]) and another one for PT
+ * tables (memslot[PT]). This diagram show the resulting guest virtual and
+ * physical address space when using 4K backing pages for the memslots, and
+ * 4K guest pages.
+ *
+ *                   Guest physical            Guest virtual
+ *
+ *                  |              |          |             |
+ *                  |              |          |             |
+ *                  +--------------+          +-------------+
+ * max_gfn - 0x1000 | TEST memslot |<---------+  test data  | 0xc0000000
+ *                  +--------------+          +-------------+
+ * max_gfn - 0x2000 |     gap      |<---------+     gap     | 0xbffff000
+ *                  +--------------+          +-------------+
+ *                  |              |          |             |
+ *                  |              |          |             |
+ *                  |  PT memslot  |          |             |
+ *                  |              |          +-------------+
+ * max_gfn - 0x6000 |              |<----+    |             |
+ *                  +--------------+     |    |             |
+ *                  |              |     |    | PTE for the |
+ *                  |              |     |    | test data   |
+ *                  |              |     +----+ page        | 0xb0000000
+ *                  |              |          +-------------+
+ *                  |              |          |             |
+ *                  |              |          |             |
+ *
+ * Using different guest page sizes or backing pages will result in the
+ * same layout but at different addresses. In particular, the memslot
+ * sizes need to be multiple of the backing page sizes (e.g., 2MB).
+ */
+static void setup_memslots(struct kvm_vm *vm, enum vm_guest_mode mode,
+		struct test_params *p)
+{
+	uint64_t backing_page_size = get_backing_src_pagesz(p->src_type);
+	uint64_t guest_page_size = vm_guest_mode_params[mode].page_size;
+	struct test_desc *test = p->test_desc;
+	uint64_t gap_gpa;
+	uint64_t alignment;
+	int i;
+
+	memslot[TEST].size = align_up(guest_page_size, backing_page_size);
+	/*
+	 * We need one guest page for the PT table containing the PTE (for
+	 * TEST_GVA), but might need more in case the higher level PT tables
+	 * were not allocated yet.
+	 */
+	memslot[PT].size = align_up(4 * guest_page_size, backing_page_size);
+
+	for (i = 0; i < NR_MEMSLOTS; i++) {
+		memslot[i].guest_pages = memslot[i].size / guest_page_size;
+		memslot[i].src_type = p->src_type;
+	}
+
+	/* Place the memslots GPAs at the end of physical memory */
+	alignment = max(backing_page_size, guest_page_size);
+	memslot[TEST].gpa = (vm->max_gfn - memslot[TEST].guest_pages) *
+		guest_page_size;
+	memslot[TEST].gpa = align_down(memslot[TEST].gpa, alignment);
+
+	/* Add a 1-guest_page gap between the two memslots */
+	gap_gpa = memslot[TEST].gpa - guest_page_size;
+	/* Map the gap so it's still adressable from the guest.  */
+	virt_pg_map(vm, TEST_GVA - guest_page_size, gap_gpa);
+
+	memslot[PT].gpa = gap_gpa - memslot[PT].size;
+	memslot[PT].gpa = align_down(memslot[PT].gpa, alignment);
+
+	vm_userspace_mem_region_add(vm, p->src_type, memslot[PT].gpa,
+			memslot[PT].idx, memslot[PT].guest_pages,
+			test->pt_memslot_flags);
+	vm_userspace_mem_region_add(vm, p->src_type, memslot[TEST].gpa,
+			memslot[TEST].idx, memslot[TEST].guest_pages,
+			test->test_memslot_flags);
+
+	for (i = 0; i < NR_MEMSLOTS; i++)
+		memslot[i].hva = addr_gpa2hva(vm, memslot[i].gpa);
+
+	/* Map the test TEST_GVA using the PT memslot. */
+	_virt_pg_map(vm, TEST_GVA, memslot[TEST].gpa, MT_NORMAL,
+			TEST_PT_SLOT_INDEX);
+
+	/*
+	 * Find the PTE of the test page and map it in the guest so it can
+	 * clear the AF.
+	 */
+	pte_gpa = addr_hva2gpa(vm, virt_get_pte_hva(vm, TEST_GVA));
+	TEST_ASSERT(memslot[PT].gpa <= pte_gpa &&
+			pte_gpa < (memslot[PT].gpa + memslot[PT].size),
+			"The EPT should be in the PT memslot.");
+	/* This is an artibrary requirement just to make things simpler. */
+	TEST_ASSERT(pte_gpa % guest_page_size == 0,
+			"The pte_gpa (%p) should be aligned to the guest page (%lx).",
+			(void *)pte_gpa, guest_page_size);
+	virt_pg_map(vm, TEST_PTE_GVA, pte_gpa);
+}
+
+static void check_event_counts(struct test_desc *test)
+{
+	ASSERT_EQ(test->expected_events.aborts,	events.aborts);
+}
+
+static void print_test_banner(enum vm_guest_mode mode, struct test_params *p)
+{
+	struct test_desc *test = p->test_desc;
+
+	pr_debug("Test: %s\n", test->name);
+	pr_debug("Testing guest mode: %s\n", vm_guest_mode_string(mode));
+	pr_debug("Testing memory backing src type: %s\n",
+			vm_mem_backing_src_alias(p->src_type)->name);
+}
+
+static void reset_event_counts(void)
+{
+	memset(&events, 0, sizeof(events));
+}
+
+static bool vcpu_run_loop(struct kvm_vm *vm, struct kvm_vcpu *vcpu,
+		struct test_desc *test)
+{
+	bool skip_test = false;
+	struct ucall uc;
+	int stage;
+
+	for (stage = 0; ; stage++) {
+		vcpu_run(vcpu);
+
+		switch (get_ucall(vcpu, &uc)) {
+		case UCALL_SYNC:
+			if (!handle_cmd(vm, uc.args[1])) {
+				pr_debug("Skipped.\n");
+				skip_test = true;
+				goto done;
+			}
+			break;
+		case UCALL_ABORT:
+			TEST_FAIL("%s at %s:%ld\n\tvalues: %#lx, %#lx",
+				(const char *)uc.args[0],
+				__FILE__, uc.args[1], uc.args[2], uc.args[3]);
+			break;
+		case UCALL_DONE:
+			pr_debug("Done.\n");
+			goto done;
+		default:
+			TEST_FAIL("Unknown ucall %lu", uc.cmd);
+		}
+	}
+
+done:
+	return skip_test;
+}
+
+static void run_test(enum vm_guest_mode mode, void *arg)
+{
+	struct test_params *p = (struct test_params *)arg;
+	struct test_desc *test = p->test_desc;
+	struct kvm_vm *vm;
+	struct kvm_vcpu *vcpus[1], *vcpu;
+	bool skip_test = false;
+
+	print_test_banner(mode, p);
+
+	vm = __vm_create_with_vcpus(mode, 1, 6, guest_code, vcpus);
+	vcpu = vcpus[0];
+	ucall_init(vm, NULL);
+
+	reset_event_counts();
+	setup_memslots(vm, mode, p);
+
+	load_exec_code_for_test();
+	setup_abort_handlers(vm, vcpu, test);
+	vcpu_args_set(vcpu, 1, test);
+
+	sync_global_to_guest(vm, memslot);
+
+	skip_test = vcpu_run_loop(vm, vcpu, test);
+
+	sync_stats_from_guest(vm);
+	ucall_uninit(vm);
+	kvm_vm_free(vm);
+
+	if (!skip_test)
+		check_event_counts(test);
+}
+
+static void help(char *name)
+{
+	puts("");
+	printf("usage: %s [-h] [-s mem-type]\n", name);
+	puts("");
+	guest_modes_help();
+	backing_src_help("-s");
+	puts("");
+}
+
+#define SNAME(s)			#s
+#define SCAT2(a, b)			SNAME(a ## _ ## b)
+#define SCAT3(a, b, c)			SCAT2(a, SCAT2(b, c))
+
+#define _CHECK(_test)			_CHECK_##_test
+#define _PREPARE(_test)			_PREPARE_##_test
+#define _PREPARE_guest_read64		NULL
+#define _PREPARE_guest_ld_preidx	NULL
+#define _PREPARE_guest_write64		NULL
+#define _PREPARE_guest_st_preidx	NULL
+#define _PREPARE_guest_exec		NULL
+#define _PREPARE_guest_at		NULL
+#define _PREPARE_guest_dc_zva		guest_check_dc_zva
+#define _PREPARE_guest_cas		guest_check_lse
+
+/* With or without access flag checks */
+#define _PREPARE_with_af		guest_set_ha, guest_clear_pte_af
+#define _PREPARE_no_af			NULL
+#define _CHECK_with_af			guest_check_pte_af
+#define _CHECK_no_af			NULL
+
+/* Performs an access and checks that no faults (no events) were triggered. */
+#define TEST_ACCESS(_access, _with_af, _mark_cmd)				\
+{										\
+	.name			= SCAT3(_access, _with_af, #_mark_cmd),		\
+	.guest_prepare		= { _PREPARE(_with_af),				\
+				    _PREPARE(_access) },			\
+	.mem_mark_cmd		= _mark_cmd,					\
+	.guest_test		= _access,					\
+	.guest_test_check	= { _CHECK(_with_af) },				\
+	.expected_events	= { 0 },					\
+}
+
+static struct test_desc tests[] = {
+	/* Check that HW is setting the Access Flag (AF) (sanity checks). */
+	TEST_ACCESS(guest_read64, with_af, CMD_NONE),
+	TEST_ACCESS(guest_ld_preidx, with_af, CMD_NONE),
+	TEST_ACCESS(guest_cas, with_af, CMD_NONE),
+	TEST_ACCESS(guest_write64, with_af, CMD_NONE),
+	TEST_ACCESS(guest_st_preidx, with_af, CMD_NONE),
+	TEST_ACCESS(guest_dc_zva, with_af, CMD_NONE),
+	TEST_ACCESS(guest_exec, with_af, CMD_NONE),
+
+	/*
+	 * Accessing a hole in the test memslot (punched with fallocate or
+	 * madvise) shouldn't fault (more sanity checks).
+	 */
+	TEST_ACCESS(guest_read64, no_af, CMD_HOLE_TEST),
+	TEST_ACCESS(guest_cas, no_af, CMD_HOLE_TEST),
+	TEST_ACCESS(guest_ld_preidx, no_af, CMD_HOLE_TEST),
+	TEST_ACCESS(guest_write64, no_af, CMD_HOLE_TEST),
+	TEST_ACCESS(guest_st_preidx, no_af, CMD_HOLE_TEST),
+	TEST_ACCESS(guest_at, no_af, CMD_HOLE_TEST),
+	TEST_ACCESS(guest_dc_zva, no_af, CMD_HOLE_TEST),
+
+	{ 0 }
+};
+
+static void for_each_test_and_guest_mode(
+		void (*func)(enum vm_guest_mode m, void *a),
+		enum vm_mem_backing_src_type src_type)
+{
+	struct test_desc *t;
+
+	for (t = &tests[0]; t->name; t++) {
+		if (t->skip)
+			continue;
+
+		struct test_params p = {
+			.src_type = src_type,
+			.test_desc = t,
+		};
+
+		for_each_guest_mode(run_test, &p);
+	}
+}
+
+int main(int argc, char *argv[])
+{
+	enum vm_mem_backing_src_type src_type;
+	int opt;
+
+	setbuf(stdout, NULL);
+
+	src_type = DEFAULT_VM_MEM_SRC;
+
+	guest_modes_append_default();
+
+	while ((opt = getopt(argc, argv, "hm:s:")) != -1) {
+		switch (opt) {
+		case 'm':
+			guest_modes_cmdline(optarg);
+			break;
+		case 's':
+			src_type = parse_backing_src_type(optarg);
+			break;
+		case 'h':
+		default:
+			help(argv[0]);
+			exit(0);
+		}
+	}
+
+	for_each_test_and_guest_mode(run_test, src_type);
+	return 0;
+}
diff --git a/tools/testing/selftests/kvm/include/aarch64/processor.h b/tools/testing/selftests/kvm/include/aarch64/processor.h
index 74f10d006e15..818665e86f32 100644
--- a/tools/testing/selftests/kvm/include/aarch64/processor.h
+++ b/tools/testing/selftests/kvm/include/aarch64/processor.h
@@ -110,6 +110,12 @@ enum {
 #define ESR_EC_WP_CURRENT	0x35
 #define ESR_EC_BRK_INS		0x3c
 
+/* Access flag */
+#define PTE_AF			(1ULL << 10)
+
+/* Access flag update enable/disable */
+#define TCR_EL1_HA		(1ULL << 39)
+
 void aarch64_get_supported_page_sizes(uint32_t ipa,
 				      bool *ps4k, bool *ps16k, bool *ps64k);
 
-- 
2.37.0.rc0.161.g10f37bed90-goog


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v4 09/13] KVM: selftests: aarch64: Add aarch64/page_fault_test
@ 2022-06-24 21:32   ` Ricardo Koller
  0 siblings, 0 replies; 56+ messages in thread
From: Ricardo Koller @ 2022-06-24 21:32 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: maz, bgardon, dmatlack, pbonzini, axelrasmussen

Add a new test for stage 2 faults when using different combinations of
guest accesses (e.g., write, S1PTW), backing source type (e.g., anon)
and types of faults (e.g., read on hugetlbfs with a hole). The next
commits will add different handling methods and more faults (e.g., uffd
and dirty logging). This first commit starts by adding two sanity checks
for all types of accesses: AF setting by the hw, and accessing memslots
with holes.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 tools/testing/selftests/kvm/Makefile          |   1 +
 .../selftests/kvm/aarch64/page_fault_test.c   | 695 ++++++++++++++++++
 .../selftests/kvm/include/aarch64/processor.h |   6 +
 3 files changed, 702 insertions(+)
 create mode 100644 tools/testing/selftests/kvm/aarch64/page_fault_test.c

diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index e4497a3a27d4..13b913225ae7 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -139,6 +139,7 @@ TEST_GEN_PROGS_aarch64 += aarch64/arch_timer
 TEST_GEN_PROGS_aarch64 += aarch64/debug-exceptions
 TEST_GEN_PROGS_aarch64 += aarch64/get-reg-list
 TEST_GEN_PROGS_aarch64 += aarch64/hypercalls
+TEST_GEN_PROGS_aarch64 += aarch64/page_fault_test
 TEST_GEN_PROGS_aarch64 += aarch64/psci_test
 TEST_GEN_PROGS_aarch64 += aarch64/vcpu_width_config
 TEST_GEN_PROGS_aarch64 += aarch64/vgic_init
diff --git a/tools/testing/selftests/kvm/aarch64/page_fault_test.c b/tools/testing/selftests/kvm/aarch64/page_fault_test.c
new file mode 100644
index 000000000000..bdda4e3fcdaa
--- /dev/null
+++ b/tools/testing/selftests/kvm/aarch64/page_fault_test.c
@@ -0,0 +1,695 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * page_fault_test.c - Test stage 2 faults.
+ *
+ * This test tries different combinations of guest accesses (e.g., write,
+ * S1PTW), backing source type (e.g., anon) and types of faults (e.g., read on
+ * hugetlbfs with a hole). It checks that the expected handling method is
+ * called (e.g., uffd faults with the right address and write/read flag).
+ */
+
+#define _GNU_SOURCE
+#include <linux/bitmap.h>
+#include <fcntl.h>
+#include <test_util.h>
+#include <kvm_util.h>
+#include <processor.h>
+#include <asm/sysreg.h>
+#include <linux/bitfield.h>
+#include "guest_modes.h"
+#include "userfaultfd_util.h"
+
+#define TEST_MEM_SLOT_INDEX			1
+#define TEST_PT_SLOT_INDEX			2
+
+/* Guest virtual addresses that point to the test page and its PTE. */
+#define TEST_GVA				0xc0000000
+#define TEST_EXEC_GVA				0xc0000008
+#define TEST_PTE_GVA				0xb0000000
+#define TEST_DATA				0x0123456789ABCDEF
+
+static uint64_t *guest_test_memory = (uint64_t *)TEST_GVA;
+
+#define CMD_NONE				(0)
+#define CMD_SKIP_TEST				(1ULL << 1)
+#define CMD_HOLE_PT				(1ULL << 2)
+#define CMD_HOLE_TEST				(1ULL << 3)
+
+#define PREPARE_FN_NR				10
+#define CHECK_FN_NR				10
+
+uint64_t pte_gpa;
+
+enum {
+	PT,
+	TEST,
+	NR_MEMSLOTS
+};
+
+struct memslot_desc {
+	void *hva;
+	uint64_t gpa;
+	uint64_t size;
+	uint64_t guest_pages;
+	enum vm_mem_backing_src_type src_type;
+	uint32_t idx;
+} memslot[NR_MEMSLOTS] = {
+	{
+		.idx = TEST_PT_SLOT_INDEX,
+	},
+	{
+		.idx = TEST_MEM_SLOT_INDEX,
+	},
+};
+
+static struct event_cnt {
+	int aborts;
+	int fail_vcpu_runs;
+} events;
+
+struct test_desc {
+	const char *name;
+	uint64_t mem_mark_cmd;
+	/* Skip the test if any prepare function returns false */
+	bool (*guest_prepare[PREPARE_FN_NR])(void);
+	void (*guest_test)(void);
+	void (*guest_test_check[CHECK_FN_NR])(void);
+	void (*dabt_handler)(struct ex_regs *regs);
+	void (*iabt_handler)(struct ex_regs *regs);
+	uint32_t pt_memslot_flags;
+	uint32_t test_memslot_flags;
+	bool skip;
+	struct event_cnt expected_events;
+};
+
+struct test_params {
+	enum vm_mem_backing_src_type src_type;
+	struct test_desc *test_desc;
+};
+
+static inline void flush_tlb_page(uint64_t vaddr)
+{
+	uint64_t page = vaddr >> 12;
+
+	dsb(ishst);
+	asm volatile("tlbi vaae1is, %0" :: "r" (page));
+	dsb(ish);
+	isb();
+}
+
+static void guest_write64(void)
+{
+	uint64_t val;
+
+	WRITE_ONCE(*guest_test_memory, TEST_DATA);
+	val = READ_ONCE(*guest_test_memory);
+	GUEST_ASSERT_EQ(val, TEST_DATA);
+}
+
+/* Check the system for atomic instructions. */
+static bool guest_check_lse(void)
+{
+	uint64_t isar0 = read_sysreg(id_aa64isar0_el1);
+	uint64_t atomic;
+
+	atomic = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64ISAR0_ATOMICS), isar0);
+	return atomic >= 2;
+}
+
+static bool guest_check_dc_zva(void)
+{
+	uint64_t dczid = read_sysreg(dczid_el0);
+	uint64_t dzp = FIELD_GET(ARM64_FEATURE_MASK(DCZID_DZP), dczid);
+
+	return dzp == 0;
+}
+
+/* Compare and swap instruction. */
+static void guest_cas(void)
+{
+	uint64_t val;
+
+	GUEST_ASSERT_EQ(guest_check_lse(), 1);
+	asm volatile(".arch_extension lse\n"
+		     "casal %0, %1, [%2]\n"
+			:: "r" (0), "r" (TEST_DATA), "r" (guest_test_memory));
+	val = READ_ONCE(*guest_test_memory);
+	GUEST_ASSERT_EQ(val, TEST_DATA);
+}
+
+static void guest_read64(void)
+{
+	uint64_t val;
+
+	val = READ_ONCE(*guest_test_memory);
+	GUEST_ASSERT_EQ(val, 0);
+}
+
+/* Address translation instruction */
+static void guest_at(void)
+{
+	uint64_t par;
+	uint64_t paddr;
+
+	asm volatile("at s1e1r, %0" :: "r" (guest_test_memory));
+	par = read_sysreg(par_el1);
+
+	/* Bit 1 indicates whether the AT was successful */
+	GUEST_ASSERT_EQ(par & 1, 0);
+	/* The PA in bits [51:12] */
+	paddr = par & (((1ULL << 40) - 1) << 12);
+	GUEST_ASSERT_EQ(paddr, memslot[TEST].gpa);
+}
+
+/*
+ * The size of the block written by "dc zva" is guaranteed to be between (2 <<
+ * 0) and (2 << 9), which is safe in our case as we need the write to happen
+ * for at least a word, and not more than a page.
+ */
+static void guest_dc_zva(void)
+{
+	uint16_t val;
+
+	asm volatile("dc zva, %0\n"
+			"dsb ish\n"
+			:: "r" (guest_test_memory));
+	val = READ_ONCE(*guest_test_memory);
+	GUEST_ASSERT_EQ(val, 0);
+}
+
+/*
+ * Pre-indexing loads and stores don't have a valid syndrome (ESR_EL2.ISV==0).
+ * And that's special because KVM must take special care with those: they
+ * should still count as accesses for dirty logging or user-faulting, but
+ * should be handled differently on mmio.
+ */
+static void guest_ld_preidx(void)
+{
+	uint64_t val;
+	uint64_t addr = TEST_GVA - 8;
+
+	/*
+	 * This ends up accessing "TEST_GVA + 8 - 8", where "TEST_GVA - 8" is
+	 * in a gap between memslots not backing by anything.
+	 */
+	asm volatile("ldr %0, [%1, #8]!"
+			: "=r" (val), "+r" (addr));
+	GUEST_ASSERT_EQ(val, 0);
+	GUEST_ASSERT_EQ(addr, TEST_GVA);
+}
+
+static void guest_st_preidx(void)
+{
+	uint64_t val = TEST_DATA;
+	uint64_t addr = TEST_GVA - 8;
+
+	asm volatile("str %0, [%1, #8]!"
+			: "+r" (val), "+r" (addr));
+
+	GUEST_ASSERT_EQ(addr, TEST_GVA);
+	val = READ_ONCE(*guest_test_memory);
+}
+
+static bool guest_set_ha(void)
+{
+	uint64_t mmfr1 = read_sysreg(id_aa64mmfr1_el1);
+	uint64_t hadbs, tcr;
+
+	/* Skip if HA is not supported. */
+	hadbs = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR1_HADBS), mmfr1);
+	if (hadbs == 0)
+		return false;
+
+	tcr = read_sysreg(tcr_el1) | TCR_EL1_HA;
+	write_sysreg(tcr, tcr_el1);
+	isb();
+
+	return true;
+}
+
+static bool guest_clear_pte_af(void)
+{
+	*((uint64_t *)TEST_PTE_GVA) &= ~PTE_AF;
+	flush_tlb_page(TEST_PTE_GVA);
+
+	return true;
+}
+
+static void guest_check_pte_af(void)
+{
+	flush_tlb_page(TEST_PTE_GVA);
+	GUEST_ASSERT_EQ(*((uint64_t *)TEST_PTE_GVA) & PTE_AF, PTE_AF);
+}
+
+static void guest_exec(void)
+{
+	int (*code)(void) = (int (*)(void))TEST_EXEC_GVA;
+	int ret;
+
+	ret = code();
+	GUEST_ASSERT_EQ(ret, 0x77);
+}
+
+static bool guest_prepare(struct test_desc *test)
+{
+	bool (*prepare_fn)(void);
+	int i;
+
+	for (i = 0; i < PREPARE_FN_NR; i++) {
+		prepare_fn = test->guest_prepare[i];
+		if (prepare_fn && !prepare_fn())
+			return false;
+	}
+
+	return true;
+}
+
+static void guest_test_check(struct test_desc *test)
+{
+	void (*check_fn)(void);
+	int i;
+
+	for (i = 0; i < CHECK_FN_NR; i++) {
+		check_fn = test->guest_test_check[i];
+		if (check_fn)
+			check_fn();
+	}
+}
+
+static void guest_code(struct test_desc *test)
+{
+	if (!guest_prepare(test))
+		GUEST_SYNC(CMD_SKIP_TEST);
+
+	GUEST_SYNC(test->mem_mark_cmd);
+
+	if (test->guest_test)
+		test->guest_test();
+
+	guest_test_check(test);
+	GUEST_DONE();
+}
+
+static void no_dabt_handler(struct ex_regs *regs)
+{
+	GUEST_ASSERT_1(false, read_sysreg(far_el1));
+}
+
+static void no_iabt_handler(struct ex_regs *regs)
+{
+	GUEST_ASSERT_1(false, regs->pc);
+}
+
+/* Returns true to continue the test, and false if it should be skipped. */
+static bool punch_hole_in_memslot(struct kvm_vm *vm,
+		struct memslot_desc *memslot)
+{
+	int ret, fd;
+	void *hva;
+
+	fd = vm_mem_region_get_src_fd(vm, memslot->idx);
+	if (fd != -1) {
+		ret = fallocate(fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
+				0, memslot->size);
+		TEST_ASSERT(ret == 0, "fallocate failed, errno: %d\n", errno);
+	} else {
+		if (is_backing_src_hugetlb(memslot->src_type))
+			return false;
+
+		hva = addr_gpa2hva(vm, memslot->gpa);
+		ret = madvise(hva, memslot->size, MADV_DONTNEED);
+		TEST_ASSERT(ret == 0, "madvise failed, errno: %d\n", errno);
+	}
+
+	return true;
+}
+
+/* Returns true to continue the test, and false if it should be skipped. */
+static bool handle_cmd(struct kvm_vm *vm, int cmd)
+{
+	bool continue_test = true;
+
+	if (cmd == CMD_SKIP_TEST)
+		continue_test = false;
+
+	if (cmd & CMD_HOLE_PT)
+		continue_test = punch_hole_in_memslot(vm, &memslot[PT]);
+	if (cmd & CMD_HOLE_TEST)
+		continue_test = punch_hole_in_memslot(vm, &memslot[TEST]);
+
+	return continue_test;
+}
+
+static void sync_stats_from_guest(struct kvm_vm *vm)
+{
+	struct event_cnt *ec = addr_gva2hva(vm, (uint64_t)&events);
+
+	events.aborts += ec->aborts;
+}
+
+void fail_vcpu_run_no_handler(int ret)
+{
+	TEST_FAIL("Unexpected vcpu run failure\n");
+}
+
+extern unsigned char __exec_test;
+
+void noinline __return_0x77(void)
+{
+	asm volatile("__exec_test: mov x0, #0x77\n"
+			"ret\n");
+}
+
+static void load_exec_code_for_test(void)
+{
+	uint64_t *code, *c;
+
+	assert(TEST_EXEC_GVA - TEST_GVA);
+	code = memslot[TEST].hva + 8;
+
+	/*
+	 * We need the cast to be separate in order for the compiler to not
+	 * complain with: "‘memcpy’ forming offset [1, 7] is out of the bounds
+	 * [0, 1] of object ‘__exec_test’ with type ‘unsigned char’"
+	 */
+	c = (uint64_t *)&__exec_test;
+	memcpy(code, c, 8);
+}
+
+static void setup_abort_handlers(struct kvm_vm *vm, struct kvm_vcpu *vcpu,
+		struct test_desc *test)
+{
+	vm_init_descriptor_tables(vm);
+	vcpu_init_descriptor_tables(vcpu);
+	if (!test->dabt_handler)
+		test->dabt_handler = no_dabt_handler;
+	if (!test->iabt_handler)
+		test->iabt_handler = no_iabt_handler;
+	vm_install_sync_handler(vm, VECTOR_SYNC_CURRENT,
+			0x25, test->dabt_handler);
+	vm_install_sync_handler(vm, VECTOR_SYNC_CURRENT,
+			0x21, test->iabt_handler);
+}
+
+/*
+ * Create a memslot for test data (memslot[TEST]) and another one for PT
+ * tables (memslot[PT]). This diagram show the resulting guest virtual and
+ * physical address space when using 4K backing pages for the memslots, and
+ * 4K guest pages.
+ *
+ *                   Guest physical            Guest virtual
+ *
+ *                  |              |          |             |
+ *                  |              |          |             |
+ *                  +--------------+          +-------------+
+ * max_gfn - 0x1000 | TEST memslot |<---------+  test data  | 0xc0000000
+ *                  +--------------+          +-------------+
+ * max_gfn - 0x2000 |     gap      |<---------+     gap     | 0xbffff000
+ *                  +--------------+          +-------------+
+ *                  |              |          |             |
+ *                  |              |          |             |
+ *                  |  PT memslot  |          |             |
+ *                  |              |          +-------------+
+ * max_gfn - 0x6000 |              |<----+    |             |
+ *                  +--------------+     |    |             |
+ *                  |              |     |    | PTE for the |
+ *                  |              |     |    | test data   |
+ *                  |              |     +----+ page        | 0xb0000000
+ *                  |              |          +-------------+
+ *                  |              |          |             |
+ *                  |              |          |             |
+ *
+ * Using different guest page sizes or backing pages will result in the
+ * same layout but at different addresses. In particular, the memslot
+ * sizes need to be multiple of the backing page sizes (e.g., 2MB).
+ */
+static void setup_memslots(struct kvm_vm *vm, enum vm_guest_mode mode,
+		struct test_params *p)
+{
+	uint64_t backing_page_size = get_backing_src_pagesz(p->src_type);
+	uint64_t guest_page_size = vm_guest_mode_params[mode].page_size;
+	struct test_desc *test = p->test_desc;
+	uint64_t gap_gpa;
+	uint64_t alignment;
+	int i;
+
+	memslot[TEST].size = align_up(guest_page_size, backing_page_size);
+	/*
+	 * We need one guest page for the PT table containing the PTE (for
+	 * TEST_GVA), but might need more in case the higher level PT tables
+	 * were not allocated yet.
+	 */
+	memslot[PT].size = align_up(4 * guest_page_size, backing_page_size);
+
+	for (i = 0; i < NR_MEMSLOTS; i++) {
+		memslot[i].guest_pages = memslot[i].size / guest_page_size;
+		memslot[i].src_type = p->src_type;
+	}
+
+	/* Place the memslots GPAs at the end of physical memory */
+	alignment = max(backing_page_size, guest_page_size);
+	memslot[TEST].gpa = (vm->max_gfn - memslot[TEST].guest_pages) *
+		guest_page_size;
+	memslot[TEST].gpa = align_down(memslot[TEST].gpa, alignment);
+
+	/* Add a 1-guest_page gap between the two memslots */
+	gap_gpa = memslot[TEST].gpa - guest_page_size;
+	/* Map the gap so it's still adressable from the guest.  */
+	virt_pg_map(vm, TEST_GVA - guest_page_size, gap_gpa);
+
+	memslot[PT].gpa = gap_gpa - memslot[PT].size;
+	memslot[PT].gpa = align_down(memslot[PT].gpa, alignment);
+
+	vm_userspace_mem_region_add(vm, p->src_type, memslot[PT].gpa,
+			memslot[PT].idx, memslot[PT].guest_pages,
+			test->pt_memslot_flags);
+	vm_userspace_mem_region_add(vm, p->src_type, memslot[TEST].gpa,
+			memslot[TEST].idx, memslot[TEST].guest_pages,
+			test->test_memslot_flags);
+
+	for (i = 0; i < NR_MEMSLOTS; i++)
+		memslot[i].hva = addr_gpa2hva(vm, memslot[i].gpa);
+
+	/* Map the test TEST_GVA using the PT memslot. */
+	_virt_pg_map(vm, TEST_GVA, memslot[TEST].gpa, MT_NORMAL,
+			TEST_PT_SLOT_INDEX);
+
+	/*
+	 * Find the PTE of the test page and map it in the guest so it can
+	 * clear the AF.
+	 */
+	pte_gpa = addr_hva2gpa(vm, virt_get_pte_hva(vm, TEST_GVA));
+	TEST_ASSERT(memslot[PT].gpa <= pte_gpa &&
+			pte_gpa < (memslot[PT].gpa + memslot[PT].size),
+			"The EPT should be in the PT memslot.");
+	/* This is an artibrary requirement just to make things simpler. */
+	TEST_ASSERT(pte_gpa % guest_page_size == 0,
+			"The pte_gpa (%p) should be aligned to the guest page (%lx).",
+			(void *)pte_gpa, guest_page_size);
+	virt_pg_map(vm, TEST_PTE_GVA, pte_gpa);
+}
+
+static void check_event_counts(struct test_desc *test)
+{
+	ASSERT_EQ(test->expected_events.aborts,	events.aborts);
+}
+
+static void print_test_banner(enum vm_guest_mode mode, struct test_params *p)
+{
+	struct test_desc *test = p->test_desc;
+
+	pr_debug("Test: %s\n", test->name);
+	pr_debug("Testing guest mode: %s\n", vm_guest_mode_string(mode));
+	pr_debug("Testing memory backing src type: %s\n",
+			vm_mem_backing_src_alias(p->src_type)->name);
+}
+
+static void reset_event_counts(void)
+{
+	memset(&events, 0, sizeof(events));
+}
+
+static bool vcpu_run_loop(struct kvm_vm *vm, struct kvm_vcpu *vcpu,
+		struct test_desc *test)
+{
+	bool skip_test = false;
+	struct ucall uc;
+	int stage;
+
+	for (stage = 0; ; stage++) {
+		vcpu_run(vcpu);
+
+		switch (get_ucall(vcpu, &uc)) {
+		case UCALL_SYNC:
+			if (!handle_cmd(vm, uc.args[1])) {
+				pr_debug("Skipped.\n");
+				skip_test = true;
+				goto done;
+			}
+			break;
+		case UCALL_ABORT:
+			TEST_FAIL("%s at %s:%ld\n\tvalues: %#lx, %#lx",
+				(const char *)uc.args[0],
+				__FILE__, uc.args[1], uc.args[2], uc.args[3]);
+			break;
+		case UCALL_DONE:
+			pr_debug("Done.\n");
+			goto done;
+		default:
+			TEST_FAIL("Unknown ucall %lu", uc.cmd);
+		}
+	}
+
+done:
+	return skip_test;
+}
+
+static void run_test(enum vm_guest_mode mode, void *arg)
+{
+	struct test_params *p = (struct test_params *)arg;
+	struct test_desc *test = p->test_desc;
+	struct kvm_vm *vm;
+	struct kvm_vcpu *vcpus[1], *vcpu;
+	bool skip_test = false;
+
+	print_test_banner(mode, p);
+
+	vm = __vm_create_with_vcpus(mode, 1, 6, guest_code, vcpus);
+	vcpu = vcpus[0];
+	ucall_init(vm, NULL);
+
+	reset_event_counts();
+	setup_memslots(vm, mode, p);
+
+	load_exec_code_for_test();
+	setup_abort_handlers(vm, vcpu, test);
+	vcpu_args_set(vcpu, 1, test);
+
+	sync_global_to_guest(vm, memslot);
+
+	skip_test = vcpu_run_loop(vm, vcpu, test);
+
+	sync_stats_from_guest(vm);
+	ucall_uninit(vm);
+	kvm_vm_free(vm);
+
+	if (!skip_test)
+		check_event_counts(test);
+}
+
+static void help(char *name)
+{
+	puts("");
+	printf("usage: %s [-h] [-s mem-type]\n", name);
+	puts("");
+	guest_modes_help();
+	backing_src_help("-s");
+	puts("");
+}
+
+#define SNAME(s)			#s
+#define SCAT2(a, b)			SNAME(a ## _ ## b)
+#define SCAT3(a, b, c)			SCAT2(a, SCAT2(b, c))
+
+#define _CHECK(_test)			_CHECK_##_test
+#define _PREPARE(_test)			_PREPARE_##_test
+#define _PREPARE_guest_read64		NULL
+#define _PREPARE_guest_ld_preidx	NULL
+#define _PREPARE_guest_write64		NULL
+#define _PREPARE_guest_st_preidx	NULL
+#define _PREPARE_guest_exec		NULL
+#define _PREPARE_guest_at		NULL
+#define _PREPARE_guest_dc_zva		guest_check_dc_zva
+#define _PREPARE_guest_cas		guest_check_lse
+
+/* With or without access flag checks */
+#define _PREPARE_with_af		guest_set_ha, guest_clear_pte_af
+#define _PREPARE_no_af			NULL
+#define _CHECK_with_af			guest_check_pte_af
+#define _CHECK_no_af			NULL
+
+/* Performs an access and checks that no faults (no events) were triggered. */
+#define TEST_ACCESS(_access, _with_af, _mark_cmd)				\
+{										\
+	.name			= SCAT3(_access, _with_af, #_mark_cmd),		\
+	.guest_prepare		= { _PREPARE(_with_af),				\
+				    _PREPARE(_access) },			\
+	.mem_mark_cmd		= _mark_cmd,					\
+	.guest_test		= _access,					\
+	.guest_test_check	= { _CHECK(_with_af) },				\
+	.expected_events	= { 0 },					\
+}
+
+static struct test_desc tests[] = {
+	/* Check that HW is setting the Access Flag (AF) (sanity checks). */
+	TEST_ACCESS(guest_read64, with_af, CMD_NONE),
+	TEST_ACCESS(guest_ld_preidx, with_af, CMD_NONE),
+	TEST_ACCESS(guest_cas, with_af, CMD_NONE),
+	TEST_ACCESS(guest_write64, with_af, CMD_NONE),
+	TEST_ACCESS(guest_st_preidx, with_af, CMD_NONE),
+	TEST_ACCESS(guest_dc_zva, with_af, CMD_NONE),
+	TEST_ACCESS(guest_exec, with_af, CMD_NONE),
+
+	/*
+	 * Accessing a hole in the test memslot (punched with fallocate or
+	 * madvise) shouldn't fault (more sanity checks).
+	 */
+	TEST_ACCESS(guest_read64, no_af, CMD_HOLE_TEST),
+	TEST_ACCESS(guest_cas, no_af, CMD_HOLE_TEST),
+	TEST_ACCESS(guest_ld_preidx, no_af, CMD_HOLE_TEST),
+	TEST_ACCESS(guest_write64, no_af, CMD_HOLE_TEST),
+	TEST_ACCESS(guest_st_preidx, no_af, CMD_HOLE_TEST),
+	TEST_ACCESS(guest_at, no_af, CMD_HOLE_TEST),
+	TEST_ACCESS(guest_dc_zva, no_af, CMD_HOLE_TEST),
+
+	{ 0 }
+};
+
+static void for_each_test_and_guest_mode(
+		void (*func)(enum vm_guest_mode m, void *a),
+		enum vm_mem_backing_src_type src_type)
+{
+	struct test_desc *t;
+
+	for (t = &tests[0]; t->name; t++) {
+		if (t->skip)
+			continue;
+
+		struct test_params p = {
+			.src_type = src_type,
+			.test_desc = t,
+		};
+
+		for_each_guest_mode(run_test, &p);
+	}
+}
+
+int main(int argc, char *argv[])
+{
+	enum vm_mem_backing_src_type src_type;
+	int opt;
+
+	setbuf(stdout, NULL);
+
+	src_type = DEFAULT_VM_MEM_SRC;
+
+	guest_modes_append_default();
+
+	while ((opt = getopt(argc, argv, "hm:s:")) != -1) {
+		switch (opt) {
+		case 'm':
+			guest_modes_cmdline(optarg);
+			break;
+		case 's':
+			src_type = parse_backing_src_type(optarg);
+			break;
+		case 'h':
+		default:
+			help(argv[0]);
+			exit(0);
+		}
+	}
+
+	for_each_test_and_guest_mode(run_test, src_type);
+	return 0;
+}
diff --git a/tools/testing/selftests/kvm/include/aarch64/processor.h b/tools/testing/selftests/kvm/include/aarch64/processor.h
index 74f10d006e15..818665e86f32 100644
--- a/tools/testing/selftests/kvm/include/aarch64/processor.h
+++ b/tools/testing/selftests/kvm/include/aarch64/processor.h
@@ -110,6 +110,12 @@ enum {
 #define ESR_EC_WP_CURRENT	0x35
 #define ESR_EC_BRK_INS		0x3c
 
+/* Access flag */
+#define PTE_AF			(1ULL << 10)
+
+/* Access flag update enable/disable */
+#define TCR_EL1_HA		(1ULL << 39)
+
 void aarch64_get_supported_page_sizes(uint32_t ipa,
 				      bool *ps4k, bool *ps16k, bool *ps64k);
 
-- 
2.37.0.rc0.161.g10f37bed90-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v4 10/13] KVM: selftests: aarch64: Add userfaultfd tests into page_fault_test
  2022-06-24 21:32 ` Ricardo Koller
@ 2022-06-24 21:32   ` Ricardo Koller
  -1 siblings, 0 replies; 56+ messages in thread
From: Ricardo Koller @ 2022-06-24 21:32 UTC (permalink / raw)
  To: kvm, kvmarm, drjones
  Cc: pbonzini, maz, alexandru.elisei, eric.auger, oupton, reijiw,
	rananta, bgardon, dmatlack, axelrasmussen, Ricardo Koller

Add some userfaultfd tests into page_fault_test. Punch holes into the
data and/or page-table memslots, perform some accesses, and check that
the faults are taken (or not taken) when expected.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 .../selftests/kvm/aarch64/page_fault_test.c   | 165 +++++++++++++++++-
 1 file changed, 163 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/kvm/aarch64/page_fault_test.c b/tools/testing/selftests/kvm/aarch64/page_fault_test.c
index bdda4e3fcdaa..26318d39940b 100644
--- a/tools/testing/selftests/kvm/aarch64/page_fault_test.c
+++ b/tools/testing/selftests/kvm/aarch64/page_fault_test.c
@@ -47,6 +47,8 @@ enum {
 };
 
 struct memslot_desc {
+	size_t paging_size;
+	char *data_copy;
 	void *hva;
 	uint64_t gpa;
 	uint64_t size;
@@ -65,6 +67,9 @@ struct memslot_desc {
 static struct event_cnt {
 	int aborts;
 	int fail_vcpu_runs;
+	int uffd_faults;
+	/* uffd_faults is incremented from multiple threads. */
+	pthread_mutex_t uffd_faults_mutex;
 } events;
 
 struct test_desc {
@@ -74,6 +79,8 @@ struct test_desc {
 	bool (*guest_prepare[PREPARE_FN_NR])(void);
 	void (*guest_test)(void);
 	void (*guest_test_check[CHECK_FN_NR])(void);
+	int (*uffd_pt_handler)(int mode, int uffd, struct uffd_msg *msg);
+	int (*uffd_test_handler)(int mode, int uffd, struct uffd_msg *msg);
 	void (*dabt_handler)(struct ex_regs *regs);
 	void (*iabt_handler)(struct ex_regs *regs);
 	uint32_t pt_memslot_flags;
@@ -301,6 +308,57 @@ static void no_iabt_handler(struct ex_regs *regs)
 }
 
 /* Returns true to continue the test, and false if it should be skipped. */
+static int uffd_generic_handler(int uffd_mode, int uffd,
+		struct uffd_msg *msg, struct memslot_desc *memslot,
+		bool expect_write)
+{
+	uint64_t addr = msg->arg.pagefault.address;
+	uint64_t flags = msg->arg.pagefault.flags;
+	struct uffdio_copy copy;
+	int ret;
+
+	TEST_ASSERT(uffd_mode == UFFDIO_REGISTER_MODE_MISSING,
+			"The only expected UFFD mode is MISSING");
+	ASSERT_EQ(!!(flags & UFFD_PAGEFAULT_FLAG_WRITE), expect_write);
+	ASSERT_EQ(addr, (uint64_t)memslot->hva);
+
+	pr_debug("uffd fault: addr=%p write=%d\n",
+			(void *)addr, !!(flags & UFFD_PAGEFAULT_FLAG_WRITE));
+
+	copy.src = (uint64_t)memslot->data_copy;
+	copy.dst = addr;
+	copy.len = memslot->paging_size;
+	copy.mode = 0;
+
+	ret = ioctl(uffd, UFFDIO_COPY, &copy);
+	if (ret == -1) {
+		pr_info("Failed UFFDIO_COPY in 0x%lx with errno: %d\n",
+				addr, errno);
+		return ret;
+	}
+
+	pthread_mutex_lock(&events.uffd_faults_mutex);
+	events.uffd_faults += 1;
+	pthread_mutex_unlock(&events.uffd_faults_mutex);
+	return 0;
+}
+
+static int uffd_pt_write_handler(int mode, int uffd, struct uffd_msg *msg)
+{
+	return uffd_generic_handler(mode, uffd, msg, &memslot[PT], true);
+}
+
+static int uffd_test_write_handler(int mode, int uffd, struct uffd_msg *msg)
+{
+	return uffd_generic_handler(mode, uffd, msg, &memslot[TEST], true);
+}
+
+static int uffd_test_read_handler(int mode, int uffd, struct uffd_msg *msg)
+{
+	return uffd_generic_handler(mode, uffd, msg, &memslot[TEST], false);
+}
+
+/* Returns false if the test should be skipped. */
 static bool punch_hole_in_memslot(struct kvm_vm *vm,
 		struct memslot_desc *memslot)
 {
@@ -310,14 +368,14 @@ static bool punch_hole_in_memslot(struct kvm_vm *vm,
 	fd = vm_mem_region_get_src_fd(vm, memslot->idx);
 	if (fd != -1) {
 		ret = fallocate(fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
-				0, memslot->size);
+				0, memslot->paging_size);
 		TEST_ASSERT(ret == 0, "fallocate failed, errno: %d\n", errno);
 	} else {
 		if (is_backing_src_hugetlb(memslot->src_type))
 			return false;
 
 		hva = addr_gpa2hva(vm, memslot->gpa);
-		ret = madvise(hva, memslot->size, MADV_DONTNEED);
+		ret = madvise(hva, memslot->paging_size, MADV_DONTNEED);
 		TEST_ASSERT(ret == 0, "madvise failed, errno: %d\n", errno);
 	}
 
@@ -487,11 +545,56 @@ static void setup_memslots(struct kvm_vm *vm, enum vm_guest_mode mode,
 			"The pte_gpa (%p) should be aligned to the guest page (%lx).",
 			(void *)pte_gpa, guest_page_size);
 	virt_pg_map(vm, TEST_PTE_GVA, pte_gpa);
+
+	memslot[PT].paging_size = memslot[PT].size;
+	memslot[TEST].paging_size = memslot[TEST].size;
+}
+
+static void setup_uffd(enum vm_guest_mode mode, struct test_params *p,
+		struct uffd_desc **uffd)
+{
+	struct test_desc *test = p->test_desc;
+	int i;
+
+
+	for (i = 0; i < NR_MEMSLOTS; i++) {
+		memslot[i].data_copy = malloc(memslot[i].paging_size);
+		TEST_ASSERT(memslot[i].data_copy, "Failed malloc.");
+		memcpy(memslot[i].data_copy, memslot[i].hva,
+				memslot[i].paging_size);
+	}
+
+	uffd[PT] = NULL;
+	if (test->uffd_pt_handler)
+		uffd[PT] = uffd_setup_demand_paging(
+				UFFDIO_REGISTER_MODE_MISSING, 0,
+				memslot[PT].hva, memslot[PT].paging_size,
+				test->uffd_pt_handler);
+
+	uffd[TEST] = NULL;
+	if (test->uffd_test_handler)
+		uffd[TEST] = uffd_setup_demand_paging(
+				UFFDIO_REGISTER_MODE_MISSING, 0,
+				memslot[TEST].hva, memslot[TEST].paging_size,
+				test->uffd_test_handler);
 }
 
 static void check_event_counts(struct test_desc *test)
 {
 	ASSERT_EQ(test->expected_events.aborts,	events.aborts);
+	ASSERT_EQ(test->expected_events.uffd_faults, events.uffd_faults);
+}
+
+static void free_uffd(struct test_desc *test, struct uffd_desc **uffd)
+{
+	int i;
+
+	if (test->uffd_pt_handler)
+		uffd_stop_demand_paging(uffd[PT]);
+	if (test->uffd_test_handler)
+		uffd_stop_demand_paging(uffd[TEST]);
+	for (i = 0; i < NR_MEMSLOTS; i++)
+		free(memslot[i].data_copy);
 }
 
 static void print_test_banner(enum vm_guest_mode mode, struct test_params *p)
@@ -550,6 +653,7 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 	struct test_desc *test = p->test_desc;
 	struct kvm_vm *vm;
 	struct kvm_vcpu *vcpus[1], *vcpu;
+	struct uffd_desc *uffd[NR_MEMSLOTS];
 	bool skip_test = false;
 
 	print_test_banner(mode, p);
@@ -561,7 +665,14 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 	reset_event_counts();
 	setup_memslots(vm, mode, p);
 
+	/*
+	 * Set some code at memslot[TEST].hva for the guest to execute (only
+	 * applicable to the EXEC tests). This has to be done before
+	 * setup_uffd() as that function copies the memslot data for the uffd
+	 * handler.
+	 */
 	load_exec_code_for_test();
+	setup_uffd(mode, p, uffd);
 	setup_abort_handlers(vm, vcpu, test);
 	vcpu_args_set(vcpu, 1, test);
 
@@ -572,7 +683,12 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 	sync_stats_from_guest(vm);
 	ucall_uninit(vm);
 	kvm_vm_free(vm);
+	free_uffd(test, uffd);
 
+	/*
+	 * Make sure this is called after the uffd threads have exited (and
+	 * updated their respective event counters).
+	 */
 	if (!skip_test)
 		check_event_counts(test);
 }
@@ -590,6 +706,7 @@ static void help(char *name)
 #define SNAME(s)			#s
 #define SCAT2(a, b)			SNAME(a ## _ ## b)
 #define SCAT3(a, b, c)			SCAT2(a, SCAT2(b, c))
+#define SCAT4(a, b, c, d)		SCAT2(a, SCAT3(b, c, d))
 
 #define _CHECK(_test)			_CHECK_##_test
 #define _PREPARE(_test)			_PREPARE_##_test
@@ -620,6 +737,20 @@ static void help(char *name)
 	.expected_events	= { 0 },					\
 }
 
+#define TEST_UFFD(_access, _with_af, _mark_cmd,					\
+		  _uffd_test_handler, _uffd_pt_handler, _uffd_faults)		\
+{										\
+	.name			= SCAT4(uffd, _access, _with_af, #_mark_cmd),	\
+	.guest_prepare		= { _PREPARE(_with_af),				\
+				    _PREPARE(_access) },			\
+	.guest_test		= _access,					\
+	.mem_mark_cmd		= _mark_cmd,					\
+	.guest_test_check	= { _CHECK(_with_af) },				\
+	.uffd_test_handler	= _uffd_test_handler,				\
+	.uffd_pt_handler	= _uffd_pt_handler,				\
+	.expected_events	= { .uffd_faults = _uffd_faults, },		\
+}
+
 static struct test_desc tests[] = {
 	/* Check that HW is setting the Access Flag (AF) (sanity checks). */
 	TEST_ACCESS(guest_read64, with_af, CMD_NONE),
@@ -642,6 +773,36 @@ static struct test_desc tests[] = {
 	TEST_ACCESS(guest_at, no_af, CMD_HOLE_TEST),
 	TEST_ACCESS(guest_dc_zva, no_af, CMD_HOLE_TEST),
 
+	/*
+	 * Punch holes in the test and PT memslots and mark them for
+	 * userfaultfd handling. This should result in 2 faults: the test
+	 * access and its respective S1 page table walk (S1PTW).
+	 */
+	TEST_UFFD(guest_read64, with_af, CMD_HOLE_TEST | CMD_HOLE_PT,
+			uffd_test_read_handler, uffd_pt_write_handler, 2),
+	/* no_af should also lead to a PT write. */
+	TEST_UFFD(guest_read64, no_af, CMD_HOLE_TEST | CMD_HOLE_PT,
+			uffd_test_read_handler, uffd_pt_write_handler, 2),
+	/* Note how that cas invokes the read handler. */
+	TEST_UFFD(guest_cas, with_af, CMD_HOLE_TEST | CMD_HOLE_PT,
+			uffd_test_read_handler, uffd_pt_write_handler, 2),
+	/*
+	 * Can't test guest_at with_af as it's IMPDEF whether the AF is set.
+	 * The S1PTW fault should still be marked as a write.
+	 */
+	TEST_UFFD(guest_at, no_af, CMD_HOLE_TEST | CMD_HOLE_PT,
+			uffd_test_read_handler, uffd_pt_write_handler, 1),
+	TEST_UFFD(guest_ld_preidx, with_af, CMD_HOLE_TEST | CMD_HOLE_PT,
+			uffd_test_read_handler, uffd_pt_write_handler, 2),
+	TEST_UFFD(guest_write64, with_af, CMD_HOLE_TEST | CMD_HOLE_PT,
+			uffd_test_write_handler, uffd_pt_write_handler, 2),
+	TEST_UFFD(guest_dc_zva, with_af, CMD_HOLE_TEST | CMD_HOLE_PT,
+			uffd_test_write_handler, uffd_pt_write_handler, 2),
+	TEST_UFFD(guest_st_preidx, with_af, CMD_HOLE_TEST | CMD_HOLE_PT,
+			uffd_test_write_handler, uffd_pt_write_handler, 2),
+	TEST_UFFD(guest_exec, with_af, CMD_HOLE_TEST | CMD_HOLE_PT,
+			uffd_test_read_handler, uffd_pt_write_handler, 2),
+
 	{ 0 }
 };
 
-- 
2.37.0.rc0.161.g10f37bed90-goog


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v4 10/13] KVM: selftests: aarch64: Add userfaultfd tests into page_fault_test
@ 2022-06-24 21:32   ` Ricardo Koller
  0 siblings, 0 replies; 56+ messages in thread
From: Ricardo Koller @ 2022-06-24 21:32 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: maz, bgardon, dmatlack, pbonzini, axelrasmussen

Add some userfaultfd tests into page_fault_test. Punch holes into the
data and/or page-table memslots, perform some accesses, and check that
the faults are taken (or not taken) when expected.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 .../selftests/kvm/aarch64/page_fault_test.c   | 165 +++++++++++++++++-
 1 file changed, 163 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/kvm/aarch64/page_fault_test.c b/tools/testing/selftests/kvm/aarch64/page_fault_test.c
index bdda4e3fcdaa..26318d39940b 100644
--- a/tools/testing/selftests/kvm/aarch64/page_fault_test.c
+++ b/tools/testing/selftests/kvm/aarch64/page_fault_test.c
@@ -47,6 +47,8 @@ enum {
 };
 
 struct memslot_desc {
+	size_t paging_size;
+	char *data_copy;
 	void *hva;
 	uint64_t gpa;
 	uint64_t size;
@@ -65,6 +67,9 @@ struct memslot_desc {
 static struct event_cnt {
 	int aborts;
 	int fail_vcpu_runs;
+	int uffd_faults;
+	/* uffd_faults is incremented from multiple threads. */
+	pthread_mutex_t uffd_faults_mutex;
 } events;
 
 struct test_desc {
@@ -74,6 +79,8 @@ struct test_desc {
 	bool (*guest_prepare[PREPARE_FN_NR])(void);
 	void (*guest_test)(void);
 	void (*guest_test_check[CHECK_FN_NR])(void);
+	int (*uffd_pt_handler)(int mode, int uffd, struct uffd_msg *msg);
+	int (*uffd_test_handler)(int mode, int uffd, struct uffd_msg *msg);
 	void (*dabt_handler)(struct ex_regs *regs);
 	void (*iabt_handler)(struct ex_regs *regs);
 	uint32_t pt_memslot_flags;
@@ -301,6 +308,57 @@ static void no_iabt_handler(struct ex_regs *regs)
 }
 
 /* Returns true to continue the test, and false if it should be skipped. */
+static int uffd_generic_handler(int uffd_mode, int uffd,
+		struct uffd_msg *msg, struct memslot_desc *memslot,
+		bool expect_write)
+{
+	uint64_t addr = msg->arg.pagefault.address;
+	uint64_t flags = msg->arg.pagefault.flags;
+	struct uffdio_copy copy;
+	int ret;
+
+	TEST_ASSERT(uffd_mode == UFFDIO_REGISTER_MODE_MISSING,
+			"The only expected UFFD mode is MISSING");
+	ASSERT_EQ(!!(flags & UFFD_PAGEFAULT_FLAG_WRITE), expect_write);
+	ASSERT_EQ(addr, (uint64_t)memslot->hva);
+
+	pr_debug("uffd fault: addr=%p write=%d\n",
+			(void *)addr, !!(flags & UFFD_PAGEFAULT_FLAG_WRITE));
+
+	copy.src = (uint64_t)memslot->data_copy;
+	copy.dst = addr;
+	copy.len = memslot->paging_size;
+	copy.mode = 0;
+
+	ret = ioctl(uffd, UFFDIO_COPY, &copy);
+	if (ret == -1) {
+		pr_info("Failed UFFDIO_COPY in 0x%lx with errno: %d\n",
+				addr, errno);
+		return ret;
+	}
+
+	pthread_mutex_lock(&events.uffd_faults_mutex);
+	events.uffd_faults += 1;
+	pthread_mutex_unlock(&events.uffd_faults_mutex);
+	return 0;
+}
+
+static int uffd_pt_write_handler(int mode, int uffd, struct uffd_msg *msg)
+{
+	return uffd_generic_handler(mode, uffd, msg, &memslot[PT], true);
+}
+
+static int uffd_test_write_handler(int mode, int uffd, struct uffd_msg *msg)
+{
+	return uffd_generic_handler(mode, uffd, msg, &memslot[TEST], true);
+}
+
+static int uffd_test_read_handler(int mode, int uffd, struct uffd_msg *msg)
+{
+	return uffd_generic_handler(mode, uffd, msg, &memslot[TEST], false);
+}
+
+/* Returns false if the test should be skipped. */
 static bool punch_hole_in_memslot(struct kvm_vm *vm,
 		struct memslot_desc *memslot)
 {
@@ -310,14 +368,14 @@ static bool punch_hole_in_memslot(struct kvm_vm *vm,
 	fd = vm_mem_region_get_src_fd(vm, memslot->idx);
 	if (fd != -1) {
 		ret = fallocate(fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
-				0, memslot->size);
+				0, memslot->paging_size);
 		TEST_ASSERT(ret == 0, "fallocate failed, errno: %d\n", errno);
 	} else {
 		if (is_backing_src_hugetlb(memslot->src_type))
 			return false;
 
 		hva = addr_gpa2hva(vm, memslot->gpa);
-		ret = madvise(hva, memslot->size, MADV_DONTNEED);
+		ret = madvise(hva, memslot->paging_size, MADV_DONTNEED);
 		TEST_ASSERT(ret == 0, "madvise failed, errno: %d\n", errno);
 	}
 
@@ -487,11 +545,56 @@ static void setup_memslots(struct kvm_vm *vm, enum vm_guest_mode mode,
 			"The pte_gpa (%p) should be aligned to the guest page (%lx).",
 			(void *)pte_gpa, guest_page_size);
 	virt_pg_map(vm, TEST_PTE_GVA, pte_gpa);
+
+	memslot[PT].paging_size = memslot[PT].size;
+	memslot[TEST].paging_size = memslot[TEST].size;
+}
+
+static void setup_uffd(enum vm_guest_mode mode, struct test_params *p,
+		struct uffd_desc **uffd)
+{
+	struct test_desc *test = p->test_desc;
+	int i;
+
+
+	for (i = 0; i < NR_MEMSLOTS; i++) {
+		memslot[i].data_copy = malloc(memslot[i].paging_size);
+		TEST_ASSERT(memslot[i].data_copy, "Failed malloc.");
+		memcpy(memslot[i].data_copy, memslot[i].hva,
+				memslot[i].paging_size);
+	}
+
+	uffd[PT] = NULL;
+	if (test->uffd_pt_handler)
+		uffd[PT] = uffd_setup_demand_paging(
+				UFFDIO_REGISTER_MODE_MISSING, 0,
+				memslot[PT].hva, memslot[PT].paging_size,
+				test->uffd_pt_handler);
+
+	uffd[TEST] = NULL;
+	if (test->uffd_test_handler)
+		uffd[TEST] = uffd_setup_demand_paging(
+				UFFDIO_REGISTER_MODE_MISSING, 0,
+				memslot[TEST].hva, memslot[TEST].paging_size,
+				test->uffd_test_handler);
 }
 
 static void check_event_counts(struct test_desc *test)
 {
 	ASSERT_EQ(test->expected_events.aborts,	events.aborts);
+	ASSERT_EQ(test->expected_events.uffd_faults, events.uffd_faults);
+}
+
+static void free_uffd(struct test_desc *test, struct uffd_desc **uffd)
+{
+	int i;
+
+	if (test->uffd_pt_handler)
+		uffd_stop_demand_paging(uffd[PT]);
+	if (test->uffd_test_handler)
+		uffd_stop_demand_paging(uffd[TEST]);
+	for (i = 0; i < NR_MEMSLOTS; i++)
+		free(memslot[i].data_copy);
 }
 
 static void print_test_banner(enum vm_guest_mode mode, struct test_params *p)
@@ -550,6 +653,7 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 	struct test_desc *test = p->test_desc;
 	struct kvm_vm *vm;
 	struct kvm_vcpu *vcpus[1], *vcpu;
+	struct uffd_desc *uffd[NR_MEMSLOTS];
 	bool skip_test = false;
 
 	print_test_banner(mode, p);
@@ -561,7 +665,14 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 	reset_event_counts();
 	setup_memslots(vm, mode, p);
 
+	/*
+	 * Set some code at memslot[TEST].hva for the guest to execute (only
+	 * applicable to the EXEC tests). This has to be done before
+	 * setup_uffd() as that function copies the memslot data for the uffd
+	 * handler.
+	 */
 	load_exec_code_for_test();
+	setup_uffd(mode, p, uffd);
 	setup_abort_handlers(vm, vcpu, test);
 	vcpu_args_set(vcpu, 1, test);
 
@@ -572,7 +683,12 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 	sync_stats_from_guest(vm);
 	ucall_uninit(vm);
 	kvm_vm_free(vm);
+	free_uffd(test, uffd);
 
+	/*
+	 * Make sure this is called after the uffd threads have exited (and
+	 * updated their respective event counters).
+	 */
 	if (!skip_test)
 		check_event_counts(test);
 }
@@ -590,6 +706,7 @@ static void help(char *name)
 #define SNAME(s)			#s
 #define SCAT2(a, b)			SNAME(a ## _ ## b)
 #define SCAT3(a, b, c)			SCAT2(a, SCAT2(b, c))
+#define SCAT4(a, b, c, d)		SCAT2(a, SCAT3(b, c, d))
 
 #define _CHECK(_test)			_CHECK_##_test
 #define _PREPARE(_test)			_PREPARE_##_test
@@ -620,6 +737,20 @@ static void help(char *name)
 	.expected_events	= { 0 },					\
 }
 
+#define TEST_UFFD(_access, _with_af, _mark_cmd,					\
+		  _uffd_test_handler, _uffd_pt_handler, _uffd_faults)		\
+{										\
+	.name			= SCAT4(uffd, _access, _with_af, #_mark_cmd),	\
+	.guest_prepare		= { _PREPARE(_with_af),				\
+				    _PREPARE(_access) },			\
+	.guest_test		= _access,					\
+	.mem_mark_cmd		= _mark_cmd,					\
+	.guest_test_check	= { _CHECK(_with_af) },				\
+	.uffd_test_handler	= _uffd_test_handler,				\
+	.uffd_pt_handler	= _uffd_pt_handler,				\
+	.expected_events	= { .uffd_faults = _uffd_faults, },		\
+}
+
 static struct test_desc tests[] = {
 	/* Check that HW is setting the Access Flag (AF) (sanity checks). */
 	TEST_ACCESS(guest_read64, with_af, CMD_NONE),
@@ -642,6 +773,36 @@ static struct test_desc tests[] = {
 	TEST_ACCESS(guest_at, no_af, CMD_HOLE_TEST),
 	TEST_ACCESS(guest_dc_zva, no_af, CMD_HOLE_TEST),
 
+	/*
+	 * Punch holes in the test and PT memslots and mark them for
+	 * userfaultfd handling. This should result in 2 faults: the test
+	 * access and its respective S1 page table walk (S1PTW).
+	 */
+	TEST_UFFD(guest_read64, with_af, CMD_HOLE_TEST | CMD_HOLE_PT,
+			uffd_test_read_handler, uffd_pt_write_handler, 2),
+	/* no_af should also lead to a PT write. */
+	TEST_UFFD(guest_read64, no_af, CMD_HOLE_TEST | CMD_HOLE_PT,
+			uffd_test_read_handler, uffd_pt_write_handler, 2),
+	/* Note how that cas invokes the read handler. */
+	TEST_UFFD(guest_cas, with_af, CMD_HOLE_TEST | CMD_HOLE_PT,
+			uffd_test_read_handler, uffd_pt_write_handler, 2),
+	/*
+	 * Can't test guest_at with_af as it's IMPDEF whether the AF is set.
+	 * The S1PTW fault should still be marked as a write.
+	 */
+	TEST_UFFD(guest_at, no_af, CMD_HOLE_TEST | CMD_HOLE_PT,
+			uffd_test_read_handler, uffd_pt_write_handler, 1),
+	TEST_UFFD(guest_ld_preidx, with_af, CMD_HOLE_TEST | CMD_HOLE_PT,
+			uffd_test_read_handler, uffd_pt_write_handler, 2),
+	TEST_UFFD(guest_write64, with_af, CMD_HOLE_TEST | CMD_HOLE_PT,
+			uffd_test_write_handler, uffd_pt_write_handler, 2),
+	TEST_UFFD(guest_dc_zva, with_af, CMD_HOLE_TEST | CMD_HOLE_PT,
+			uffd_test_write_handler, uffd_pt_write_handler, 2),
+	TEST_UFFD(guest_st_preidx, with_af, CMD_HOLE_TEST | CMD_HOLE_PT,
+			uffd_test_write_handler, uffd_pt_write_handler, 2),
+	TEST_UFFD(guest_exec, with_af, CMD_HOLE_TEST | CMD_HOLE_PT,
+			uffd_test_read_handler, uffd_pt_write_handler, 2),
+
 	{ 0 }
 };
 
-- 
2.37.0.rc0.161.g10f37bed90-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v4 11/13] KVM: selftests: aarch64: Add dirty logging tests into page_fault_test
  2022-06-24 21:32 ` Ricardo Koller
@ 2022-06-24 21:32   ` Ricardo Koller
  -1 siblings, 0 replies; 56+ messages in thread
From: Ricardo Koller @ 2022-06-24 21:32 UTC (permalink / raw)
  To: kvm, kvmarm, drjones
  Cc: pbonzini, maz, alexandru.elisei, eric.auger, oupton, reijiw,
	rananta, bgardon, dmatlack, axelrasmussen, Ricardo Koller

Add some dirty logging tests into page_fault_test. Mark the data and/or
page-table memslots for dirty logging, perform some accesses, and check
that the dirty log bits are set or clean when expected.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 .../selftests/kvm/aarch64/page_fault_test.c   | 74 +++++++++++++++++++
 1 file changed, 74 insertions(+)

diff --git a/tools/testing/selftests/kvm/aarch64/page_fault_test.c b/tools/testing/selftests/kvm/aarch64/page_fault_test.c
index 26318d39940b..d44a024dca89 100644
--- a/tools/testing/selftests/kvm/aarch64/page_fault_test.c
+++ b/tools/testing/selftests/kvm/aarch64/page_fault_test.c
@@ -34,6 +34,12 @@ static uint64_t *guest_test_memory = (uint64_t *)TEST_GVA;
 #define CMD_SKIP_TEST				(1ULL << 1)
 #define CMD_HOLE_PT				(1ULL << 2)
 #define CMD_HOLE_TEST				(1ULL << 3)
+#define CMD_RECREATE_PT_MEMSLOT_WR		(1ULL << 4)
+#define CMD_CHECK_WRITE_IN_DIRTY_LOG		(1ULL << 5)
+#define CMD_CHECK_S1PTW_WR_IN_DIRTY_LOG		(1ULL << 6)
+#define CMD_CHECK_NO_WRITE_IN_DIRTY_LOG		(1ULL << 7)
+#define CMD_CHECK_NO_S1PTW_WR_IN_DIRTY_LOG	(1ULL << 8)
+#define CMD_SET_PTE_AF				(1ULL << 9)
 
 #define PREPARE_FN_NR				10
 #define CHECK_FN_NR				10
@@ -248,6 +254,21 @@ static void guest_check_pte_af(void)
 	GUEST_ASSERT_EQ(*((uint64_t *)TEST_PTE_GVA) & PTE_AF, PTE_AF);
 }
 
+static void guest_check_write_in_dirty_log(void)
+{
+	GUEST_SYNC(CMD_CHECK_WRITE_IN_DIRTY_LOG);
+}
+
+static void guest_check_no_write_in_dirty_log(void)
+{
+	GUEST_SYNC(CMD_CHECK_NO_WRITE_IN_DIRTY_LOG);
+}
+
+static void guest_check_s1ptw_wr_in_dirty_log(void)
+{
+	GUEST_SYNC(CMD_CHECK_S1PTW_WR_IN_DIRTY_LOG);
+}
+
 static void guest_exec(void)
 {
 	int (*code)(void) = (int (*)(void))TEST_EXEC_GVA;
@@ -382,6 +403,19 @@ static bool punch_hole_in_memslot(struct kvm_vm *vm,
 	return true;
 }
 
+static bool check_write_in_dirty_log(struct kvm_vm *vm,
+		struct memslot_desc *ms, uint64_t host_pg_nr)
+{
+	unsigned long *bmap;
+	bool first_page_dirty;
+
+	bmap = bitmap_zalloc(ms->size / getpagesize());
+	kvm_vm_get_dirty_log(vm, ms->idx, bmap);
+	first_page_dirty = test_bit(host_pg_nr, bmap);
+	free(bmap);
+	return first_page_dirty;
+}
+
 /* Returns true to continue the test, and false if it should be skipped. */
 static bool handle_cmd(struct kvm_vm *vm, int cmd)
 {
@@ -394,6 +428,18 @@ static bool handle_cmd(struct kvm_vm *vm, int cmd)
 		continue_test = punch_hole_in_memslot(vm, &memslot[PT]);
 	if (cmd & CMD_HOLE_TEST)
 		continue_test = punch_hole_in_memslot(vm, &memslot[TEST]);
+	if (cmd & CMD_CHECK_WRITE_IN_DIRTY_LOG)
+		TEST_ASSERT(check_write_in_dirty_log(vm, &memslot[TEST], 0),
+				"Missing write in dirty log");
+	if (cmd & CMD_CHECK_S1PTW_WR_IN_DIRTY_LOG)
+		TEST_ASSERT(check_write_in_dirty_log(vm, &memslot[PT], 0),
+				"Missing s1ptw write in dirty log");
+	if (cmd & CMD_CHECK_NO_WRITE_IN_DIRTY_LOG)
+		TEST_ASSERT(!check_write_in_dirty_log(vm, &memslot[TEST], 0),
+				"Unexpected write in dirty log");
+	if (cmd & CMD_CHECK_NO_S1PTW_WR_IN_DIRTY_LOG)
+		TEST_ASSERT(!check_write_in_dirty_log(vm, &memslot[PT], 0),
+				"Unexpected s1ptw write in dirty log");
 
 	return continue_test;
 }
@@ -751,6 +797,19 @@ static void help(char *name)
 	.expected_events	= { .uffd_faults = _uffd_faults, },		\
 }
 
+#define TEST_DIRTY_LOG(_access, _with_af, _test_check)				\
+{										\
+	.name			= SCAT3(dirty_log, _access, _with_af),		\
+	.test_memslot_flags	= KVM_MEM_LOG_DIRTY_PAGES,			\
+	.pt_memslot_flags	= KVM_MEM_LOG_DIRTY_PAGES,			\
+	.guest_prepare		= { _PREPARE(_with_af),				\
+				    _PREPARE(_access) },			\
+	.guest_test		= _access,					\
+	.guest_test_check	= { _CHECK(_with_af), _test_check,		\
+				    guest_check_s1ptw_wr_in_dirty_log},		\
+	.expected_events	= { 0 },					\
+}
+
 static struct test_desc tests[] = {
 	/* Check that HW is setting the Access Flag (AF) (sanity checks). */
 	TEST_ACCESS(guest_read64, with_af, CMD_NONE),
@@ -803,6 +862,21 @@ static struct test_desc tests[] = {
 	TEST_UFFD(guest_exec, with_af, CMD_HOLE_TEST | CMD_HOLE_PT,
 			uffd_test_read_handler, uffd_pt_write_handler, 2),
 
+	/*
+	 * Try accesses when the test and PT memslots are both tracked for
+	 * dirty logging.
+	 */
+	TEST_DIRTY_LOG(guest_read64, with_af, guest_check_no_write_in_dirty_log),
+	/* no_af should also lead to a PT write. */
+	TEST_DIRTY_LOG(guest_read64, no_af, guest_check_no_write_in_dirty_log),
+	TEST_DIRTY_LOG(guest_ld_preidx, with_af, guest_check_no_write_in_dirty_log),
+	TEST_DIRTY_LOG(guest_at, no_af, guest_check_no_write_in_dirty_log),
+	TEST_DIRTY_LOG(guest_exec, with_af, guest_check_no_write_in_dirty_log),
+	TEST_DIRTY_LOG(guest_write64, with_af, guest_check_write_in_dirty_log),
+	TEST_DIRTY_LOG(guest_cas, with_af, guest_check_write_in_dirty_log),
+	TEST_DIRTY_LOG(guest_dc_zva, with_af, guest_check_write_in_dirty_log),
+	TEST_DIRTY_LOG(guest_st_preidx, with_af, guest_check_write_in_dirty_log),
+
 	{ 0 }
 };
 
-- 
2.37.0.rc0.161.g10f37bed90-goog


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v4 11/13] KVM: selftests: aarch64: Add dirty logging tests into page_fault_test
@ 2022-06-24 21:32   ` Ricardo Koller
  0 siblings, 0 replies; 56+ messages in thread
From: Ricardo Koller @ 2022-06-24 21:32 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: maz, bgardon, dmatlack, pbonzini, axelrasmussen

Add some dirty logging tests into page_fault_test. Mark the data and/or
page-table memslots for dirty logging, perform some accesses, and check
that the dirty log bits are set or clean when expected.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 .../selftests/kvm/aarch64/page_fault_test.c   | 74 +++++++++++++++++++
 1 file changed, 74 insertions(+)

diff --git a/tools/testing/selftests/kvm/aarch64/page_fault_test.c b/tools/testing/selftests/kvm/aarch64/page_fault_test.c
index 26318d39940b..d44a024dca89 100644
--- a/tools/testing/selftests/kvm/aarch64/page_fault_test.c
+++ b/tools/testing/selftests/kvm/aarch64/page_fault_test.c
@@ -34,6 +34,12 @@ static uint64_t *guest_test_memory = (uint64_t *)TEST_GVA;
 #define CMD_SKIP_TEST				(1ULL << 1)
 #define CMD_HOLE_PT				(1ULL << 2)
 #define CMD_HOLE_TEST				(1ULL << 3)
+#define CMD_RECREATE_PT_MEMSLOT_WR		(1ULL << 4)
+#define CMD_CHECK_WRITE_IN_DIRTY_LOG		(1ULL << 5)
+#define CMD_CHECK_S1PTW_WR_IN_DIRTY_LOG		(1ULL << 6)
+#define CMD_CHECK_NO_WRITE_IN_DIRTY_LOG		(1ULL << 7)
+#define CMD_CHECK_NO_S1PTW_WR_IN_DIRTY_LOG	(1ULL << 8)
+#define CMD_SET_PTE_AF				(1ULL << 9)
 
 #define PREPARE_FN_NR				10
 #define CHECK_FN_NR				10
@@ -248,6 +254,21 @@ static void guest_check_pte_af(void)
 	GUEST_ASSERT_EQ(*((uint64_t *)TEST_PTE_GVA) & PTE_AF, PTE_AF);
 }
 
+static void guest_check_write_in_dirty_log(void)
+{
+	GUEST_SYNC(CMD_CHECK_WRITE_IN_DIRTY_LOG);
+}
+
+static void guest_check_no_write_in_dirty_log(void)
+{
+	GUEST_SYNC(CMD_CHECK_NO_WRITE_IN_DIRTY_LOG);
+}
+
+static void guest_check_s1ptw_wr_in_dirty_log(void)
+{
+	GUEST_SYNC(CMD_CHECK_S1PTW_WR_IN_DIRTY_LOG);
+}
+
 static void guest_exec(void)
 {
 	int (*code)(void) = (int (*)(void))TEST_EXEC_GVA;
@@ -382,6 +403,19 @@ static bool punch_hole_in_memslot(struct kvm_vm *vm,
 	return true;
 }
 
+static bool check_write_in_dirty_log(struct kvm_vm *vm,
+		struct memslot_desc *ms, uint64_t host_pg_nr)
+{
+	unsigned long *bmap;
+	bool first_page_dirty;
+
+	bmap = bitmap_zalloc(ms->size / getpagesize());
+	kvm_vm_get_dirty_log(vm, ms->idx, bmap);
+	first_page_dirty = test_bit(host_pg_nr, bmap);
+	free(bmap);
+	return first_page_dirty;
+}
+
 /* Returns true to continue the test, and false if it should be skipped. */
 static bool handle_cmd(struct kvm_vm *vm, int cmd)
 {
@@ -394,6 +428,18 @@ static bool handle_cmd(struct kvm_vm *vm, int cmd)
 		continue_test = punch_hole_in_memslot(vm, &memslot[PT]);
 	if (cmd & CMD_HOLE_TEST)
 		continue_test = punch_hole_in_memslot(vm, &memslot[TEST]);
+	if (cmd & CMD_CHECK_WRITE_IN_DIRTY_LOG)
+		TEST_ASSERT(check_write_in_dirty_log(vm, &memslot[TEST], 0),
+				"Missing write in dirty log");
+	if (cmd & CMD_CHECK_S1PTW_WR_IN_DIRTY_LOG)
+		TEST_ASSERT(check_write_in_dirty_log(vm, &memslot[PT], 0),
+				"Missing s1ptw write in dirty log");
+	if (cmd & CMD_CHECK_NO_WRITE_IN_DIRTY_LOG)
+		TEST_ASSERT(!check_write_in_dirty_log(vm, &memslot[TEST], 0),
+				"Unexpected write in dirty log");
+	if (cmd & CMD_CHECK_NO_S1PTW_WR_IN_DIRTY_LOG)
+		TEST_ASSERT(!check_write_in_dirty_log(vm, &memslot[PT], 0),
+				"Unexpected s1ptw write in dirty log");
 
 	return continue_test;
 }
@@ -751,6 +797,19 @@ static void help(char *name)
 	.expected_events	= { .uffd_faults = _uffd_faults, },		\
 }
 
+#define TEST_DIRTY_LOG(_access, _with_af, _test_check)				\
+{										\
+	.name			= SCAT3(dirty_log, _access, _with_af),		\
+	.test_memslot_flags	= KVM_MEM_LOG_DIRTY_PAGES,			\
+	.pt_memslot_flags	= KVM_MEM_LOG_DIRTY_PAGES,			\
+	.guest_prepare		= { _PREPARE(_with_af),				\
+				    _PREPARE(_access) },			\
+	.guest_test		= _access,					\
+	.guest_test_check	= { _CHECK(_with_af), _test_check,		\
+				    guest_check_s1ptw_wr_in_dirty_log},		\
+	.expected_events	= { 0 },					\
+}
+
 static struct test_desc tests[] = {
 	/* Check that HW is setting the Access Flag (AF) (sanity checks). */
 	TEST_ACCESS(guest_read64, with_af, CMD_NONE),
@@ -803,6 +862,21 @@ static struct test_desc tests[] = {
 	TEST_UFFD(guest_exec, with_af, CMD_HOLE_TEST | CMD_HOLE_PT,
 			uffd_test_read_handler, uffd_pt_write_handler, 2),
 
+	/*
+	 * Try accesses when the test and PT memslots are both tracked for
+	 * dirty logging.
+	 */
+	TEST_DIRTY_LOG(guest_read64, with_af, guest_check_no_write_in_dirty_log),
+	/* no_af should also lead to a PT write. */
+	TEST_DIRTY_LOG(guest_read64, no_af, guest_check_no_write_in_dirty_log),
+	TEST_DIRTY_LOG(guest_ld_preidx, with_af, guest_check_no_write_in_dirty_log),
+	TEST_DIRTY_LOG(guest_at, no_af, guest_check_no_write_in_dirty_log),
+	TEST_DIRTY_LOG(guest_exec, with_af, guest_check_no_write_in_dirty_log),
+	TEST_DIRTY_LOG(guest_write64, with_af, guest_check_write_in_dirty_log),
+	TEST_DIRTY_LOG(guest_cas, with_af, guest_check_write_in_dirty_log),
+	TEST_DIRTY_LOG(guest_dc_zva, with_af, guest_check_write_in_dirty_log),
+	TEST_DIRTY_LOG(guest_st_preidx, with_af, guest_check_write_in_dirty_log),
+
 	{ 0 }
 };
 
-- 
2.37.0.rc0.161.g10f37bed90-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v4 12/13] KVM: selftests: aarch64: Add readonly memslot tests into page_fault_test
  2022-06-24 21:32 ` Ricardo Koller
@ 2022-06-24 21:32   ` Ricardo Koller
  -1 siblings, 0 replies; 56+ messages in thread
From: Ricardo Koller @ 2022-06-24 21:32 UTC (permalink / raw)
  To: kvm, kvmarm, drjones
  Cc: pbonzini, maz, alexandru.elisei, eric.auger, oupton, reijiw,
	rananta, bgardon, dmatlack, axelrasmussen, Ricardo Koller

Add some readonly memslot tests into page_fault_test. Mark the data
and/or page-table memslots as readonly, perform some accesses, and check
that the right fault is triggered when expected (e.g., a store with no
write-back should lead to an mmio exit).

Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 .../selftests/kvm/aarch64/page_fault_test.c   | 134 +++++++++++++++++-
 1 file changed, 131 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/kvm/aarch64/page_fault_test.c b/tools/testing/selftests/kvm/aarch64/page_fault_test.c
index d44a024dca89..c96fc2fd3390 100644
--- a/tools/testing/selftests/kvm/aarch64/page_fault_test.c
+++ b/tools/testing/selftests/kvm/aarch64/page_fault_test.c
@@ -72,6 +72,7 @@ struct memslot_desc {
 
 static struct event_cnt {
 	int aborts;
+	int mmio_exits;
 	int fail_vcpu_runs;
 	int uffd_faults;
 	/* uffd_faults is incremented from multiple threads. */
@@ -89,6 +90,8 @@ struct test_desc {
 	int (*uffd_test_handler)(int mode, int uffd, struct uffd_msg *msg);
 	void (*dabt_handler)(struct ex_regs *regs);
 	void (*iabt_handler)(struct ex_regs *regs);
+	void (*mmio_handler)(struct kvm_run *run);
+	void (*fail_vcpu_run_handler)(int ret);
 	uint32_t pt_memslot_flags;
 	uint32_t test_memslot_flags;
 	bool skip;
@@ -318,6 +321,20 @@ static void guest_code(struct test_desc *test)
 	GUEST_DONE();
 }
 
+static void dabt_s1ptw_on_ro_memslot_handler(struct ex_regs *regs)
+{
+	GUEST_ASSERT_EQ(read_sysreg(far_el1), TEST_GVA);
+	events.aborts += 1;
+	GUEST_SYNC(CMD_RECREATE_PT_MEMSLOT_WR);
+}
+
+static void iabt_s1ptw_on_ro_memslot_handler(struct ex_regs *regs)
+{
+	GUEST_ASSERT_EQ(regs->pc, TEST_EXEC_GVA);
+	events.aborts += 1;
+	GUEST_SYNC(CMD_RECREATE_PT_MEMSLOT_WR);
+}
+
 static void no_dabt_handler(struct ex_regs *regs)
 {
 	GUEST_ASSERT_1(false, read_sysreg(far_el1));
@@ -403,6 +420,32 @@ static bool punch_hole_in_memslot(struct kvm_vm *vm,
 	return true;
 }
 
+static void recreate_memslot(struct kvm_vm *vm, struct memslot_desc *ms,
+		uint32_t flags)
+{
+	vm_set_user_memory_region(vm, ms->idx, 0, ms->gpa, 0, ms->hva);
+	vm_set_user_memory_region(vm, ms->idx, flags, ms->gpa, ms->size, ms->hva);
+}
+
+static void mmio_on_test_gpa_handler(struct kvm_run *run)
+{
+	ASSERT_EQ(run->mmio.phys_addr, memslot[TEST].gpa);
+
+	memcpy(memslot[TEST].hva, run->mmio.data, run->mmio.len);
+	events.mmio_exits += 1;
+}
+
+static void mmio_no_handler(struct kvm_run *run)
+{
+	uint64_t data;
+
+	memcpy(&data, run->mmio.data, sizeof(data));
+	pr_debug("addr=%lld len=%d w=%d data=%lx\n",
+			run->mmio.phys_addr, run->mmio.len,
+			run->mmio.is_write, data);
+	TEST_FAIL("There was no MMIO exit expected.");
+}
+
 static bool check_write_in_dirty_log(struct kvm_vm *vm,
 		struct memslot_desc *ms, uint64_t host_pg_nr)
 {
@@ -440,6 +483,8 @@ static bool handle_cmd(struct kvm_vm *vm, int cmd)
 	if (cmd & CMD_CHECK_NO_S1PTW_WR_IN_DIRTY_LOG)
 		TEST_ASSERT(!check_write_in_dirty_log(vm, &memslot[PT], 0),
 				"Unexpected s1ptw write in dirty log");
+	if (cmd & CMD_RECREATE_PT_MEMSLOT_WR)
+		recreate_memslot(vm, &memslot[PT], 0);
 
 	return continue_test;
 }
@@ -456,6 +501,13 @@ void fail_vcpu_run_no_handler(int ret)
 	TEST_FAIL("Unexpected vcpu run failure\n");
 }
 
+void fail_vcpu_run_mmio_no_syndrome_handler(int ret)
+{
+	TEST_ASSERT(errno == ENOSYS,
+		"The mmio handler should have returned not implemented.");
+	events.fail_vcpu_runs += 1;
+}
+
 extern unsigned char __exec_test;
 
 void noinline __return_0x77(void)
@@ -625,10 +677,21 @@ static void setup_uffd(enum vm_guest_mode mode, struct test_params *p,
 				test->uffd_test_handler);
 }
 
+static void setup_default_handlers(struct test_desc *test)
+{
+	if (!test->mmio_handler)
+		test->mmio_handler = mmio_no_handler;
+
+	if (!test->fail_vcpu_run_handler)
+		test->fail_vcpu_run_handler = fail_vcpu_run_no_handler;
+}
+
 static void check_event_counts(struct test_desc *test)
 {
 	ASSERT_EQ(test->expected_events.aborts,	events.aborts);
 	ASSERT_EQ(test->expected_events.uffd_faults, events.uffd_faults);
+	ASSERT_EQ(test->expected_events.mmio_exits, events.mmio_exits);
+	ASSERT_EQ(test->expected_events.fail_vcpu_runs, events.fail_vcpu_runs);
 }
 
 static void free_uffd(struct test_desc *test, struct uffd_desc **uffd)
@@ -661,12 +724,20 @@ static void reset_event_counts(void)
 static bool vcpu_run_loop(struct kvm_vm *vm, struct kvm_vcpu *vcpu,
 		struct test_desc *test)
 {
+	struct kvm_run *run;
 	bool skip_test = false;
 	struct ucall uc;
-	int stage;
+	int stage, ret;
+
+	run = vcpu->run;
 
 	for (stage = 0; ; stage++) {
-		vcpu_run(vcpu);
+		ret = _vcpu_run(vcpu);
+		if (ret) {
+			test->fail_vcpu_run_handler(ret);
+			pr_debug("Done.\n");
+			goto done;
+		}
 
 		switch (get_ucall(vcpu, &uc)) {
 		case UCALL_SYNC:
@@ -684,6 +755,10 @@ static bool vcpu_run_loop(struct kvm_vm *vm, struct kvm_vcpu *vcpu,
 		case UCALL_DONE:
 			pr_debug("Done.\n");
 			goto done;
+		case UCALL_NONE:
+			if (run->exit_reason == KVM_EXIT_MMIO)
+				test->mmio_handler(run);
+			break;
 		default:
 			TEST_FAIL("Unknown ucall %lu", uc.cmd);
 		}
@@ -709,6 +784,7 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 	ucall_init(vm, NULL);
 
 	reset_event_counts();
+	setup_abort_handlers(vm, vcpu, test);
 	setup_memslots(vm, mode, p);
 
 	/*
@@ -719,7 +795,7 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 	 */
 	load_exec_code_for_test();
 	setup_uffd(mode, p, uffd);
-	setup_abort_handlers(vm, vcpu, test);
+	setup_default_handlers(test);
 	vcpu_args_set(vcpu, 1, test);
 
 	sync_global_to_guest(vm, memslot);
@@ -810,6 +886,32 @@ static void help(char *name)
 	.expected_events	= { 0 },					\
 }
 
+#define TEST_RO_MEMSLOT(_access, _mmio_handler, _mmio_exits,			\
+			_iabt_handler, _dabt_handler, _aborts)			\
+{										\
+	.name			= SCAT3(ro_memslot, _access, _with_af),		\
+	.test_memslot_flags	= KVM_MEM_READONLY,				\
+	.pt_memslot_flags	= KVM_MEM_READONLY,				\
+	.guest_prepare		= { _PREPARE(_access) },			\
+	.guest_test		= _access,					\
+	.mmio_handler		= _mmio_handler,				\
+	.iabt_handler		= _iabt_handler,				\
+	.dabt_handler		= _dabt_handler,				\
+	.expected_events	= { .mmio_exits = _mmio_exits,			\
+				    .aborts = _aborts},				\
+}
+
+#define TEST_RO_MEMSLOT_NO_SYNDROME(_access)					\
+{										\
+	.name			= SCAT2(ro_memslot_no_syndrome, _access),	\
+	.test_memslot_flags	= KVM_MEM_READONLY,				\
+	.pt_memslot_flags	= KVM_MEM_READONLY,				\
+	.guest_test		= _access,					\
+	.dabt_handler		= dabt_s1ptw_on_ro_memslot_handler,		\
+	.fail_vcpu_run_handler	= fail_vcpu_run_mmio_no_syndrome_handler,	\
+	.expected_events	= { .aborts = 1, .fail_vcpu_runs = 1 },		\
+}
+
 static struct test_desc tests[] = {
 	/* Check that HW is setting the Access Flag (AF) (sanity checks). */
 	TEST_ACCESS(guest_read64, with_af, CMD_NONE),
@@ -877,6 +979,32 @@ static struct test_desc tests[] = {
 	TEST_DIRTY_LOG(guest_dc_zva, with_af, guest_check_write_in_dirty_log),
 	TEST_DIRTY_LOG(guest_st_preidx, with_af, guest_check_write_in_dirty_log),
 
+	/*
+	 * Try accesses when both the test and PT memslots are marked read-only
+	 * (with KVM_MEM_READONLY). The S1PTW results in an guest abort, whose
+	 * handler asks the host to recreate the memslot as writable. Note that
+	 * guests would typically panic as there's no way of asking the VMM to
+	 * perform the write for the guest (or make the memslot writable).  The
+	 * instruction then is executed: writes with a syndrome result in an
+	 * MMIO exit, writes with no syndrome (e.g., CAS) result in a failed
+	 * vcpu run, and reads/execs with and without syndroms do not fault.
+	 * Check that the expected aborts, failed vcpu runs, mmio exits
+	 * actually happen.
+	 */
+	TEST_RO_MEMSLOT(guest_read64, 0, 0, 0,
+			dabt_s1ptw_on_ro_memslot_handler, 1),
+	TEST_RO_MEMSLOT(guest_ld_preidx, 0, 0, 0,
+			dabt_s1ptw_on_ro_memslot_handler, 1),
+	TEST_RO_MEMSLOT(guest_at, 0, 0, 0,
+			dabt_s1ptw_on_ro_memslot_handler, 1),
+	TEST_RO_MEMSLOT(guest_exec, 0, 0, iabt_s1ptw_on_ro_memslot_handler,
+			0, 1),
+	TEST_RO_MEMSLOT(guest_write64, mmio_on_test_gpa_handler, 1, 0,
+			dabt_s1ptw_on_ro_memslot_handler, 1),
+	TEST_RO_MEMSLOT_NO_SYNDROME(guest_dc_zva),
+	TEST_RO_MEMSLOT_NO_SYNDROME(guest_cas),
+	TEST_RO_MEMSLOT_NO_SYNDROME(guest_st_preidx),
+
 	{ 0 }
 };
 
-- 
2.37.0.rc0.161.g10f37bed90-goog


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v4 12/13] KVM: selftests: aarch64: Add readonly memslot tests into page_fault_test
@ 2022-06-24 21:32   ` Ricardo Koller
  0 siblings, 0 replies; 56+ messages in thread
From: Ricardo Koller @ 2022-06-24 21:32 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: maz, bgardon, dmatlack, pbonzini, axelrasmussen

Add some readonly memslot tests into page_fault_test. Mark the data
and/or page-table memslots as readonly, perform some accesses, and check
that the right fault is triggered when expected (e.g., a store with no
write-back should lead to an mmio exit).

Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 .../selftests/kvm/aarch64/page_fault_test.c   | 134 +++++++++++++++++-
 1 file changed, 131 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/kvm/aarch64/page_fault_test.c b/tools/testing/selftests/kvm/aarch64/page_fault_test.c
index d44a024dca89..c96fc2fd3390 100644
--- a/tools/testing/selftests/kvm/aarch64/page_fault_test.c
+++ b/tools/testing/selftests/kvm/aarch64/page_fault_test.c
@@ -72,6 +72,7 @@ struct memslot_desc {
 
 static struct event_cnt {
 	int aborts;
+	int mmio_exits;
 	int fail_vcpu_runs;
 	int uffd_faults;
 	/* uffd_faults is incremented from multiple threads. */
@@ -89,6 +90,8 @@ struct test_desc {
 	int (*uffd_test_handler)(int mode, int uffd, struct uffd_msg *msg);
 	void (*dabt_handler)(struct ex_regs *regs);
 	void (*iabt_handler)(struct ex_regs *regs);
+	void (*mmio_handler)(struct kvm_run *run);
+	void (*fail_vcpu_run_handler)(int ret);
 	uint32_t pt_memslot_flags;
 	uint32_t test_memslot_flags;
 	bool skip;
@@ -318,6 +321,20 @@ static void guest_code(struct test_desc *test)
 	GUEST_DONE();
 }
 
+static void dabt_s1ptw_on_ro_memslot_handler(struct ex_regs *regs)
+{
+	GUEST_ASSERT_EQ(read_sysreg(far_el1), TEST_GVA);
+	events.aborts += 1;
+	GUEST_SYNC(CMD_RECREATE_PT_MEMSLOT_WR);
+}
+
+static void iabt_s1ptw_on_ro_memslot_handler(struct ex_regs *regs)
+{
+	GUEST_ASSERT_EQ(regs->pc, TEST_EXEC_GVA);
+	events.aborts += 1;
+	GUEST_SYNC(CMD_RECREATE_PT_MEMSLOT_WR);
+}
+
 static void no_dabt_handler(struct ex_regs *regs)
 {
 	GUEST_ASSERT_1(false, read_sysreg(far_el1));
@@ -403,6 +420,32 @@ static bool punch_hole_in_memslot(struct kvm_vm *vm,
 	return true;
 }
 
+static void recreate_memslot(struct kvm_vm *vm, struct memslot_desc *ms,
+		uint32_t flags)
+{
+	vm_set_user_memory_region(vm, ms->idx, 0, ms->gpa, 0, ms->hva);
+	vm_set_user_memory_region(vm, ms->idx, flags, ms->gpa, ms->size, ms->hva);
+}
+
+static void mmio_on_test_gpa_handler(struct kvm_run *run)
+{
+	ASSERT_EQ(run->mmio.phys_addr, memslot[TEST].gpa);
+
+	memcpy(memslot[TEST].hva, run->mmio.data, run->mmio.len);
+	events.mmio_exits += 1;
+}
+
+static void mmio_no_handler(struct kvm_run *run)
+{
+	uint64_t data;
+
+	memcpy(&data, run->mmio.data, sizeof(data));
+	pr_debug("addr=%lld len=%d w=%d data=%lx\n",
+			run->mmio.phys_addr, run->mmio.len,
+			run->mmio.is_write, data);
+	TEST_FAIL("There was no MMIO exit expected.");
+}
+
 static bool check_write_in_dirty_log(struct kvm_vm *vm,
 		struct memslot_desc *ms, uint64_t host_pg_nr)
 {
@@ -440,6 +483,8 @@ static bool handle_cmd(struct kvm_vm *vm, int cmd)
 	if (cmd & CMD_CHECK_NO_S1PTW_WR_IN_DIRTY_LOG)
 		TEST_ASSERT(!check_write_in_dirty_log(vm, &memslot[PT], 0),
 				"Unexpected s1ptw write in dirty log");
+	if (cmd & CMD_RECREATE_PT_MEMSLOT_WR)
+		recreate_memslot(vm, &memslot[PT], 0);
 
 	return continue_test;
 }
@@ -456,6 +501,13 @@ void fail_vcpu_run_no_handler(int ret)
 	TEST_FAIL("Unexpected vcpu run failure\n");
 }
 
+void fail_vcpu_run_mmio_no_syndrome_handler(int ret)
+{
+	TEST_ASSERT(errno == ENOSYS,
+		"The mmio handler should have returned not implemented.");
+	events.fail_vcpu_runs += 1;
+}
+
 extern unsigned char __exec_test;
 
 void noinline __return_0x77(void)
@@ -625,10 +677,21 @@ static void setup_uffd(enum vm_guest_mode mode, struct test_params *p,
 				test->uffd_test_handler);
 }
 
+static void setup_default_handlers(struct test_desc *test)
+{
+	if (!test->mmio_handler)
+		test->mmio_handler = mmio_no_handler;
+
+	if (!test->fail_vcpu_run_handler)
+		test->fail_vcpu_run_handler = fail_vcpu_run_no_handler;
+}
+
 static void check_event_counts(struct test_desc *test)
 {
 	ASSERT_EQ(test->expected_events.aborts,	events.aborts);
 	ASSERT_EQ(test->expected_events.uffd_faults, events.uffd_faults);
+	ASSERT_EQ(test->expected_events.mmio_exits, events.mmio_exits);
+	ASSERT_EQ(test->expected_events.fail_vcpu_runs, events.fail_vcpu_runs);
 }
 
 static void free_uffd(struct test_desc *test, struct uffd_desc **uffd)
@@ -661,12 +724,20 @@ static void reset_event_counts(void)
 static bool vcpu_run_loop(struct kvm_vm *vm, struct kvm_vcpu *vcpu,
 		struct test_desc *test)
 {
+	struct kvm_run *run;
 	bool skip_test = false;
 	struct ucall uc;
-	int stage;
+	int stage, ret;
+
+	run = vcpu->run;
 
 	for (stage = 0; ; stage++) {
-		vcpu_run(vcpu);
+		ret = _vcpu_run(vcpu);
+		if (ret) {
+			test->fail_vcpu_run_handler(ret);
+			pr_debug("Done.\n");
+			goto done;
+		}
 
 		switch (get_ucall(vcpu, &uc)) {
 		case UCALL_SYNC:
@@ -684,6 +755,10 @@ static bool vcpu_run_loop(struct kvm_vm *vm, struct kvm_vcpu *vcpu,
 		case UCALL_DONE:
 			pr_debug("Done.\n");
 			goto done;
+		case UCALL_NONE:
+			if (run->exit_reason == KVM_EXIT_MMIO)
+				test->mmio_handler(run);
+			break;
 		default:
 			TEST_FAIL("Unknown ucall %lu", uc.cmd);
 		}
@@ -709,6 +784,7 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 	ucall_init(vm, NULL);
 
 	reset_event_counts();
+	setup_abort_handlers(vm, vcpu, test);
 	setup_memslots(vm, mode, p);
 
 	/*
@@ -719,7 +795,7 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 	 */
 	load_exec_code_for_test();
 	setup_uffd(mode, p, uffd);
-	setup_abort_handlers(vm, vcpu, test);
+	setup_default_handlers(test);
 	vcpu_args_set(vcpu, 1, test);
 
 	sync_global_to_guest(vm, memslot);
@@ -810,6 +886,32 @@ static void help(char *name)
 	.expected_events	= { 0 },					\
 }
 
+#define TEST_RO_MEMSLOT(_access, _mmio_handler, _mmio_exits,			\
+			_iabt_handler, _dabt_handler, _aborts)			\
+{										\
+	.name			= SCAT3(ro_memslot, _access, _with_af),		\
+	.test_memslot_flags	= KVM_MEM_READONLY,				\
+	.pt_memslot_flags	= KVM_MEM_READONLY,				\
+	.guest_prepare		= { _PREPARE(_access) },			\
+	.guest_test		= _access,					\
+	.mmio_handler		= _mmio_handler,				\
+	.iabt_handler		= _iabt_handler,				\
+	.dabt_handler		= _dabt_handler,				\
+	.expected_events	= { .mmio_exits = _mmio_exits,			\
+				    .aborts = _aborts},				\
+}
+
+#define TEST_RO_MEMSLOT_NO_SYNDROME(_access)					\
+{										\
+	.name			= SCAT2(ro_memslot_no_syndrome, _access),	\
+	.test_memslot_flags	= KVM_MEM_READONLY,				\
+	.pt_memslot_flags	= KVM_MEM_READONLY,				\
+	.guest_test		= _access,					\
+	.dabt_handler		= dabt_s1ptw_on_ro_memslot_handler,		\
+	.fail_vcpu_run_handler	= fail_vcpu_run_mmio_no_syndrome_handler,	\
+	.expected_events	= { .aborts = 1, .fail_vcpu_runs = 1 },		\
+}
+
 static struct test_desc tests[] = {
 	/* Check that HW is setting the Access Flag (AF) (sanity checks). */
 	TEST_ACCESS(guest_read64, with_af, CMD_NONE),
@@ -877,6 +979,32 @@ static struct test_desc tests[] = {
 	TEST_DIRTY_LOG(guest_dc_zva, with_af, guest_check_write_in_dirty_log),
 	TEST_DIRTY_LOG(guest_st_preidx, with_af, guest_check_write_in_dirty_log),
 
+	/*
+	 * Try accesses when both the test and PT memslots are marked read-only
+	 * (with KVM_MEM_READONLY). The S1PTW results in an guest abort, whose
+	 * handler asks the host to recreate the memslot as writable. Note that
+	 * guests would typically panic as there's no way of asking the VMM to
+	 * perform the write for the guest (or make the memslot writable).  The
+	 * instruction then is executed: writes with a syndrome result in an
+	 * MMIO exit, writes with no syndrome (e.g., CAS) result in a failed
+	 * vcpu run, and reads/execs with and without syndroms do not fault.
+	 * Check that the expected aborts, failed vcpu runs, mmio exits
+	 * actually happen.
+	 */
+	TEST_RO_MEMSLOT(guest_read64, 0, 0, 0,
+			dabt_s1ptw_on_ro_memslot_handler, 1),
+	TEST_RO_MEMSLOT(guest_ld_preidx, 0, 0, 0,
+			dabt_s1ptw_on_ro_memslot_handler, 1),
+	TEST_RO_MEMSLOT(guest_at, 0, 0, 0,
+			dabt_s1ptw_on_ro_memslot_handler, 1),
+	TEST_RO_MEMSLOT(guest_exec, 0, 0, iabt_s1ptw_on_ro_memslot_handler,
+			0, 1),
+	TEST_RO_MEMSLOT(guest_write64, mmio_on_test_gpa_handler, 1, 0,
+			dabt_s1ptw_on_ro_memslot_handler, 1),
+	TEST_RO_MEMSLOT_NO_SYNDROME(guest_dc_zva),
+	TEST_RO_MEMSLOT_NO_SYNDROME(guest_cas),
+	TEST_RO_MEMSLOT_NO_SYNDROME(guest_st_preidx),
+
 	{ 0 }
 };
 
-- 
2.37.0.rc0.161.g10f37bed90-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v4 13/13] KVM: selftests: aarch64: Add mix of tests into page_fault_test
  2022-06-24 21:32 ` Ricardo Koller
@ 2022-06-24 21:32   ` Ricardo Koller
  -1 siblings, 0 replies; 56+ messages in thread
From: Ricardo Koller @ 2022-06-24 21:32 UTC (permalink / raw)
  To: kvm, kvmarm, drjones
  Cc: pbonzini, maz, alexandru.elisei, eric.auger, oupton, reijiw,
	rananta, bgardon, dmatlack, axelrasmussen, Ricardo Koller

Add some mix of tests into page_fault_test: memslots with all the
pairwise combinations of read-only, userfaultfd, and dirty-logging.  For
example, writing into a read-only memslot which has a hole handled with
userfaultfd.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 .../selftests/kvm/aarch64/page_fault_test.c   | 178 ++++++++++++++++++
 1 file changed, 178 insertions(+)

diff --git a/tools/testing/selftests/kvm/aarch64/page_fault_test.c b/tools/testing/selftests/kvm/aarch64/page_fault_test.c
index c96fc2fd3390..4116a35979d5 100644
--- a/tools/testing/selftests/kvm/aarch64/page_fault_test.c
+++ b/tools/testing/selftests/kvm/aarch64/page_fault_test.c
@@ -396,6 +396,12 @@ static int uffd_test_read_handler(int mode, int uffd, struct uffd_msg *msg)
 	return uffd_generic_handler(mode, uffd, msg, &memslot[TEST], false);
 }
 
+static int uffd_no_handler(int mode, int uffd, struct uffd_msg *msg)
+{
+	TEST_FAIL("There was no UFFD fault expected.");
+	return -1;
+}
+
 /* Returns false if the test should be skipped. */
 static bool punch_hole_in_memslot(struct kvm_vm *vm,
 		struct memslot_desc *memslot)
@@ -886,6 +892,22 @@ static void help(char *name)
 	.expected_events	= { 0 },					\
 }
 
+#define TEST_UFFD_AND_DIRTY_LOG(_access, _with_af, _uffd_test_handler,		\
+		_uffd_faults, _test_check)					\
+{										\
+	.name			= SCAT3(uffd_and_dirty_log, _access, _with_af),	\
+	.test_memslot_flags	= KVM_MEM_LOG_DIRTY_PAGES,			\
+	.pt_memslot_flags	= KVM_MEM_LOG_DIRTY_PAGES,			\
+	.guest_prepare		= { _PREPARE(_with_af),				\
+				    _PREPARE(_access) },			\
+	.guest_test		= _access,					\
+	.mem_mark_cmd		= CMD_HOLE_TEST | CMD_HOLE_PT,			\
+	.guest_test_check	= { _CHECK(_with_af), _test_check },		\
+	.uffd_test_handler	= _uffd_test_handler,				\
+	.uffd_pt_handler	= uffd_pt_write_handler,			\
+	.expected_events	= { .uffd_faults = _uffd_faults, },		\
+}
+
 #define TEST_RO_MEMSLOT(_access, _mmio_handler, _mmio_exits,			\
 			_iabt_handler, _dabt_handler, _aborts)			\
 {										\
@@ -912,6 +934,71 @@ static void help(char *name)
 	.expected_events	= { .aborts = 1, .fail_vcpu_runs = 1 },		\
 }
 
+#define TEST_RO_MEMSLOT_AND_DIRTY_LOG(_access, _mmio_handler, _mmio_exits,	\
+				      _iabt_handler, _dabt_handler, _aborts,	\
+				      _test_check)				\
+{										\
+	.name			= SCAT3(ro_memslot, _access, _with_af),		\
+	.test_memslot_flags	= KVM_MEM_READONLY | KVM_MEM_LOG_DIRTY_PAGES,	\
+	.pt_memslot_flags	= KVM_MEM_READONLY | KVM_MEM_LOG_DIRTY_PAGES,	\
+	.guest_prepare		= { _PREPARE(_access) },			\
+	.guest_test		= _access,					\
+	.guest_test_check	= { _test_check },				\
+	.mmio_handler		= _mmio_handler,				\
+	.iabt_handler		= _iabt_handler,				\
+	.dabt_handler		= _dabt_handler,				\
+	.expected_events	= { .mmio_exits = _mmio_exits,			\
+				    .aborts = _aborts},				\
+}
+
+#define TEST_RO_MEMSLOT_NO_SYNDROME_AND_DIRTY_LOG(_access, _test_check)		\
+{										\
+	.name			= SCAT2(ro_memslot_no_syn_and_dlog, _access),	\
+	.test_memslot_flags	= KVM_MEM_READONLY | KVM_MEM_LOG_DIRTY_PAGES,	\
+	.pt_memslot_flags	= KVM_MEM_READONLY | KVM_MEM_LOG_DIRTY_PAGES,	\
+	.guest_test		= _access,					\
+	.guest_test_check	= { _test_check },				\
+	.dabt_handler		= dabt_s1ptw_on_ro_memslot_handler,		\
+	.fail_vcpu_run_handler	= fail_vcpu_run_mmio_no_syndrome_handler,	\
+	.expected_events	= { .aborts = 1, .fail_vcpu_runs = 1 },		\
+}
+
+#define TEST_RO_MEMSLOT_AND_UFFD(_access, _mmio_handler, _mmio_exits,		\
+				 _iabt_handler, _dabt_handler, _aborts,		\
+				_uffd_test_handler, _uffd_faults)		\
+{										\
+	.name			= SCAT2(ro_memslot_uffd, _access),		\
+	.test_memslot_flags	= KVM_MEM_READONLY,				\
+	.pt_memslot_flags	= KVM_MEM_READONLY,				\
+	.mem_mark_cmd		= CMD_HOLE_TEST | CMD_HOLE_PT,			\
+	.guest_prepare		= { _PREPARE(_access) },			\
+	.guest_test		= _access,					\
+	.uffd_test_handler	= _uffd_test_handler,				\
+	.uffd_pt_handler	= uffd_pt_write_handler,			\
+	.mmio_handler		= _mmio_handler,				\
+	.iabt_handler		= _iabt_handler,				\
+	.dabt_handler		= _dabt_handler,				\
+	.expected_events	= { .mmio_exits = _mmio_exits,			\
+				    .aborts = _aborts,				\
+				    .uffd_faults = _uffd_faults },		\
+}
+
+#define TEST_RO_MEMSLOT_NO_SYNDROME_AND_UFFD(_access, _uffd_test_handler,	\
+					     _uffd_faults)			\
+{										\
+	.name			= SCAT2(ro_memslot_no_syndrome, _access),	\
+	.test_memslot_flags	= KVM_MEM_READONLY,				\
+	.pt_memslot_flags	= KVM_MEM_READONLY,				\
+	.mem_mark_cmd		= CMD_HOLE_TEST | CMD_HOLE_PT,			\
+	.guest_test		= _access,					\
+	.uffd_test_handler	= _uffd_test_handler,				\
+	.uffd_pt_handler	= uffd_pt_write_handler,			\
+	.dabt_handler		= dabt_s1ptw_on_ro_memslot_handler,		\
+	.fail_vcpu_run_handler	= fail_vcpu_run_mmio_no_syndrome_handler,	\
+	.expected_events	= { .aborts = 1, .fail_vcpu_runs = 1,		\
+				    .uffd_faults = _uffd_faults },		\
+}
+
 static struct test_desc tests[] = {
 	/* Check that HW is setting the Access Flag (AF) (sanity checks). */
 	TEST_ACCESS(guest_read64, with_af, CMD_NONE),
@@ -979,6 +1066,35 @@ static struct test_desc tests[] = {
 	TEST_DIRTY_LOG(guest_dc_zva, with_af, guest_check_write_in_dirty_log),
 	TEST_DIRTY_LOG(guest_st_preidx, with_af, guest_check_write_in_dirty_log),
 
+	/*
+	 * Access when the test and PT memslots are both marked for dirty
+	 * logging and UFFD at the same time. The expected result is that
+	 * writes should mark the dirty log and trigger a userfaultfd write
+	 * fault.  Reads/execs should result in a read userfaultfd fault, and
+	 * nothing in the dirty log.  The S1PTW in all cases should result in a
+	 * write in the dirty log and a userfaultfd write.
+	 */
+	TEST_UFFD_AND_DIRTY_LOG(guest_read64, with_af, uffd_test_read_handler, 2,
+			guest_check_no_write_in_dirty_log),
+	/* no_af should also lead to a PT write. */
+	TEST_UFFD_AND_DIRTY_LOG(guest_read64, no_af, uffd_test_read_handler, 2,
+			guest_check_no_write_in_dirty_log),
+	TEST_UFFD_AND_DIRTY_LOG(guest_ld_preidx, with_af, uffd_test_read_handler,
+			2, guest_check_no_write_in_dirty_log),
+	TEST_UFFD_AND_DIRTY_LOG(guest_at, with_af, 0, 1,
+			guest_check_no_write_in_dirty_log),
+	TEST_UFFD_AND_DIRTY_LOG(guest_exec, with_af, uffd_test_read_handler, 2,
+			guest_check_no_write_in_dirty_log),
+	TEST_UFFD_AND_DIRTY_LOG(guest_write64, with_af, uffd_test_write_handler,
+			2, guest_check_write_in_dirty_log),
+	TEST_UFFD_AND_DIRTY_LOG(guest_cas, with_af, uffd_test_read_handler, 2,
+			guest_check_write_in_dirty_log),
+	TEST_UFFD_AND_DIRTY_LOG(guest_dc_zva, with_af, uffd_test_write_handler,
+			2, guest_check_write_in_dirty_log),
+	TEST_UFFD_AND_DIRTY_LOG(guest_st_preidx, with_af,
+			uffd_test_write_handler, 2,
+			guest_check_write_in_dirty_log),
+
 	/*
 	 * Try accesses when both the test and PT memslots are marked read-only
 	 * (with KVM_MEM_READONLY). The S1PTW results in an guest abort, whose
@@ -1005,6 +1121,68 @@ static struct test_desc tests[] = {
 	TEST_RO_MEMSLOT_NO_SYNDROME(guest_cas),
 	TEST_RO_MEMSLOT_NO_SYNDROME(guest_st_preidx),
 
+	/*
+	 * Access when both the test and PT memslots are read-only and marked
+	 * for dirty logging at the same time. The expected result is that
+	 * there should be no write in the dirty log. The S1PTW results in an
+	 * abort which is handled by asking the host to recreate the memslot as
+	 * writable. The readonly handling are the same as if the memslots were
+	 * not marked for dirty logging: writes with a syndrome result in an
+	 * MMIO exit, and writes with no syndrome result in a failed vcpu run.
+	 */
+	TEST_RO_MEMSLOT_AND_DIRTY_LOG(guest_read64, 0, 0, 0,
+			dabt_s1ptw_on_ro_memslot_handler, 1,
+			guest_check_no_write_in_dirty_log),
+	TEST_RO_MEMSLOT_AND_DIRTY_LOG(guest_ld_preidx, 0, 0, 0,
+			dabt_s1ptw_on_ro_memslot_handler, 1,
+			guest_check_no_write_in_dirty_log),
+	TEST_RO_MEMSLOT_AND_DIRTY_LOG(guest_at, 0, 0, 0,
+			dabt_s1ptw_on_ro_memslot_handler, 1,
+			guest_check_no_write_in_dirty_log),
+	TEST_RO_MEMSLOT_AND_DIRTY_LOG(guest_exec, 0, 0,
+			iabt_s1ptw_on_ro_memslot_handler, 0, 1,
+			guest_check_no_write_in_dirty_log),
+	TEST_RO_MEMSLOT_AND_DIRTY_LOG(guest_write64, mmio_on_test_gpa_handler,
+			1, 0, dabt_s1ptw_on_ro_memslot_handler, 1,
+			guest_check_no_write_in_dirty_log),
+	TEST_RO_MEMSLOT_NO_SYNDROME_AND_DIRTY_LOG(guest_dc_zva,
+			guest_check_no_write_in_dirty_log),
+	TEST_RO_MEMSLOT_NO_SYNDROME_AND_DIRTY_LOG(guest_cas,
+			guest_check_no_write_in_dirty_log),
+	TEST_RO_MEMSLOT_NO_SYNDROME_AND_DIRTY_LOG(guest_st_preidx,
+			guest_check_no_write_in_dirty_log),
+
+	/*
+	 * Access when both the test and PT memslots are read-only, and punched
+	 * with holes tracked with userfaultfd.  The expected result is the
+	 * union of both userfaultfd and read-only behaviors. For example,
+	 * write accesses result in a userfaultfd write fault and an MMIO exit.
+	 * Writes with no syndrome result in a failed vcpu run and no
+	 * userfaultfd write fault. Reads only result in userfaultfd getting
+	 * triggered.
+	 */
+	TEST_RO_MEMSLOT_AND_UFFD(guest_read64, 0, 0, 0,
+			dabt_s1ptw_on_ro_memslot_handler, 1,
+			uffd_test_read_handler, 2),
+	TEST_RO_MEMSLOT_AND_UFFD(guest_ld_preidx, 0, 0, 0,
+			dabt_s1ptw_on_ro_memslot_handler, 1,
+			uffd_test_read_handler, 2),
+	TEST_RO_MEMSLOT_AND_UFFD(guest_at, 0, 0, 0,
+			dabt_s1ptw_on_ro_memslot_handler, 1,
+			uffd_no_handler, 1),
+	TEST_RO_MEMSLOT_AND_UFFD(guest_exec, 0, 0,
+			iabt_s1ptw_on_ro_memslot_handler, 0, 1,
+			uffd_test_read_handler, 2),
+	TEST_RO_MEMSLOT_AND_UFFD(guest_write64, mmio_on_test_gpa_handler, 1, 0,
+			dabt_s1ptw_on_ro_memslot_handler, 1,
+			uffd_test_write_handler, 2),
+	TEST_RO_MEMSLOT_NO_SYNDROME_AND_UFFD(guest_cas,
+			uffd_test_read_handler, 2),
+	TEST_RO_MEMSLOT_NO_SYNDROME_AND_UFFD(guest_dc_zva,
+			uffd_no_handler, 1),
+	TEST_RO_MEMSLOT_NO_SYNDROME_AND_UFFD(guest_st_preidx,
+			uffd_no_handler, 1),
+
 	{ 0 }
 };
 
-- 
2.37.0.rc0.161.g10f37bed90-goog


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v4 13/13] KVM: selftests: aarch64: Add mix of tests into page_fault_test
@ 2022-06-24 21:32   ` Ricardo Koller
  0 siblings, 0 replies; 56+ messages in thread
From: Ricardo Koller @ 2022-06-24 21:32 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: maz, bgardon, dmatlack, pbonzini, axelrasmussen

Add some mix of tests into page_fault_test: memslots with all the
pairwise combinations of read-only, userfaultfd, and dirty-logging.  For
example, writing into a read-only memslot which has a hole handled with
userfaultfd.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 .../selftests/kvm/aarch64/page_fault_test.c   | 178 ++++++++++++++++++
 1 file changed, 178 insertions(+)

diff --git a/tools/testing/selftests/kvm/aarch64/page_fault_test.c b/tools/testing/selftests/kvm/aarch64/page_fault_test.c
index c96fc2fd3390..4116a35979d5 100644
--- a/tools/testing/selftests/kvm/aarch64/page_fault_test.c
+++ b/tools/testing/selftests/kvm/aarch64/page_fault_test.c
@@ -396,6 +396,12 @@ static int uffd_test_read_handler(int mode, int uffd, struct uffd_msg *msg)
 	return uffd_generic_handler(mode, uffd, msg, &memslot[TEST], false);
 }
 
+static int uffd_no_handler(int mode, int uffd, struct uffd_msg *msg)
+{
+	TEST_FAIL("There was no UFFD fault expected.");
+	return -1;
+}
+
 /* Returns false if the test should be skipped. */
 static bool punch_hole_in_memslot(struct kvm_vm *vm,
 		struct memslot_desc *memslot)
@@ -886,6 +892,22 @@ static void help(char *name)
 	.expected_events	= { 0 },					\
 }
 
+#define TEST_UFFD_AND_DIRTY_LOG(_access, _with_af, _uffd_test_handler,		\
+		_uffd_faults, _test_check)					\
+{										\
+	.name			= SCAT3(uffd_and_dirty_log, _access, _with_af),	\
+	.test_memslot_flags	= KVM_MEM_LOG_DIRTY_PAGES,			\
+	.pt_memslot_flags	= KVM_MEM_LOG_DIRTY_PAGES,			\
+	.guest_prepare		= { _PREPARE(_with_af),				\
+				    _PREPARE(_access) },			\
+	.guest_test		= _access,					\
+	.mem_mark_cmd		= CMD_HOLE_TEST | CMD_HOLE_PT,			\
+	.guest_test_check	= { _CHECK(_with_af), _test_check },		\
+	.uffd_test_handler	= _uffd_test_handler,				\
+	.uffd_pt_handler	= uffd_pt_write_handler,			\
+	.expected_events	= { .uffd_faults = _uffd_faults, },		\
+}
+
 #define TEST_RO_MEMSLOT(_access, _mmio_handler, _mmio_exits,			\
 			_iabt_handler, _dabt_handler, _aborts)			\
 {										\
@@ -912,6 +934,71 @@ static void help(char *name)
 	.expected_events	= { .aborts = 1, .fail_vcpu_runs = 1 },		\
 }
 
+#define TEST_RO_MEMSLOT_AND_DIRTY_LOG(_access, _mmio_handler, _mmio_exits,	\
+				      _iabt_handler, _dabt_handler, _aborts,	\
+				      _test_check)				\
+{										\
+	.name			= SCAT3(ro_memslot, _access, _with_af),		\
+	.test_memslot_flags	= KVM_MEM_READONLY | KVM_MEM_LOG_DIRTY_PAGES,	\
+	.pt_memslot_flags	= KVM_MEM_READONLY | KVM_MEM_LOG_DIRTY_PAGES,	\
+	.guest_prepare		= { _PREPARE(_access) },			\
+	.guest_test		= _access,					\
+	.guest_test_check	= { _test_check },				\
+	.mmio_handler		= _mmio_handler,				\
+	.iabt_handler		= _iabt_handler,				\
+	.dabt_handler		= _dabt_handler,				\
+	.expected_events	= { .mmio_exits = _mmio_exits,			\
+				    .aborts = _aborts},				\
+}
+
+#define TEST_RO_MEMSLOT_NO_SYNDROME_AND_DIRTY_LOG(_access, _test_check)		\
+{										\
+	.name			= SCAT2(ro_memslot_no_syn_and_dlog, _access),	\
+	.test_memslot_flags	= KVM_MEM_READONLY | KVM_MEM_LOG_DIRTY_PAGES,	\
+	.pt_memslot_flags	= KVM_MEM_READONLY | KVM_MEM_LOG_DIRTY_PAGES,	\
+	.guest_test		= _access,					\
+	.guest_test_check	= { _test_check },				\
+	.dabt_handler		= dabt_s1ptw_on_ro_memslot_handler,		\
+	.fail_vcpu_run_handler	= fail_vcpu_run_mmio_no_syndrome_handler,	\
+	.expected_events	= { .aborts = 1, .fail_vcpu_runs = 1 },		\
+}
+
+#define TEST_RO_MEMSLOT_AND_UFFD(_access, _mmio_handler, _mmio_exits,		\
+				 _iabt_handler, _dabt_handler, _aborts,		\
+				_uffd_test_handler, _uffd_faults)		\
+{										\
+	.name			= SCAT2(ro_memslot_uffd, _access),		\
+	.test_memslot_flags	= KVM_MEM_READONLY,				\
+	.pt_memslot_flags	= KVM_MEM_READONLY,				\
+	.mem_mark_cmd		= CMD_HOLE_TEST | CMD_HOLE_PT,			\
+	.guest_prepare		= { _PREPARE(_access) },			\
+	.guest_test		= _access,					\
+	.uffd_test_handler	= _uffd_test_handler,				\
+	.uffd_pt_handler	= uffd_pt_write_handler,			\
+	.mmio_handler		= _mmio_handler,				\
+	.iabt_handler		= _iabt_handler,				\
+	.dabt_handler		= _dabt_handler,				\
+	.expected_events	= { .mmio_exits = _mmio_exits,			\
+				    .aborts = _aborts,				\
+				    .uffd_faults = _uffd_faults },		\
+}
+
+#define TEST_RO_MEMSLOT_NO_SYNDROME_AND_UFFD(_access, _uffd_test_handler,	\
+					     _uffd_faults)			\
+{										\
+	.name			= SCAT2(ro_memslot_no_syndrome, _access),	\
+	.test_memslot_flags	= KVM_MEM_READONLY,				\
+	.pt_memslot_flags	= KVM_MEM_READONLY,				\
+	.mem_mark_cmd		= CMD_HOLE_TEST | CMD_HOLE_PT,			\
+	.guest_test		= _access,					\
+	.uffd_test_handler	= _uffd_test_handler,				\
+	.uffd_pt_handler	= uffd_pt_write_handler,			\
+	.dabt_handler		= dabt_s1ptw_on_ro_memslot_handler,		\
+	.fail_vcpu_run_handler	= fail_vcpu_run_mmio_no_syndrome_handler,	\
+	.expected_events	= { .aborts = 1, .fail_vcpu_runs = 1,		\
+				    .uffd_faults = _uffd_faults },		\
+}
+
 static struct test_desc tests[] = {
 	/* Check that HW is setting the Access Flag (AF) (sanity checks). */
 	TEST_ACCESS(guest_read64, with_af, CMD_NONE),
@@ -979,6 +1066,35 @@ static struct test_desc tests[] = {
 	TEST_DIRTY_LOG(guest_dc_zva, with_af, guest_check_write_in_dirty_log),
 	TEST_DIRTY_LOG(guest_st_preidx, with_af, guest_check_write_in_dirty_log),
 
+	/*
+	 * Access when the test and PT memslots are both marked for dirty
+	 * logging and UFFD at the same time. The expected result is that
+	 * writes should mark the dirty log and trigger a userfaultfd write
+	 * fault.  Reads/execs should result in a read userfaultfd fault, and
+	 * nothing in the dirty log.  The S1PTW in all cases should result in a
+	 * write in the dirty log and a userfaultfd write.
+	 */
+	TEST_UFFD_AND_DIRTY_LOG(guest_read64, with_af, uffd_test_read_handler, 2,
+			guest_check_no_write_in_dirty_log),
+	/* no_af should also lead to a PT write. */
+	TEST_UFFD_AND_DIRTY_LOG(guest_read64, no_af, uffd_test_read_handler, 2,
+			guest_check_no_write_in_dirty_log),
+	TEST_UFFD_AND_DIRTY_LOG(guest_ld_preidx, with_af, uffd_test_read_handler,
+			2, guest_check_no_write_in_dirty_log),
+	TEST_UFFD_AND_DIRTY_LOG(guest_at, with_af, 0, 1,
+			guest_check_no_write_in_dirty_log),
+	TEST_UFFD_AND_DIRTY_LOG(guest_exec, with_af, uffd_test_read_handler, 2,
+			guest_check_no_write_in_dirty_log),
+	TEST_UFFD_AND_DIRTY_LOG(guest_write64, with_af, uffd_test_write_handler,
+			2, guest_check_write_in_dirty_log),
+	TEST_UFFD_AND_DIRTY_LOG(guest_cas, with_af, uffd_test_read_handler, 2,
+			guest_check_write_in_dirty_log),
+	TEST_UFFD_AND_DIRTY_LOG(guest_dc_zva, with_af, uffd_test_write_handler,
+			2, guest_check_write_in_dirty_log),
+	TEST_UFFD_AND_DIRTY_LOG(guest_st_preidx, with_af,
+			uffd_test_write_handler, 2,
+			guest_check_write_in_dirty_log),
+
 	/*
 	 * Try accesses when both the test and PT memslots are marked read-only
 	 * (with KVM_MEM_READONLY). The S1PTW results in an guest abort, whose
@@ -1005,6 +1121,68 @@ static struct test_desc tests[] = {
 	TEST_RO_MEMSLOT_NO_SYNDROME(guest_cas),
 	TEST_RO_MEMSLOT_NO_SYNDROME(guest_st_preidx),
 
+	/*
+	 * Access when both the test and PT memslots are read-only and marked
+	 * for dirty logging at the same time. The expected result is that
+	 * there should be no write in the dirty log. The S1PTW results in an
+	 * abort which is handled by asking the host to recreate the memslot as
+	 * writable. The readonly handling are the same as if the memslots were
+	 * not marked for dirty logging: writes with a syndrome result in an
+	 * MMIO exit, and writes with no syndrome result in a failed vcpu run.
+	 */
+	TEST_RO_MEMSLOT_AND_DIRTY_LOG(guest_read64, 0, 0, 0,
+			dabt_s1ptw_on_ro_memslot_handler, 1,
+			guest_check_no_write_in_dirty_log),
+	TEST_RO_MEMSLOT_AND_DIRTY_LOG(guest_ld_preidx, 0, 0, 0,
+			dabt_s1ptw_on_ro_memslot_handler, 1,
+			guest_check_no_write_in_dirty_log),
+	TEST_RO_MEMSLOT_AND_DIRTY_LOG(guest_at, 0, 0, 0,
+			dabt_s1ptw_on_ro_memslot_handler, 1,
+			guest_check_no_write_in_dirty_log),
+	TEST_RO_MEMSLOT_AND_DIRTY_LOG(guest_exec, 0, 0,
+			iabt_s1ptw_on_ro_memslot_handler, 0, 1,
+			guest_check_no_write_in_dirty_log),
+	TEST_RO_MEMSLOT_AND_DIRTY_LOG(guest_write64, mmio_on_test_gpa_handler,
+			1, 0, dabt_s1ptw_on_ro_memslot_handler, 1,
+			guest_check_no_write_in_dirty_log),
+	TEST_RO_MEMSLOT_NO_SYNDROME_AND_DIRTY_LOG(guest_dc_zva,
+			guest_check_no_write_in_dirty_log),
+	TEST_RO_MEMSLOT_NO_SYNDROME_AND_DIRTY_LOG(guest_cas,
+			guest_check_no_write_in_dirty_log),
+	TEST_RO_MEMSLOT_NO_SYNDROME_AND_DIRTY_LOG(guest_st_preidx,
+			guest_check_no_write_in_dirty_log),
+
+	/*
+	 * Access when both the test and PT memslots are read-only, and punched
+	 * with holes tracked with userfaultfd.  The expected result is the
+	 * union of both userfaultfd and read-only behaviors. For example,
+	 * write accesses result in a userfaultfd write fault and an MMIO exit.
+	 * Writes with no syndrome result in a failed vcpu run and no
+	 * userfaultfd write fault. Reads only result in userfaultfd getting
+	 * triggered.
+	 */
+	TEST_RO_MEMSLOT_AND_UFFD(guest_read64, 0, 0, 0,
+			dabt_s1ptw_on_ro_memslot_handler, 1,
+			uffd_test_read_handler, 2),
+	TEST_RO_MEMSLOT_AND_UFFD(guest_ld_preidx, 0, 0, 0,
+			dabt_s1ptw_on_ro_memslot_handler, 1,
+			uffd_test_read_handler, 2),
+	TEST_RO_MEMSLOT_AND_UFFD(guest_at, 0, 0, 0,
+			dabt_s1ptw_on_ro_memslot_handler, 1,
+			uffd_no_handler, 1),
+	TEST_RO_MEMSLOT_AND_UFFD(guest_exec, 0, 0,
+			iabt_s1ptw_on_ro_memslot_handler, 0, 1,
+			uffd_test_read_handler, 2),
+	TEST_RO_MEMSLOT_AND_UFFD(guest_write64, mmio_on_test_gpa_handler, 1, 0,
+			dabt_s1ptw_on_ro_memslot_handler, 1,
+			uffd_test_write_handler, 2),
+	TEST_RO_MEMSLOT_NO_SYNDROME_AND_UFFD(guest_cas,
+			uffd_test_read_handler, 2),
+	TEST_RO_MEMSLOT_NO_SYNDROME_AND_UFFD(guest_dc_zva,
+			uffd_no_handler, 1),
+	TEST_RO_MEMSLOT_NO_SYNDROME_AND_UFFD(guest_st_preidx,
+			uffd_no_handler, 1),
+
 	{ 0 }
 };
 
-- 
2.37.0.rc0.161.g10f37bed90-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH v4 09/13] KVM: selftests: aarch64: Add aarch64/page_fault_test
  2022-06-24 21:32   ` Ricardo Koller
@ 2022-06-28 23:43     ` Oliver Upton
  -1 siblings, 0 replies; 56+ messages in thread
From: Oliver Upton @ 2022-06-28 23:43 UTC (permalink / raw)
  To: Ricardo Koller
  Cc: kvm, kvmarm, drjones, maz, bgardon, dmatlack, pbonzini, axelrasmussen

Hi Ricardo,

On Fri, Jun 24, 2022 at 02:32:53PM -0700, Ricardo Koller wrote:
> Add a new test for stage 2 faults when using different combinations of
> guest accesses (e.g., write, S1PTW), backing source type (e.g., anon)
> and types of faults (e.g., read on hugetlbfs with a hole). The next
> commits will add different handling methods and more faults (e.g., uffd
> and dirty logging). This first commit starts by adding two sanity checks
> for all types of accesses: AF setting by the hw, and accessing memslots
> with holes.
> 
> Signed-off-by: Ricardo Koller <ricarkol@google.com>
> ---
>  tools/testing/selftests/kvm/Makefile          |   1 +
>  .../selftests/kvm/aarch64/page_fault_test.c   | 695 ++++++++++++++++++
>  .../selftests/kvm/include/aarch64/processor.h |   6 +
>  3 files changed, 702 insertions(+)
>  create mode 100644 tools/testing/selftests/kvm/aarch64/page_fault_test.c
> 
> diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
> index e4497a3a27d4..13b913225ae7 100644
> --- a/tools/testing/selftests/kvm/Makefile
> +++ b/tools/testing/selftests/kvm/Makefile
> @@ -139,6 +139,7 @@ TEST_GEN_PROGS_aarch64 += aarch64/arch_timer
>  TEST_GEN_PROGS_aarch64 += aarch64/debug-exceptions
>  TEST_GEN_PROGS_aarch64 += aarch64/get-reg-list
>  TEST_GEN_PROGS_aarch64 += aarch64/hypercalls
> +TEST_GEN_PROGS_aarch64 += aarch64/page_fault_test
>  TEST_GEN_PROGS_aarch64 += aarch64/psci_test
>  TEST_GEN_PROGS_aarch64 += aarch64/vcpu_width_config
>  TEST_GEN_PROGS_aarch64 += aarch64/vgic_init
> diff --git a/tools/testing/selftests/kvm/aarch64/page_fault_test.c b/tools/testing/selftests/kvm/aarch64/page_fault_test.c
> new file mode 100644
> index 000000000000..bdda4e3fcdaa
> --- /dev/null
> +++ b/tools/testing/selftests/kvm/aarch64/page_fault_test.c

[...]

> +/* Compare and swap instruction. */
> +static void guest_cas(void)
> +{
> +	uint64_t val;
> +
> +	GUEST_ASSERT_EQ(guest_check_lse(), 1);

Why not just GUEST_ASSERT(guest_check_lse()) ?

> +	asm volatile(".arch_extension lse\n"
> +		     "casal %0, %1, [%2]\n"
> +			:: "r" (0), "r" (TEST_DATA), "r" (guest_test_memory));
> +	val = READ_ONCE(*guest_test_memory);
> +	GUEST_ASSERT_EQ(val, TEST_DATA);
> +}
> +
> +static void guest_read64(void)
> +{
> +	uint64_t val;
> +
> +	val = READ_ONCE(*guest_test_memory);
> +	GUEST_ASSERT_EQ(val, 0);
> +}
> +
> +/* Address translation instruction */
> +static void guest_at(void)
> +{
> +	uint64_t par;
> +	uint64_t paddr;
> +
> +	asm volatile("at s1e1r, %0" :: "r" (guest_test_memory));
> +	par = read_sysreg(par_el1);

I believe you need explicit synchronization (an isb) before the fault
information is guaranteed visibile in PAR_EL1.

> +	/* Bit 1 indicates whether the AT was successful */
> +	GUEST_ASSERT_EQ(par & 1, 0);
> +	/* The PA in bits [51:12] */
> +	paddr = par & (((1ULL << 40) - 1) << 12);
> +	GUEST_ASSERT_EQ(paddr, memslot[TEST].gpa);
> +}
> +
> +/*
> + * The size of the block written by "dc zva" is guaranteed to be between (2 <<
> + * 0) and (2 << 9), which is safe in our case as we need the write to happen
> + * for at least a word, and not more than a page.
> + */
> +static void guest_dc_zva(void)
> +{
> +	uint16_t val;
> +
> +	asm volatile("dc zva, %0\n"
> +			"dsb ish\n"

nit: use the dsb() macro instead. Extremely minor, but makes it a bit
more obvious to the reader. Or maybe I need to get my eyes checked ;-)

> +			:: "r" (guest_test_memory));
> +	val = READ_ONCE(*guest_test_memory);
> +	GUEST_ASSERT_EQ(val, 0);
> +}
> +
> +/*
> + * Pre-indexing loads and stores don't have a valid syndrome (ESR_EL2.ISV==0).
> + * And that's special because KVM must take special care with those: they
> + * should still count as accesses for dirty logging or user-faulting, but
> + * should be handled differently on mmio.
> + */
> +static void guest_ld_preidx(void)
> +{
> +	uint64_t val;
> +	uint64_t addr = TEST_GVA - 8;
> +
> +	/*
> +	 * This ends up accessing "TEST_GVA + 8 - 8", where "TEST_GVA - 8" is
> +	 * in a gap between memslots not backing by anything.
> +	 */
> +	asm volatile("ldr %0, [%1, #8]!"
> +			: "=r" (val), "+r" (addr));
> +	GUEST_ASSERT_EQ(val, 0);
> +	GUEST_ASSERT_EQ(addr, TEST_GVA);
> +}
> +
> +static void guest_st_preidx(void)
> +{
> +	uint64_t val = TEST_DATA;
> +	uint64_t addr = TEST_GVA - 8;
> +
> +	asm volatile("str %0, [%1, #8]!"
> +			: "+r" (val), "+r" (addr));
> +
> +	GUEST_ASSERT_EQ(addr, TEST_GVA);
> +	val = READ_ONCE(*guest_test_memory);
> +}
> +
> +static bool guest_set_ha(void)
> +{
> +	uint64_t mmfr1 = read_sysreg(id_aa64mmfr1_el1);
> +	uint64_t hadbs, tcr;
> +
> +	/* Skip if HA is not supported. */
> +	hadbs = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR1_HADBS), mmfr1);
> +	if (hadbs == 0)
> +		return false;
> +
> +	tcr = read_sysreg(tcr_el1) | TCR_EL1_HA;
> +	write_sysreg(tcr, tcr_el1);
> +	isb();
> +
> +	return true;
> +}
> +
> +static bool guest_clear_pte_af(void)
> +{
> +	*((uint64_t *)TEST_PTE_GVA) &= ~PTE_AF;
> +	flush_tlb_page(TEST_PTE_GVA);

Don't you want to actually flush TEST_GVA to force the TLB fill when you
poke the address again? This looks like you're flushing the VA of the
*PTE* not the test address.

> +	return true;
> +}
> +
> +static void guest_check_pte_af(void)

nit: call this guest_test_pte_af(). You use the guest_check_* pattern
for test preconditions (like guest_check_lse()).

> +{
> +	flush_tlb_page(TEST_PTE_GVA);

What is the purpose of this flush? I believe you are actually depending
on a dsb(ish) between the hardware PTE update and the load below. Or,
that's at least what I gleaned from the jargon of DDI0487H.a D5.4.13 
'Ordering of hardware updates to the translation tables'.

> +	GUEST_ASSERT_EQ(*((uint64_t *)TEST_PTE_GVA) & PTE_AF, PTE_AF);
> +}

[...]

> +static void sync_stats_from_guest(struct kvm_vm *vm)
> +{
> +	struct event_cnt *ec = addr_gva2hva(vm, (uint64_t)&events);
> +
> +	events.aborts += ec->aborts;
> +}

I believe you can use sync_global_from_guest() instead of this.

> +void fail_vcpu_run_no_handler(int ret)
> +{
> +	TEST_FAIL("Unexpected vcpu run failure\n");
> +}
> +
> +extern unsigned char __exec_test;
> +
> +void noinline __return_0x77(void)
> +{
> +	asm volatile("__exec_test: mov x0, #0x77\n"
> +			"ret\n");
> +}
> +
> +static void load_exec_code_for_test(void)
> +{
> +	uint64_t *code, *c;
> +
> +	assert(TEST_EXEC_GVA - TEST_GVA);
> +	code = memslot[TEST].hva + 8;
> +
> +	/*
> +	 * We need the cast to be separate in order for the compiler to not
> +	 * complain with: "‘memcpy’ forming offset [1, 7] is out of the bounds
> +	 * [0, 1] of object ‘__exec_test’ with type ‘unsigned char’"
> +	 */
> +	c = (uint64_t *)&__exec_test;
> +	memcpy(code, c, 8);

Don't you need to sync D$ and I$?

--
Thanks,
Oliver

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v4 09/13] KVM: selftests: aarch64: Add aarch64/page_fault_test
@ 2022-06-28 23:43     ` Oliver Upton
  0 siblings, 0 replies; 56+ messages in thread
From: Oliver Upton @ 2022-06-28 23:43 UTC (permalink / raw)
  To: Ricardo Koller
  Cc: drjones, kvm, maz, axelrasmussen, bgardon, dmatlack, pbonzini, kvmarm

Hi Ricardo,

On Fri, Jun 24, 2022 at 02:32:53PM -0700, Ricardo Koller wrote:
> Add a new test for stage 2 faults when using different combinations of
> guest accesses (e.g., write, S1PTW), backing source type (e.g., anon)
> and types of faults (e.g., read on hugetlbfs with a hole). The next
> commits will add different handling methods and more faults (e.g., uffd
> and dirty logging). This first commit starts by adding two sanity checks
> for all types of accesses: AF setting by the hw, and accessing memslots
> with holes.
> 
> Signed-off-by: Ricardo Koller <ricarkol@google.com>
> ---
>  tools/testing/selftests/kvm/Makefile          |   1 +
>  .../selftests/kvm/aarch64/page_fault_test.c   | 695 ++++++++++++++++++
>  .../selftests/kvm/include/aarch64/processor.h |   6 +
>  3 files changed, 702 insertions(+)
>  create mode 100644 tools/testing/selftests/kvm/aarch64/page_fault_test.c
> 
> diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
> index e4497a3a27d4..13b913225ae7 100644
> --- a/tools/testing/selftests/kvm/Makefile
> +++ b/tools/testing/selftests/kvm/Makefile
> @@ -139,6 +139,7 @@ TEST_GEN_PROGS_aarch64 += aarch64/arch_timer
>  TEST_GEN_PROGS_aarch64 += aarch64/debug-exceptions
>  TEST_GEN_PROGS_aarch64 += aarch64/get-reg-list
>  TEST_GEN_PROGS_aarch64 += aarch64/hypercalls
> +TEST_GEN_PROGS_aarch64 += aarch64/page_fault_test
>  TEST_GEN_PROGS_aarch64 += aarch64/psci_test
>  TEST_GEN_PROGS_aarch64 += aarch64/vcpu_width_config
>  TEST_GEN_PROGS_aarch64 += aarch64/vgic_init
> diff --git a/tools/testing/selftests/kvm/aarch64/page_fault_test.c b/tools/testing/selftests/kvm/aarch64/page_fault_test.c
> new file mode 100644
> index 000000000000..bdda4e3fcdaa
> --- /dev/null
> +++ b/tools/testing/selftests/kvm/aarch64/page_fault_test.c

[...]

> +/* Compare and swap instruction. */
> +static void guest_cas(void)
> +{
> +	uint64_t val;
> +
> +	GUEST_ASSERT_EQ(guest_check_lse(), 1);

Why not just GUEST_ASSERT(guest_check_lse()) ?

> +	asm volatile(".arch_extension lse\n"
> +		     "casal %0, %1, [%2]\n"
> +			:: "r" (0), "r" (TEST_DATA), "r" (guest_test_memory));
> +	val = READ_ONCE(*guest_test_memory);
> +	GUEST_ASSERT_EQ(val, TEST_DATA);
> +}
> +
> +static void guest_read64(void)
> +{
> +	uint64_t val;
> +
> +	val = READ_ONCE(*guest_test_memory);
> +	GUEST_ASSERT_EQ(val, 0);
> +}
> +
> +/* Address translation instruction */
> +static void guest_at(void)
> +{
> +	uint64_t par;
> +	uint64_t paddr;
> +
> +	asm volatile("at s1e1r, %0" :: "r" (guest_test_memory));
> +	par = read_sysreg(par_el1);

I believe you need explicit synchronization (an isb) before the fault
information is guaranteed visibile in PAR_EL1.

> +	/* Bit 1 indicates whether the AT was successful */
> +	GUEST_ASSERT_EQ(par & 1, 0);
> +	/* The PA in bits [51:12] */
> +	paddr = par & (((1ULL << 40) - 1) << 12);
> +	GUEST_ASSERT_EQ(paddr, memslot[TEST].gpa);
> +}
> +
> +/*
> + * The size of the block written by "dc zva" is guaranteed to be between (2 <<
> + * 0) and (2 << 9), which is safe in our case as we need the write to happen
> + * for at least a word, and not more than a page.
> + */
> +static void guest_dc_zva(void)
> +{
> +	uint16_t val;
> +
> +	asm volatile("dc zva, %0\n"
> +			"dsb ish\n"

nit: use the dsb() macro instead. Extremely minor, but makes it a bit
more obvious to the reader. Or maybe I need to get my eyes checked ;-)

> +			:: "r" (guest_test_memory));
> +	val = READ_ONCE(*guest_test_memory);
> +	GUEST_ASSERT_EQ(val, 0);
> +}
> +
> +/*
> + * Pre-indexing loads and stores don't have a valid syndrome (ESR_EL2.ISV==0).
> + * And that's special because KVM must take special care with those: they
> + * should still count as accesses for dirty logging or user-faulting, but
> + * should be handled differently on mmio.
> + */
> +static void guest_ld_preidx(void)
> +{
> +	uint64_t val;
> +	uint64_t addr = TEST_GVA - 8;
> +
> +	/*
> +	 * This ends up accessing "TEST_GVA + 8 - 8", where "TEST_GVA - 8" is
> +	 * in a gap between memslots not backing by anything.
> +	 */
> +	asm volatile("ldr %0, [%1, #8]!"
> +			: "=r" (val), "+r" (addr));
> +	GUEST_ASSERT_EQ(val, 0);
> +	GUEST_ASSERT_EQ(addr, TEST_GVA);
> +}
> +
> +static void guest_st_preidx(void)
> +{
> +	uint64_t val = TEST_DATA;
> +	uint64_t addr = TEST_GVA - 8;
> +
> +	asm volatile("str %0, [%1, #8]!"
> +			: "+r" (val), "+r" (addr));
> +
> +	GUEST_ASSERT_EQ(addr, TEST_GVA);
> +	val = READ_ONCE(*guest_test_memory);
> +}
> +
> +static bool guest_set_ha(void)
> +{
> +	uint64_t mmfr1 = read_sysreg(id_aa64mmfr1_el1);
> +	uint64_t hadbs, tcr;
> +
> +	/* Skip if HA is not supported. */
> +	hadbs = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR1_HADBS), mmfr1);
> +	if (hadbs == 0)
> +		return false;
> +
> +	tcr = read_sysreg(tcr_el1) | TCR_EL1_HA;
> +	write_sysreg(tcr, tcr_el1);
> +	isb();
> +
> +	return true;
> +}
> +
> +static bool guest_clear_pte_af(void)
> +{
> +	*((uint64_t *)TEST_PTE_GVA) &= ~PTE_AF;
> +	flush_tlb_page(TEST_PTE_GVA);

Don't you want to actually flush TEST_GVA to force the TLB fill when you
poke the address again? This looks like you're flushing the VA of the
*PTE* not the test address.

> +	return true;
> +}
> +
> +static void guest_check_pte_af(void)

nit: call this guest_test_pte_af(). You use the guest_check_* pattern
for test preconditions (like guest_check_lse()).

> +{
> +	flush_tlb_page(TEST_PTE_GVA);

What is the purpose of this flush? I believe you are actually depending
on a dsb(ish) between the hardware PTE update and the load below. Or,
that's at least what I gleaned from the jargon of DDI0487H.a D5.4.13 
'Ordering of hardware updates to the translation tables'.

> +	GUEST_ASSERT_EQ(*((uint64_t *)TEST_PTE_GVA) & PTE_AF, PTE_AF);
> +}

[...]

> +static void sync_stats_from_guest(struct kvm_vm *vm)
> +{
> +	struct event_cnt *ec = addr_gva2hva(vm, (uint64_t)&events);
> +
> +	events.aborts += ec->aborts;
> +}

I believe you can use sync_global_from_guest() instead of this.

> +void fail_vcpu_run_no_handler(int ret)
> +{
> +	TEST_FAIL("Unexpected vcpu run failure\n");
> +}
> +
> +extern unsigned char __exec_test;
> +
> +void noinline __return_0x77(void)
> +{
> +	asm volatile("__exec_test: mov x0, #0x77\n"
> +			"ret\n");
> +}
> +
> +static void load_exec_code_for_test(void)
> +{
> +	uint64_t *code, *c;
> +
> +	assert(TEST_EXEC_GVA - TEST_GVA);
> +	code = memslot[TEST].hva + 8;
> +
> +	/*
> +	 * We need the cast to be separate in order for the compiler to not
> +	 * complain with: "‘memcpy’ forming offset [1, 7] is out of the bounds
> +	 * [0, 1] of object ‘__exec_test’ with type ‘unsigned char’"
> +	 */
> +	c = (uint64_t *)&__exec_test;
> +	memcpy(code, c, 8);

Don't you need to sync D$ and I$?

--
Thanks,
Oliver
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v4 09/13] KVM: selftests: aarch64: Add aarch64/page_fault_test
  2022-06-28 23:43     ` Oliver Upton
@ 2022-06-29  1:32       ` Ricardo Koller
  -1 siblings, 0 replies; 56+ messages in thread
From: Ricardo Koller @ 2022-06-29  1:32 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvm, kvmarm, drjones, maz, bgardon, dmatlack, pbonzini, axelrasmussen

Hey Oliver,

On Tue, Jun 28, 2022 at 04:43:29PM -0700, Oliver Upton wrote:
> Hi Ricardo,
> 
> On Fri, Jun 24, 2022 at 02:32:53PM -0700, Ricardo Koller wrote:
> > Add a new test for stage 2 faults when using different combinations of
> > guest accesses (e.g., write, S1PTW), backing source type (e.g., anon)
> > and types of faults (e.g., read on hugetlbfs with a hole). The next
> > commits will add different handling methods and more faults (e.g., uffd
> > and dirty logging). This first commit starts by adding two sanity checks
> > for all types of accesses: AF setting by the hw, and accessing memslots
> > with holes.
> > 
> > Signed-off-by: Ricardo Koller <ricarkol@google.com>
> > ---
> >  tools/testing/selftests/kvm/Makefile          |   1 +
> >  .../selftests/kvm/aarch64/page_fault_test.c   | 695 ++++++++++++++++++
> >  .../selftests/kvm/include/aarch64/processor.h |   6 +
> >  3 files changed, 702 insertions(+)
> >  create mode 100644 tools/testing/selftests/kvm/aarch64/page_fault_test.c
> > 
> > diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
> > index e4497a3a27d4..13b913225ae7 100644
> > --- a/tools/testing/selftests/kvm/Makefile
> > +++ b/tools/testing/selftests/kvm/Makefile
> > @@ -139,6 +139,7 @@ TEST_GEN_PROGS_aarch64 += aarch64/arch_timer
> >  TEST_GEN_PROGS_aarch64 += aarch64/debug-exceptions
> >  TEST_GEN_PROGS_aarch64 += aarch64/get-reg-list
> >  TEST_GEN_PROGS_aarch64 += aarch64/hypercalls
> > +TEST_GEN_PROGS_aarch64 += aarch64/page_fault_test
> >  TEST_GEN_PROGS_aarch64 += aarch64/psci_test
> >  TEST_GEN_PROGS_aarch64 += aarch64/vcpu_width_config
> >  TEST_GEN_PROGS_aarch64 += aarch64/vgic_init
> > diff --git a/tools/testing/selftests/kvm/aarch64/page_fault_test.c b/tools/testing/selftests/kvm/aarch64/page_fault_test.c
> > new file mode 100644
> > index 000000000000..bdda4e3fcdaa
> > --- /dev/null
> > +++ b/tools/testing/selftests/kvm/aarch64/page_fault_test.c
> 
> [...]
> 
> > +/* Compare and swap instruction. */
> > +static void guest_cas(void)
> > +{
> > +	uint64_t val;
> > +
> > +	GUEST_ASSERT_EQ(guest_check_lse(), 1);
> 
> Why not just GUEST_ASSERT(guest_check_lse()) ?
> 
> > +	asm volatile(".arch_extension lse\n"
> > +		     "casal %0, %1, [%2]\n"
> > +			:: "r" (0), "r" (TEST_DATA), "r" (guest_test_memory));
> > +	val = READ_ONCE(*guest_test_memory);
> > +	GUEST_ASSERT_EQ(val, TEST_DATA);
> > +}
> > +
> > +static void guest_read64(void)
> > +{
> > +	uint64_t val;
> > +
> > +	val = READ_ONCE(*guest_test_memory);
> > +	GUEST_ASSERT_EQ(val, 0);
> > +}
> > +
> > +/* Address translation instruction */
> > +static void guest_at(void)
> > +{
> > +	uint64_t par;
> > +	uint64_t paddr;
> > +
> > +	asm volatile("at s1e1r, %0" :: "r" (guest_test_memory));
> > +	par = read_sysreg(par_el1);
> 
> I believe you need explicit synchronization (an isb) before the fault
> information is guaranteed visibile in PAR_EL1.
> 
> > +	/* Bit 1 indicates whether the AT was successful */
> > +	GUEST_ASSERT_EQ(par & 1, 0);
> > +	/* The PA in bits [51:12] */
> > +	paddr = par & (((1ULL << 40) - 1) << 12);
> > +	GUEST_ASSERT_EQ(paddr, memslot[TEST].gpa);
> > +}
> > +
> > +/*
> > + * The size of the block written by "dc zva" is guaranteed to be between (2 <<
> > + * 0) and (2 << 9), which is safe in our case as we need the write to happen
> > + * for at least a word, and not more than a page.
> > + */
> > +static void guest_dc_zva(void)
> > +{
> > +	uint16_t val;
> > +
> > +	asm volatile("dc zva, %0\n"
> > +			"dsb ish\n"
> 
> nit: use the dsb() macro instead. Extremely minor, but makes it a bit
> more obvious to the reader. Or maybe I need to get my eyes checked ;-)
> 
> > +			:: "r" (guest_test_memory));
> > +	val = READ_ONCE(*guest_test_memory);
> > +	GUEST_ASSERT_EQ(val, 0);
> > +}
> > +
> > +/*
> > + * Pre-indexing loads and stores don't have a valid syndrome (ESR_EL2.ISV==0).
> > + * And that's special because KVM must take special care with those: they
> > + * should still count as accesses for dirty logging or user-faulting, but
> > + * should be handled differently on mmio.
> > + */
> > +static void guest_ld_preidx(void)
> > +{
> > +	uint64_t val;
> > +	uint64_t addr = TEST_GVA - 8;
> > +
> > +	/*
> > +	 * This ends up accessing "TEST_GVA + 8 - 8", where "TEST_GVA - 8" is
> > +	 * in a gap between memslots not backing by anything.
> > +	 */
> > +	asm volatile("ldr %0, [%1, #8]!"
> > +			: "=r" (val), "+r" (addr));
> > +	GUEST_ASSERT_EQ(val, 0);
> > +	GUEST_ASSERT_EQ(addr, TEST_GVA);
> > +}
> > +
> > +static void guest_st_preidx(void)
> > +{
> > +	uint64_t val = TEST_DATA;
> > +	uint64_t addr = TEST_GVA - 8;
> > +
> > +	asm volatile("str %0, [%1, #8]!"
> > +			: "+r" (val), "+r" (addr));
> > +
> > +	GUEST_ASSERT_EQ(addr, TEST_GVA);
> > +	val = READ_ONCE(*guest_test_memory);
> > +}
> > +
> > +static bool guest_set_ha(void)
> > +{
> > +	uint64_t mmfr1 = read_sysreg(id_aa64mmfr1_el1);
> > +	uint64_t hadbs, tcr;
> > +
> > +	/* Skip if HA is not supported. */
> > +	hadbs = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR1_HADBS), mmfr1);
> > +	if (hadbs == 0)
> > +		return false;
> > +
> > +	tcr = read_sysreg(tcr_el1) | TCR_EL1_HA;
> > +	write_sysreg(tcr, tcr_el1);
> > +	isb();
> > +
> > +	return true;
> > +}
> > +
> > +static bool guest_clear_pte_af(void)
> > +{
> > +	*((uint64_t *)TEST_PTE_GVA) &= ~PTE_AF;
> > +	flush_tlb_page(TEST_PTE_GVA);
> 
> Don't you want to actually flush TEST_GVA to force the TLB fill when you
> poke the address again? This looks like you're flushing the VA of the
> *PTE* not the test address.

Yes, you are right, this was supposed to be:
flush_tlb_page(TEST_GVA);
(I could swear this was TEST_GVA at one time)

> 
> > +	return true;
> > +}
> > +
> > +static void guest_check_pte_af(void)
> 
> nit: call this guest_test_pte_af(). You use the guest_check_* pattern
> for test preconditions (like guest_check_lse()).
> 
> > +{
> > +	flush_tlb_page(TEST_PTE_GVA);
> 
> What is the purpose of this flush? I believe you are actually depending
> on a dsb(ish) between the hardware PTE update and the load below. Or,
> that's at least what I gleaned from the jargon of DDI0487H.a D5.4.13 
> 'Ordering of hardware updates to the translation tables'.

This was also supposed to be: flush_tlb_page(TEST_GVA)
But will removed based on D5.4.13, as it's indeed saying that the DSB
should be enough.

> 
> > +	GUEST_ASSERT_EQ(*((uint64_t *)TEST_PTE_GVA) & PTE_AF, PTE_AF);
> > +}
> 
> [...]
> 
> > +static void sync_stats_from_guest(struct kvm_vm *vm)
> > +{
> > +	struct event_cnt *ec = addr_gva2hva(vm, (uint64_t)&events);
> > +
> > +	events.aborts += ec->aborts;
> > +}
> 
> I believe you can use sync_global_from_guest() instead of this.
> 
> > +void fail_vcpu_run_no_handler(int ret)
> > +{
> > +	TEST_FAIL("Unexpected vcpu run failure\n");
> > +}
> > +
> > +extern unsigned char __exec_test;
> > +
> > +void noinline __return_0x77(void)
> > +{
> > +	asm volatile("__exec_test: mov x0, #0x77\n"
> > +			"ret\n");
> > +}
> > +
> > +static void load_exec_code_for_test(void)
> > +{
> > +	uint64_t *code, *c;
> > +
> > +	assert(TEST_EXEC_GVA - TEST_GVA);
> > +	code = memslot[TEST].hva + 8;
> > +
> > +	/*
> > +	 * We need the cast to be separate in order for the compiler to not
> > +	 * complain with: "‘memcpy’ forming offset [1, 7] is out of the bounds
> > +	 * [0, 1] of object ‘__exec_test’ with type ‘unsigned char’"
> > +	 */
> > +	c = (uint64_t *)&__exec_test;
> > +	memcpy(code, c, 8);
> 
> Don't you need to sync D$ and I$?

This is done before running the VM for the first time, and it's only
ever written this one time. I think KVM itself is doing the sync when
mapping new pages for the first time, which would be this case.

> 
> --
> Thanks,
> Oliver

ACK on all the other points, will fix accordingly.

Thanks for the review!

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v4 09/13] KVM: selftests: aarch64: Add aarch64/page_fault_test
@ 2022-06-29  1:32       ` Ricardo Koller
  0 siblings, 0 replies; 56+ messages in thread
From: Ricardo Koller @ 2022-06-29  1:32 UTC (permalink / raw)
  To: Oliver Upton
  Cc: drjones, kvm, maz, axelrasmussen, bgardon, dmatlack, pbonzini, kvmarm

Hey Oliver,

On Tue, Jun 28, 2022 at 04:43:29PM -0700, Oliver Upton wrote:
> Hi Ricardo,
> 
> On Fri, Jun 24, 2022 at 02:32:53PM -0700, Ricardo Koller wrote:
> > Add a new test for stage 2 faults when using different combinations of
> > guest accesses (e.g., write, S1PTW), backing source type (e.g., anon)
> > and types of faults (e.g., read on hugetlbfs with a hole). The next
> > commits will add different handling methods and more faults (e.g., uffd
> > and dirty logging). This first commit starts by adding two sanity checks
> > for all types of accesses: AF setting by the hw, and accessing memslots
> > with holes.
> > 
> > Signed-off-by: Ricardo Koller <ricarkol@google.com>
> > ---
> >  tools/testing/selftests/kvm/Makefile          |   1 +
> >  .../selftests/kvm/aarch64/page_fault_test.c   | 695 ++++++++++++++++++
> >  .../selftests/kvm/include/aarch64/processor.h |   6 +
> >  3 files changed, 702 insertions(+)
> >  create mode 100644 tools/testing/selftests/kvm/aarch64/page_fault_test.c
> > 
> > diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
> > index e4497a3a27d4..13b913225ae7 100644
> > --- a/tools/testing/selftests/kvm/Makefile
> > +++ b/tools/testing/selftests/kvm/Makefile
> > @@ -139,6 +139,7 @@ TEST_GEN_PROGS_aarch64 += aarch64/arch_timer
> >  TEST_GEN_PROGS_aarch64 += aarch64/debug-exceptions
> >  TEST_GEN_PROGS_aarch64 += aarch64/get-reg-list
> >  TEST_GEN_PROGS_aarch64 += aarch64/hypercalls
> > +TEST_GEN_PROGS_aarch64 += aarch64/page_fault_test
> >  TEST_GEN_PROGS_aarch64 += aarch64/psci_test
> >  TEST_GEN_PROGS_aarch64 += aarch64/vcpu_width_config
> >  TEST_GEN_PROGS_aarch64 += aarch64/vgic_init
> > diff --git a/tools/testing/selftests/kvm/aarch64/page_fault_test.c b/tools/testing/selftests/kvm/aarch64/page_fault_test.c
> > new file mode 100644
> > index 000000000000..bdda4e3fcdaa
> > --- /dev/null
> > +++ b/tools/testing/selftests/kvm/aarch64/page_fault_test.c
> 
> [...]
> 
> > +/* Compare and swap instruction. */
> > +static void guest_cas(void)
> > +{
> > +	uint64_t val;
> > +
> > +	GUEST_ASSERT_EQ(guest_check_lse(), 1);
> 
> Why not just GUEST_ASSERT(guest_check_lse()) ?
> 
> > +	asm volatile(".arch_extension lse\n"
> > +		     "casal %0, %1, [%2]\n"
> > +			:: "r" (0), "r" (TEST_DATA), "r" (guest_test_memory));
> > +	val = READ_ONCE(*guest_test_memory);
> > +	GUEST_ASSERT_EQ(val, TEST_DATA);
> > +}
> > +
> > +static void guest_read64(void)
> > +{
> > +	uint64_t val;
> > +
> > +	val = READ_ONCE(*guest_test_memory);
> > +	GUEST_ASSERT_EQ(val, 0);
> > +}
> > +
> > +/* Address translation instruction */
> > +static void guest_at(void)
> > +{
> > +	uint64_t par;
> > +	uint64_t paddr;
> > +
> > +	asm volatile("at s1e1r, %0" :: "r" (guest_test_memory));
> > +	par = read_sysreg(par_el1);
> 
> I believe you need explicit synchronization (an isb) before the fault
> information is guaranteed visibile in PAR_EL1.
> 
> > +	/* Bit 1 indicates whether the AT was successful */
> > +	GUEST_ASSERT_EQ(par & 1, 0);
> > +	/* The PA in bits [51:12] */
> > +	paddr = par & (((1ULL << 40) - 1) << 12);
> > +	GUEST_ASSERT_EQ(paddr, memslot[TEST].gpa);
> > +}
> > +
> > +/*
> > + * The size of the block written by "dc zva" is guaranteed to be between (2 <<
> > + * 0) and (2 << 9), which is safe in our case as we need the write to happen
> > + * for at least a word, and not more than a page.
> > + */
> > +static void guest_dc_zva(void)
> > +{
> > +	uint16_t val;
> > +
> > +	asm volatile("dc zva, %0\n"
> > +			"dsb ish\n"
> 
> nit: use the dsb() macro instead. Extremely minor, but makes it a bit
> more obvious to the reader. Or maybe I need to get my eyes checked ;-)
> 
> > +			:: "r" (guest_test_memory));
> > +	val = READ_ONCE(*guest_test_memory);
> > +	GUEST_ASSERT_EQ(val, 0);
> > +}
> > +
> > +/*
> > + * Pre-indexing loads and stores don't have a valid syndrome (ESR_EL2.ISV==0).
> > + * And that's special because KVM must take special care with those: they
> > + * should still count as accesses for dirty logging or user-faulting, but
> > + * should be handled differently on mmio.
> > + */
> > +static void guest_ld_preidx(void)
> > +{
> > +	uint64_t val;
> > +	uint64_t addr = TEST_GVA - 8;
> > +
> > +	/*
> > +	 * This ends up accessing "TEST_GVA + 8 - 8", where "TEST_GVA - 8" is
> > +	 * in a gap between memslots not backing by anything.
> > +	 */
> > +	asm volatile("ldr %0, [%1, #8]!"
> > +			: "=r" (val), "+r" (addr));
> > +	GUEST_ASSERT_EQ(val, 0);
> > +	GUEST_ASSERT_EQ(addr, TEST_GVA);
> > +}
> > +
> > +static void guest_st_preidx(void)
> > +{
> > +	uint64_t val = TEST_DATA;
> > +	uint64_t addr = TEST_GVA - 8;
> > +
> > +	asm volatile("str %0, [%1, #8]!"
> > +			: "+r" (val), "+r" (addr));
> > +
> > +	GUEST_ASSERT_EQ(addr, TEST_GVA);
> > +	val = READ_ONCE(*guest_test_memory);
> > +}
> > +
> > +static bool guest_set_ha(void)
> > +{
> > +	uint64_t mmfr1 = read_sysreg(id_aa64mmfr1_el1);
> > +	uint64_t hadbs, tcr;
> > +
> > +	/* Skip if HA is not supported. */
> > +	hadbs = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR1_HADBS), mmfr1);
> > +	if (hadbs == 0)
> > +		return false;
> > +
> > +	tcr = read_sysreg(tcr_el1) | TCR_EL1_HA;
> > +	write_sysreg(tcr, tcr_el1);
> > +	isb();
> > +
> > +	return true;
> > +}
> > +
> > +static bool guest_clear_pte_af(void)
> > +{
> > +	*((uint64_t *)TEST_PTE_GVA) &= ~PTE_AF;
> > +	flush_tlb_page(TEST_PTE_GVA);
> 
> Don't you want to actually flush TEST_GVA to force the TLB fill when you
> poke the address again? This looks like you're flushing the VA of the
> *PTE* not the test address.

Yes, you are right, this was supposed to be:
flush_tlb_page(TEST_GVA);
(I could swear this was TEST_GVA at one time)

> 
> > +	return true;
> > +}
> > +
> > +static void guest_check_pte_af(void)
> 
> nit: call this guest_test_pte_af(). You use the guest_check_* pattern
> for test preconditions (like guest_check_lse()).
> 
> > +{
> > +	flush_tlb_page(TEST_PTE_GVA);
> 
> What is the purpose of this flush? I believe you are actually depending
> on a dsb(ish) between the hardware PTE update and the load below. Or,
> that's at least what I gleaned from the jargon of DDI0487H.a D5.4.13 
> 'Ordering of hardware updates to the translation tables'.

This was also supposed to be: flush_tlb_page(TEST_GVA)
But will removed based on D5.4.13, as it's indeed saying that the DSB
should be enough.

> 
> > +	GUEST_ASSERT_EQ(*((uint64_t *)TEST_PTE_GVA) & PTE_AF, PTE_AF);
> > +}
> 
> [...]
> 
> > +static void sync_stats_from_guest(struct kvm_vm *vm)
> > +{
> > +	struct event_cnt *ec = addr_gva2hva(vm, (uint64_t)&events);
> > +
> > +	events.aborts += ec->aborts;
> > +}
> 
> I believe you can use sync_global_from_guest() instead of this.
> 
> > +void fail_vcpu_run_no_handler(int ret)
> > +{
> > +	TEST_FAIL("Unexpected vcpu run failure\n");
> > +}
> > +
> > +extern unsigned char __exec_test;
> > +
> > +void noinline __return_0x77(void)
> > +{
> > +	asm volatile("__exec_test: mov x0, #0x77\n"
> > +			"ret\n");
> > +}
> > +
> > +static void load_exec_code_for_test(void)
> > +{
> > +	uint64_t *code, *c;
> > +
> > +	assert(TEST_EXEC_GVA - TEST_GVA);
> > +	code = memslot[TEST].hva + 8;
> > +
> > +	/*
> > +	 * We need the cast to be separate in order for the compiler to not
> > +	 * complain with: "‘memcpy’ forming offset [1, 7] is out of the bounds
> > +	 * [0, 1] of object ‘__exec_test’ with type ‘unsigned char’"
> > +	 */
> > +	c = (uint64_t *)&__exec_test;
> > +	memcpy(code, c, 8);
> 
> Don't you need to sync D$ and I$?

This is done before running the VM for the first time, and it's only
ever written this one time. I think KVM itself is doing the sync when
mapping new pages for the first time, which would be this case.

> 
> --
> Thanks,
> Oliver

ACK on all the other points, will fix accordingly.

Thanks for the review!
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v4 02/13] KVM: selftests: aarch64: Add virt_get_pte_hva library function
  2022-06-24 21:32   ` Ricardo Koller
@ 2022-07-12  9:12     ` Andrew Jones
  -1 siblings, 0 replies; 56+ messages in thread
From: Andrew Jones @ 2022-07-12  9:12 UTC (permalink / raw)
  To: Ricardo Koller
  Cc: kvm, kvmarm, pbonzini, maz, alexandru.elisei, eric.auger, oupton,
	reijiw, rananta, bgardon, dmatlack, axelrasmussen

On Fri, Jun 24, 2022 at 02:32:46PM -0700, Ricardo Koller wrote:
> Add a library function to get the PTE (a host virtual address) of a
> given GVA.  This will be used in a future commit by a test to clear and
> check the access flag of a particular page.
> 
> Signed-off-by: Ricardo Koller <ricarkol@google.com>
> ---
>  .../selftests/kvm/include/aarch64/processor.h       |  2 ++
>  tools/testing/selftests/kvm/lib/aarch64/processor.c | 13 ++++++++++---
>  2 files changed, 12 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/testing/selftests/kvm/include/aarch64/processor.h b/tools/testing/selftests/kvm/include/aarch64/processor.h
> index a8124f9dd68a..df4bfac69551 100644
> --- a/tools/testing/selftests/kvm/include/aarch64/processor.h
> +++ b/tools/testing/selftests/kvm/include/aarch64/processor.h
> @@ -109,6 +109,8 @@ void vm_install_exception_handler(struct kvm_vm *vm,
>  void vm_install_sync_handler(struct kvm_vm *vm,
>  		int vector, int ec, handler_fn handler);
>  
> +uint64_t *virt_get_pte_hva(struct kvm_vm *vm, vm_vaddr_t gva);
> +
>  static inline void cpu_relax(void)
>  {
>  	asm volatile("yield" ::: "memory");
> diff --git a/tools/testing/selftests/kvm/lib/aarch64/processor.c b/tools/testing/selftests/kvm/lib/aarch64/processor.c
> index 6f5551368944..63ef3c78e55e 100644
> --- a/tools/testing/selftests/kvm/lib/aarch64/processor.c
> +++ b/tools/testing/selftests/kvm/lib/aarch64/processor.c
> @@ -138,7 +138,7 @@ void virt_arch_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr)
>  	_virt_pg_map(vm, vaddr, paddr, attr_idx);
>  }
>  
> -vm_paddr_t addr_arch_gva2gpa(struct kvm_vm *vm, vm_vaddr_t gva)
> +uint64_t *virt_get_pte_hva(struct kvm_vm *vm, vm_vaddr_t gva)
>  {
>  	uint64_t *ptep;
>  
> @@ -169,11 +169,18 @@ vm_paddr_t addr_arch_gva2gpa(struct kvm_vm *vm, vm_vaddr_t gva)
>  		TEST_FAIL("Page table levels must be 2, 3, or 4");
>  	}
>  
> -	return pte_addr(vm, *ptep) + (gva & (vm->page_size - 1));
> +	return ptep;
>  
>  unmapped_gva:
>  	TEST_FAIL("No mapping for vm virtual address, gva: 0x%lx", gva);
> -	exit(1);
> +	exit(EXIT_FAILURE);
> +}
> +
> +vm_paddr_t addr_arch_gva2gpa(struct kvm_vm *vm, vm_vaddr_t gva)
> +{
> +	uint64_t *ptep = virt_get_pte_hva(vm, gva);
> +
> +	return pte_addr(vm, *ptep) + (gva & (vm->page_size - 1));
>  }
>  
>  static void pte_dump(FILE *stream, struct kvm_vm *vm, uint8_t indent, uint64_t page, int level)
> -- 
> 2.37.0.rc0.161.g10f37bed90-goog
>

Reviewed-by: Andrew Jones <andrew.jones@linux.dev>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v4 02/13] KVM: selftests: aarch64: Add virt_get_pte_hva library function
@ 2022-07-12  9:12     ` Andrew Jones
  0 siblings, 0 replies; 56+ messages in thread
From: Andrew Jones @ 2022-07-12  9:12 UTC (permalink / raw)
  To: Ricardo Koller
  Cc: kvm, maz, bgardon, dmatlack, pbonzini, axelrasmussen, kvmarm

On Fri, Jun 24, 2022 at 02:32:46PM -0700, Ricardo Koller wrote:
> Add a library function to get the PTE (a host virtual address) of a
> given GVA.  This will be used in a future commit by a test to clear and
> check the access flag of a particular page.
> 
> Signed-off-by: Ricardo Koller <ricarkol@google.com>
> ---
>  .../selftests/kvm/include/aarch64/processor.h       |  2 ++
>  tools/testing/selftests/kvm/lib/aarch64/processor.c | 13 ++++++++++---
>  2 files changed, 12 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/testing/selftests/kvm/include/aarch64/processor.h b/tools/testing/selftests/kvm/include/aarch64/processor.h
> index a8124f9dd68a..df4bfac69551 100644
> --- a/tools/testing/selftests/kvm/include/aarch64/processor.h
> +++ b/tools/testing/selftests/kvm/include/aarch64/processor.h
> @@ -109,6 +109,8 @@ void vm_install_exception_handler(struct kvm_vm *vm,
>  void vm_install_sync_handler(struct kvm_vm *vm,
>  		int vector, int ec, handler_fn handler);
>  
> +uint64_t *virt_get_pte_hva(struct kvm_vm *vm, vm_vaddr_t gva);
> +
>  static inline void cpu_relax(void)
>  {
>  	asm volatile("yield" ::: "memory");
> diff --git a/tools/testing/selftests/kvm/lib/aarch64/processor.c b/tools/testing/selftests/kvm/lib/aarch64/processor.c
> index 6f5551368944..63ef3c78e55e 100644
> --- a/tools/testing/selftests/kvm/lib/aarch64/processor.c
> +++ b/tools/testing/selftests/kvm/lib/aarch64/processor.c
> @@ -138,7 +138,7 @@ void virt_arch_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr)
>  	_virt_pg_map(vm, vaddr, paddr, attr_idx);
>  }
>  
> -vm_paddr_t addr_arch_gva2gpa(struct kvm_vm *vm, vm_vaddr_t gva)
> +uint64_t *virt_get_pte_hva(struct kvm_vm *vm, vm_vaddr_t gva)
>  {
>  	uint64_t *ptep;
>  
> @@ -169,11 +169,18 @@ vm_paddr_t addr_arch_gva2gpa(struct kvm_vm *vm, vm_vaddr_t gva)
>  		TEST_FAIL("Page table levels must be 2, 3, or 4");
>  	}
>  
> -	return pte_addr(vm, *ptep) + (gva & (vm->page_size - 1));
> +	return ptep;
>  
>  unmapped_gva:
>  	TEST_FAIL("No mapping for vm virtual address, gva: 0x%lx", gva);
> -	exit(1);
> +	exit(EXIT_FAILURE);
> +}
> +
> +vm_paddr_t addr_arch_gva2gpa(struct kvm_vm *vm, vm_vaddr_t gva)
> +{
> +	uint64_t *ptep = virt_get_pte_hva(vm, gva);
> +
> +	return pte_addr(vm, *ptep) + (gva & (vm->page_size - 1));
>  }
>  
>  static void pte_dump(FILE *stream, struct kvm_vm *vm, uint8_t indent, uint64_t page, int level)
> -- 
> 2.37.0.rc0.161.g10f37bed90-goog
>

Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v4 03/13] KVM: selftests: Add vm_alloc_page_table_in_memslot library function
  2022-06-24 21:32   ` Ricardo Koller
@ 2022-07-12  9:13     ` Andrew Jones
  -1 siblings, 0 replies; 56+ messages in thread
From: Andrew Jones @ 2022-07-12  9:13 UTC (permalink / raw)
  To: Ricardo Koller
  Cc: kvm, kvmarm, drjones, pbonzini, maz, alexandru.elisei,
	eric.auger, oupton, reijiw, rananta, bgardon, dmatlack,
	axelrasmussen

On Fri, Jun 24, 2022 at 02:32:47PM -0700, Ricardo Koller wrote:
> Add a library function to allocate a page-table physical page in a
> particular memslot.  The default behavior is to create new page-table
> pages in memslot 0.
> 
> Reviewed-by: Oliver Upton <oupton@google.com>
> Reviewed-by: Ben Gardon <bgardon@google.com>
> Signed-off-by: Ricardo Koller <ricarkol@google.com>
> ---
>  tools/testing/selftests/kvm/include/kvm_util_base.h | 1 +
>  tools/testing/selftests/kvm/lib/kvm_util.c          | 8 +++++++-
>  2 files changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
> index 7ebfc8c7de17..54ede9fc923c 100644
> --- a/tools/testing/selftests/kvm/include/kvm_util_base.h
> +++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
> @@ -579,6 +579,7 @@ vm_paddr_t vm_phy_page_alloc(struct kvm_vm *vm, vm_paddr_t paddr_min,
>  vm_paddr_t vm_phy_pages_alloc(struct kvm_vm *vm, size_t num,
>  			      vm_paddr_t paddr_min, uint32_t memslot);
>  vm_paddr_t vm_alloc_page_table(struct kvm_vm *vm);
> +vm_paddr_t vm_alloc_page_table_in_memslot(struct kvm_vm *vm, uint32_t pt_memslot);
>  
>  /*
>   * ____vm_create() does KVM_CREATE_VM and little else.  __vm_create() also
> diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
> index f8c104dba258..5ee20d4da222 100644
> --- a/tools/testing/selftests/kvm/lib/kvm_util.c
> +++ b/tools/testing/selftests/kvm/lib/kvm_util.c
> @@ -1784,9 +1784,15 @@ vm_paddr_t vm_phy_page_alloc(struct kvm_vm *vm, vm_paddr_t paddr_min,
>  /* Arbitrary minimum physical address used for virtual translation tables. */
>  #define KVM_GUEST_PAGE_TABLE_MIN_PADDR 0x180000
>  
> +vm_paddr_t vm_alloc_page_table_in_memslot(struct kvm_vm *vm, uint32_t pt_memslot)
> +{
> +	return vm_phy_page_alloc(vm, KVM_GUEST_PAGE_TABLE_MIN_PADDR,
> +			pt_memslot);
> +}
> +
>  vm_paddr_t vm_alloc_page_table(struct kvm_vm *vm)
>  {
> -	return vm_phy_page_alloc(vm, KVM_GUEST_PAGE_TABLE_MIN_PADDR, 0);
> +	return vm_alloc_page_table_in_memslot(vm, 0);
>  }
>  
>  /*
> -- 
> 2.37.0.rc0.161.g10f37bed90-goog
>

Reviewed-by: Andrew Jones <andrew.jones@linux.dev>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v4 03/13] KVM: selftests: Add vm_alloc_page_table_in_memslot library function
@ 2022-07-12  9:13     ` Andrew Jones
  0 siblings, 0 replies; 56+ messages in thread
From: Andrew Jones @ 2022-07-12  9:13 UTC (permalink / raw)
  To: Ricardo Koller
  Cc: drjones, kvm, maz, bgardon, dmatlack, pbonzini, axelrasmussen, kvmarm

On Fri, Jun 24, 2022 at 02:32:47PM -0700, Ricardo Koller wrote:
> Add a library function to allocate a page-table physical page in a
> particular memslot.  The default behavior is to create new page-table
> pages in memslot 0.
> 
> Reviewed-by: Oliver Upton <oupton@google.com>
> Reviewed-by: Ben Gardon <bgardon@google.com>
> Signed-off-by: Ricardo Koller <ricarkol@google.com>
> ---
>  tools/testing/selftests/kvm/include/kvm_util_base.h | 1 +
>  tools/testing/selftests/kvm/lib/kvm_util.c          | 8 +++++++-
>  2 files changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
> index 7ebfc8c7de17..54ede9fc923c 100644
> --- a/tools/testing/selftests/kvm/include/kvm_util_base.h
> +++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
> @@ -579,6 +579,7 @@ vm_paddr_t vm_phy_page_alloc(struct kvm_vm *vm, vm_paddr_t paddr_min,
>  vm_paddr_t vm_phy_pages_alloc(struct kvm_vm *vm, size_t num,
>  			      vm_paddr_t paddr_min, uint32_t memslot);
>  vm_paddr_t vm_alloc_page_table(struct kvm_vm *vm);
> +vm_paddr_t vm_alloc_page_table_in_memslot(struct kvm_vm *vm, uint32_t pt_memslot);
>  
>  /*
>   * ____vm_create() does KVM_CREATE_VM and little else.  __vm_create() also
> diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
> index f8c104dba258..5ee20d4da222 100644
> --- a/tools/testing/selftests/kvm/lib/kvm_util.c
> +++ b/tools/testing/selftests/kvm/lib/kvm_util.c
> @@ -1784,9 +1784,15 @@ vm_paddr_t vm_phy_page_alloc(struct kvm_vm *vm, vm_paddr_t paddr_min,
>  /* Arbitrary minimum physical address used for virtual translation tables. */
>  #define KVM_GUEST_PAGE_TABLE_MIN_PADDR 0x180000
>  
> +vm_paddr_t vm_alloc_page_table_in_memslot(struct kvm_vm *vm, uint32_t pt_memslot)
> +{
> +	return vm_phy_page_alloc(vm, KVM_GUEST_PAGE_TABLE_MIN_PADDR,
> +			pt_memslot);
> +}
> +
>  vm_paddr_t vm_alloc_page_table(struct kvm_vm *vm)
>  {
> -	return vm_phy_page_alloc(vm, KVM_GUEST_PAGE_TABLE_MIN_PADDR, 0);
> +	return vm_alloc_page_table_in_memslot(vm, 0);
>  }
>  
>  /*
> -- 
> 2.37.0.rc0.161.g10f37bed90-goog
>

Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v4 04/13] KVM: selftests: aarch64: Export _virt_pg_map with a pt_memslot arg
  2022-06-24 21:32   ` Ricardo Koller
@ 2022-07-12  9:33     ` Andrew Jones
  -1 siblings, 0 replies; 56+ messages in thread
From: Andrew Jones @ 2022-07-12  9:33 UTC (permalink / raw)
  To: Ricardo Koller
  Cc: kvm, kvmarm, maz, bgardon, dmatlack, pbonzini, axelrasmussen

On Fri, Jun 24, 2022 at 02:32:48PM -0700, Ricardo Koller wrote:
> Add an argument, pt_memslot, into _virt_pg_map in order to use a
> specific memslot for the page-table allocations performed when creating
> a new map. This will be used in a future commit to test having PTEs
> stored on memslots with different setups (e.g., hugetlb with a hole).
> 
> Reviewed-by: Oliver Upton <oupton@google.com>
> Signed-off-by: Ricardo Koller <ricarkol@google.com>
> ---
>  .../selftests/kvm/include/aarch64/processor.h        |  3 +++
>  tools/testing/selftests/kvm/lib/aarch64/processor.c  | 12 ++++++------
>  2 files changed, 9 insertions(+), 6 deletions(-)
>

Reviewed-by: Andrew Jones <andrew.jones@linux.dev>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v4 04/13] KVM: selftests: aarch64: Export _virt_pg_map with a pt_memslot arg
@ 2022-07-12  9:33     ` Andrew Jones
  0 siblings, 0 replies; 56+ messages in thread
From: Andrew Jones @ 2022-07-12  9:33 UTC (permalink / raw)
  To: Ricardo Koller
  Cc: kvm, maz, axelrasmussen, bgardon, dmatlack, pbonzini, kvmarm

On Fri, Jun 24, 2022 at 02:32:48PM -0700, Ricardo Koller wrote:
> Add an argument, pt_memslot, into _virt_pg_map in order to use a
> specific memslot for the page-table allocations performed when creating
> a new map. This will be used in a future commit to test having PTEs
> stored on memslots with different setups (e.g., hugetlb with a hole).
> 
> Reviewed-by: Oliver Upton <oupton@google.com>
> Signed-off-by: Ricardo Koller <ricarkol@google.com>
> ---
>  .../selftests/kvm/include/aarch64/processor.h        |  3 +++
>  tools/testing/selftests/kvm/lib/aarch64/processor.c  | 12 ++++++------
>  2 files changed, 9 insertions(+), 6 deletions(-)
>

Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v4 05/13] KVM: selftests: Add missing close and munmap in __vm_mem_region_delete
  2022-06-24 21:32   ` Ricardo Koller
@ 2022-07-12  9:35     ` Andrew Jones
  -1 siblings, 0 replies; 56+ messages in thread
From: Andrew Jones @ 2022-07-12  9:35 UTC (permalink / raw)
  To: Ricardo Koller
  Cc: kvm, kvmarm, pbonzini, maz, alexandru.elisei, eric.auger, oupton,
	reijiw, rananta, bgardon, dmatlack, axelrasmussen

On Fri, Jun 24, 2022 at 02:32:49PM -0700, Ricardo Koller wrote:
> Deleting a memslot (when freeing a VM) is not closing the backing fd,
> nor it's unmapping the alias mapping. Fix by adding the missing close
> and munmap.
> 
> Reviewed-by: Oliver Upton <oupton@google.com>
> Reviewed-by: Ben Gardon <bgardon@google.com>
> Signed-off-by: Ricardo Koller <ricarkol@google.com>
> ---
>  tools/testing/selftests/kvm/lib/kvm_util.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
> index 5ee20d4da222..3e45e3776bdf 100644
> --- a/tools/testing/selftests/kvm/lib/kvm_util.c
> +++ b/tools/testing/selftests/kvm/lib/kvm_util.c
> @@ -531,6 +531,12 @@ static void __vm_mem_region_delete(struct kvm_vm *vm,
>  	sparsebit_free(&region->unused_phy_pages);
>  	ret = munmap(region->mmap_start, region->mmap_size);
>  	TEST_ASSERT(!ret, __KVM_SYSCALL_ERROR("munmap()", ret));
> +	if (region->fd >= 0) {
> +		/* There's an extra map when using shared memory. */
> +		ret = munmap(region->mmap_alias, region->mmap_size);
> +		TEST_ASSERT(!ret, __KVM_SYSCALL_ERROR("munmap()", ret));
> +		close(region->fd);
> +	}
>  
>  	free(region);
>  }
> -- 
> 2.37.0.rc0.161.g10f37bed90-goog
>

Reviewed-by: Andrew Jones <andrew.jones@linux.dev>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v4 05/13] KVM: selftests: Add missing close and munmap in __vm_mem_region_delete
@ 2022-07-12  9:35     ` Andrew Jones
  0 siblings, 0 replies; 56+ messages in thread
From: Andrew Jones @ 2022-07-12  9:35 UTC (permalink / raw)
  To: Ricardo Koller
  Cc: kvm, maz, bgardon, dmatlack, pbonzini, axelrasmussen, kvmarm

On Fri, Jun 24, 2022 at 02:32:49PM -0700, Ricardo Koller wrote:
> Deleting a memslot (when freeing a VM) is not closing the backing fd,
> nor it's unmapping the alias mapping. Fix by adding the missing close
> and munmap.
> 
> Reviewed-by: Oliver Upton <oupton@google.com>
> Reviewed-by: Ben Gardon <bgardon@google.com>
> Signed-off-by: Ricardo Koller <ricarkol@google.com>
> ---
>  tools/testing/selftests/kvm/lib/kvm_util.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
> index 5ee20d4da222..3e45e3776bdf 100644
> --- a/tools/testing/selftests/kvm/lib/kvm_util.c
> +++ b/tools/testing/selftests/kvm/lib/kvm_util.c
> @@ -531,6 +531,12 @@ static void __vm_mem_region_delete(struct kvm_vm *vm,
>  	sparsebit_free(&region->unused_phy_pages);
>  	ret = munmap(region->mmap_start, region->mmap_size);
>  	TEST_ASSERT(!ret, __KVM_SYSCALL_ERROR("munmap()", ret));
> +	if (region->fd >= 0) {
> +		/* There's an extra map when using shared memory. */
> +		ret = munmap(region->mmap_alias, region->mmap_size);
> +		TEST_ASSERT(!ret, __KVM_SYSCALL_ERROR("munmap()", ret));
> +		close(region->fd);
> +	}
>  
>  	free(region);
>  }
> -- 
> 2.37.0.rc0.161.g10f37bed90-goog
>

Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v4 06/13] KVM: selftests: Add vm_mem_region_get_src_fd library function
  2022-06-24 21:32   ` Ricardo Koller
@ 2022-07-12  9:40     ` Andrew Jones
  -1 siblings, 0 replies; 56+ messages in thread
From: Andrew Jones @ 2022-07-12  9:40 UTC (permalink / raw)
  To: Ricardo Koller
  Cc: kvm, kvmarm, pbonzini, maz, alexandru.elisei, eric.auger, oupton,
	reijiw, rananta, bgardon, dmatlack, axelrasmussen

On Fri, Jun 24, 2022 at 02:32:50PM -0700, Ricardo Koller wrote:
> Add a library function to get the backing source FD of a memslot.
> 
> Reviewed-by: Oliver Upton <oupton@google.com>
> Signed-off-by: Ricardo Koller <ricarkol@google.com>
> ---
>  .../selftests/kvm/include/kvm_util_base.h     |  1 +
>  tools/testing/selftests/kvm/lib/kvm_util.c    | 23 +++++++++++++++++++
>  2 files changed, 24 insertions(+)
> 
> diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
> index 54ede9fc923c..72c8881fe8fb 100644
> --- a/tools/testing/selftests/kvm/include/kvm_util_base.h
> +++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
> @@ -322,6 +322,7 @@ void vm_userspace_mem_region_add(struct kvm_vm *vm,
>  void vm_mem_region_set_flags(struct kvm_vm *vm, uint32_t slot, uint32_t flags);
>  void vm_mem_region_move(struct kvm_vm *vm, uint32_t slot, uint64_t new_gpa);
>  void vm_mem_region_delete(struct kvm_vm *vm, uint32_t slot);
> +int vm_mem_region_get_src_fd(struct kvm_vm *vm, uint32_t memslot);
>  struct kvm_vcpu *__vm_vcpu_add(struct kvm_vm *vm, uint32_t vcpu_id);
>  vm_vaddr_t vm_vaddr_alloc(struct kvm_vm *vm, size_t sz, vm_vaddr_t vaddr_min);
>  vm_vaddr_t vm_vaddr_alloc_pages(struct kvm_vm *vm, int nr_pages);
> diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
> index 3e45e3776bdf..7c81028f23d8 100644
> --- a/tools/testing/selftests/kvm/lib/kvm_util.c
> +++ b/tools/testing/selftests/kvm/lib/kvm_util.c
> @@ -466,6 +466,29 @@ kvm_userspace_memory_region_find(struct kvm_vm *vm, uint64_t start,
>  	return &region->region;
>  }
>  
> +/*
> + * KVM Userspace Memory Get Backing Source FD
> + *
> + * Input Args:
> + *   vm - Virtual Machine
> + *   memslot - KVM memory slot ID
> + *
> + * Output Args: None
> + *
> + * Return:
> + *   Backing source file descriptor, -1 if the memslot is an anonymous region.
> + *
> + * Returns the backing source fd of a memslot, so tests can use it to punch
> + * holes, or to setup permissions.
> + */

nit: We're starting to slowly change these verbose function headers into
smaller headers, so this could be reduced.

> +int vm_mem_region_get_src_fd(struct kvm_vm *vm, uint32_t memslot)
> +{
> +	struct userspace_mem_region *region;
> +
> +	region = memslot2region(vm, memslot);
> +	return region->fd;
> +}
> +
>  /*
>   * VM VCPU Remove
>   *
> -- 
> 2.37.0.rc0.161.g10f37bed90-goog
>

Reviewed-by: Andrew Jones <andrew.jones@linux.dev>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v4 06/13] KVM: selftests: Add vm_mem_region_get_src_fd library function
@ 2022-07-12  9:40     ` Andrew Jones
  0 siblings, 0 replies; 56+ messages in thread
From: Andrew Jones @ 2022-07-12  9:40 UTC (permalink / raw)
  To: Ricardo Koller
  Cc: kvm, maz, bgardon, dmatlack, pbonzini, axelrasmussen, kvmarm

On Fri, Jun 24, 2022 at 02:32:50PM -0700, Ricardo Koller wrote:
> Add a library function to get the backing source FD of a memslot.
> 
> Reviewed-by: Oliver Upton <oupton@google.com>
> Signed-off-by: Ricardo Koller <ricarkol@google.com>
> ---
>  .../selftests/kvm/include/kvm_util_base.h     |  1 +
>  tools/testing/selftests/kvm/lib/kvm_util.c    | 23 +++++++++++++++++++
>  2 files changed, 24 insertions(+)
> 
> diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
> index 54ede9fc923c..72c8881fe8fb 100644
> --- a/tools/testing/selftests/kvm/include/kvm_util_base.h
> +++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
> @@ -322,6 +322,7 @@ void vm_userspace_mem_region_add(struct kvm_vm *vm,
>  void vm_mem_region_set_flags(struct kvm_vm *vm, uint32_t slot, uint32_t flags);
>  void vm_mem_region_move(struct kvm_vm *vm, uint32_t slot, uint64_t new_gpa);
>  void vm_mem_region_delete(struct kvm_vm *vm, uint32_t slot);
> +int vm_mem_region_get_src_fd(struct kvm_vm *vm, uint32_t memslot);
>  struct kvm_vcpu *__vm_vcpu_add(struct kvm_vm *vm, uint32_t vcpu_id);
>  vm_vaddr_t vm_vaddr_alloc(struct kvm_vm *vm, size_t sz, vm_vaddr_t vaddr_min);
>  vm_vaddr_t vm_vaddr_alloc_pages(struct kvm_vm *vm, int nr_pages);
> diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
> index 3e45e3776bdf..7c81028f23d8 100644
> --- a/tools/testing/selftests/kvm/lib/kvm_util.c
> +++ b/tools/testing/selftests/kvm/lib/kvm_util.c
> @@ -466,6 +466,29 @@ kvm_userspace_memory_region_find(struct kvm_vm *vm, uint64_t start,
>  	return &region->region;
>  }
>  
> +/*
> + * KVM Userspace Memory Get Backing Source FD
> + *
> + * Input Args:
> + *   vm - Virtual Machine
> + *   memslot - KVM memory slot ID
> + *
> + * Output Args: None
> + *
> + * Return:
> + *   Backing source file descriptor, -1 if the memslot is an anonymous region.
> + *
> + * Returns the backing source fd of a memslot, so tests can use it to punch
> + * holes, or to setup permissions.
> + */

nit: We're starting to slowly change these verbose function headers into
smaller headers, so this could be reduced.

> +int vm_mem_region_get_src_fd(struct kvm_vm *vm, uint32_t memslot)
> +{
> +	struct userspace_mem_region *region;
> +
> +	region = memslot2region(vm, memslot);
> +	return region->fd;
> +}
> +
>  /*
>   * VM VCPU Remove
>   *
> -- 
> 2.37.0.rc0.161.g10f37bed90-goog
>

Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v4 07/13] KVM: selftests: aarch64: Construct DEFAULT_MAIR_EL1 using sysreg.h macros
  2022-06-24 21:32   ` Ricardo Koller
@ 2022-07-12  9:46     ` Andrew Jones
  -1 siblings, 0 replies; 56+ messages in thread
From: Andrew Jones @ 2022-07-12  9:46 UTC (permalink / raw)
  To: Ricardo Koller
  Cc: kvm, kvmarm, pbonzini, maz, alexandru.elisei, eric.auger, oupton,
	reijiw, rananta, bgardon, dmatlack, axelrasmussen

On Fri, Jun 24, 2022 at 02:32:51PM -0700, Ricardo Koller wrote:
> Define macros for memory type indexes and construct DEFAULT_MAIR_EL1
> with macros from asm/sysreg.h.  The index macros can then be used when
> constructing PTEs (instead of using raw numbers).
> 
> Reviewed-by: Oliver Upton <oupton@google.com>
> Signed-off-by: Ricardo Koller <ricarkol@google.com>
> ---
>  .../selftests/kvm/include/aarch64/processor.h | 25 ++++++++++++++-----
>  .../selftests/kvm/lib/aarch64/processor.c     |  2 +-
>  2 files changed, 20 insertions(+), 7 deletions(-)
> 
> diff --git a/tools/testing/selftests/kvm/include/aarch64/processor.h b/tools/testing/selftests/kvm/include/aarch64/processor.h
> index 6649671fa7c1..74f10d006e15 100644
> --- a/tools/testing/selftests/kvm/include/aarch64/processor.h
> +++ b/tools/testing/selftests/kvm/include/aarch64/processor.h
> @@ -38,12 +38,25 @@
>   * NORMAL             4     1111:1111
>   * NORMAL_WT          5     1011:1011
>   */
> -#define DEFAULT_MAIR_EL1 ((0x00ul << (0 * 8)) | \
> -			  (0x04ul << (1 * 8)) | \
> -			  (0x0cul << (2 * 8)) | \
> -			  (0x44ul << (3 * 8)) | \
> -			  (0xfful << (4 * 8)) | \
> -			  (0xbbul << (5 * 8)))
> +
> +/* Linux doesn't use these memory types, so let's define them. */
> +#define MAIR_ATTR_DEVICE_GRE	UL(0x0c)
> +#define MAIR_ATTR_NORMAL_WT	UL(0xbb)
> +
> +#define MT_DEVICE_nGnRnE	0
> +#define MT_DEVICE_nGnRE		1
> +#define MT_DEVICE_GRE		2
> +#define MT_NORMAL_NC		3
> +#define MT_NORMAL		4
> +#define MT_NORMAL_WT		5
> +
> +#define DEFAULT_MAIR_EL1							\
> +	(MAIR_ATTRIDX(MAIR_ATTR_DEVICE_nGnRnE, MT_DEVICE_nGnRnE) |		\
> +	 MAIR_ATTRIDX(MAIR_ATTR_DEVICE_nGnRE, MT_DEVICE_nGnRE) |		\
> +	 MAIR_ATTRIDX(MAIR_ATTR_DEVICE_GRE, MT_DEVICE_GRE) |			\
> +	 MAIR_ATTRIDX(MAIR_ATTR_NORMAL_NC, MT_NORMAL_NC) |			\
> +	 MAIR_ATTRIDX(MAIR_ATTR_NORMAL, MT_NORMAL) |				\
> +	 MAIR_ATTRIDX(MAIR_ATTR_NORMAL_WT, MT_NORMAL_WT))
>  
>  #define MPIDR_HWID_BITMASK (0xff00fffffful)
>  
> diff --git a/tools/testing/selftests/kvm/lib/aarch64/processor.c b/tools/testing/selftests/kvm/lib/aarch64/processor.c
> index 8dd511aa79c2..733a2b713580 100644
> --- a/tools/testing/selftests/kvm/lib/aarch64/processor.c
> +++ b/tools/testing/selftests/kvm/lib/aarch64/processor.c
> @@ -133,7 +133,7 @@ void _virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr,
>  
>  void virt_arch_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr)
>  {
> -	uint64_t attr_idx = 4; /* NORMAL (See DEFAULT_MAIR_EL1) */
> +	uint64_t attr_idx = MT_NORMAL;
>  
>  	_virt_pg_map(vm, vaddr, paddr, attr_idx, 0);
>  }
> -- 
> 2.37.0.rc0.161.g10f37bed90-goog
>

Reviewed-by: Andrew Jones <andrew.jones@linux.dev>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v4 07/13] KVM: selftests: aarch64: Construct DEFAULT_MAIR_EL1 using sysreg.h macros
@ 2022-07-12  9:46     ` Andrew Jones
  0 siblings, 0 replies; 56+ messages in thread
From: Andrew Jones @ 2022-07-12  9:46 UTC (permalink / raw)
  To: Ricardo Koller
  Cc: kvm, maz, bgardon, dmatlack, pbonzini, axelrasmussen, kvmarm

On Fri, Jun 24, 2022 at 02:32:51PM -0700, Ricardo Koller wrote:
> Define macros for memory type indexes and construct DEFAULT_MAIR_EL1
> with macros from asm/sysreg.h.  The index macros can then be used when
> constructing PTEs (instead of using raw numbers).
> 
> Reviewed-by: Oliver Upton <oupton@google.com>
> Signed-off-by: Ricardo Koller <ricarkol@google.com>
> ---
>  .../selftests/kvm/include/aarch64/processor.h | 25 ++++++++++++++-----
>  .../selftests/kvm/lib/aarch64/processor.c     |  2 +-
>  2 files changed, 20 insertions(+), 7 deletions(-)
> 
> diff --git a/tools/testing/selftests/kvm/include/aarch64/processor.h b/tools/testing/selftests/kvm/include/aarch64/processor.h
> index 6649671fa7c1..74f10d006e15 100644
> --- a/tools/testing/selftests/kvm/include/aarch64/processor.h
> +++ b/tools/testing/selftests/kvm/include/aarch64/processor.h
> @@ -38,12 +38,25 @@
>   * NORMAL             4     1111:1111
>   * NORMAL_WT          5     1011:1011
>   */
> -#define DEFAULT_MAIR_EL1 ((0x00ul << (0 * 8)) | \
> -			  (0x04ul << (1 * 8)) | \
> -			  (0x0cul << (2 * 8)) | \
> -			  (0x44ul << (3 * 8)) | \
> -			  (0xfful << (4 * 8)) | \
> -			  (0xbbul << (5 * 8)))
> +
> +/* Linux doesn't use these memory types, so let's define them. */
> +#define MAIR_ATTR_DEVICE_GRE	UL(0x0c)
> +#define MAIR_ATTR_NORMAL_WT	UL(0xbb)
> +
> +#define MT_DEVICE_nGnRnE	0
> +#define MT_DEVICE_nGnRE		1
> +#define MT_DEVICE_GRE		2
> +#define MT_NORMAL_NC		3
> +#define MT_NORMAL		4
> +#define MT_NORMAL_WT		5
> +
> +#define DEFAULT_MAIR_EL1							\
> +	(MAIR_ATTRIDX(MAIR_ATTR_DEVICE_nGnRnE, MT_DEVICE_nGnRnE) |		\
> +	 MAIR_ATTRIDX(MAIR_ATTR_DEVICE_nGnRE, MT_DEVICE_nGnRE) |		\
> +	 MAIR_ATTRIDX(MAIR_ATTR_DEVICE_GRE, MT_DEVICE_GRE) |			\
> +	 MAIR_ATTRIDX(MAIR_ATTR_NORMAL_NC, MT_NORMAL_NC) |			\
> +	 MAIR_ATTRIDX(MAIR_ATTR_NORMAL, MT_NORMAL) |				\
> +	 MAIR_ATTRIDX(MAIR_ATTR_NORMAL_WT, MT_NORMAL_WT))
>  
>  #define MPIDR_HWID_BITMASK (0xff00fffffful)
>  
> diff --git a/tools/testing/selftests/kvm/lib/aarch64/processor.c b/tools/testing/selftests/kvm/lib/aarch64/processor.c
> index 8dd511aa79c2..733a2b713580 100644
> --- a/tools/testing/selftests/kvm/lib/aarch64/processor.c
> +++ b/tools/testing/selftests/kvm/lib/aarch64/processor.c
> @@ -133,7 +133,7 @@ void _virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr,
>  
>  void virt_arch_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr)
>  {
> -	uint64_t attr_idx = 4; /* NORMAL (See DEFAULT_MAIR_EL1) */
> +	uint64_t attr_idx = MT_NORMAL;
>  
>  	_virt_pg_map(vm, vaddr, paddr, attr_idx, 0);
>  }
> -- 
> 2.37.0.rc0.161.g10f37bed90-goog
>

Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v4 09/13] KVM: selftests: aarch64: Add aarch64/page_fault_test
  2022-06-24 21:32   ` Ricardo Koller
@ 2022-07-21  1:29     ` Sean Christopherson
  -1 siblings, 0 replies; 56+ messages in thread
From: Sean Christopherson @ 2022-07-21  1:29 UTC (permalink / raw)
  To: Ricardo Koller
  Cc: kvm, kvmarm, drjones, maz, bgardon, dmatlack, pbonzini, axelrasmussen

On Fri, Jun 24, 2022, Ricardo Koller wrote:
> Add a new test for stage 2 faults when using different combinations of
> guest accesses (e.g., write, S1PTW), backing source type (e.g., anon)
> and types of faults (e.g., read on hugetlbfs with a hole). The next
> commits will add different handling methods and more faults (e.g., uffd
> and dirty logging). This first commit starts by adding two sanity checks
> for all types of accesses: AF setting by the hw, and accessing memslots
> with holes.
> 
> Signed-off-by: Ricardo Koller <ricarkol@google.com>
> ---

...

> +/*
> + * Create a memslot for test data (memslot[TEST]) and another one for PT
> + * tables (memslot[PT]). This diagram show the resulting guest virtual and
> + * physical address space when using 4K backing pages for the memslots, and
> + * 4K guest pages.
> + *
> + *                   Guest physical            Guest virtual
> + *
> + *                  |              |          |             |
> + *                  |              |          |             |
> + *                  +--------------+          +-------------+
> + * max_gfn - 0x1000 | TEST memslot |<---------+  test data  | 0xc0000000
> + *                  +--------------+          +-------------+
> + * max_gfn - 0x2000 |     gap      |<---------+     gap     | 0xbffff000
> + *                  +--------------+          +-------------+
> + *                  |              |          |             |
> + *                  |              |          |             |
> + *                  |  PT memslot  |          |             |
> + *                  |              |          +-------------+
> + * max_gfn - 0x6000 |              |<----+    |             |
> + *                  +--------------+     |    |             |
> + *                  |              |     |    | PTE for the |
> + *                  |              |     |    | test data   |
> + *                  |              |     +----+ page        | 0xb0000000
> + *                  |              |          +-------------+
> + *                  |              |          |             |
> + *                  |              |          |             |
> + *
> + * Using different guest page sizes or backing pages will result in the
> + * same layout but at different addresses. In particular, the memslot
> + * sizes need to be multiple of the backing page sizes (e.g., 2MB).
> + */
> +static void setup_memslots(struct kvm_vm *vm, enum vm_guest_mode mode,
> +		struct test_params *p)
> +{
> +	uint64_t backing_page_size = get_backing_src_pagesz(p->src_type);
> +	uint64_t guest_page_size = vm_guest_mode_params[mode].page_size;
> +	struct test_desc *test = p->test_desc;
> +	uint64_t gap_gpa;
> +	uint64_t alignment;
> +	int i;
> +
> +	memslot[TEST].size = align_up(guest_page_size, backing_page_size);
> +	/*
> +	 * We need one guest page for the PT table containing the PTE (for
> +	 * TEST_GVA), but might need more in case the higher level PT tables
> +	 * were not allocated yet.
> +	 */
> +	memslot[PT].size = align_up(4 * guest_page_size, backing_page_size);
> +
> +	for (i = 0; i < NR_MEMSLOTS; i++) {
> +		memslot[i].guest_pages = memslot[i].size / guest_page_size;
> +		memslot[i].src_type = p->src_type;
> +	}
> +
> +	/* Place the memslots GPAs at the end of physical memory */
> +	alignment = max(backing_page_size, guest_page_size);
> +	memslot[TEST].gpa = (vm->max_gfn - memslot[TEST].guest_pages) *
> +		guest_page_size;
> +	memslot[TEST].gpa = align_down(memslot[TEST].gpa, alignment);
> +
> +	/* Add a 1-guest_page gap between the two memslots */
> +	gap_gpa = memslot[TEST].gpa - guest_page_size;
> +	/* Map the gap so it's still adressable from the guest.  */
> +	virt_pg_map(vm, TEST_GVA - guest_page_size, gap_gpa);
> +
> +	memslot[PT].gpa = gap_gpa - memslot[PT].size;
> +	memslot[PT].gpa = align_down(memslot[PT].gpa, alignment);
> +
> +	vm_userspace_mem_region_add(vm, p->src_type, memslot[PT].gpa,
> +			memslot[PT].idx, memslot[PT].guest_pages,
> +			test->pt_memslot_flags);
> +	vm_userspace_mem_region_add(vm, p->src_type, memslot[TEST].gpa,
> +			memslot[TEST].idx, memslot[TEST].guest_pages,
> +			test->test_memslot_flags);
> +
> +	for (i = 0; i < NR_MEMSLOTS; i++)
> +		memslot[i].hva = addr_gpa2hva(vm, memslot[i].gpa);
> +
> +	/* Map the test TEST_GVA using the PT memslot. */
> +	_virt_pg_map(vm, TEST_GVA, memslot[TEST].gpa, MT_NORMAL,
> +			TEST_PT_SLOT_INDEX);

Use memslot[TEST].idx instead of TEST_PT_SLOT_INDEX to be consistent, though my
preference would be to avoid this API.

IIUC, the goal is to test different backing stores for the memory the guest uses
for it's page tables.  But do we care about testing that a single guest's page
tables are backed with different types concurrently?  If we don't care, and maybe
even if we do, then my preference would be to enhance the __vm_create family of
helpers to allow for specifying what backing type should be used for page tables,
i.e. associate the info the VM instead of passing it around the stack.

One idea would be to do something like David Matlack suggested a while back and
replace extra_mem_pages with a struct, e.g. struct kvm_vm_mem_params  That struct
can then provide the necessary knobs to control how memory is allocated.  And then
the lib can provide a global

	struct kvm_vm_mem_params kvm_default_vm_mem_params;

or whatever (probably a shorter name) for the tests that don't care.  And then
down the road, if someone wants to control the backing type for code vs. data,
we could and those flags to kvm_vm_mem_params and add vm_vaddr_alloc() wrappers
to alloc code vs. data (with a default to data?).

I don't like the param approach because it bleeds implementation details that
really shouldn't matter, e.g. which memslot is the default, into tests.  And it's
not very easy to use, e.g. if a different test wants to do something similar,
it would have to create its own memslot, populate the tables, etc...

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v4 09/13] KVM: selftests: aarch64: Add aarch64/page_fault_test
@ 2022-07-21  1:29     ` Sean Christopherson
  0 siblings, 0 replies; 56+ messages in thread
From: Sean Christopherson @ 2022-07-21  1:29 UTC (permalink / raw)
  To: Ricardo Koller
  Cc: drjones, kvm, maz, axelrasmussen, bgardon, dmatlack, pbonzini, kvmarm

On Fri, Jun 24, 2022, Ricardo Koller wrote:
> Add a new test for stage 2 faults when using different combinations of
> guest accesses (e.g., write, S1PTW), backing source type (e.g., anon)
> and types of faults (e.g., read on hugetlbfs with a hole). The next
> commits will add different handling methods and more faults (e.g., uffd
> and dirty logging). This first commit starts by adding two sanity checks
> for all types of accesses: AF setting by the hw, and accessing memslots
> with holes.
> 
> Signed-off-by: Ricardo Koller <ricarkol@google.com>
> ---

...

> +/*
> + * Create a memslot for test data (memslot[TEST]) and another one for PT
> + * tables (memslot[PT]). This diagram show the resulting guest virtual and
> + * physical address space when using 4K backing pages for the memslots, and
> + * 4K guest pages.
> + *
> + *                   Guest physical            Guest virtual
> + *
> + *                  |              |          |             |
> + *                  |              |          |             |
> + *                  +--------------+          +-------------+
> + * max_gfn - 0x1000 | TEST memslot |<---------+  test data  | 0xc0000000
> + *                  +--------------+          +-------------+
> + * max_gfn - 0x2000 |     gap      |<---------+     gap     | 0xbffff000
> + *                  +--------------+          +-------------+
> + *                  |              |          |             |
> + *                  |              |          |             |
> + *                  |  PT memslot  |          |             |
> + *                  |              |          +-------------+
> + * max_gfn - 0x6000 |              |<----+    |             |
> + *                  +--------------+     |    |             |
> + *                  |              |     |    | PTE for the |
> + *                  |              |     |    | test data   |
> + *                  |              |     +----+ page        | 0xb0000000
> + *                  |              |          +-------------+
> + *                  |              |          |             |
> + *                  |              |          |             |
> + *
> + * Using different guest page sizes or backing pages will result in the
> + * same layout but at different addresses. In particular, the memslot
> + * sizes need to be multiple of the backing page sizes (e.g., 2MB).
> + */
> +static void setup_memslots(struct kvm_vm *vm, enum vm_guest_mode mode,
> +		struct test_params *p)
> +{
> +	uint64_t backing_page_size = get_backing_src_pagesz(p->src_type);
> +	uint64_t guest_page_size = vm_guest_mode_params[mode].page_size;
> +	struct test_desc *test = p->test_desc;
> +	uint64_t gap_gpa;
> +	uint64_t alignment;
> +	int i;
> +
> +	memslot[TEST].size = align_up(guest_page_size, backing_page_size);
> +	/*
> +	 * We need one guest page for the PT table containing the PTE (for
> +	 * TEST_GVA), but might need more in case the higher level PT tables
> +	 * were not allocated yet.
> +	 */
> +	memslot[PT].size = align_up(4 * guest_page_size, backing_page_size);
> +
> +	for (i = 0; i < NR_MEMSLOTS; i++) {
> +		memslot[i].guest_pages = memslot[i].size / guest_page_size;
> +		memslot[i].src_type = p->src_type;
> +	}
> +
> +	/* Place the memslots GPAs at the end of physical memory */
> +	alignment = max(backing_page_size, guest_page_size);
> +	memslot[TEST].gpa = (vm->max_gfn - memslot[TEST].guest_pages) *
> +		guest_page_size;
> +	memslot[TEST].gpa = align_down(memslot[TEST].gpa, alignment);
> +
> +	/* Add a 1-guest_page gap between the two memslots */
> +	gap_gpa = memslot[TEST].gpa - guest_page_size;
> +	/* Map the gap so it's still adressable from the guest.  */
> +	virt_pg_map(vm, TEST_GVA - guest_page_size, gap_gpa);
> +
> +	memslot[PT].gpa = gap_gpa - memslot[PT].size;
> +	memslot[PT].gpa = align_down(memslot[PT].gpa, alignment);
> +
> +	vm_userspace_mem_region_add(vm, p->src_type, memslot[PT].gpa,
> +			memslot[PT].idx, memslot[PT].guest_pages,
> +			test->pt_memslot_flags);
> +	vm_userspace_mem_region_add(vm, p->src_type, memslot[TEST].gpa,
> +			memslot[TEST].idx, memslot[TEST].guest_pages,
> +			test->test_memslot_flags);
> +
> +	for (i = 0; i < NR_MEMSLOTS; i++)
> +		memslot[i].hva = addr_gpa2hva(vm, memslot[i].gpa);
> +
> +	/* Map the test TEST_GVA using the PT memslot. */
> +	_virt_pg_map(vm, TEST_GVA, memslot[TEST].gpa, MT_NORMAL,
> +			TEST_PT_SLOT_INDEX);

Use memslot[TEST].idx instead of TEST_PT_SLOT_INDEX to be consistent, though my
preference would be to avoid this API.

IIUC, the goal is to test different backing stores for the memory the guest uses
for it's page tables.  But do we care about testing that a single guest's page
tables are backed with different types concurrently?  If we don't care, and maybe
even if we do, then my preference would be to enhance the __vm_create family of
helpers to allow for specifying what backing type should be used for page tables,
i.e. associate the info the VM instead of passing it around the stack.

One idea would be to do something like David Matlack suggested a while back and
replace extra_mem_pages with a struct, e.g. struct kvm_vm_mem_params  That struct
can then provide the necessary knobs to control how memory is allocated.  And then
the lib can provide a global

	struct kvm_vm_mem_params kvm_default_vm_mem_params;

or whatever (probably a shorter name) for the tests that don't care.  And then
down the road, if someone wants to control the backing type for code vs. data,
we could and those flags to kvm_vm_mem_params and add vm_vaddr_alloc() wrappers
to alloc code vs. data (with a default to data?).

I don't like the param approach because it bleeds implementation details that
really shouldn't matter, e.g. which memslot is the default, into tests.  And it's
not very easy to use, e.g. if a different test wants to do something similar,
it would have to create its own memslot, populate the tables, etc...
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v4 09/13] KVM: selftests: aarch64: Add aarch64/page_fault_test
  2022-07-21  1:29     ` Sean Christopherson
@ 2022-07-22 17:19       ` Ricardo Koller
  -1 siblings, 0 replies; 56+ messages in thread
From: Ricardo Koller @ 2022-07-22 17:19 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, kvmarm, drjones, maz, bgardon, dmatlack, pbonzini, axelrasmussen

On Thu, Jul 21, 2022 at 01:29:34AM +0000, Sean Christopherson wrote:
> On Fri, Jun 24, 2022, Ricardo Koller wrote:
> > Add a new test for stage 2 faults when using different combinations of
> > guest accesses (e.g., write, S1PTW), backing source type (e.g., anon)
> > and types of faults (e.g., read on hugetlbfs with a hole). The next
> > commits will add different handling methods and more faults (e.g., uffd
> > and dirty logging). This first commit starts by adding two sanity checks
> > for all types of accesses: AF setting by the hw, and accessing memslots
> > with holes.
> > 
> > Signed-off-by: Ricardo Koller <ricarkol@google.com>
> > ---
> 
> ...
> 
> > +/*
> > + * Create a memslot for test data (memslot[TEST]) and another one for PT
> > + * tables (memslot[PT]). This diagram show the resulting guest virtual and
> > + * physical address space when using 4K backing pages for the memslots, and
> > + * 4K guest pages.
> > + *
> > + *                   Guest physical            Guest virtual
> > + *
> > + *                  |              |          |             |
> > + *                  |              |          |             |
> > + *                  +--------------+          +-------------+
> > + * max_gfn - 0x1000 | TEST memslot |<---------+  test data  | 0xc0000000
> > + *                  +--------------+          +-------------+
> > + * max_gfn - 0x2000 |     gap      |<---------+     gap     | 0xbffff000
> > + *                  +--------------+          +-------------+
> > + *                  |              |          |             |
> > + *                  |              |          |             |
> > + *                  |  PT memslot  |          |             |
> > + *                  |              |          +-------------+
> > + * max_gfn - 0x6000 |              |<----+    |             |
> > + *                  +--------------+     |    |             |
> > + *                  |              |     |    | PTE for the |
> > + *                  |              |     |    | test data   |
> > + *                  |              |     +----+ page        | 0xb0000000
> > + *                  |              |          +-------------+
> > + *                  |              |          |             |
> > + *                  |              |          |             |
> > + *
> > + * Using different guest page sizes or backing pages will result in the
> > + * same layout but at different addresses. In particular, the memslot
> > + * sizes need to be multiple of the backing page sizes (e.g., 2MB).
> > + */
> > +static void setup_memslots(struct kvm_vm *vm, enum vm_guest_mode mode,
> > +		struct test_params *p)
> > +{
> > +	uint64_t backing_page_size = get_backing_src_pagesz(p->src_type);
> > +	uint64_t guest_page_size = vm_guest_mode_params[mode].page_size;
> > +	struct test_desc *test = p->test_desc;
> > +	uint64_t gap_gpa;
> > +	uint64_t alignment;
> > +	int i;
> > +
> > +	memslot[TEST].size = align_up(guest_page_size, backing_page_size);
> > +	/*
> > +	 * We need one guest page for the PT table containing the PTE (for
> > +	 * TEST_GVA), but might need more in case the higher level PT tables
> > +	 * were not allocated yet.
> > +	 */
> > +	memslot[PT].size = align_up(4 * guest_page_size, backing_page_size);
> > +
> > +	for (i = 0; i < NR_MEMSLOTS; i++) {
> > +		memslot[i].guest_pages = memslot[i].size / guest_page_size;
> > +		memslot[i].src_type = p->src_type;
> > +	}
> > +
> > +	/* Place the memslots GPAs at the end of physical memory */
> > +	alignment = max(backing_page_size, guest_page_size);
> > +	memslot[TEST].gpa = (vm->max_gfn - memslot[TEST].guest_pages) *
> > +		guest_page_size;
> > +	memslot[TEST].gpa = align_down(memslot[TEST].gpa, alignment);
> > +
> > +	/* Add a 1-guest_page gap between the two memslots */
> > +	gap_gpa = memslot[TEST].gpa - guest_page_size;
> > +	/* Map the gap so it's still adressable from the guest.  */
> > +	virt_pg_map(vm, TEST_GVA - guest_page_size, gap_gpa);
> > +
> > +	memslot[PT].gpa = gap_gpa - memslot[PT].size;
> > +	memslot[PT].gpa = align_down(memslot[PT].gpa, alignment);
> > +
> > +	vm_userspace_mem_region_add(vm, p->src_type, memslot[PT].gpa,
> > +			memslot[PT].idx, memslot[PT].guest_pages,
> > +			test->pt_memslot_flags);
> > +	vm_userspace_mem_region_add(vm, p->src_type, memslot[TEST].gpa,
> > +			memslot[TEST].idx, memslot[TEST].guest_pages,
> > +			test->test_memslot_flags);
> > +
> > +	for (i = 0; i < NR_MEMSLOTS; i++)
> > +		memslot[i].hva = addr_gpa2hva(vm, memslot[i].gpa);
> > +
> > +	/* Map the test TEST_GVA using the PT memslot. */
> > +	_virt_pg_map(vm, TEST_GVA, memslot[TEST].gpa, MT_NORMAL,
> > +			TEST_PT_SLOT_INDEX);
> 
> Use memslot[TEST].idx instead of TEST_PT_SLOT_INDEX to be consistent, though my
> preference would be to avoid this API.
> 
> IIUC, the goal is to test different backing stores for the memory the guest uses
> for it's page tables.  But do we care about testing that a single guest's page
> tables are backed with different types concurrently?

This test creates a new VM for each subtest, so an API like that would
definitely make this code simpler/nicer.

> If we don't care, and maybe
> even if we do, then my preference would be to enhance the __vm_create family of
> helpers to allow for specifying what backing type should be used for page tables,
> i.e. associate the info the VM instead of passing it around the stack.
> 
> One idea would be to do something like David Matlack suggested a while back and
> replace extra_mem_pages with a struct, e.g. struct kvm_vm_mem_params  That struct
> can then provide the necessary knobs to control how memory is allocated.  And then
> the lib can provide a global
> 
> 	struct kvm_vm_mem_params kvm_default_vm_mem_params;
> 

I like this idea, passing the info at vm creation.

What about dividing the changes in two.

	1. Will add the struct to "__vm_create()" as part of this
	series, and then use it in this commit. There's only one user

		dirty_log_test.c:   vm = __vm_create(mode, 1, extra_mem_pages);

	so that would avoid having to touch every test as part of this patchset.

	2. I can then send another series to add support for all the other
	vm_create() functions.

Alternatively, I can send a new series that does 1 and 2 afterwards.
WDYT?

> or whatever (probably a shorter name) for the tests that don't care.  And then
> down the road, if someone wants to control the backing type for code vs. data,
> we could and those flags to kvm_vm_mem_params and add vm_vaddr_alloc() wrappers
> to alloc code vs. data (with a default to data?).
> 
> I don't like the param approach because it bleeds implementation details that
> really shouldn't matter, e.g. which memslot is the default, into tests.  And it's
> not very easy to use, e.g. if a different test wants to do something similar,
> it would have to create its own memslot, populate the tables, etc...

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v4 09/13] KVM: selftests: aarch64: Add aarch64/page_fault_test
@ 2022-07-22 17:19       ` Ricardo Koller
  0 siblings, 0 replies; 56+ messages in thread
From: Ricardo Koller @ 2022-07-22 17:19 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: drjones, kvm, maz, axelrasmussen, bgardon, dmatlack, pbonzini, kvmarm

On Thu, Jul 21, 2022 at 01:29:34AM +0000, Sean Christopherson wrote:
> On Fri, Jun 24, 2022, Ricardo Koller wrote:
> > Add a new test for stage 2 faults when using different combinations of
> > guest accesses (e.g., write, S1PTW), backing source type (e.g., anon)
> > and types of faults (e.g., read on hugetlbfs with a hole). The next
> > commits will add different handling methods and more faults (e.g., uffd
> > and dirty logging). This first commit starts by adding two sanity checks
> > for all types of accesses: AF setting by the hw, and accessing memslots
> > with holes.
> > 
> > Signed-off-by: Ricardo Koller <ricarkol@google.com>
> > ---
> 
> ...
> 
> > +/*
> > + * Create a memslot for test data (memslot[TEST]) and another one for PT
> > + * tables (memslot[PT]). This diagram show the resulting guest virtual and
> > + * physical address space when using 4K backing pages for the memslots, and
> > + * 4K guest pages.
> > + *
> > + *                   Guest physical            Guest virtual
> > + *
> > + *                  |              |          |             |
> > + *                  |              |          |             |
> > + *                  +--------------+          +-------------+
> > + * max_gfn - 0x1000 | TEST memslot |<---------+  test data  | 0xc0000000
> > + *                  +--------------+          +-------------+
> > + * max_gfn - 0x2000 |     gap      |<---------+     gap     | 0xbffff000
> > + *                  +--------------+          +-------------+
> > + *                  |              |          |             |
> > + *                  |              |          |             |
> > + *                  |  PT memslot  |          |             |
> > + *                  |              |          +-------------+
> > + * max_gfn - 0x6000 |              |<----+    |             |
> > + *                  +--------------+     |    |             |
> > + *                  |              |     |    | PTE for the |
> > + *                  |              |     |    | test data   |
> > + *                  |              |     +----+ page        | 0xb0000000
> > + *                  |              |          +-------------+
> > + *                  |              |          |             |
> > + *                  |              |          |             |
> > + *
> > + * Using different guest page sizes or backing pages will result in the
> > + * same layout but at different addresses. In particular, the memslot
> > + * sizes need to be multiple of the backing page sizes (e.g., 2MB).
> > + */
> > +static void setup_memslots(struct kvm_vm *vm, enum vm_guest_mode mode,
> > +		struct test_params *p)
> > +{
> > +	uint64_t backing_page_size = get_backing_src_pagesz(p->src_type);
> > +	uint64_t guest_page_size = vm_guest_mode_params[mode].page_size;
> > +	struct test_desc *test = p->test_desc;
> > +	uint64_t gap_gpa;
> > +	uint64_t alignment;
> > +	int i;
> > +
> > +	memslot[TEST].size = align_up(guest_page_size, backing_page_size);
> > +	/*
> > +	 * We need one guest page for the PT table containing the PTE (for
> > +	 * TEST_GVA), but might need more in case the higher level PT tables
> > +	 * were not allocated yet.
> > +	 */
> > +	memslot[PT].size = align_up(4 * guest_page_size, backing_page_size);
> > +
> > +	for (i = 0; i < NR_MEMSLOTS; i++) {
> > +		memslot[i].guest_pages = memslot[i].size / guest_page_size;
> > +		memslot[i].src_type = p->src_type;
> > +	}
> > +
> > +	/* Place the memslots GPAs at the end of physical memory */
> > +	alignment = max(backing_page_size, guest_page_size);
> > +	memslot[TEST].gpa = (vm->max_gfn - memslot[TEST].guest_pages) *
> > +		guest_page_size;
> > +	memslot[TEST].gpa = align_down(memslot[TEST].gpa, alignment);
> > +
> > +	/* Add a 1-guest_page gap between the two memslots */
> > +	gap_gpa = memslot[TEST].gpa - guest_page_size;
> > +	/* Map the gap so it's still adressable from the guest.  */
> > +	virt_pg_map(vm, TEST_GVA - guest_page_size, gap_gpa);
> > +
> > +	memslot[PT].gpa = gap_gpa - memslot[PT].size;
> > +	memslot[PT].gpa = align_down(memslot[PT].gpa, alignment);
> > +
> > +	vm_userspace_mem_region_add(vm, p->src_type, memslot[PT].gpa,
> > +			memslot[PT].idx, memslot[PT].guest_pages,
> > +			test->pt_memslot_flags);
> > +	vm_userspace_mem_region_add(vm, p->src_type, memslot[TEST].gpa,
> > +			memslot[TEST].idx, memslot[TEST].guest_pages,
> > +			test->test_memslot_flags);
> > +
> > +	for (i = 0; i < NR_MEMSLOTS; i++)
> > +		memslot[i].hva = addr_gpa2hva(vm, memslot[i].gpa);
> > +
> > +	/* Map the test TEST_GVA using the PT memslot. */
> > +	_virt_pg_map(vm, TEST_GVA, memslot[TEST].gpa, MT_NORMAL,
> > +			TEST_PT_SLOT_INDEX);
> 
> Use memslot[TEST].idx instead of TEST_PT_SLOT_INDEX to be consistent, though my
> preference would be to avoid this API.
> 
> IIUC, the goal is to test different backing stores for the memory the guest uses
> for it's page tables.  But do we care about testing that a single guest's page
> tables are backed with different types concurrently?

This test creates a new VM for each subtest, so an API like that would
definitely make this code simpler/nicer.

> If we don't care, and maybe
> even if we do, then my preference would be to enhance the __vm_create family of
> helpers to allow for specifying what backing type should be used for page tables,
> i.e. associate the info the VM instead of passing it around the stack.
> 
> One idea would be to do something like David Matlack suggested a while back and
> replace extra_mem_pages with a struct, e.g. struct kvm_vm_mem_params  That struct
> can then provide the necessary knobs to control how memory is allocated.  And then
> the lib can provide a global
> 
> 	struct kvm_vm_mem_params kvm_default_vm_mem_params;
> 

I like this idea, passing the info at vm creation.

What about dividing the changes in two.

	1. Will add the struct to "__vm_create()" as part of this
	series, and then use it in this commit. There's only one user

		dirty_log_test.c:   vm = __vm_create(mode, 1, extra_mem_pages);

	so that would avoid having to touch every test as part of this patchset.

	2. I can then send another series to add support for all the other
	vm_create() functions.

Alternatively, I can send a new series that does 1 and 2 afterwards.
WDYT?

> or whatever (probably a shorter name) for the tests that don't care.  And then
> down the road, if someone wants to control the backing type for code vs. data,
> we could and those flags to kvm_vm_mem_params and add vm_vaddr_alloc() wrappers
> to alloc code vs. data (with a default to data?).
> 
> I don't like the param approach because it bleeds implementation details that
> really shouldn't matter, e.g. which memslot is the default, into tests.  And it's
> not very easy to use, e.g. if a different test wants to do something similar,
> it would have to create its own memslot, populate the tables, etc...
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v4 09/13] KVM: selftests: aarch64: Add aarch64/page_fault_test
  2022-07-22 17:19       ` Ricardo Koller
@ 2022-07-22 18:20         ` Sean Christopherson
  -1 siblings, 0 replies; 56+ messages in thread
From: Sean Christopherson @ 2022-07-22 18:20 UTC (permalink / raw)
  To: Ricardo Koller
  Cc: kvm, kvmarm, drjones, maz, bgardon, dmatlack, pbonzini, axelrasmussen

On Fri, Jul 22, 2022, Ricardo Koller wrote:
> On Thu, Jul 21, 2022 at 01:29:34AM +0000, Sean Christopherson wrote:
> > If we don't care, and maybe even if we do, then my preference would be to
> > enhance the __vm_create family of helpers to allow for specifying what
> > backing type should be used for page tables, i.e. associate the info the VM
> > instead of passing it around the stack.
> > 
> > One idea would be to do something like David Matlack suggested a while back
> > and replace extra_mem_pages with a struct, e.g. struct kvm_vm_mem_params
> > That struct can then provide the necessary knobs to control how memory is
> > allocated.  And then the lib can provide a global
> > 
> > 	struct kvm_vm_mem_params kvm_default_vm_mem_params;
> > 
> 
> I like this idea, passing the info at vm creation.
> 
> What about dividing the changes in two.
> 
> 	1. Will add the struct to "__vm_create()" as part of this
> 	series, and then use it in this commit. There's only one user
> 
> 		dirty_log_test.c:   vm = __vm_create(mode, 1, extra_mem_pages);
> 
> 	so that would avoid having to touch every test as part of this patchset.
> 
> 	2. I can then send another series to add support for all the other
> 	vm_create() functions.
> 
> Alternatively, I can send a new series that does 1 and 2 afterwards.
> WDYT?

Don't do #2, ever. :-)  The intent of having vm_create() versus is __vm_create()
is so that tests that don't care about things like backing pages don't have to
pass in extra params.  I very much want to keep that behavior, i.e. I don't want
to extend vm_create() at all.  IMO, adding _anything_ is a slippery slope, e.g.
why are the backing types special enough to get a param, but thing XYZ isn't?

Thinking more, the struct idea probably isn't going to work all that well.  It
again puts the selftests into a state where it becomes difficult to control one
setting and ignore the rest, e.g. the dirty_log_test and anything else with extra
pages suddenly has to care about the backing type for page tables and code.

Rather than adding a struct, what about extending the @mode param?  We already
have vm_mem_backing_src_type, we just need a way to splice things together.  There
are a total of four things we can control: primary mode, and then code, data, and
page tables backing types.

So, turn @mode into a uint32_t and carve out 8 bits for each of those four "modes".
The defaults Just Work because VM_MEM_SRC_ANONYMOUS==0.

Lightly tested, but the below should provide the necessary base infrastructure,
then you just need to have ____vm_create() consume the secondary "modes".

---
From: Sean Christopherson <seanjc@google.com>
Date: Fri, 22 Jul 2022 10:56:08 -0700
Subject: [PATCH] KVM: selftests: Extend VM creation's @mode to allow control
 of backing types

Carve out space in the @mode passed to the various VM creation helpers to
allow using the mode to control the backing type for code, data, and page
table allocations made by the selftests framework.  E.g. to allow tests
to force guest page tables to be backed with huge pages.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 .../selftests/kvm/include/kvm_util_base.h     | 78 ++++++++++++-------
 tools/testing/selftests/kvm/lib/guest_modes.c |  2 +-
 tools/testing/selftests/kvm/lib/kvm_util.c    | 35 +++++----
 3 files changed, 69 insertions(+), 46 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
index 24fde97f6121..992dcc7b39e7 100644
--- a/tools/testing/selftests/kvm/include/kvm_util_base.h
+++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
@@ -29,6 +29,45 @@
 typedef uint64_t vm_paddr_t; /* Virtual Machine (Guest) physical address */
 typedef uint64_t vm_vaddr_t; /* Virtual Machine (Guest) virtual address */

+
+enum vm_guest_mode {
+	VM_MODE_P52V48_4K,
+	VM_MODE_P52V48_64K,
+	VM_MODE_P48V48_4K,
+	VM_MODE_P48V48_16K,
+	VM_MODE_P48V48_64K,
+	VM_MODE_P40V48_4K,
+	VM_MODE_P40V48_16K,
+	VM_MODE_P40V48_64K,
+	VM_MODE_PXXV48_4K,	/* For 48bits VA but ANY bits PA */
+	VM_MODE_P47V64_4K,
+	VM_MODE_P44V64_4K,
+	VM_MODE_P36V48_4K,
+	VM_MODE_P36V48_16K,
+	VM_MODE_P36V48_64K,
+	VM_MODE_P36V47_16K,
+	NUM_VM_MODES,
+};
+
+/*
+ * There are four flavors of "modes" that tests can control.  The primary mode
+ * defines the physical and virtual address widths, and page sizes configured
+ * in hardware.  The code/data/page_table modifiers control the backing types
+ * for code, data, and page tables that are allocated by the infrastructure,
+ * e.g. to allow tests to force page tables to be back by huge pages.
+ *
+ * Valid values for the primary mask are "enum vm_guest_mode", and valid values
+ * for code, data, and page tables are "enum vm_mem_backing_src_type".
+ */
+#define VM_MODE_PRIMARY_MASK	GENMASK(7, 0)
+#define VM_MODE_CODE_MASK	GENMASK(15, 8)
+#define VM_MODE_DATA_MASK	GENMASK(23, 16)
+#define VM_MODE_PAGE_TABLE_MASK	GENMASK(31, 24)
+
+/* 8 bits in each mask above, i.e. 255 possible values */
+_Static_assert(NUM_VM_MODES < 256);
+_Static_assert(NUM_SRC_TYPES < 256);
+
 struct userspace_mem_region {
 	struct kvm_userspace_memory_region region;
 	struct sparsebit *unused_phy_pages;
@@ -65,7 +104,7 @@ struct userspace_mem_regions {
 };

 struct kvm_vm {
-	int mode;
+	enum vm_guest_mode mode;
 	unsigned long type;
 	int kvm_fd;
 	int fd;
@@ -111,28 +150,9 @@ memslot2region(struct kvm_vm *vm, uint32_t memslot);
 #define DEFAULT_GUEST_STACK_VADDR_MIN	0xab6000
 #define DEFAULT_STACK_PGS		5

-enum vm_guest_mode {
-	VM_MODE_P52V48_4K,
-	VM_MODE_P52V48_64K,
-	VM_MODE_P48V48_4K,
-	VM_MODE_P48V48_16K,
-	VM_MODE_P48V48_64K,
-	VM_MODE_P40V48_4K,
-	VM_MODE_P40V48_16K,
-	VM_MODE_P40V48_64K,
-	VM_MODE_PXXV48_4K,	/* For 48bits VA but ANY bits PA */
-	VM_MODE_P47V64_4K,
-	VM_MODE_P44V64_4K,
-	VM_MODE_P36V48_4K,
-	VM_MODE_P36V48_16K,
-	VM_MODE_P36V48_64K,
-	VM_MODE_P36V47_16K,
-	NUM_VM_MODES,
-};
-
 #if defined(__aarch64__)

-extern enum vm_guest_mode vm_mode_default;
+extern uint32_t vm_mode_default;

 #define VM_MODE_DEFAULT			vm_mode_default
 #define MIN_PAGE_SHIFT			12U
@@ -642,8 +662,8 @@ vm_paddr_t vm_alloc_page_table(struct kvm_vm *vm);
  * __vm_create() does NOT create vCPUs, @nr_runnable_vcpus is used purely to
  * calculate the amount of memory needed for per-vCPU data, e.g. stacks.
  */
-struct kvm_vm *____vm_create(enum vm_guest_mode mode, uint64_t nr_pages);
-struct kvm_vm *__vm_create(enum vm_guest_mode mode, uint32_t nr_runnable_vcpus,
+struct kvm_vm *____vm_create(uint32_t mode, uint64_t nr_pages);
+struct kvm_vm *__vm_create(uint32_t mode, uint32_t nr_runnable_vcpus,
 			   uint64_t nr_extra_pages);

 static inline struct kvm_vm *vm_create_barebones(void)
@@ -656,7 +676,7 @@ static inline struct kvm_vm *vm_create(uint32_t nr_runnable_vcpus)
 	return __vm_create(VM_MODE_DEFAULT, nr_runnable_vcpus, 0);
 }

-struct kvm_vm *__vm_create_with_vcpus(enum vm_guest_mode mode, uint32_t nr_vcpus,
+struct kvm_vm *__vm_create_with_vcpus(uint32_t mode, uint32_t nr_vcpus,
 				      uint64_t extra_mem_pages,
 				      void *guest_code, struct kvm_vcpu *vcpus[]);

@@ -685,11 +705,11 @@ static inline struct kvm_vm *vm_create_with_one_vcpu(struct kvm_vcpu **vcpu,
 struct kvm_vcpu *vm_recreate_with_one_vcpu(struct kvm_vm *vm);

 unsigned long vm_compute_max_gfn(struct kvm_vm *vm);
-unsigned int vm_calc_num_guest_pages(enum vm_guest_mode mode, size_t size);
-unsigned int vm_num_host_pages(enum vm_guest_mode mode, unsigned int num_guest_pages);
-unsigned int vm_num_guest_pages(enum vm_guest_mode mode, unsigned int num_host_pages);
-static inline unsigned int
-vm_adjust_num_guest_pages(enum vm_guest_mode mode, unsigned int num_guest_pages)
+unsigned int vm_calc_num_guest_pages(uint32_t mode, size_t size);
+unsigned int vm_num_host_pages(uint32_t mode, unsigned int num_guest_pages);
+unsigned int vm_num_guest_pages(uint32_t mode, unsigned int num_host_pages);
+static inline unsigned int vm_adjust_num_guest_pages(uint32_t mode,
+						     unsigned int num_guest_pages)
 {
 	unsigned int n;
 	n = vm_num_guest_pages(mode, vm_num_host_pages(mode, num_guest_pages));
diff --git a/tools/testing/selftests/kvm/lib/guest_modes.c b/tools/testing/selftests/kvm/lib/guest_modes.c
index 99a575bbbc52..93c6ca9ebb49 100644
--- a/tools/testing/selftests/kvm/lib/guest_modes.c
+++ b/tools/testing/selftests/kvm/lib/guest_modes.c
@@ -6,7 +6,7 @@

 #ifdef __aarch64__
 #include "processor.h"
-enum vm_guest_mode vm_mode_default;
+uint32_t vm_mode_default;
 #endif

 struct guest_mode guest_modes[NUM_VM_MODES];
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index 9889fe0d8919..c2f3c49643b1 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -143,13 +143,10 @@ const struct vm_guest_mode_params vm_guest_mode_params[] = {
 _Static_assert(sizeof(vm_guest_mode_params)/sizeof(struct vm_guest_mode_params) == NUM_VM_MODES,
 	       "Missing new mode params?");

-struct kvm_vm *____vm_create(enum vm_guest_mode mode, uint64_t nr_pages)
+struct kvm_vm *____vm_create(uint32_t mode, uint64_t nr_pages)
 {
 	struct kvm_vm *vm;

-	pr_debug("%s: mode='%s' pages='%ld'\n", __func__,
-		 vm_guest_mode_string(mode), nr_pages);
-
 	vm = calloc(1, sizeof(*vm));
 	TEST_ASSERT(vm != NULL, "Insufficient Memory");

@@ -158,13 +155,19 @@ struct kvm_vm *____vm_create(enum vm_guest_mode mode, uint64_t nr_pages)
 	vm->regions.hva_tree = RB_ROOT;
 	hash_init(vm->regions.slot_hash);

-	vm->mode = mode;
 	vm->type = 0;

-	vm->pa_bits = vm_guest_mode_params[mode].pa_bits;
-	vm->va_bits = vm_guest_mode_params[mode].va_bits;
-	vm->page_size = vm_guest_mode_params[mode].page_size;
-	vm->page_shift = vm_guest_mode_params[mode].page_shift;
+	vm->mode = mode & VM_MODE_PRIMARY_MASK;
+	pr_debug("%s: mode='%s' pages='%ld'\n", __func__,
+		 vm_guest_mode_string(mode), nr_pages);
+
+	TEST_ASSERT(vm->mode == mode,
+		    "Code, data, and page tables \"modes\" not yet implemented");
+
+	vm->pa_bits = vm_guest_mode_params[vm->mode].pa_bits;
+	vm->va_bits = vm_guest_mode_params[vm->mode].va_bits;
+	vm->page_size = vm_guest_mode_params[vm->mode].page_size;
+	vm->page_shift = vm_guest_mode_params[vm->mode].page_shift;

 	/* Setup mode specific traits. */
 	switch (vm->mode) {
@@ -222,7 +225,7 @@ struct kvm_vm *____vm_create(enum vm_guest_mode mode, uint64_t nr_pages)
 		vm->pgtable_levels = 5;
 		break;
 	default:
-		TEST_FAIL("Unknown guest mode, mode: 0x%x", mode);
+		TEST_FAIL("Unknown guest mode, mode: 0x%x", vm->mode);
 	}

 #ifdef __aarch64__
@@ -252,7 +255,7 @@ struct kvm_vm *____vm_create(enum vm_guest_mode mode, uint64_t nr_pages)
 	return vm;
 }

-static uint64_t vm_nr_pages_required(enum vm_guest_mode mode,
+static uint64_t vm_nr_pages_required(uint32_t mode,
 				     uint32_t nr_runnable_vcpus,
 				     uint64_t extra_mem_pages)
 {
@@ -287,7 +290,7 @@ static uint64_t vm_nr_pages_required(enum vm_guest_mode mode,
 	return vm_adjust_num_guest_pages(mode, nr_pages);
 }

-struct kvm_vm *__vm_create(enum vm_guest_mode mode, uint32_t nr_runnable_vcpus,
+struct kvm_vm *__vm_create(uint32_t mode, uint32_t nr_runnable_vcpus,
 			   uint64_t nr_extra_pages)
 {
 	uint64_t nr_pages = vm_nr_pages_required(mode, nr_runnable_vcpus,
@@ -323,7 +326,7 @@ struct kvm_vm *__vm_create(enum vm_guest_mode mode, uint32_t nr_runnable_vcpus,
  * extra_mem_pages is only used to calculate the maximum page table size,
  * no real memory allocation for non-slot0 memory in this function.
  */
-struct kvm_vm *__vm_create_with_vcpus(enum vm_guest_mode mode, uint32_t nr_vcpus,
+struct kvm_vm *__vm_create_with_vcpus(uint32_t mode, uint32_t nr_vcpus,
 				      uint64_t extra_mem_pages,
 				      void *guest_code, struct kvm_vcpu *vcpus[])
 {
@@ -1849,7 +1852,7 @@ static inline int getpageshift(void)
 }

 unsigned int
-vm_num_host_pages(enum vm_guest_mode mode, unsigned int num_guest_pages)
+vm_num_host_pages(uint32_t mode, unsigned int num_guest_pages)
 {
 	return vm_calc_num_pages(num_guest_pages,
 				 vm_guest_mode_params[mode].page_shift,
@@ -1857,13 +1860,13 @@ vm_num_host_pages(enum vm_guest_mode mode, unsigned int num_guest_pages)
 }

 unsigned int
-vm_num_guest_pages(enum vm_guest_mode mode, unsigned int num_host_pages)
+vm_num_guest_pages(uint32_t mode, unsigned int num_host_pages)
 {
 	return vm_calc_num_pages(num_host_pages, getpageshift(),
 				 vm_guest_mode_params[mode].page_shift, false);
 }

-unsigned int vm_calc_num_guest_pages(enum vm_guest_mode mode, size_t size)
+unsigned int vm_calc_num_guest_pages(uint32_t mode, size_t size)
 {
 	unsigned int n;
 	n = DIV_ROUND_UP(size, vm_guest_mode_params[mode].page_size);

base-commit: 1a4d88a361af4f2e91861d632c6a1fe87a9665c2
--


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH v4 09/13] KVM: selftests: aarch64: Add aarch64/page_fault_test
@ 2022-07-22 18:20         ` Sean Christopherson
  0 siblings, 0 replies; 56+ messages in thread
From: Sean Christopherson @ 2022-07-22 18:20 UTC (permalink / raw)
  To: Ricardo Koller
  Cc: drjones, kvm, maz, axelrasmussen, bgardon, dmatlack, pbonzini, kvmarm

On Fri, Jul 22, 2022, Ricardo Koller wrote:
> On Thu, Jul 21, 2022 at 01:29:34AM +0000, Sean Christopherson wrote:
> > If we don't care, and maybe even if we do, then my preference would be to
> > enhance the __vm_create family of helpers to allow for specifying what
> > backing type should be used for page tables, i.e. associate the info the VM
> > instead of passing it around the stack.
> > 
> > One idea would be to do something like David Matlack suggested a while back
> > and replace extra_mem_pages with a struct, e.g. struct kvm_vm_mem_params
> > That struct can then provide the necessary knobs to control how memory is
> > allocated.  And then the lib can provide a global
> > 
> > 	struct kvm_vm_mem_params kvm_default_vm_mem_params;
> > 
> 
> I like this idea, passing the info at vm creation.
> 
> What about dividing the changes in two.
> 
> 	1. Will add the struct to "__vm_create()" as part of this
> 	series, and then use it in this commit. There's only one user
> 
> 		dirty_log_test.c:   vm = __vm_create(mode, 1, extra_mem_pages);
> 
> 	so that would avoid having to touch every test as part of this patchset.
> 
> 	2. I can then send another series to add support for all the other
> 	vm_create() functions.
> 
> Alternatively, I can send a new series that does 1 and 2 afterwards.
> WDYT?

Don't do #2, ever. :-)  The intent of having vm_create() versus is __vm_create()
is so that tests that don't care about things like backing pages don't have to
pass in extra params.  I very much want to keep that behavior, i.e. I don't want
to extend vm_create() at all.  IMO, adding _anything_ is a slippery slope, e.g.
why are the backing types special enough to get a param, but thing XYZ isn't?

Thinking more, the struct idea probably isn't going to work all that well.  It
again puts the selftests into a state where it becomes difficult to control one
setting and ignore the rest, e.g. the dirty_log_test and anything else with extra
pages suddenly has to care about the backing type for page tables and code.

Rather than adding a struct, what about extending the @mode param?  We already
have vm_mem_backing_src_type, we just need a way to splice things together.  There
are a total of four things we can control: primary mode, and then code, data, and
page tables backing types.

So, turn @mode into a uint32_t and carve out 8 bits for each of those four "modes".
The defaults Just Work because VM_MEM_SRC_ANONYMOUS==0.

Lightly tested, but the below should provide the necessary base infrastructure,
then you just need to have ____vm_create() consume the secondary "modes".

---
From: Sean Christopherson <seanjc@google.com>
Date: Fri, 22 Jul 2022 10:56:08 -0700
Subject: [PATCH] KVM: selftests: Extend VM creation's @mode to allow control
 of backing types

Carve out space in the @mode passed to the various VM creation helpers to
allow using the mode to control the backing type for code, data, and page
table allocations made by the selftests framework.  E.g. to allow tests
to force guest page tables to be backed with huge pages.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 .../selftests/kvm/include/kvm_util_base.h     | 78 ++++++++++++-------
 tools/testing/selftests/kvm/lib/guest_modes.c |  2 +-
 tools/testing/selftests/kvm/lib/kvm_util.c    | 35 +++++----
 3 files changed, 69 insertions(+), 46 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
index 24fde97f6121..992dcc7b39e7 100644
--- a/tools/testing/selftests/kvm/include/kvm_util_base.h
+++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
@@ -29,6 +29,45 @@
 typedef uint64_t vm_paddr_t; /* Virtual Machine (Guest) physical address */
 typedef uint64_t vm_vaddr_t; /* Virtual Machine (Guest) virtual address */

+
+enum vm_guest_mode {
+	VM_MODE_P52V48_4K,
+	VM_MODE_P52V48_64K,
+	VM_MODE_P48V48_4K,
+	VM_MODE_P48V48_16K,
+	VM_MODE_P48V48_64K,
+	VM_MODE_P40V48_4K,
+	VM_MODE_P40V48_16K,
+	VM_MODE_P40V48_64K,
+	VM_MODE_PXXV48_4K,	/* For 48bits VA but ANY bits PA */
+	VM_MODE_P47V64_4K,
+	VM_MODE_P44V64_4K,
+	VM_MODE_P36V48_4K,
+	VM_MODE_P36V48_16K,
+	VM_MODE_P36V48_64K,
+	VM_MODE_P36V47_16K,
+	NUM_VM_MODES,
+};
+
+/*
+ * There are four flavors of "modes" that tests can control.  The primary mode
+ * defines the physical and virtual address widths, and page sizes configured
+ * in hardware.  The code/data/page_table modifiers control the backing types
+ * for code, data, and page tables that are allocated by the infrastructure,
+ * e.g. to allow tests to force page tables to be back by huge pages.
+ *
+ * Valid values for the primary mask are "enum vm_guest_mode", and valid values
+ * for code, data, and page tables are "enum vm_mem_backing_src_type".
+ */
+#define VM_MODE_PRIMARY_MASK	GENMASK(7, 0)
+#define VM_MODE_CODE_MASK	GENMASK(15, 8)
+#define VM_MODE_DATA_MASK	GENMASK(23, 16)
+#define VM_MODE_PAGE_TABLE_MASK	GENMASK(31, 24)
+
+/* 8 bits in each mask above, i.e. 255 possible values */
+_Static_assert(NUM_VM_MODES < 256);
+_Static_assert(NUM_SRC_TYPES < 256);
+
 struct userspace_mem_region {
 	struct kvm_userspace_memory_region region;
 	struct sparsebit *unused_phy_pages;
@@ -65,7 +104,7 @@ struct userspace_mem_regions {
 };

 struct kvm_vm {
-	int mode;
+	enum vm_guest_mode mode;
 	unsigned long type;
 	int kvm_fd;
 	int fd;
@@ -111,28 +150,9 @@ memslot2region(struct kvm_vm *vm, uint32_t memslot);
 #define DEFAULT_GUEST_STACK_VADDR_MIN	0xab6000
 #define DEFAULT_STACK_PGS		5

-enum vm_guest_mode {
-	VM_MODE_P52V48_4K,
-	VM_MODE_P52V48_64K,
-	VM_MODE_P48V48_4K,
-	VM_MODE_P48V48_16K,
-	VM_MODE_P48V48_64K,
-	VM_MODE_P40V48_4K,
-	VM_MODE_P40V48_16K,
-	VM_MODE_P40V48_64K,
-	VM_MODE_PXXV48_4K,	/* For 48bits VA but ANY bits PA */
-	VM_MODE_P47V64_4K,
-	VM_MODE_P44V64_4K,
-	VM_MODE_P36V48_4K,
-	VM_MODE_P36V48_16K,
-	VM_MODE_P36V48_64K,
-	VM_MODE_P36V47_16K,
-	NUM_VM_MODES,
-};
-
 #if defined(__aarch64__)

-extern enum vm_guest_mode vm_mode_default;
+extern uint32_t vm_mode_default;

 #define VM_MODE_DEFAULT			vm_mode_default
 #define MIN_PAGE_SHIFT			12U
@@ -642,8 +662,8 @@ vm_paddr_t vm_alloc_page_table(struct kvm_vm *vm);
  * __vm_create() does NOT create vCPUs, @nr_runnable_vcpus is used purely to
  * calculate the amount of memory needed for per-vCPU data, e.g. stacks.
  */
-struct kvm_vm *____vm_create(enum vm_guest_mode mode, uint64_t nr_pages);
-struct kvm_vm *__vm_create(enum vm_guest_mode mode, uint32_t nr_runnable_vcpus,
+struct kvm_vm *____vm_create(uint32_t mode, uint64_t nr_pages);
+struct kvm_vm *__vm_create(uint32_t mode, uint32_t nr_runnable_vcpus,
 			   uint64_t nr_extra_pages);

 static inline struct kvm_vm *vm_create_barebones(void)
@@ -656,7 +676,7 @@ static inline struct kvm_vm *vm_create(uint32_t nr_runnable_vcpus)
 	return __vm_create(VM_MODE_DEFAULT, nr_runnable_vcpus, 0);
 }

-struct kvm_vm *__vm_create_with_vcpus(enum vm_guest_mode mode, uint32_t nr_vcpus,
+struct kvm_vm *__vm_create_with_vcpus(uint32_t mode, uint32_t nr_vcpus,
 				      uint64_t extra_mem_pages,
 				      void *guest_code, struct kvm_vcpu *vcpus[]);

@@ -685,11 +705,11 @@ static inline struct kvm_vm *vm_create_with_one_vcpu(struct kvm_vcpu **vcpu,
 struct kvm_vcpu *vm_recreate_with_one_vcpu(struct kvm_vm *vm);

 unsigned long vm_compute_max_gfn(struct kvm_vm *vm);
-unsigned int vm_calc_num_guest_pages(enum vm_guest_mode mode, size_t size);
-unsigned int vm_num_host_pages(enum vm_guest_mode mode, unsigned int num_guest_pages);
-unsigned int vm_num_guest_pages(enum vm_guest_mode mode, unsigned int num_host_pages);
-static inline unsigned int
-vm_adjust_num_guest_pages(enum vm_guest_mode mode, unsigned int num_guest_pages)
+unsigned int vm_calc_num_guest_pages(uint32_t mode, size_t size);
+unsigned int vm_num_host_pages(uint32_t mode, unsigned int num_guest_pages);
+unsigned int vm_num_guest_pages(uint32_t mode, unsigned int num_host_pages);
+static inline unsigned int vm_adjust_num_guest_pages(uint32_t mode,
+						     unsigned int num_guest_pages)
 {
 	unsigned int n;
 	n = vm_num_guest_pages(mode, vm_num_host_pages(mode, num_guest_pages));
diff --git a/tools/testing/selftests/kvm/lib/guest_modes.c b/tools/testing/selftests/kvm/lib/guest_modes.c
index 99a575bbbc52..93c6ca9ebb49 100644
--- a/tools/testing/selftests/kvm/lib/guest_modes.c
+++ b/tools/testing/selftests/kvm/lib/guest_modes.c
@@ -6,7 +6,7 @@

 #ifdef __aarch64__
 #include "processor.h"
-enum vm_guest_mode vm_mode_default;
+uint32_t vm_mode_default;
 #endif

 struct guest_mode guest_modes[NUM_VM_MODES];
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index 9889fe0d8919..c2f3c49643b1 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -143,13 +143,10 @@ const struct vm_guest_mode_params vm_guest_mode_params[] = {
 _Static_assert(sizeof(vm_guest_mode_params)/sizeof(struct vm_guest_mode_params) == NUM_VM_MODES,
 	       "Missing new mode params?");

-struct kvm_vm *____vm_create(enum vm_guest_mode mode, uint64_t nr_pages)
+struct kvm_vm *____vm_create(uint32_t mode, uint64_t nr_pages)
 {
 	struct kvm_vm *vm;

-	pr_debug("%s: mode='%s' pages='%ld'\n", __func__,
-		 vm_guest_mode_string(mode), nr_pages);
-
 	vm = calloc(1, sizeof(*vm));
 	TEST_ASSERT(vm != NULL, "Insufficient Memory");

@@ -158,13 +155,19 @@ struct kvm_vm *____vm_create(enum vm_guest_mode mode, uint64_t nr_pages)
 	vm->regions.hva_tree = RB_ROOT;
 	hash_init(vm->regions.slot_hash);

-	vm->mode = mode;
 	vm->type = 0;

-	vm->pa_bits = vm_guest_mode_params[mode].pa_bits;
-	vm->va_bits = vm_guest_mode_params[mode].va_bits;
-	vm->page_size = vm_guest_mode_params[mode].page_size;
-	vm->page_shift = vm_guest_mode_params[mode].page_shift;
+	vm->mode = mode & VM_MODE_PRIMARY_MASK;
+	pr_debug("%s: mode='%s' pages='%ld'\n", __func__,
+		 vm_guest_mode_string(mode), nr_pages);
+
+	TEST_ASSERT(vm->mode == mode,
+		    "Code, data, and page tables \"modes\" not yet implemented");
+
+	vm->pa_bits = vm_guest_mode_params[vm->mode].pa_bits;
+	vm->va_bits = vm_guest_mode_params[vm->mode].va_bits;
+	vm->page_size = vm_guest_mode_params[vm->mode].page_size;
+	vm->page_shift = vm_guest_mode_params[vm->mode].page_shift;

 	/* Setup mode specific traits. */
 	switch (vm->mode) {
@@ -222,7 +225,7 @@ struct kvm_vm *____vm_create(enum vm_guest_mode mode, uint64_t nr_pages)
 		vm->pgtable_levels = 5;
 		break;
 	default:
-		TEST_FAIL("Unknown guest mode, mode: 0x%x", mode);
+		TEST_FAIL("Unknown guest mode, mode: 0x%x", vm->mode);
 	}

 #ifdef __aarch64__
@@ -252,7 +255,7 @@ struct kvm_vm *____vm_create(enum vm_guest_mode mode, uint64_t nr_pages)
 	return vm;
 }

-static uint64_t vm_nr_pages_required(enum vm_guest_mode mode,
+static uint64_t vm_nr_pages_required(uint32_t mode,
 				     uint32_t nr_runnable_vcpus,
 				     uint64_t extra_mem_pages)
 {
@@ -287,7 +290,7 @@ static uint64_t vm_nr_pages_required(enum vm_guest_mode mode,
 	return vm_adjust_num_guest_pages(mode, nr_pages);
 }

-struct kvm_vm *__vm_create(enum vm_guest_mode mode, uint32_t nr_runnable_vcpus,
+struct kvm_vm *__vm_create(uint32_t mode, uint32_t nr_runnable_vcpus,
 			   uint64_t nr_extra_pages)
 {
 	uint64_t nr_pages = vm_nr_pages_required(mode, nr_runnable_vcpus,
@@ -323,7 +326,7 @@ struct kvm_vm *__vm_create(enum vm_guest_mode mode, uint32_t nr_runnable_vcpus,
  * extra_mem_pages is only used to calculate the maximum page table size,
  * no real memory allocation for non-slot0 memory in this function.
  */
-struct kvm_vm *__vm_create_with_vcpus(enum vm_guest_mode mode, uint32_t nr_vcpus,
+struct kvm_vm *__vm_create_with_vcpus(uint32_t mode, uint32_t nr_vcpus,
 				      uint64_t extra_mem_pages,
 				      void *guest_code, struct kvm_vcpu *vcpus[])
 {
@@ -1849,7 +1852,7 @@ static inline int getpageshift(void)
 }

 unsigned int
-vm_num_host_pages(enum vm_guest_mode mode, unsigned int num_guest_pages)
+vm_num_host_pages(uint32_t mode, unsigned int num_guest_pages)
 {
 	return vm_calc_num_pages(num_guest_pages,
 				 vm_guest_mode_params[mode].page_shift,
@@ -1857,13 +1860,13 @@ vm_num_host_pages(enum vm_guest_mode mode, unsigned int num_guest_pages)
 }

 unsigned int
-vm_num_guest_pages(enum vm_guest_mode mode, unsigned int num_host_pages)
+vm_num_guest_pages(uint32_t mode, unsigned int num_host_pages)
 {
 	return vm_calc_num_pages(num_host_pages, getpageshift(),
 				 vm_guest_mode_params[mode].page_shift, false);
 }

-unsigned int vm_calc_num_guest_pages(enum vm_guest_mode mode, size_t size)
+unsigned int vm_calc_num_guest_pages(uint32_t mode, size_t size)
 {
 	unsigned int n;
 	n = DIV_ROUND_UP(size, vm_guest_mode_params[mode].page_size);

base-commit: 1a4d88a361af4f2e91861d632c6a1fe87a9665c2
--

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH v4 09/13] KVM: selftests: aarch64: Add aarch64/page_fault_test
  2022-07-22 18:20         ` Sean Christopherson
@ 2022-07-23  8:22           ` Andrew Jones
  -1 siblings, 0 replies; 56+ messages in thread
From: Andrew Jones @ 2022-07-23  8:22 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Ricardo Koller, kvm, kvmarm, drjones, maz, bgardon, dmatlack,
	pbonzini, axelrasmussen

On Fri, Jul 22, 2022 at 06:20:07PM +0000, Sean Christopherson wrote:
> On Fri, Jul 22, 2022, Ricardo Koller wrote:
> > On Thu, Jul 21, 2022 at 01:29:34AM +0000, Sean Christopherson wrote:
> > > If we don't care, and maybe even if we do, then my preference would be to
> > > enhance the __vm_create family of helpers to allow for specifying what
> > > backing type should be used for page tables, i.e. associate the info the VM
> > > instead of passing it around the stack.
> > > 
> > > One idea would be to do something like David Matlack suggested a while back
> > > and replace extra_mem_pages with a struct, e.g. struct kvm_vm_mem_params
> > > That struct can then provide the necessary knobs to control how memory is
> > > allocated.  And then the lib can provide a global
> > > 
> > > 	struct kvm_vm_mem_params kvm_default_vm_mem_params;
> > > 
> > 
> > I like this idea, passing the info at vm creation.
> > 
> > What about dividing the changes in two.
> > 
> > 	1. Will add the struct to "__vm_create()" as part of this
> > 	series, and then use it in this commit. There's only one user
> > 
> > 		dirty_log_test.c:   vm = __vm_create(mode, 1, extra_mem_pages);
> > 
> > 	so that would avoid having to touch every test as part of this patchset.
> > 
> > 	2. I can then send another series to add support for all the other
> > 	vm_create() functions.
> > 
> > Alternatively, I can send a new series that does 1 and 2 afterwards.
> > WDYT?
> 
> Don't do #2, ever. :-)  The intent of having vm_create() versus is __vm_create()
> is so that tests that don't care about things like backing pages don't have to
> pass in extra params.  I very much want to keep that behavior, i.e. I don't want
> to extend vm_create() at all.  IMO, adding _anything_ is a slippery slope, e.g.
> why are the backing types special enough to get a param, but thing XYZ isn't?
> 
> Thinking more, the struct idea probably isn't going to work all that well.  It
> again puts the selftests into a state where it becomes difficult to control one
> setting and ignore the rest, e.g. the dirty_log_test and anything else with extra
> pages suddenly has to care about the backing type for page tables and code.
> 
> Rather than adding a struct, what about extending the @mode param?  We already
> have vm_mem_backing_src_type, we just need a way to splice things together.  There
> are a total of four things we can control: primary mode, and then code, data, and
> page tables backing types.
> 
> So, turn @mode into a uint32_t and carve out 8 bits for each of those four "modes".
> The defaults Just Work because VM_MEM_SRC_ANONYMOUS==0.

Hi Sean,

How about merging both proposals and turn @mode into a struct and pass
around a pointer to it? Then, when calling something that requires @mode,
if @mode is NULL, the called function would use vm_arch_default_mode()
to get a pointer to the arch-specific default mode struct. If a test needs
to modify the parameters then it can construct a mode struct from scratch
or start with a copy of the default. As long as all members of the struct
representing parameters, such as backing type, have defaults mapped to
zero for the struct members, then we shouldn't be adding any burden to
users that don't care about other parameters (other than ensuring their
@mode struct was zero initialized).

Thanks,
drew

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v4 09/13] KVM: selftests: aarch64: Add aarch64/page_fault_test
@ 2022-07-23  8:22           ` Andrew Jones
  0 siblings, 0 replies; 56+ messages in thread
From: Andrew Jones @ 2022-07-23  8:22 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: drjones, kvm, maz, axelrasmussen, bgardon, dmatlack, pbonzini, kvmarm

On Fri, Jul 22, 2022 at 06:20:07PM +0000, Sean Christopherson wrote:
> On Fri, Jul 22, 2022, Ricardo Koller wrote:
> > On Thu, Jul 21, 2022 at 01:29:34AM +0000, Sean Christopherson wrote:
> > > If we don't care, and maybe even if we do, then my preference would be to
> > > enhance the __vm_create family of helpers to allow for specifying what
> > > backing type should be used for page tables, i.e. associate the info the VM
> > > instead of passing it around the stack.
> > > 
> > > One idea would be to do something like David Matlack suggested a while back
> > > and replace extra_mem_pages with a struct, e.g. struct kvm_vm_mem_params
> > > That struct can then provide the necessary knobs to control how memory is
> > > allocated.  And then the lib can provide a global
> > > 
> > > 	struct kvm_vm_mem_params kvm_default_vm_mem_params;
> > > 
> > 
> > I like this idea, passing the info at vm creation.
> > 
> > What about dividing the changes in two.
> > 
> > 	1. Will add the struct to "__vm_create()" as part of this
> > 	series, and then use it in this commit. There's only one user
> > 
> > 		dirty_log_test.c:   vm = __vm_create(mode, 1, extra_mem_pages);
> > 
> > 	so that would avoid having to touch every test as part of this patchset.
> > 
> > 	2. I can then send another series to add support for all the other
> > 	vm_create() functions.
> > 
> > Alternatively, I can send a new series that does 1 and 2 afterwards.
> > WDYT?
> 
> Don't do #2, ever. :-)  The intent of having vm_create() versus is __vm_create()
> is so that tests that don't care about things like backing pages don't have to
> pass in extra params.  I very much want to keep that behavior, i.e. I don't want
> to extend vm_create() at all.  IMO, adding _anything_ is a slippery slope, e.g.
> why are the backing types special enough to get a param, but thing XYZ isn't?
> 
> Thinking more, the struct idea probably isn't going to work all that well.  It
> again puts the selftests into a state where it becomes difficult to control one
> setting and ignore the rest, e.g. the dirty_log_test and anything else with extra
> pages suddenly has to care about the backing type for page tables and code.
> 
> Rather than adding a struct, what about extending the @mode param?  We already
> have vm_mem_backing_src_type, we just need a way to splice things together.  There
> are a total of four things we can control: primary mode, and then code, data, and
> page tables backing types.
> 
> So, turn @mode into a uint32_t and carve out 8 bits for each of those four "modes".
> The defaults Just Work because VM_MEM_SRC_ANONYMOUS==0.

Hi Sean,

How about merging both proposals and turn @mode into a struct and pass
around a pointer to it? Then, when calling something that requires @mode,
if @mode is NULL, the called function would use vm_arch_default_mode()
to get a pointer to the arch-specific default mode struct. If a test needs
to modify the parameters then it can construct a mode struct from scratch
or start with a copy of the default. As long as all members of the struct
representing parameters, such as backing type, have defaults mapped to
zero for the struct members, then we shouldn't be adding any burden to
users that don't care about other parameters (other than ensuring their
@mode struct was zero initialized).

Thanks,
drew
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v4 09/13] KVM: selftests: aarch64: Add aarch64/page_fault_test
  2022-07-23  8:22           ` Andrew Jones
@ 2022-07-26 18:15             ` Sean Christopherson
  -1 siblings, 0 replies; 56+ messages in thread
From: Sean Christopherson @ 2022-07-26 18:15 UTC (permalink / raw)
  To: Andrew Jones
  Cc: Ricardo Koller, kvm, kvmarm, drjones, maz, bgardon, dmatlack,
	pbonzini, axelrasmussen

On Sat, Jul 23, 2022, Andrew Jones wrote:
> On Fri, Jul 22, 2022 at 06:20:07PM +0000, Sean Christopherson wrote:
> > On Fri, Jul 22, 2022, Ricardo Koller wrote:
> > > What about dividing the changes in two.
> > > 
> > > 	1. Will add the struct to "__vm_create()" as part of this
> > > 	series, and then use it in this commit. There's only one user
> > > 
> > > 		dirty_log_test.c:   vm = __vm_create(mode, 1, extra_mem_pages);
> > > 
> > > 	so that would avoid having to touch every test as part of this patchset.
> > > 
> > > 	2. I can then send another series to add support for all the other
> > > 	vm_create() functions.
> > > 
> > > Alternatively, I can send a new series that does 1 and 2 afterwards.
> > > WDYT?
> > 
> > Don't do #2, ever. :-)  The intent of having vm_create() versus is __vm_create()
> > is so that tests that don't care about things like backing pages don't have to
> > pass in extra params.  I very much want to keep that behavior, i.e. I don't want
> > to extend vm_create() at all.  IMO, adding _anything_ is a slippery slope, e.g.
> > why are the backing types special enough to get a param, but thing XYZ isn't?
> > 
> > Thinking more, the struct idea probably isn't going to work all that well.  It
> > again puts the selftests into a state where it becomes difficult to control one
> > setting and ignore the rest, e.g. the dirty_log_test and anything else with extra
> > pages suddenly has to care about the backing type for page tables and code.
> > 
> > Rather than adding a struct, what about extending the @mode param?  We already
> > have vm_mem_backing_src_type, we just need a way to splice things together.  There
> > are a total of four things we can control: primary mode, and then code, data, and
> > page tables backing types.
> > 
> > So, turn @mode into a uint32_t and carve out 8 bits for each of those four "modes".
> > The defaults Just Work because VM_MEM_SRC_ANONYMOUS==0.
> 
> Hi Sean,
> 
> How about merging both proposals and turn @mode into a struct and pass
> around a pointer to it? Then, when calling something that requires @mode,
> if @mode is NULL, the called function would use vm_arch_default_mode()
> to get a pointer to the arch-specific default mode struct.

One tweak: rather that use @NULL as a magic param, #define VM_MODE_DEFAULT to
point at a global struct, similar to what is already done for __aarch64__.

E.g.

	__vm_create(VM_MODE_DEFAULT, nr_runnable_vcpus, 0);

does a much better job of self-documenting its behavior than this:

	__vm_create(NULL, nr_runnable_vcpus, 0);

> If a test needs to modify the parameters then it can construct a mode struct
> from scratch or start with a copy of the default. As long as all members of
> the struct representing parameters, such as backing type, have defaults
> mapped to zero for the struct members, then we shouldn't be adding any burden
> to users that don't care about other parameters (other than ensuring their
> @mode struct was zero initialized).

I was hoping to avoid forcing tests to build a struct, but looking at all the
existing users, they either use for_each_guest_mode() or just pass VM_MODE_DEFAULT,
so in practice it's a complete non-issue.

The page fault usage will likely be similar, e.g. programatically generate the set
of combinations to test.

So yeah, let's try the struct approach.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v4 09/13] KVM: selftests: aarch64: Add aarch64/page_fault_test
@ 2022-07-26 18:15             ` Sean Christopherson
  0 siblings, 0 replies; 56+ messages in thread
From: Sean Christopherson @ 2022-07-26 18:15 UTC (permalink / raw)
  To: Andrew Jones
  Cc: drjones, kvm, maz, axelrasmussen, bgardon, dmatlack, pbonzini, kvmarm

On Sat, Jul 23, 2022, Andrew Jones wrote:
> On Fri, Jul 22, 2022 at 06:20:07PM +0000, Sean Christopherson wrote:
> > On Fri, Jul 22, 2022, Ricardo Koller wrote:
> > > What about dividing the changes in two.
> > > 
> > > 	1. Will add the struct to "__vm_create()" as part of this
> > > 	series, and then use it in this commit. There's only one user
> > > 
> > > 		dirty_log_test.c:   vm = __vm_create(mode, 1, extra_mem_pages);
> > > 
> > > 	so that would avoid having to touch every test as part of this patchset.
> > > 
> > > 	2. I can then send another series to add support for all the other
> > > 	vm_create() functions.
> > > 
> > > Alternatively, I can send a new series that does 1 and 2 afterwards.
> > > WDYT?
> > 
> > Don't do #2, ever. :-)  The intent of having vm_create() versus is __vm_create()
> > is so that tests that don't care about things like backing pages don't have to
> > pass in extra params.  I very much want to keep that behavior, i.e. I don't want
> > to extend vm_create() at all.  IMO, adding _anything_ is a slippery slope, e.g.
> > why are the backing types special enough to get a param, but thing XYZ isn't?
> > 
> > Thinking more, the struct idea probably isn't going to work all that well.  It
> > again puts the selftests into a state where it becomes difficult to control one
> > setting and ignore the rest, e.g. the dirty_log_test and anything else with extra
> > pages suddenly has to care about the backing type for page tables and code.
> > 
> > Rather than adding a struct, what about extending the @mode param?  We already
> > have vm_mem_backing_src_type, we just need a way to splice things together.  There
> > are a total of four things we can control: primary mode, and then code, data, and
> > page tables backing types.
> > 
> > So, turn @mode into a uint32_t and carve out 8 bits for each of those four "modes".
> > The defaults Just Work because VM_MEM_SRC_ANONYMOUS==0.
> 
> Hi Sean,
> 
> How about merging both proposals and turn @mode into a struct and pass
> around a pointer to it? Then, when calling something that requires @mode,
> if @mode is NULL, the called function would use vm_arch_default_mode()
> to get a pointer to the arch-specific default mode struct.

One tweak: rather that use @NULL as a magic param, #define VM_MODE_DEFAULT to
point at a global struct, similar to what is already done for __aarch64__.

E.g.

	__vm_create(VM_MODE_DEFAULT, nr_runnable_vcpus, 0);

does a much better job of self-documenting its behavior than this:

	__vm_create(NULL, nr_runnable_vcpus, 0);

> If a test needs to modify the parameters then it can construct a mode struct
> from scratch or start with a copy of the default. As long as all members of
> the struct representing parameters, such as backing type, have defaults
> mapped to zero for the struct members, then we shouldn't be adding any burden
> to users that don't care about other parameters (other than ensuring their
> @mode struct was zero initialized).

I was hoping to avoid forcing tests to build a struct, but looking at all the
existing users, they either use for_each_guest_mode() or just pass VM_MODE_DEFAULT,
so in practice it's a complete non-issue.

The page fault usage will likely be similar, e.g. programatically generate the set
of combinations to test.

So yeah, let's try the struct approach.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v4 09/13] KVM: selftests: aarch64: Add aarch64/page_fault_test
  2022-07-26 18:15             ` Sean Christopherson
@ 2022-08-23 23:37               ` Ricardo Koller
  -1 siblings, 0 replies; 56+ messages in thread
From: Ricardo Koller @ 2022-08-23 23:37 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Andrew Jones, kvm, kvmarm, drjones, maz, bgardon, dmatlack,
	pbonzini, axelrasmussen

On Tue, Jul 26, 2022 at 06:15:31PM +0000, Sean Christopherson wrote:
> On Sat, Jul 23, 2022, Andrew Jones wrote:
> > On Fri, Jul 22, 2022 at 06:20:07PM +0000, Sean Christopherson wrote:
> > > On Fri, Jul 22, 2022, Ricardo Koller wrote:
> > > > What about dividing the changes in two.
> > > > 
> > > > 	1. Will add the struct to "__vm_create()" as part of this
> > > > 	series, and then use it in this commit. There's only one user
> > > > 
> > > > 		dirty_log_test.c:   vm = __vm_create(mode, 1, extra_mem_pages);
> > > > 
> > > > 	so that would avoid having to touch every test as part of this patchset.
> > > > 
> > > > 	2. I can then send another series to add support for all the other
> > > > 	vm_create() functions.
> > > > 
> > > > Alternatively, I can send a new series that does 1 and 2 afterwards.
> > > > WDYT?
> > > 
> > > Don't do #2, ever. :-)  The intent of having vm_create() versus is __vm_create()
> > > is so that tests that don't care about things like backing pages don't have to
> > > pass in extra params.  I very much want to keep that behavior, i.e. I don't want
> > > to extend vm_create() at all.  IMO, adding _anything_ is a slippery slope, e.g.
> > > why are the backing types special enough to get a param, but thing XYZ isn't?
> > > 
> > > Thinking more, the struct idea probably isn't going to work all that well.  It
> > > again puts the selftests into a state where it becomes difficult to control one
> > > setting and ignore the rest, e.g. the dirty_log_test and anything else with extra
> > > pages suddenly has to care about the backing type for page tables and code.
> > > 
> > > Rather than adding a struct, what about extending the @mode param?  We already
> > > have vm_mem_backing_src_type, we just need a way to splice things together.  There
> > > are a total of four things we can control: primary mode, and then code, data, and
> > > page tables backing types.
> > > 
> > > So, turn @mode into a uint32_t and carve out 8 bits for each of those four "modes".
> > > The defaults Just Work because VM_MEM_SRC_ANONYMOUS==0.
> > 
> > Hi Sean,
> > 
> > How about merging both proposals and turn @mode into a struct and pass
> > around a pointer to it? Then, when calling something that requires @mode,
> > if @mode is NULL, the called function would use vm_arch_default_mode()
> > to get a pointer to the arch-specific default mode struct.
> 
> One tweak: rather that use @NULL as a magic param, #define VM_MODE_DEFAULT to
> point at a global struct, similar to what is already done for __aarch64__.
> 
> E.g.
> 
> 	__vm_create(VM_MODE_DEFAULT, nr_runnable_vcpus, 0);
> 
> does a much better job of self-documenting its behavior than this:
> 
> 	__vm_create(NULL, nr_runnable_vcpus, 0);
> 
> > If a test needs to modify the parameters then it can construct a mode struct
> > from scratch or start with a copy of the default. As long as all members of
> > the struct representing parameters, such as backing type, have defaults
> > mapped to zero for the struct members, then we shouldn't be adding any burden
> > to users that don't care about other parameters (other than ensuring their
> > @mode struct was zero initialized).
> 
> I was hoping to avoid forcing tests to build a struct, but looking at all the
> existing users, they either use for_each_guest_mode() or just pass VM_MODE_DEFAULT,
> so in practice it's a complete non-issue.
> 
> The page fault usage will likely be similar, e.g. programatically generate the set
> of combinations to test.
> 
> So yeah, let's try the struct approach.

Thank you both for the suggestions.

About to send v5 with the suggested changes, with a slight modification.
V5 adds "struct kvm_vm_mem_params" which includes a "guest mode" field.
The suggestion was to overload "guest mode". What I have doesn't change
the meaning of "guest mode", and just keeps everything dealing with
"guest mode" the same (like guest_mode.c).

The main changes are:

1. adding a struct kvm_vm_mem_params that defines the memory layout:

	-struct kvm_vm *____vm_create(enum vm_guest_mode mode, uint64_t nr_pages);
	+struct kvm_vm *____vm_create(struct kvm_vm_mem_params *mem_params);

2. adding memslot vm->memslot.[code|pt|data] and change all allocators
to use the right memslot, e.g.,: lib/elf should use the code memslot.

3. change the new page_fault_test.c setup_memslot() accordingly (much
nicer).

Let me know what you think.

Thanks!
Ricardo

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v4 09/13] KVM: selftests: aarch64: Add aarch64/page_fault_test
@ 2022-08-23 23:37               ` Ricardo Koller
  0 siblings, 0 replies; 56+ messages in thread
From: Ricardo Koller @ 2022-08-23 23:37 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: drjones, kvm, maz, Andrew Jones, axelrasmussen, bgardon,
	dmatlack, pbonzini, kvmarm

On Tue, Jul 26, 2022 at 06:15:31PM +0000, Sean Christopherson wrote:
> On Sat, Jul 23, 2022, Andrew Jones wrote:
> > On Fri, Jul 22, 2022 at 06:20:07PM +0000, Sean Christopherson wrote:
> > > On Fri, Jul 22, 2022, Ricardo Koller wrote:
> > > > What about dividing the changes in two.
> > > > 
> > > > 	1. Will add the struct to "__vm_create()" as part of this
> > > > 	series, and then use it in this commit. There's only one user
> > > > 
> > > > 		dirty_log_test.c:   vm = __vm_create(mode, 1, extra_mem_pages);
> > > > 
> > > > 	so that would avoid having to touch every test as part of this patchset.
> > > > 
> > > > 	2. I can then send another series to add support for all the other
> > > > 	vm_create() functions.
> > > > 
> > > > Alternatively, I can send a new series that does 1 and 2 afterwards.
> > > > WDYT?
> > > 
> > > Don't do #2, ever. :-)  The intent of having vm_create() versus is __vm_create()
> > > is so that tests that don't care about things like backing pages don't have to
> > > pass in extra params.  I very much want to keep that behavior, i.e. I don't want
> > > to extend vm_create() at all.  IMO, adding _anything_ is a slippery slope, e.g.
> > > why are the backing types special enough to get a param, but thing XYZ isn't?
> > > 
> > > Thinking more, the struct idea probably isn't going to work all that well.  It
> > > again puts the selftests into a state where it becomes difficult to control one
> > > setting and ignore the rest, e.g. the dirty_log_test and anything else with extra
> > > pages suddenly has to care about the backing type for page tables and code.
> > > 
> > > Rather than adding a struct, what about extending the @mode param?  We already
> > > have vm_mem_backing_src_type, we just need a way to splice things together.  There
> > > are a total of four things we can control: primary mode, and then code, data, and
> > > page tables backing types.
> > > 
> > > So, turn @mode into a uint32_t and carve out 8 bits for each of those four "modes".
> > > The defaults Just Work because VM_MEM_SRC_ANONYMOUS==0.
> > 
> > Hi Sean,
> > 
> > How about merging both proposals and turn @mode into a struct and pass
> > around a pointer to it? Then, when calling something that requires @mode,
> > if @mode is NULL, the called function would use vm_arch_default_mode()
> > to get a pointer to the arch-specific default mode struct.
> 
> One tweak: rather that use @NULL as a magic param, #define VM_MODE_DEFAULT to
> point at a global struct, similar to what is already done for __aarch64__.
> 
> E.g.
> 
> 	__vm_create(VM_MODE_DEFAULT, nr_runnable_vcpus, 0);
> 
> does a much better job of self-documenting its behavior than this:
> 
> 	__vm_create(NULL, nr_runnable_vcpus, 0);
> 
> > If a test needs to modify the parameters then it can construct a mode struct
> > from scratch or start with a copy of the default. As long as all members of
> > the struct representing parameters, such as backing type, have defaults
> > mapped to zero for the struct members, then we shouldn't be adding any burden
> > to users that don't care about other parameters (other than ensuring their
> > @mode struct was zero initialized).
> 
> I was hoping to avoid forcing tests to build a struct, but looking at all the
> existing users, they either use for_each_guest_mode() or just pass VM_MODE_DEFAULT,
> so in practice it's a complete non-issue.
> 
> The page fault usage will likely be similar, e.g. programatically generate the set
> of combinations to test.
> 
> So yeah, let's try the struct approach.

Thank you both for the suggestions.

About to send v5 with the suggested changes, with a slight modification.
V5 adds "struct kvm_vm_mem_params" which includes a "guest mode" field.
The suggestion was to overload "guest mode". What I have doesn't change
the meaning of "guest mode", and just keeps everything dealing with
"guest mode" the same (like guest_mode.c).

The main changes are:

1. adding a struct kvm_vm_mem_params that defines the memory layout:

	-struct kvm_vm *____vm_create(enum vm_guest_mode mode, uint64_t nr_pages);
	+struct kvm_vm *____vm_create(struct kvm_vm_mem_params *mem_params);

2. adding memslot vm->memslot.[code|pt|data] and change all allocators
to use the right memslot, e.g.,: lib/elf should use the code memslot.

3. change the new page_fault_test.c setup_memslot() accordingly (much
nicer).

Let me know what you think.

Thanks!
Ricardo
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 56+ messages in thread

end of thread, other threads:[~2022-08-23 23:38 UTC | newest]

Thread overview: 56+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-24 21:32 [PATCH v4 00/13] KVM: selftests: Add aarch64/page_fault_test Ricardo Koller
2022-06-24 21:32 ` Ricardo Koller
2022-06-24 21:32 ` [PATCH v4 01/13] KVM: selftests: Add a userfaultfd library Ricardo Koller
2022-06-24 21:32   ` Ricardo Koller
2022-06-24 21:32 ` [PATCH v4 02/13] KVM: selftests: aarch64: Add virt_get_pte_hva library function Ricardo Koller
2022-06-24 21:32   ` Ricardo Koller
2022-07-12  9:12   ` Andrew Jones
2022-07-12  9:12     ` Andrew Jones
2022-06-24 21:32 ` [PATCH v4 03/13] KVM: selftests: Add vm_alloc_page_table_in_memslot " Ricardo Koller
2022-06-24 21:32   ` Ricardo Koller
2022-07-12  9:13   ` Andrew Jones
2022-07-12  9:13     ` Andrew Jones
2022-06-24 21:32 ` [PATCH v4 04/13] KVM: selftests: aarch64: Export _virt_pg_map with a pt_memslot arg Ricardo Koller
2022-06-24 21:32   ` Ricardo Koller
2022-07-12  9:33   ` Andrew Jones
2022-07-12  9:33     ` Andrew Jones
2022-06-24 21:32 ` [PATCH v4 05/13] KVM: selftests: Add missing close and munmap in __vm_mem_region_delete Ricardo Koller
2022-06-24 21:32   ` Ricardo Koller
2022-07-12  9:35   ` Andrew Jones
2022-07-12  9:35     ` Andrew Jones
2022-06-24 21:32 ` [PATCH v4 06/13] KVM: selftests: Add vm_mem_region_get_src_fd library function Ricardo Koller
2022-06-24 21:32   ` Ricardo Koller
2022-07-12  9:40   ` Andrew Jones
2022-07-12  9:40     ` Andrew Jones
2022-06-24 21:32 ` [PATCH v4 07/13] KVM: selftests: aarch64: Construct DEFAULT_MAIR_EL1 using sysreg.h macros Ricardo Koller
2022-06-24 21:32   ` Ricardo Koller
2022-07-12  9:46   ` Andrew Jones
2022-07-12  9:46     ` Andrew Jones
2022-06-24 21:32 ` [PATCH v4 08/13] tools: Copy bitfield.h from the kernel sources Ricardo Koller
2022-06-24 21:32   ` Ricardo Koller
2022-06-24 21:32 ` [PATCH v4 09/13] KVM: selftests: aarch64: Add aarch64/page_fault_test Ricardo Koller
2022-06-24 21:32   ` Ricardo Koller
2022-06-28 23:43   ` Oliver Upton
2022-06-28 23:43     ` Oliver Upton
2022-06-29  1:32     ` Ricardo Koller
2022-06-29  1:32       ` Ricardo Koller
2022-07-21  1:29   ` Sean Christopherson
2022-07-21  1:29     ` Sean Christopherson
2022-07-22 17:19     ` Ricardo Koller
2022-07-22 17:19       ` Ricardo Koller
2022-07-22 18:20       ` Sean Christopherson
2022-07-22 18:20         ` Sean Christopherson
2022-07-23  8:22         ` Andrew Jones
2022-07-23  8:22           ` Andrew Jones
2022-07-26 18:15           ` Sean Christopherson
2022-07-26 18:15             ` Sean Christopherson
2022-08-23 23:37             ` Ricardo Koller
2022-08-23 23:37               ` Ricardo Koller
2022-06-24 21:32 ` [PATCH v4 10/13] KVM: selftests: aarch64: Add userfaultfd tests into page_fault_test Ricardo Koller
2022-06-24 21:32   ` Ricardo Koller
2022-06-24 21:32 ` [PATCH v4 11/13] KVM: selftests: aarch64: Add dirty logging " Ricardo Koller
2022-06-24 21:32   ` Ricardo Koller
2022-06-24 21:32 ` [PATCH v4 12/13] KVM: selftests: aarch64: Add readonly memslot " Ricardo Koller
2022-06-24 21:32   ` Ricardo Koller
2022-06-24 21:32 ` [PATCH v4 13/13] KVM: selftests: aarch64: Add mix of " Ricardo Koller
2022-06-24 21:32   ` Ricardo Koller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.