All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/11] KVM: selftests: Add aarch64/page_fault_test
@ 2022-03-11  6:01 ` Ricardo Koller
  0 siblings, 0 replies; 42+ messages in thread
From: Ricardo Koller @ 2022-03-11  6:01 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: maz, bgardon, pbonzini, axelrasmussen

This series adds a new aarch64 selftest for testing stage 2 fault handling for
various combinations of guest accesses (e.g., write, S1PTW), backing sources
(e.g., anon), and types of faults (e.g., read on hugetlbfs with a hole, write
on a readonly memslot). Each test tries a different combination and then checks
that the access results in the right behavior (e.g., uffd faults with the right
address and write/read flag). Some interesting combinations are:

- loading an instruction leads to a stage 1 page-table-walk that misses on
  stage 2 because the backing memslot for the page table it not in host memory
  (a hole was punched right there) and the fault is handled using userfaultfd.
  The expected behavior is that this leads to a userfaultfd fault marked as a
  write. See commit c4ad98e4b72c ("KVM: arm64: Assume write fault on S1PTW
  permission fault on instruction fetch") for why that's a write.
- a cas (compare-and-swap) on a readonly memslot leads to a failed vcpu run.
- write-faulting on a memslot that's marked for userfaultfd handling and dirty
  logging should result in a uffd fault and having the respective bit set in
  the dirty log.

The first 6 commits of this series add library support. The first one adds a
new userfaultfd library (out of demand_paging_test.c). The next 3 add some
library functions to get the GPA of a PTE, and to get the fd of a backing
source. Commit 6 fixes a leaked fd when using shared backing stores. The last 5
commits add the new selftest, one type of test at a time. It first adds core
tests, then uffd, then dirty logging, then readonly memslots tests, and finally
combinations of the previous ones (like uffd and dirty logging at the same
time).

Ricardo Koller (11):
  KVM: selftests: Add a userfaultfd library
  KVM: selftests: Add vm_mem_region_get_src_fd library function
  KVM: selftests: aarch64: Add vm_get_pte_gpa library function
  KVM: selftests: Add vm_alloc_page_table_in_memslot library function
  KVM: selftests: aarch64: Export _virt_pg_map with a pt_memslot arg
  KVM: selftests: Add missing close and munmap in __vm_mem_region_delete
  KVM: selftests: aarch64: Add aarch64/page_fault_test
  KVM: selftests: aarch64: Add userfaultfd tests into page_fault_test
  KVM: selftests: aarch64: Add dirty logging tests into page_fault_test
  KVM: selftests: aarch64: Add readonly memslot tests into
    page_fault_test
  KVM: selftests: aarch64: Add mix of tests into page_fault_test

 tools/testing/selftests/kvm/Makefile          |    3 +-
 .../selftests/kvm/aarch64/page_fault_test.c   | 1461 +++++++++++++++++
 .../selftests/kvm/demand_paging_test.c        |  227 +--
 .../selftests/kvm/include/aarch64/processor.h |    5 +
 .../selftests/kvm/include/kvm_util_base.h     |    2 +
 .../selftests/kvm/include/userfaultfd_util.h  |   46 +
 .../selftests/kvm/lib/aarch64/processor.c     |   36 +-
 tools/testing/selftests/kvm/lib/kvm_util.c    |   37 +-
 .../selftests/kvm/lib/userfaultfd_util.c      |  196 +++
 9 files changed, 1805 insertions(+), 208 deletions(-)
 create mode 100644 tools/testing/selftests/kvm/aarch64/page_fault_test.c
 create mode 100644 tools/testing/selftests/kvm/include/userfaultfd_util.h
 create mode 100644 tools/testing/selftests/kvm/lib/userfaultfd_util.c

-- 
2.35.1.723.g4982287a31-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 00/11] KVM: selftests: Add aarch64/page_fault_test
@ 2022-03-11  6:01 ` Ricardo Koller
  0 siblings, 0 replies; 42+ messages in thread
From: Ricardo Koller @ 2022-03-11  6:01 UTC (permalink / raw)
  To: kvm, kvmarm, drjones
  Cc: pbonzini, maz, alexandru.elisei, eric.auger, oupton, reijiw,
	rananta, bgardon, axelrasmussen, Ricardo Koller

This series adds a new aarch64 selftest for testing stage 2 fault handling for
various combinations of guest accesses (e.g., write, S1PTW), backing sources
(e.g., anon), and types of faults (e.g., read on hugetlbfs with a hole, write
on a readonly memslot). Each test tries a different combination and then checks
that the access results in the right behavior (e.g., uffd faults with the right
address and write/read flag). Some interesting combinations are:

- loading an instruction leads to a stage 1 page-table-walk that misses on
  stage 2 because the backing memslot for the page table it not in host memory
  (a hole was punched right there) and the fault is handled using userfaultfd.
  The expected behavior is that this leads to a userfaultfd fault marked as a
  write. See commit c4ad98e4b72c ("KVM: arm64: Assume write fault on S1PTW
  permission fault on instruction fetch") for why that's a write.
- a cas (compare-and-swap) on a readonly memslot leads to a failed vcpu run.
- write-faulting on a memslot that's marked for userfaultfd handling and dirty
  logging should result in a uffd fault and having the respective bit set in
  the dirty log.

The first 6 commits of this series add library support. The first one adds a
new userfaultfd library (out of demand_paging_test.c). The next 3 add some
library functions to get the GPA of a PTE, and to get the fd of a backing
source. Commit 6 fixes a leaked fd when using shared backing stores. The last 5
commits add the new selftest, one type of test at a time. It first adds core
tests, then uffd, then dirty logging, then readonly memslots tests, and finally
combinations of the previous ones (like uffd and dirty logging at the same
time).

Ricardo Koller (11):
  KVM: selftests: Add a userfaultfd library
  KVM: selftests: Add vm_mem_region_get_src_fd library function
  KVM: selftests: aarch64: Add vm_get_pte_gpa library function
  KVM: selftests: Add vm_alloc_page_table_in_memslot library function
  KVM: selftests: aarch64: Export _virt_pg_map with a pt_memslot arg
  KVM: selftests: Add missing close and munmap in __vm_mem_region_delete
  KVM: selftests: aarch64: Add aarch64/page_fault_test
  KVM: selftests: aarch64: Add userfaultfd tests into page_fault_test
  KVM: selftests: aarch64: Add dirty logging tests into page_fault_test
  KVM: selftests: aarch64: Add readonly memslot tests into
    page_fault_test
  KVM: selftests: aarch64: Add mix of tests into page_fault_test

 tools/testing/selftests/kvm/Makefile          |    3 +-
 .../selftests/kvm/aarch64/page_fault_test.c   | 1461 +++++++++++++++++
 .../selftests/kvm/demand_paging_test.c        |  227 +--
 .../selftests/kvm/include/aarch64/processor.h |    5 +
 .../selftests/kvm/include/kvm_util_base.h     |    2 +
 .../selftests/kvm/include/userfaultfd_util.h  |   46 +
 .../selftests/kvm/lib/aarch64/processor.c     |   36 +-
 tools/testing/selftests/kvm/lib/kvm_util.c    |   37 +-
 .../selftests/kvm/lib/userfaultfd_util.c      |  196 +++
 9 files changed, 1805 insertions(+), 208 deletions(-)
 create mode 100644 tools/testing/selftests/kvm/aarch64/page_fault_test.c
 create mode 100644 tools/testing/selftests/kvm/include/userfaultfd_util.h
 create mode 100644 tools/testing/selftests/kvm/lib/userfaultfd_util.c

-- 
2.35.1.723.g4982287a31-goog


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 01/11] KVM: selftests: Add a userfaultfd library
  2022-03-11  6:01 ` Ricardo Koller
@ 2022-03-11  6:01   ` Ricardo Koller
  -1 siblings, 0 replies; 42+ messages in thread
From: Ricardo Koller @ 2022-03-11  6:01 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: maz, bgardon, pbonzini, axelrasmussen

Move the generic userfaultfd code out of demand_paging_test.c into a
common library, userfaultfd_util. This library consists of a setup and a
stop function. The setup function starts a thread for handling page
faults using the handler callback function. This setup returns a
uffd_desc object which is then used in the stop function (to wait and
destroy the threads).

Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 tools/testing/selftests/kvm/Makefile          |   2 +-
 .../selftests/kvm/demand_paging_test.c        | 227 +++---------------
 .../selftests/kvm/include/userfaultfd_util.h  |  46 ++++
 .../selftests/kvm/lib/userfaultfd_util.c      | 196 +++++++++++++++
 4 files changed, 272 insertions(+), 199 deletions(-)
 create mode 100644 tools/testing/selftests/kvm/include/userfaultfd_util.h
 create mode 100644 tools/testing/selftests/kvm/lib/userfaultfd_util.c

diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index 0e4926bc9a58..bc5f89b3700e 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -37,7 +37,7 @@ ifeq ($(ARCH),riscv)
 	UNAME_M := riscv
 endif
 
-LIBKVM = lib/assert.c lib/elf.c lib/io.c lib/kvm_util.c lib/rbtree.c lib/sparsebit.c lib/test_util.c lib/guest_modes.c lib/perf_test_util.c
+LIBKVM = lib/assert.c lib/elf.c lib/io.c lib/kvm_util.c lib/rbtree.c lib/sparsebit.c lib/test_util.c lib/guest_modes.c lib/perf_test_util.c lib/userfaultfd_util.c
 LIBKVM_x86_64 = lib/x86_64/apic.c lib/x86_64/processor.c lib/x86_64/vmx.c lib/x86_64/svm.c lib/x86_64/ucall.c lib/x86_64/handlers.S
 LIBKVM_aarch64 = lib/aarch64/processor.c lib/aarch64/ucall.c lib/aarch64/handlers.S lib/aarch64/spinlock.c lib/aarch64/gic.c lib/aarch64/gic_v3.c lib/aarch64/vgic.c
 LIBKVM_s390x = lib/s390x/processor.c lib/s390x/ucall.c lib/s390x/diag318_test_handler.c
diff --git a/tools/testing/selftests/kvm/demand_paging_test.c b/tools/testing/selftests/kvm/demand_paging_test.c
index 6a719d065599..b3d457cecd68 100644
--- a/tools/testing/selftests/kvm/demand_paging_test.c
+++ b/tools/testing/selftests/kvm/demand_paging_test.c
@@ -22,23 +22,13 @@
 #include "test_util.h"
 #include "perf_test_util.h"
 #include "guest_modes.h"
+#include "userfaultfd_util.h"
 
 #ifdef __NR_userfaultfd
 
-#ifdef PRINT_PER_PAGE_UPDATES
-#define PER_PAGE_DEBUG(...) printf(__VA_ARGS__)
-#else
-#define PER_PAGE_DEBUG(...) _no_printf(__VA_ARGS__)
-#endif
-
-#ifdef PRINT_PER_VCPU_UPDATES
-#define PER_VCPU_DEBUG(...) printf(__VA_ARGS__)
-#else
-#define PER_VCPU_DEBUG(...) _no_printf(__VA_ARGS__)
-#endif
-
 static int nr_vcpus = 1;
 static uint64_t guest_percpu_mem_size = DEFAULT_PER_VCPU_MEM_SIZE;
+
 static size_t demand_paging_size;
 static char *guest_data_prototype;
 
@@ -69,9 +59,11 @@ static void vcpu_worker(struct perf_test_vcpu_args *vcpu_args)
 		       ts_diff.tv_sec, ts_diff.tv_nsec);
 }
 
-static int handle_uffd_page_request(int uffd_mode, int uffd, uint64_t addr)
+static int handle_uffd_page_request(int uffd_mode, int uffd,
+		struct uffd_msg *msg)
 {
 	pid_t tid = syscall(__NR_gettid);
+	uint64_t addr = msg->arg.pagefault.address;
 	struct timespec start;
 	struct timespec ts_diff;
 	int r;
@@ -118,175 +110,32 @@ static int handle_uffd_page_request(int uffd_mode, int uffd, uint64_t addr)
 	return 0;
 }
 
-bool quit_uffd_thread;
-
-struct uffd_handler_args {
+struct test_params {
 	int uffd_mode;
-	int uffd;
-	int pipefd;
-	useconds_t delay;
+	useconds_t uffd_delay;
+	enum vm_mem_backing_src_type src_type;
+	bool partition_vcpu_memory_access;
 };
 
-static void *uffd_handler_thread_fn(void *arg)
+static void prefault_mem(void *alias, uint64_t len)
 {
-	struct uffd_handler_args *uffd_args = (struct uffd_handler_args *)arg;
-	int uffd = uffd_args->uffd;
-	int pipefd = uffd_args->pipefd;
-	useconds_t delay = uffd_args->delay;
-	int64_t pages = 0;
-	struct timespec start;
-	struct timespec ts_diff;
-
-	clock_gettime(CLOCK_MONOTONIC, &start);
-	while (!quit_uffd_thread) {
-		struct uffd_msg msg;
-		struct pollfd pollfd[2];
-		char tmp_chr;
-		int r;
-		uint64_t addr;
-
-		pollfd[0].fd = uffd;
-		pollfd[0].events = POLLIN;
-		pollfd[1].fd = pipefd;
-		pollfd[1].events = POLLIN;
-
-		r = poll(pollfd, 2, -1);
-		switch (r) {
-		case -1:
-			pr_info("poll err");
-			continue;
-		case 0:
-			continue;
-		case 1:
-			break;
-		default:
-			pr_info("Polling uffd returned %d", r);
-			return NULL;
-		}
-
-		if (pollfd[0].revents & POLLERR) {
-			pr_info("uffd revents has POLLERR");
-			return NULL;
-		}
-
-		if (pollfd[1].revents & POLLIN) {
-			r = read(pollfd[1].fd, &tmp_chr, 1);
-			TEST_ASSERT(r == 1,
-				    "Error reading pipefd in UFFD thread\n");
-			return NULL;
-		}
-
-		if (!(pollfd[0].revents & POLLIN))
-			continue;
-
-		r = read(uffd, &msg, sizeof(msg));
-		if (r == -1) {
-			if (errno == EAGAIN)
-				continue;
-			pr_info("Read of uffd got errno %d\n", errno);
-			return NULL;
-		}
-
-		if (r != sizeof(msg)) {
-			pr_info("Read on uffd returned unexpected size: %d bytes", r);
-			return NULL;
-		}
-
-		if (!(msg.event & UFFD_EVENT_PAGEFAULT))
-			continue;
+	size_t p;
 
-		if (delay)
-			usleep(delay);
-		addr =  msg.arg.pagefault.address;
-		r = handle_uffd_page_request(uffd_args->uffd_mode, uffd, addr);
-		if (r < 0)
-			return NULL;
-		pages++;
+	TEST_ASSERT(alias != NULL, "Alias required for minor faults");
+	for (p = 0; p < (len / demand_paging_size); ++p) {
+		memcpy(alias + (p * demand_paging_size),
+		       guest_data_prototype, demand_paging_size);
 	}
-
-	ts_diff = timespec_elapsed(start);
-	PER_VCPU_DEBUG("userfaulted %ld pages over %ld.%.9lds. (%f/sec)\n",
-		       pages, ts_diff.tv_sec, ts_diff.tv_nsec,
-		       pages / ((double)ts_diff.tv_sec + (double)ts_diff.tv_nsec / 100000000.0));
-
-	return NULL;
 }
 
-static void setup_demand_paging(struct kvm_vm *vm,
-				pthread_t *uffd_handler_thread, int pipefd,
-				int uffd_mode, useconds_t uffd_delay,
-				struct uffd_handler_args *uffd_args,
-				void *hva, void *alias, uint64_t len)
-{
-	bool is_minor = (uffd_mode == UFFDIO_REGISTER_MODE_MINOR);
-	int uffd;
-	struct uffdio_api uffdio_api;
-	struct uffdio_register uffdio_register;
-	uint64_t expected_ioctls = ((uint64_t) 1) << _UFFDIO_COPY;
-
-	PER_PAGE_DEBUG("Userfaultfd %s mode, faults resolved with %s\n",
-		       is_minor ? "MINOR" : "MISSING",
-		       is_minor ? "UFFDIO_CONINUE" : "UFFDIO_COPY");
-
-	/* In order to get minor faults, prefault via the alias. */
-	if (is_minor) {
-		size_t p;
-
-		expected_ioctls = ((uint64_t) 1) << _UFFDIO_CONTINUE;
-
-		TEST_ASSERT(alias != NULL, "Alias required for minor faults");
-		for (p = 0; p < (len / demand_paging_size); ++p) {
-			memcpy(alias + (p * demand_paging_size),
-			       guest_data_prototype, demand_paging_size);
-		}
-	}
-
-	uffd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK);
-	TEST_ASSERT(uffd >= 0, "uffd creation failed, errno: %d", errno);
-
-	uffdio_api.api = UFFD_API;
-	uffdio_api.features = 0;
-	TEST_ASSERT(ioctl(uffd, UFFDIO_API, &uffdio_api) != -1,
-		    "ioctl UFFDIO_API failed: %" PRIu64,
-		    (uint64_t)uffdio_api.api);
-
-	uffdio_register.range.start = (uint64_t)hva;
-	uffdio_register.range.len = len;
-	uffdio_register.mode = uffd_mode;
-	TEST_ASSERT(ioctl(uffd, UFFDIO_REGISTER, &uffdio_register) != -1,
-		    "ioctl UFFDIO_REGISTER failed");
-	TEST_ASSERT((uffdio_register.ioctls & expected_ioctls) ==
-		    expected_ioctls, "missing userfaultfd ioctls");
-
-	uffd_args->uffd_mode = uffd_mode;
-	uffd_args->uffd = uffd;
-	uffd_args->pipefd = pipefd;
-	uffd_args->delay = uffd_delay;
-	pthread_create(uffd_handler_thread, NULL, uffd_handler_thread_fn,
-		       uffd_args);
-
-	PER_VCPU_DEBUG("Created uffd thread for HVA range [%p, %p)\n",
-		       hva, hva + len);
-}
-
-struct test_params {
-	int uffd_mode;
-	useconds_t uffd_delay;
-	enum vm_mem_backing_src_type src_type;
-	bool partition_vcpu_memory_access;
-};
-
 static void run_test(enum vm_guest_mode mode, void *arg)
 {
 	struct test_params *p = arg;
-	pthread_t *uffd_handler_threads = NULL;
-	struct uffd_handler_args *uffd_args = NULL;
+	struct uffd_desc **uffd_descs = NULL;
 	struct timespec start;
 	struct timespec ts_diff;
-	int *pipefds = NULL;
 	struct kvm_vm *vm;
 	int vcpu_id;
-	int r;
 
 	vm = perf_test_create_vm(mode, nr_vcpus, guest_percpu_mem_size, 1,
 				 p->src_type, p->partition_vcpu_memory_access);
@@ -299,15 +148,8 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 	memset(guest_data_prototype, 0xAB, demand_paging_size);
 
 	if (p->uffd_mode) {
-		uffd_handler_threads =
-			malloc(nr_vcpus * sizeof(*uffd_handler_threads));
-		TEST_ASSERT(uffd_handler_threads, "Memory allocation failed");
-
-		uffd_args = malloc(nr_vcpus * sizeof(*uffd_args));
-		TEST_ASSERT(uffd_args, "Memory allocation failed");
-
-		pipefds = malloc(sizeof(int) * nr_vcpus * 2);
-		TEST_ASSERT(pipefds, "Unable to allocate memory for pipefd");
+		uffd_descs = malloc(nr_vcpus * sizeof(struct uffd_desc *));
+		TEST_ASSERT(uffd_descs, "Memory allocation failed");
 
 		for (vcpu_id = 0; vcpu_id < nr_vcpus; vcpu_id++) {
 			struct perf_test_vcpu_args *vcpu_args;
@@ -320,19 +162,17 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 			vcpu_hva = addr_gpa2hva(vm, vcpu_args->gpa);
 			vcpu_alias = addr_gpa2alias(vm, vcpu_args->gpa);
 
+			prefault_mem(vcpu_alias,
+				vcpu_args->pages * perf_test_args.guest_page_size);
+
 			/*
 			 * Set up user fault fd to handle demand paging
 			 * requests.
 			 */
-			r = pipe2(&pipefds[vcpu_id * 2],
-				  O_CLOEXEC | O_NONBLOCK);
-			TEST_ASSERT(!r, "Failed to set up pipefd");
-
-			setup_demand_paging(vm, &uffd_handler_threads[vcpu_id],
-					    pipefds[vcpu_id * 2], p->uffd_mode,
-					    p->uffd_delay, &uffd_args[vcpu_id],
-					    vcpu_hva, vcpu_alias,
-					    vcpu_args->pages * perf_test_args.guest_page_size);
+			uffd_descs[vcpu_id] = uffd_setup_demand_paging(
+				p->uffd_mode, p->uffd_delay, vcpu_hva,
+				vcpu_args->pages * perf_test_args.guest_page_size,
+				&handle_uffd_page_request);
 		}
 	}
 
@@ -347,15 +187,9 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 	pr_info("All vCPU threads joined\n");
 
 	if (p->uffd_mode) {
-		char c;
-
 		/* Tell the user fault fd handler threads to quit */
-		for (vcpu_id = 0; vcpu_id < nr_vcpus; vcpu_id++) {
-			r = write(pipefds[vcpu_id * 2 + 1], &c, 1);
-			TEST_ASSERT(r == 1, "Unable to write to pipefd");
-
-			pthread_join(uffd_handler_threads[vcpu_id], NULL);
-		}
+		for (vcpu_id = 0; vcpu_id < nr_vcpus; vcpu_id++)
+			uffd_stop_demand_paging(uffd_descs[vcpu_id]);
 	}
 
 	pr_info("Total guest execution time: %ld.%.9lds\n",
@@ -367,11 +201,8 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 	perf_test_destroy_vm(vm);
 
 	free(guest_data_prototype);
-	if (p->uffd_mode) {
-		free(uffd_handler_threads);
-		free(uffd_args);
-		free(pipefds);
-	}
+	if (p->uffd_mode)
+		free(uffd_descs);
 }
 
 static void help(char *name)
diff --git a/tools/testing/selftests/kvm/include/userfaultfd_util.h b/tools/testing/selftests/kvm/include/userfaultfd_util.h
new file mode 100644
index 000000000000..7b294ce8147c
--- /dev/null
+++ b/tools/testing/selftests/kvm/include/userfaultfd_util.h
@@ -0,0 +1,46 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * KVM userfaultfd util
+ * Adapted from demand_paging_test.c
+ *
+ * Copyright (C) 2018, Red Hat, Inc.
+ * Copyright (C) 2019, Google, Inc.
+ * Copyright (C) 2022, Google, Inc.
+ */
+
+#define _GNU_SOURCE /* for pipe2 */
+
+#include <inttypes.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <time.h>
+#include <poll.h>
+#include <pthread.h>
+#include <linux/userfaultfd.h>
+#include <sys/syscall.h>
+
+#include "kvm_util.h"
+#include "test_util.h"
+#include "perf_test_util.h"
+
+typedef int (*uffd_handler_t)(int uffd_mode, int uffd, struct uffd_msg *msg);
+
+struct uffd_desc;
+
+struct uffd_desc *uffd_setup_demand_paging(int uffd_mode,
+		useconds_t uffd_delay, void *hva, uint64_t len,
+		uffd_handler_t handler);
+
+void uffd_stop_demand_paging(struct uffd_desc *uffd);
+
+#ifdef PRINT_PER_PAGE_UPDATES
+#define PER_PAGE_DEBUG(...) printf(__VA_ARGS__)
+#else
+#define PER_PAGE_DEBUG(...) _no_printf(__VA_ARGS__)
+#endif
+
+#ifdef PRINT_PER_VCPU_UPDATES
+#define PER_VCPU_DEBUG(...) printf(__VA_ARGS__)
+#else
+#define PER_VCPU_DEBUG(...) _no_printf(__VA_ARGS__)
+#endif
diff --git a/tools/testing/selftests/kvm/lib/userfaultfd_util.c b/tools/testing/selftests/kvm/lib/userfaultfd_util.c
new file mode 100644
index 000000000000..5e0878878a69
--- /dev/null
+++ b/tools/testing/selftests/kvm/lib/userfaultfd_util.c
@@ -0,0 +1,196 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * KVM userfaultfd util
+ * Adapted from demand_paging_test.c
+ *
+ * Copyright (C) 2018, Red Hat, Inc.
+ * Copyright (C) 2019, Google, Inc.
+ * Copyright (C) 2022, Google, Inc.
+ */
+
+#define _GNU_SOURCE /* for pipe2 */
+
+#include <inttypes.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <time.h>
+#include <poll.h>
+#include <pthread.h>
+#include <linux/userfaultfd.h>
+#include <sys/syscall.h>
+
+#include "kvm_util.h"
+#include "test_util.h"
+#include "perf_test_util.h"
+#include "userfaultfd_util.h"
+
+#ifdef __NR_userfaultfd
+
+struct uffd_desc {
+	int uffd_mode;
+	int uffd;
+	int pipefds[2];
+	useconds_t delay;
+	uffd_handler_t handler;
+	pthread_t thread;
+};
+
+static void *uffd_handler_thread_fn(void *arg)
+{
+	struct uffd_desc *uffd_desc = (struct uffd_desc *)arg;
+	int uffd = uffd_desc->uffd;
+	int pipefd = uffd_desc->pipefds[0];
+	useconds_t delay = uffd_desc->delay;
+	int64_t pages = 0;
+	struct timespec start;
+	struct timespec ts_diff;
+
+	clock_gettime(CLOCK_MONOTONIC, &start);
+	while (1) {
+		struct uffd_msg msg;
+		struct pollfd pollfd[2];
+		char tmp_chr;
+		int r;
+
+		pollfd[0].fd = uffd;
+		pollfd[0].events = POLLIN;
+		pollfd[1].fd = pipefd;
+		pollfd[1].events = POLLIN;
+
+		r = poll(pollfd, 2, -1);
+		switch (r) {
+		case -1:
+			pr_info("poll err");
+			continue;
+		case 0:
+			continue;
+		case 1:
+			break;
+		default:
+			pr_info("Polling uffd returned %d", r);
+			return NULL;
+		}
+
+		if (pollfd[0].revents & POLLERR) {
+			pr_info("uffd revents has POLLERR");
+			return NULL;
+		}
+
+		if (pollfd[1].revents & POLLIN) {
+			r = read(pollfd[1].fd, &tmp_chr, 1);
+			TEST_ASSERT(r == 1,
+				    "Error reading pipefd in UFFD thread\n");
+			return NULL;
+		}
+
+		if (!(pollfd[0].revents & POLLIN))
+			continue;
+
+		r = read(uffd, &msg, sizeof(msg));
+		if (r == -1) {
+			if (errno == EAGAIN)
+				continue;
+			pr_info("Read of uffd got errno %d\n", errno);
+			return NULL;
+		}
+
+		if (r != sizeof(msg)) {
+			pr_info("Read on uffd returned unexpected size: %d bytes", r);
+			return NULL;
+		}
+
+		if (!(msg.event & UFFD_EVENT_PAGEFAULT))
+			continue;
+
+		if (delay)
+			usleep(delay);
+		r = uffd_desc->handler(uffd_desc->uffd_mode, uffd, &msg);
+		if (r < 0)
+			return NULL;
+		pages++;
+	}
+
+	ts_diff = timespec_elapsed(start);
+	PER_VCPU_DEBUG("userfaulted %ld pages over %ld.%.9lds. (%f/sec)\n",
+		       pages, ts_diff.tv_sec, ts_diff.tv_nsec,
+		       pages / ((double)ts_diff.tv_sec + (double)ts_diff.tv_nsec / 100000000.0));
+
+	return NULL;
+}
+
+struct uffd_desc *uffd_setup_demand_paging(int uffd_mode,
+		useconds_t uffd_delay, void *hva, uint64_t len,
+		uffd_handler_t handler)
+{
+	struct uffd_desc *uffd_desc;
+	bool is_minor = (uffd_mode == UFFDIO_REGISTER_MODE_MINOR);
+	int uffd;
+	struct uffdio_api uffdio_api;
+	struct uffdio_register uffdio_register;
+	uint64_t expected_ioctls = ((uint64_t) 1) << _UFFDIO_COPY;
+	int ret;
+
+	PER_PAGE_DEBUG("Userfaultfd %s mode, faults resolved with %s\n",
+		       is_minor ? "MINOR" : "MISSING",
+		       is_minor ? "UFFDIO_CONINUE" : "UFFDIO_COPY");
+
+	uffd_desc = malloc(sizeof(struct uffd_desc));
+	TEST_ASSERT(uffd_desc, "malloc failed");
+
+	/* In order to get minor faults, prefault via the alias. */
+	if (is_minor)
+		expected_ioctls = ((uint64_t) 1) << _UFFDIO_CONTINUE;
+
+	uffd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK);
+	TEST_ASSERT(uffd >= 0, "uffd creation failed, errno: %d", errno);
+
+	uffdio_api.api = UFFD_API;
+	uffdio_api.features = 0;
+	TEST_ASSERT(ioctl(uffd, UFFDIO_API, &uffdio_api) != -1,
+		    "ioctl UFFDIO_API failed: %" PRIu64,
+		    (uint64_t)uffdio_api.api);
+
+	uffdio_register.range.start = (uint64_t)hva;
+	uffdio_register.range.len = len;
+	uffdio_register.mode = uffd_mode;
+	TEST_ASSERT(ioctl(uffd, UFFDIO_REGISTER, &uffdio_register) != -1,
+		    "ioctl UFFDIO_REGISTER failed");
+	TEST_ASSERT((uffdio_register.ioctls & expected_ioctls) ==
+			expected_ioctls, "missing userfaultfd ioctls");
+
+	ret = pipe2(uffd_desc->pipefds, O_CLOEXEC | O_NONBLOCK);
+	TEST_ASSERT(!ret, "Failed to set up pipefd");
+
+	uffd_desc->uffd_mode = uffd_mode;
+	uffd_desc->uffd = uffd;
+	uffd_desc->delay = uffd_delay;
+	uffd_desc->handler = handler;
+	pthread_create(&uffd_desc->thread, NULL, uffd_handler_thread_fn,
+		       uffd_desc);
+
+	PER_VCPU_DEBUG("Created uffd thread for HVA range [%p, %p)\n",
+		       hva, hva + len);
+
+	return uffd_desc;
+}
+
+void uffd_stop_demand_paging(struct uffd_desc *uffd)
+{
+	char c = 0;
+	int ret;
+
+	ret = write(uffd->pipefds[1], &c, 1);
+	TEST_ASSERT(ret == 1, "Unable to write to pipefd");
+
+	ret = pthread_join(uffd->thread, NULL);
+	TEST_ASSERT(ret == 0, "Pthread_join failed.");
+
+	close(uffd->uffd);
+
+	close(uffd->pipefds[1]);
+	close(uffd->pipefds[0]);
+
+	free(uffd);
+}
+
+#endif /* __NR_userfaultfd */
-- 
2.35.1.723.g4982287a31-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 01/11] KVM: selftests: Add a userfaultfd library
@ 2022-03-11  6:01   ` Ricardo Koller
  0 siblings, 0 replies; 42+ messages in thread
From: Ricardo Koller @ 2022-03-11  6:01 UTC (permalink / raw)
  To: kvm, kvmarm, drjones
  Cc: pbonzini, maz, alexandru.elisei, eric.auger, oupton, reijiw,
	rananta, bgardon, axelrasmussen, Ricardo Koller

Move the generic userfaultfd code out of demand_paging_test.c into a
common library, userfaultfd_util. This library consists of a setup and a
stop function. The setup function starts a thread for handling page
faults using the handler callback function. This setup returns a
uffd_desc object which is then used in the stop function (to wait and
destroy the threads).

Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 tools/testing/selftests/kvm/Makefile          |   2 +-
 .../selftests/kvm/demand_paging_test.c        | 227 +++---------------
 .../selftests/kvm/include/userfaultfd_util.h  |  46 ++++
 .../selftests/kvm/lib/userfaultfd_util.c      | 196 +++++++++++++++
 4 files changed, 272 insertions(+), 199 deletions(-)
 create mode 100644 tools/testing/selftests/kvm/include/userfaultfd_util.h
 create mode 100644 tools/testing/selftests/kvm/lib/userfaultfd_util.c

diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index 0e4926bc9a58..bc5f89b3700e 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -37,7 +37,7 @@ ifeq ($(ARCH),riscv)
 	UNAME_M := riscv
 endif
 
-LIBKVM = lib/assert.c lib/elf.c lib/io.c lib/kvm_util.c lib/rbtree.c lib/sparsebit.c lib/test_util.c lib/guest_modes.c lib/perf_test_util.c
+LIBKVM = lib/assert.c lib/elf.c lib/io.c lib/kvm_util.c lib/rbtree.c lib/sparsebit.c lib/test_util.c lib/guest_modes.c lib/perf_test_util.c lib/userfaultfd_util.c
 LIBKVM_x86_64 = lib/x86_64/apic.c lib/x86_64/processor.c lib/x86_64/vmx.c lib/x86_64/svm.c lib/x86_64/ucall.c lib/x86_64/handlers.S
 LIBKVM_aarch64 = lib/aarch64/processor.c lib/aarch64/ucall.c lib/aarch64/handlers.S lib/aarch64/spinlock.c lib/aarch64/gic.c lib/aarch64/gic_v3.c lib/aarch64/vgic.c
 LIBKVM_s390x = lib/s390x/processor.c lib/s390x/ucall.c lib/s390x/diag318_test_handler.c
diff --git a/tools/testing/selftests/kvm/demand_paging_test.c b/tools/testing/selftests/kvm/demand_paging_test.c
index 6a719d065599..b3d457cecd68 100644
--- a/tools/testing/selftests/kvm/demand_paging_test.c
+++ b/tools/testing/selftests/kvm/demand_paging_test.c
@@ -22,23 +22,13 @@
 #include "test_util.h"
 #include "perf_test_util.h"
 #include "guest_modes.h"
+#include "userfaultfd_util.h"
 
 #ifdef __NR_userfaultfd
 
-#ifdef PRINT_PER_PAGE_UPDATES
-#define PER_PAGE_DEBUG(...) printf(__VA_ARGS__)
-#else
-#define PER_PAGE_DEBUG(...) _no_printf(__VA_ARGS__)
-#endif
-
-#ifdef PRINT_PER_VCPU_UPDATES
-#define PER_VCPU_DEBUG(...) printf(__VA_ARGS__)
-#else
-#define PER_VCPU_DEBUG(...) _no_printf(__VA_ARGS__)
-#endif
-
 static int nr_vcpus = 1;
 static uint64_t guest_percpu_mem_size = DEFAULT_PER_VCPU_MEM_SIZE;
+
 static size_t demand_paging_size;
 static char *guest_data_prototype;
 
@@ -69,9 +59,11 @@ static void vcpu_worker(struct perf_test_vcpu_args *vcpu_args)
 		       ts_diff.tv_sec, ts_diff.tv_nsec);
 }
 
-static int handle_uffd_page_request(int uffd_mode, int uffd, uint64_t addr)
+static int handle_uffd_page_request(int uffd_mode, int uffd,
+		struct uffd_msg *msg)
 {
 	pid_t tid = syscall(__NR_gettid);
+	uint64_t addr = msg->arg.pagefault.address;
 	struct timespec start;
 	struct timespec ts_diff;
 	int r;
@@ -118,175 +110,32 @@ static int handle_uffd_page_request(int uffd_mode, int uffd, uint64_t addr)
 	return 0;
 }
 
-bool quit_uffd_thread;
-
-struct uffd_handler_args {
+struct test_params {
 	int uffd_mode;
-	int uffd;
-	int pipefd;
-	useconds_t delay;
+	useconds_t uffd_delay;
+	enum vm_mem_backing_src_type src_type;
+	bool partition_vcpu_memory_access;
 };
 
-static void *uffd_handler_thread_fn(void *arg)
+static void prefault_mem(void *alias, uint64_t len)
 {
-	struct uffd_handler_args *uffd_args = (struct uffd_handler_args *)arg;
-	int uffd = uffd_args->uffd;
-	int pipefd = uffd_args->pipefd;
-	useconds_t delay = uffd_args->delay;
-	int64_t pages = 0;
-	struct timespec start;
-	struct timespec ts_diff;
-
-	clock_gettime(CLOCK_MONOTONIC, &start);
-	while (!quit_uffd_thread) {
-		struct uffd_msg msg;
-		struct pollfd pollfd[2];
-		char tmp_chr;
-		int r;
-		uint64_t addr;
-
-		pollfd[0].fd = uffd;
-		pollfd[0].events = POLLIN;
-		pollfd[1].fd = pipefd;
-		pollfd[1].events = POLLIN;
-
-		r = poll(pollfd, 2, -1);
-		switch (r) {
-		case -1:
-			pr_info("poll err");
-			continue;
-		case 0:
-			continue;
-		case 1:
-			break;
-		default:
-			pr_info("Polling uffd returned %d", r);
-			return NULL;
-		}
-
-		if (pollfd[0].revents & POLLERR) {
-			pr_info("uffd revents has POLLERR");
-			return NULL;
-		}
-
-		if (pollfd[1].revents & POLLIN) {
-			r = read(pollfd[1].fd, &tmp_chr, 1);
-			TEST_ASSERT(r == 1,
-				    "Error reading pipefd in UFFD thread\n");
-			return NULL;
-		}
-
-		if (!(pollfd[0].revents & POLLIN))
-			continue;
-
-		r = read(uffd, &msg, sizeof(msg));
-		if (r == -1) {
-			if (errno == EAGAIN)
-				continue;
-			pr_info("Read of uffd got errno %d\n", errno);
-			return NULL;
-		}
-
-		if (r != sizeof(msg)) {
-			pr_info("Read on uffd returned unexpected size: %d bytes", r);
-			return NULL;
-		}
-
-		if (!(msg.event & UFFD_EVENT_PAGEFAULT))
-			continue;
+	size_t p;
 
-		if (delay)
-			usleep(delay);
-		addr =  msg.arg.pagefault.address;
-		r = handle_uffd_page_request(uffd_args->uffd_mode, uffd, addr);
-		if (r < 0)
-			return NULL;
-		pages++;
+	TEST_ASSERT(alias != NULL, "Alias required for minor faults");
+	for (p = 0; p < (len / demand_paging_size); ++p) {
+		memcpy(alias + (p * demand_paging_size),
+		       guest_data_prototype, demand_paging_size);
 	}
-
-	ts_diff = timespec_elapsed(start);
-	PER_VCPU_DEBUG("userfaulted %ld pages over %ld.%.9lds. (%f/sec)\n",
-		       pages, ts_diff.tv_sec, ts_diff.tv_nsec,
-		       pages / ((double)ts_diff.tv_sec + (double)ts_diff.tv_nsec / 100000000.0));
-
-	return NULL;
 }
 
-static void setup_demand_paging(struct kvm_vm *vm,
-				pthread_t *uffd_handler_thread, int pipefd,
-				int uffd_mode, useconds_t uffd_delay,
-				struct uffd_handler_args *uffd_args,
-				void *hva, void *alias, uint64_t len)
-{
-	bool is_minor = (uffd_mode == UFFDIO_REGISTER_MODE_MINOR);
-	int uffd;
-	struct uffdio_api uffdio_api;
-	struct uffdio_register uffdio_register;
-	uint64_t expected_ioctls = ((uint64_t) 1) << _UFFDIO_COPY;
-
-	PER_PAGE_DEBUG("Userfaultfd %s mode, faults resolved with %s\n",
-		       is_minor ? "MINOR" : "MISSING",
-		       is_minor ? "UFFDIO_CONINUE" : "UFFDIO_COPY");
-
-	/* In order to get minor faults, prefault via the alias. */
-	if (is_minor) {
-		size_t p;
-
-		expected_ioctls = ((uint64_t) 1) << _UFFDIO_CONTINUE;
-
-		TEST_ASSERT(alias != NULL, "Alias required for minor faults");
-		for (p = 0; p < (len / demand_paging_size); ++p) {
-			memcpy(alias + (p * demand_paging_size),
-			       guest_data_prototype, demand_paging_size);
-		}
-	}
-
-	uffd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK);
-	TEST_ASSERT(uffd >= 0, "uffd creation failed, errno: %d", errno);
-
-	uffdio_api.api = UFFD_API;
-	uffdio_api.features = 0;
-	TEST_ASSERT(ioctl(uffd, UFFDIO_API, &uffdio_api) != -1,
-		    "ioctl UFFDIO_API failed: %" PRIu64,
-		    (uint64_t)uffdio_api.api);
-
-	uffdio_register.range.start = (uint64_t)hva;
-	uffdio_register.range.len = len;
-	uffdio_register.mode = uffd_mode;
-	TEST_ASSERT(ioctl(uffd, UFFDIO_REGISTER, &uffdio_register) != -1,
-		    "ioctl UFFDIO_REGISTER failed");
-	TEST_ASSERT((uffdio_register.ioctls & expected_ioctls) ==
-		    expected_ioctls, "missing userfaultfd ioctls");
-
-	uffd_args->uffd_mode = uffd_mode;
-	uffd_args->uffd = uffd;
-	uffd_args->pipefd = pipefd;
-	uffd_args->delay = uffd_delay;
-	pthread_create(uffd_handler_thread, NULL, uffd_handler_thread_fn,
-		       uffd_args);
-
-	PER_VCPU_DEBUG("Created uffd thread for HVA range [%p, %p)\n",
-		       hva, hva + len);
-}
-
-struct test_params {
-	int uffd_mode;
-	useconds_t uffd_delay;
-	enum vm_mem_backing_src_type src_type;
-	bool partition_vcpu_memory_access;
-};
-
 static void run_test(enum vm_guest_mode mode, void *arg)
 {
 	struct test_params *p = arg;
-	pthread_t *uffd_handler_threads = NULL;
-	struct uffd_handler_args *uffd_args = NULL;
+	struct uffd_desc **uffd_descs = NULL;
 	struct timespec start;
 	struct timespec ts_diff;
-	int *pipefds = NULL;
 	struct kvm_vm *vm;
 	int vcpu_id;
-	int r;
 
 	vm = perf_test_create_vm(mode, nr_vcpus, guest_percpu_mem_size, 1,
 				 p->src_type, p->partition_vcpu_memory_access);
@@ -299,15 +148,8 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 	memset(guest_data_prototype, 0xAB, demand_paging_size);
 
 	if (p->uffd_mode) {
-		uffd_handler_threads =
-			malloc(nr_vcpus * sizeof(*uffd_handler_threads));
-		TEST_ASSERT(uffd_handler_threads, "Memory allocation failed");
-
-		uffd_args = malloc(nr_vcpus * sizeof(*uffd_args));
-		TEST_ASSERT(uffd_args, "Memory allocation failed");
-
-		pipefds = malloc(sizeof(int) * nr_vcpus * 2);
-		TEST_ASSERT(pipefds, "Unable to allocate memory for pipefd");
+		uffd_descs = malloc(nr_vcpus * sizeof(struct uffd_desc *));
+		TEST_ASSERT(uffd_descs, "Memory allocation failed");
 
 		for (vcpu_id = 0; vcpu_id < nr_vcpus; vcpu_id++) {
 			struct perf_test_vcpu_args *vcpu_args;
@@ -320,19 +162,17 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 			vcpu_hva = addr_gpa2hva(vm, vcpu_args->gpa);
 			vcpu_alias = addr_gpa2alias(vm, vcpu_args->gpa);
 
+			prefault_mem(vcpu_alias,
+				vcpu_args->pages * perf_test_args.guest_page_size);
+
 			/*
 			 * Set up user fault fd to handle demand paging
 			 * requests.
 			 */
-			r = pipe2(&pipefds[vcpu_id * 2],
-				  O_CLOEXEC | O_NONBLOCK);
-			TEST_ASSERT(!r, "Failed to set up pipefd");
-
-			setup_demand_paging(vm, &uffd_handler_threads[vcpu_id],
-					    pipefds[vcpu_id * 2], p->uffd_mode,
-					    p->uffd_delay, &uffd_args[vcpu_id],
-					    vcpu_hva, vcpu_alias,
-					    vcpu_args->pages * perf_test_args.guest_page_size);
+			uffd_descs[vcpu_id] = uffd_setup_demand_paging(
+				p->uffd_mode, p->uffd_delay, vcpu_hva,
+				vcpu_args->pages * perf_test_args.guest_page_size,
+				&handle_uffd_page_request);
 		}
 	}
 
@@ -347,15 +187,9 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 	pr_info("All vCPU threads joined\n");
 
 	if (p->uffd_mode) {
-		char c;
-
 		/* Tell the user fault fd handler threads to quit */
-		for (vcpu_id = 0; vcpu_id < nr_vcpus; vcpu_id++) {
-			r = write(pipefds[vcpu_id * 2 + 1], &c, 1);
-			TEST_ASSERT(r == 1, "Unable to write to pipefd");
-
-			pthread_join(uffd_handler_threads[vcpu_id], NULL);
-		}
+		for (vcpu_id = 0; vcpu_id < nr_vcpus; vcpu_id++)
+			uffd_stop_demand_paging(uffd_descs[vcpu_id]);
 	}
 
 	pr_info("Total guest execution time: %ld.%.9lds\n",
@@ -367,11 +201,8 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 	perf_test_destroy_vm(vm);
 
 	free(guest_data_prototype);
-	if (p->uffd_mode) {
-		free(uffd_handler_threads);
-		free(uffd_args);
-		free(pipefds);
-	}
+	if (p->uffd_mode)
+		free(uffd_descs);
 }
 
 static void help(char *name)
diff --git a/tools/testing/selftests/kvm/include/userfaultfd_util.h b/tools/testing/selftests/kvm/include/userfaultfd_util.h
new file mode 100644
index 000000000000..7b294ce8147c
--- /dev/null
+++ b/tools/testing/selftests/kvm/include/userfaultfd_util.h
@@ -0,0 +1,46 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * KVM userfaultfd util
+ * Adapted from demand_paging_test.c
+ *
+ * Copyright (C) 2018, Red Hat, Inc.
+ * Copyright (C) 2019, Google, Inc.
+ * Copyright (C) 2022, Google, Inc.
+ */
+
+#define _GNU_SOURCE /* for pipe2 */
+
+#include <inttypes.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <time.h>
+#include <poll.h>
+#include <pthread.h>
+#include <linux/userfaultfd.h>
+#include <sys/syscall.h>
+
+#include "kvm_util.h"
+#include "test_util.h"
+#include "perf_test_util.h"
+
+typedef int (*uffd_handler_t)(int uffd_mode, int uffd, struct uffd_msg *msg);
+
+struct uffd_desc;
+
+struct uffd_desc *uffd_setup_demand_paging(int uffd_mode,
+		useconds_t uffd_delay, void *hva, uint64_t len,
+		uffd_handler_t handler);
+
+void uffd_stop_demand_paging(struct uffd_desc *uffd);
+
+#ifdef PRINT_PER_PAGE_UPDATES
+#define PER_PAGE_DEBUG(...) printf(__VA_ARGS__)
+#else
+#define PER_PAGE_DEBUG(...) _no_printf(__VA_ARGS__)
+#endif
+
+#ifdef PRINT_PER_VCPU_UPDATES
+#define PER_VCPU_DEBUG(...) printf(__VA_ARGS__)
+#else
+#define PER_VCPU_DEBUG(...) _no_printf(__VA_ARGS__)
+#endif
diff --git a/tools/testing/selftests/kvm/lib/userfaultfd_util.c b/tools/testing/selftests/kvm/lib/userfaultfd_util.c
new file mode 100644
index 000000000000..5e0878878a69
--- /dev/null
+++ b/tools/testing/selftests/kvm/lib/userfaultfd_util.c
@@ -0,0 +1,196 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * KVM userfaultfd util
+ * Adapted from demand_paging_test.c
+ *
+ * Copyright (C) 2018, Red Hat, Inc.
+ * Copyright (C) 2019, Google, Inc.
+ * Copyright (C) 2022, Google, Inc.
+ */
+
+#define _GNU_SOURCE /* for pipe2 */
+
+#include <inttypes.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <time.h>
+#include <poll.h>
+#include <pthread.h>
+#include <linux/userfaultfd.h>
+#include <sys/syscall.h>
+
+#include "kvm_util.h"
+#include "test_util.h"
+#include "perf_test_util.h"
+#include "userfaultfd_util.h"
+
+#ifdef __NR_userfaultfd
+
+struct uffd_desc {
+	int uffd_mode;
+	int uffd;
+	int pipefds[2];
+	useconds_t delay;
+	uffd_handler_t handler;
+	pthread_t thread;
+};
+
+static void *uffd_handler_thread_fn(void *arg)
+{
+	struct uffd_desc *uffd_desc = (struct uffd_desc *)arg;
+	int uffd = uffd_desc->uffd;
+	int pipefd = uffd_desc->pipefds[0];
+	useconds_t delay = uffd_desc->delay;
+	int64_t pages = 0;
+	struct timespec start;
+	struct timespec ts_diff;
+
+	clock_gettime(CLOCK_MONOTONIC, &start);
+	while (1) {
+		struct uffd_msg msg;
+		struct pollfd pollfd[2];
+		char tmp_chr;
+		int r;
+
+		pollfd[0].fd = uffd;
+		pollfd[0].events = POLLIN;
+		pollfd[1].fd = pipefd;
+		pollfd[1].events = POLLIN;
+
+		r = poll(pollfd, 2, -1);
+		switch (r) {
+		case -1:
+			pr_info("poll err");
+			continue;
+		case 0:
+			continue;
+		case 1:
+			break;
+		default:
+			pr_info("Polling uffd returned %d", r);
+			return NULL;
+		}
+
+		if (pollfd[0].revents & POLLERR) {
+			pr_info("uffd revents has POLLERR");
+			return NULL;
+		}
+
+		if (pollfd[1].revents & POLLIN) {
+			r = read(pollfd[1].fd, &tmp_chr, 1);
+			TEST_ASSERT(r == 1,
+				    "Error reading pipefd in UFFD thread\n");
+			return NULL;
+		}
+
+		if (!(pollfd[0].revents & POLLIN))
+			continue;
+
+		r = read(uffd, &msg, sizeof(msg));
+		if (r == -1) {
+			if (errno == EAGAIN)
+				continue;
+			pr_info("Read of uffd got errno %d\n", errno);
+			return NULL;
+		}
+
+		if (r != sizeof(msg)) {
+			pr_info("Read on uffd returned unexpected size: %d bytes", r);
+			return NULL;
+		}
+
+		if (!(msg.event & UFFD_EVENT_PAGEFAULT))
+			continue;
+
+		if (delay)
+			usleep(delay);
+		r = uffd_desc->handler(uffd_desc->uffd_mode, uffd, &msg);
+		if (r < 0)
+			return NULL;
+		pages++;
+	}
+
+	ts_diff = timespec_elapsed(start);
+	PER_VCPU_DEBUG("userfaulted %ld pages over %ld.%.9lds. (%f/sec)\n",
+		       pages, ts_diff.tv_sec, ts_diff.tv_nsec,
+		       pages / ((double)ts_diff.tv_sec + (double)ts_diff.tv_nsec / 100000000.0));
+
+	return NULL;
+}
+
+struct uffd_desc *uffd_setup_demand_paging(int uffd_mode,
+		useconds_t uffd_delay, void *hva, uint64_t len,
+		uffd_handler_t handler)
+{
+	struct uffd_desc *uffd_desc;
+	bool is_minor = (uffd_mode == UFFDIO_REGISTER_MODE_MINOR);
+	int uffd;
+	struct uffdio_api uffdio_api;
+	struct uffdio_register uffdio_register;
+	uint64_t expected_ioctls = ((uint64_t) 1) << _UFFDIO_COPY;
+	int ret;
+
+	PER_PAGE_DEBUG("Userfaultfd %s mode, faults resolved with %s\n",
+		       is_minor ? "MINOR" : "MISSING",
+		       is_minor ? "UFFDIO_CONINUE" : "UFFDIO_COPY");
+
+	uffd_desc = malloc(sizeof(struct uffd_desc));
+	TEST_ASSERT(uffd_desc, "malloc failed");
+
+	/* In order to get minor faults, prefault via the alias. */
+	if (is_minor)
+		expected_ioctls = ((uint64_t) 1) << _UFFDIO_CONTINUE;
+
+	uffd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK);
+	TEST_ASSERT(uffd >= 0, "uffd creation failed, errno: %d", errno);
+
+	uffdio_api.api = UFFD_API;
+	uffdio_api.features = 0;
+	TEST_ASSERT(ioctl(uffd, UFFDIO_API, &uffdio_api) != -1,
+		    "ioctl UFFDIO_API failed: %" PRIu64,
+		    (uint64_t)uffdio_api.api);
+
+	uffdio_register.range.start = (uint64_t)hva;
+	uffdio_register.range.len = len;
+	uffdio_register.mode = uffd_mode;
+	TEST_ASSERT(ioctl(uffd, UFFDIO_REGISTER, &uffdio_register) != -1,
+		    "ioctl UFFDIO_REGISTER failed");
+	TEST_ASSERT((uffdio_register.ioctls & expected_ioctls) ==
+			expected_ioctls, "missing userfaultfd ioctls");
+
+	ret = pipe2(uffd_desc->pipefds, O_CLOEXEC | O_NONBLOCK);
+	TEST_ASSERT(!ret, "Failed to set up pipefd");
+
+	uffd_desc->uffd_mode = uffd_mode;
+	uffd_desc->uffd = uffd;
+	uffd_desc->delay = uffd_delay;
+	uffd_desc->handler = handler;
+	pthread_create(&uffd_desc->thread, NULL, uffd_handler_thread_fn,
+		       uffd_desc);
+
+	PER_VCPU_DEBUG("Created uffd thread for HVA range [%p, %p)\n",
+		       hva, hva + len);
+
+	return uffd_desc;
+}
+
+void uffd_stop_demand_paging(struct uffd_desc *uffd)
+{
+	char c = 0;
+	int ret;
+
+	ret = write(uffd->pipefds[1], &c, 1);
+	TEST_ASSERT(ret == 1, "Unable to write to pipefd");
+
+	ret = pthread_join(uffd->thread, NULL);
+	TEST_ASSERT(ret == 0, "Pthread_join failed.");
+
+	close(uffd->uffd);
+
+	close(uffd->pipefds[1]);
+	close(uffd->pipefds[0]);
+
+	free(uffd);
+}
+
+#endif /* __NR_userfaultfd */
-- 
2.35.1.723.g4982287a31-goog


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 02/11] KVM: selftests: Add vm_mem_region_get_src_fd library function
  2022-03-11  6:01 ` Ricardo Koller
@ 2022-03-11  6:01   ` Ricardo Koller
  -1 siblings, 0 replies; 42+ messages in thread
From: Ricardo Koller @ 2022-03-11  6:01 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: maz, bgardon, pbonzini, axelrasmussen

Add a library function to get the backing source FD of a memslot.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 .../selftests/kvm/include/kvm_util_base.h     |  1 +
 tools/testing/selftests/kvm/lib/kvm_util.c    | 23 +++++++++++++++++++
 2 files changed, 24 insertions(+)

diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
index 4ed6aa049a91..d6acec0858c0 100644
--- a/tools/testing/selftests/kvm/include/kvm_util_base.h
+++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
@@ -163,6 +163,7 @@ int _kvm_ioctl(struct kvm_vm *vm, unsigned long ioctl, void *arg);
 void vm_mem_region_set_flags(struct kvm_vm *vm, uint32_t slot, uint32_t flags);
 void vm_mem_region_move(struct kvm_vm *vm, uint32_t slot, uint64_t new_gpa);
 void vm_mem_region_delete(struct kvm_vm *vm, uint32_t slot);
+int vm_mem_region_get_src_fd(struct kvm_vm *vm, uint32_t memslot);
 void vm_vcpu_add(struct kvm_vm *vm, uint32_t vcpuid);
 vm_vaddr_t vm_vaddr_alloc(struct kvm_vm *vm, size_t sz, vm_vaddr_t vaddr_min);
 vm_vaddr_t vm_vaddr_alloc_pages(struct kvm_vm *vm, int nr_pages);
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index d8cf851ab119..64ef245b73de 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -580,6 +580,29 @@ kvm_userspace_memory_region_find(struct kvm_vm *vm, uint64_t start,
 	return &region->region;
 }
 
+/*
+ * KVM Userspace Memory Get Backing Source FD
+ *
+ * Input Args:
+ *   vm - Virtual Machine
+ *   memslot - KVM memory slot ID
+ *
+ * Output Args: None
+ *
+ * Return:
+ *   Backing source file descriptor, -1 if the memslot is an anonymous region.
+ *
+ * Returns the backing source fd of a memslot, so tests can use it to punch
+ * holes, or to setup permissions.
+ */
+int vm_mem_region_get_src_fd(struct kvm_vm *vm, uint32_t memslot)
+{
+	struct userspace_mem_region *region;
+
+	region = memslot2region(vm, memslot);
+	return region->fd;
+}
+
 /*
  * VCPU Find
  *
-- 
2.35.1.723.g4982287a31-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 02/11] KVM: selftests: Add vm_mem_region_get_src_fd library function
@ 2022-03-11  6:01   ` Ricardo Koller
  0 siblings, 0 replies; 42+ messages in thread
From: Ricardo Koller @ 2022-03-11  6:01 UTC (permalink / raw)
  To: kvm, kvmarm, drjones
  Cc: pbonzini, maz, alexandru.elisei, eric.auger, oupton, reijiw,
	rananta, bgardon, axelrasmussen, Ricardo Koller

Add a library function to get the backing source FD of a memslot.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 .../selftests/kvm/include/kvm_util_base.h     |  1 +
 tools/testing/selftests/kvm/lib/kvm_util.c    | 23 +++++++++++++++++++
 2 files changed, 24 insertions(+)

diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
index 4ed6aa049a91..d6acec0858c0 100644
--- a/tools/testing/selftests/kvm/include/kvm_util_base.h
+++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
@@ -163,6 +163,7 @@ int _kvm_ioctl(struct kvm_vm *vm, unsigned long ioctl, void *arg);
 void vm_mem_region_set_flags(struct kvm_vm *vm, uint32_t slot, uint32_t flags);
 void vm_mem_region_move(struct kvm_vm *vm, uint32_t slot, uint64_t new_gpa);
 void vm_mem_region_delete(struct kvm_vm *vm, uint32_t slot);
+int vm_mem_region_get_src_fd(struct kvm_vm *vm, uint32_t memslot);
 void vm_vcpu_add(struct kvm_vm *vm, uint32_t vcpuid);
 vm_vaddr_t vm_vaddr_alloc(struct kvm_vm *vm, size_t sz, vm_vaddr_t vaddr_min);
 vm_vaddr_t vm_vaddr_alloc_pages(struct kvm_vm *vm, int nr_pages);
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index d8cf851ab119..64ef245b73de 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -580,6 +580,29 @@ kvm_userspace_memory_region_find(struct kvm_vm *vm, uint64_t start,
 	return &region->region;
 }
 
+/*
+ * KVM Userspace Memory Get Backing Source FD
+ *
+ * Input Args:
+ *   vm - Virtual Machine
+ *   memslot - KVM memory slot ID
+ *
+ * Output Args: None
+ *
+ * Return:
+ *   Backing source file descriptor, -1 if the memslot is an anonymous region.
+ *
+ * Returns the backing source fd of a memslot, so tests can use it to punch
+ * holes, or to setup permissions.
+ */
+int vm_mem_region_get_src_fd(struct kvm_vm *vm, uint32_t memslot)
+{
+	struct userspace_mem_region *region;
+
+	region = memslot2region(vm, memslot);
+	return region->fd;
+}
+
 /*
  * VCPU Find
  *
-- 
2.35.1.723.g4982287a31-goog


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 03/11] KVM: selftests: aarch64: Add vm_get_pte_gpa library function
  2022-03-11  6:01 ` Ricardo Koller
@ 2022-03-11  6:01   ` Ricardo Koller
  -1 siblings, 0 replies; 42+ messages in thread
From: Ricardo Koller @ 2022-03-11  6:01 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: maz, bgardon, pbonzini, axelrasmussen

Add a library function (in-guest) to get the GPA of the PTE of a
particular GVA.  This will be used in a future commit by a test to clear
and check the AF (access flag) of a particular page.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 .../selftests/kvm/include/aarch64/processor.h |  2 ++
 .../selftests/kvm/lib/aarch64/processor.c     | 24 +++++++++++++++++--
 2 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/aarch64/processor.h b/tools/testing/selftests/kvm/include/aarch64/processor.h
index 8f9f46979a00..caa572d83062 100644
--- a/tools/testing/selftests/kvm/include/aarch64/processor.h
+++ b/tools/testing/selftests/kvm/include/aarch64/processor.h
@@ -125,6 +125,8 @@ void vm_install_exception_handler(struct kvm_vm *vm,
 void vm_install_sync_handler(struct kvm_vm *vm,
 		int vector, int ec, handler_fn handler);
 
+vm_paddr_t vm_get_pte_gpa(struct kvm_vm *vm, vm_vaddr_t gva);
+
 static inline void cpu_relax(void)
 {
 	asm volatile("yield" ::: "memory");
diff --git a/tools/testing/selftests/kvm/lib/aarch64/processor.c b/tools/testing/selftests/kvm/lib/aarch64/processor.c
index 9343d82519b4..ee006d354b79 100644
--- a/tools/testing/selftests/kvm/lib/aarch64/processor.c
+++ b/tools/testing/selftests/kvm/lib/aarch64/processor.c
@@ -139,7 +139,7 @@ void virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr)
 	_virt_pg_map(vm, vaddr, paddr, attr_idx);
 }
 
-vm_paddr_t addr_gva2gpa(struct kvm_vm *vm, vm_vaddr_t gva)
+vm_paddr_t vm_get_pte_gpa(struct kvm_vm *vm, vm_vaddr_t gva)
 {
 	uint64_t *ptep;
 
@@ -162,7 +162,7 @@ vm_paddr_t addr_gva2gpa(struct kvm_vm *vm, vm_vaddr_t gva)
 			goto unmapped_gva;
 		/* fall through */
 	case 2:
-		ptep = addr_gpa2hva(vm, pte_addr(vm, *ptep)) + pte_index(vm, gva) * 8;
+		ptep = (uint64_t *)(pte_addr(vm, *ptep) + pte_index(vm, gva) * 8);
 		if (!ptep)
 			goto unmapped_gva;
 		break;
@@ -170,6 +170,26 @@ vm_paddr_t addr_gva2gpa(struct kvm_vm *vm, vm_vaddr_t gva)
 		TEST_FAIL("Page table levels must be 2, 3, or 4");
 	}
 
+	return (vm_paddr_t)ptep;
+
+unmapped_gva:
+	TEST_FAIL("No mapping for vm virtual address, gva: 0x%lx", gva);
+	exit(1);
+}
+
+vm_paddr_t addr_gva2gpa(struct kvm_vm *vm, vm_vaddr_t gva)
+{
+	uint64_t *ptep;
+	vm_paddr_t ptep_gpa;
+
+	ptep_gpa = vm_get_pte_gpa(vm, gva);
+	if (!ptep_gpa)
+		goto unmapped_gva;
+
+	ptep = addr_gpa2hva(vm, ptep_gpa);
+	if (!ptep)
+		goto unmapped_gva;
+
 	return pte_addr(vm, *ptep) + (gva & (vm->page_size - 1));
 
 unmapped_gva:
-- 
2.35.1.723.g4982287a31-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 03/11] KVM: selftests: aarch64: Add vm_get_pte_gpa library function
@ 2022-03-11  6:01   ` Ricardo Koller
  0 siblings, 0 replies; 42+ messages in thread
From: Ricardo Koller @ 2022-03-11  6:01 UTC (permalink / raw)
  To: kvm, kvmarm, drjones
  Cc: pbonzini, maz, alexandru.elisei, eric.auger, oupton, reijiw,
	rananta, bgardon, axelrasmussen, Ricardo Koller

Add a library function (in-guest) to get the GPA of the PTE of a
particular GVA.  This will be used in a future commit by a test to clear
and check the AF (access flag) of a particular page.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 .../selftests/kvm/include/aarch64/processor.h |  2 ++
 .../selftests/kvm/lib/aarch64/processor.c     | 24 +++++++++++++++++--
 2 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/aarch64/processor.h b/tools/testing/selftests/kvm/include/aarch64/processor.h
index 8f9f46979a00..caa572d83062 100644
--- a/tools/testing/selftests/kvm/include/aarch64/processor.h
+++ b/tools/testing/selftests/kvm/include/aarch64/processor.h
@@ -125,6 +125,8 @@ void vm_install_exception_handler(struct kvm_vm *vm,
 void vm_install_sync_handler(struct kvm_vm *vm,
 		int vector, int ec, handler_fn handler);
 
+vm_paddr_t vm_get_pte_gpa(struct kvm_vm *vm, vm_vaddr_t gva);
+
 static inline void cpu_relax(void)
 {
 	asm volatile("yield" ::: "memory");
diff --git a/tools/testing/selftests/kvm/lib/aarch64/processor.c b/tools/testing/selftests/kvm/lib/aarch64/processor.c
index 9343d82519b4..ee006d354b79 100644
--- a/tools/testing/selftests/kvm/lib/aarch64/processor.c
+++ b/tools/testing/selftests/kvm/lib/aarch64/processor.c
@@ -139,7 +139,7 @@ void virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr)
 	_virt_pg_map(vm, vaddr, paddr, attr_idx);
 }
 
-vm_paddr_t addr_gva2gpa(struct kvm_vm *vm, vm_vaddr_t gva)
+vm_paddr_t vm_get_pte_gpa(struct kvm_vm *vm, vm_vaddr_t gva)
 {
 	uint64_t *ptep;
 
@@ -162,7 +162,7 @@ vm_paddr_t addr_gva2gpa(struct kvm_vm *vm, vm_vaddr_t gva)
 			goto unmapped_gva;
 		/* fall through */
 	case 2:
-		ptep = addr_gpa2hva(vm, pte_addr(vm, *ptep)) + pte_index(vm, gva) * 8;
+		ptep = (uint64_t *)(pte_addr(vm, *ptep) + pte_index(vm, gva) * 8);
 		if (!ptep)
 			goto unmapped_gva;
 		break;
@@ -170,6 +170,26 @@ vm_paddr_t addr_gva2gpa(struct kvm_vm *vm, vm_vaddr_t gva)
 		TEST_FAIL("Page table levels must be 2, 3, or 4");
 	}
 
+	return (vm_paddr_t)ptep;
+
+unmapped_gva:
+	TEST_FAIL("No mapping for vm virtual address, gva: 0x%lx", gva);
+	exit(1);
+}
+
+vm_paddr_t addr_gva2gpa(struct kvm_vm *vm, vm_vaddr_t gva)
+{
+	uint64_t *ptep;
+	vm_paddr_t ptep_gpa;
+
+	ptep_gpa = vm_get_pte_gpa(vm, gva);
+	if (!ptep_gpa)
+		goto unmapped_gva;
+
+	ptep = addr_gpa2hva(vm, ptep_gpa);
+	if (!ptep)
+		goto unmapped_gva;
+
 	return pte_addr(vm, *ptep) + (gva & (vm->page_size - 1));
 
 unmapped_gva:
-- 
2.35.1.723.g4982287a31-goog


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 04/11] KVM: selftests: Add vm_alloc_page_table_in_memslot library function
  2022-03-11  6:01 ` Ricardo Koller
@ 2022-03-11  6:02   ` Ricardo Koller
  -1 siblings, 0 replies; 42+ messages in thread
From: Ricardo Koller @ 2022-03-11  6:02 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: maz, bgardon, pbonzini, axelrasmussen

Add a library function to allocate a page-table physical page in a
particular memslot.  The default behavior is to create new page-table
pages in memslot 0.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 tools/testing/selftests/kvm/include/kvm_util_base.h | 1 +
 tools/testing/selftests/kvm/lib/kvm_util.c          | 8 +++++++-
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
index d6acec0858c0..c8dce12a9a52 100644
--- a/tools/testing/selftests/kvm/include/kvm_util_base.h
+++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
@@ -307,6 +307,7 @@ vm_paddr_t vm_phy_page_alloc(struct kvm_vm *vm, vm_paddr_t paddr_min,
 vm_paddr_t vm_phy_pages_alloc(struct kvm_vm *vm, size_t num,
 			      vm_paddr_t paddr_min, uint32_t memslot);
 vm_paddr_t vm_alloc_page_table(struct kvm_vm *vm);
+vm_paddr_t vm_alloc_page_table_in_memslot(struct kvm_vm *vm, uint32_t pt_memslot);
 
 /*
  * Create a VM with reasonable defaults
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index 64ef245b73de..ae21564241c8 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -2409,9 +2409,15 @@ vm_paddr_t vm_phy_page_alloc(struct kvm_vm *vm, vm_paddr_t paddr_min,
 /* Arbitrary minimum physical address used for virtual translation tables. */
 #define KVM_GUEST_PAGE_TABLE_MIN_PADDR 0x180000
 
+vm_paddr_t vm_alloc_page_table_in_memslot(struct kvm_vm *vm, uint32_t pt_memslot)
+{
+	return vm_phy_page_alloc(vm, KVM_GUEST_PAGE_TABLE_MIN_PADDR,
+			pt_memslot);
+}
+
 vm_paddr_t vm_alloc_page_table(struct kvm_vm *vm)
 {
-	return vm_phy_page_alloc(vm, KVM_GUEST_PAGE_TABLE_MIN_PADDR, 0);
+	return vm_alloc_page_table_in_memslot(vm, 0);
 }
 
 /*
-- 
2.35.1.723.g4982287a31-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 04/11] KVM: selftests: Add vm_alloc_page_table_in_memslot library function
@ 2022-03-11  6:02   ` Ricardo Koller
  0 siblings, 0 replies; 42+ messages in thread
From: Ricardo Koller @ 2022-03-11  6:02 UTC (permalink / raw)
  To: kvm, kvmarm, drjones
  Cc: pbonzini, maz, alexandru.elisei, eric.auger, oupton, reijiw,
	rananta, bgardon, axelrasmussen, Ricardo Koller

Add a library function to allocate a page-table physical page in a
particular memslot.  The default behavior is to create new page-table
pages in memslot 0.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 tools/testing/selftests/kvm/include/kvm_util_base.h | 1 +
 tools/testing/selftests/kvm/lib/kvm_util.c          | 8 +++++++-
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
index d6acec0858c0..c8dce12a9a52 100644
--- a/tools/testing/selftests/kvm/include/kvm_util_base.h
+++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
@@ -307,6 +307,7 @@ vm_paddr_t vm_phy_page_alloc(struct kvm_vm *vm, vm_paddr_t paddr_min,
 vm_paddr_t vm_phy_pages_alloc(struct kvm_vm *vm, size_t num,
 			      vm_paddr_t paddr_min, uint32_t memslot);
 vm_paddr_t vm_alloc_page_table(struct kvm_vm *vm);
+vm_paddr_t vm_alloc_page_table_in_memslot(struct kvm_vm *vm, uint32_t pt_memslot);
 
 /*
  * Create a VM with reasonable defaults
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index 64ef245b73de..ae21564241c8 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -2409,9 +2409,15 @@ vm_paddr_t vm_phy_page_alloc(struct kvm_vm *vm, vm_paddr_t paddr_min,
 /* Arbitrary minimum physical address used for virtual translation tables. */
 #define KVM_GUEST_PAGE_TABLE_MIN_PADDR 0x180000
 
+vm_paddr_t vm_alloc_page_table_in_memslot(struct kvm_vm *vm, uint32_t pt_memslot)
+{
+	return vm_phy_page_alloc(vm, KVM_GUEST_PAGE_TABLE_MIN_PADDR,
+			pt_memslot);
+}
+
 vm_paddr_t vm_alloc_page_table(struct kvm_vm *vm)
 {
-	return vm_phy_page_alloc(vm, KVM_GUEST_PAGE_TABLE_MIN_PADDR, 0);
+	return vm_alloc_page_table_in_memslot(vm, 0);
 }
 
 /*
-- 
2.35.1.723.g4982287a31-goog


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 05/11] KVM: selftests: aarch64: Export _virt_pg_map with a pt_memslot arg
  2022-03-11  6:01 ` Ricardo Koller
@ 2022-03-11  6:02   ` Ricardo Koller
  -1 siblings, 0 replies; 42+ messages in thread
From: Ricardo Koller @ 2022-03-11  6:02 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: maz, bgardon, pbonzini, axelrasmussen

Add an argument, pt_memslot, into _virt_pg_map in order to use a
specific memslot for the page-table allocations performed when creating
a new map. This will be used in a future commit to test having PTEs
stored on memslots with different setups (e.g., hugetlb with a hole).

Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 .../selftests/kvm/include/aarch64/processor.h        |  3 +++
 tools/testing/selftests/kvm/lib/aarch64/processor.c  | 12 ++++++------
 2 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/aarch64/processor.h b/tools/testing/selftests/kvm/include/aarch64/processor.h
index caa572d83062..3965a5ac778e 100644
--- a/tools/testing/selftests/kvm/include/aarch64/processor.h
+++ b/tools/testing/selftests/kvm/include/aarch64/processor.h
@@ -125,6 +125,9 @@ void vm_install_exception_handler(struct kvm_vm *vm,
 void vm_install_sync_handler(struct kvm_vm *vm,
 		int vector, int ec, handler_fn handler);
 
+void _virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr,
+			 uint64_t flags, uint32_t pt_memslot);
+
 vm_paddr_t vm_get_pte_gpa(struct kvm_vm *vm, vm_vaddr_t gva);
 
 static inline void cpu_relax(void)
diff --git a/tools/testing/selftests/kvm/lib/aarch64/processor.c b/tools/testing/selftests/kvm/lib/aarch64/processor.c
index ee006d354b79..8f4ec1be4364 100644
--- a/tools/testing/selftests/kvm/lib/aarch64/processor.c
+++ b/tools/testing/selftests/kvm/lib/aarch64/processor.c
@@ -86,8 +86,8 @@ void virt_pgd_alloc(struct kvm_vm *vm)
 	}
 }
 
-static void _virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr,
-			 uint64_t flags)
+void _virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr,
+			 uint64_t flags, uint32_t pt_memslot)
 {
 	uint8_t attr_idx = flags & 7;
 	uint64_t *ptep;
@@ -108,18 +108,18 @@ static void _virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr,
 
 	ptep = addr_gpa2hva(vm, vm->pgd) + pgd_index(vm, vaddr) * 8;
 	if (!*ptep)
-		*ptep = vm_alloc_page_table(vm) | 3;
+		*ptep = vm_alloc_page_table_in_memslot(vm, pt_memslot) | 3;
 
 	switch (vm->pgtable_levels) {
 	case 4:
 		ptep = addr_gpa2hva(vm, pte_addr(vm, *ptep)) + pud_index(vm, vaddr) * 8;
 		if (!*ptep)
-			*ptep = vm_alloc_page_table(vm) | 3;
+			*ptep = vm_alloc_page_table_in_memslot(vm, pt_memslot) | 3;
 		/* fall through */
 	case 3:
 		ptep = addr_gpa2hva(vm, pte_addr(vm, *ptep)) + pmd_index(vm, vaddr) * 8;
 		if (!*ptep)
-			*ptep = vm_alloc_page_table(vm) | 3;
+			*ptep = vm_alloc_page_table_in_memslot(vm, pt_memslot) | 3;
 		/* fall through */
 	case 2:
 		ptep = addr_gpa2hva(vm, pte_addr(vm, *ptep)) + pte_index(vm, vaddr) * 8;
@@ -136,7 +136,7 @@ void virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr)
 {
 	uint64_t attr_idx = 4; /* NORMAL (See DEFAULT_MAIR_EL1) */
 
-	_virt_pg_map(vm, vaddr, paddr, attr_idx);
+	_virt_pg_map(vm, vaddr, paddr, attr_idx, 0);
 }
 
 vm_paddr_t vm_get_pte_gpa(struct kvm_vm *vm, vm_vaddr_t gva)
-- 
2.35.1.723.g4982287a31-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 05/11] KVM: selftests: aarch64: Export _virt_pg_map with a pt_memslot arg
@ 2022-03-11  6:02   ` Ricardo Koller
  0 siblings, 0 replies; 42+ messages in thread
From: Ricardo Koller @ 2022-03-11  6:02 UTC (permalink / raw)
  To: kvm, kvmarm, drjones
  Cc: pbonzini, maz, alexandru.elisei, eric.auger, oupton, reijiw,
	rananta, bgardon, axelrasmussen, Ricardo Koller

Add an argument, pt_memslot, into _virt_pg_map in order to use a
specific memslot for the page-table allocations performed when creating
a new map. This will be used in a future commit to test having PTEs
stored on memslots with different setups (e.g., hugetlb with a hole).

Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 .../selftests/kvm/include/aarch64/processor.h        |  3 +++
 tools/testing/selftests/kvm/lib/aarch64/processor.c  | 12 ++++++------
 2 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/aarch64/processor.h b/tools/testing/selftests/kvm/include/aarch64/processor.h
index caa572d83062..3965a5ac778e 100644
--- a/tools/testing/selftests/kvm/include/aarch64/processor.h
+++ b/tools/testing/selftests/kvm/include/aarch64/processor.h
@@ -125,6 +125,9 @@ void vm_install_exception_handler(struct kvm_vm *vm,
 void vm_install_sync_handler(struct kvm_vm *vm,
 		int vector, int ec, handler_fn handler);
 
+void _virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr,
+			 uint64_t flags, uint32_t pt_memslot);
+
 vm_paddr_t vm_get_pte_gpa(struct kvm_vm *vm, vm_vaddr_t gva);
 
 static inline void cpu_relax(void)
diff --git a/tools/testing/selftests/kvm/lib/aarch64/processor.c b/tools/testing/selftests/kvm/lib/aarch64/processor.c
index ee006d354b79..8f4ec1be4364 100644
--- a/tools/testing/selftests/kvm/lib/aarch64/processor.c
+++ b/tools/testing/selftests/kvm/lib/aarch64/processor.c
@@ -86,8 +86,8 @@ void virt_pgd_alloc(struct kvm_vm *vm)
 	}
 }
 
-static void _virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr,
-			 uint64_t flags)
+void _virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr,
+			 uint64_t flags, uint32_t pt_memslot)
 {
 	uint8_t attr_idx = flags & 7;
 	uint64_t *ptep;
@@ -108,18 +108,18 @@ static void _virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr,
 
 	ptep = addr_gpa2hva(vm, vm->pgd) + pgd_index(vm, vaddr) * 8;
 	if (!*ptep)
-		*ptep = vm_alloc_page_table(vm) | 3;
+		*ptep = vm_alloc_page_table_in_memslot(vm, pt_memslot) | 3;
 
 	switch (vm->pgtable_levels) {
 	case 4:
 		ptep = addr_gpa2hva(vm, pte_addr(vm, *ptep)) + pud_index(vm, vaddr) * 8;
 		if (!*ptep)
-			*ptep = vm_alloc_page_table(vm) | 3;
+			*ptep = vm_alloc_page_table_in_memslot(vm, pt_memslot) | 3;
 		/* fall through */
 	case 3:
 		ptep = addr_gpa2hva(vm, pte_addr(vm, *ptep)) + pmd_index(vm, vaddr) * 8;
 		if (!*ptep)
-			*ptep = vm_alloc_page_table(vm) | 3;
+			*ptep = vm_alloc_page_table_in_memslot(vm, pt_memslot) | 3;
 		/* fall through */
 	case 2:
 		ptep = addr_gpa2hva(vm, pte_addr(vm, *ptep)) + pte_index(vm, vaddr) * 8;
@@ -136,7 +136,7 @@ void virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr)
 {
 	uint64_t attr_idx = 4; /* NORMAL (See DEFAULT_MAIR_EL1) */
 
-	_virt_pg_map(vm, vaddr, paddr, attr_idx);
+	_virt_pg_map(vm, vaddr, paddr, attr_idx, 0);
 }
 
 vm_paddr_t vm_get_pte_gpa(struct kvm_vm *vm, vm_vaddr_t gva)
-- 
2.35.1.723.g4982287a31-goog


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 06/11] KVM: selftests: Add missing close and munmap in __vm_mem_region_delete
  2022-03-11  6:01 ` Ricardo Koller
@ 2022-03-11  6:02   ` Ricardo Koller
  -1 siblings, 0 replies; 42+ messages in thread
From: Ricardo Koller @ 2022-03-11  6:02 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: maz, bgardon, pbonzini, axelrasmussen

Deleting a memslot (when freeing a VM) is not closing the backing fd,
nor it's unmapping the alias mapping. Fix by adding the missing close
and munmap.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 tools/testing/selftests/kvm/lib/kvm_util.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index ae21564241c8..c25c79f97695 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -702,6 +702,12 @@ static void __vm_mem_region_delete(struct kvm_vm *vm,
 	sparsebit_free(&region->unused_phy_pages);
 	ret = munmap(region->mmap_start, region->mmap_size);
 	TEST_ASSERT(ret == 0, "munmap failed, rc: %i errno: %i", ret, errno);
+	if (region->fd >= 0) {
+	/* There's an extra map if shared memory. */
+		ret = munmap(region->mmap_alias, region->mmap_size);
+		TEST_ASSERT(ret == 0, "munmap failed, rc: %i errno: %i", ret, errno);
+		close(region->fd);
+	}
 
 	free(region);
 }
-- 
2.35.1.723.g4982287a31-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 06/11] KVM: selftests: Add missing close and munmap in __vm_mem_region_delete
@ 2022-03-11  6:02   ` Ricardo Koller
  0 siblings, 0 replies; 42+ messages in thread
From: Ricardo Koller @ 2022-03-11  6:02 UTC (permalink / raw)
  To: kvm, kvmarm, drjones
  Cc: pbonzini, maz, alexandru.elisei, eric.auger, oupton, reijiw,
	rananta, bgardon, axelrasmussen, Ricardo Koller

Deleting a memslot (when freeing a VM) is not closing the backing fd,
nor it's unmapping the alias mapping. Fix by adding the missing close
and munmap.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 tools/testing/selftests/kvm/lib/kvm_util.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index ae21564241c8..c25c79f97695 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -702,6 +702,12 @@ static void __vm_mem_region_delete(struct kvm_vm *vm,
 	sparsebit_free(&region->unused_phy_pages);
 	ret = munmap(region->mmap_start, region->mmap_size);
 	TEST_ASSERT(ret == 0, "munmap failed, rc: %i errno: %i", ret, errno);
+	if (region->fd >= 0) {
+	/* There's an extra map if shared memory. */
+		ret = munmap(region->mmap_alias, region->mmap_size);
+		TEST_ASSERT(ret == 0, "munmap failed, rc: %i errno: %i", ret, errno);
+		close(region->fd);
+	}
 
 	free(region);
 }
-- 
2.35.1.723.g4982287a31-goog


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 07/11] KVM: selftests: aarch64: Add aarch64/page_fault_test
  2022-03-11  6:01 ` Ricardo Koller
@ 2022-03-11  6:02   ` Ricardo Koller
  -1 siblings, 0 replies; 42+ messages in thread
From: Ricardo Koller @ 2022-03-11  6:02 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: maz, bgardon, pbonzini, axelrasmussen

Add a new test for stage 2 faults when using different combinations of
guest accesses (e.g., write, S1PTW), backing source type (e.g., anon)
and types of faults (e.g., read on hugetlbfs with a hole). The next
commits will add different handling methods and more faults (e.g., uffd
and dirty logging). This first commit starts by adding two sanity checks
for all types of accesses: AF setting by the hw, and accessing memslots
with holes.

Note that this commit borrows some code from kvm-unit-tests: RET,
MOV_X0, and flush_tlb_page.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 tools/testing/selftests/kvm/Makefile          |   1 +
 .../selftests/kvm/aarch64/page_fault_test.c   | 667 ++++++++++++++++++
 2 files changed, 668 insertions(+)
 create mode 100644 tools/testing/selftests/kvm/aarch64/page_fault_test.c

diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index bc5f89b3700e..6a192798b217 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -103,6 +103,7 @@ TEST_GEN_PROGS_x86_64 += system_counter_offset_test
 TEST_GEN_PROGS_aarch64 += aarch64/arch_timer
 TEST_GEN_PROGS_aarch64 += aarch64/debug-exceptions
 TEST_GEN_PROGS_aarch64 += aarch64/get-reg-list
+TEST_GEN_PROGS_aarch64 += aarch64/page_fault_test
 TEST_GEN_PROGS_aarch64 += aarch64/psci_cpu_on_test
 TEST_GEN_PROGS_aarch64 += aarch64/vgic_init
 TEST_GEN_PROGS_aarch64 += aarch64/vgic_irq
diff --git a/tools/testing/selftests/kvm/aarch64/page_fault_test.c b/tools/testing/selftests/kvm/aarch64/page_fault_test.c
new file mode 100644
index 000000000000..00477a4f10cb
--- /dev/null
+++ b/tools/testing/selftests/kvm/aarch64/page_fault_test.c
@@ -0,0 +1,667 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * page_fault_test.c - Test stage 2 faults.
+ *
+ * This test tries different combinations of guest accesses (e.g., write,
+ * S1PTW), backing source type (e.g., anon) and types of faults (e.g., read on
+ * hugetlbfs with a hole). It checks that the expected handling method is
+ * called (e.g., uffd faults with the right address and write/read flag).
+ */
+
+#define _GNU_SOURCE
+#include <linux/bitmap.h>
+#include <fcntl.h>
+#include <test_util.h>
+#include <kvm_util.h>
+#include <processor.h>
+#include "guest_modes.h"
+#include "userfaultfd_util.h"
+
+#define VCPU_ID					0
+
+#define TEST_MEM_SLOT_INDEX			1
+#define TEST_PT_SLOT_INDEX			2
+
+/* Max number of backing pages per guest page */
+#define BACKING_PG_PER_GUEST_PG			(64 / 4)
+
+/* Test memslot in backing source pages */
+#define TEST_MEMSLOT_BACKING_SRC_NPAGES		(1 * BACKING_PG_PER_GUEST_PG)
+
+/* PT memslot size in backing source pages */
+#define PT_MEMSLOT_BACKING_SRC_NPAGES		(4 * BACKING_PG_PER_GUEST_PG)
+
+/* Guest virtual addresses that point to the test page and its PTE. */
+#define GUEST_TEST_GVA				0xc0000000
+#define GUEST_TEST_EXEC_GVA			0xc0000008
+#define GUEST_TEST_PTE_GVA			0xd0000000
+
+/* Access flag */
+#define PTE_AF					(1ULL << 10)
+
+/* Acces flag update enable/disable */
+#define TCR_EL1_HA				(1ULL << 39)
+
+#define CMD_SKIP_TEST				(-1LL)
+#define CMD_HOLE_PT				(1ULL << 2)
+#define CMD_HOLE_TEST				(1ULL << 3)
+
+#define PREPARE_FN_NR				10
+#define CHECK_FN_NR				10
+
+static const uint64_t test_gva = GUEST_TEST_GVA;
+static const uint64_t test_exec_gva = GUEST_TEST_EXEC_GVA;
+static const uint64_t pte_gva = GUEST_TEST_PTE_GVA;
+uint64_t pte_gpa;
+
+enum { PT, TEST, NR_MEMSLOTS};
+
+struct memslot_desc {
+	void *hva;
+	uint64_t gpa;
+	uint64_t size;
+	uint64_t guest_pages;
+	uint64_t backing_pages;
+	enum vm_mem_backing_src_type src_type;
+	uint32_t idx;
+} memslot[NR_MEMSLOTS] = {
+	{
+		.idx = TEST_PT_SLOT_INDEX,
+		.backing_pages = PT_MEMSLOT_BACKING_SRC_NPAGES,
+	},
+	{
+		.idx = TEST_MEM_SLOT_INDEX,
+		.backing_pages = TEST_MEMSLOT_BACKING_SRC_NPAGES,
+	},
+};
+
+static struct event_cnt {
+	int aborts;
+	int fail_vcpu_runs;
+} events;
+
+struct test_desc {
+	const char *name;
+	uint64_t mem_mark_cmd;
+	/* Skip the test if any prepare function returns false */
+	bool (*guest_prepare[PREPARE_FN_NR])(void);
+	void (*guest_test)(void);
+	void (*guest_test_check[CHECK_FN_NR])(void);
+	void (*dabt_handler)(struct ex_regs *regs);
+	void (*iabt_handler)(struct ex_regs *regs);
+	uint32_t pt_memslot_flags;
+	uint32_t test_memslot_flags;
+	void (*guest_pre_run)(struct kvm_vm *vm);
+	bool skip;
+	struct event_cnt expected_events;
+};
+
+struct test_params {
+	enum vm_mem_backing_src_type src_type;
+	struct test_desc *test_desc;
+};
+
+
+static inline void flush_tlb_page(uint64_t vaddr)
+{
+	uint64_t page = vaddr >> 12;
+
+	dsb(ishst);
+	asm("tlbi vaae1is, %0" :: "r" (page));
+	dsb(ish);
+	isb();
+}
+
+#define RET			0xd65f03c0
+#define MOV_X0(x)		(0xd2800000 | (((x) & 0xffff) << 5))
+
+static void guest_test_nop(void)
+{}
+
+static void guest_test_write64(void)
+{
+	uint64_t val;
+
+	WRITE_ONCE(*((uint64_t *)test_gva), 0x0123456789ABCDEF);
+	val = READ_ONCE(*(uint64_t *)test_gva);
+	GUEST_ASSERT_EQ(val, 0x0123456789ABCDEF);
+}
+
+/* Check the system for atomic instructions. */
+static bool guest_check_lse(void)
+{
+	uint64_t isar0 = read_sysreg(id_aa64isar0_el1);
+	uint64_t atomic = (isar0 >> 20) & 7;
+
+	return atomic >= 2;
+}
+
+/* Compare and swap instruction. */
+static void guest_test_cas(void)
+{
+	uint64_t val;
+	uint64_t addr = test_gva;
+
+	GUEST_ASSERT_EQ(guest_check_lse(), 1);
+	asm volatile(".arch_extension lse\n"
+		     "casal %0, %1, [%2]\n"
+			:: "r" (0), "r" (0x0123456789ABCDEF), "r" (addr));
+	val = READ_ONCE(*(uint64_t *)(addr));
+	GUEST_ASSERT_EQ(val, 0x0123456789ABCDEF);
+}
+
+static void guest_test_read64(void)
+{
+	uint64_t val;
+
+	val = READ_ONCE(*(uint64_t *)test_gva);
+	GUEST_ASSERT_EQ(val, 0);
+}
+
+/* Address translation instruction */
+static void guest_test_at(void)
+{
+	uint64_t par;
+	uint64_t addr = 0;
+
+	asm volatile("at s1e1r, %0" :: "r" (test_gva));
+	par = read_sysreg(par_el1);
+
+	/* Bit 1 indicates whether the AT was successful */
+	GUEST_ASSERT_EQ(par & 1, 0);
+	/* The PA in bits [51:12] */
+	addr = par & (((1ULL << 40) - 1) << 12);
+	GUEST_ASSERT_EQ(addr, memslot[TEST].gpa);
+}
+
+static void guest_test_dc_zva(void)
+{
+	/* The smallest guaranteed block size (bs) is a word. */
+	uint16_t val;
+
+	asm volatile("dc zva, %0\n"
+			"dsb ish\n"
+			:: "r" (test_gva));
+	val = READ_ONCE(*(uint16_t *)test_gva);
+	GUEST_ASSERT_EQ(val, 0);
+}
+
+static void guest_test_ld_preidx(void)
+{
+	uint64_t val;
+	uint64_t addr = test_gva - 8;
+
+	/*
+	 * This ends up accessing "test_gva + 8 - 8", where "test_gva - 8"
+	 * is not backed by a memslot.
+	 */
+	asm volatile("ldr %0, [%1, #8]!"
+			: "=r" (val), "+r" (addr));
+	GUEST_ASSERT_EQ(val, 0);
+	GUEST_ASSERT_EQ(addr, test_gva);
+}
+
+static void guest_test_st_preidx(void)
+{
+	uint64_t val = 0x0123456789ABCDEF;
+	uint64_t addr = test_gva - 8;
+
+	asm volatile("str %0, [%1, #8]!"
+			: "+r" (val), "+r" (addr));
+
+	GUEST_ASSERT_EQ(addr, test_gva);
+	val = READ_ONCE(*(uint64_t *)test_gva);
+}
+
+static bool guest_set_ha(void)
+{
+	uint64_t mmfr1 = read_sysreg(id_aa64mmfr1_el1);
+	uint64_t hadbs = mmfr1 & 6;
+	uint64_t tcr;
+
+	/* Skip if HA is not supported. */
+	if (hadbs == 0)
+		return false;
+
+	tcr = read_sysreg(tcr_el1) | TCR_EL1_HA;
+	write_sysreg(tcr, tcr_el1);
+	isb();
+
+	return true;
+}
+
+static bool guest_clear_pte_af(void)
+{
+	*((uint64_t *)pte_gva) &= ~PTE_AF;
+	flush_tlb_page(pte_gva);
+
+	return true;
+}
+
+static void guest_check_pte_af(void)
+{
+	flush_tlb_page(pte_gva);
+	GUEST_ASSERT_EQ(*((uint64_t *)pte_gva) & PTE_AF, PTE_AF);
+}
+
+static void guest_test_exec(void)
+{
+	int (*code)(void) = (int (*)(void))test_exec_gva;
+	int ret;
+
+	ret = code();
+	GUEST_ASSERT_EQ(ret, 0x77);
+}
+
+static bool guest_prepare(struct test_desc *test)
+{
+	bool (*prepare_fn)(void);
+	int i;
+
+	for (i = 0; i < PREPARE_FN_NR; i++) {
+		prepare_fn = test->guest_prepare[i];
+		if (prepare_fn && !prepare_fn())
+			return false;
+	}
+
+	return true;
+}
+
+static void guest_test_check(struct test_desc *test)
+{
+	void (*check_fn)(void);
+	int i;
+
+	for (i = 0; i < CHECK_FN_NR; i++) {
+		check_fn = test->guest_test_check[i];
+		if (!check_fn)
+			continue;
+		check_fn();
+	}
+}
+
+static void guest_code(struct test_desc *test)
+{
+	if (!test->guest_test)
+		test->guest_test = guest_test_nop;
+
+	if (!guest_prepare(test))
+		GUEST_SYNC(CMD_SKIP_TEST);
+
+	GUEST_SYNC(test->mem_mark_cmd);
+	test->guest_test();
+
+	guest_test_check(test);
+	GUEST_DONE();
+}
+
+static void no_dabt_handler(struct ex_regs *regs)
+{
+	GUEST_ASSERT_1(false, read_sysreg(far_el1));
+}
+
+static void no_iabt_handler(struct ex_regs *regs)
+{
+	GUEST_ASSERT_1(false, regs->pc);
+}
+
+static void punch_hole_in_memslot(struct kvm_vm *vm,
+		struct memslot_desc *memslot)
+{
+	int ret, fd;
+	void *hva;
+
+	fd = vm_mem_region_get_src_fd(vm, memslot->idx);
+	if (fd != -1) {
+		ret = fallocate(fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
+				0, memslot->size);
+		TEST_ASSERT(ret == 0, "fallocate failed, errno: %d\n", errno);
+	} else {
+		hva = addr_gpa2hva(vm, memslot->gpa);
+		ret = madvise(hva, memslot->size, MADV_DONTNEED);
+		TEST_ASSERT(ret == 0, "madvise failed, errno: %d\n", errno);
+	}
+}
+
+static void handle_cmd(struct kvm_vm *vm, int cmd)
+{
+	if (cmd & CMD_HOLE_PT)
+		punch_hole_in_memslot(vm, &memslot[PT]);
+	if (cmd & CMD_HOLE_TEST)
+		punch_hole_in_memslot(vm, &memslot[TEST]);
+}
+
+static void sync_stats_from_guest(struct kvm_vm *vm)
+{
+	struct event_cnt *ec = addr_gva2hva(vm, (uint64_t)&events);
+
+	events.aborts += ec->aborts;
+}
+
+void fail_vcpu_run_no_handler(int ret)
+{
+	TEST_FAIL("Unexpected vcpu run failure\n");
+}
+
+static uint64_t get_total_guest_pages(enum vm_guest_mode mode,
+		struct test_params *p)
+{
+	uint64_t large_page_size = get_backing_src_pagesz(p->src_type);
+	uint64_t guest_page_size = vm_guest_mode_params[mode].page_size;
+	uint64_t size;
+
+	size = PT_MEMSLOT_BACKING_SRC_NPAGES * large_page_size;
+	size += TEST_MEMSLOT_BACKING_SRC_NPAGES * large_page_size;
+
+	return size / guest_page_size;
+}
+
+static void load_exec_code_for_test(void)
+{
+	uint32_t *code;
+
+	/* Write this "code" into test_exec_gva */
+	assert(test_exec_gva - test_gva);
+	code = memslot[TEST].hva + 8;
+
+	code[0] = MOV_X0(0x77);
+	code[1] = RET;
+}
+
+static void setup_guest_args(struct kvm_vm *vm, struct test_desc *test)
+{
+	vm_vaddr_t test_desc_gva;
+
+	test_desc_gva = vm_vaddr_alloc_page(vm);
+	memcpy(addr_gva2hva(vm, test_desc_gva), test,
+			sizeof(struct test_desc));
+	vcpu_args_set(vm, 0, 1, test_desc_gva);
+}
+
+static void setup_abort_handlers(struct kvm_vm *vm, struct test_desc *test)
+{
+	vm_init_descriptor_tables(vm);
+	vcpu_init_descriptor_tables(vm, VCPU_ID);
+	if (!test->dabt_handler)
+		test->dabt_handler = no_dabt_handler;
+	if (!test->iabt_handler)
+		test->iabt_handler = no_iabt_handler;
+	vm_install_sync_handler(vm, VECTOR_SYNC_CURRENT,
+			0x25, test->dabt_handler);
+	vm_install_sync_handler(vm, VECTOR_SYNC_CURRENT,
+			0x21, test->iabt_handler);
+}
+
+static void setup_memslots(struct kvm_vm *vm, enum vm_guest_mode mode,
+		struct test_params *p)
+{
+	uint64_t large_page_size = get_backing_src_pagesz(p->src_type);
+	uint64_t guest_page_size = vm_guest_mode_params[mode].page_size;
+	struct test_desc *test = p->test_desc;
+	uint64_t hole_gpa;
+	uint64_t alignment;
+	int i;
+
+	/* Calculate the test and PT memslot sizes */
+	for (i = 0; i < NR_MEMSLOTS; i++) {
+		memslot[i].size = large_page_size * memslot[i].backing_pages;
+		memslot[i].guest_pages = memslot[i].size / guest_page_size;
+		memslot[i].src_type = p->src_type;
+	}
+
+	TEST_ASSERT(memslot[TEST].size >= guest_page_size,
+			"The test memslot should have space one guest page.\n");
+	TEST_ASSERT(memslot[PT].size >= (4 * guest_page_size),
+			"The PT memslot sould have space for 4 guest pages.\n");
+
+	/* Place the memslots GPAs at the end of physical memory */
+	alignment = max(large_page_size, guest_page_size);
+	memslot[TEST].gpa = (vm_get_max_gfn(vm) - memslot[TEST].guest_pages) *
+		guest_page_size;
+	memslot[TEST].gpa = align_down(memslot[TEST].gpa, alignment);
+	/* Add a 1-guest_page-hole between the two memslots */
+	hole_gpa = memslot[TEST].gpa - guest_page_size;
+	virt_pg_map(vm, test_gva - guest_page_size, hole_gpa);
+	memslot[PT].gpa = hole_gpa - (memslot[PT].guest_pages *
+			guest_page_size);
+	memslot[PT].gpa = align_down(memslot[PT].gpa, alignment);
+
+	/* Create memslots for and test data and a PTE. */
+	vm_userspace_mem_region_add(vm, p->src_type, memslot[PT].gpa,
+			memslot[PT].idx, memslot[PT].guest_pages,
+			test->pt_memslot_flags);
+	vm_userspace_mem_region_add(vm, p->src_type, memslot[TEST].gpa,
+			memslot[TEST].idx, memslot[TEST].guest_pages,
+			test->test_memslot_flags);
+
+	for (i = 0; i < NR_MEMSLOTS; i++)
+		memslot[i].hva = addr_gpa2hva(vm, memslot[i].gpa);
+
+	/* Map the test test_gva using the PT memslot. */
+	_virt_pg_map(vm, test_gva, memslot[TEST].gpa,
+			4 /* NORMAL (See DEFAULT_MAIR_EL1) */,
+			TEST_PT_SLOT_INDEX);
+
+	/*
+	 * Find the PTE of the test page and map it in the guest so it can
+	 * clear the AF.
+	 */
+	pte_gpa = vm_get_pte_gpa(vm, test_gva);
+	TEST_ASSERT(memslot[PT].gpa <= pte_gpa &&
+			pte_gpa < (memslot[PT].gpa + memslot[PT].size),
+			"The EPT should be in the PT memslot.");
+	/* This is an artibrary requirement just to make things simpler. */
+	TEST_ASSERT(pte_gpa % guest_page_size == 0,
+			"The pte_gpa (%p) should be aligned to the guest page (%lx).",
+			(void *)pte_gpa, guest_page_size);
+	virt_pg_map(vm, pte_gva, pte_gpa);
+}
+
+static void check_event_counts(struct test_desc *test)
+{
+	ASSERT_EQ(test->expected_events.aborts,	events.aborts);
+}
+
+static void print_test_banner(enum vm_guest_mode mode, struct test_params *p)
+{
+	struct test_desc *test = p->test_desc;
+
+	pr_debug("Test: %s\n", test->name);
+	pr_debug("Testing guest mode: %s\n", vm_guest_mode_string(mode));
+	pr_debug("Testing memory backing src type: %s\n",
+			vm_mem_backing_src_alias(p->src_type)->name);
+}
+
+static void reset_event_counts(void)
+{
+	memset(&events, 0, sizeof(events));
+}
+
+static bool vcpu_run_loop(struct kvm_vm *vm, struct test_desc *test)
+{
+	bool skip_test = false;
+	struct ucall uc;
+	int stage;
+
+	for (stage = 0; ; stage++) {
+		vcpu_run(vm, VCPU_ID);
+
+		switch (get_ucall(vm, VCPU_ID, &uc)) {
+		case UCALL_SYNC:
+			if (uc.args[1] == CMD_SKIP_TEST) {
+				pr_debug("Skipped.\n");
+				skip_test = true;
+				goto done;
+			}
+			handle_cmd(vm, uc.args[1]);
+			break;
+		case UCALL_ABORT:
+			TEST_FAIL("%s at %s:%ld\n\tvalues: %#lx, %#lx",
+				(const char *)uc.args[0],
+				__FILE__, uc.args[1], uc.args[2], uc.args[3]);
+			break;
+		case UCALL_DONE:
+			pr_debug("Done.\n");
+			goto done;
+		default:
+			TEST_FAIL("Unknown ucall %lu", uc.cmd);
+		}
+	}
+
+done:
+	return skip_test;
+}
+
+static void run_test(enum vm_guest_mode mode, void *arg)
+{
+	struct test_params *p = (struct test_params *)arg;
+	struct test_desc *test = p->test_desc;
+	struct kvm_vm *vm;
+	bool skip_test = false;
+
+	print_test_banner(mode, p);
+
+	vm = vm_create_with_vcpus(mode, 1, DEFAULT_GUEST_PHY_PAGES,
+			get_total_guest_pages(mode, p), 0, guest_code, NULL);
+	ucall_init(vm, NULL);
+
+	reset_event_counts();
+	setup_memslots(vm, mode, p);
+
+	load_exec_code_for_test();
+	setup_abort_handlers(vm, test);
+	setup_guest_args(vm, test);
+
+	if (test->guest_pre_run)
+		test->guest_pre_run(vm);
+
+	sync_global_to_guest(vm, memslot);
+
+	skip_test = vcpu_run_loop(vm, test);
+
+	sync_stats_from_guest(vm);
+	ucall_uninit(vm);
+	kvm_vm_free(vm);
+
+	if (!skip_test)
+		check_event_counts(test);
+}
+
+static void for_each_test_and_guest_mode(void (*func)(enum vm_guest_mode, void *),
+		enum vm_mem_backing_src_type src_type);
+
+static void help(char *name)
+{
+	puts("");
+	printf("usage: %s [-h] [-s mem-type]\n", name);
+	puts("");
+	guest_modes_help();
+	backing_src_help("-s");
+	puts("");
+}
+
+int main(int argc, char *argv[])
+{
+	enum vm_mem_backing_src_type src_type;
+	int opt;
+
+	setbuf(stdout, NULL);
+
+	src_type = DEFAULT_VM_MEM_SRC;
+
+	guest_modes_append_default();
+
+	while ((opt = getopt(argc, argv, "hm:s:")) != -1) {
+		switch (opt) {
+		case 'm':
+			guest_modes_cmdline(optarg);
+			break;
+		case 's':
+			src_type = parse_backing_src_type(optarg);
+			break;
+		case 'h':
+		default:
+			help(argv[0]);
+			exit(0);
+		}
+	}
+
+	for_each_test_and_guest_mode(run_test, src_type);
+	return 0;
+}
+
+#define SNAME(s)		#s
+#define SCAT(a, b)		SNAME(a ## _ ## b)
+
+#define TEST_BASIC_ACCESS(__a, ...)						\
+{										\
+	.name			= SNAME(BASIC_ACCESS ## _ ## __a),		\
+	.guest_test		= __a,						\
+	.expected_events	= { 0 },					\
+	__VA_ARGS__								\
+}
+
+#define __AF_TEST_ARGS								\
+	.guest_prepare		= { guest_set_ha, guest_clear_pte_af, },	\
+	.guest_test_check	= { guest_check_pte_af, },			\
+
+#define __AF_LSE_TEST_ARGS							\
+	.guest_prepare		= { guest_set_ha, guest_clear_pte_af,		\
+				    guest_check_lse, },				\
+	.guest_test_check	= { guest_check_pte_af, },			\
+
+#define __PREPARE_LSE_TEST_ARGS							\
+	.guest_prepare		= { guest_check_lse, },
+
+#define TEST_HW_ACCESS_FLAG(__a)						\
+	TEST_BASIC_ACCESS(__a, __AF_TEST_ARGS)
+
+#define TEST_ACCESS_ON_HOLE_NO_FAULTS(__a, ...)					\
+{										\
+	.name			= SNAME(ACCESS_ON_HOLE_NO_FAULTS ## _ ## __a),	\
+	.guest_test		= __a,						\
+	.mem_mark_cmd		= CMD_HOLE_TEST,				\
+	.expected_events	= { 0 },					\
+	__VA_ARGS__								\
+}
+
+static struct test_desc tests[] = {
+	/* Check that HW is setting the AF (sanity checks). */
+	TEST_HW_ACCESS_FLAG(guest_test_read64),
+	TEST_HW_ACCESS_FLAG(guest_test_ld_preidx),
+	TEST_BASIC_ACCESS(guest_test_cas, __AF_LSE_TEST_ARGS),
+	TEST_HW_ACCESS_FLAG(guest_test_write64),
+	TEST_HW_ACCESS_FLAG(guest_test_st_preidx),
+	TEST_HW_ACCESS_FLAG(guest_test_dc_zva),
+	TEST_HW_ACCESS_FLAG(guest_test_exec),
+
+	/* Accessing a hole shouldn't fault (more sanity checks). */
+	TEST_ACCESS_ON_HOLE_NO_FAULTS(guest_test_read64),
+	TEST_ACCESS_ON_HOLE_NO_FAULTS(guest_test_cas, __PREPARE_LSE_TEST_ARGS),
+	TEST_ACCESS_ON_HOLE_NO_FAULTS(guest_test_ld_preidx),
+	TEST_ACCESS_ON_HOLE_NO_FAULTS(guest_test_write64),
+	TEST_ACCESS_ON_HOLE_NO_FAULTS(guest_test_at),
+	TEST_ACCESS_ON_HOLE_NO_FAULTS(guest_test_dc_zva),
+	TEST_ACCESS_ON_HOLE_NO_FAULTS(guest_test_st_preidx),
+
+	{ 0 },
+};
+
+static void for_each_test_and_guest_mode(
+		void (*func)(enum vm_guest_mode m, void *a),
+		enum vm_mem_backing_src_type src_type)
+{
+	struct test_desc *t;
+
+	for (t = &tests[0]; t->name; t++) {
+		if (t->skip)
+			continue;
+
+		struct test_params p = {
+			.src_type = src_type,
+			.test_desc = t,
+		};
+
+		for_each_guest_mode(run_test, &p);
+	}
+}
-- 
2.35.1.723.g4982287a31-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 07/11] KVM: selftests: aarch64: Add aarch64/page_fault_test
@ 2022-03-11  6:02   ` Ricardo Koller
  0 siblings, 0 replies; 42+ messages in thread
From: Ricardo Koller @ 2022-03-11  6:02 UTC (permalink / raw)
  To: kvm, kvmarm, drjones
  Cc: pbonzini, maz, alexandru.elisei, eric.auger, oupton, reijiw,
	rananta, bgardon, axelrasmussen, Ricardo Koller

Add a new test for stage 2 faults when using different combinations of
guest accesses (e.g., write, S1PTW), backing source type (e.g., anon)
and types of faults (e.g., read on hugetlbfs with a hole). The next
commits will add different handling methods and more faults (e.g., uffd
and dirty logging). This first commit starts by adding two sanity checks
for all types of accesses: AF setting by the hw, and accessing memslots
with holes.

Note that this commit borrows some code from kvm-unit-tests: RET,
MOV_X0, and flush_tlb_page.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 tools/testing/selftests/kvm/Makefile          |   1 +
 .../selftests/kvm/aarch64/page_fault_test.c   | 667 ++++++++++++++++++
 2 files changed, 668 insertions(+)
 create mode 100644 tools/testing/selftests/kvm/aarch64/page_fault_test.c

diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index bc5f89b3700e..6a192798b217 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -103,6 +103,7 @@ TEST_GEN_PROGS_x86_64 += system_counter_offset_test
 TEST_GEN_PROGS_aarch64 += aarch64/arch_timer
 TEST_GEN_PROGS_aarch64 += aarch64/debug-exceptions
 TEST_GEN_PROGS_aarch64 += aarch64/get-reg-list
+TEST_GEN_PROGS_aarch64 += aarch64/page_fault_test
 TEST_GEN_PROGS_aarch64 += aarch64/psci_cpu_on_test
 TEST_GEN_PROGS_aarch64 += aarch64/vgic_init
 TEST_GEN_PROGS_aarch64 += aarch64/vgic_irq
diff --git a/tools/testing/selftests/kvm/aarch64/page_fault_test.c b/tools/testing/selftests/kvm/aarch64/page_fault_test.c
new file mode 100644
index 000000000000..00477a4f10cb
--- /dev/null
+++ b/tools/testing/selftests/kvm/aarch64/page_fault_test.c
@@ -0,0 +1,667 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * page_fault_test.c - Test stage 2 faults.
+ *
+ * This test tries different combinations of guest accesses (e.g., write,
+ * S1PTW), backing source type (e.g., anon) and types of faults (e.g., read on
+ * hugetlbfs with a hole). It checks that the expected handling method is
+ * called (e.g., uffd faults with the right address and write/read flag).
+ */
+
+#define _GNU_SOURCE
+#include <linux/bitmap.h>
+#include <fcntl.h>
+#include <test_util.h>
+#include <kvm_util.h>
+#include <processor.h>
+#include "guest_modes.h"
+#include "userfaultfd_util.h"
+
+#define VCPU_ID					0
+
+#define TEST_MEM_SLOT_INDEX			1
+#define TEST_PT_SLOT_INDEX			2
+
+/* Max number of backing pages per guest page */
+#define BACKING_PG_PER_GUEST_PG			(64 / 4)
+
+/* Test memslot in backing source pages */
+#define TEST_MEMSLOT_BACKING_SRC_NPAGES		(1 * BACKING_PG_PER_GUEST_PG)
+
+/* PT memslot size in backing source pages */
+#define PT_MEMSLOT_BACKING_SRC_NPAGES		(4 * BACKING_PG_PER_GUEST_PG)
+
+/* Guest virtual addresses that point to the test page and its PTE. */
+#define GUEST_TEST_GVA				0xc0000000
+#define GUEST_TEST_EXEC_GVA			0xc0000008
+#define GUEST_TEST_PTE_GVA			0xd0000000
+
+/* Access flag */
+#define PTE_AF					(1ULL << 10)
+
+/* Acces flag update enable/disable */
+#define TCR_EL1_HA				(1ULL << 39)
+
+#define CMD_SKIP_TEST				(-1LL)
+#define CMD_HOLE_PT				(1ULL << 2)
+#define CMD_HOLE_TEST				(1ULL << 3)
+
+#define PREPARE_FN_NR				10
+#define CHECK_FN_NR				10
+
+static const uint64_t test_gva = GUEST_TEST_GVA;
+static const uint64_t test_exec_gva = GUEST_TEST_EXEC_GVA;
+static const uint64_t pte_gva = GUEST_TEST_PTE_GVA;
+uint64_t pte_gpa;
+
+enum { PT, TEST, NR_MEMSLOTS};
+
+struct memslot_desc {
+	void *hva;
+	uint64_t gpa;
+	uint64_t size;
+	uint64_t guest_pages;
+	uint64_t backing_pages;
+	enum vm_mem_backing_src_type src_type;
+	uint32_t idx;
+} memslot[NR_MEMSLOTS] = {
+	{
+		.idx = TEST_PT_SLOT_INDEX,
+		.backing_pages = PT_MEMSLOT_BACKING_SRC_NPAGES,
+	},
+	{
+		.idx = TEST_MEM_SLOT_INDEX,
+		.backing_pages = TEST_MEMSLOT_BACKING_SRC_NPAGES,
+	},
+};
+
+static struct event_cnt {
+	int aborts;
+	int fail_vcpu_runs;
+} events;
+
+struct test_desc {
+	const char *name;
+	uint64_t mem_mark_cmd;
+	/* Skip the test if any prepare function returns false */
+	bool (*guest_prepare[PREPARE_FN_NR])(void);
+	void (*guest_test)(void);
+	void (*guest_test_check[CHECK_FN_NR])(void);
+	void (*dabt_handler)(struct ex_regs *regs);
+	void (*iabt_handler)(struct ex_regs *regs);
+	uint32_t pt_memslot_flags;
+	uint32_t test_memslot_flags;
+	void (*guest_pre_run)(struct kvm_vm *vm);
+	bool skip;
+	struct event_cnt expected_events;
+};
+
+struct test_params {
+	enum vm_mem_backing_src_type src_type;
+	struct test_desc *test_desc;
+};
+
+
+static inline void flush_tlb_page(uint64_t vaddr)
+{
+	uint64_t page = vaddr >> 12;
+
+	dsb(ishst);
+	asm("tlbi vaae1is, %0" :: "r" (page));
+	dsb(ish);
+	isb();
+}
+
+#define RET			0xd65f03c0
+#define MOV_X0(x)		(0xd2800000 | (((x) & 0xffff) << 5))
+
+static void guest_test_nop(void)
+{}
+
+static void guest_test_write64(void)
+{
+	uint64_t val;
+
+	WRITE_ONCE(*((uint64_t *)test_gva), 0x0123456789ABCDEF);
+	val = READ_ONCE(*(uint64_t *)test_gva);
+	GUEST_ASSERT_EQ(val, 0x0123456789ABCDEF);
+}
+
+/* Check the system for atomic instructions. */
+static bool guest_check_lse(void)
+{
+	uint64_t isar0 = read_sysreg(id_aa64isar0_el1);
+	uint64_t atomic = (isar0 >> 20) & 7;
+
+	return atomic >= 2;
+}
+
+/* Compare and swap instruction. */
+static void guest_test_cas(void)
+{
+	uint64_t val;
+	uint64_t addr = test_gva;
+
+	GUEST_ASSERT_EQ(guest_check_lse(), 1);
+	asm volatile(".arch_extension lse\n"
+		     "casal %0, %1, [%2]\n"
+			:: "r" (0), "r" (0x0123456789ABCDEF), "r" (addr));
+	val = READ_ONCE(*(uint64_t *)(addr));
+	GUEST_ASSERT_EQ(val, 0x0123456789ABCDEF);
+}
+
+static void guest_test_read64(void)
+{
+	uint64_t val;
+
+	val = READ_ONCE(*(uint64_t *)test_gva);
+	GUEST_ASSERT_EQ(val, 0);
+}
+
+/* Address translation instruction */
+static void guest_test_at(void)
+{
+	uint64_t par;
+	uint64_t addr = 0;
+
+	asm volatile("at s1e1r, %0" :: "r" (test_gva));
+	par = read_sysreg(par_el1);
+
+	/* Bit 1 indicates whether the AT was successful */
+	GUEST_ASSERT_EQ(par & 1, 0);
+	/* The PA in bits [51:12] */
+	addr = par & (((1ULL << 40) - 1) << 12);
+	GUEST_ASSERT_EQ(addr, memslot[TEST].gpa);
+}
+
+static void guest_test_dc_zva(void)
+{
+	/* The smallest guaranteed block size (bs) is a word. */
+	uint16_t val;
+
+	asm volatile("dc zva, %0\n"
+			"dsb ish\n"
+			:: "r" (test_gva));
+	val = READ_ONCE(*(uint16_t *)test_gva);
+	GUEST_ASSERT_EQ(val, 0);
+}
+
+static void guest_test_ld_preidx(void)
+{
+	uint64_t val;
+	uint64_t addr = test_gva - 8;
+
+	/*
+	 * This ends up accessing "test_gva + 8 - 8", where "test_gva - 8"
+	 * is not backed by a memslot.
+	 */
+	asm volatile("ldr %0, [%1, #8]!"
+			: "=r" (val), "+r" (addr));
+	GUEST_ASSERT_EQ(val, 0);
+	GUEST_ASSERT_EQ(addr, test_gva);
+}
+
+static void guest_test_st_preidx(void)
+{
+	uint64_t val = 0x0123456789ABCDEF;
+	uint64_t addr = test_gva - 8;
+
+	asm volatile("str %0, [%1, #8]!"
+			: "+r" (val), "+r" (addr));
+
+	GUEST_ASSERT_EQ(addr, test_gva);
+	val = READ_ONCE(*(uint64_t *)test_gva);
+}
+
+static bool guest_set_ha(void)
+{
+	uint64_t mmfr1 = read_sysreg(id_aa64mmfr1_el1);
+	uint64_t hadbs = mmfr1 & 6;
+	uint64_t tcr;
+
+	/* Skip if HA is not supported. */
+	if (hadbs == 0)
+		return false;
+
+	tcr = read_sysreg(tcr_el1) | TCR_EL1_HA;
+	write_sysreg(tcr, tcr_el1);
+	isb();
+
+	return true;
+}
+
+static bool guest_clear_pte_af(void)
+{
+	*((uint64_t *)pte_gva) &= ~PTE_AF;
+	flush_tlb_page(pte_gva);
+
+	return true;
+}
+
+static void guest_check_pte_af(void)
+{
+	flush_tlb_page(pte_gva);
+	GUEST_ASSERT_EQ(*((uint64_t *)pte_gva) & PTE_AF, PTE_AF);
+}
+
+static void guest_test_exec(void)
+{
+	int (*code)(void) = (int (*)(void))test_exec_gva;
+	int ret;
+
+	ret = code();
+	GUEST_ASSERT_EQ(ret, 0x77);
+}
+
+static bool guest_prepare(struct test_desc *test)
+{
+	bool (*prepare_fn)(void);
+	int i;
+
+	for (i = 0; i < PREPARE_FN_NR; i++) {
+		prepare_fn = test->guest_prepare[i];
+		if (prepare_fn && !prepare_fn())
+			return false;
+	}
+
+	return true;
+}
+
+static void guest_test_check(struct test_desc *test)
+{
+	void (*check_fn)(void);
+	int i;
+
+	for (i = 0; i < CHECK_FN_NR; i++) {
+		check_fn = test->guest_test_check[i];
+		if (!check_fn)
+			continue;
+		check_fn();
+	}
+}
+
+static void guest_code(struct test_desc *test)
+{
+	if (!test->guest_test)
+		test->guest_test = guest_test_nop;
+
+	if (!guest_prepare(test))
+		GUEST_SYNC(CMD_SKIP_TEST);
+
+	GUEST_SYNC(test->mem_mark_cmd);
+	test->guest_test();
+
+	guest_test_check(test);
+	GUEST_DONE();
+}
+
+static void no_dabt_handler(struct ex_regs *regs)
+{
+	GUEST_ASSERT_1(false, read_sysreg(far_el1));
+}
+
+static void no_iabt_handler(struct ex_regs *regs)
+{
+	GUEST_ASSERT_1(false, regs->pc);
+}
+
+static void punch_hole_in_memslot(struct kvm_vm *vm,
+		struct memslot_desc *memslot)
+{
+	int ret, fd;
+	void *hva;
+
+	fd = vm_mem_region_get_src_fd(vm, memslot->idx);
+	if (fd != -1) {
+		ret = fallocate(fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
+				0, memslot->size);
+		TEST_ASSERT(ret == 0, "fallocate failed, errno: %d\n", errno);
+	} else {
+		hva = addr_gpa2hva(vm, memslot->gpa);
+		ret = madvise(hva, memslot->size, MADV_DONTNEED);
+		TEST_ASSERT(ret == 0, "madvise failed, errno: %d\n", errno);
+	}
+}
+
+static void handle_cmd(struct kvm_vm *vm, int cmd)
+{
+	if (cmd & CMD_HOLE_PT)
+		punch_hole_in_memslot(vm, &memslot[PT]);
+	if (cmd & CMD_HOLE_TEST)
+		punch_hole_in_memslot(vm, &memslot[TEST]);
+}
+
+static void sync_stats_from_guest(struct kvm_vm *vm)
+{
+	struct event_cnt *ec = addr_gva2hva(vm, (uint64_t)&events);
+
+	events.aborts += ec->aborts;
+}
+
+void fail_vcpu_run_no_handler(int ret)
+{
+	TEST_FAIL("Unexpected vcpu run failure\n");
+}
+
+static uint64_t get_total_guest_pages(enum vm_guest_mode mode,
+		struct test_params *p)
+{
+	uint64_t large_page_size = get_backing_src_pagesz(p->src_type);
+	uint64_t guest_page_size = vm_guest_mode_params[mode].page_size;
+	uint64_t size;
+
+	size = PT_MEMSLOT_BACKING_SRC_NPAGES * large_page_size;
+	size += TEST_MEMSLOT_BACKING_SRC_NPAGES * large_page_size;
+
+	return size / guest_page_size;
+}
+
+static void load_exec_code_for_test(void)
+{
+	uint32_t *code;
+
+	/* Write this "code" into test_exec_gva */
+	assert(test_exec_gva - test_gva);
+	code = memslot[TEST].hva + 8;
+
+	code[0] = MOV_X0(0x77);
+	code[1] = RET;
+}
+
+static void setup_guest_args(struct kvm_vm *vm, struct test_desc *test)
+{
+	vm_vaddr_t test_desc_gva;
+
+	test_desc_gva = vm_vaddr_alloc_page(vm);
+	memcpy(addr_gva2hva(vm, test_desc_gva), test,
+			sizeof(struct test_desc));
+	vcpu_args_set(vm, 0, 1, test_desc_gva);
+}
+
+static void setup_abort_handlers(struct kvm_vm *vm, struct test_desc *test)
+{
+	vm_init_descriptor_tables(vm);
+	vcpu_init_descriptor_tables(vm, VCPU_ID);
+	if (!test->dabt_handler)
+		test->dabt_handler = no_dabt_handler;
+	if (!test->iabt_handler)
+		test->iabt_handler = no_iabt_handler;
+	vm_install_sync_handler(vm, VECTOR_SYNC_CURRENT,
+			0x25, test->dabt_handler);
+	vm_install_sync_handler(vm, VECTOR_SYNC_CURRENT,
+			0x21, test->iabt_handler);
+}
+
+static void setup_memslots(struct kvm_vm *vm, enum vm_guest_mode mode,
+		struct test_params *p)
+{
+	uint64_t large_page_size = get_backing_src_pagesz(p->src_type);
+	uint64_t guest_page_size = vm_guest_mode_params[mode].page_size;
+	struct test_desc *test = p->test_desc;
+	uint64_t hole_gpa;
+	uint64_t alignment;
+	int i;
+
+	/* Calculate the test and PT memslot sizes */
+	for (i = 0; i < NR_MEMSLOTS; i++) {
+		memslot[i].size = large_page_size * memslot[i].backing_pages;
+		memslot[i].guest_pages = memslot[i].size / guest_page_size;
+		memslot[i].src_type = p->src_type;
+	}
+
+	TEST_ASSERT(memslot[TEST].size >= guest_page_size,
+			"The test memslot should have space one guest page.\n");
+	TEST_ASSERT(memslot[PT].size >= (4 * guest_page_size),
+			"The PT memslot sould have space for 4 guest pages.\n");
+
+	/* Place the memslots GPAs at the end of physical memory */
+	alignment = max(large_page_size, guest_page_size);
+	memslot[TEST].gpa = (vm_get_max_gfn(vm) - memslot[TEST].guest_pages) *
+		guest_page_size;
+	memslot[TEST].gpa = align_down(memslot[TEST].gpa, alignment);
+	/* Add a 1-guest_page-hole between the two memslots */
+	hole_gpa = memslot[TEST].gpa - guest_page_size;
+	virt_pg_map(vm, test_gva - guest_page_size, hole_gpa);
+	memslot[PT].gpa = hole_gpa - (memslot[PT].guest_pages *
+			guest_page_size);
+	memslot[PT].gpa = align_down(memslot[PT].gpa, alignment);
+
+	/* Create memslots for and test data and a PTE. */
+	vm_userspace_mem_region_add(vm, p->src_type, memslot[PT].gpa,
+			memslot[PT].idx, memslot[PT].guest_pages,
+			test->pt_memslot_flags);
+	vm_userspace_mem_region_add(vm, p->src_type, memslot[TEST].gpa,
+			memslot[TEST].idx, memslot[TEST].guest_pages,
+			test->test_memslot_flags);
+
+	for (i = 0; i < NR_MEMSLOTS; i++)
+		memslot[i].hva = addr_gpa2hva(vm, memslot[i].gpa);
+
+	/* Map the test test_gva using the PT memslot. */
+	_virt_pg_map(vm, test_gva, memslot[TEST].gpa,
+			4 /* NORMAL (See DEFAULT_MAIR_EL1) */,
+			TEST_PT_SLOT_INDEX);
+
+	/*
+	 * Find the PTE of the test page and map it in the guest so it can
+	 * clear the AF.
+	 */
+	pte_gpa = vm_get_pte_gpa(vm, test_gva);
+	TEST_ASSERT(memslot[PT].gpa <= pte_gpa &&
+			pte_gpa < (memslot[PT].gpa + memslot[PT].size),
+			"The EPT should be in the PT memslot.");
+	/* This is an artibrary requirement just to make things simpler. */
+	TEST_ASSERT(pte_gpa % guest_page_size == 0,
+			"The pte_gpa (%p) should be aligned to the guest page (%lx).",
+			(void *)pte_gpa, guest_page_size);
+	virt_pg_map(vm, pte_gva, pte_gpa);
+}
+
+static void check_event_counts(struct test_desc *test)
+{
+	ASSERT_EQ(test->expected_events.aborts,	events.aborts);
+}
+
+static void print_test_banner(enum vm_guest_mode mode, struct test_params *p)
+{
+	struct test_desc *test = p->test_desc;
+
+	pr_debug("Test: %s\n", test->name);
+	pr_debug("Testing guest mode: %s\n", vm_guest_mode_string(mode));
+	pr_debug("Testing memory backing src type: %s\n",
+			vm_mem_backing_src_alias(p->src_type)->name);
+}
+
+static void reset_event_counts(void)
+{
+	memset(&events, 0, sizeof(events));
+}
+
+static bool vcpu_run_loop(struct kvm_vm *vm, struct test_desc *test)
+{
+	bool skip_test = false;
+	struct ucall uc;
+	int stage;
+
+	for (stage = 0; ; stage++) {
+		vcpu_run(vm, VCPU_ID);
+
+		switch (get_ucall(vm, VCPU_ID, &uc)) {
+		case UCALL_SYNC:
+			if (uc.args[1] == CMD_SKIP_TEST) {
+				pr_debug("Skipped.\n");
+				skip_test = true;
+				goto done;
+			}
+			handle_cmd(vm, uc.args[1]);
+			break;
+		case UCALL_ABORT:
+			TEST_FAIL("%s at %s:%ld\n\tvalues: %#lx, %#lx",
+				(const char *)uc.args[0],
+				__FILE__, uc.args[1], uc.args[2], uc.args[3]);
+			break;
+		case UCALL_DONE:
+			pr_debug("Done.\n");
+			goto done;
+		default:
+			TEST_FAIL("Unknown ucall %lu", uc.cmd);
+		}
+	}
+
+done:
+	return skip_test;
+}
+
+static void run_test(enum vm_guest_mode mode, void *arg)
+{
+	struct test_params *p = (struct test_params *)arg;
+	struct test_desc *test = p->test_desc;
+	struct kvm_vm *vm;
+	bool skip_test = false;
+
+	print_test_banner(mode, p);
+
+	vm = vm_create_with_vcpus(mode, 1, DEFAULT_GUEST_PHY_PAGES,
+			get_total_guest_pages(mode, p), 0, guest_code, NULL);
+	ucall_init(vm, NULL);
+
+	reset_event_counts();
+	setup_memslots(vm, mode, p);
+
+	load_exec_code_for_test();
+	setup_abort_handlers(vm, test);
+	setup_guest_args(vm, test);
+
+	if (test->guest_pre_run)
+		test->guest_pre_run(vm);
+
+	sync_global_to_guest(vm, memslot);
+
+	skip_test = vcpu_run_loop(vm, test);
+
+	sync_stats_from_guest(vm);
+	ucall_uninit(vm);
+	kvm_vm_free(vm);
+
+	if (!skip_test)
+		check_event_counts(test);
+}
+
+static void for_each_test_and_guest_mode(void (*func)(enum vm_guest_mode, void *),
+		enum vm_mem_backing_src_type src_type);
+
+static void help(char *name)
+{
+	puts("");
+	printf("usage: %s [-h] [-s mem-type]\n", name);
+	puts("");
+	guest_modes_help();
+	backing_src_help("-s");
+	puts("");
+}
+
+int main(int argc, char *argv[])
+{
+	enum vm_mem_backing_src_type src_type;
+	int opt;
+
+	setbuf(stdout, NULL);
+
+	src_type = DEFAULT_VM_MEM_SRC;
+
+	guest_modes_append_default();
+
+	while ((opt = getopt(argc, argv, "hm:s:")) != -1) {
+		switch (opt) {
+		case 'm':
+			guest_modes_cmdline(optarg);
+			break;
+		case 's':
+			src_type = parse_backing_src_type(optarg);
+			break;
+		case 'h':
+		default:
+			help(argv[0]);
+			exit(0);
+		}
+	}
+
+	for_each_test_and_guest_mode(run_test, src_type);
+	return 0;
+}
+
+#define SNAME(s)		#s
+#define SCAT(a, b)		SNAME(a ## _ ## b)
+
+#define TEST_BASIC_ACCESS(__a, ...)						\
+{										\
+	.name			= SNAME(BASIC_ACCESS ## _ ## __a),		\
+	.guest_test		= __a,						\
+	.expected_events	= { 0 },					\
+	__VA_ARGS__								\
+}
+
+#define __AF_TEST_ARGS								\
+	.guest_prepare		= { guest_set_ha, guest_clear_pte_af, },	\
+	.guest_test_check	= { guest_check_pte_af, },			\
+
+#define __AF_LSE_TEST_ARGS							\
+	.guest_prepare		= { guest_set_ha, guest_clear_pte_af,		\
+				    guest_check_lse, },				\
+	.guest_test_check	= { guest_check_pte_af, },			\
+
+#define __PREPARE_LSE_TEST_ARGS							\
+	.guest_prepare		= { guest_check_lse, },
+
+#define TEST_HW_ACCESS_FLAG(__a)						\
+	TEST_BASIC_ACCESS(__a, __AF_TEST_ARGS)
+
+#define TEST_ACCESS_ON_HOLE_NO_FAULTS(__a, ...)					\
+{										\
+	.name			= SNAME(ACCESS_ON_HOLE_NO_FAULTS ## _ ## __a),	\
+	.guest_test		= __a,						\
+	.mem_mark_cmd		= CMD_HOLE_TEST,				\
+	.expected_events	= { 0 },					\
+	__VA_ARGS__								\
+}
+
+static struct test_desc tests[] = {
+	/* Check that HW is setting the AF (sanity checks). */
+	TEST_HW_ACCESS_FLAG(guest_test_read64),
+	TEST_HW_ACCESS_FLAG(guest_test_ld_preidx),
+	TEST_BASIC_ACCESS(guest_test_cas, __AF_LSE_TEST_ARGS),
+	TEST_HW_ACCESS_FLAG(guest_test_write64),
+	TEST_HW_ACCESS_FLAG(guest_test_st_preidx),
+	TEST_HW_ACCESS_FLAG(guest_test_dc_zva),
+	TEST_HW_ACCESS_FLAG(guest_test_exec),
+
+	/* Accessing a hole shouldn't fault (more sanity checks). */
+	TEST_ACCESS_ON_HOLE_NO_FAULTS(guest_test_read64),
+	TEST_ACCESS_ON_HOLE_NO_FAULTS(guest_test_cas, __PREPARE_LSE_TEST_ARGS),
+	TEST_ACCESS_ON_HOLE_NO_FAULTS(guest_test_ld_preidx),
+	TEST_ACCESS_ON_HOLE_NO_FAULTS(guest_test_write64),
+	TEST_ACCESS_ON_HOLE_NO_FAULTS(guest_test_at),
+	TEST_ACCESS_ON_HOLE_NO_FAULTS(guest_test_dc_zva),
+	TEST_ACCESS_ON_HOLE_NO_FAULTS(guest_test_st_preidx),
+
+	{ 0 },
+};
+
+static void for_each_test_and_guest_mode(
+		void (*func)(enum vm_guest_mode m, void *a),
+		enum vm_mem_backing_src_type src_type)
+{
+	struct test_desc *t;
+
+	for (t = &tests[0]; t->name; t++) {
+		if (t->skip)
+			continue;
+
+		struct test_params p = {
+			.src_type = src_type,
+			.test_desc = t,
+		};
+
+		for_each_guest_mode(run_test, &p);
+	}
+}
-- 
2.35.1.723.g4982287a31-goog


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 08/11] KVM: selftests: aarch64: Add userfaultfd tests into page_fault_test
  2022-03-11  6:01 ` Ricardo Koller
@ 2022-03-11  6:02   ` Ricardo Koller
  -1 siblings, 0 replies; 42+ messages in thread
From: Ricardo Koller @ 2022-03-11  6:02 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: maz, bgardon, pbonzini, axelrasmussen

Add some userfaultfd tests into page_fault_test. Punch holes into the
data and/or page-table memslots, perform some accesses, and check that
the faults are taken (or not taken) when expected.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 .../selftests/kvm/aarch64/page_fault_test.c   | 232 +++++++++++++++++-
 1 file changed, 229 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/kvm/aarch64/page_fault_test.c b/tools/testing/selftests/kvm/aarch64/page_fault_test.c
index 00477a4f10cb..99449eaddb2b 100644
--- a/tools/testing/selftests/kvm/aarch64/page_fault_test.c
+++ b/tools/testing/selftests/kvm/aarch64/page_fault_test.c
@@ -57,6 +57,8 @@ uint64_t pte_gpa;
 enum { PT, TEST, NR_MEMSLOTS};
 
 struct memslot_desc {
+	size_t paging_size;
+	char *data_copy;
 	void *hva;
 	uint64_t gpa;
 	uint64_t size;
@@ -78,6 +80,9 @@ struct memslot_desc {
 static struct event_cnt {
 	int aborts;
 	int fail_vcpu_runs;
+	int uffd_faults;
+	/* uffd_faults is incremented from multiple threads. */
+	pthread_mutex_t uffd_faults_mutex;
 } events;
 
 struct test_desc {
@@ -87,6 +92,8 @@ struct test_desc {
 	bool (*guest_prepare[PREPARE_FN_NR])(void);
 	void (*guest_test)(void);
 	void (*guest_test_check[CHECK_FN_NR])(void);
+	int (*uffd_pt_handler)(int mode, int uffd, struct uffd_msg *msg);
+	int (*uffd_test_handler)(int mode, int uffd, struct uffd_msg *msg);
 	void (*dabt_handler)(struct ex_regs *regs);
 	void (*iabt_handler)(struct ex_regs *regs);
 	uint32_t pt_memslot_flags;
@@ -305,6 +312,56 @@ static void no_iabt_handler(struct ex_regs *regs)
 	GUEST_ASSERT_1(false, regs->pc);
 }
 
+static int uffd_generic_handler(int uffd_mode, int uffd,
+		struct uffd_msg *msg, struct memslot_desc *memslot,
+		bool expect_write)
+{
+	uint64_t addr = msg->arg.pagefault.address;
+	uint64_t flags = msg->arg.pagefault.flags;
+	struct uffdio_copy copy;
+	int ret;
+
+	TEST_ASSERT(uffd_mode == UFFDIO_REGISTER_MODE_MISSING,
+			"The only expected UFFD mode is MISSING");
+	ASSERT_EQ(!!(flags & UFFD_PAGEFAULT_FLAG_WRITE), expect_write);
+	ASSERT_EQ(addr, (uint64_t)memslot->hva);
+
+	pr_debug("uffd fault: addr=%p write=%d\n",
+			(void *)addr, !!(flags & UFFD_PAGEFAULT_FLAG_WRITE));
+
+	copy.src = (uint64_t)memslot->data_copy;
+	copy.dst = addr;
+	copy.len = memslot->paging_size;
+	copy.mode = 0;
+
+	ret = ioctl(uffd, UFFDIO_COPY, &copy);
+	if (ret == -1) {
+		pr_info("Failed UFFDIO_COPY in 0x%lx with errno: %d\n",
+				addr, errno);
+		return ret;
+	}
+
+	pthread_mutex_lock(&events.uffd_faults_mutex);
+	events.uffd_faults += 1;
+	pthread_mutex_unlock(&events.uffd_faults_mutex);
+	return 0;
+}
+
+static int uffd_pt_write_handler(int mode, int uffd, struct uffd_msg *msg)
+{
+	return uffd_generic_handler(mode, uffd, msg, &memslot[PT], true);
+}
+
+static int uffd_test_write_handler(int mode, int uffd, struct uffd_msg *msg)
+{
+	return uffd_generic_handler(mode, uffd, msg, &memslot[TEST], true);
+}
+
+static int uffd_test_read_handler(int mode, int uffd, struct uffd_msg *msg)
+{
+	return uffd_generic_handler(mode, uffd, msg, &memslot[TEST], false);
+}
+
 static void punch_hole_in_memslot(struct kvm_vm *vm,
 		struct memslot_desc *memslot)
 {
@@ -314,11 +371,11 @@ static void punch_hole_in_memslot(struct kvm_vm *vm,
 	fd = vm_mem_region_get_src_fd(vm, memslot->idx);
 	if (fd != -1) {
 		ret = fallocate(fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
-				0, memslot->size);
+				0, memslot->paging_size);
 		TEST_ASSERT(ret == 0, "fallocate failed, errno: %d\n", errno);
 	} else {
 		hva = addr_gpa2hva(vm, memslot->gpa);
-		ret = madvise(hva, memslot->size, MADV_DONTNEED);
+		ret = madvise(hva, memslot->paging_size, MADV_DONTNEED);
 		TEST_ASSERT(ret == 0, "madvise failed, errno: %d\n", errno);
 	}
 }
@@ -457,9 +514,60 @@ static void setup_memslots(struct kvm_vm *vm, enum vm_guest_mode mode,
 	virt_pg_map(vm, pte_gva, pte_gpa);
 }
 
+static void setup_uffd(enum vm_guest_mode mode, struct test_params *p,
+		struct uffd_desc **uffd)
+{
+	struct test_desc *test = p->test_desc;
+	uint64_t large_page_size = get_backing_src_pagesz(p->src_type);
+	int i;
+
+	/*
+	 * When creating the map, we might not only have created a pte page,
+	 * but also an intermediate level (pte_gpa != gpa[PT]). So, we
+	 * might need to demand page both.
+	 */
+	memslot[PT].paging_size = align_up(pte_gpa - memslot[PT].gpa,
+			large_page_size) + large_page_size;
+	memslot[TEST].paging_size = large_page_size;
+
+	for (i = 0; i < NR_MEMSLOTS; i++) {
+		memslot[i].data_copy = malloc(memslot[i].paging_size);
+		TEST_ASSERT(memslot[i].data_copy, "Failed malloc.");
+		memcpy(memslot[i].data_copy, memslot[i].hva,
+				memslot[i].paging_size);
+	}
+
+	uffd[PT] = NULL;
+	if (test->uffd_pt_handler)
+		uffd[PT] = uffd_setup_demand_paging(
+				UFFDIO_REGISTER_MODE_MISSING, 0,
+				memslot[PT].hva, memslot[PT].paging_size,
+				test->uffd_pt_handler);
+
+	uffd[TEST] = NULL;
+	if (test->uffd_test_handler)
+		uffd[TEST] = uffd_setup_demand_paging(
+				UFFDIO_REGISTER_MODE_MISSING, 0,
+				memslot[TEST].hva, memslot[TEST].paging_size,
+				test->uffd_test_handler);
+}
+
 static void check_event_counts(struct test_desc *test)
 {
 	ASSERT_EQ(test->expected_events.aborts,	events.aborts);
+	ASSERT_EQ(test->expected_events.uffd_faults, events.uffd_faults);
+}
+
+static void free_uffd(struct test_desc *test, struct uffd_desc **uffd)
+{
+	int i;
+
+	if (test->uffd_pt_handler)
+		uffd_stop_demand_paging(uffd[PT]);
+	if (test->uffd_test_handler)
+		uffd_stop_demand_paging(uffd[TEST]);
+	for (i = 0; i < NR_MEMSLOTS; i++)
+		free(memslot[i].data_copy);
 }
 
 static void print_test_banner(enum vm_guest_mode mode, struct test_params *p)
@@ -517,6 +625,7 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 	struct test_params *p = (struct test_params *)arg;
 	struct test_desc *test = p->test_desc;
 	struct kvm_vm *vm;
+	struct uffd_desc *uffd[NR_MEMSLOTS];
 	bool skip_test = false;
 
 	print_test_banner(mode, p);
@@ -528,7 +637,14 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 	reset_event_counts();
 	setup_memslots(vm, mode, p);
 
+	/*
+	 * Set some code at memslot[TEST].hva for the guest to execute (only
+	 * applicable to the EXEC tests). This has to be done before
+	 * setup_uffd() as that function copies the memslot data for the uffd
+	 * handler.
+	 */
 	load_exec_code_for_test();
+	setup_uffd(mode, p, uffd);
 	setup_abort_handlers(vm, test);
 	setup_guest_args(vm, test);
 
@@ -542,7 +658,12 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 	sync_stats_from_guest(vm);
 	ucall_uninit(vm);
 	kvm_vm_free(vm);
+	free_uffd(test, uffd);
 
+	/*
+	 * Make sure this is called after the uffd threads have exited (and
+	 * updated their respective event counters).
+	 */
 	if (!skip_test)
 		check_event_counts(test);
 }
@@ -625,6 +746,43 @@ int main(int argc, char *argv[])
 	__VA_ARGS__								\
 }
 
+#define TEST_ACCESS_ON_HOLE_UFFD(__a, __uffd_handler, ...)			\
+{										\
+	.name			= SNAME(ACCESS_ON_HOLE_UFFD ## _ ## __a),	\
+	.guest_test		= __a,						\
+	.mem_mark_cmd		= CMD_HOLE_TEST,				\
+	.uffd_test_handler	= __uffd_handler,				\
+	.expected_events	= { .uffd_faults = 1, },			\
+	__VA_ARGS__								\
+}
+
+#define TEST_S1PTW_ON_HOLE_UFFD(__a, __uffd_handler, ...)			\
+{										\
+	.name			= SNAME(S1PTW_ON_HOLE_UFFD ## _ ## __a),	\
+	.guest_test		= __a,						\
+	.mem_mark_cmd		= CMD_HOLE_PT,					\
+	.uffd_pt_handler	= __uffd_handler,				\
+	.expected_events	= { .uffd_faults = 1, },			\
+	__VA_ARGS__								\
+}
+
+#define TEST_S1PTW_ON_HOLE_UFFD_AF(__a, __uffd_handler)				\
+	TEST_S1PTW_ON_HOLE_UFFD(__a, __uffd_handler, __AF_TEST_ARGS)
+
+#define TEST_ACCESS_AND_S1PTW_ON_HOLE_UFFD(__a, __th, __ph, ...)		\
+{										\
+	.name			= SNAME(ACCESS_S1PTW_ON_HOLE_UFFD ## _ ## __a),	\
+	.guest_test		= __a,						\
+	.mem_mark_cmd		= CMD_HOLE_PT | CMD_HOLE_TEST,			\
+	.uffd_pt_handler	= __ph,						\
+	.uffd_test_handler	= __th,						\
+	.expected_events	= { .uffd_faults = 2, },			\
+	__VA_ARGS__								\
+}
+
+#define TEST_ACCESS_AND_S1PTW_ON_HOLE_UFFD_AF(__a, __th, __ph)			\
+	TEST_ACCESS_AND_S1PTW_ON_HOLE_UFFD(__a, __th, __ph, __AF_TEST_ARGS)
+
 static struct test_desc tests[] = {
 	/* Check that HW is setting the AF (sanity checks). */
 	TEST_HW_ACCESS_FLAG(guest_test_read64),
@@ -640,10 +798,78 @@ static struct test_desc tests[] = {
 	TEST_ACCESS_ON_HOLE_NO_FAULTS(guest_test_cas, __PREPARE_LSE_TEST_ARGS),
 	TEST_ACCESS_ON_HOLE_NO_FAULTS(guest_test_ld_preidx),
 	TEST_ACCESS_ON_HOLE_NO_FAULTS(guest_test_write64),
-	TEST_ACCESS_ON_HOLE_NO_FAULTS(guest_test_at),
 	TEST_ACCESS_ON_HOLE_NO_FAULTS(guest_test_dc_zva),
 	TEST_ACCESS_ON_HOLE_NO_FAULTS(guest_test_st_preidx),
 
+	/* UFFD basic (sanity checks) */
+	TEST_ACCESS_ON_HOLE_UFFD(guest_test_read64, uffd_test_read_handler),
+	TEST_ACCESS_ON_HOLE_UFFD(guest_test_cas, uffd_test_read_handler,
+			__PREPARE_LSE_TEST_ARGS),
+	TEST_ACCESS_ON_HOLE_UFFD(guest_test_ld_preidx, uffd_test_read_handler),
+	TEST_ACCESS_ON_HOLE_UFFD(guest_test_write64, uffd_test_write_handler),
+	TEST_ACCESS_ON_HOLE_UFFD(guest_test_st_preidx, uffd_test_write_handler),
+	TEST_ACCESS_ON_HOLE_UFFD(guest_test_dc_zva, uffd_test_write_handler),
+	TEST_ACCESS_ON_HOLE_UFFD(guest_test_exec, uffd_test_read_handler),
+
+	/* UFFD fault due to S1PTW. Note how they are all write faults. */
+	TEST_S1PTW_ON_HOLE_UFFD(guest_test_read64, uffd_pt_write_handler),
+	TEST_S1PTW_ON_HOLE_UFFD(guest_test_cas, uffd_pt_write_handler,
+			__PREPARE_LSE_TEST_ARGS),
+	TEST_S1PTW_ON_HOLE_UFFD(guest_test_at, uffd_pt_write_handler),
+	TEST_S1PTW_ON_HOLE_UFFD(guest_test_ld_preidx, uffd_pt_write_handler),
+	TEST_S1PTW_ON_HOLE_UFFD(guest_test_write64, uffd_pt_write_handler),
+	TEST_S1PTW_ON_HOLE_UFFD(guest_test_dc_zva, uffd_pt_write_handler),
+	TEST_S1PTW_ON_HOLE_UFFD(guest_test_st_preidx, uffd_pt_write_handler),
+	TEST_S1PTW_ON_HOLE_UFFD(guest_test_exec, uffd_pt_write_handler),
+
+	/* UFFD fault due to S1PTW with AF. Note how they all write faults. */
+	TEST_S1PTW_ON_HOLE_UFFD_AF(guest_test_read64, uffd_pt_write_handler),
+	TEST_S1PTW_ON_HOLE_UFFD(guest_test_cas, uffd_pt_write_handler,
+			__AF_LSE_TEST_ARGS),
+	/*
+	 * Can't test the AF case for address translation insts (D5.4.11) as
+	 * it's IMPDEF whether that marks the AF.
+	 */
+	TEST_S1PTW_ON_HOLE_UFFD_AF(guest_test_ld_preidx, uffd_pt_write_handler),
+	TEST_S1PTW_ON_HOLE_UFFD_AF(guest_test_write64, uffd_pt_write_handler),
+	TEST_S1PTW_ON_HOLE_UFFD_AF(guest_test_st_preidx, uffd_pt_write_handler),
+	TEST_S1PTW_ON_HOLE_UFFD_AF(guest_test_dc_zva, uffd_pt_write_handler),
+	TEST_S1PTW_ON_HOLE_UFFD_AF(guest_test_exec, uffd_pt_write_handler),
+
+	/* UFFD faults due to an access and its S1PTW. */
+	TEST_ACCESS_AND_S1PTW_ON_HOLE_UFFD(guest_test_read64,
+			uffd_test_read_handler, uffd_pt_write_handler),
+	TEST_ACCESS_AND_S1PTW_ON_HOLE_UFFD(guest_test_cas,
+			uffd_test_read_handler, uffd_pt_write_handler,
+			__PREPARE_LSE_TEST_ARGS),
+	TEST_ACCESS_AND_S1PTW_ON_HOLE_UFFD(guest_test_ld_preidx,
+			uffd_test_read_handler, uffd_pt_write_handler),
+	TEST_ACCESS_AND_S1PTW_ON_HOLE_UFFD(guest_test_write64,
+			uffd_test_write_handler, uffd_pt_write_handler),
+	TEST_ACCESS_AND_S1PTW_ON_HOLE_UFFD(guest_test_dc_zva,
+			uffd_test_write_handler, uffd_pt_write_handler),
+	TEST_ACCESS_AND_S1PTW_ON_HOLE_UFFD(guest_test_st_preidx,
+			uffd_test_write_handler, uffd_pt_write_handler),
+	TEST_ACCESS_AND_S1PTW_ON_HOLE_UFFD(guest_test_exec,
+			uffd_test_read_handler, uffd_pt_write_handler),
+
+	/* UFFD faults due to an access and its S1PTW with AF. */
+	TEST_ACCESS_AND_S1PTW_ON_HOLE_UFFD_AF(guest_test_read64,
+			uffd_test_read_handler, uffd_pt_write_handler),
+	TEST_ACCESS_AND_S1PTW_ON_HOLE_UFFD(guest_test_cas,
+			uffd_test_read_handler, uffd_pt_write_handler,
+			__AF_LSE_TEST_ARGS),
+	TEST_ACCESS_AND_S1PTW_ON_HOLE_UFFD_AF(guest_test_ld_preidx,
+			uffd_test_read_handler, uffd_pt_write_handler),
+	TEST_ACCESS_AND_S1PTW_ON_HOLE_UFFD_AF(guest_test_write64,
+			uffd_test_write_handler, uffd_pt_write_handler),
+	TEST_ACCESS_AND_S1PTW_ON_HOLE_UFFD_AF(guest_test_dc_zva,
+			uffd_test_write_handler, uffd_pt_write_handler),
+	TEST_ACCESS_AND_S1PTW_ON_HOLE_UFFD_AF(guest_test_st_preidx,
+			uffd_test_write_handler, uffd_pt_write_handler),
+	TEST_ACCESS_AND_S1PTW_ON_HOLE_UFFD_AF(guest_test_exec,
+			uffd_test_read_handler, uffd_pt_write_handler),
+
 	{ 0 },
 };
 
-- 
2.35.1.723.g4982287a31-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 08/11] KVM: selftests: aarch64: Add userfaultfd tests into page_fault_test
@ 2022-03-11  6:02   ` Ricardo Koller
  0 siblings, 0 replies; 42+ messages in thread
From: Ricardo Koller @ 2022-03-11  6:02 UTC (permalink / raw)
  To: kvm, kvmarm, drjones
  Cc: pbonzini, maz, alexandru.elisei, eric.auger, oupton, reijiw,
	rananta, bgardon, axelrasmussen, Ricardo Koller

Add some userfaultfd tests into page_fault_test. Punch holes into the
data and/or page-table memslots, perform some accesses, and check that
the faults are taken (or not taken) when expected.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 .../selftests/kvm/aarch64/page_fault_test.c   | 232 +++++++++++++++++-
 1 file changed, 229 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/kvm/aarch64/page_fault_test.c b/tools/testing/selftests/kvm/aarch64/page_fault_test.c
index 00477a4f10cb..99449eaddb2b 100644
--- a/tools/testing/selftests/kvm/aarch64/page_fault_test.c
+++ b/tools/testing/selftests/kvm/aarch64/page_fault_test.c
@@ -57,6 +57,8 @@ uint64_t pte_gpa;
 enum { PT, TEST, NR_MEMSLOTS};
 
 struct memslot_desc {
+	size_t paging_size;
+	char *data_copy;
 	void *hva;
 	uint64_t gpa;
 	uint64_t size;
@@ -78,6 +80,9 @@ struct memslot_desc {
 static struct event_cnt {
 	int aborts;
 	int fail_vcpu_runs;
+	int uffd_faults;
+	/* uffd_faults is incremented from multiple threads. */
+	pthread_mutex_t uffd_faults_mutex;
 } events;
 
 struct test_desc {
@@ -87,6 +92,8 @@ struct test_desc {
 	bool (*guest_prepare[PREPARE_FN_NR])(void);
 	void (*guest_test)(void);
 	void (*guest_test_check[CHECK_FN_NR])(void);
+	int (*uffd_pt_handler)(int mode, int uffd, struct uffd_msg *msg);
+	int (*uffd_test_handler)(int mode, int uffd, struct uffd_msg *msg);
 	void (*dabt_handler)(struct ex_regs *regs);
 	void (*iabt_handler)(struct ex_regs *regs);
 	uint32_t pt_memslot_flags;
@@ -305,6 +312,56 @@ static void no_iabt_handler(struct ex_regs *regs)
 	GUEST_ASSERT_1(false, regs->pc);
 }
 
+static int uffd_generic_handler(int uffd_mode, int uffd,
+		struct uffd_msg *msg, struct memslot_desc *memslot,
+		bool expect_write)
+{
+	uint64_t addr = msg->arg.pagefault.address;
+	uint64_t flags = msg->arg.pagefault.flags;
+	struct uffdio_copy copy;
+	int ret;
+
+	TEST_ASSERT(uffd_mode == UFFDIO_REGISTER_MODE_MISSING,
+			"The only expected UFFD mode is MISSING");
+	ASSERT_EQ(!!(flags & UFFD_PAGEFAULT_FLAG_WRITE), expect_write);
+	ASSERT_EQ(addr, (uint64_t)memslot->hva);
+
+	pr_debug("uffd fault: addr=%p write=%d\n",
+			(void *)addr, !!(flags & UFFD_PAGEFAULT_FLAG_WRITE));
+
+	copy.src = (uint64_t)memslot->data_copy;
+	copy.dst = addr;
+	copy.len = memslot->paging_size;
+	copy.mode = 0;
+
+	ret = ioctl(uffd, UFFDIO_COPY, &copy);
+	if (ret == -1) {
+		pr_info("Failed UFFDIO_COPY in 0x%lx with errno: %d\n",
+				addr, errno);
+		return ret;
+	}
+
+	pthread_mutex_lock(&events.uffd_faults_mutex);
+	events.uffd_faults += 1;
+	pthread_mutex_unlock(&events.uffd_faults_mutex);
+	return 0;
+}
+
+static int uffd_pt_write_handler(int mode, int uffd, struct uffd_msg *msg)
+{
+	return uffd_generic_handler(mode, uffd, msg, &memslot[PT], true);
+}
+
+static int uffd_test_write_handler(int mode, int uffd, struct uffd_msg *msg)
+{
+	return uffd_generic_handler(mode, uffd, msg, &memslot[TEST], true);
+}
+
+static int uffd_test_read_handler(int mode, int uffd, struct uffd_msg *msg)
+{
+	return uffd_generic_handler(mode, uffd, msg, &memslot[TEST], false);
+}
+
 static void punch_hole_in_memslot(struct kvm_vm *vm,
 		struct memslot_desc *memslot)
 {
@@ -314,11 +371,11 @@ static void punch_hole_in_memslot(struct kvm_vm *vm,
 	fd = vm_mem_region_get_src_fd(vm, memslot->idx);
 	if (fd != -1) {
 		ret = fallocate(fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
-				0, memslot->size);
+				0, memslot->paging_size);
 		TEST_ASSERT(ret == 0, "fallocate failed, errno: %d\n", errno);
 	} else {
 		hva = addr_gpa2hva(vm, memslot->gpa);
-		ret = madvise(hva, memslot->size, MADV_DONTNEED);
+		ret = madvise(hva, memslot->paging_size, MADV_DONTNEED);
 		TEST_ASSERT(ret == 0, "madvise failed, errno: %d\n", errno);
 	}
 }
@@ -457,9 +514,60 @@ static void setup_memslots(struct kvm_vm *vm, enum vm_guest_mode mode,
 	virt_pg_map(vm, pte_gva, pte_gpa);
 }
 
+static void setup_uffd(enum vm_guest_mode mode, struct test_params *p,
+		struct uffd_desc **uffd)
+{
+	struct test_desc *test = p->test_desc;
+	uint64_t large_page_size = get_backing_src_pagesz(p->src_type);
+	int i;
+
+	/*
+	 * When creating the map, we might not only have created a pte page,
+	 * but also an intermediate level (pte_gpa != gpa[PT]). So, we
+	 * might need to demand page both.
+	 */
+	memslot[PT].paging_size = align_up(pte_gpa - memslot[PT].gpa,
+			large_page_size) + large_page_size;
+	memslot[TEST].paging_size = large_page_size;
+
+	for (i = 0; i < NR_MEMSLOTS; i++) {
+		memslot[i].data_copy = malloc(memslot[i].paging_size);
+		TEST_ASSERT(memslot[i].data_copy, "Failed malloc.");
+		memcpy(memslot[i].data_copy, memslot[i].hva,
+				memslot[i].paging_size);
+	}
+
+	uffd[PT] = NULL;
+	if (test->uffd_pt_handler)
+		uffd[PT] = uffd_setup_demand_paging(
+				UFFDIO_REGISTER_MODE_MISSING, 0,
+				memslot[PT].hva, memslot[PT].paging_size,
+				test->uffd_pt_handler);
+
+	uffd[TEST] = NULL;
+	if (test->uffd_test_handler)
+		uffd[TEST] = uffd_setup_demand_paging(
+				UFFDIO_REGISTER_MODE_MISSING, 0,
+				memslot[TEST].hva, memslot[TEST].paging_size,
+				test->uffd_test_handler);
+}
+
 static void check_event_counts(struct test_desc *test)
 {
 	ASSERT_EQ(test->expected_events.aborts,	events.aborts);
+	ASSERT_EQ(test->expected_events.uffd_faults, events.uffd_faults);
+}
+
+static void free_uffd(struct test_desc *test, struct uffd_desc **uffd)
+{
+	int i;
+
+	if (test->uffd_pt_handler)
+		uffd_stop_demand_paging(uffd[PT]);
+	if (test->uffd_test_handler)
+		uffd_stop_demand_paging(uffd[TEST]);
+	for (i = 0; i < NR_MEMSLOTS; i++)
+		free(memslot[i].data_copy);
 }
 
 static void print_test_banner(enum vm_guest_mode mode, struct test_params *p)
@@ -517,6 +625,7 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 	struct test_params *p = (struct test_params *)arg;
 	struct test_desc *test = p->test_desc;
 	struct kvm_vm *vm;
+	struct uffd_desc *uffd[NR_MEMSLOTS];
 	bool skip_test = false;
 
 	print_test_banner(mode, p);
@@ -528,7 +637,14 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 	reset_event_counts();
 	setup_memslots(vm, mode, p);
 
+	/*
+	 * Set some code at memslot[TEST].hva for the guest to execute (only
+	 * applicable to the EXEC tests). This has to be done before
+	 * setup_uffd() as that function copies the memslot data for the uffd
+	 * handler.
+	 */
 	load_exec_code_for_test();
+	setup_uffd(mode, p, uffd);
 	setup_abort_handlers(vm, test);
 	setup_guest_args(vm, test);
 
@@ -542,7 +658,12 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 	sync_stats_from_guest(vm);
 	ucall_uninit(vm);
 	kvm_vm_free(vm);
+	free_uffd(test, uffd);
 
+	/*
+	 * Make sure this is called after the uffd threads have exited (and
+	 * updated their respective event counters).
+	 */
 	if (!skip_test)
 		check_event_counts(test);
 }
@@ -625,6 +746,43 @@ int main(int argc, char *argv[])
 	__VA_ARGS__								\
 }
 
+#define TEST_ACCESS_ON_HOLE_UFFD(__a, __uffd_handler, ...)			\
+{										\
+	.name			= SNAME(ACCESS_ON_HOLE_UFFD ## _ ## __a),	\
+	.guest_test		= __a,						\
+	.mem_mark_cmd		= CMD_HOLE_TEST,				\
+	.uffd_test_handler	= __uffd_handler,				\
+	.expected_events	= { .uffd_faults = 1, },			\
+	__VA_ARGS__								\
+}
+
+#define TEST_S1PTW_ON_HOLE_UFFD(__a, __uffd_handler, ...)			\
+{										\
+	.name			= SNAME(S1PTW_ON_HOLE_UFFD ## _ ## __a),	\
+	.guest_test		= __a,						\
+	.mem_mark_cmd		= CMD_HOLE_PT,					\
+	.uffd_pt_handler	= __uffd_handler,				\
+	.expected_events	= { .uffd_faults = 1, },			\
+	__VA_ARGS__								\
+}
+
+#define TEST_S1PTW_ON_HOLE_UFFD_AF(__a, __uffd_handler)				\
+	TEST_S1PTW_ON_HOLE_UFFD(__a, __uffd_handler, __AF_TEST_ARGS)
+
+#define TEST_ACCESS_AND_S1PTW_ON_HOLE_UFFD(__a, __th, __ph, ...)		\
+{										\
+	.name			= SNAME(ACCESS_S1PTW_ON_HOLE_UFFD ## _ ## __a),	\
+	.guest_test		= __a,						\
+	.mem_mark_cmd		= CMD_HOLE_PT | CMD_HOLE_TEST,			\
+	.uffd_pt_handler	= __ph,						\
+	.uffd_test_handler	= __th,						\
+	.expected_events	= { .uffd_faults = 2, },			\
+	__VA_ARGS__								\
+}
+
+#define TEST_ACCESS_AND_S1PTW_ON_HOLE_UFFD_AF(__a, __th, __ph)			\
+	TEST_ACCESS_AND_S1PTW_ON_HOLE_UFFD(__a, __th, __ph, __AF_TEST_ARGS)
+
 static struct test_desc tests[] = {
 	/* Check that HW is setting the AF (sanity checks). */
 	TEST_HW_ACCESS_FLAG(guest_test_read64),
@@ -640,10 +798,78 @@ static struct test_desc tests[] = {
 	TEST_ACCESS_ON_HOLE_NO_FAULTS(guest_test_cas, __PREPARE_LSE_TEST_ARGS),
 	TEST_ACCESS_ON_HOLE_NO_FAULTS(guest_test_ld_preidx),
 	TEST_ACCESS_ON_HOLE_NO_FAULTS(guest_test_write64),
-	TEST_ACCESS_ON_HOLE_NO_FAULTS(guest_test_at),
 	TEST_ACCESS_ON_HOLE_NO_FAULTS(guest_test_dc_zva),
 	TEST_ACCESS_ON_HOLE_NO_FAULTS(guest_test_st_preidx),
 
+	/* UFFD basic (sanity checks) */
+	TEST_ACCESS_ON_HOLE_UFFD(guest_test_read64, uffd_test_read_handler),
+	TEST_ACCESS_ON_HOLE_UFFD(guest_test_cas, uffd_test_read_handler,
+			__PREPARE_LSE_TEST_ARGS),
+	TEST_ACCESS_ON_HOLE_UFFD(guest_test_ld_preidx, uffd_test_read_handler),
+	TEST_ACCESS_ON_HOLE_UFFD(guest_test_write64, uffd_test_write_handler),
+	TEST_ACCESS_ON_HOLE_UFFD(guest_test_st_preidx, uffd_test_write_handler),
+	TEST_ACCESS_ON_HOLE_UFFD(guest_test_dc_zva, uffd_test_write_handler),
+	TEST_ACCESS_ON_HOLE_UFFD(guest_test_exec, uffd_test_read_handler),
+
+	/* UFFD fault due to S1PTW. Note how they are all write faults. */
+	TEST_S1PTW_ON_HOLE_UFFD(guest_test_read64, uffd_pt_write_handler),
+	TEST_S1PTW_ON_HOLE_UFFD(guest_test_cas, uffd_pt_write_handler,
+			__PREPARE_LSE_TEST_ARGS),
+	TEST_S1PTW_ON_HOLE_UFFD(guest_test_at, uffd_pt_write_handler),
+	TEST_S1PTW_ON_HOLE_UFFD(guest_test_ld_preidx, uffd_pt_write_handler),
+	TEST_S1PTW_ON_HOLE_UFFD(guest_test_write64, uffd_pt_write_handler),
+	TEST_S1PTW_ON_HOLE_UFFD(guest_test_dc_zva, uffd_pt_write_handler),
+	TEST_S1PTW_ON_HOLE_UFFD(guest_test_st_preidx, uffd_pt_write_handler),
+	TEST_S1PTW_ON_HOLE_UFFD(guest_test_exec, uffd_pt_write_handler),
+
+	/* UFFD fault due to S1PTW with AF. Note how they all write faults. */
+	TEST_S1PTW_ON_HOLE_UFFD_AF(guest_test_read64, uffd_pt_write_handler),
+	TEST_S1PTW_ON_HOLE_UFFD(guest_test_cas, uffd_pt_write_handler,
+			__AF_LSE_TEST_ARGS),
+	/*
+	 * Can't test the AF case for address translation insts (D5.4.11) as
+	 * it's IMPDEF whether that marks the AF.
+	 */
+	TEST_S1PTW_ON_HOLE_UFFD_AF(guest_test_ld_preidx, uffd_pt_write_handler),
+	TEST_S1PTW_ON_HOLE_UFFD_AF(guest_test_write64, uffd_pt_write_handler),
+	TEST_S1PTW_ON_HOLE_UFFD_AF(guest_test_st_preidx, uffd_pt_write_handler),
+	TEST_S1PTW_ON_HOLE_UFFD_AF(guest_test_dc_zva, uffd_pt_write_handler),
+	TEST_S1PTW_ON_HOLE_UFFD_AF(guest_test_exec, uffd_pt_write_handler),
+
+	/* UFFD faults due to an access and its S1PTW. */
+	TEST_ACCESS_AND_S1PTW_ON_HOLE_UFFD(guest_test_read64,
+			uffd_test_read_handler, uffd_pt_write_handler),
+	TEST_ACCESS_AND_S1PTW_ON_HOLE_UFFD(guest_test_cas,
+			uffd_test_read_handler, uffd_pt_write_handler,
+			__PREPARE_LSE_TEST_ARGS),
+	TEST_ACCESS_AND_S1PTW_ON_HOLE_UFFD(guest_test_ld_preidx,
+			uffd_test_read_handler, uffd_pt_write_handler),
+	TEST_ACCESS_AND_S1PTW_ON_HOLE_UFFD(guest_test_write64,
+			uffd_test_write_handler, uffd_pt_write_handler),
+	TEST_ACCESS_AND_S1PTW_ON_HOLE_UFFD(guest_test_dc_zva,
+			uffd_test_write_handler, uffd_pt_write_handler),
+	TEST_ACCESS_AND_S1PTW_ON_HOLE_UFFD(guest_test_st_preidx,
+			uffd_test_write_handler, uffd_pt_write_handler),
+	TEST_ACCESS_AND_S1PTW_ON_HOLE_UFFD(guest_test_exec,
+			uffd_test_read_handler, uffd_pt_write_handler),
+
+	/* UFFD faults due to an access and its S1PTW with AF. */
+	TEST_ACCESS_AND_S1PTW_ON_HOLE_UFFD_AF(guest_test_read64,
+			uffd_test_read_handler, uffd_pt_write_handler),
+	TEST_ACCESS_AND_S1PTW_ON_HOLE_UFFD(guest_test_cas,
+			uffd_test_read_handler, uffd_pt_write_handler,
+			__AF_LSE_TEST_ARGS),
+	TEST_ACCESS_AND_S1PTW_ON_HOLE_UFFD_AF(guest_test_ld_preidx,
+			uffd_test_read_handler, uffd_pt_write_handler),
+	TEST_ACCESS_AND_S1PTW_ON_HOLE_UFFD_AF(guest_test_write64,
+			uffd_test_write_handler, uffd_pt_write_handler),
+	TEST_ACCESS_AND_S1PTW_ON_HOLE_UFFD_AF(guest_test_dc_zva,
+			uffd_test_write_handler, uffd_pt_write_handler),
+	TEST_ACCESS_AND_S1PTW_ON_HOLE_UFFD_AF(guest_test_st_preidx,
+			uffd_test_write_handler, uffd_pt_write_handler),
+	TEST_ACCESS_AND_S1PTW_ON_HOLE_UFFD_AF(guest_test_exec,
+			uffd_test_read_handler, uffd_pt_write_handler),
+
 	{ 0 },
 };
 
-- 
2.35.1.723.g4982287a31-goog


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 09/11] KVM: selftests: aarch64: Add dirty logging tests into page_fault_test
  2022-03-11  6:01 ` Ricardo Koller
@ 2022-03-11  6:02   ` Ricardo Koller
  -1 siblings, 0 replies; 42+ messages in thread
From: Ricardo Koller @ 2022-03-11  6:02 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: maz, bgardon, pbonzini, axelrasmussen

Add some dirty logging tests into page_fault_test. Mark the data and/or
page-table memslots for dirty logging, perform some accesses, and check
that the dirty log bits are set or clean when expected.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 .../selftests/kvm/aarch64/page_fault_test.c   | 123 ++++++++++++++++++
 1 file changed, 123 insertions(+)

diff --git a/tools/testing/selftests/kvm/aarch64/page_fault_test.c b/tools/testing/selftests/kvm/aarch64/page_fault_test.c
index 99449eaddb2b..b41da9317242 100644
--- a/tools/testing/selftests/kvm/aarch64/page_fault_test.c
+++ b/tools/testing/selftests/kvm/aarch64/page_fault_test.c
@@ -45,6 +45,11 @@
 #define CMD_SKIP_TEST				(-1LL)
 #define CMD_HOLE_PT				(1ULL << 2)
 #define CMD_HOLE_TEST				(1ULL << 3)
+#define CMD_RECREATE_PT_MEMSLOT_WR		(1ULL << 4)
+#define CMD_CHECK_WRITE_IN_DIRTY_LOG		(1ULL << 5)
+#define CMD_CHECK_S1PTW_WR_IN_DIRTY_LOG		(1ULL << 6)
+#define CMD_CHECK_NO_WRITE_IN_DIRTY_LOG		(1ULL << 7)
+#define CMD_SET_PTE_AF				(1ULL << 8)
 
 #define PREPARE_FN_NR				10
 #define CHECK_FN_NR				10
@@ -251,6 +256,21 @@ static void guest_check_pte_af(void)
 	GUEST_ASSERT_EQ(*((uint64_t *)pte_gva) & PTE_AF, PTE_AF);
 }
 
+static void guest_check_write_in_dirty_log(void)
+{
+	GUEST_SYNC(CMD_CHECK_WRITE_IN_DIRTY_LOG);
+}
+
+static void guest_check_no_write_in_dirty_log(void)
+{
+	GUEST_SYNC(CMD_CHECK_NO_WRITE_IN_DIRTY_LOG);
+}
+
+static void guest_check_s1ptw_wr_in_dirty_log(void)
+{
+	GUEST_SYNC(CMD_CHECK_S1PTW_WR_IN_DIRTY_LOG);
+}
+
 static void guest_test_exec(void)
 {
 	int (*code)(void) = (int (*)(void))test_exec_gva;
@@ -380,12 +400,34 @@ static void punch_hole_in_memslot(struct kvm_vm *vm,
 	}
 }
 
+static bool check_write_in_dirty_log(struct kvm_vm *vm,
+		struct memslot_desc *ms, uint64_t host_pg_nr)
+{
+	unsigned long *bmap;
+	bool first_page_dirty;
+
+	bmap = bitmap_zalloc(ms->size / getpagesize());
+	kvm_vm_get_dirty_log(vm, ms->idx, bmap);
+	first_page_dirty = test_bit(host_pg_nr, bmap);
+	free(bmap);
+	return first_page_dirty;
+}
+
 static void handle_cmd(struct kvm_vm *vm, int cmd)
 {
 	if (cmd & CMD_HOLE_PT)
 		punch_hole_in_memslot(vm, &memslot[PT]);
 	if (cmd & CMD_HOLE_TEST)
 		punch_hole_in_memslot(vm, &memslot[TEST]);
+	if (cmd & CMD_CHECK_WRITE_IN_DIRTY_LOG)
+		TEST_ASSERT(check_write_in_dirty_log(vm, &memslot[TEST], 0),
+				"Missing write in dirty log");
+	if (cmd & CMD_CHECK_NO_WRITE_IN_DIRTY_LOG)
+		TEST_ASSERT(!check_write_in_dirty_log(vm, &memslot[TEST], 0),
+				"Unexpected s1ptw write in dirty log");
+	if (cmd & CMD_CHECK_S1PTW_WR_IN_DIRTY_LOG)
+		TEST_ASSERT(check_write_in_dirty_log(vm, &memslot[PT], 0),
+				"Missing s1ptw write in dirty log");
 }
 
 static void sync_stats_from_guest(struct kvm_vm *vm)
@@ -783,6 +825,56 @@ int main(int argc, char *argv[])
 #define TEST_ACCESS_AND_S1PTW_ON_HOLE_UFFD_AF(__a, __th, __ph)			\
 	TEST_ACCESS_AND_S1PTW_ON_HOLE_UFFD(__a, __th, __ph, __AF_TEST_ARGS)
 
+#define	__TEST_ACCESS_DIRTY_LOG(__a, ...)					\
+{										\
+	.name			= SNAME(TEST_ACCESS_DIRTY_LOG ## _ ## __a),	\
+	.test_memslot_flags	= KVM_MEM_LOG_DIRTY_PAGES,			\
+	.guest_test 		= __a,						\
+	.expected_events	= { 0 },					\
+	__VA_ARGS__								\
+}
+
+#define __CHECK_WRITE_IN_DIRTY_LOG						\
+	.guest_test_check	= { guest_check_write_in_dirty_log, },
+
+#define __CHECK_NO_WRITE_IN_DIRTY_LOG						\
+	.guest_test_check	= { guest_check_no_write_in_dirty_log, },
+
+#define	TEST_WRITE_DIRTY_LOG(__a, ...)						\
+	__TEST_ACCESS_DIRTY_LOG(__a, __CHECK_WRITE_IN_DIRTY_LOG __VA_ARGS__)
+
+#define	TEST_NO_WRITE_DIRTY_LOG(__a, ...)					\
+	__TEST_ACCESS_DIRTY_LOG(__a, __CHECK_NO_WRITE_IN_DIRTY_LOG __VA_ARGS__)
+
+#define	__TEST_S1PTW_DIRTY_LOG(__a, ...)					\
+{										\
+	.name			= SNAME(S1PTW_AF_DIRTY_LOG ## _ ## __a),	\
+	.pt_memslot_flags	= KVM_MEM_LOG_DIRTY_PAGES,			\
+	.guest_test 		= __a,						\
+	.expected_events	= { 0 },					\
+	__VA_ARGS__								\
+}
+
+#define __CHECK_S1PTW_WR_IN_DIRTY_LOG						\
+	.guest_test_check	= { guest_check_s1ptw_wr_in_dirty_log, },
+
+#define	TEST_S1PTW_DIRTY_LOG(__a, ...)						\
+	__TEST_S1PTW_DIRTY_LOG(__a, __CHECK_S1PTW_WR_IN_DIRTY_LOG __VA_ARGS__)
+
+#define __AF_TEST_ARGS_FOR_DIRTY_LOG						\
+	.guest_prepare		= { guest_set_ha, guest_clear_pte_af, },	\
+	.guest_test_check       = { guest_check_s1ptw_wr_in_dirty_log, 		\
+				    guest_check_pte_af, },
+
+#define __AF_AND_LSE_ARGS_FOR_DIRTY_LOG						\
+	.guest_prepare		= { guest_set_ha, guest_clear_pte_af,		\
+				    guest_check_lse, },				\
+	.guest_test_check       = { guest_check_s1ptw_wr_in_dirty_log, 		\
+				    guest_check_pte_af, },
+
+#define	TEST_S1PTW_AF_DIRTY_LOG(__a, ...)					\
+	TEST_S1PTW_DIRTY_LOG(__a, __AF_TEST_ARGS_FOR_DIRTY_LOG)
+
 static struct test_desc tests[] = {
 	/* Check that HW is setting the AF (sanity checks). */
 	TEST_HW_ACCESS_FLAG(guest_test_read64),
@@ -793,6 +885,37 @@ static struct test_desc tests[] = {
 	TEST_HW_ACCESS_FLAG(guest_test_dc_zva),
 	TEST_HW_ACCESS_FLAG(guest_test_exec),
 
+	/* Dirty log basic checks. */
+	TEST_WRITE_DIRTY_LOG(guest_test_write64),
+	TEST_WRITE_DIRTY_LOG(guest_test_cas, __PREPARE_LSE_TEST_ARGS),
+	TEST_WRITE_DIRTY_LOG(guest_test_dc_zva),
+	TEST_WRITE_DIRTY_LOG(guest_test_st_preidx),
+	TEST_NO_WRITE_DIRTY_LOG(guest_test_read64),
+	TEST_NO_WRITE_DIRTY_LOG(guest_test_ld_preidx),
+	TEST_NO_WRITE_DIRTY_LOG(guest_test_at),
+	TEST_NO_WRITE_DIRTY_LOG(guest_test_exec),
+
+	/*
+	 * S1PTW on a PT (no AF) which is marked for dirty logging. Note that
+	 * this still shows up in the dirty log as a write.
+	 */
+	TEST_S1PTW_DIRTY_LOG(guest_test_write64),
+	TEST_S1PTW_DIRTY_LOG(guest_test_st_preidx),
+	TEST_S1PTW_DIRTY_LOG(guest_test_read64),
+	TEST_S1PTW_DIRTY_LOG(guest_test_cas, __PREPARE_LSE_TEST_ARGS),
+	TEST_S1PTW_DIRTY_LOG(guest_test_ld_preidx),
+	TEST_S1PTW_DIRTY_LOG(guest_test_at),
+	TEST_S1PTW_DIRTY_LOG(guest_test_dc_zva),
+	TEST_S1PTW_DIRTY_LOG(guest_test_exec),
+	TEST_S1PTW_AF_DIRTY_LOG(guest_test_write64),
+	TEST_S1PTW_AF_DIRTY_LOG(guest_test_st_preidx),
+	TEST_S1PTW_AF_DIRTY_LOG(guest_test_read64),
+	TEST_S1PTW_DIRTY_LOG(guest_test_cas, __AF_AND_LSE_ARGS_FOR_DIRTY_LOG),
+	TEST_S1PTW_AF_DIRTY_LOG(guest_test_ld_preidx),
+	TEST_S1PTW_AF_DIRTY_LOG(guest_test_at),
+	TEST_S1PTW_AF_DIRTY_LOG(guest_test_dc_zva),
+	TEST_S1PTW_AF_DIRTY_LOG(guest_test_exec),
+
 	/* Accessing a hole shouldn't fault (more sanity checks). */
 	TEST_ACCESS_ON_HOLE_NO_FAULTS(guest_test_read64),
 	TEST_ACCESS_ON_HOLE_NO_FAULTS(guest_test_cas, __PREPARE_LSE_TEST_ARGS),
-- 
2.35.1.723.g4982287a31-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 09/11] KVM: selftests: aarch64: Add dirty logging tests into page_fault_test
@ 2022-03-11  6:02   ` Ricardo Koller
  0 siblings, 0 replies; 42+ messages in thread
From: Ricardo Koller @ 2022-03-11  6:02 UTC (permalink / raw)
  To: kvm, kvmarm, drjones
  Cc: pbonzini, maz, alexandru.elisei, eric.auger, oupton, reijiw,
	rananta, bgardon, axelrasmussen, Ricardo Koller

Add some dirty logging tests into page_fault_test. Mark the data and/or
page-table memslots for dirty logging, perform some accesses, and check
that the dirty log bits are set or clean when expected.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 .../selftests/kvm/aarch64/page_fault_test.c   | 123 ++++++++++++++++++
 1 file changed, 123 insertions(+)

diff --git a/tools/testing/selftests/kvm/aarch64/page_fault_test.c b/tools/testing/selftests/kvm/aarch64/page_fault_test.c
index 99449eaddb2b..b41da9317242 100644
--- a/tools/testing/selftests/kvm/aarch64/page_fault_test.c
+++ b/tools/testing/selftests/kvm/aarch64/page_fault_test.c
@@ -45,6 +45,11 @@
 #define CMD_SKIP_TEST				(-1LL)
 #define CMD_HOLE_PT				(1ULL << 2)
 #define CMD_HOLE_TEST				(1ULL << 3)
+#define CMD_RECREATE_PT_MEMSLOT_WR		(1ULL << 4)
+#define CMD_CHECK_WRITE_IN_DIRTY_LOG		(1ULL << 5)
+#define CMD_CHECK_S1PTW_WR_IN_DIRTY_LOG		(1ULL << 6)
+#define CMD_CHECK_NO_WRITE_IN_DIRTY_LOG		(1ULL << 7)
+#define CMD_SET_PTE_AF				(1ULL << 8)
 
 #define PREPARE_FN_NR				10
 #define CHECK_FN_NR				10
@@ -251,6 +256,21 @@ static void guest_check_pte_af(void)
 	GUEST_ASSERT_EQ(*((uint64_t *)pte_gva) & PTE_AF, PTE_AF);
 }
 
+static void guest_check_write_in_dirty_log(void)
+{
+	GUEST_SYNC(CMD_CHECK_WRITE_IN_DIRTY_LOG);
+}
+
+static void guest_check_no_write_in_dirty_log(void)
+{
+	GUEST_SYNC(CMD_CHECK_NO_WRITE_IN_DIRTY_LOG);
+}
+
+static void guest_check_s1ptw_wr_in_dirty_log(void)
+{
+	GUEST_SYNC(CMD_CHECK_S1PTW_WR_IN_DIRTY_LOG);
+}
+
 static void guest_test_exec(void)
 {
 	int (*code)(void) = (int (*)(void))test_exec_gva;
@@ -380,12 +400,34 @@ static void punch_hole_in_memslot(struct kvm_vm *vm,
 	}
 }
 
+static bool check_write_in_dirty_log(struct kvm_vm *vm,
+		struct memslot_desc *ms, uint64_t host_pg_nr)
+{
+	unsigned long *bmap;
+	bool first_page_dirty;
+
+	bmap = bitmap_zalloc(ms->size / getpagesize());
+	kvm_vm_get_dirty_log(vm, ms->idx, bmap);
+	first_page_dirty = test_bit(host_pg_nr, bmap);
+	free(bmap);
+	return first_page_dirty;
+}
+
 static void handle_cmd(struct kvm_vm *vm, int cmd)
 {
 	if (cmd & CMD_HOLE_PT)
 		punch_hole_in_memslot(vm, &memslot[PT]);
 	if (cmd & CMD_HOLE_TEST)
 		punch_hole_in_memslot(vm, &memslot[TEST]);
+	if (cmd & CMD_CHECK_WRITE_IN_DIRTY_LOG)
+		TEST_ASSERT(check_write_in_dirty_log(vm, &memslot[TEST], 0),
+				"Missing write in dirty log");
+	if (cmd & CMD_CHECK_NO_WRITE_IN_DIRTY_LOG)
+		TEST_ASSERT(!check_write_in_dirty_log(vm, &memslot[TEST], 0),
+				"Unexpected s1ptw write in dirty log");
+	if (cmd & CMD_CHECK_S1PTW_WR_IN_DIRTY_LOG)
+		TEST_ASSERT(check_write_in_dirty_log(vm, &memslot[PT], 0),
+				"Missing s1ptw write in dirty log");
 }
 
 static void sync_stats_from_guest(struct kvm_vm *vm)
@@ -783,6 +825,56 @@ int main(int argc, char *argv[])
 #define TEST_ACCESS_AND_S1PTW_ON_HOLE_UFFD_AF(__a, __th, __ph)			\
 	TEST_ACCESS_AND_S1PTW_ON_HOLE_UFFD(__a, __th, __ph, __AF_TEST_ARGS)
 
+#define	__TEST_ACCESS_DIRTY_LOG(__a, ...)					\
+{										\
+	.name			= SNAME(TEST_ACCESS_DIRTY_LOG ## _ ## __a),	\
+	.test_memslot_flags	= KVM_MEM_LOG_DIRTY_PAGES,			\
+	.guest_test 		= __a,						\
+	.expected_events	= { 0 },					\
+	__VA_ARGS__								\
+}
+
+#define __CHECK_WRITE_IN_DIRTY_LOG						\
+	.guest_test_check	= { guest_check_write_in_dirty_log, },
+
+#define __CHECK_NO_WRITE_IN_DIRTY_LOG						\
+	.guest_test_check	= { guest_check_no_write_in_dirty_log, },
+
+#define	TEST_WRITE_DIRTY_LOG(__a, ...)						\
+	__TEST_ACCESS_DIRTY_LOG(__a, __CHECK_WRITE_IN_DIRTY_LOG __VA_ARGS__)
+
+#define	TEST_NO_WRITE_DIRTY_LOG(__a, ...)					\
+	__TEST_ACCESS_DIRTY_LOG(__a, __CHECK_NO_WRITE_IN_DIRTY_LOG __VA_ARGS__)
+
+#define	__TEST_S1PTW_DIRTY_LOG(__a, ...)					\
+{										\
+	.name			= SNAME(S1PTW_AF_DIRTY_LOG ## _ ## __a),	\
+	.pt_memslot_flags	= KVM_MEM_LOG_DIRTY_PAGES,			\
+	.guest_test 		= __a,						\
+	.expected_events	= { 0 },					\
+	__VA_ARGS__								\
+}
+
+#define __CHECK_S1PTW_WR_IN_DIRTY_LOG						\
+	.guest_test_check	= { guest_check_s1ptw_wr_in_dirty_log, },
+
+#define	TEST_S1PTW_DIRTY_LOG(__a, ...)						\
+	__TEST_S1PTW_DIRTY_LOG(__a, __CHECK_S1PTW_WR_IN_DIRTY_LOG __VA_ARGS__)
+
+#define __AF_TEST_ARGS_FOR_DIRTY_LOG						\
+	.guest_prepare		= { guest_set_ha, guest_clear_pte_af, },	\
+	.guest_test_check       = { guest_check_s1ptw_wr_in_dirty_log, 		\
+				    guest_check_pte_af, },
+
+#define __AF_AND_LSE_ARGS_FOR_DIRTY_LOG						\
+	.guest_prepare		= { guest_set_ha, guest_clear_pte_af,		\
+				    guest_check_lse, },				\
+	.guest_test_check       = { guest_check_s1ptw_wr_in_dirty_log, 		\
+				    guest_check_pte_af, },
+
+#define	TEST_S1PTW_AF_DIRTY_LOG(__a, ...)					\
+	TEST_S1PTW_DIRTY_LOG(__a, __AF_TEST_ARGS_FOR_DIRTY_LOG)
+
 static struct test_desc tests[] = {
 	/* Check that HW is setting the AF (sanity checks). */
 	TEST_HW_ACCESS_FLAG(guest_test_read64),
@@ -793,6 +885,37 @@ static struct test_desc tests[] = {
 	TEST_HW_ACCESS_FLAG(guest_test_dc_zva),
 	TEST_HW_ACCESS_FLAG(guest_test_exec),
 
+	/* Dirty log basic checks. */
+	TEST_WRITE_DIRTY_LOG(guest_test_write64),
+	TEST_WRITE_DIRTY_LOG(guest_test_cas, __PREPARE_LSE_TEST_ARGS),
+	TEST_WRITE_DIRTY_LOG(guest_test_dc_zva),
+	TEST_WRITE_DIRTY_LOG(guest_test_st_preidx),
+	TEST_NO_WRITE_DIRTY_LOG(guest_test_read64),
+	TEST_NO_WRITE_DIRTY_LOG(guest_test_ld_preidx),
+	TEST_NO_WRITE_DIRTY_LOG(guest_test_at),
+	TEST_NO_WRITE_DIRTY_LOG(guest_test_exec),
+
+	/*
+	 * S1PTW on a PT (no AF) which is marked for dirty logging. Note that
+	 * this still shows up in the dirty log as a write.
+	 */
+	TEST_S1PTW_DIRTY_LOG(guest_test_write64),
+	TEST_S1PTW_DIRTY_LOG(guest_test_st_preidx),
+	TEST_S1PTW_DIRTY_LOG(guest_test_read64),
+	TEST_S1PTW_DIRTY_LOG(guest_test_cas, __PREPARE_LSE_TEST_ARGS),
+	TEST_S1PTW_DIRTY_LOG(guest_test_ld_preidx),
+	TEST_S1PTW_DIRTY_LOG(guest_test_at),
+	TEST_S1PTW_DIRTY_LOG(guest_test_dc_zva),
+	TEST_S1PTW_DIRTY_LOG(guest_test_exec),
+	TEST_S1PTW_AF_DIRTY_LOG(guest_test_write64),
+	TEST_S1PTW_AF_DIRTY_LOG(guest_test_st_preidx),
+	TEST_S1PTW_AF_DIRTY_LOG(guest_test_read64),
+	TEST_S1PTW_DIRTY_LOG(guest_test_cas, __AF_AND_LSE_ARGS_FOR_DIRTY_LOG),
+	TEST_S1PTW_AF_DIRTY_LOG(guest_test_ld_preidx),
+	TEST_S1PTW_AF_DIRTY_LOG(guest_test_at),
+	TEST_S1PTW_AF_DIRTY_LOG(guest_test_dc_zva),
+	TEST_S1PTW_AF_DIRTY_LOG(guest_test_exec),
+
 	/* Accessing a hole shouldn't fault (more sanity checks). */
 	TEST_ACCESS_ON_HOLE_NO_FAULTS(guest_test_read64),
 	TEST_ACCESS_ON_HOLE_NO_FAULTS(guest_test_cas, __PREPARE_LSE_TEST_ARGS),
-- 
2.35.1.723.g4982287a31-goog


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 10/11] KVM: selftests: aarch64: Add readonly memslot tests into page_fault_test
  2022-03-11  6:01 ` Ricardo Koller
@ 2022-03-11  6:02   ` Ricardo Koller
  -1 siblings, 0 replies; 42+ messages in thread
From: Ricardo Koller @ 2022-03-11  6:02 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: maz, bgardon, pbonzini, axelrasmussen

Add some readonly memslot tests into page_fault_test. Mark the data
and/or page-table memslots as readonly, perform some accesses, and check
that the right fault is triggered when expected (e.g., a store with no
write-back should lead to an mmio exit).

Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 .../selftests/kvm/aarch64/page_fault_test.c   | 303 +++++++++++++++++-
 1 file changed, 300 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/kvm/aarch64/page_fault_test.c b/tools/testing/selftests/kvm/aarch64/page_fault_test.c
index b41da9317242..e6607f903bc1 100644
--- a/tools/testing/selftests/kvm/aarch64/page_fault_test.c
+++ b/tools/testing/selftests/kvm/aarch64/page_fault_test.c
@@ -84,6 +84,7 @@ struct memslot_desc {
 
 static struct event_cnt {
 	int aborts;
+	int mmio_exits;
 	int fail_vcpu_runs;
 	int uffd_faults;
 	/* uffd_faults is incremented from multiple threads. */
@@ -101,6 +102,8 @@ struct test_desc {
 	int (*uffd_test_handler)(int mode, int uffd, struct uffd_msg *msg);
 	void (*dabt_handler)(struct ex_regs *regs);
 	void (*iabt_handler)(struct ex_regs *regs);
+	void (*mmio_handler)(struct kvm_run *run);
+	void (*fail_vcpu_run_handler)(int ret);
 	uint32_t pt_memslot_flags;
 	uint32_t test_memslot_flags;
 	void (*guest_pre_run)(struct kvm_vm *vm);
@@ -322,6 +325,20 @@ static void guest_code(struct test_desc *test)
 	GUEST_DONE();
 }
 
+static void dabt_s1ptw_on_ro_memslot_handler(struct ex_regs *regs)
+{
+	GUEST_ASSERT_EQ(read_sysreg(far_el1), GUEST_TEST_GVA);
+	events.aborts += 1;
+	GUEST_SYNC(CMD_RECREATE_PT_MEMSLOT_WR);
+}
+
+static void iabt_s1ptw_on_ro_memslot_handler(struct ex_regs *regs)
+{
+	GUEST_ASSERT_EQ(regs->pc, GUEST_TEST_EXEC_GVA);
+	events.aborts += 1;
+	GUEST_SYNC(CMD_RECREATE_PT_MEMSLOT_WR);
+}
+
 static void no_dabt_handler(struct ex_regs *regs)
 {
 	GUEST_ASSERT_1(false, read_sysreg(far_el1));
@@ -400,6 +417,57 @@ static void punch_hole_in_memslot(struct kvm_vm *vm,
 	}
 }
 
+static int __memory_region_add(struct kvm_vm *vm, void *mem, uint32_t slot,
+		uint32_t size, uint64_t guest_addr,
+		uint32_t flags)
+{
+	struct kvm_userspace_memory_region region;
+	int ret;
+
+	region.slot = slot;
+	region.flags = flags;
+	region.guest_phys_addr = guest_addr;
+	region.memory_size = size;
+	region.userspace_addr = (uintptr_t) mem;
+	ret = ioctl(vm_get_fd(vm), KVM_SET_USER_MEMORY_REGION, &region);
+
+	return ret;
+}
+
+static void recreate_memslot(struct kvm_vm *vm, struct memslot_desc *ms,
+		uint32_t flags)
+{
+	__memory_region_add(vm, ms->hva, ms->idx, 0, ms->gpa, 0);
+	__memory_region_add(vm, ms->hva, ms->idx, ms->size, ms->gpa, flags);
+}
+
+static void clear_pte_accessflag(struct kvm_vm *vm)
+{
+	volatile uint64_t *pte_hva;
+
+	pte_hva = (uint64_t *)addr_gpa2hva(vm, pte_gpa);
+	*pte_hva &= ~PTE_AF;
+}
+
+static void mmio_on_test_gpa_handler(struct kvm_run *run)
+{
+	ASSERT_EQ(run->mmio.phys_addr, memslot[TEST].gpa);
+
+	memcpy(memslot[TEST].hva, run->mmio.data, run->mmio.len);
+	events.mmio_exits += 1;
+}
+
+static void mmio_no_handler(struct kvm_run *run)
+{
+	uint64_t data;
+
+	memcpy(&data, run->mmio.data, sizeof(data));
+	pr_debug("addr=%lld len=%d w=%d data=%lx\n",
+			run->mmio.phys_addr, run->mmio.len,
+			run->mmio.is_write, data);
+	TEST_FAIL("There was no MMIO exit expected.");
+}
+
 static bool check_write_in_dirty_log(struct kvm_vm *vm,
 		struct memslot_desc *ms, uint64_t host_pg_nr)
 {
@@ -419,6 +487,8 @@ static void handle_cmd(struct kvm_vm *vm, int cmd)
 		punch_hole_in_memslot(vm, &memslot[PT]);
 	if (cmd & CMD_HOLE_TEST)
 		punch_hole_in_memslot(vm, &memslot[TEST]);
+	if (cmd & CMD_RECREATE_PT_MEMSLOT_WR)
+		recreate_memslot(vm, &memslot[PT], 0);
 	if (cmd & CMD_CHECK_WRITE_IN_DIRTY_LOG)
 		TEST_ASSERT(check_write_in_dirty_log(vm, &memslot[TEST], 0),
 				"Missing write in dirty log");
@@ -442,6 +512,13 @@ void fail_vcpu_run_no_handler(int ret)
 	TEST_FAIL("Unexpected vcpu run failure\n");
 }
 
+void fail_vcpu_run_mmio_no_syndrome_handler(int ret)
+{
+	TEST_ASSERT(errno == ENOSYS, "The mmio handler in the kernel"
+			" should have returned not implemented.");
+	events.fail_vcpu_runs += 1;
+}
+
 static uint64_t get_total_guest_pages(enum vm_guest_mode mode,
 		struct test_params *p)
 {
@@ -594,10 +671,21 @@ static void setup_uffd(enum vm_guest_mode mode, struct test_params *p,
 				test->uffd_test_handler);
 }
 
+static void setup_default_handlers(struct test_desc *test)
+{
+	if (!test->mmio_handler)
+		test->mmio_handler = mmio_no_handler;
+
+	if (!test->fail_vcpu_run_handler)
+		test->fail_vcpu_run_handler = fail_vcpu_run_no_handler;
+}
+
 static void check_event_counts(struct test_desc *test)
 {
 	ASSERT_EQ(test->expected_events.aborts,	events.aborts);
 	ASSERT_EQ(test->expected_events.uffd_faults, events.uffd_faults);
+	ASSERT_EQ(test->expected_events.mmio_exits, events.mmio_exits);
+	ASSERT_EQ(test->expected_events.fail_vcpu_runs, events.fail_vcpu_runs);
 }
 
 static void free_uffd(struct test_desc *test, struct uffd_desc **uffd)
@@ -629,12 +717,20 @@ static void reset_event_counts(void)
 
 static bool vcpu_run_loop(struct kvm_vm *vm, struct test_desc *test)
 {
+	struct kvm_run *run;
 	bool skip_test = false;
 	struct ucall uc;
-	int stage;
+	int stage, ret;
+
+	run = vcpu_state(vm, VCPU_ID);
 
 	for (stage = 0; ; stage++) {
-		vcpu_run(vm, VCPU_ID);
+		ret = _vcpu_run(vm, VCPU_ID);
+		if (ret) {
+			test->fail_vcpu_run_handler(ret);
+			pr_debug("Done.\n");
+			goto done;
+		}
 
 		switch (get_ucall(vm, VCPU_ID, &uc)) {
 		case UCALL_SYNC:
@@ -653,6 +749,10 @@ static bool vcpu_run_loop(struct kvm_vm *vm, struct test_desc *test)
 		case UCALL_DONE:
 			pr_debug("Done.\n");
 			goto done;
+		case UCALL_NONE:
+			if (run->exit_reason == KVM_EXIT_MMIO)
+				test->mmio_handler(run);
+			break;
 		default:
 			TEST_FAIL("Unknown ucall %lu", uc.cmd);
 		}
@@ -677,6 +777,7 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 	ucall_init(vm, NULL);
 
 	reset_event_counts();
+	setup_abort_handlers(vm, test);
 	setup_memslots(vm, mode, p);
 
 	/*
@@ -687,7 +788,7 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 	 */
 	load_exec_code_for_test();
 	setup_uffd(mode, p, uffd);
-	setup_abort_handlers(vm, test);
+	setup_default_handlers(test);
 	setup_guest_args(vm, test);
 
 	if (test->guest_pre_run)
@@ -875,6 +976,135 @@ int main(int argc, char *argv[])
 #define	TEST_S1PTW_AF_DIRTY_LOG(__a, ...)					\
 	TEST_S1PTW_DIRTY_LOG(__a, __AF_TEST_ARGS_FOR_DIRTY_LOG)
 
+#define	TEST_WRITE_ON_RO_MEMSLOT(__a, ...)					\
+{										\
+	.name			= SNAME(WRITE_ON_RO_MEMSLOT ## _ ## __a),	\
+	.test_memslot_flags	= KVM_MEM_READONLY,				\
+	.guest_test 		= __a,						\
+	.mmio_handler		= mmio_on_test_gpa_handler,			\
+	.expected_events	= { .mmio_exits = 1, },				\
+	__VA_ARGS__								\
+}
+
+#define	TEST_READ_ON_RO_MEMSLOT(__a, ...)					\
+{										\
+	.name			= SNAME(READ_ON_RO_MEMSLOT ## _ ## __a),	\
+	.test_memslot_flags	= KVM_MEM_READONLY,				\
+	.guest_test 		= __a,						\
+	.expected_events	= { 0 },					\
+	__VA_ARGS__								\
+}
+
+#define	TEST_CM_ON_RO_MEMSLOT(__a, ...)						\
+{										\
+	.name			= SNAME(CM_ON_RO_MEMSLOT ## _ ## __a),		\
+	.test_memslot_flags	= KVM_MEM_READONLY,				\
+	.guest_test 		= __a,						\
+	.fail_vcpu_run_handler	= fail_vcpu_run_mmio_no_syndrome_handler,	\
+	.expected_events	= { .fail_vcpu_runs = 1, },			\
+	__VA_ARGS__								\
+}
+
+#define __AF_TEST_IN_RO_MEMSLOT_ARGS						\
+	.guest_pre_run		= clear_pte_accessflag,				\
+	.guest_prepare		= { guest_set_ha, },				\
+	.guest_test_check	= { guest_check_pte_af, }
+
+#define __AF_LSE_IN_RO_MEMSLOT_ARGS						\
+	.guest_pre_run		= clear_pte_accessflag,				\
+	.guest_prepare		= { guest_set_ha, guest_check_lse, },		\
+	.guest_test_check	= { guest_check_pte_af, }
+
+#define	TEST_WRITE_ON_RO_MEMSLOT_AF(__a)					\
+	TEST_WRITE_ON_RO_MEMSLOT(__a, __AF_TEST_IN_RO_MEMSLOT_ARGS)
+
+#define	TEST_READ_ON_RO_MEMSLOT_AF(__a)						\
+	TEST_READ_ON_RO_MEMSLOT(__a, __AF_TEST_IN_RO_MEMSLOT_ARGS)
+
+#define	TEST_CM_ON_RO_MEMSLOT_AF(__a)						\
+	TEST_CM_ON_RO_MEMSLOT(__a, __AF_TEST_IN_RO_MEMSLOT_ARGS)
+
+#define	TEST_S1PTW_ON_RO_MEMSLOT_DATA(__a, ...)					\
+{										\
+	.name			= SNAME(S1PTW_ON_RO_MEMSLOT_DATA ## _ ## __a),	\
+	.pt_memslot_flags	= KVM_MEM_READONLY,				\
+	.guest_test 		= __a,						\
+	.dabt_handler		= dabt_s1ptw_on_ro_memslot_handler,		\
+	.expected_events	= { .aborts = 1, },				\
+	__VA_ARGS__								\
+}
+
+#define	TEST_S1PTW_ON_RO_MEMSLOT_EXEC(__a, ...)					\
+{										\
+	.name			= SNAME(S1PTW_ON_RO_MEMSLOT_EXEC ## _ ## __a),	\
+	.pt_memslot_flags	= KVM_MEM_READONLY,				\
+	.guest_test 		= __a,						\
+	.iabt_handler		= iabt_s1ptw_on_ro_memslot_handler,		\
+	.expected_events	= { .aborts = 1, },				\
+	__VA_ARGS__								\
+}
+
+#define	TEST_S1PTW_AF_ON_RO_MEMSLOT_DATA(__a)					\
+	TEST_S1PTW_ON_RO_MEMSLOT_DATA(__a, __AF_TEST_IN_RO_MEMSLOT_ARGS)
+
+#define	TEST_S1PTW_AF_ON_RO_MEMSLOT_EXEC(__a)					\
+	TEST_S1PTW_ON_RO_MEMSLOT_EXEC(__a, __AF_TEST_IN_RO_MEMSLOT_ARGS)
+
+#define	TEST_WRITE_AND_S1PTW_ON_RO_MEMSLOT(__a, ...)				\
+{										\
+	.name			= SCAT(WRITE_AND_S1PTW_ON_RO_MEMSLOT, __a),	\
+	.test_memslot_flags	= KVM_MEM_READONLY,				\
+	.pt_memslot_flags	= KVM_MEM_READONLY,				\
+	.guest_test 		= __a,						\
+	.mmio_handler		= mmio_on_test_gpa_handler,			\
+	.dabt_handler		= dabt_s1ptw_on_ro_memslot_handler,		\
+	.expected_events	= { .mmio_exits = 1, .aborts = 1, },		\
+	__VA_ARGS__								\
+}
+
+#define	TEST_READ_AND_S1PTW_ON_RO_MEMSLOT(__a, ...)				\
+{										\
+	.name			= SCAT(READ_AND_S1PTW_ON_RO_MEMSLOT, __a),	\
+	.test_memslot_flags	= KVM_MEM_READONLY,				\
+	.pt_memslot_flags	= KVM_MEM_READONLY,				\
+	.guest_test 		= __a,						\
+	.dabt_handler		= dabt_s1ptw_on_ro_memslot_handler,		\
+	.expected_events	= { .aborts = 1, },				\
+	__VA_ARGS__								\
+}
+
+#define	TEST_CM_AND_S1PTW_ON_RO_MEMSLOT(__a, ...)				\
+{										\
+	.name			= SCAT(CM_AND_S1PTW_ON_RO_MEMSLOT, __a),	\
+	.test_memslot_flags	= KVM_MEM_READONLY,				\
+	.pt_memslot_flags	= KVM_MEM_READONLY,				\
+	.guest_test 		= __a,						\
+	.dabt_handler		= dabt_s1ptw_on_ro_memslot_handler,		\
+	.fail_vcpu_run_handler	= fail_vcpu_run_mmio_no_syndrome_handler,	\
+	.expected_events	= { .aborts = 1, .fail_vcpu_runs = 1 },		\
+	__VA_ARGS__								\
+}
+
+#define	TEST_EXEC_AND_S1PTW_ON_RO_MEMSLOT(__a, ...)				\
+{										\
+	.name			= SCAT(EXEC_AND_S1PTW_ON_RO_MEMSLOT, __a),	\
+	.test_memslot_flags	= KVM_MEM_READONLY,				\
+	.pt_memslot_flags	= KVM_MEM_READONLY,				\
+	.guest_test 		= __a,						\
+	.iabt_handler		= iabt_s1ptw_on_ro_memslot_handler,		\
+	.expected_events	= { .aborts = 1, },				\
+	__VA_ARGS__								\
+}
+
+#define TEST_WRITE_AND_S1PTW_AF_ON_RO_MEMSLOT(__a) 				\
+	TEST_WRITE_AND_S1PTW_ON_RO_MEMSLOT(__a, __AF_TEST_IN_RO_MEMSLOT_ARGS)
+#define TEST_READ_AND_S1PTW_AF_ON_RO_MEMSLOT(__a) 				\
+	TEST_READ_AND_S1PTW_ON_RO_MEMSLOT(__a, __AF_TEST_IN_RO_MEMSLOT_ARGS)
+#define TEST_CM_AND_S1PTW_AF_ON_RO_MEMSLOT(__a) 				\
+	TEST_CM_AND_S1PTW_ON_RO_MEMSLOT(__a, __AF_TEST_IN_RO_MEMSLOT_ARGS)
+#define TEST_EXEC_AND_S1PTW_AF_ON_RO_MEMSLOT(__a) 				\
+	TEST_EXEC_AND_S1PTW_ON_RO_MEMSLOT(__a, __AF_TEST_IN_RO_MEMSLOT_ARGS)
+
 static struct test_desc tests[] = {
 	/* Check that HW is setting the AF (sanity checks). */
 	TEST_HW_ACCESS_FLAG(guest_test_read64),
@@ -993,6 +1223,73 @@ static struct test_desc tests[] = {
 	TEST_ACCESS_AND_S1PTW_ON_HOLE_UFFD_AF(guest_test_exec,
 			uffd_test_read_handler, uffd_pt_write_handler),
 
+	/* Access on readonly memslot (sanity check). */
+	TEST_WRITE_ON_RO_MEMSLOT(guest_test_write64),
+	TEST_READ_ON_RO_MEMSLOT(guest_test_read64),
+	TEST_READ_ON_RO_MEMSLOT(guest_test_ld_preidx),
+	TEST_READ_ON_RO_MEMSLOT(guest_test_exec),
+	/*
+	 * CM and ld/st with pre-indexing don't have any syndrome.  And so
+	 * vcpu_run just fails; which is expected.
+	 */
+	TEST_CM_ON_RO_MEMSLOT(guest_test_dc_zva),
+	TEST_CM_ON_RO_MEMSLOT(guest_test_cas, __PREPARE_LSE_TEST_ARGS),
+	TEST_CM_ON_RO_MEMSLOT(guest_test_st_preidx),
+
+	/* Access on readonly memslot w/ non-faulting S1PTW w/ AF. */
+	TEST_WRITE_ON_RO_MEMSLOT_AF(guest_test_write64),
+	TEST_READ_ON_RO_MEMSLOT_AF(guest_test_read64),
+	TEST_READ_ON_RO_MEMSLOT_AF(guest_test_ld_preidx),
+	TEST_CM_ON_RO_MEMSLOT(guest_test_cas, __AF_LSE_IN_RO_MEMSLOT_ARGS),
+	TEST_CM_ON_RO_MEMSLOT_AF(guest_test_dc_zva),
+	TEST_CM_ON_RO_MEMSLOT_AF(guest_test_st_preidx),
+	TEST_READ_ON_RO_MEMSLOT_AF(guest_test_exec),
+
+	/*
+	 * S1PTW without AF on a readonly memslot. Note that even though this
+	 * page table walk does not actually write the access flag, it is still
+	 * considered a write, and therefore there is a fault.
+	 */
+	TEST_S1PTW_ON_RO_MEMSLOT_DATA(guest_test_write64),
+	TEST_S1PTW_ON_RO_MEMSLOT_DATA(guest_test_read64),
+	TEST_S1PTW_ON_RO_MEMSLOT_DATA(guest_test_ld_preidx),
+	TEST_S1PTW_ON_RO_MEMSLOT_DATA(guest_test_cas, __PREPARE_LSE_TEST_ARGS),
+	TEST_S1PTW_ON_RO_MEMSLOT_DATA(guest_test_at),
+	TEST_S1PTW_ON_RO_MEMSLOT_DATA(guest_test_dc_zva),
+	TEST_S1PTW_ON_RO_MEMSLOT_DATA(guest_test_st_preidx),
+	TEST_S1PTW_ON_RO_MEMSLOT_EXEC(guest_test_exec),
+
+	/* S1PTW with AF on a readonly memslot. */
+	TEST_S1PTW_AF_ON_RO_MEMSLOT_DATA(guest_test_write64),
+	TEST_S1PTW_AF_ON_RO_MEMSLOT_DATA(guest_test_read64),
+	TEST_S1PTW_ON_RO_MEMSLOT_DATA(guest_test_cas,
+			__AF_LSE_IN_RO_MEMSLOT_ARGS),
+	TEST_S1PTW_AF_ON_RO_MEMSLOT_DATA(guest_test_ld_preidx),
+	TEST_S1PTW_AF_ON_RO_MEMSLOT_DATA(guest_test_at),
+	TEST_S1PTW_AF_ON_RO_MEMSLOT_DATA(guest_test_st_preidx),
+	TEST_S1PTW_AF_ON_RO_MEMSLOT_DATA(guest_test_dc_zva),
+	TEST_S1PTW_AF_ON_RO_MEMSLOT_EXEC(guest_test_exec),
+
+	/* Access on a RO memslot with S1PTW also on a RO memslot. */
+	TEST_WRITE_AND_S1PTW_ON_RO_MEMSLOT(guest_test_write64),
+	TEST_READ_AND_S1PTW_ON_RO_MEMSLOT(guest_test_ld_preidx),
+	TEST_READ_AND_S1PTW_ON_RO_MEMSLOT(guest_test_read64),
+	TEST_CM_AND_S1PTW_ON_RO_MEMSLOT(guest_test_cas,
+			__PREPARE_LSE_TEST_ARGS),
+	TEST_CM_AND_S1PTW_ON_RO_MEMSLOT(guest_test_dc_zva),
+	TEST_CM_AND_S1PTW_ON_RO_MEMSLOT(guest_test_st_preidx),
+	TEST_EXEC_AND_S1PTW_ON_RO_MEMSLOT(guest_test_exec),
+
+	/* Access on a RO memslot with S1PTW w/ AF also on a RO memslot. */
+	TEST_WRITE_AND_S1PTW_AF_ON_RO_MEMSLOT(guest_test_write64),
+	TEST_READ_AND_S1PTW_AF_ON_RO_MEMSLOT(guest_test_read64),
+	TEST_READ_AND_S1PTW_AF_ON_RO_MEMSLOT(guest_test_ld_preidx),
+	TEST_CM_AND_S1PTW_ON_RO_MEMSLOT(guest_test_cas,
+			__AF_LSE_IN_RO_MEMSLOT_ARGS),
+	TEST_CM_AND_S1PTW_AF_ON_RO_MEMSLOT(guest_test_dc_zva),
+	TEST_CM_AND_S1PTW_AF_ON_RO_MEMSLOT(guest_test_st_preidx),
+	TEST_EXEC_AND_S1PTW_AF_ON_RO_MEMSLOT(guest_test_exec),
+
 	{ 0 },
 };
 
-- 
2.35.1.723.g4982287a31-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 10/11] KVM: selftests: aarch64: Add readonly memslot tests into page_fault_test
@ 2022-03-11  6:02   ` Ricardo Koller
  0 siblings, 0 replies; 42+ messages in thread
From: Ricardo Koller @ 2022-03-11  6:02 UTC (permalink / raw)
  To: kvm, kvmarm, drjones
  Cc: pbonzini, maz, alexandru.elisei, eric.auger, oupton, reijiw,
	rananta, bgardon, axelrasmussen, Ricardo Koller

Add some readonly memslot tests into page_fault_test. Mark the data
and/or page-table memslots as readonly, perform some accesses, and check
that the right fault is triggered when expected (e.g., a store with no
write-back should lead to an mmio exit).

Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 .../selftests/kvm/aarch64/page_fault_test.c   | 303 +++++++++++++++++-
 1 file changed, 300 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/kvm/aarch64/page_fault_test.c b/tools/testing/selftests/kvm/aarch64/page_fault_test.c
index b41da9317242..e6607f903bc1 100644
--- a/tools/testing/selftests/kvm/aarch64/page_fault_test.c
+++ b/tools/testing/selftests/kvm/aarch64/page_fault_test.c
@@ -84,6 +84,7 @@ struct memslot_desc {
 
 static struct event_cnt {
 	int aborts;
+	int mmio_exits;
 	int fail_vcpu_runs;
 	int uffd_faults;
 	/* uffd_faults is incremented from multiple threads. */
@@ -101,6 +102,8 @@ struct test_desc {
 	int (*uffd_test_handler)(int mode, int uffd, struct uffd_msg *msg);
 	void (*dabt_handler)(struct ex_regs *regs);
 	void (*iabt_handler)(struct ex_regs *regs);
+	void (*mmio_handler)(struct kvm_run *run);
+	void (*fail_vcpu_run_handler)(int ret);
 	uint32_t pt_memslot_flags;
 	uint32_t test_memslot_flags;
 	void (*guest_pre_run)(struct kvm_vm *vm);
@@ -322,6 +325,20 @@ static void guest_code(struct test_desc *test)
 	GUEST_DONE();
 }
 
+static void dabt_s1ptw_on_ro_memslot_handler(struct ex_regs *regs)
+{
+	GUEST_ASSERT_EQ(read_sysreg(far_el1), GUEST_TEST_GVA);
+	events.aborts += 1;
+	GUEST_SYNC(CMD_RECREATE_PT_MEMSLOT_WR);
+}
+
+static void iabt_s1ptw_on_ro_memslot_handler(struct ex_regs *regs)
+{
+	GUEST_ASSERT_EQ(regs->pc, GUEST_TEST_EXEC_GVA);
+	events.aborts += 1;
+	GUEST_SYNC(CMD_RECREATE_PT_MEMSLOT_WR);
+}
+
 static void no_dabt_handler(struct ex_regs *regs)
 {
 	GUEST_ASSERT_1(false, read_sysreg(far_el1));
@@ -400,6 +417,57 @@ static void punch_hole_in_memslot(struct kvm_vm *vm,
 	}
 }
 
+static int __memory_region_add(struct kvm_vm *vm, void *mem, uint32_t slot,
+		uint32_t size, uint64_t guest_addr,
+		uint32_t flags)
+{
+	struct kvm_userspace_memory_region region;
+	int ret;
+
+	region.slot = slot;
+	region.flags = flags;
+	region.guest_phys_addr = guest_addr;
+	region.memory_size = size;
+	region.userspace_addr = (uintptr_t) mem;
+	ret = ioctl(vm_get_fd(vm), KVM_SET_USER_MEMORY_REGION, &region);
+
+	return ret;
+}
+
+static void recreate_memslot(struct kvm_vm *vm, struct memslot_desc *ms,
+		uint32_t flags)
+{
+	__memory_region_add(vm, ms->hva, ms->idx, 0, ms->gpa, 0);
+	__memory_region_add(vm, ms->hva, ms->idx, ms->size, ms->gpa, flags);
+}
+
+static void clear_pte_accessflag(struct kvm_vm *vm)
+{
+	volatile uint64_t *pte_hva;
+
+	pte_hva = (uint64_t *)addr_gpa2hva(vm, pte_gpa);
+	*pte_hva &= ~PTE_AF;
+}
+
+static void mmio_on_test_gpa_handler(struct kvm_run *run)
+{
+	ASSERT_EQ(run->mmio.phys_addr, memslot[TEST].gpa);
+
+	memcpy(memslot[TEST].hva, run->mmio.data, run->mmio.len);
+	events.mmio_exits += 1;
+}
+
+static void mmio_no_handler(struct kvm_run *run)
+{
+	uint64_t data;
+
+	memcpy(&data, run->mmio.data, sizeof(data));
+	pr_debug("addr=%lld len=%d w=%d data=%lx\n",
+			run->mmio.phys_addr, run->mmio.len,
+			run->mmio.is_write, data);
+	TEST_FAIL("There was no MMIO exit expected.");
+}
+
 static bool check_write_in_dirty_log(struct kvm_vm *vm,
 		struct memslot_desc *ms, uint64_t host_pg_nr)
 {
@@ -419,6 +487,8 @@ static void handle_cmd(struct kvm_vm *vm, int cmd)
 		punch_hole_in_memslot(vm, &memslot[PT]);
 	if (cmd & CMD_HOLE_TEST)
 		punch_hole_in_memslot(vm, &memslot[TEST]);
+	if (cmd & CMD_RECREATE_PT_MEMSLOT_WR)
+		recreate_memslot(vm, &memslot[PT], 0);
 	if (cmd & CMD_CHECK_WRITE_IN_DIRTY_LOG)
 		TEST_ASSERT(check_write_in_dirty_log(vm, &memslot[TEST], 0),
 				"Missing write in dirty log");
@@ -442,6 +512,13 @@ void fail_vcpu_run_no_handler(int ret)
 	TEST_FAIL("Unexpected vcpu run failure\n");
 }
 
+void fail_vcpu_run_mmio_no_syndrome_handler(int ret)
+{
+	TEST_ASSERT(errno == ENOSYS, "The mmio handler in the kernel"
+			" should have returned not implemented.");
+	events.fail_vcpu_runs += 1;
+}
+
 static uint64_t get_total_guest_pages(enum vm_guest_mode mode,
 		struct test_params *p)
 {
@@ -594,10 +671,21 @@ static void setup_uffd(enum vm_guest_mode mode, struct test_params *p,
 				test->uffd_test_handler);
 }
 
+static void setup_default_handlers(struct test_desc *test)
+{
+	if (!test->mmio_handler)
+		test->mmio_handler = mmio_no_handler;
+
+	if (!test->fail_vcpu_run_handler)
+		test->fail_vcpu_run_handler = fail_vcpu_run_no_handler;
+}
+
 static void check_event_counts(struct test_desc *test)
 {
 	ASSERT_EQ(test->expected_events.aborts,	events.aborts);
 	ASSERT_EQ(test->expected_events.uffd_faults, events.uffd_faults);
+	ASSERT_EQ(test->expected_events.mmio_exits, events.mmio_exits);
+	ASSERT_EQ(test->expected_events.fail_vcpu_runs, events.fail_vcpu_runs);
 }
 
 static void free_uffd(struct test_desc *test, struct uffd_desc **uffd)
@@ -629,12 +717,20 @@ static void reset_event_counts(void)
 
 static bool vcpu_run_loop(struct kvm_vm *vm, struct test_desc *test)
 {
+	struct kvm_run *run;
 	bool skip_test = false;
 	struct ucall uc;
-	int stage;
+	int stage, ret;
+
+	run = vcpu_state(vm, VCPU_ID);
 
 	for (stage = 0; ; stage++) {
-		vcpu_run(vm, VCPU_ID);
+		ret = _vcpu_run(vm, VCPU_ID);
+		if (ret) {
+			test->fail_vcpu_run_handler(ret);
+			pr_debug("Done.\n");
+			goto done;
+		}
 
 		switch (get_ucall(vm, VCPU_ID, &uc)) {
 		case UCALL_SYNC:
@@ -653,6 +749,10 @@ static bool vcpu_run_loop(struct kvm_vm *vm, struct test_desc *test)
 		case UCALL_DONE:
 			pr_debug("Done.\n");
 			goto done;
+		case UCALL_NONE:
+			if (run->exit_reason == KVM_EXIT_MMIO)
+				test->mmio_handler(run);
+			break;
 		default:
 			TEST_FAIL("Unknown ucall %lu", uc.cmd);
 		}
@@ -677,6 +777,7 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 	ucall_init(vm, NULL);
 
 	reset_event_counts();
+	setup_abort_handlers(vm, test);
 	setup_memslots(vm, mode, p);
 
 	/*
@@ -687,7 +788,7 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 	 */
 	load_exec_code_for_test();
 	setup_uffd(mode, p, uffd);
-	setup_abort_handlers(vm, test);
+	setup_default_handlers(test);
 	setup_guest_args(vm, test);
 
 	if (test->guest_pre_run)
@@ -875,6 +976,135 @@ int main(int argc, char *argv[])
 #define	TEST_S1PTW_AF_DIRTY_LOG(__a, ...)					\
 	TEST_S1PTW_DIRTY_LOG(__a, __AF_TEST_ARGS_FOR_DIRTY_LOG)
 
+#define	TEST_WRITE_ON_RO_MEMSLOT(__a, ...)					\
+{										\
+	.name			= SNAME(WRITE_ON_RO_MEMSLOT ## _ ## __a),	\
+	.test_memslot_flags	= KVM_MEM_READONLY,				\
+	.guest_test 		= __a,						\
+	.mmio_handler		= mmio_on_test_gpa_handler,			\
+	.expected_events	= { .mmio_exits = 1, },				\
+	__VA_ARGS__								\
+}
+
+#define	TEST_READ_ON_RO_MEMSLOT(__a, ...)					\
+{										\
+	.name			= SNAME(READ_ON_RO_MEMSLOT ## _ ## __a),	\
+	.test_memslot_flags	= KVM_MEM_READONLY,				\
+	.guest_test 		= __a,						\
+	.expected_events	= { 0 },					\
+	__VA_ARGS__								\
+}
+
+#define	TEST_CM_ON_RO_MEMSLOT(__a, ...)						\
+{										\
+	.name			= SNAME(CM_ON_RO_MEMSLOT ## _ ## __a),		\
+	.test_memslot_flags	= KVM_MEM_READONLY,				\
+	.guest_test 		= __a,						\
+	.fail_vcpu_run_handler	= fail_vcpu_run_mmio_no_syndrome_handler,	\
+	.expected_events	= { .fail_vcpu_runs = 1, },			\
+	__VA_ARGS__								\
+}
+
+#define __AF_TEST_IN_RO_MEMSLOT_ARGS						\
+	.guest_pre_run		= clear_pte_accessflag,				\
+	.guest_prepare		= { guest_set_ha, },				\
+	.guest_test_check	= { guest_check_pte_af, }
+
+#define __AF_LSE_IN_RO_MEMSLOT_ARGS						\
+	.guest_pre_run		= clear_pte_accessflag,				\
+	.guest_prepare		= { guest_set_ha, guest_check_lse, },		\
+	.guest_test_check	= { guest_check_pte_af, }
+
+#define	TEST_WRITE_ON_RO_MEMSLOT_AF(__a)					\
+	TEST_WRITE_ON_RO_MEMSLOT(__a, __AF_TEST_IN_RO_MEMSLOT_ARGS)
+
+#define	TEST_READ_ON_RO_MEMSLOT_AF(__a)						\
+	TEST_READ_ON_RO_MEMSLOT(__a, __AF_TEST_IN_RO_MEMSLOT_ARGS)
+
+#define	TEST_CM_ON_RO_MEMSLOT_AF(__a)						\
+	TEST_CM_ON_RO_MEMSLOT(__a, __AF_TEST_IN_RO_MEMSLOT_ARGS)
+
+#define	TEST_S1PTW_ON_RO_MEMSLOT_DATA(__a, ...)					\
+{										\
+	.name			= SNAME(S1PTW_ON_RO_MEMSLOT_DATA ## _ ## __a),	\
+	.pt_memslot_flags	= KVM_MEM_READONLY,				\
+	.guest_test 		= __a,						\
+	.dabt_handler		= dabt_s1ptw_on_ro_memslot_handler,		\
+	.expected_events	= { .aborts = 1, },				\
+	__VA_ARGS__								\
+}
+
+#define	TEST_S1PTW_ON_RO_MEMSLOT_EXEC(__a, ...)					\
+{										\
+	.name			= SNAME(S1PTW_ON_RO_MEMSLOT_EXEC ## _ ## __a),	\
+	.pt_memslot_flags	= KVM_MEM_READONLY,				\
+	.guest_test 		= __a,						\
+	.iabt_handler		= iabt_s1ptw_on_ro_memslot_handler,		\
+	.expected_events	= { .aborts = 1, },				\
+	__VA_ARGS__								\
+}
+
+#define	TEST_S1PTW_AF_ON_RO_MEMSLOT_DATA(__a)					\
+	TEST_S1PTW_ON_RO_MEMSLOT_DATA(__a, __AF_TEST_IN_RO_MEMSLOT_ARGS)
+
+#define	TEST_S1PTW_AF_ON_RO_MEMSLOT_EXEC(__a)					\
+	TEST_S1PTW_ON_RO_MEMSLOT_EXEC(__a, __AF_TEST_IN_RO_MEMSLOT_ARGS)
+
+#define	TEST_WRITE_AND_S1PTW_ON_RO_MEMSLOT(__a, ...)				\
+{										\
+	.name			= SCAT(WRITE_AND_S1PTW_ON_RO_MEMSLOT, __a),	\
+	.test_memslot_flags	= KVM_MEM_READONLY,				\
+	.pt_memslot_flags	= KVM_MEM_READONLY,				\
+	.guest_test 		= __a,						\
+	.mmio_handler		= mmio_on_test_gpa_handler,			\
+	.dabt_handler		= dabt_s1ptw_on_ro_memslot_handler,		\
+	.expected_events	= { .mmio_exits = 1, .aborts = 1, },		\
+	__VA_ARGS__								\
+}
+
+#define	TEST_READ_AND_S1PTW_ON_RO_MEMSLOT(__a, ...)				\
+{										\
+	.name			= SCAT(READ_AND_S1PTW_ON_RO_MEMSLOT, __a),	\
+	.test_memslot_flags	= KVM_MEM_READONLY,				\
+	.pt_memslot_flags	= KVM_MEM_READONLY,				\
+	.guest_test 		= __a,						\
+	.dabt_handler		= dabt_s1ptw_on_ro_memslot_handler,		\
+	.expected_events	= { .aborts = 1, },				\
+	__VA_ARGS__								\
+}
+
+#define	TEST_CM_AND_S1PTW_ON_RO_MEMSLOT(__a, ...)				\
+{										\
+	.name			= SCAT(CM_AND_S1PTW_ON_RO_MEMSLOT, __a),	\
+	.test_memslot_flags	= KVM_MEM_READONLY,				\
+	.pt_memslot_flags	= KVM_MEM_READONLY,				\
+	.guest_test 		= __a,						\
+	.dabt_handler		= dabt_s1ptw_on_ro_memslot_handler,		\
+	.fail_vcpu_run_handler	= fail_vcpu_run_mmio_no_syndrome_handler,	\
+	.expected_events	= { .aborts = 1, .fail_vcpu_runs = 1 },		\
+	__VA_ARGS__								\
+}
+
+#define	TEST_EXEC_AND_S1PTW_ON_RO_MEMSLOT(__a, ...)				\
+{										\
+	.name			= SCAT(EXEC_AND_S1PTW_ON_RO_MEMSLOT, __a),	\
+	.test_memslot_flags	= KVM_MEM_READONLY,				\
+	.pt_memslot_flags	= KVM_MEM_READONLY,				\
+	.guest_test 		= __a,						\
+	.iabt_handler		= iabt_s1ptw_on_ro_memslot_handler,		\
+	.expected_events	= { .aborts = 1, },				\
+	__VA_ARGS__								\
+}
+
+#define TEST_WRITE_AND_S1PTW_AF_ON_RO_MEMSLOT(__a) 				\
+	TEST_WRITE_AND_S1PTW_ON_RO_MEMSLOT(__a, __AF_TEST_IN_RO_MEMSLOT_ARGS)
+#define TEST_READ_AND_S1PTW_AF_ON_RO_MEMSLOT(__a) 				\
+	TEST_READ_AND_S1PTW_ON_RO_MEMSLOT(__a, __AF_TEST_IN_RO_MEMSLOT_ARGS)
+#define TEST_CM_AND_S1PTW_AF_ON_RO_MEMSLOT(__a) 				\
+	TEST_CM_AND_S1PTW_ON_RO_MEMSLOT(__a, __AF_TEST_IN_RO_MEMSLOT_ARGS)
+#define TEST_EXEC_AND_S1PTW_AF_ON_RO_MEMSLOT(__a) 				\
+	TEST_EXEC_AND_S1PTW_ON_RO_MEMSLOT(__a, __AF_TEST_IN_RO_MEMSLOT_ARGS)
+
 static struct test_desc tests[] = {
 	/* Check that HW is setting the AF (sanity checks). */
 	TEST_HW_ACCESS_FLAG(guest_test_read64),
@@ -993,6 +1223,73 @@ static struct test_desc tests[] = {
 	TEST_ACCESS_AND_S1PTW_ON_HOLE_UFFD_AF(guest_test_exec,
 			uffd_test_read_handler, uffd_pt_write_handler),
 
+	/* Access on readonly memslot (sanity check). */
+	TEST_WRITE_ON_RO_MEMSLOT(guest_test_write64),
+	TEST_READ_ON_RO_MEMSLOT(guest_test_read64),
+	TEST_READ_ON_RO_MEMSLOT(guest_test_ld_preidx),
+	TEST_READ_ON_RO_MEMSLOT(guest_test_exec),
+	/*
+	 * CM and ld/st with pre-indexing don't have any syndrome.  And so
+	 * vcpu_run just fails; which is expected.
+	 */
+	TEST_CM_ON_RO_MEMSLOT(guest_test_dc_zva),
+	TEST_CM_ON_RO_MEMSLOT(guest_test_cas, __PREPARE_LSE_TEST_ARGS),
+	TEST_CM_ON_RO_MEMSLOT(guest_test_st_preidx),
+
+	/* Access on readonly memslot w/ non-faulting S1PTW w/ AF. */
+	TEST_WRITE_ON_RO_MEMSLOT_AF(guest_test_write64),
+	TEST_READ_ON_RO_MEMSLOT_AF(guest_test_read64),
+	TEST_READ_ON_RO_MEMSLOT_AF(guest_test_ld_preidx),
+	TEST_CM_ON_RO_MEMSLOT(guest_test_cas, __AF_LSE_IN_RO_MEMSLOT_ARGS),
+	TEST_CM_ON_RO_MEMSLOT_AF(guest_test_dc_zva),
+	TEST_CM_ON_RO_MEMSLOT_AF(guest_test_st_preidx),
+	TEST_READ_ON_RO_MEMSLOT_AF(guest_test_exec),
+
+	/*
+	 * S1PTW without AF on a readonly memslot. Note that even though this
+	 * page table walk does not actually write the access flag, it is still
+	 * considered a write, and therefore there is a fault.
+	 */
+	TEST_S1PTW_ON_RO_MEMSLOT_DATA(guest_test_write64),
+	TEST_S1PTW_ON_RO_MEMSLOT_DATA(guest_test_read64),
+	TEST_S1PTW_ON_RO_MEMSLOT_DATA(guest_test_ld_preidx),
+	TEST_S1PTW_ON_RO_MEMSLOT_DATA(guest_test_cas, __PREPARE_LSE_TEST_ARGS),
+	TEST_S1PTW_ON_RO_MEMSLOT_DATA(guest_test_at),
+	TEST_S1PTW_ON_RO_MEMSLOT_DATA(guest_test_dc_zva),
+	TEST_S1PTW_ON_RO_MEMSLOT_DATA(guest_test_st_preidx),
+	TEST_S1PTW_ON_RO_MEMSLOT_EXEC(guest_test_exec),
+
+	/* S1PTW with AF on a readonly memslot. */
+	TEST_S1PTW_AF_ON_RO_MEMSLOT_DATA(guest_test_write64),
+	TEST_S1PTW_AF_ON_RO_MEMSLOT_DATA(guest_test_read64),
+	TEST_S1PTW_ON_RO_MEMSLOT_DATA(guest_test_cas,
+			__AF_LSE_IN_RO_MEMSLOT_ARGS),
+	TEST_S1PTW_AF_ON_RO_MEMSLOT_DATA(guest_test_ld_preidx),
+	TEST_S1PTW_AF_ON_RO_MEMSLOT_DATA(guest_test_at),
+	TEST_S1PTW_AF_ON_RO_MEMSLOT_DATA(guest_test_st_preidx),
+	TEST_S1PTW_AF_ON_RO_MEMSLOT_DATA(guest_test_dc_zva),
+	TEST_S1PTW_AF_ON_RO_MEMSLOT_EXEC(guest_test_exec),
+
+	/* Access on a RO memslot with S1PTW also on a RO memslot. */
+	TEST_WRITE_AND_S1PTW_ON_RO_MEMSLOT(guest_test_write64),
+	TEST_READ_AND_S1PTW_ON_RO_MEMSLOT(guest_test_ld_preidx),
+	TEST_READ_AND_S1PTW_ON_RO_MEMSLOT(guest_test_read64),
+	TEST_CM_AND_S1PTW_ON_RO_MEMSLOT(guest_test_cas,
+			__PREPARE_LSE_TEST_ARGS),
+	TEST_CM_AND_S1PTW_ON_RO_MEMSLOT(guest_test_dc_zva),
+	TEST_CM_AND_S1PTW_ON_RO_MEMSLOT(guest_test_st_preidx),
+	TEST_EXEC_AND_S1PTW_ON_RO_MEMSLOT(guest_test_exec),
+
+	/* Access on a RO memslot with S1PTW w/ AF also on a RO memslot. */
+	TEST_WRITE_AND_S1PTW_AF_ON_RO_MEMSLOT(guest_test_write64),
+	TEST_READ_AND_S1PTW_AF_ON_RO_MEMSLOT(guest_test_read64),
+	TEST_READ_AND_S1PTW_AF_ON_RO_MEMSLOT(guest_test_ld_preidx),
+	TEST_CM_AND_S1PTW_ON_RO_MEMSLOT(guest_test_cas,
+			__AF_LSE_IN_RO_MEMSLOT_ARGS),
+	TEST_CM_AND_S1PTW_AF_ON_RO_MEMSLOT(guest_test_dc_zva),
+	TEST_CM_AND_S1PTW_AF_ON_RO_MEMSLOT(guest_test_st_preidx),
+	TEST_EXEC_AND_S1PTW_AF_ON_RO_MEMSLOT(guest_test_exec),
+
 	{ 0 },
 };
 
-- 
2.35.1.723.g4982287a31-goog


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 11/11] KVM: selftests: aarch64: Add mix of tests into page_fault_test
  2022-03-11  6:01 ` Ricardo Koller
@ 2022-03-11  6:02   ` Ricardo Koller
  -1 siblings, 0 replies; 42+ messages in thread
From: Ricardo Koller @ 2022-03-11  6:02 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: maz, bgardon, pbonzini, axelrasmussen

Add some mix of tests into page_fault_test, like stage 2 faults on
memslots marked for both userfaultfd and dirty-logging.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 .../selftests/kvm/aarch64/page_fault_test.c   | 148 ++++++++++++++++++
 1 file changed, 148 insertions(+)

diff --git a/tools/testing/selftests/kvm/aarch64/page_fault_test.c b/tools/testing/selftests/kvm/aarch64/page_fault_test.c
index e6607f903bc1..f1a5bf081a5b 100644
--- a/tools/testing/selftests/kvm/aarch64/page_fault_test.c
+++ b/tools/testing/selftests/kvm/aarch64/page_fault_test.c
@@ -399,6 +399,12 @@ static int uffd_test_read_handler(int mode, int uffd, struct uffd_msg *msg)
 	return uffd_generic_handler(mode, uffd, msg, &memslot[TEST], false);
 }
 
+static int uffd_no_handler(int mode, int uffd, struct uffd_msg *msg)
+{
+	TEST_FAIL("There was no UFFD fault expected.");
+	return -1;
+}
+
 static void punch_hole_in_memslot(struct kvm_vm *vm,
 		struct memslot_desc *memslot)
 {
@@ -912,6 +918,30 @@ int main(int argc, char *argv[])
 #define TEST_S1PTW_ON_HOLE_UFFD_AF(__a, __uffd_handler)				\
 	TEST_S1PTW_ON_HOLE_UFFD(__a, __uffd_handler, __AF_TEST_ARGS)
 
+#define __DIRTY_LOG_TEST							\
+	.test_memslot_flags	= KVM_MEM_LOG_DIRTY_PAGES,			\
+	.guest_test_check	= { guest_check_write_in_dirty_log, },		\
+
+#define __DIRTY_LOG_S1PTW_TEST							\
+	.pt_memslot_flags	= KVM_MEM_LOG_DIRTY_PAGES,			\
+	.guest_test_check	= { guest_check_s1ptw_wr_in_dirty_log, },	\
+
+#define TEST_WRITE_DIRTY_LOG_AND_S1PTW_ON_UFFD(__a, __uffd_handler, ...)	\
+	TEST_S1PTW_ON_HOLE_UFFD(__a, __uffd_handler,				\
+			__DIRTY_LOG_TEST __VA_ARGS__)
+
+#define TEST_WRITE_ON_DIRTY_LOG_AND_UFFD(__a, __uffd_handler, ...)		\
+	TEST_ACCESS_ON_HOLE_UFFD(__a, __uffd_handler,				\
+			__DIRTY_LOG_TEST __VA_ARGS__)
+
+#define TEST_WRITE_UFFD_AND_S1PTW_ON_DIRTY_LOG(__a, __uffd_handler, ...)	\
+	TEST_ACCESS_ON_HOLE_UFFD(__a, __uffd_handler,				\
+			__DIRTY_LOG_S1PTW_TEST __VA_ARGS__)
+
+#define TEST_S1PTW_ON_DIRTY_LOG_AND_UFFD(__a, __uffd_handler, ...)		\
+	TEST_S1PTW_ON_HOLE_UFFD(__a, __uffd_handler,				\
+			__DIRTY_LOG_S1PTW_TEST __VA_ARGS__)
+
 #define TEST_ACCESS_AND_S1PTW_ON_HOLE_UFFD(__a, __th, __ph, ...)		\
 {										\
 	.name			= SNAME(ACCESS_S1PTW_ON_HOLE_UFFD ## _ ## __a),	\
@@ -1015,6 +1045,10 @@ int main(int argc, char *argv[])
 	.guest_prepare		= { guest_set_ha, guest_check_lse, },		\
 	.guest_test_check	= { guest_check_pte_af, }
 
+#define __NULL_UFFD_HANDLERS							\
+	.uffd_test_handler	= uffd_no_handler,				\
+	.uffd_pt_handler	= uffd_no_handler
+
 #define	TEST_WRITE_ON_RO_MEMSLOT_AF(__a)					\
 	TEST_WRITE_ON_RO_MEMSLOT(__a, __AF_TEST_IN_RO_MEMSLOT_ARGS)
 
@@ -1105,6 +1139,37 @@ int main(int argc, char *argv[])
 #define TEST_EXEC_AND_S1PTW_AF_ON_RO_MEMSLOT(__a) 				\
 	TEST_EXEC_AND_S1PTW_ON_RO_MEMSLOT(__a, __AF_TEST_IN_RO_MEMSLOT_ARGS)
 
+#define TEST_WRITE_AND_S1PTW_AF_ON_RO_MEMSLOT_WITH_UFFD(__a)			\
+	TEST_WRITE_AND_S1PTW_ON_RO_MEMSLOT(__a, __NULL_UFFD_HANDLERS)
+#define TEST_READ_AND_S1PTW_ON_RO_MEMSLOT_WITH_UFFD(__a)			\
+	TEST_READ_AND_S1PTW_ON_RO_MEMSLOT(__a, __NULL_UFFD_HANDLERS)
+#define TEST_CM_AND_S1PTW_AF_ON_RO_MEMSLOT_WITH_UFFD(__a)			\
+	TEST_CM_AND_S1PTW_ON_RO_MEMSLOT(__a, __NULL_UFFD_HANDLERS)
+#define TEST_EXEC_AND_S1PTW_AF_ON_RO_MEMSLOT_WITH_UFFD(__a)			\
+	TEST_EXEC_AND_S1PTW_ON_RO_MEMSLOT(__a, __NULL_UFFD_HANDLERS)
+
+#define	TEST_WRITE_ON_RO_DIRTY_LOG_MEMSLOT(__a, ...)				\
+{										\
+	.name			= SNAME(WRITE_ON_RO_MEMSLOT ## _ ## __a),	\
+	.test_memslot_flags	= KVM_MEM_READONLY | KVM_MEM_LOG_DIRTY_PAGES,	\
+	.guest_test		= __a,						\
+	.guest_test_check	= { guest_check_no_write_in_dirty_log, },	\
+	.mmio_handler		= mmio_on_test_gpa_handler,			\
+	.expected_events	= { .mmio_exits = 1, },				\
+	__VA_ARGS__								\
+}
+
+#define	TEST_CM_ON_RO_DIRTY_LOG_MEMSLOT(__a, ...)				\
+{										\
+	.name			= SNAME(WRITE_ON_RO_MEMSLOT ## _ ## __a),	\
+	.test_memslot_flags	= KVM_MEM_READONLY | KVM_MEM_LOG_DIRTY_PAGES,	\
+	.guest_test		= __a,						\
+	.guest_test_check	= { guest_check_no_write_in_dirty_log, },	\
+	.fail_vcpu_run_handler	= fail_vcpu_run_mmio_no_syndrome_handler,	\
+	.expected_events	= { .fail_vcpu_runs = 1, },			\
+	__VA_ARGS__								\
+}
+
 static struct test_desc tests[] = {
 	/* Check that HW is setting the AF (sanity checks). */
 	TEST_HW_ACCESS_FLAG(guest_test_read64),
@@ -1223,6 +1288,65 @@ static struct test_desc tests[] = {
 	TEST_ACCESS_AND_S1PTW_ON_HOLE_UFFD_AF(guest_test_exec,
 			uffd_test_read_handler, uffd_pt_write_handler),
 
+	/* Write into a memslot marked for both dirty logging and UFFD. */
+	TEST_WRITE_ON_DIRTY_LOG_AND_UFFD(guest_test_write64,
+			uffd_test_write_handler),
+	/* Note that the cas uffd handler is for a read. */
+	TEST_WRITE_ON_DIRTY_LOG_AND_UFFD(guest_test_cas,
+			uffd_test_read_handler, __PREPARE_LSE_TEST_ARGS),
+	TEST_WRITE_ON_DIRTY_LOG_AND_UFFD(guest_test_dc_zva,
+			uffd_test_write_handler),
+	TEST_WRITE_ON_DIRTY_LOG_AND_UFFD(guest_test_st_preidx,
+			uffd_test_write_handler),
+
+	/*
+	 * Access whose s1ptw faults on a hole that's marked for both dirty
+	 * logging and UFFD.
+	 */
+	TEST_S1PTW_ON_DIRTY_LOG_AND_UFFD(guest_test_read64,
+			uffd_pt_write_handler),
+	TEST_S1PTW_ON_DIRTY_LOG_AND_UFFD(guest_test_cas,
+			uffd_pt_write_handler, __PREPARE_LSE_TEST_ARGS),
+	TEST_S1PTW_ON_DIRTY_LOG_AND_UFFD(guest_test_ld_preidx,
+			uffd_pt_write_handler),
+	TEST_S1PTW_ON_DIRTY_LOG_AND_UFFD(guest_test_exec,
+			uffd_pt_write_handler),
+	TEST_S1PTW_ON_DIRTY_LOG_AND_UFFD(guest_test_write64,
+			uffd_pt_write_handler),
+	TEST_S1PTW_ON_DIRTY_LOG_AND_UFFD(guest_test_st_preidx,
+			uffd_pt_write_handler),
+	TEST_S1PTW_ON_DIRTY_LOG_AND_UFFD(guest_test_dc_zva,
+			uffd_pt_write_handler),
+	TEST_S1PTW_ON_DIRTY_LOG_AND_UFFD(guest_test_at,
+			uffd_pt_write_handler),
+
+	/*
+	 * Write on a memslot marked for dirty logging whose related s1ptw
+	 * is on a hole marked with UFFD.
+	 */
+	TEST_WRITE_DIRTY_LOG_AND_S1PTW_ON_UFFD(guest_test_write64,
+			uffd_pt_write_handler),
+	TEST_WRITE_DIRTY_LOG_AND_S1PTW_ON_UFFD(guest_test_cas,
+			uffd_pt_write_handler, __PREPARE_LSE_TEST_ARGS),
+	TEST_WRITE_DIRTY_LOG_AND_S1PTW_ON_UFFD(guest_test_dc_zva,
+			uffd_pt_write_handler),
+	TEST_WRITE_DIRTY_LOG_AND_S1PTW_ON_UFFD(guest_test_st_preidx,
+			uffd_pt_write_handler),
+
+	/*
+	 * Write on a memslot that's on a hole marked with UFFD, whose related
+	 * sp1ptw is on a memslot marked for dirty logging.
+	 */
+	TEST_WRITE_UFFD_AND_S1PTW_ON_DIRTY_LOG(guest_test_write64,
+			uffd_test_write_handler),
+	/* Note that the uffd handler is for a read. */
+	TEST_WRITE_UFFD_AND_S1PTW_ON_DIRTY_LOG(guest_test_cas,
+			uffd_test_read_handler, __PREPARE_LSE_TEST_ARGS),
+	TEST_WRITE_UFFD_AND_S1PTW_ON_DIRTY_LOG(guest_test_dc_zva,
+			uffd_test_write_handler),
+	TEST_WRITE_UFFD_AND_S1PTW_ON_DIRTY_LOG(guest_test_st_preidx,
+			uffd_test_write_handler),
+
 	/* Access on readonly memslot (sanity check). */
 	TEST_WRITE_ON_RO_MEMSLOT(guest_test_write64),
 	TEST_READ_ON_RO_MEMSLOT(guest_test_read64),
@@ -1290,6 +1414,30 @@ static struct test_desc tests[] = {
 	TEST_CM_AND_S1PTW_AF_ON_RO_MEMSLOT(guest_test_st_preidx),
 	TEST_EXEC_AND_S1PTW_AF_ON_RO_MEMSLOT(guest_test_exec),
 
+	/*
+	 * Access on a memslot marked as readonly with also dirty log tracking.
+	 * There should be no write in the dirty log.
+	 */
+	TEST_WRITE_ON_RO_DIRTY_LOG_MEMSLOT(guest_test_write64),
+	TEST_CM_ON_RO_DIRTY_LOG_MEMSLOT(guest_test_cas,
+			__PREPARE_LSE_TEST_ARGS),
+	TEST_CM_ON_RO_DIRTY_LOG_MEMSLOT(guest_test_dc_zva),
+	TEST_CM_ON_RO_DIRTY_LOG_MEMSLOT(guest_test_st_preidx),
+
+	/*
+	 * Access on a RO memslot with S1PTW also on a RO memslot, while also
+	 * having those memslot regions marked for UFFD fault handling.  The
+	 * result is that UFFD fault handlers should not be called.
+	 */
+	TEST_WRITE_AND_S1PTW_AF_ON_RO_MEMSLOT_WITH_UFFD(guest_test_write64),
+	TEST_READ_AND_S1PTW_ON_RO_MEMSLOT_WITH_UFFD(guest_test_read64),
+	TEST_READ_AND_S1PTW_ON_RO_MEMSLOT_WITH_UFFD(guest_test_ld_preidx),
+	TEST_CM_AND_S1PTW_ON_RO_MEMSLOT(guest_test_cas,
+			__PREPARE_LSE_TEST_ARGS __NULL_UFFD_HANDLERS),
+	TEST_CM_AND_S1PTW_AF_ON_RO_MEMSLOT_WITH_UFFD(guest_test_dc_zva),
+	TEST_CM_AND_S1PTW_AF_ON_RO_MEMSLOT_WITH_UFFD(guest_test_st_preidx),
+	TEST_EXEC_AND_S1PTW_AF_ON_RO_MEMSLOT_WITH_UFFD(guest_test_exec),
+
 	{ 0 },
 };
 
-- 
2.35.1.723.g4982287a31-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 11/11] KVM: selftests: aarch64: Add mix of tests into page_fault_test
@ 2022-03-11  6:02   ` Ricardo Koller
  0 siblings, 0 replies; 42+ messages in thread
From: Ricardo Koller @ 2022-03-11  6:02 UTC (permalink / raw)
  To: kvm, kvmarm, drjones
  Cc: pbonzini, maz, alexandru.elisei, eric.auger, oupton, reijiw,
	rananta, bgardon, axelrasmussen, Ricardo Koller

Add some mix of tests into page_fault_test, like stage 2 faults on
memslots marked for both userfaultfd and dirty-logging.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
---
 .../selftests/kvm/aarch64/page_fault_test.c   | 148 ++++++++++++++++++
 1 file changed, 148 insertions(+)

diff --git a/tools/testing/selftests/kvm/aarch64/page_fault_test.c b/tools/testing/selftests/kvm/aarch64/page_fault_test.c
index e6607f903bc1..f1a5bf081a5b 100644
--- a/tools/testing/selftests/kvm/aarch64/page_fault_test.c
+++ b/tools/testing/selftests/kvm/aarch64/page_fault_test.c
@@ -399,6 +399,12 @@ static int uffd_test_read_handler(int mode, int uffd, struct uffd_msg *msg)
 	return uffd_generic_handler(mode, uffd, msg, &memslot[TEST], false);
 }
 
+static int uffd_no_handler(int mode, int uffd, struct uffd_msg *msg)
+{
+	TEST_FAIL("There was no UFFD fault expected.");
+	return -1;
+}
+
 static void punch_hole_in_memslot(struct kvm_vm *vm,
 		struct memslot_desc *memslot)
 {
@@ -912,6 +918,30 @@ int main(int argc, char *argv[])
 #define TEST_S1PTW_ON_HOLE_UFFD_AF(__a, __uffd_handler)				\
 	TEST_S1PTW_ON_HOLE_UFFD(__a, __uffd_handler, __AF_TEST_ARGS)
 
+#define __DIRTY_LOG_TEST							\
+	.test_memslot_flags	= KVM_MEM_LOG_DIRTY_PAGES,			\
+	.guest_test_check	= { guest_check_write_in_dirty_log, },		\
+
+#define __DIRTY_LOG_S1PTW_TEST							\
+	.pt_memslot_flags	= KVM_MEM_LOG_DIRTY_PAGES,			\
+	.guest_test_check	= { guest_check_s1ptw_wr_in_dirty_log, },	\
+
+#define TEST_WRITE_DIRTY_LOG_AND_S1PTW_ON_UFFD(__a, __uffd_handler, ...)	\
+	TEST_S1PTW_ON_HOLE_UFFD(__a, __uffd_handler,				\
+			__DIRTY_LOG_TEST __VA_ARGS__)
+
+#define TEST_WRITE_ON_DIRTY_LOG_AND_UFFD(__a, __uffd_handler, ...)		\
+	TEST_ACCESS_ON_HOLE_UFFD(__a, __uffd_handler,				\
+			__DIRTY_LOG_TEST __VA_ARGS__)
+
+#define TEST_WRITE_UFFD_AND_S1PTW_ON_DIRTY_LOG(__a, __uffd_handler, ...)	\
+	TEST_ACCESS_ON_HOLE_UFFD(__a, __uffd_handler,				\
+			__DIRTY_LOG_S1PTW_TEST __VA_ARGS__)
+
+#define TEST_S1PTW_ON_DIRTY_LOG_AND_UFFD(__a, __uffd_handler, ...)		\
+	TEST_S1PTW_ON_HOLE_UFFD(__a, __uffd_handler,				\
+			__DIRTY_LOG_S1PTW_TEST __VA_ARGS__)
+
 #define TEST_ACCESS_AND_S1PTW_ON_HOLE_UFFD(__a, __th, __ph, ...)		\
 {										\
 	.name			= SNAME(ACCESS_S1PTW_ON_HOLE_UFFD ## _ ## __a),	\
@@ -1015,6 +1045,10 @@ int main(int argc, char *argv[])
 	.guest_prepare		= { guest_set_ha, guest_check_lse, },		\
 	.guest_test_check	= { guest_check_pte_af, }
 
+#define __NULL_UFFD_HANDLERS							\
+	.uffd_test_handler	= uffd_no_handler,				\
+	.uffd_pt_handler	= uffd_no_handler
+
 #define	TEST_WRITE_ON_RO_MEMSLOT_AF(__a)					\
 	TEST_WRITE_ON_RO_MEMSLOT(__a, __AF_TEST_IN_RO_MEMSLOT_ARGS)
 
@@ -1105,6 +1139,37 @@ int main(int argc, char *argv[])
 #define TEST_EXEC_AND_S1PTW_AF_ON_RO_MEMSLOT(__a) 				\
 	TEST_EXEC_AND_S1PTW_ON_RO_MEMSLOT(__a, __AF_TEST_IN_RO_MEMSLOT_ARGS)
 
+#define TEST_WRITE_AND_S1PTW_AF_ON_RO_MEMSLOT_WITH_UFFD(__a)			\
+	TEST_WRITE_AND_S1PTW_ON_RO_MEMSLOT(__a, __NULL_UFFD_HANDLERS)
+#define TEST_READ_AND_S1PTW_ON_RO_MEMSLOT_WITH_UFFD(__a)			\
+	TEST_READ_AND_S1PTW_ON_RO_MEMSLOT(__a, __NULL_UFFD_HANDLERS)
+#define TEST_CM_AND_S1PTW_AF_ON_RO_MEMSLOT_WITH_UFFD(__a)			\
+	TEST_CM_AND_S1PTW_ON_RO_MEMSLOT(__a, __NULL_UFFD_HANDLERS)
+#define TEST_EXEC_AND_S1PTW_AF_ON_RO_MEMSLOT_WITH_UFFD(__a)			\
+	TEST_EXEC_AND_S1PTW_ON_RO_MEMSLOT(__a, __NULL_UFFD_HANDLERS)
+
+#define	TEST_WRITE_ON_RO_DIRTY_LOG_MEMSLOT(__a, ...)				\
+{										\
+	.name			= SNAME(WRITE_ON_RO_MEMSLOT ## _ ## __a),	\
+	.test_memslot_flags	= KVM_MEM_READONLY | KVM_MEM_LOG_DIRTY_PAGES,	\
+	.guest_test		= __a,						\
+	.guest_test_check	= { guest_check_no_write_in_dirty_log, },	\
+	.mmio_handler		= mmio_on_test_gpa_handler,			\
+	.expected_events	= { .mmio_exits = 1, },				\
+	__VA_ARGS__								\
+}
+
+#define	TEST_CM_ON_RO_DIRTY_LOG_MEMSLOT(__a, ...)				\
+{										\
+	.name			= SNAME(WRITE_ON_RO_MEMSLOT ## _ ## __a),	\
+	.test_memslot_flags	= KVM_MEM_READONLY | KVM_MEM_LOG_DIRTY_PAGES,	\
+	.guest_test		= __a,						\
+	.guest_test_check	= { guest_check_no_write_in_dirty_log, },	\
+	.fail_vcpu_run_handler	= fail_vcpu_run_mmio_no_syndrome_handler,	\
+	.expected_events	= { .fail_vcpu_runs = 1, },			\
+	__VA_ARGS__								\
+}
+
 static struct test_desc tests[] = {
 	/* Check that HW is setting the AF (sanity checks). */
 	TEST_HW_ACCESS_FLAG(guest_test_read64),
@@ -1223,6 +1288,65 @@ static struct test_desc tests[] = {
 	TEST_ACCESS_AND_S1PTW_ON_HOLE_UFFD_AF(guest_test_exec,
 			uffd_test_read_handler, uffd_pt_write_handler),
 
+	/* Write into a memslot marked for both dirty logging and UFFD. */
+	TEST_WRITE_ON_DIRTY_LOG_AND_UFFD(guest_test_write64,
+			uffd_test_write_handler),
+	/* Note that the cas uffd handler is for a read. */
+	TEST_WRITE_ON_DIRTY_LOG_AND_UFFD(guest_test_cas,
+			uffd_test_read_handler, __PREPARE_LSE_TEST_ARGS),
+	TEST_WRITE_ON_DIRTY_LOG_AND_UFFD(guest_test_dc_zva,
+			uffd_test_write_handler),
+	TEST_WRITE_ON_DIRTY_LOG_AND_UFFD(guest_test_st_preidx,
+			uffd_test_write_handler),
+
+	/*
+	 * Access whose s1ptw faults on a hole that's marked for both dirty
+	 * logging and UFFD.
+	 */
+	TEST_S1PTW_ON_DIRTY_LOG_AND_UFFD(guest_test_read64,
+			uffd_pt_write_handler),
+	TEST_S1PTW_ON_DIRTY_LOG_AND_UFFD(guest_test_cas,
+			uffd_pt_write_handler, __PREPARE_LSE_TEST_ARGS),
+	TEST_S1PTW_ON_DIRTY_LOG_AND_UFFD(guest_test_ld_preidx,
+			uffd_pt_write_handler),
+	TEST_S1PTW_ON_DIRTY_LOG_AND_UFFD(guest_test_exec,
+			uffd_pt_write_handler),
+	TEST_S1PTW_ON_DIRTY_LOG_AND_UFFD(guest_test_write64,
+			uffd_pt_write_handler),
+	TEST_S1PTW_ON_DIRTY_LOG_AND_UFFD(guest_test_st_preidx,
+			uffd_pt_write_handler),
+	TEST_S1PTW_ON_DIRTY_LOG_AND_UFFD(guest_test_dc_zva,
+			uffd_pt_write_handler),
+	TEST_S1PTW_ON_DIRTY_LOG_AND_UFFD(guest_test_at,
+			uffd_pt_write_handler),
+
+	/*
+	 * Write on a memslot marked for dirty logging whose related s1ptw
+	 * is on a hole marked with UFFD.
+	 */
+	TEST_WRITE_DIRTY_LOG_AND_S1PTW_ON_UFFD(guest_test_write64,
+			uffd_pt_write_handler),
+	TEST_WRITE_DIRTY_LOG_AND_S1PTW_ON_UFFD(guest_test_cas,
+			uffd_pt_write_handler, __PREPARE_LSE_TEST_ARGS),
+	TEST_WRITE_DIRTY_LOG_AND_S1PTW_ON_UFFD(guest_test_dc_zva,
+			uffd_pt_write_handler),
+	TEST_WRITE_DIRTY_LOG_AND_S1PTW_ON_UFFD(guest_test_st_preidx,
+			uffd_pt_write_handler),
+
+	/*
+	 * Write on a memslot that's on a hole marked with UFFD, whose related
+	 * sp1ptw is on a memslot marked for dirty logging.
+	 */
+	TEST_WRITE_UFFD_AND_S1PTW_ON_DIRTY_LOG(guest_test_write64,
+			uffd_test_write_handler),
+	/* Note that the uffd handler is for a read. */
+	TEST_WRITE_UFFD_AND_S1PTW_ON_DIRTY_LOG(guest_test_cas,
+			uffd_test_read_handler, __PREPARE_LSE_TEST_ARGS),
+	TEST_WRITE_UFFD_AND_S1PTW_ON_DIRTY_LOG(guest_test_dc_zva,
+			uffd_test_write_handler),
+	TEST_WRITE_UFFD_AND_S1PTW_ON_DIRTY_LOG(guest_test_st_preidx,
+			uffd_test_write_handler),
+
 	/* Access on readonly memslot (sanity check). */
 	TEST_WRITE_ON_RO_MEMSLOT(guest_test_write64),
 	TEST_READ_ON_RO_MEMSLOT(guest_test_read64),
@@ -1290,6 +1414,30 @@ static struct test_desc tests[] = {
 	TEST_CM_AND_S1PTW_AF_ON_RO_MEMSLOT(guest_test_st_preidx),
 	TEST_EXEC_AND_S1PTW_AF_ON_RO_MEMSLOT(guest_test_exec),
 
+	/*
+	 * Access on a memslot marked as readonly with also dirty log tracking.
+	 * There should be no write in the dirty log.
+	 */
+	TEST_WRITE_ON_RO_DIRTY_LOG_MEMSLOT(guest_test_write64),
+	TEST_CM_ON_RO_DIRTY_LOG_MEMSLOT(guest_test_cas,
+			__PREPARE_LSE_TEST_ARGS),
+	TEST_CM_ON_RO_DIRTY_LOG_MEMSLOT(guest_test_dc_zva),
+	TEST_CM_ON_RO_DIRTY_LOG_MEMSLOT(guest_test_st_preidx),
+
+	/*
+	 * Access on a RO memslot with S1PTW also on a RO memslot, while also
+	 * having those memslot regions marked for UFFD fault handling.  The
+	 * result is that UFFD fault handlers should not be called.
+	 */
+	TEST_WRITE_AND_S1PTW_AF_ON_RO_MEMSLOT_WITH_UFFD(guest_test_write64),
+	TEST_READ_AND_S1PTW_ON_RO_MEMSLOT_WITH_UFFD(guest_test_read64),
+	TEST_READ_AND_S1PTW_ON_RO_MEMSLOT_WITH_UFFD(guest_test_ld_preidx),
+	TEST_CM_AND_S1PTW_ON_RO_MEMSLOT(guest_test_cas,
+			__PREPARE_LSE_TEST_ARGS __NULL_UFFD_HANDLERS),
+	TEST_CM_AND_S1PTW_AF_ON_RO_MEMSLOT_WITH_UFFD(guest_test_dc_zva),
+	TEST_CM_AND_S1PTW_AF_ON_RO_MEMSLOT_WITH_UFFD(guest_test_st_preidx),
+	TEST_EXEC_AND_S1PTW_AF_ON_RO_MEMSLOT_WITH_UFFD(guest_test_exec),
+
 	{ 0 },
 };
 
-- 
2.35.1.723.g4982287a31-goog


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH 01/11] KVM: selftests: Add a userfaultfd library
  2022-03-11  6:01   ` Ricardo Koller
@ 2022-03-16 18:02     ` Ben Gardon
  -1 siblings, 0 replies; 42+ messages in thread
From: Ben Gardon @ 2022-03-16 18:02 UTC (permalink / raw)
  To: Ricardo Koller
  Cc: kvm, kvmarm, Andrew Jones, Paolo Bonzini, Marc Zyngier,
	Alexandru Elisei, Eric Auger, Oliver Upton, Reiji Watanabe,
	Raghavendra Rao Ananta, Axel Rasmussen

On Fri, Mar 11, 2022 at 12:02 AM Ricardo Koller <ricarkol@google.com> wrote:
>
> Move the generic userfaultfd code out of demand_paging_test.c into a
> common library, userfaultfd_util. This library consists of a setup and a
> stop function. The setup function starts a thread for handling page
> faults using the handler callback function. This setup returns a
> uffd_desc object which is then used in the stop function (to wait and
> destroy the threads).
>
> Signed-off-by: Ricardo Koller <ricarkol@google.com>
> ---
>  tools/testing/selftests/kvm/Makefile          |   2 +-
>  .../selftests/kvm/demand_paging_test.c        | 227 +++---------------
>  .../selftests/kvm/include/userfaultfd_util.h  |  46 ++++
>  .../selftests/kvm/lib/userfaultfd_util.c      | 196 +++++++++++++++
>  4 files changed, 272 insertions(+), 199 deletions(-)
>  create mode 100644 tools/testing/selftests/kvm/include/userfaultfd_util.h
>  create mode 100644 tools/testing/selftests/kvm/lib/userfaultfd_util.c
>
> diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
> index 0e4926bc9a58..bc5f89b3700e 100644
> --- a/tools/testing/selftests/kvm/Makefile
> +++ b/tools/testing/selftests/kvm/Makefile
> @@ -37,7 +37,7 @@ ifeq ($(ARCH),riscv)
>         UNAME_M := riscv
>  endif
>
> -LIBKVM = lib/assert.c lib/elf.c lib/io.c lib/kvm_util.c lib/rbtree.c lib/sparsebit.c lib/test_util.c lib/guest_modes.c lib/perf_test_util.c
> +LIBKVM = lib/assert.c lib/elf.c lib/io.c lib/kvm_util.c lib/rbtree.c lib/sparsebit.c lib/test_util.c lib/guest_modes.c lib/perf_test_util.c lib/userfaultfd_util.c
>  LIBKVM_x86_64 = lib/x86_64/apic.c lib/x86_64/processor.c lib/x86_64/vmx.c lib/x86_64/svm.c lib/x86_64/ucall.c lib/x86_64/handlers.S
>  LIBKVM_aarch64 = lib/aarch64/processor.c lib/aarch64/ucall.c lib/aarch64/handlers.S lib/aarch64/spinlock.c lib/aarch64/gic.c lib/aarch64/gic_v3.c lib/aarch64/vgic.c
>  LIBKVM_s390x = lib/s390x/processor.c lib/s390x/ucall.c lib/s390x/diag318_test_handler.c
> diff --git a/tools/testing/selftests/kvm/demand_paging_test.c b/tools/testing/selftests/kvm/demand_paging_test.c
> index 6a719d065599..b3d457cecd68 100644
> --- a/tools/testing/selftests/kvm/demand_paging_test.c
> +++ b/tools/testing/selftests/kvm/demand_paging_test.c
> @@ -22,23 +22,13 @@
>  #include "test_util.h"
>  #include "perf_test_util.h"
>  #include "guest_modes.h"
> +#include "userfaultfd_util.h"
>
>  #ifdef __NR_userfaultfd
>
> -#ifdef PRINT_PER_PAGE_UPDATES
> -#define PER_PAGE_DEBUG(...) printf(__VA_ARGS__)
> -#else
> -#define PER_PAGE_DEBUG(...) _no_printf(__VA_ARGS__)
> -#endif
> -
> -#ifdef PRINT_PER_VCPU_UPDATES
> -#define PER_VCPU_DEBUG(...) printf(__VA_ARGS__)
> -#else
> -#define PER_VCPU_DEBUG(...) _no_printf(__VA_ARGS__)
> -#endif
> -
>  static int nr_vcpus = 1;
>  static uint64_t guest_percpu_mem_size = DEFAULT_PER_VCPU_MEM_SIZE;
> +
>  static size_t demand_paging_size;
>  static char *guest_data_prototype;
>
> @@ -69,9 +59,11 @@ static void vcpu_worker(struct perf_test_vcpu_args *vcpu_args)
>                        ts_diff.tv_sec, ts_diff.tv_nsec);
>  }
>
> -static int handle_uffd_page_request(int uffd_mode, int uffd, uint64_t addr)
> +static int handle_uffd_page_request(int uffd_mode, int uffd,
> +               struct uffd_msg *msg)
>  {
>         pid_t tid = syscall(__NR_gettid);
> +       uint64_t addr = msg->arg.pagefault.address;
>         struct timespec start;
>         struct timespec ts_diff;
>         int r;
> @@ -118,175 +110,32 @@ static int handle_uffd_page_request(int uffd_mode, int uffd, uint64_t addr)
>         return 0;
>  }
>
> -bool quit_uffd_thread;
> -
> -struct uffd_handler_args {
> +struct test_params {
>         int uffd_mode;
> -       int uffd;
> -       int pipefd;
> -       useconds_t delay;
> +       useconds_t uffd_delay;
> +       enum vm_mem_backing_src_type src_type;
> +       bool partition_vcpu_memory_access;
>  };
>
> -static void *uffd_handler_thread_fn(void *arg)
> +static void prefault_mem(void *alias, uint64_t len)
>  {
> -       struct uffd_handler_args *uffd_args = (struct uffd_handler_args *)arg;
> -       int uffd = uffd_args->uffd;
> -       int pipefd = uffd_args->pipefd;
> -       useconds_t delay = uffd_args->delay;
> -       int64_t pages = 0;
> -       struct timespec start;
> -       struct timespec ts_diff;
> -
> -       clock_gettime(CLOCK_MONOTONIC, &start);
> -       while (!quit_uffd_thread) {
> -               struct uffd_msg msg;
> -               struct pollfd pollfd[2];
> -               char tmp_chr;
> -               int r;
> -               uint64_t addr;
> -
> -               pollfd[0].fd = uffd;
> -               pollfd[0].events = POLLIN;
> -               pollfd[1].fd = pipefd;
> -               pollfd[1].events = POLLIN;
> -
> -               r = poll(pollfd, 2, -1);
> -               switch (r) {
> -               case -1:
> -                       pr_info("poll err");
> -                       continue;
> -               case 0:
> -                       continue;
> -               case 1:
> -                       break;
> -               default:
> -                       pr_info("Polling uffd returned %d", r);
> -                       return NULL;
> -               }
> -
> -               if (pollfd[0].revents & POLLERR) {
> -                       pr_info("uffd revents has POLLERR");
> -                       return NULL;
> -               }
> -
> -               if (pollfd[1].revents & POLLIN) {
> -                       r = read(pollfd[1].fd, &tmp_chr, 1);
> -                       TEST_ASSERT(r == 1,
> -                                   "Error reading pipefd in UFFD thread\n");
> -                       return NULL;
> -               }
> -
> -               if (!(pollfd[0].revents & POLLIN))
> -                       continue;
> -
> -               r = read(uffd, &msg, sizeof(msg));
> -               if (r == -1) {
> -                       if (errno == EAGAIN)
> -                               continue;
> -                       pr_info("Read of uffd got errno %d\n", errno);
> -                       return NULL;
> -               }
> -
> -               if (r != sizeof(msg)) {
> -                       pr_info("Read on uffd returned unexpected size: %d bytes", r);
> -                       return NULL;
> -               }
> -
> -               if (!(msg.event & UFFD_EVENT_PAGEFAULT))
> -                       continue;
> +       size_t p;
>
> -               if (delay)
> -                       usleep(delay);
> -               addr =  msg.arg.pagefault.address;
> -               r = handle_uffd_page_request(uffd_args->uffd_mode, uffd, addr);
> -               if (r < 0)
> -                       return NULL;
> -               pages++;
> +       TEST_ASSERT(alias != NULL, "Alias required for minor faults");
> +       for (p = 0; p < (len / demand_paging_size); ++p) {
> +               memcpy(alias + (p * demand_paging_size),
> +                      guest_data_prototype, demand_paging_size);
>         }
> -
> -       ts_diff = timespec_elapsed(start);
> -       PER_VCPU_DEBUG("userfaulted %ld pages over %ld.%.9lds. (%f/sec)\n",
> -                      pages, ts_diff.tv_sec, ts_diff.tv_nsec,
> -                      pages / ((double)ts_diff.tv_sec + (double)ts_diff.tv_nsec / 100000000.0));
> -
> -       return NULL;
>  }
>
> -static void setup_demand_paging(struct kvm_vm *vm,
> -                               pthread_t *uffd_handler_thread, int pipefd,
> -                               int uffd_mode, useconds_t uffd_delay,
> -                               struct uffd_handler_args *uffd_args,
> -                               void *hva, void *alias, uint64_t len)
> -{
> -       bool is_minor = (uffd_mode == UFFDIO_REGISTER_MODE_MINOR);
> -       int uffd;
> -       struct uffdio_api uffdio_api;
> -       struct uffdio_register uffdio_register;
> -       uint64_t expected_ioctls = ((uint64_t) 1) << _UFFDIO_COPY;
> -
> -       PER_PAGE_DEBUG("Userfaultfd %s mode, faults resolved with %s\n",
> -                      is_minor ? "MINOR" : "MISSING",
> -                      is_minor ? "UFFDIO_CONINUE" : "UFFDIO_COPY");
> -
> -       /* In order to get minor faults, prefault via the alias. */
> -       if (is_minor) {
> -               size_t p;
> -
> -               expected_ioctls = ((uint64_t) 1) << _UFFDIO_CONTINUE;
> -
> -               TEST_ASSERT(alias != NULL, "Alias required for minor faults");
> -               for (p = 0; p < (len / demand_paging_size); ++p) {
> -                       memcpy(alias + (p * demand_paging_size),
> -                              guest_data_prototype, demand_paging_size);
> -               }
> -       }
> -
> -       uffd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK);
> -       TEST_ASSERT(uffd >= 0, "uffd creation failed, errno: %d", errno);
> -
> -       uffdio_api.api = UFFD_API;
> -       uffdio_api.features = 0;
> -       TEST_ASSERT(ioctl(uffd, UFFDIO_API, &uffdio_api) != -1,
> -                   "ioctl UFFDIO_API failed: %" PRIu64,
> -                   (uint64_t)uffdio_api.api);
> -
> -       uffdio_register.range.start = (uint64_t)hva;
> -       uffdio_register.range.len = len;
> -       uffdio_register.mode = uffd_mode;
> -       TEST_ASSERT(ioctl(uffd, UFFDIO_REGISTER, &uffdio_register) != -1,
> -                   "ioctl UFFDIO_REGISTER failed");
> -       TEST_ASSERT((uffdio_register.ioctls & expected_ioctls) ==
> -                   expected_ioctls, "missing userfaultfd ioctls");
> -
> -       uffd_args->uffd_mode = uffd_mode;
> -       uffd_args->uffd = uffd;
> -       uffd_args->pipefd = pipefd;
> -       uffd_args->delay = uffd_delay;
> -       pthread_create(uffd_handler_thread, NULL, uffd_handler_thread_fn,
> -                      uffd_args);
> -
> -       PER_VCPU_DEBUG("Created uffd thread for HVA range [%p, %p)\n",
> -                      hva, hva + len);
> -}
> -
> -struct test_params {
> -       int uffd_mode;
> -       useconds_t uffd_delay;
> -       enum vm_mem_backing_src_type src_type;
> -       bool partition_vcpu_memory_access;
> -};
> -
>  static void run_test(enum vm_guest_mode mode, void *arg)
>  {
>         struct test_params *p = arg;
> -       pthread_t *uffd_handler_threads = NULL;
> -       struct uffd_handler_args *uffd_args = NULL;
> +       struct uffd_desc **uffd_descs = NULL;
>         struct timespec start;
>         struct timespec ts_diff;
> -       int *pipefds = NULL;
>         struct kvm_vm *vm;
>         int vcpu_id;
> -       int r;
>
>         vm = perf_test_create_vm(mode, nr_vcpus, guest_percpu_mem_size, 1,
>                                  p->src_type, p->partition_vcpu_memory_access);
> @@ -299,15 +148,8 @@ static void run_test(enum vm_guest_mode mode, void *arg)
>         memset(guest_data_prototype, 0xAB, demand_paging_size);
>
>         if (p->uffd_mode) {
> -               uffd_handler_threads =
> -                       malloc(nr_vcpus * sizeof(*uffd_handler_threads));
> -               TEST_ASSERT(uffd_handler_threads, "Memory allocation failed");
> -
> -               uffd_args = malloc(nr_vcpus * sizeof(*uffd_args));
> -               TEST_ASSERT(uffd_args, "Memory allocation failed");
> -
> -               pipefds = malloc(sizeof(int) * nr_vcpus * 2);
> -               TEST_ASSERT(pipefds, "Unable to allocate memory for pipefd");
> +               uffd_descs = malloc(nr_vcpus * sizeof(struct uffd_desc *));
> +               TEST_ASSERT(uffd_descs, "Memory allocation failed");
>
>                 for (vcpu_id = 0; vcpu_id < nr_vcpus; vcpu_id++) {
>                         struct perf_test_vcpu_args *vcpu_args;
> @@ -320,19 +162,17 @@ static void run_test(enum vm_guest_mode mode, void *arg)
>                         vcpu_hva = addr_gpa2hva(vm, vcpu_args->gpa);
>                         vcpu_alias = addr_gpa2alias(vm, vcpu_args->gpa);
>
> +                       prefault_mem(vcpu_alias,
> +                               vcpu_args->pages * perf_test_args.guest_page_size);
> +
>                         /*
>                          * Set up user fault fd to handle demand paging
>                          * requests.
>                          */
> -                       r = pipe2(&pipefds[vcpu_id * 2],
> -                                 O_CLOEXEC | O_NONBLOCK);
> -                       TEST_ASSERT(!r, "Failed to set up pipefd");
> -
> -                       setup_demand_paging(vm, &uffd_handler_threads[vcpu_id],
> -                                           pipefds[vcpu_id * 2], p->uffd_mode,
> -                                           p->uffd_delay, &uffd_args[vcpu_id],
> -                                           vcpu_hva, vcpu_alias,
> -                                           vcpu_args->pages * perf_test_args.guest_page_size);
> +                       uffd_descs[vcpu_id] = uffd_setup_demand_paging(
> +                               p->uffd_mode, p->uffd_delay, vcpu_hva,
> +                               vcpu_args->pages * perf_test_args.guest_page_size,
> +                               &handle_uffd_page_request);
>                 }
>         }
>
> @@ -347,15 +187,9 @@ static void run_test(enum vm_guest_mode mode, void *arg)
>         pr_info("All vCPU threads joined\n");
>
>         if (p->uffd_mode) {
> -               char c;
> -
>                 /* Tell the user fault fd handler threads to quit */
> -               for (vcpu_id = 0; vcpu_id < nr_vcpus; vcpu_id++) {
> -                       r = write(pipefds[vcpu_id * 2 + 1], &c, 1);
> -                       TEST_ASSERT(r == 1, "Unable to write to pipefd");
> -
> -                       pthread_join(uffd_handler_threads[vcpu_id], NULL);
> -               }
> +               for (vcpu_id = 0; vcpu_id < nr_vcpus; vcpu_id++)
> +                       uffd_stop_demand_paging(uffd_descs[vcpu_id]);
>         }
>
>         pr_info("Total guest execution time: %ld.%.9lds\n",
> @@ -367,11 +201,8 @@ static void run_test(enum vm_guest_mode mode, void *arg)
>         perf_test_destroy_vm(vm);
>
>         free(guest_data_prototype);
> -       if (p->uffd_mode) {
> -               free(uffd_handler_threads);
> -               free(uffd_args);
> -               free(pipefds);
> -       }
> +       if (p->uffd_mode)
> +               free(uffd_descs);
>  }
>
>  static void help(char *name)
> diff --git a/tools/testing/selftests/kvm/include/userfaultfd_util.h b/tools/testing/selftests/kvm/include/userfaultfd_util.h
> new file mode 100644
> index 000000000000..7b294ce8147c
> --- /dev/null
> +++ b/tools/testing/selftests/kvm/include/userfaultfd_util.h
> @@ -0,0 +1,46 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * KVM userfaultfd util
> + * Adapted from demand_paging_test.c
> + *
> + * Copyright (C) 2018, Red Hat, Inc.
> + * Copyright (C) 2019, Google, Inc.
> + * Copyright (C) 2022, Google, Inc.
> + */
> +
> +#define _GNU_SOURCE /* for pipe2 */
> +
> +#include <inttypes.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <time.h>
> +#include <poll.h>
> +#include <pthread.h>
> +#include <linux/userfaultfd.h>
> +#include <sys/syscall.h>
> +
> +#include "kvm_util.h"
> +#include "test_util.h"
> +#include "perf_test_util.h"
> +
> +typedef int (*uffd_handler_t)(int uffd_mode, int uffd, struct uffd_msg *msg);
> +
> +struct uffd_desc;

Do we gain anything from making this opaque? Given the 100+ patch
series Sean just sent out to expose the KVM util library functions,
I'd be inclined to just define the struct in the header file.

Otherwise, I'm really happy to see all this code factored out into its
own little library.

> +
> +struct uffd_desc *uffd_setup_demand_paging(int uffd_mode,
> +               useconds_t uffd_delay, void *hva, uint64_t len,
> +               uffd_handler_t handler);
> +
> +void uffd_stop_demand_paging(struct uffd_desc *uffd);
> +
> +#ifdef PRINT_PER_PAGE_UPDATES
> +#define PER_PAGE_DEBUG(...) printf(__VA_ARGS__)
> +#else
> +#define PER_PAGE_DEBUG(...) _no_printf(__VA_ARGS__)
> +#endif
> +
> +#ifdef PRINT_PER_VCPU_UPDATES
> +#define PER_VCPU_DEBUG(...) printf(__VA_ARGS__)
> +#else
> +#define PER_VCPU_DEBUG(...) _no_printf(__VA_ARGS__)
> +#endif
> diff --git a/tools/testing/selftests/kvm/lib/userfaultfd_util.c b/tools/testing/selftests/kvm/lib/userfaultfd_util.c
> new file mode 100644
> index 000000000000..5e0878878a69
> --- /dev/null
> +++ b/tools/testing/selftests/kvm/lib/userfaultfd_util.c
> @@ -0,0 +1,196 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * KVM userfaultfd util
> + * Adapted from demand_paging_test.c
> + *
> + * Copyright (C) 2018, Red Hat, Inc.
> + * Copyright (C) 2019, Google, Inc.
> + * Copyright (C) 2022, Google, Inc.
> + */
> +
> +#define _GNU_SOURCE /* for pipe2 */
> +
> +#include <inttypes.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <time.h>
> +#include <poll.h>
> +#include <pthread.h>
> +#include <linux/userfaultfd.h>
> +#include <sys/syscall.h>
> +
> +#include "kvm_util.h"
> +#include "test_util.h"
> +#include "perf_test_util.h"
> +#include "userfaultfd_util.h"
> +
> +#ifdef __NR_userfaultfd
> +
> +struct uffd_desc {
> +       int uffd_mode;
> +       int uffd;
> +       int pipefds[2];
> +       useconds_t delay;
> +       uffd_handler_t handler;
> +       pthread_t thread;
> +};
> +
> +static void *uffd_handler_thread_fn(void *arg)
> +{
> +       struct uffd_desc *uffd_desc = (struct uffd_desc *)arg;
> +       int uffd = uffd_desc->uffd;
> +       int pipefd = uffd_desc->pipefds[0];
> +       useconds_t delay = uffd_desc->delay;
> +       int64_t pages = 0;
> +       struct timespec start;
> +       struct timespec ts_diff;
> +
> +       clock_gettime(CLOCK_MONOTONIC, &start);
> +       while (1) {
> +               struct uffd_msg msg;
> +               struct pollfd pollfd[2];
> +               char tmp_chr;
> +               int r;
> +
> +               pollfd[0].fd = uffd;
> +               pollfd[0].events = POLLIN;
> +               pollfd[1].fd = pipefd;
> +               pollfd[1].events = POLLIN;
> +
> +               r = poll(pollfd, 2, -1);
> +               switch (r) {
> +               case -1:
> +                       pr_info("poll err");
> +                       continue;
> +               case 0:
> +                       continue;
> +               case 1:
> +                       break;
> +               default:
> +                       pr_info("Polling uffd returned %d", r);
> +                       return NULL;
> +               }
> +
> +               if (pollfd[0].revents & POLLERR) {
> +                       pr_info("uffd revents has POLLERR");
> +                       return NULL;
> +               }
> +
> +               if (pollfd[1].revents & POLLIN) {
> +                       r = read(pollfd[1].fd, &tmp_chr, 1);
> +                       TEST_ASSERT(r == 1,
> +                                   "Error reading pipefd in UFFD thread\n");
> +                       return NULL;
> +               }
> +
> +               if (!(pollfd[0].revents & POLLIN))
> +                       continue;
> +
> +               r = read(uffd, &msg, sizeof(msg));
> +               if (r == -1) {
> +                       if (errno == EAGAIN)
> +                               continue;
> +                       pr_info("Read of uffd got errno %d\n", errno);
> +                       return NULL;
> +               }
> +
> +               if (r != sizeof(msg)) {
> +                       pr_info("Read on uffd returned unexpected size: %d bytes", r);
> +                       return NULL;
> +               }
> +
> +               if (!(msg.event & UFFD_EVENT_PAGEFAULT))
> +                       continue;
> +
> +               if (delay)
> +                       usleep(delay);
> +               r = uffd_desc->handler(uffd_desc->uffd_mode, uffd, &msg);
> +               if (r < 0)
> +                       return NULL;
> +               pages++;
> +       }
> +
> +       ts_diff = timespec_elapsed(start);
> +       PER_VCPU_DEBUG("userfaulted %ld pages over %ld.%.9lds. (%f/sec)\n",
> +                      pages, ts_diff.tv_sec, ts_diff.tv_nsec,
> +                      pages / ((double)ts_diff.tv_sec + (double)ts_diff.tv_nsec / 100000000.0));
> +
> +       return NULL;
> +}
> +
> +struct uffd_desc *uffd_setup_demand_paging(int uffd_mode,
> +               useconds_t uffd_delay, void *hva, uint64_t len,
> +               uffd_handler_t handler)
> +{
> +       struct uffd_desc *uffd_desc;
> +       bool is_minor = (uffd_mode == UFFDIO_REGISTER_MODE_MINOR);
> +       int uffd;
> +       struct uffdio_api uffdio_api;
> +       struct uffdio_register uffdio_register;
> +       uint64_t expected_ioctls = ((uint64_t) 1) << _UFFDIO_COPY;
> +       int ret;
> +
> +       PER_PAGE_DEBUG("Userfaultfd %s mode, faults resolved with %s\n",
> +                      is_minor ? "MINOR" : "MISSING",
> +                      is_minor ? "UFFDIO_CONINUE" : "UFFDIO_COPY");
> +
> +       uffd_desc = malloc(sizeof(struct uffd_desc));
> +       TEST_ASSERT(uffd_desc, "malloc failed");
> +
> +       /* In order to get minor faults, prefault via the alias. */
> +       if (is_minor)
> +               expected_ioctls = ((uint64_t) 1) << _UFFDIO_CONTINUE;
> +
> +       uffd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK);
> +       TEST_ASSERT(uffd >= 0, "uffd creation failed, errno: %d", errno);
> +
> +       uffdio_api.api = UFFD_API;
> +       uffdio_api.features = 0;
> +       TEST_ASSERT(ioctl(uffd, UFFDIO_API, &uffdio_api) != -1,
> +                   "ioctl UFFDIO_API failed: %" PRIu64,
> +                   (uint64_t)uffdio_api.api);
> +
> +       uffdio_register.range.start = (uint64_t)hva;
> +       uffdio_register.range.len = len;
> +       uffdio_register.mode = uffd_mode;
> +       TEST_ASSERT(ioctl(uffd, UFFDIO_REGISTER, &uffdio_register) != -1,
> +                   "ioctl UFFDIO_REGISTER failed");
> +       TEST_ASSERT((uffdio_register.ioctls & expected_ioctls) ==
> +                       expected_ioctls, "missing userfaultfd ioctls");
> +
> +       ret = pipe2(uffd_desc->pipefds, O_CLOEXEC | O_NONBLOCK);
> +       TEST_ASSERT(!ret, "Failed to set up pipefd");
> +
> +       uffd_desc->uffd_mode = uffd_mode;
> +       uffd_desc->uffd = uffd;
> +       uffd_desc->delay = uffd_delay;
> +       uffd_desc->handler = handler;
> +       pthread_create(&uffd_desc->thread, NULL, uffd_handler_thread_fn,
> +                      uffd_desc);
> +
> +       PER_VCPU_DEBUG("Created uffd thread for HVA range [%p, %p)\n",
> +                      hva, hva + len);
> +
> +       return uffd_desc;
> +}
> +
> +void uffd_stop_demand_paging(struct uffd_desc *uffd)
> +{
> +       char c = 0;
> +       int ret;
> +
> +       ret = write(uffd->pipefds[1], &c, 1);
> +       TEST_ASSERT(ret == 1, "Unable to write to pipefd");
> +
> +       ret = pthread_join(uffd->thread, NULL);
> +       TEST_ASSERT(ret == 0, "Pthread_join failed.");
> +
> +       close(uffd->uffd);
> +
> +       close(uffd->pipefds[1]);
> +       close(uffd->pipefds[0]);
> +
> +       free(uffd);
> +}
> +
> +#endif /* __NR_userfaultfd */
> --
> 2.35.1.723.g4982287a31-goog
>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 01/11] KVM: selftests: Add a userfaultfd library
@ 2022-03-16 18:02     ` Ben Gardon
  0 siblings, 0 replies; 42+ messages in thread
From: Ben Gardon @ 2022-03-16 18:02 UTC (permalink / raw)
  To: Ricardo Koller; +Cc: kvm, Marc Zyngier, Paolo Bonzini, Axel Rasmussen, kvmarm

On Fri, Mar 11, 2022 at 12:02 AM Ricardo Koller <ricarkol@google.com> wrote:
>
> Move the generic userfaultfd code out of demand_paging_test.c into a
> common library, userfaultfd_util. This library consists of a setup and a
> stop function. The setup function starts a thread for handling page
> faults using the handler callback function. This setup returns a
> uffd_desc object which is then used in the stop function (to wait and
> destroy the threads).
>
> Signed-off-by: Ricardo Koller <ricarkol@google.com>
> ---
>  tools/testing/selftests/kvm/Makefile          |   2 +-
>  .../selftests/kvm/demand_paging_test.c        | 227 +++---------------
>  .../selftests/kvm/include/userfaultfd_util.h  |  46 ++++
>  .../selftests/kvm/lib/userfaultfd_util.c      | 196 +++++++++++++++
>  4 files changed, 272 insertions(+), 199 deletions(-)
>  create mode 100644 tools/testing/selftests/kvm/include/userfaultfd_util.h
>  create mode 100644 tools/testing/selftests/kvm/lib/userfaultfd_util.c
>
> diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
> index 0e4926bc9a58..bc5f89b3700e 100644
> --- a/tools/testing/selftests/kvm/Makefile
> +++ b/tools/testing/selftests/kvm/Makefile
> @@ -37,7 +37,7 @@ ifeq ($(ARCH),riscv)
>         UNAME_M := riscv
>  endif
>
> -LIBKVM = lib/assert.c lib/elf.c lib/io.c lib/kvm_util.c lib/rbtree.c lib/sparsebit.c lib/test_util.c lib/guest_modes.c lib/perf_test_util.c
> +LIBKVM = lib/assert.c lib/elf.c lib/io.c lib/kvm_util.c lib/rbtree.c lib/sparsebit.c lib/test_util.c lib/guest_modes.c lib/perf_test_util.c lib/userfaultfd_util.c
>  LIBKVM_x86_64 = lib/x86_64/apic.c lib/x86_64/processor.c lib/x86_64/vmx.c lib/x86_64/svm.c lib/x86_64/ucall.c lib/x86_64/handlers.S
>  LIBKVM_aarch64 = lib/aarch64/processor.c lib/aarch64/ucall.c lib/aarch64/handlers.S lib/aarch64/spinlock.c lib/aarch64/gic.c lib/aarch64/gic_v3.c lib/aarch64/vgic.c
>  LIBKVM_s390x = lib/s390x/processor.c lib/s390x/ucall.c lib/s390x/diag318_test_handler.c
> diff --git a/tools/testing/selftests/kvm/demand_paging_test.c b/tools/testing/selftests/kvm/demand_paging_test.c
> index 6a719d065599..b3d457cecd68 100644
> --- a/tools/testing/selftests/kvm/demand_paging_test.c
> +++ b/tools/testing/selftests/kvm/demand_paging_test.c
> @@ -22,23 +22,13 @@
>  #include "test_util.h"
>  #include "perf_test_util.h"
>  #include "guest_modes.h"
> +#include "userfaultfd_util.h"
>
>  #ifdef __NR_userfaultfd
>
> -#ifdef PRINT_PER_PAGE_UPDATES
> -#define PER_PAGE_DEBUG(...) printf(__VA_ARGS__)
> -#else
> -#define PER_PAGE_DEBUG(...) _no_printf(__VA_ARGS__)
> -#endif
> -
> -#ifdef PRINT_PER_VCPU_UPDATES
> -#define PER_VCPU_DEBUG(...) printf(__VA_ARGS__)
> -#else
> -#define PER_VCPU_DEBUG(...) _no_printf(__VA_ARGS__)
> -#endif
> -
>  static int nr_vcpus = 1;
>  static uint64_t guest_percpu_mem_size = DEFAULT_PER_VCPU_MEM_SIZE;
> +
>  static size_t demand_paging_size;
>  static char *guest_data_prototype;
>
> @@ -69,9 +59,11 @@ static void vcpu_worker(struct perf_test_vcpu_args *vcpu_args)
>                        ts_diff.tv_sec, ts_diff.tv_nsec);
>  }
>
> -static int handle_uffd_page_request(int uffd_mode, int uffd, uint64_t addr)
> +static int handle_uffd_page_request(int uffd_mode, int uffd,
> +               struct uffd_msg *msg)
>  {
>         pid_t tid = syscall(__NR_gettid);
> +       uint64_t addr = msg->arg.pagefault.address;
>         struct timespec start;
>         struct timespec ts_diff;
>         int r;
> @@ -118,175 +110,32 @@ static int handle_uffd_page_request(int uffd_mode, int uffd, uint64_t addr)
>         return 0;
>  }
>
> -bool quit_uffd_thread;
> -
> -struct uffd_handler_args {
> +struct test_params {
>         int uffd_mode;
> -       int uffd;
> -       int pipefd;
> -       useconds_t delay;
> +       useconds_t uffd_delay;
> +       enum vm_mem_backing_src_type src_type;
> +       bool partition_vcpu_memory_access;
>  };
>
> -static void *uffd_handler_thread_fn(void *arg)
> +static void prefault_mem(void *alias, uint64_t len)
>  {
> -       struct uffd_handler_args *uffd_args = (struct uffd_handler_args *)arg;
> -       int uffd = uffd_args->uffd;
> -       int pipefd = uffd_args->pipefd;
> -       useconds_t delay = uffd_args->delay;
> -       int64_t pages = 0;
> -       struct timespec start;
> -       struct timespec ts_diff;
> -
> -       clock_gettime(CLOCK_MONOTONIC, &start);
> -       while (!quit_uffd_thread) {
> -               struct uffd_msg msg;
> -               struct pollfd pollfd[2];
> -               char tmp_chr;
> -               int r;
> -               uint64_t addr;
> -
> -               pollfd[0].fd = uffd;
> -               pollfd[0].events = POLLIN;
> -               pollfd[1].fd = pipefd;
> -               pollfd[1].events = POLLIN;
> -
> -               r = poll(pollfd, 2, -1);
> -               switch (r) {
> -               case -1:
> -                       pr_info("poll err");
> -                       continue;
> -               case 0:
> -                       continue;
> -               case 1:
> -                       break;
> -               default:
> -                       pr_info("Polling uffd returned %d", r);
> -                       return NULL;
> -               }
> -
> -               if (pollfd[0].revents & POLLERR) {
> -                       pr_info("uffd revents has POLLERR");
> -                       return NULL;
> -               }
> -
> -               if (pollfd[1].revents & POLLIN) {
> -                       r = read(pollfd[1].fd, &tmp_chr, 1);
> -                       TEST_ASSERT(r == 1,
> -                                   "Error reading pipefd in UFFD thread\n");
> -                       return NULL;
> -               }
> -
> -               if (!(pollfd[0].revents & POLLIN))
> -                       continue;
> -
> -               r = read(uffd, &msg, sizeof(msg));
> -               if (r == -1) {
> -                       if (errno == EAGAIN)
> -                               continue;
> -                       pr_info("Read of uffd got errno %d\n", errno);
> -                       return NULL;
> -               }
> -
> -               if (r != sizeof(msg)) {
> -                       pr_info("Read on uffd returned unexpected size: %d bytes", r);
> -                       return NULL;
> -               }
> -
> -               if (!(msg.event & UFFD_EVENT_PAGEFAULT))
> -                       continue;
> +       size_t p;
>
> -               if (delay)
> -                       usleep(delay);
> -               addr =  msg.arg.pagefault.address;
> -               r = handle_uffd_page_request(uffd_args->uffd_mode, uffd, addr);
> -               if (r < 0)
> -                       return NULL;
> -               pages++;
> +       TEST_ASSERT(alias != NULL, "Alias required for minor faults");
> +       for (p = 0; p < (len / demand_paging_size); ++p) {
> +               memcpy(alias + (p * demand_paging_size),
> +                      guest_data_prototype, demand_paging_size);
>         }
> -
> -       ts_diff = timespec_elapsed(start);
> -       PER_VCPU_DEBUG("userfaulted %ld pages over %ld.%.9lds. (%f/sec)\n",
> -                      pages, ts_diff.tv_sec, ts_diff.tv_nsec,
> -                      pages / ((double)ts_diff.tv_sec + (double)ts_diff.tv_nsec / 100000000.0));
> -
> -       return NULL;
>  }
>
> -static void setup_demand_paging(struct kvm_vm *vm,
> -                               pthread_t *uffd_handler_thread, int pipefd,
> -                               int uffd_mode, useconds_t uffd_delay,
> -                               struct uffd_handler_args *uffd_args,
> -                               void *hva, void *alias, uint64_t len)
> -{
> -       bool is_minor = (uffd_mode == UFFDIO_REGISTER_MODE_MINOR);
> -       int uffd;
> -       struct uffdio_api uffdio_api;
> -       struct uffdio_register uffdio_register;
> -       uint64_t expected_ioctls = ((uint64_t) 1) << _UFFDIO_COPY;
> -
> -       PER_PAGE_DEBUG("Userfaultfd %s mode, faults resolved with %s\n",
> -                      is_minor ? "MINOR" : "MISSING",
> -                      is_minor ? "UFFDIO_CONINUE" : "UFFDIO_COPY");
> -
> -       /* In order to get minor faults, prefault via the alias. */
> -       if (is_minor) {
> -               size_t p;
> -
> -               expected_ioctls = ((uint64_t) 1) << _UFFDIO_CONTINUE;
> -
> -               TEST_ASSERT(alias != NULL, "Alias required for minor faults");
> -               for (p = 0; p < (len / demand_paging_size); ++p) {
> -                       memcpy(alias + (p * demand_paging_size),
> -                              guest_data_prototype, demand_paging_size);
> -               }
> -       }
> -
> -       uffd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK);
> -       TEST_ASSERT(uffd >= 0, "uffd creation failed, errno: %d", errno);
> -
> -       uffdio_api.api = UFFD_API;
> -       uffdio_api.features = 0;
> -       TEST_ASSERT(ioctl(uffd, UFFDIO_API, &uffdio_api) != -1,
> -                   "ioctl UFFDIO_API failed: %" PRIu64,
> -                   (uint64_t)uffdio_api.api);
> -
> -       uffdio_register.range.start = (uint64_t)hva;
> -       uffdio_register.range.len = len;
> -       uffdio_register.mode = uffd_mode;
> -       TEST_ASSERT(ioctl(uffd, UFFDIO_REGISTER, &uffdio_register) != -1,
> -                   "ioctl UFFDIO_REGISTER failed");
> -       TEST_ASSERT((uffdio_register.ioctls & expected_ioctls) ==
> -                   expected_ioctls, "missing userfaultfd ioctls");
> -
> -       uffd_args->uffd_mode = uffd_mode;
> -       uffd_args->uffd = uffd;
> -       uffd_args->pipefd = pipefd;
> -       uffd_args->delay = uffd_delay;
> -       pthread_create(uffd_handler_thread, NULL, uffd_handler_thread_fn,
> -                      uffd_args);
> -
> -       PER_VCPU_DEBUG("Created uffd thread for HVA range [%p, %p)\n",
> -                      hva, hva + len);
> -}
> -
> -struct test_params {
> -       int uffd_mode;
> -       useconds_t uffd_delay;
> -       enum vm_mem_backing_src_type src_type;
> -       bool partition_vcpu_memory_access;
> -};
> -
>  static void run_test(enum vm_guest_mode mode, void *arg)
>  {
>         struct test_params *p = arg;
> -       pthread_t *uffd_handler_threads = NULL;
> -       struct uffd_handler_args *uffd_args = NULL;
> +       struct uffd_desc **uffd_descs = NULL;
>         struct timespec start;
>         struct timespec ts_diff;
> -       int *pipefds = NULL;
>         struct kvm_vm *vm;
>         int vcpu_id;
> -       int r;
>
>         vm = perf_test_create_vm(mode, nr_vcpus, guest_percpu_mem_size, 1,
>                                  p->src_type, p->partition_vcpu_memory_access);
> @@ -299,15 +148,8 @@ static void run_test(enum vm_guest_mode mode, void *arg)
>         memset(guest_data_prototype, 0xAB, demand_paging_size);
>
>         if (p->uffd_mode) {
> -               uffd_handler_threads =
> -                       malloc(nr_vcpus * sizeof(*uffd_handler_threads));
> -               TEST_ASSERT(uffd_handler_threads, "Memory allocation failed");
> -
> -               uffd_args = malloc(nr_vcpus * sizeof(*uffd_args));
> -               TEST_ASSERT(uffd_args, "Memory allocation failed");
> -
> -               pipefds = malloc(sizeof(int) * nr_vcpus * 2);
> -               TEST_ASSERT(pipefds, "Unable to allocate memory for pipefd");
> +               uffd_descs = malloc(nr_vcpus * sizeof(struct uffd_desc *));
> +               TEST_ASSERT(uffd_descs, "Memory allocation failed");
>
>                 for (vcpu_id = 0; vcpu_id < nr_vcpus; vcpu_id++) {
>                         struct perf_test_vcpu_args *vcpu_args;
> @@ -320,19 +162,17 @@ static void run_test(enum vm_guest_mode mode, void *arg)
>                         vcpu_hva = addr_gpa2hva(vm, vcpu_args->gpa);
>                         vcpu_alias = addr_gpa2alias(vm, vcpu_args->gpa);
>
> +                       prefault_mem(vcpu_alias,
> +                               vcpu_args->pages * perf_test_args.guest_page_size);
> +
>                         /*
>                          * Set up user fault fd to handle demand paging
>                          * requests.
>                          */
> -                       r = pipe2(&pipefds[vcpu_id * 2],
> -                                 O_CLOEXEC | O_NONBLOCK);
> -                       TEST_ASSERT(!r, "Failed to set up pipefd");
> -
> -                       setup_demand_paging(vm, &uffd_handler_threads[vcpu_id],
> -                                           pipefds[vcpu_id * 2], p->uffd_mode,
> -                                           p->uffd_delay, &uffd_args[vcpu_id],
> -                                           vcpu_hva, vcpu_alias,
> -                                           vcpu_args->pages * perf_test_args.guest_page_size);
> +                       uffd_descs[vcpu_id] = uffd_setup_demand_paging(
> +                               p->uffd_mode, p->uffd_delay, vcpu_hva,
> +                               vcpu_args->pages * perf_test_args.guest_page_size,
> +                               &handle_uffd_page_request);
>                 }
>         }
>
> @@ -347,15 +187,9 @@ static void run_test(enum vm_guest_mode mode, void *arg)
>         pr_info("All vCPU threads joined\n");
>
>         if (p->uffd_mode) {
> -               char c;
> -
>                 /* Tell the user fault fd handler threads to quit */
> -               for (vcpu_id = 0; vcpu_id < nr_vcpus; vcpu_id++) {
> -                       r = write(pipefds[vcpu_id * 2 + 1], &c, 1);
> -                       TEST_ASSERT(r == 1, "Unable to write to pipefd");
> -
> -                       pthread_join(uffd_handler_threads[vcpu_id], NULL);
> -               }
> +               for (vcpu_id = 0; vcpu_id < nr_vcpus; vcpu_id++)
> +                       uffd_stop_demand_paging(uffd_descs[vcpu_id]);
>         }
>
>         pr_info("Total guest execution time: %ld.%.9lds\n",
> @@ -367,11 +201,8 @@ static void run_test(enum vm_guest_mode mode, void *arg)
>         perf_test_destroy_vm(vm);
>
>         free(guest_data_prototype);
> -       if (p->uffd_mode) {
> -               free(uffd_handler_threads);
> -               free(uffd_args);
> -               free(pipefds);
> -       }
> +       if (p->uffd_mode)
> +               free(uffd_descs);
>  }
>
>  static void help(char *name)
> diff --git a/tools/testing/selftests/kvm/include/userfaultfd_util.h b/tools/testing/selftests/kvm/include/userfaultfd_util.h
> new file mode 100644
> index 000000000000..7b294ce8147c
> --- /dev/null
> +++ b/tools/testing/selftests/kvm/include/userfaultfd_util.h
> @@ -0,0 +1,46 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * KVM userfaultfd util
> + * Adapted from demand_paging_test.c
> + *
> + * Copyright (C) 2018, Red Hat, Inc.
> + * Copyright (C) 2019, Google, Inc.
> + * Copyright (C) 2022, Google, Inc.
> + */
> +
> +#define _GNU_SOURCE /* for pipe2 */
> +
> +#include <inttypes.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <time.h>
> +#include <poll.h>
> +#include <pthread.h>
> +#include <linux/userfaultfd.h>
> +#include <sys/syscall.h>
> +
> +#include "kvm_util.h"
> +#include "test_util.h"
> +#include "perf_test_util.h"
> +
> +typedef int (*uffd_handler_t)(int uffd_mode, int uffd, struct uffd_msg *msg);
> +
> +struct uffd_desc;

Do we gain anything from making this opaque? Given the 100+ patch
series Sean just sent out to expose the KVM util library functions,
I'd be inclined to just define the struct in the header file.

Otherwise, I'm really happy to see all this code factored out into its
own little library.

> +
> +struct uffd_desc *uffd_setup_demand_paging(int uffd_mode,
> +               useconds_t uffd_delay, void *hva, uint64_t len,
> +               uffd_handler_t handler);
> +
> +void uffd_stop_demand_paging(struct uffd_desc *uffd);
> +
> +#ifdef PRINT_PER_PAGE_UPDATES
> +#define PER_PAGE_DEBUG(...) printf(__VA_ARGS__)
> +#else
> +#define PER_PAGE_DEBUG(...) _no_printf(__VA_ARGS__)
> +#endif
> +
> +#ifdef PRINT_PER_VCPU_UPDATES
> +#define PER_VCPU_DEBUG(...) printf(__VA_ARGS__)
> +#else
> +#define PER_VCPU_DEBUG(...) _no_printf(__VA_ARGS__)
> +#endif
> diff --git a/tools/testing/selftests/kvm/lib/userfaultfd_util.c b/tools/testing/selftests/kvm/lib/userfaultfd_util.c
> new file mode 100644
> index 000000000000..5e0878878a69
> --- /dev/null
> +++ b/tools/testing/selftests/kvm/lib/userfaultfd_util.c
> @@ -0,0 +1,196 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * KVM userfaultfd util
> + * Adapted from demand_paging_test.c
> + *
> + * Copyright (C) 2018, Red Hat, Inc.
> + * Copyright (C) 2019, Google, Inc.
> + * Copyright (C) 2022, Google, Inc.
> + */
> +
> +#define _GNU_SOURCE /* for pipe2 */
> +
> +#include <inttypes.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <time.h>
> +#include <poll.h>
> +#include <pthread.h>
> +#include <linux/userfaultfd.h>
> +#include <sys/syscall.h>
> +
> +#include "kvm_util.h"
> +#include "test_util.h"
> +#include "perf_test_util.h"
> +#include "userfaultfd_util.h"
> +
> +#ifdef __NR_userfaultfd
> +
> +struct uffd_desc {
> +       int uffd_mode;
> +       int uffd;
> +       int pipefds[2];
> +       useconds_t delay;
> +       uffd_handler_t handler;
> +       pthread_t thread;
> +};
> +
> +static void *uffd_handler_thread_fn(void *arg)
> +{
> +       struct uffd_desc *uffd_desc = (struct uffd_desc *)arg;
> +       int uffd = uffd_desc->uffd;
> +       int pipefd = uffd_desc->pipefds[0];
> +       useconds_t delay = uffd_desc->delay;
> +       int64_t pages = 0;
> +       struct timespec start;
> +       struct timespec ts_diff;
> +
> +       clock_gettime(CLOCK_MONOTONIC, &start);
> +       while (1) {
> +               struct uffd_msg msg;
> +               struct pollfd pollfd[2];
> +               char tmp_chr;
> +               int r;
> +
> +               pollfd[0].fd = uffd;
> +               pollfd[0].events = POLLIN;
> +               pollfd[1].fd = pipefd;
> +               pollfd[1].events = POLLIN;
> +
> +               r = poll(pollfd, 2, -1);
> +               switch (r) {
> +               case -1:
> +                       pr_info("poll err");
> +                       continue;
> +               case 0:
> +                       continue;
> +               case 1:
> +                       break;
> +               default:
> +                       pr_info("Polling uffd returned %d", r);
> +                       return NULL;
> +               }
> +
> +               if (pollfd[0].revents & POLLERR) {
> +                       pr_info("uffd revents has POLLERR");
> +                       return NULL;
> +               }
> +
> +               if (pollfd[1].revents & POLLIN) {
> +                       r = read(pollfd[1].fd, &tmp_chr, 1);
> +                       TEST_ASSERT(r == 1,
> +                                   "Error reading pipefd in UFFD thread\n");
> +                       return NULL;
> +               }
> +
> +               if (!(pollfd[0].revents & POLLIN))
> +                       continue;
> +
> +               r = read(uffd, &msg, sizeof(msg));
> +               if (r == -1) {
> +                       if (errno == EAGAIN)
> +                               continue;
> +                       pr_info("Read of uffd got errno %d\n", errno);
> +                       return NULL;
> +               }
> +
> +               if (r != sizeof(msg)) {
> +                       pr_info("Read on uffd returned unexpected size: %d bytes", r);
> +                       return NULL;
> +               }
> +
> +               if (!(msg.event & UFFD_EVENT_PAGEFAULT))
> +                       continue;
> +
> +               if (delay)
> +                       usleep(delay);
> +               r = uffd_desc->handler(uffd_desc->uffd_mode, uffd, &msg);
> +               if (r < 0)
> +                       return NULL;
> +               pages++;
> +       }
> +
> +       ts_diff = timespec_elapsed(start);
> +       PER_VCPU_DEBUG("userfaulted %ld pages over %ld.%.9lds. (%f/sec)\n",
> +                      pages, ts_diff.tv_sec, ts_diff.tv_nsec,
> +                      pages / ((double)ts_diff.tv_sec + (double)ts_diff.tv_nsec / 100000000.0));
> +
> +       return NULL;
> +}
> +
> +struct uffd_desc *uffd_setup_demand_paging(int uffd_mode,
> +               useconds_t uffd_delay, void *hva, uint64_t len,
> +               uffd_handler_t handler)
> +{
> +       struct uffd_desc *uffd_desc;
> +       bool is_minor = (uffd_mode == UFFDIO_REGISTER_MODE_MINOR);
> +       int uffd;
> +       struct uffdio_api uffdio_api;
> +       struct uffdio_register uffdio_register;
> +       uint64_t expected_ioctls = ((uint64_t) 1) << _UFFDIO_COPY;
> +       int ret;
> +
> +       PER_PAGE_DEBUG("Userfaultfd %s mode, faults resolved with %s\n",
> +                      is_minor ? "MINOR" : "MISSING",
> +                      is_minor ? "UFFDIO_CONINUE" : "UFFDIO_COPY");
> +
> +       uffd_desc = malloc(sizeof(struct uffd_desc));
> +       TEST_ASSERT(uffd_desc, "malloc failed");
> +
> +       /* In order to get minor faults, prefault via the alias. */
> +       if (is_minor)
> +               expected_ioctls = ((uint64_t) 1) << _UFFDIO_CONTINUE;
> +
> +       uffd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK);
> +       TEST_ASSERT(uffd >= 0, "uffd creation failed, errno: %d", errno);
> +
> +       uffdio_api.api = UFFD_API;
> +       uffdio_api.features = 0;
> +       TEST_ASSERT(ioctl(uffd, UFFDIO_API, &uffdio_api) != -1,
> +                   "ioctl UFFDIO_API failed: %" PRIu64,
> +                   (uint64_t)uffdio_api.api);
> +
> +       uffdio_register.range.start = (uint64_t)hva;
> +       uffdio_register.range.len = len;
> +       uffdio_register.mode = uffd_mode;
> +       TEST_ASSERT(ioctl(uffd, UFFDIO_REGISTER, &uffdio_register) != -1,
> +                   "ioctl UFFDIO_REGISTER failed");
> +       TEST_ASSERT((uffdio_register.ioctls & expected_ioctls) ==
> +                       expected_ioctls, "missing userfaultfd ioctls");
> +
> +       ret = pipe2(uffd_desc->pipefds, O_CLOEXEC | O_NONBLOCK);
> +       TEST_ASSERT(!ret, "Failed to set up pipefd");
> +
> +       uffd_desc->uffd_mode = uffd_mode;
> +       uffd_desc->uffd = uffd;
> +       uffd_desc->delay = uffd_delay;
> +       uffd_desc->handler = handler;
> +       pthread_create(&uffd_desc->thread, NULL, uffd_handler_thread_fn,
> +                      uffd_desc);
> +
> +       PER_VCPU_DEBUG("Created uffd thread for HVA range [%p, %p)\n",
> +                      hva, hva + len);
> +
> +       return uffd_desc;
> +}
> +
> +void uffd_stop_demand_paging(struct uffd_desc *uffd)
> +{
> +       char c = 0;
> +       int ret;
> +
> +       ret = write(uffd->pipefds[1], &c, 1);
> +       TEST_ASSERT(ret == 1, "Unable to write to pipefd");
> +
> +       ret = pthread_join(uffd->thread, NULL);
> +       TEST_ASSERT(ret == 0, "Pthread_join failed.");
> +
> +       close(uffd->uffd);
> +
> +       close(uffd->pipefds[1]);
> +       close(uffd->pipefds[0]);
> +
> +       free(uffd);
> +}
> +
> +#endif /* __NR_userfaultfd */
> --
> 2.35.1.723.g4982287a31-goog
>
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 04/11] KVM: selftests: Add vm_alloc_page_table_in_memslot library function
  2022-03-11  6:02   ` Ricardo Koller
@ 2022-03-16 18:07     ` Ben Gardon
  -1 siblings, 0 replies; 42+ messages in thread
From: Ben Gardon @ 2022-03-16 18:07 UTC (permalink / raw)
  To: Ricardo Koller
  Cc: kvm, kvmarm, Andrew Jones, Paolo Bonzini, Marc Zyngier,
	Alexandru Elisei, Eric Auger, Oliver Upton, Reiji Watanabe,
	Raghavendra Rao Ananta, Axel Rasmussen

On Fri, Mar 11, 2022 at 12:02 AM Ricardo Koller <ricarkol@google.com> wrote:
>
> Add a library function to allocate a page-table physical page in a
> particular memslot.  The default behavior is to create new page-table
> pages in memslot 0.
>
> Signed-off-by: Ricardo Koller <ricarkol@google.com>

This is very similar to one of the patches in the NX hugepages control
series I sent out last week, I guess we both had similar needs at the
same time.
Your solution introduces way less churn though, so it's probably
better. I might use this commit or wait for it to be merged before I
send out v2 of my NX control series.

In any case,
Reviewed-by: Ben Gardon <bgardon@google.com>

> ---
>  tools/testing/selftests/kvm/include/kvm_util_base.h | 1 +
>  tools/testing/selftests/kvm/lib/kvm_util.c          | 8 +++++++-
>  2 files changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
> index d6acec0858c0..c8dce12a9a52 100644
> --- a/tools/testing/selftests/kvm/include/kvm_util_base.h
> +++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
> @@ -307,6 +307,7 @@ vm_paddr_t vm_phy_page_alloc(struct kvm_vm *vm, vm_paddr_t paddr_min,
>  vm_paddr_t vm_phy_pages_alloc(struct kvm_vm *vm, size_t num,
>                               vm_paddr_t paddr_min, uint32_t memslot);
>  vm_paddr_t vm_alloc_page_table(struct kvm_vm *vm);
> +vm_paddr_t vm_alloc_page_table_in_memslot(struct kvm_vm *vm, uint32_t pt_memslot);
>
>  /*
>   * Create a VM with reasonable defaults
> diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
> index 64ef245b73de..ae21564241c8 100644
> --- a/tools/testing/selftests/kvm/lib/kvm_util.c
> +++ b/tools/testing/selftests/kvm/lib/kvm_util.c
> @@ -2409,9 +2409,15 @@ vm_paddr_t vm_phy_page_alloc(struct kvm_vm *vm, vm_paddr_t paddr_min,
>  /* Arbitrary minimum physical address used for virtual translation tables. */
>  #define KVM_GUEST_PAGE_TABLE_MIN_PADDR 0x180000
>
> +vm_paddr_t vm_alloc_page_table_in_memslot(struct kvm_vm *vm, uint32_t pt_memslot)
> +{
> +       return vm_phy_page_alloc(vm, KVM_GUEST_PAGE_TABLE_MIN_PADDR,
> +                       pt_memslot);
> +}
> +
>  vm_paddr_t vm_alloc_page_table(struct kvm_vm *vm)
>  {
> -       return vm_phy_page_alloc(vm, KVM_GUEST_PAGE_TABLE_MIN_PADDR, 0);
> +       return vm_alloc_page_table_in_memslot(vm, 0);
>  }
>
>  /*
> --
> 2.35.1.723.g4982287a31-goog
>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 04/11] KVM: selftests: Add vm_alloc_page_table_in_memslot library function
@ 2022-03-16 18:07     ` Ben Gardon
  0 siblings, 0 replies; 42+ messages in thread
From: Ben Gardon @ 2022-03-16 18:07 UTC (permalink / raw)
  To: Ricardo Koller; +Cc: kvm, Marc Zyngier, Paolo Bonzini, Axel Rasmussen, kvmarm

On Fri, Mar 11, 2022 at 12:02 AM Ricardo Koller <ricarkol@google.com> wrote:
>
> Add a library function to allocate a page-table physical page in a
> particular memslot.  The default behavior is to create new page-table
> pages in memslot 0.
>
> Signed-off-by: Ricardo Koller <ricarkol@google.com>

This is very similar to one of the patches in the NX hugepages control
series I sent out last week, I guess we both had similar needs at the
same time.
Your solution introduces way less churn though, so it's probably
better. I might use this commit or wait for it to be merged before I
send out v2 of my NX control series.

In any case,
Reviewed-by: Ben Gardon <bgardon@google.com>

> ---
>  tools/testing/selftests/kvm/include/kvm_util_base.h | 1 +
>  tools/testing/selftests/kvm/lib/kvm_util.c          | 8 +++++++-
>  2 files changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
> index d6acec0858c0..c8dce12a9a52 100644
> --- a/tools/testing/selftests/kvm/include/kvm_util_base.h
> +++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
> @@ -307,6 +307,7 @@ vm_paddr_t vm_phy_page_alloc(struct kvm_vm *vm, vm_paddr_t paddr_min,
>  vm_paddr_t vm_phy_pages_alloc(struct kvm_vm *vm, size_t num,
>                               vm_paddr_t paddr_min, uint32_t memslot);
>  vm_paddr_t vm_alloc_page_table(struct kvm_vm *vm);
> +vm_paddr_t vm_alloc_page_table_in_memslot(struct kvm_vm *vm, uint32_t pt_memslot);
>
>  /*
>   * Create a VM with reasonable defaults
> diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
> index 64ef245b73de..ae21564241c8 100644
> --- a/tools/testing/selftests/kvm/lib/kvm_util.c
> +++ b/tools/testing/selftests/kvm/lib/kvm_util.c
> @@ -2409,9 +2409,15 @@ vm_paddr_t vm_phy_page_alloc(struct kvm_vm *vm, vm_paddr_t paddr_min,
>  /* Arbitrary minimum physical address used for virtual translation tables. */
>  #define KVM_GUEST_PAGE_TABLE_MIN_PADDR 0x180000
>
> +vm_paddr_t vm_alloc_page_table_in_memslot(struct kvm_vm *vm, uint32_t pt_memslot)
> +{
> +       return vm_phy_page_alloc(vm, KVM_GUEST_PAGE_TABLE_MIN_PADDR,
> +                       pt_memslot);
> +}
> +
>  vm_paddr_t vm_alloc_page_table(struct kvm_vm *vm)
>  {
> -       return vm_phy_page_alloc(vm, KVM_GUEST_PAGE_TABLE_MIN_PADDR, 0);
> +       return vm_alloc_page_table_in_memslot(vm, 0);
>  }
>
>  /*
> --
> 2.35.1.723.g4982287a31-goog
>
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 02/11] KVM: selftests: Add vm_mem_region_get_src_fd library function
  2022-03-11  6:01   ` Ricardo Koller
@ 2022-03-16 18:08     ` Ben Gardon
  -1 siblings, 0 replies; 42+ messages in thread
From: Ben Gardon @ 2022-03-16 18:08 UTC (permalink / raw)
  To: Ricardo Koller
  Cc: kvm, kvmarm, Andrew Jones, Paolo Bonzini, Marc Zyngier,
	Alexandru Elisei, Eric Auger, Oliver Upton, Reiji Watanabe,
	Raghavendra Rao Ananta, Axel Rasmussen

On Fri, Mar 11, 2022 at 12:02 AM Ricardo Koller <ricarkol@google.com> wrote:
>
> Add a library function to get the backing source FD of a memslot.
>
> Signed-off-by: Ricardo Koller <ricarkol@google.com>

This appears to be dead code as of this commit, would recommend
merging it into the commit in which it's actually used.

> ---
>  .../selftests/kvm/include/kvm_util_base.h     |  1 +
>  tools/testing/selftests/kvm/lib/kvm_util.c    | 23 +++++++++++++++++++
>  2 files changed, 24 insertions(+)
>
> diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
> index 4ed6aa049a91..d6acec0858c0 100644
> --- a/tools/testing/selftests/kvm/include/kvm_util_base.h
> +++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
> @@ -163,6 +163,7 @@ int _kvm_ioctl(struct kvm_vm *vm, unsigned long ioctl, void *arg);
>  void vm_mem_region_set_flags(struct kvm_vm *vm, uint32_t slot, uint32_t flags);
>  void vm_mem_region_move(struct kvm_vm *vm, uint32_t slot, uint64_t new_gpa);
>  void vm_mem_region_delete(struct kvm_vm *vm, uint32_t slot);
> +int vm_mem_region_get_src_fd(struct kvm_vm *vm, uint32_t memslot);
>  void vm_vcpu_add(struct kvm_vm *vm, uint32_t vcpuid);
>  vm_vaddr_t vm_vaddr_alloc(struct kvm_vm *vm, size_t sz, vm_vaddr_t vaddr_min);
>  vm_vaddr_t vm_vaddr_alloc_pages(struct kvm_vm *vm, int nr_pages);
> diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
> index d8cf851ab119..64ef245b73de 100644
> --- a/tools/testing/selftests/kvm/lib/kvm_util.c
> +++ b/tools/testing/selftests/kvm/lib/kvm_util.c
> @@ -580,6 +580,29 @@ kvm_userspace_memory_region_find(struct kvm_vm *vm, uint64_t start,
>         return &region->region;
>  }
>
> +/*
> + * KVM Userspace Memory Get Backing Source FD
> + *
> + * Input Args:
> + *   vm - Virtual Machine
> + *   memslot - KVM memory slot ID
> + *
> + * Output Args: None
> + *
> + * Return:
> + *   Backing source file descriptor, -1 if the memslot is an anonymous region.
> + *
> + * Returns the backing source fd of a memslot, so tests can use it to punch
> + * holes, or to setup permissions.
> + */
> +int vm_mem_region_get_src_fd(struct kvm_vm *vm, uint32_t memslot)
> +{
> +       struct userspace_mem_region *region;
> +
> +       region = memslot2region(vm, memslot);
> +       return region->fd;
> +}
> +
>  /*
>   * VCPU Find
>   *
> --
> 2.35.1.723.g4982287a31-goog
>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 02/11] KVM: selftests: Add vm_mem_region_get_src_fd library function
@ 2022-03-16 18:08     ` Ben Gardon
  0 siblings, 0 replies; 42+ messages in thread
From: Ben Gardon @ 2022-03-16 18:08 UTC (permalink / raw)
  To: Ricardo Koller; +Cc: kvm, Marc Zyngier, Paolo Bonzini, Axel Rasmussen, kvmarm

On Fri, Mar 11, 2022 at 12:02 AM Ricardo Koller <ricarkol@google.com> wrote:
>
> Add a library function to get the backing source FD of a memslot.
>
> Signed-off-by: Ricardo Koller <ricarkol@google.com>

This appears to be dead code as of this commit, would recommend
merging it into the commit in which it's actually used.

> ---
>  .../selftests/kvm/include/kvm_util_base.h     |  1 +
>  tools/testing/selftests/kvm/lib/kvm_util.c    | 23 +++++++++++++++++++
>  2 files changed, 24 insertions(+)
>
> diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
> index 4ed6aa049a91..d6acec0858c0 100644
> --- a/tools/testing/selftests/kvm/include/kvm_util_base.h
> +++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
> @@ -163,6 +163,7 @@ int _kvm_ioctl(struct kvm_vm *vm, unsigned long ioctl, void *arg);
>  void vm_mem_region_set_flags(struct kvm_vm *vm, uint32_t slot, uint32_t flags);
>  void vm_mem_region_move(struct kvm_vm *vm, uint32_t slot, uint64_t new_gpa);
>  void vm_mem_region_delete(struct kvm_vm *vm, uint32_t slot);
> +int vm_mem_region_get_src_fd(struct kvm_vm *vm, uint32_t memslot);
>  void vm_vcpu_add(struct kvm_vm *vm, uint32_t vcpuid);
>  vm_vaddr_t vm_vaddr_alloc(struct kvm_vm *vm, size_t sz, vm_vaddr_t vaddr_min);
>  vm_vaddr_t vm_vaddr_alloc_pages(struct kvm_vm *vm, int nr_pages);
> diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
> index d8cf851ab119..64ef245b73de 100644
> --- a/tools/testing/selftests/kvm/lib/kvm_util.c
> +++ b/tools/testing/selftests/kvm/lib/kvm_util.c
> @@ -580,6 +580,29 @@ kvm_userspace_memory_region_find(struct kvm_vm *vm, uint64_t start,
>         return &region->region;
>  }
>
> +/*
> + * KVM Userspace Memory Get Backing Source FD
> + *
> + * Input Args:
> + *   vm - Virtual Machine
> + *   memslot - KVM memory slot ID
> + *
> + * Output Args: None
> + *
> + * Return:
> + *   Backing source file descriptor, -1 if the memslot is an anonymous region.
> + *
> + * Returns the backing source fd of a memslot, so tests can use it to punch
> + * holes, or to setup permissions.
> + */
> +int vm_mem_region_get_src_fd(struct kvm_vm *vm, uint32_t memslot)
> +{
> +       struct userspace_mem_region *region;
> +
> +       region = memslot2region(vm, memslot);
> +       return region->fd;
> +}
> +
>  /*
>   * VCPU Find
>   *
> --
> 2.35.1.723.g4982287a31-goog
>
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 06/11] KVM: selftests: Add missing close and munmap in __vm_mem_region_delete
  2022-03-11  6:02   ` Ricardo Koller
@ 2022-03-16 18:09     ` Ben Gardon
  -1 siblings, 0 replies; 42+ messages in thread
From: Ben Gardon @ 2022-03-16 18:09 UTC (permalink / raw)
  To: Ricardo Koller
  Cc: kvm, kvmarm, Andrew Jones, Paolo Bonzini, Marc Zyngier,
	Alexandru Elisei, Eric Auger, Oliver Upton, Reiji Watanabe,
	Raghavendra Rao Ananta, Axel Rasmussen

On Fri, Mar 11, 2022 at 12:02 AM Ricardo Koller <ricarkol@google.com> wrote:
>
> Deleting a memslot (when freeing a VM) is not closing the backing fd,
> nor it's unmapping the alias mapping. Fix by adding the missing close
> and munmap.
>
> Signed-off-by: Ricardo Koller <ricarkol@google.com>
> ---
>  tools/testing/selftests/kvm/lib/kvm_util.c | 6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
> index ae21564241c8..c25c79f97695 100644
> --- a/tools/testing/selftests/kvm/lib/kvm_util.c
> +++ b/tools/testing/selftests/kvm/lib/kvm_util.c
> @@ -702,6 +702,12 @@ static void __vm_mem_region_delete(struct kvm_vm *vm,
>         sparsebit_free(&region->unused_phy_pages);
>         ret = munmap(region->mmap_start, region->mmap_size);
>         TEST_ASSERT(ret == 0, "munmap failed, rc: %i errno: %i", ret, errno);
> +       if (region->fd >= 0) {
> +       /* There's an extra map if shared memory. */

Nit: indentation

> +               ret = munmap(region->mmap_alias, region->mmap_size);
> +               TEST_ASSERT(ret == 0, "munmap failed, rc: %i errno: %i", ret, errno);
> +               close(region->fd);
> +       }
>
>         free(region);
>  }
> --
> 2.35.1.723.g4982287a31-goog
>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 06/11] KVM: selftests: Add missing close and munmap in __vm_mem_region_delete
@ 2022-03-16 18:09     ` Ben Gardon
  0 siblings, 0 replies; 42+ messages in thread
From: Ben Gardon @ 2022-03-16 18:09 UTC (permalink / raw)
  To: Ricardo Koller; +Cc: kvm, Marc Zyngier, Paolo Bonzini, Axel Rasmussen, kvmarm

On Fri, Mar 11, 2022 at 12:02 AM Ricardo Koller <ricarkol@google.com> wrote:
>
> Deleting a memslot (when freeing a VM) is not closing the backing fd,
> nor it's unmapping the alias mapping. Fix by adding the missing close
> and munmap.
>
> Signed-off-by: Ricardo Koller <ricarkol@google.com>
> ---
>  tools/testing/selftests/kvm/lib/kvm_util.c | 6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
> index ae21564241c8..c25c79f97695 100644
> --- a/tools/testing/selftests/kvm/lib/kvm_util.c
> +++ b/tools/testing/selftests/kvm/lib/kvm_util.c
> @@ -702,6 +702,12 @@ static void __vm_mem_region_delete(struct kvm_vm *vm,
>         sparsebit_free(&region->unused_phy_pages);
>         ret = munmap(region->mmap_start, region->mmap_size);
>         TEST_ASSERT(ret == 0, "munmap failed, rc: %i errno: %i", ret, errno);
> +       if (region->fd >= 0) {
> +       /* There's an extra map if shared memory. */

Nit: indentation

> +               ret = munmap(region->mmap_alias, region->mmap_size);
> +               TEST_ASSERT(ret == 0, "munmap failed, rc: %i errno: %i", ret, errno);
> +               close(region->fd);
> +       }
>
>         free(region);
>  }
> --
> 2.35.1.723.g4982287a31-goog
>
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 01/11] KVM: selftests: Add a userfaultfd library
  2022-03-16 18:02     ` Ben Gardon
@ 2022-03-18 20:27       ` Ricardo Koller
  -1 siblings, 0 replies; 42+ messages in thread
From: Ricardo Koller @ 2022-03-18 20:27 UTC (permalink / raw)
  To: Ben Gardon
  Cc: kvm, kvmarm, Andrew Jones, Paolo Bonzini, Marc Zyngier,
	Alexandru Elisei, Eric Auger, Oliver Upton, Reiji Watanabe,
	Raghavendra Rao Ananta, Axel Rasmussen

On Wed, Mar 16, 2022 at 12:02:18PM -0600, Ben Gardon wrote:
> On Fri, Mar 11, 2022 at 12:02 AM Ricardo Koller <ricarkol@google.com> wrote:
> >
> > Move the generic userfaultfd code out of demand_paging_test.c into a
> > common library, userfaultfd_util. This library consists of a setup and a
> > stop function. The setup function starts a thread for handling page
> > faults using the handler callback function. This setup returns a
> > uffd_desc object which is then used in the stop function (to wait and
> > destroy the threads).
> >
> > Signed-off-by: Ricardo Koller <ricarkol@google.com>
> > ---
> >  tools/testing/selftests/kvm/Makefile          |   2 +-
> >  .../selftests/kvm/demand_paging_test.c        | 227 +++---------------
> >  .../selftests/kvm/include/userfaultfd_util.h  |  46 ++++
> >  .../selftests/kvm/lib/userfaultfd_util.c      | 196 +++++++++++++++
> >  4 files changed, 272 insertions(+), 199 deletions(-)
> >  create mode 100644 tools/testing/selftests/kvm/include/userfaultfd_util.h
> >  create mode 100644 tools/testing/selftests/kvm/lib/userfaultfd_util.c
> >
> > diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
> > index 0e4926bc9a58..bc5f89b3700e 100644
> > --- a/tools/testing/selftests/kvm/Makefile
> > +++ b/tools/testing/selftests/kvm/Makefile
> > @@ -37,7 +37,7 @@ ifeq ($(ARCH),riscv)
> >         UNAME_M := riscv
> >  endif
> >
> > -LIBKVM = lib/assert.c lib/elf.c lib/io.c lib/kvm_util.c lib/rbtree.c lib/sparsebit.c lib/test_util.c lib/guest_modes.c lib/perf_test_util.c
> > +LIBKVM = lib/assert.c lib/elf.c lib/io.c lib/kvm_util.c lib/rbtree.c lib/sparsebit.c lib/test_util.c lib/guest_modes.c lib/perf_test_util.c lib/userfaultfd_util.c
> >  LIBKVM_x86_64 = lib/x86_64/apic.c lib/x86_64/processor.c lib/x86_64/vmx.c lib/x86_64/svm.c lib/x86_64/ucall.c lib/x86_64/handlers.S
> >  LIBKVM_aarch64 = lib/aarch64/processor.c lib/aarch64/ucall.c lib/aarch64/handlers.S lib/aarch64/spinlock.c lib/aarch64/gic.c lib/aarch64/gic_v3.c lib/aarch64/vgic.c
> >  LIBKVM_s390x = lib/s390x/processor.c lib/s390x/ucall.c lib/s390x/diag318_test_handler.c
> > diff --git a/tools/testing/selftests/kvm/demand_paging_test.c b/tools/testing/selftests/kvm/demand_paging_test.c
> > index 6a719d065599..b3d457cecd68 100644
> > --- a/tools/testing/selftests/kvm/demand_paging_test.c
> > +++ b/tools/testing/selftests/kvm/demand_paging_test.c
> > @@ -22,23 +22,13 @@
> >  #include "test_util.h"
> >  #include "perf_test_util.h"
> >  #include "guest_modes.h"
> > +#include "userfaultfd_util.h"
> >
> >  #ifdef __NR_userfaultfd
> >
> > -#ifdef PRINT_PER_PAGE_UPDATES
> > -#define PER_PAGE_DEBUG(...) printf(__VA_ARGS__)
> > -#else
> > -#define PER_PAGE_DEBUG(...) _no_printf(__VA_ARGS__)
> > -#endif
> > -
> > -#ifdef PRINT_PER_VCPU_UPDATES
> > -#define PER_VCPU_DEBUG(...) printf(__VA_ARGS__)
> > -#else
> > -#define PER_VCPU_DEBUG(...) _no_printf(__VA_ARGS__)
> > -#endif
> > -
> >  static int nr_vcpus = 1;
> >  static uint64_t guest_percpu_mem_size = DEFAULT_PER_VCPU_MEM_SIZE;
> > +
> >  static size_t demand_paging_size;
> >  static char *guest_data_prototype;
> >
> > @@ -69,9 +59,11 @@ static void vcpu_worker(struct perf_test_vcpu_args *vcpu_args)
> >                        ts_diff.tv_sec, ts_diff.tv_nsec);
> >  }
> >
> > -static int handle_uffd_page_request(int uffd_mode, int uffd, uint64_t addr)
> > +static int handle_uffd_page_request(int uffd_mode, int uffd,
> > +               struct uffd_msg *msg)
> >  {
> >         pid_t tid = syscall(__NR_gettid);
> > +       uint64_t addr = msg->arg.pagefault.address;
> >         struct timespec start;
> >         struct timespec ts_diff;
> >         int r;
> > @@ -118,175 +110,32 @@ static int handle_uffd_page_request(int uffd_mode, int uffd, uint64_t addr)
> >         return 0;
> >  }
> >
> > -bool quit_uffd_thread;
> > -
> > -struct uffd_handler_args {
> > +struct test_params {
> >         int uffd_mode;
> > -       int uffd;
> > -       int pipefd;
> > -       useconds_t delay;
> > +       useconds_t uffd_delay;
> > +       enum vm_mem_backing_src_type src_type;
> > +       bool partition_vcpu_memory_access;
> >  };
> >
> > -static void *uffd_handler_thread_fn(void *arg)
> > +static void prefault_mem(void *alias, uint64_t len)
> >  {
> > -       struct uffd_handler_args *uffd_args = (struct uffd_handler_args *)arg;
> > -       int uffd = uffd_args->uffd;
> > -       int pipefd = uffd_args->pipefd;
> > -       useconds_t delay = uffd_args->delay;
> > -       int64_t pages = 0;
> > -       struct timespec start;
> > -       struct timespec ts_diff;
> > -
> > -       clock_gettime(CLOCK_MONOTONIC, &start);
> > -       while (!quit_uffd_thread) {
> > -               struct uffd_msg msg;
> > -               struct pollfd pollfd[2];
> > -               char tmp_chr;
> > -               int r;
> > -               uint64_t addr;
> > -
> > -               pollfd[0].fd = uffd;
> > -               pollfd[0].events = POLLIN;
> > -               pollfd[1].fd = pipefd;
> > -               pollfd[1].events = POLLIN;
> > -
> > -               r = poll(pollfd, 2, -1);
> > -               switch (r) {
> > -               case -1:
> > -                       pr_info("poll err");
> > -                       continue;
> > -               case 0:
> > -                       continue;
> > -               case 1:
> > -                       break;
> > -               default:
> > -                       pr_info("Polling uffd returned %d", r);
> > -                       return NULL;
> > -               }
> > -
> > -               if (pollfd[0].revents & POLLERR) {
> > -                       pr_info("uffd revents has POLLERR");
> > -                       return NULL;
> > -               }
> > -
> > -               if (pollfd[1].revents & POLLIN) {
> > -                       r = read(pollfd[1].fd, &tmp_chr, 1);
> > -                       TEST_ASSERT(r == 1,
> > -                                   "Error reading pipefd in UFFD thread\n");
> > -                       return NULL;
> > -               }
> > -
> > -               if (!(pollfd[0].revents & POLLIN))
> > -                       continue;
> > -
> > -               r = read(uffd, &msg, sizeof(msg));
> > -               if (r == -1) {
> > -                       if (errno == EAGAIN)
> > -                               continue;
> > -                       pr_info("Read of uffd got errno %d\n", errno);
> > -                       return NULL;
> > -               }
> > -
> > -               if (r != sizeof(msg)) {
> > -                       pr_info("Read on uffd returned unexpected size: %d bytes", r);
> > -                       return NULL;
> > -               }
> > -
> > -               if (!(msg.event & UFFD_EVENT_PAGEFAULT))
> > -                       continue;
> > +       size_t p;
> >
> > -               if (delay)
> > -                       usleep(delay);
> > -               addr =  msg.arg.pagefault.address;
> > -               r = handle_uffd_page_request(uffd_args->uffd_mode, uffd, addr);
> > -               if (r < 0)
> > -                       return NULL;
> > -               pages++;
> > +       TEST_ASSERT(alias != NULL, "Alias required for minor faults");
> > +       for (p = 0; p < (len / demand_paging_size); ++p) {
> > +               memcpy(alias + (p * demand_paging_size),
> > +                      guest_data_prototype, demand_paging_size);
> >         }
> > -
> > -       ts_diff = timespec_elapsed(start);
> > -       PER_VCPU_DEBUG("userfaulted %ld pages over %ld.%.9lds. (%f/sec)\n",
> > -                      pages, ts_diff.tv_sec, ts_diff.tv_nsec,
> > -                      pages / ((double)ts_diff.tv_sec + (double)ts_diff.tv_nsec / 100000000.0));
> > -
> > -       return NULL;
> >  }
> >
> > -static void setup_demand_paging(struct kvm_vm *vm,
> > -                               pthread_t *uffd_handler_thread, int pipefd,
> > -                               int uffd_mode, useconds_t uffd_delay,
> > -                               struct uffd_handler_args *uffd_args,
> > -                               void *hva, void *alias, uint64_t len)
> > -{
> > -       bool is_minor = (uffd_mode == UFFDIO_REGISTER_MODE_MINOR);
> > -       int uffd;
> > -       struct uffdio_api uffdio_api;
> > -       struct uffdio_register uffdio_register;
> > -       uint64_t expected_ioctls = ((uint64_t) 1) << _UFFDIO_COPY;
> > -
> > -       PER_PAGE_DEBUG("Userfaultfd %s mode, faults resolved with %s\n",
> > -                      is_minor ? "MINOR" : "MISSING",
> > -                      is_minor ? "UFFDIO_CONINUE" : "UFFDIO_COPY");
> > -
> > -       /* In order to get minor faults, prefault via the alias. */
> > -       if (is_minor) {
> > -               size_t p;
> > -
> > -               expected_ioctls = ((uint64_t) 1) << _UFFDIO_CONTINUE;
> > -
> > -               TEST_ASSERT(alias != NULL, "Alias required for minor faults");
> > -               for (p = 0; p < (len / demand_paging_size); ++p) {
> > -                       memcpy(alias + (p * demand_paging_size),
> > -                              guest_data_prototype, demand_paging_size);
> > -               }
> > -       }
> > -
> > -       uffd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK);
> > -       TEST_ASSERT(uffd >= 0, "uffd creation failed, errno: %d", errno);
> > -
> > -       uffdio_api.api = UFFD_API;
> > -       uffdio_api.features = 0;
> > -       TEST_ASSERT(ioctl(uffd, UFFDIO_API, &uffdio_api) != -1,
> > -                   "ioctl UFFDIO_API failed: %" PRIu64,
> > -                   (uint64_t)uffdio_api.api);
> > -
> > -       uffdio_register.range.start = (uint64_t)hva;
> > -       uffdio_register.range.len = len;
> > -       uffdio_register.mode = uffd_mode;
> > -       TEST_ASSERT(ioctl(uffd, UFFDIO_REGISTER, &uffdio_register) != -1,
> > -                   "ioctl UFFDIO_REGISTER failed");
> > -       TEST_ASSERT((uffdio_register.ioctls & expected_ioctls) ==
> > -                   expected_ioctls, "missing userfaultfd ioctls");
> > -
> > -       uffd_args->uffd_mode = uffd_mode;
> > -       uffd_args->uffd = uffd;
> > -       uffd_args->pipefd = pipefd;
> > -       uffd_args->delay = uffd_delay;
> > -       pthread_create(uffd_handler_thread, NULL, uffd_handler_thread_fn,
> > -                      uffd_args);
> > -
> > -       PER_VCPU_DEBUG("Created uffd thread for HVA range [%p, %p)\n",
> > -                      hva, hva + len);
> > -}
> > -
> > -struct test_params {
> > -       int uffd_mode;
> > -       useconds_t uffd_delay;
> > -       enum vm_mem_backing_src_type src_type;
> > -       bool partition_vcpu_memory_access;
> > -};
> > -
> >  static void run_test(enum vm_guest_mode mode, void *arg)
> >  {
> >         struct test_params *p = arg;
> > -       pthread_t *uffd_handler_threads = NULL;
> > -       struct uffd_handler_args *uffd_args = NULL;
> > +       struct uffd_desc **uffd_descs = NULL;
> >         struct timespec start;
> >         struct timespec ts_diff;
> > -       int *pipefds = NULL;
> >         struct kvm_vm *vm;
> >         int vcpu_id;
> > -       int r;
> >
> >         vm = perf_test_create_vm(mode, nr_vcpus, guest_percpu_mem_size, 1,
> >                                  p->src_type, p->partition_vcpu_memory_access);
> > @@ -299,15 +148,8 @@ static void run_test(enum vm_guest_mode mode, void *arg)
> >         memset(guest_data_prototype, 0xAB, demand_paging_size);
> >
> >         if (p->uffd_mode) {
> > -               uffd_handler_threads =
> > -                       malloc(nr_vcpus * sizeof(*uffd_handler_threads));
> > -               TEST_ASSERT(uffd_handler_threads, "Memory allocation failed");
> > -
> > -               uffd_args = malloc(nr_vcpus * sizeof(*uffd_args));
> > -               TEST_ASSERT(uffd_args, "Memory allocation failed");
> > -
> > -               pipefds = malloc(sizeof(int) * nr_vcpus * 2);
> > -               TEST_ASSERT(pipefds, "Unable to allocate memory for pipefd");
> > +               uffd_descs = malloc(nr_vcpus * sizeof(struct uffd_desc *));
> > +               TEST_ASSERT(uffd_descs, "Memory allocation failed");
> >
> >                 for (vcpu_id = 0; vcpu_id < nr_vcpus; vcpu_id++) {
> >                         struct perf_test_vcpu_args *vcpu_args;
> > @@ -320,19 +162,17 @@ static void run_test(enum vm_guest_mode mode, void *arg)
> >                         vcpu_hva = addr_gpa2hva(vm, vcpu_args->gpa);
> >                         vcpu_alias = addr_gpa2alias(vm, vcpu_args->gpa);
> >
> > +                       prefault_mem(vcpu_alias,
> > +                               vcpu_args->pages * perf_test_args.guest_page_size);
> > +
> >                         /*
> >                          * Set up user fault fd to handle demand paging
> >                          * requests.
> >                          */
> > -                       r = pipe2(&pipefds[vcpu_id * 2],
> > -                                 O_CLOEXEC | O_NONBLOCK);
> > -                       TEST_ASSERT(!r, "Failed to set up pipefd");
> > -
> > -                       setup_demand_paging(vm, &uffd_handler_threads[vcpu_id],
> > -                                           pipefds[vcpu_id * 2], p->uffd_mode,
> > -                                           p->uffd_delay, &uffd_args[vcpu_id],
> > -                                           vcpu_hva, vcpu_alias,
> > -                                           vcpu_args->pages * perf_test_args.guest_page_size);
> > +                       uffd_descs[vcpu_id] = uffd_setup_demand_paging(
> > +                               p->uffd_mode, p->uffd_delay, vcpu_hva,
> > +                               vcpu_args->pages * perf_test_args.guest_page_size,
> > +                               &handle_uffd_page_request);
> >                 }
> >         }
> >
> > @@ -347,15 +187,9 @@ static void run_test(enum vm_guest_mode mode, void *arg)
> >         pr_info("All vCPU threads joined\n");
> >
> >         if (p->uffd_mode) {
> > -               char c;
> > -
> >                 /* Tell the user fault fd handler threads to quit */
> > -               for (vcpu_id = 0; vcpu_id < nr_vcpus; vcpu_id++) {
> > -                       r = write(pipefds[vcpu_id * 2 + 1], &c, 1);
> > -                       TEST_ASSERT(r == 1, "Unable to write to pipefd");
> > -
> > -                       pthread_join(uffd_handler_threads[vcpu_id], NULL);
> > -               }
> > +               for (vcpu_id = 0; vcpu_id < nr_vcpus; vcpu_id++)
> > +                       uffd_stop_demand_paging(uffd_descs[vcpu_id]);
> >         }
> >
> >         pr_info("Total guest execution time: %ld.%.9lds\n",
> > @@ -367,11 +201,8 @@ static void run_test(enum vm_guest_mode mode, void *arg)
> >         perf_test_destroy_vm(vm);
> >
> >         free(guest_data_prototype);
> > -       if (p->uffd_mode) {
> > -               free(uffd_handler_threads);
> > -               free(uffd_args);
> > -               free(pipefds);
> > -       }
> > +       if (p->uffd_mode)
> > +               free(uffd_descs);
> >  }
> >
> >  static void help(char *name)
> > diff --git a/tools/testing/selftests/kvm/include/userfaultfd_util.h b/tools/testing/selftests/kvm/include/userfaultfd_util.h
> > new file mode 100644
> > index 000000000000..7b294ce8147c
> > --- /dev/null
> > +++ b/tools/testing/selftests/kvm/include/userfaultfd_util.h
> > @@ -0,0 +1,46 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * KVM userfaultfd util
> > + * Adapted from demand_paging_test.c
> > + *
> > + * Copyright (C) 2018, Red Hat, Inc.
> > + * Copyright (C) 2019, Google, Inc.
> > + * Copyright (C) 2022, Google, Inc.
> > + */
> > +
> > +#define _GNU_SOURCE /* for pipe2 */
> > +
> > +#include <inttypes.h>
> > +#include <stdio.h>
> > +#include <stdlib.h>
> > +#include <time.h>
> > +#include <poll.h>
> > +#include <pthread.h>
> > +#include <linux/userfaultfd.h>
> > +#include <sys/syscall.h>
> > +
> > +#include "kvm_util.h"
> > +#include "test_util.h"
> > +#include "perf_test_util.h"
> > +
> > +typedef int (*uffd_handler_t)(int uffd_mode, int uffd, struct uffd_msg *msg);
> > +
> > +struct uffd_desc;
> 
> Do we gain anything from making this opaque? Given the 100+ patch
> series Sean just sent out to expose the KVM util library functions,
> I'd be inclined to just define the struct in the header file.

Ah, good point. Yes, I better change this.

> 
> Otherwise, I'm really happy to see all this code factored out into its
> own little library.
> 
> > +
> > +struct uffd_desc *uffd_setup_demand_paging(int uffd_mode,
> > +               useconds_t uffd_delay, void *hva, uint64_t len,
> > +               uffd_handler_t handler);
> > +
> > +void uffd_stop_demand_paging(struct uffd_desc *uffd);
> > +
> > +#ifdef PRINT_PER_PAGE_UPDATES
> > +#define PER_PAGE_DEBUG(...) printf(__VA_ARGS__)
> > +#else
> > +#define PER_PAGE_DEBUG(...) _no_printf(__VA_ARGS__)
> > +#endif
> > +
> > +#ifdef PRINT_PER_VCPU_UPDATES
> > +#define PER_VCPU_DEBUG(...) printf(__VA_ARGS__)
> > +#else
> > +#define PER_VCPU_DEBUG(...) _no_printf(__VA_ARGS__)
> > +#endif
> > diff --git a/tools/testing/selftests/kvm/lib/userfaultfd_util.c b/tools/testing/selftests/kvm/lib/userfaultfd_util.c
> > new file mode 100644
> > index 000000000000..5e0878878a69
> > --- /dev/null
> > +++ b/tools/testing/selftests/kvm/lib/userfaultfd_util.c
> > @@ -0,0 +1,196 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * KVM userfaultfd util
> > + * Adapted from demand_paging_test.c
> > + *
> > + * Copyright (C) 2018, Red Hat, Inc.
> > + * Copyright (C) 2019, Google, Inc.
> > + * Copyright (C) 2022, Google, Inc.
> > + */
> > +
> > +#define _GNU_SOURCE /* for pipe2 */
> > +
> > +#include <inttypes.h>
> > +#include <stdio.h>
> > +#include <stdlib.h>
> > +#include <time.h>
> > +#include <poll.h>
> > +#include <pthread.h>
> > +#include <linux/userfaultfd.h>
> > +#include <sys/syscall.h>
> > +
> > +#include "kvm_util.h"
> > +#include "test_util.h"
> > +#include "perf_test_util.h"
> > +#include "userfaultfd_util.h"
> > +
> > +#ifdef __NR_userfaultfd
> > +
> > +struct uffd_desc {
> > +       int uffd_mode;
> > +       int uffd;
> > +       int pipefds[2];
> > +       useconds_t delay;
> > +       uffd_handler_t handler;
> > +       pthread_t thread;
> > +};
> > +
> > +static void *uffd_handler_thread_fn(void *arg)
> > +{
> > +       struct uffd_desc *uffd_desc = (struct uffd_desc *)arg;
> > +       int uffd = uffd_desc->uffd;
> > +       int pipefd = uffd_desc->pipefds[0];
> > +       useconds_t delay = uffd_desc->delay;
> > +       int64_t pages = 0;
> > +       struct timespec start;
> > +       struct timespec ts_diff;
> > +
> > +       clock_gettime(CLOCK_MONOTONIC, &start);
> > +       while (1) {
> > +               struct uffd_msg msg;
> > +               struct pollfd pollfd[2];
> > +               char tmp_chr;
> > +               int r;
> > +
> > +               pollfd[0].fd = uffd;
> > +               pollfd[0].events = POLLIN;
> > +               pollfd[1].fd = pipefd;
> > +               pollfd[1].events = POLLIN;
> > +
> > +               r = poll(pollfd, 2, -1);
> > +               switch (r) {
> > +               case -1:
> > +                       pr_info("poll err");
> > +                       continue;
> > +               case 0:
> > +                       continue;
> > +               case 1:
> > +                       break;
> > +               default:
> > +                       pr_info("Polling uffd returned %d", r);
> > +                       return NULL;
> > +               }
> > +
> > +               if (pollfd[0].revents & POLLERR) {
> > +                       pr_info("uffd revents has POLLERR");
> > +                       return NULL;
> > +               }
> > +
> > +               if (pollfd[1].revents & POLLIN) {
> > +                       r = read(pollfd[1].fd, &tmp_chr, 1);
> > +                       TEST_ASSERT(r == 1,
> > +                                   "Error reading pipefd in UFFD thread\n");
> > +                       return NULL;
> > +               }
> > +
> > +               if (!(pollfd[0].revents & POLLIN))
> > +                       continue;
> > +
> > +               r = read(uffd, &msg, sizeof(msg));
> > +               if (r == -1) {
> > +                       if (errno == EAGAIN)
> > +                               continue;
> > +                       pr_info("Read of uffd got errno %d\n", errno);
> > +                       return NULL;
> > +               }
> > +
> > +               if (r != sizeof(msg)) {
> > +                       pr_info("Read on uffd returned unexpected size: %d bytes", r);
> > +                       return NULL;
> > +               }
> > +
> > +               if (!(msg.event & UFFD_EVENT_PAGEFAULT))
> > +                       continue;
> > +
> > +               if (delay)
> > +                       usleep(delay);
> > +               r = uffd_desc->handler(uffd_desc->uffd_mode, uffd, &msg);
> > +               if (r < 0)
> > +                       return NULL;
> > +               pages++;
> > +       }
> > +
> > +       ts_diff = timespec_elapsed(start);
> > +       PER_VCPU_DEBUG("userfaulted %ld pages over %ld.%.9lds. (%f/sec)\n",
> > +                      pages, ts_diff.tv_sec, ts_diff.tv_nsec,
> > +                      pages / ((double)ts_diff.tv_sec + (double)ts_diff.tv_nsec / 100000000.0));
> > +
> > +       return NULL;
> > +}
> > +
> > +struct uffd_desc *uffd_setup_demand_paging(int uffd_mode,
> > +               useconds_t uffd_delay, void *hva, uint64_t len,
> > +               uffd_handler_t handler)
> > +{
> > +       struct uffd_desc *uffd_desc;
> > +       bool is_minor = (uffd_mode == UFFDIO_REGISTER_MODE_MINOR);
> > +       int uffd;
> > +       struct uffdio_api uffdio_api;
> > +       struct uffdio_register uffdio_register;
> > +       uint64_t expected_ioctls = ((uint64_t) 1) << _UFFDIO_COPY;
> > +       int ret;
> > +
> > +       PER_PAGE_DEBUG("Userfaultfd %s mode, faults resolved with %s\n",
> > +                      is_minor ? "MINOR" : "MISSING",
> > +                      is_minor ? "UFFDIO_CONINUE" : "UFFDIO_COPY");
> > +
> > +       uffd_desc = malloc(sizeof(struct uffd_desc));
> > +       TEST_ASSERT(uffd_desc, "malloc failed");
> > +
> > +       /* In order to get minor faults, prefault via the alias. */
> > +       if (is_minor)
> > +               expected_ioctls = ((uint64_t) 1) << _UFFDIO_CONTINUE;
> > +
> > +       uffd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK);
> > +       TEST_ASSERT(uffd >= 0, "uffd creation failed, errno: %d", errno);
> > +
> > +       uffdio_api.api = UFFD_API;
> > +       uffdio_api.features = 0;
> > +       TEST_ASSERT(ioctl(uffd, UFFDIO_API, &uffdio_api) != -1,
> > +                   "ioctl UFFDIO_API failed: %" PRIu64,
> > +                   (uint64_t)uffdio_api.api);
> > +
> > +       uffdio_register.range.start = (uint64_t)hva;
> > +       uffdio_register.range.len = len;
> > +       uffdio_register.mode = uffd_mode;
> > +       TEST_ASSERT(ioctl(uffd, UFFDIO_REGISTER, &uffdio_register) != -1,
> > +                   "ioctl UFFDIO_REGISTER failed");
> > +       TEST_ASSERT((uffdio_register.ioctls & expected_ioctls) ==
> > +                       expected_ioctls, "missing userfaultfd ioctls");
> > +
> > +       ret = pipe2(uffd_desc->pipefds, O_CLOEXEC | O_NONBLOCK);
> > +       TEST_ASSERT(!ret, "Failed to set up pipefd");
> > +
> > +       uffd_desc->uffd_mode = uffd_mode;
> > +       uffd_desc->uffd = uffd;
> > +       uffd_desc->delay = uffd_delay;
> > +       uffd_desc->handler = handler;
> > +       pthread_create(&uffd_desc->thread, NULL, uffd_handler_thread_fn,
> > +                      uffd_desc);
> > +
> > +       PER_VCPU_DEBUG("Created uffd thread for HVA range [%p, %p)\n",
> > +                      hva, hva + len);
> > +
> > +       return uffd_desc;
> > +}
> > +
> > +void uffd_stop_demand_paging(struct uffd_desc *uffd)
> > +{
> > +       char c = 0;
> > +       int ret;
> > +
> > +       ret = write(uffd->pipefds[1], &c, 1);
> > +       TEST_ASSERT(ret == 1, "Unable to write to pipefd");
> > +
> > +       ret = pthread_join(uffd->thread, NULL);
> > +       TEST_ASSERT(ret == 0, "Pthread_join failed.");
> > +
> > +       close(uffd->uffd);
> > +
> > +       close(uffd->pipefds[1]);
> > +       close(uffd->pipefds[0]);
> > +
> > +       free(uffd);
> > +}
> > +
> > +#endif /* __NR_userfaultfd */
> > --
> > 2.35.1.723.g4982287a31-goog
> >

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 01/11] KVM: selftests: Add a userfaultfd library
@ 2022-03-18 20:27       ` Ricardo Koller
  0 siblings, 0 replies; 42+ messages in thread
From: Ricardo Koller @ 2022-03-18 20:27 UTC (permalink / raw)
  To: Ben Gardon; +Cc: kvm, Marc Zyngier, Paolo Bonzini, Axel Rasmussen, kvmarm

On Wed, Mar 16, 2022 at 12:02:18PM -0600, Ben Gardon wrote:
> On Fri, Mar 11, 2022 at 12:02 AM Ricardo Koller <ricarkol@google.com> wrote:
> >
> > Move the generic userfaultfd code out of demand_paging_test.c into a
> > common library, userfaultfd_util. This library consists of a setup and a
> > stop function. The setup function starts a thread for handling page
> > faults using the handler callback function. This setup returns a
> > uffd_desc object which is then used in the stop function (to wait and
> > destroy the threads).
> >
> > Signed-off-by: Ricardo Koller <ricarkol@google.com>
> > ---
> >  tools/testing/selftests/kvm/Makefile          |   2 +-
> >  .../selftests/kvm/demand_paging_test.c        | 227 +++---------------
> >  .../selftests/kvm/include/userfaultfd_util.h  |  46 ++++
> >  .../selftests/kvm/lib/userfaultfd_util.c      | 196 +++++++++++++++
> >  4 files changed, 272 insertions(+), 199 deletions(-)
> >  create mode 100644 tools/testing/selftests/kvm/include/userfaultfd_util.h
> >  create mode 100644 tools/testing/selftests/kvm/lib/userfaultfd_util.c
> >
> > diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
> > index 0e4926bc9a58..bc5f89b3700e 100644
> > --- a/tools/testing/selftests/kvm/Makefile
> > +++ b/tools/testing/selftests/kvm/Makefile
> > @@ -37,7 +37,7 @@ ifeq ($(ARCH),riscv)
> >         UNAME_M := riscv
> >  endif
> >
> > -LIBKVM = lib/assert.c lib/elf.c lib/io.c lib/kvm_util.c lib/rbtree.c lib/sparsebit.c lib/test_util.c lib/guest_modes.c lib/perf_test_util.c
> > +LIBKVM = lib/assert.c lib/elf.c lib/io.c lib/kvm_util.c lib/rbtree.c lib/sparsebit.c lib/test_util.c lib/guest_modes.c lib/perf_test_util.c lib/userfaultfd_util.c
> >  LIBKVM_x86_64 = lib/x86_64/apic.c lib/x86_64/processor.c lib/x86_64/vmx.c lib/x86_64/svm.c lib/x86_64/ucall.c lib/x86_64/handlers.S
> >  LIBKVM_aarch64 = lib/aarch64/processor.c lib/aarch64/ucall.c lib/aarch64/handlers.S lib/aarch64/spinlock.c lib/aarch64/gic.c lib/aarch64/gic_v3.c lib/aarch64/vgic.c
> >  LIBKVM_s390x = lib/s390x/processor.c lib/s390x/ucall.c lib/s390x/diag318_test_handler.c
> > diff --git a/tools/testing/selftests/kvm/demand_paging_test.c b/tools/testing/selftests/kvm/demand_paging_test.c
> > index 6a719d065599..b3d457cecd68 100644
> > --- a/tools/testing/selftests/kvm/demand_paging_test.c
> > +++ b/tools/testing/selftests/kvm/demand_paging_test.c
> > @@ -22,23 +22,13 @@
> >  #include "test_util.h"
> >  #include "perf_test_util.h"
> >  #include "guest_modes.h"
> > +#include "userfaultfd_util.h"
> >
> >  #ifdef __NR_userfaultfd
> >
> > -#ifdef PRINT_PER_PAGE_UPDATES
> > -#define PER_PAGE_DEBUG(...) printf(__VA_ARGS__)
> > -#else
> > -#define PER_PAGE_DEBUG(...) _no_printf(__VA_ARGS__)
> > -#endif
> > -
> > -#ifdef PRINT_PER_VCPU_UPDATES
> > -#define PER_VCPU_DEBUG(...) printf(__VA_ARGS__)
> > -#else
> > -#define PER_VCPU_DEBUG(...) _no_printf(__VA_ARGS__)
> > -#endif
> > -
> >  static int nr_vcpus = 1;
> >  static uint64_t guest_percpu_mem_size = DEFAULT_PER_VCPU_MEM_SIZE;
> > +
> >  static size_t demand_paging_size;
> >  static char *guest_data_prototype;
> >
> > @@ -69,9 +59,11 @@ static void vcpu_worker(struct perf_test_vcpu_args *vcpu_args)
> >                        ts_diff.tv_sec, ts_diff.tv_nsec);
> >  }
> >
> > -static int handle_uffd_page_request(int uffd_mode, int uffd, uint64_t addr)
> > +static int handle_uffd_page_request(int uffd_mode, int uffd,
> > +               struct uffd_msg *msg)
> >  {
> >         pid_t tid = syscall(__NR_gettid);
> > +       uint64_t addr = msg->arg.pagefault.address;
> >         struct timespec start;
> >         struct timespec ts_diff;
> >         int r;
> > @@ -118,175 +110,32 @@ static int handle_uffd_page_request(int uffd_mode, int uffd, uint64_t addr)
> >         return 0;
> >  }
> >
> > -bool quit_uffd_thread;
> > -
> > -struct uffd_handler_args {
> > +struct test_params {
> >         int uffd_mode;
> > -       int uffd;
> > -       int pipefd;
> > -       useconds_t delay;
> > +       useconds_t uffd_delay;
> > +       enum vm_mem_backing_src_type src_type;
> > +       bool partition_vcpu_memory_access;
> >  };
> >
> > -static void *uffd_handler_thread_fn(void *arg)
> > +static void prefault_mem(void *alias, uint64_t len)
> >  {
> > -       struct uffd_handler_args *uffd_args = (struct uffd_handler_args *)arg;
> > -       int uffd = uffd_args->uffd;
> > -       int pipefd = uffd_args->pipefd;
> > -       useconds_t delay = uffd_args->delay;
> > -       int64_t pages = 0;
> > -       struct timespec start;
> > -       struct timespec ts_diff;
> > -
> > -       clock_gettime(CLOCK_MONOTONIC, &start);
> > -       while (!quit_uffd_thread) {
> > -               struct uffd_msg msg;
> > -               struct pollfd pollfd[2];
> > -               char tmp_chr;
> > -               int r;
> > -               uint64_t addr;
> > -
> > -               pollfd[0].fd = uffd;
> > -               pollfd[0].events = POLLIN;
> > -               pollfd[1].fd = pipefd;
> > -               pollfd[1].events = POLLIN;
> > -
> > -               r = poll(pollfd, 2, -1);
> > -               switch (r) {
> > -               case -1:
> > -                       pr_info("poll err");
> > -                       continue;
> > -               case 0:
> > -                       continue;
> > -               case 1:
> > -                       break;
> > -               default:
> > -                       pr_info("Polling uffd returned %d", r);
> > -                       return NULL;
> > -               }
> > -
> > -               if (pollfd[0].revents & POLLERR) {
> > -                       pr_info("uffd revents has POLLERR");
> > -                       return NULL;
> > -               }
> > -
> > -               if (pollfd[1].revents & POLLIN) {
> > -                       r = read(pollfd[1].fd, &tmp_chr, 1);
> > -                       TEST_ASSERT(r == 1,
> > -                                   "Error reading pipefd in UFFD thread\n");
> > -                       return NULL;
> > -               }
> > -
> > -               if (!(pollfd[0].revents & POLLIN))
> > -                       continue;
> > -
> > -               r = read(uffd, &msg, sizeof(msg));
> > -               if (r == -1) {
> > -                       if (errno == EAGAIN)
> > -                               continue;
> > -                       pr_info("Read of uffd got errno %d\n", errno);
> > -                       return NULL;
> > -               }
> > -
> > -               if (r != sizeof(msg)) {
> > -                       pr_info("Read on uffd returned unexpected size: %d bytes", r);
> > -                       return NULL;
> > -               }
> > -
> > -               if (!(msg.event & UFFD_EVENT_PAGEFAULT))
> > -                       continue;
> > +       size_t p;
> >
> > -               if (delay)
> > -                       usleep(delay);
> > -               addr =  msg.arg.pagefault.address;
> > -               r = handle_uffd_page_request(uffd_args->uffd_mode, uffd, addr);
> > -               if (r < 0)
> > -                       return NULL;
> > -               pages++;
> > +       TEST_ASSERT(alias != NULL, "Alias required for minor faults");
> > +       for (p = 0; p < (len / demand_paging_size); ++p) {
> > +               memcpy(alias + (p * demand_paging_size),
> > +                      guest_data_prototype, demand_paging_size);
> >         }
> > -
> > -       ts_diff = timespec_elapsed(start);
> > -       PER_VCPU_DEBUG("userfaulted %ld pages over %ld.%.9lds. (%f/sec)\n",
> > -                      pages, ts_diff.tv_sec, ts_diff.tv_nsec,
> > -                      pages / ((double)ts_diff.tv_sec + (double)ts_diff.tv_nsec / 100000000.0));
> > -
> > -       return NULL;
> >  }
> >
> > -static void setup_demand_paging(struct kvm_vm *vm,
> > -                               pthread_t *uffd_handler_thread, int pipefd,
> > -                               int uffd_mode, useconds_t uffd_delay,
> > -                               struct uffd_handler_args *uffd_args,
> > -                               void *hva, void *alias, uint64_t len)
> > -{
> > -       bool is_minor = (uffd_mode == UFFDIO_REGISTER_MODE_MINOR);
> > -       int uffd;
> > -       struct uffdio_api uffdio_api;
> > -       struct uffdio_register uffdio_register;
> > -       uint64_t expected_ioctls = ((uint64_t) 1) << _UFFDIO_COPY;
> > -
> > -       PER_PAGE_DEBUG("Userfaultfd %s mode, faults resolved with %s\n",
> > -                      is_minor ? "MINOR" : "MISSING",
> > -                      is_minor ? "UFFDIO_CONINUE" : "UFFDIO_COPY");
> > -
> > -       /* In order to get minor faults, prefault via the alias. */
> > -       if (is_minor) {
> > -               size_t p;
> > -
> > -               expected_ioctls = ((uint64_t) 1) << _UFFDIO_CONTINUE;
> > -
> > -               TEST_ASSERT(alias != NULL, "Alias required for minor faults");
> > -               for (p = 0; p < (len / demand_paging_size); ++p) {
> > -                       memcpy(alias + (p * demand_paging_size),
> > -                              guest_data_prototype, demand_paging_size);
> > -               }
> > -       }
> > -
> > -       uffd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK);
> > -       TEST_ASSERT(uffd >= 0, "uffd creation failed, errno: %d", errno);
> > -
> > -       uffdio_api.api = UFFD_API;
> > -       uffdio_api.features = 0;
> > -       TEST_ASSERT(ioctl(uffd, UFFDIO_API, &uffdio_api) != -1,
> > -                   "ioctl UFFDIO_API failed: %" PRIu64,
> > -                   (uint64_t)uffdio_api.api);
> > -
> > -       uffdio_register.range.start = (uint64_t)hva;
> > -       uffdio_register.range.len = len;
> > -       uffdio_register.mode = uffd_mode;
> > -       TEST_ASSERT(ioctl(uffd, UFFDIO_REGISTER, &uffdio_register) != -1,
> > -                   "ioctl UFFDIO_REGISTER failed");
> > -       TEST_ASSERT((uffdio_register.ioctls & expected_ioctls) ==
> > -                   expected_ioctls, "missing userfaultfd ioctls");
> > -
> > -       uffd_args->uffd_mode = uffd_mode;
> > -       uffd_args->uffd = uffd;
> > -       uffd_args->pipefd = pipefd;
> > -       uffd_args->delay = uffd_delay;
> > -       pthread_create(uffd_handler_thread, NULL, uffd_handler_thread_fn,
> > -                      uffd_args);
> > -
> > -       PER_VCPU_DEBUG("Created uffd thread for HVA range [%p, %p)\n",
> > -                      hva, hva + len);
> > -}
> > -
> > -struct test_params {
> > -       int uffd_mode;
> > -       useconds_t uffd_delay;
> > -       enum vm_mem_backing_src_type src_type;
> > -       bool partition_vcpu_memory_access;
> > -};
> > -
> >  static void run_test(enum vm_guest_mode mode, void *arg)
> >  {
> >         struct test_params *p = arg;
> > -       pthread_t *uffd_handler_threads = NULL;
> > -       struct uffd_handler_args *uffd_args = NULL;
> > +       struct uffd_desc **uffd_descs = NULL;
> >         struct timespec start;
> >         struct timespec ts_diff;
> > -       int *pipefds = NULL;
> >         struct kvm_vm *vm;
> >         int vcpu_id;
> > -       int r;
> >
> >         vm = perf_test_create_vm(mode, nr_vcpus, guest_percpu_mem_size, 1,
> >                                  p->src_type, p->partition_vcpu_memory_access);
> > @@ -299,15 +148,8 @@ static void run_test(enum vm_guest_mode mode, void *arg)
> >         memset(guest_data_prototype, 0xAB, demand_paging_size);
> >
> >         if (p->uffd_mode) {
> > -               uffd_handler_threads =
> > -                       malloc(nr_vcpus * sizeof(*uffd_handler_threads));
> > -               TEST_ASSERT(uffd_handler_threads, "Memory allocation failed");
> > -
> > -               uffd_args = malloc(nr_vcpus * sizeof(*uffd_args));
> > -               TEST_ASSERT(uffd_args, "Memory allocation failed");
> > -
> > -               pipefds = malloc(sizeof(int) * nr_vcpus * 2);
> > -               TEST_ASSERT(pipefds, "Unable to allocate memory for pipefd");
> > +               uffd_descs = malloc(nr_vcpus * sizeof(struct uffd_desc *));
> > +               TEST_ASSERT(uffd_descs, "Memory allocation failed");
> >
> >                 for (vcpu_id = 0; vcpu_id < nr_vcpus; vcpu_id++) {
> >                         struct perf_test_vcpu_args *vcpu_args;
> > @@ -320,19 +162,17 @@ static void run_test(enum vm_guest_mode mode, void *arg)
> >                         vcpu_hva = addr_gpa2hva(vm, vcpu_args->gpa);
> >                         vcpu_alias = addr_gpa2alias(vm, vcpu_args->gpa);
> >
> > +                       prefault_mem(vcpu_alias,
> > +                               vcpu_args->pages * perf_test_args.guest_page_size);
> > +
> >                         /*
> >                          * Set up user fault fd to handle demand paging
> >                          * requests.
> >                          */
> > -                       r = pipe2(&pipefds[vcpu_id * 2],
> > -                                 O_CLOEXEC | O_NONBLOCK);
> > -                       TEST_ASSERT(!r, "Failed to set up pipefd");
> > -
> > -                       setup_demand_paging(vm, &uffd_handler_threads[vcpu_id],
> > -                                           pipefds[vcpu_id * 2], p->uffd_mode,
> > -                                           p->uffd_delay, &uffd_args[vcpu_id],
> > -                                           vcpu_hva, vcpu_alias,
> > -                                           vcpu_args->pages * perf_test_args.guest_page_size);
> > +                       uffd_descs[vcpu_id] = uffd_setup_demand_paging(
> > +                               p->uffd_mode, p->uffd_delay, vcpu_hva,
> > +                               vcpu_args->pages * perf_test_args.guest_page_size,
> > +                               &handle_uffd_page_request);
> >                 }
> >         }
> >
> > @@ -347,15 +187,9 @@ static void run_test(enum vm_guest_mode mode, void *arg)
> >         pr_info("All vCPU threads joined\n");
> >
> >         if (p->uffd_mode) {
> > -               char c;
> > -
> >                 /* Tell the user fault fd handler threads to quit */
> > -               for (vcpu_id = 0; vcpu_id < nr_vcpus; vcpu_id++) {
> > -                       r = write(pipefds[vcpu_id * 2 + 1], &c, 1);
> > -                       TEST_ASSERT(r == 1, "Unable to write to pipefd");
> > -
> > -                       pthread_join(uffd_handler_threads[vcpu_id], NULL);
> > -               }
> > +               for (vcpu_id = 0; vcpu_id < nr_vcpus; vcpu_id++)
> > +                       uffd_stop_demand_paging(uffd_descs[vcpu_id]);
> >         }
> >
> >         pr_info("Total guest execution time: %ld.%.9lds\n",
> > @@ -367,11 +201,8 @@ static void run_test(enum vm_guest_mode mode, void *arg)
> >         perf_test_destroy_vm(vm);
> >
> >         free(guest_data_prototype);
> > -       if (p->uffd_mode) {
> > -               free(uffd_handler_threads);
> > -               free(uffd_args);
> > -               free(pipefds);
> > -       }
> > +       if (p->uffd_mode)
> > +               free(uffd_descs);
> >  }
> >
> >  static void help(char *name)
> > diff --git a/tools/testing/selftests/kvm/include/userfaultfd_util.h b/tools/testing/selftests/kvm/include/userfaultfd_util.h
> > new file mode 100644
> > index 000000000000..7b294ce8147c
> > --- /dev/null
> > +++ b/tools/testing/selftests/kvm/include/userfaultfd_util.h
> > @@ -0,0 +1,46 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * KVM userfaultfd util
> > + * Adapted from demand_paging_test.c
> > + *
> > + * Copyright (C) 2018, Red Hat, Inc.
> > + * Copyright (C) 2019, Google, Inc.
> > + * Copyright (C) 2022, Google, Inc.
> > + */
> > +
> > +#define _GNU_SOURCE /* for pipe2 */
> > +
> > +#include <inttypes.h>
> > +#include <stdio.h>
> > +#include <stdlib.h>
> > +#include <time.h>
> > +#include <poll.h>
> > +#include <pthread.h>
> > +#include <linux/userfaultfd.h>
> > +#include <sys/syscall.h>
> > +
> > +#include "kvm_util.h"
> > +#include "test_util.h"
> > +#include "perf_test_util.h"
> > +
> > +typedef int (*uffd_handler_t)(int uffd_mode, int uffd, struct uffd_msg *msg);
> > +
> > +struct uffd_desc;
> 
> Do we gain anything from making this opaque? Given the 100+ patch
> series Sean just sent out to expose the KVM util library functions,
> I'd be inclined to just define the struct in the header file.

Ah, good point. Yes, I better change this.

> 
> Otherwise, I'm really happy to see all this code factored out into its
> own little library.
> 
> > +
> > +struct uffd_desc *uffd_setup_demand_paging(int uffd_mode,
> > +               useconds_t uffd_delay, void *hva, uint64_t len,
> > +               uffd_handler_t handler);
> > +
> > +void uffd_stop_demand_paging(struct uffd_desc *uffd);
> > +
> > +#ifdef PRINT_PER_PAGE_UPDATES
> > +#define PER_PAGE_DEBUG(...) printf(__VA_ARGS__)
> > +#else
> > +#define PER_PAGE_DEBUG(...) _no_printf(__VA_ARGS__)
> > +#endif
> > +
> > +#ifdef PRINT_PER_VCPU_UPDATES
> > +#define PER_VCPU_DEBUG(...) printf(__VA_ARGS__)
> > +#else
> > +#define PER_VCPU_DEBUG(...) _no_printf(__VA_ARGS__)
> > +#endif
> > diff --git a/tools/testing/selftests/kvm/lib/userfaultfd_util.c b/tools/testing/selftests/kvm/lib/userfaultfd_util.c
> > new file mode 100644
> > index 000000000000..5e0878878a69
> > --- /dev/null
> > +++ b/tools/testing/selftests/kvm/lib/userfaultfd_util.c
> > @@ -0,0 +1,196 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * KVM userfaultfd util
> > + * Adapted from demand_paging_test.c
> > + *
> > + * Copyright (C) 2018, Red Hat, Inc.
> > + * Copyright (C) 2019, Google, Inc.
> > + * Copyright (C) 2022, Google, Inc.
> > + */
> > +
> > +#define _GNU_SOURCE /* for pipe2 */
> > +
> > +#include <inttypes.h>
> > +#include <stdio.h>
> > +#include <stdlib.h>
> > +#include <time.h>
> > +#include <poll.h>
> > +#include <pthread.h>
> > +#include <linux/userfaultfd.h>
> > +#include <sys/syscall.h>
> > +
> > +#include "kvm_util.h"
> > +#include "test_util.h"
> > +#include "perf_test_util.h"
> > +#include "userfaultfd_util.h"
> > +
> > +#ifdef __NR_userfaultfd
> > +
> > +struct uffd_desc {
> > +       int uffd_mode;
> > +       int uffd;
> > +       int pipefds[2];
> > +       useconds_t delay;
> > +       uffd_handler_t handler;
> > +       pthread_t thread;
> > +};
> > +
> > +static void *uffd_handler_thread_fn(void *arg)
> > +{
> > +       struct uffd_desc *uffd_desc = (struct uffd_desc *)arg;
> > +       int uffd = uffd_desc->uffd;
> > +       int pipefd = uffd_desc->pipefds[0];
> > +       useconds_t delay = uffd_desc->delay;
> > +       int64_t pages = 0;
> > +       struct timespec start;
> > +       struct timespec ts_diff;
> > +
> > +       clock_gettime(CLOCK_MONOTONIC, &start);
> > +       while (1) {
> > +               struct uffd_msg msg;
> > +               struct pollfd pollfd[2];
> > +               char tmp_chr;
> > +               int r;
> > +
> > +               pollfd[0].fd = uffd;
> > +               pollfd[0].events = POLLIN;
> > +               pollfd[1].fd = pipefd;
> > +               pollfd[1].events = POLLIN;
> > +
> > +               r = poll(pollfd, 2, -1);
> > +               switch (r) {
> > +               case -1:
> > +                       pr_info("poll err");
> > +                       continue;
> > +               case 0:
> > +                       continue;
> > +               case 1:
> > +                       break;
> > +               default:
> > +                       pr_info("Polling uffd returned %d", r);
> > +                       return NULL;
> > +               }
> > +
> > +               if (pollfd[0].revents & POLLERR) {
> > +                       pr_info("uffd revents has POLLERR");
> > +                       return NULL;
> > +               }
> > +
> > +               if (pollfd[1].revents & POLLIN) {
> > +                       r = read(pollfd[1].fd, &tmp_chr, 1);
> > +                       TEST_ASSERT(r == 1,
> > +                                   "Error reading pipefd in UFFD thread\n");
> > +                       return NULL;
> > +               }
> > +
> > +               if (!(pollfd[0].revents & POLLIN))
> > +                       continue;
> > +
> > +               r = read(uffd, &msg, sizeof(msg));
> > +               if (r == -1) {
> > +                       if (errno == EAGAIN)
> > +                               continue;
> > +                       pr_info("Read of uffd got errno %d\n", errno);
> > +                       return NULL;
> > +               }
> > +
> > +               if (r != sizeof(msg)) {
> > +                       pr_info("Read on uffd returned unexpected size: %d bytes", r);
> > +                       return NULL;
> > +               }
> > +
> > +               if (!(msg.event & UFFD_EVENT_PAGEFAULT))
> > +                       continue;
> > +
> > +               if (delay)
> > +                       usleep(delay);
> > +               r = uffd_desc->handler(uffd_desc->uffd_mode, uffd, &msg);
> > +               if (r < 0)
> > +                       return NULL;
> > +               pages++;
> > +       }
> > +
> > +       ts_diff = timespec_elapsed(start);
> > +       PER_VCPU_DEBUG("userfaulted %ld pages over %ld.%.9lds. (%f/sec)\n",
> > +                      pages, ts_diff.tv_sec, ts_diff.tv_nsec,
> > +                      pages / ((double)ts_diff.tv_sec + (double)ts_diff.tv_nsec / 100000000.0));
> > +
> > +       return NULL;
> > +}
> > +
> > +struct uffd_desc *uffd_setup_demand_paging(int uffd_mode,
> > +               useconds_t uffd_delay, void *hva, uint64_t len,
> > +               uffd_handler_t handler)
> > +{
> > +       struct uffd_desc *uffd_desc;
> > +       bool is_minor = (uffd_mode == UFFDIO_REGISTER_MODE_MINOR);
> > +       int uffd;
> > +       struct uffdio_api uffdio_api;
> > +       struct uffdio_register uffdio_register;
> > +       uint64_t expected_ioctls = ((uint64_t) 1) << _UFFDIO_COPY;
> > +       int ret;
> > +
> > +       PER_PAGE_DEBUG("Userfaultfd %s mode, faults resolved with %s\n",
> > +                      is_minor ? "MINOR" : "MISSING",
> > +                      is_minor ? "UFFDIO_CONINUE" : "UFFDIO_COPY");
> > +
> > +       uffd_desc = malloc(sizeof(struct uffd_desc));
> > +       TEST_ASSERT(uffd_desc, "malloc failed");
> > +
> > +       /* In order to get minor faults, prefault via the alias. */
> > +       if (is_minor)
> > +               expected_ioctls = ((uint64_t) 1) << _UFFDIO_CONTINUE;
> > +
> > +       uffd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK);
> > +       TEST_ASSERT(uffd >= 0, "uffd creation failed, errno: %d", errno);
> > +
> > +       uffdio_api.api = UFFD_API;
> > +       uffdio_api.features = 0;
> > +       TEST_ASSERT(ioctl(uffd, UFFDIO_API, &uffdio_api) != -1,
> > +                   "ioctl UFFDIO_API failed: %" PRIu64,
> > +                   (uint64_t)uffdio_api.api);
> > +
> > +       uffdio_register.range.start = (uint64_t)hva;
> > +       uffdio_register.range.len = len;
> > +       uffdio_register.mode = uffd_mode;
> > +       TEST_ASSERT(ioctl(uffd, UFFDIO_REGISTER, &uffdio_register) != -1,
> > +                   "ioctl UFFDIO_REGISTER failed");
> > +       TEST_ASSERT((uffdio_register.ioctls & expected_ioctls) ==
> > +                       expected_ioctls, "missing userfaultfd ioctls");
> > +
> > +       ret = pipe2(uffd_desc->pipefds, O_CLOEXEC | O_NONBLOCK);
> > +       TEST_ASSERT(!ret, "Failed to set up pipefd");
> > +
> > +       uffd_desc->uffd_mode = uffd_mode;
> > +       uffd_desc->uffd = uffd;
> > +       uffd_desc->delay = uffd_delay;
> > +       uffd_desc->handler = handler;
> > +       pthread_create(&uffd_desc->thread, NULL, uffd_handler_thread_fn,
> > +                      uffd_desc);
> > +
> > +       PER_VCPU_DEBUG("Created uffd thread for HVA range [%p, %p)\n",
> > +                      hva, hva + len);
> > +
> > +       return uffd_desc;
> > +}
> > +
> > +void uffd_stop_demand_paging(struct uffd_desc *uffd)
> > +{
> > +       char c = 0;
> > +       int ret;
> > +
> > +       ret = write(uffd->pipefds[1], &c, 1);
> > +       TEST_ASSERT(ret == 1, "Unable to write to pipefd");
> > +
> > +       ret = pthread_join(uffd->thread, NULL);
> > +       TEST_ASSERT(ret == 0, "Pthread_join failed.");
> > +
> > +       close(uffd->uffd);
> > +
> > +       close(uffd->pipefds[1]);
> > +       close(uffd->pipefds[0]);
> > +
> > +       free(uffd);
> > +}
> > +
> > +#endif /* __NR_userfaultfd */
> > --
> > 2.35.1.723.g4982287a31-goog
> >
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 02/11] KVM: selftests: Add vm_mem_region_get_src_fd library function
  2022-03-16 18:08     ` Ben Gardon
@ 2022-03-18 20:30       ` Ricardo Koller
  -1 siblings, 0 replies; 42+ messages in thread
From: Ricardo Koller @ 2022-03-18 20:30 UTC (permalink / raw)
  To: Ben Gardon
  Cc: kvm, kvmarm, Andrew Jones, Paolo Bonzini, Marc Zyngier,
	Alexandru Elisei, Eric Auger, Oliver Upton, Reiji Watanabe,
	Raghavendra Rao Ananta, Axel Rasmussen

On Wed, Mar 16, 2022 at 12:08:23PM -0600, Ben Gardon wrote:
> On Fri, Mar 11, 2022 at 12:02 AM Ricardo Koller <ricarkol@google.com> wrote:
> >
> > Add a library function to get the backing source FD of a memslot.
> >
> > Signed-off-by: Ricardo Koller <ricarkol@google.com>
> 
> This appears to be dead code as of this commit, would recommend
> merging it into the commit in which it's actually used.

I was trying to separate lib changes (which are mostly arch independent)
with the actual test. Would move the commit to be right before the one
that uses be better? and maybe add a commit comment mentioning how it's
going to be used.

> 
> > ---
> >  .../selftests/kvm/include/kvm_util_base.h     |  1 +
> >  tools/testing/selftests/kvm/lib/kvm_util.c    | 23 +++++++++++++++++++
> >  2 files changed, 24 insertions(+)
> >
> > diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
> > index 4ed6aa049a91..d6acec0858c0 100644
> > --- a/tools/testing/selftests/kvm/include/kvm_util_base.h
> > +++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
> > @@ -163,6 +163,7 @@ int _kvm_ioctl(struct kvm_vm *vm, unsigned long ioctl, void *arg);
> >  void vm_mem_region_set_flags(struct kvm_vm *vm, uint32_t slot, uint32_t flags);
> >  void vm_mem_region_move(struct kvm_vm *vm, uint32_t slot, uint64_t new_gpa);
> >  void vm_mem_region_delete(struct kvm_vm *vm, uint32_t slot);
> > +int vm_mem_region_get_src_fd(struct kvm_vm *vm, uint32_t memslot);
> >  void vm_vcpu_add(struct kvm_vm *vm, uint32_t vcpuid);
> >  vm_vaddr_t vm_vaddr_alloc(struct kvm_vm *vm, size_t sz, vm_vaddr_t vaddr_min);
> >  vm_vaddr_t vm_vaddr_alloc_pages(struct kvm_vm *vm, int nr_pages);
> > diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
> > index d8cf851ab119..64ef245b73de 100644
> > --- a/tools/testing/selftests/kvm/lib/kvm_util.c
> > +++ b/tools/testing/selftests/kvm/lib/kvm_util.c
> > @@ -580,6 +580,29 @@ kvm_userspace_memory_region_find(struct kvm_vm *vm, uint64_t start,
> >         return &region->region;
> >  }
> >
> > +/*
> > + * KVM Userspace Memory Get Backing Source FD
> > + *
> > + * Input Args:
> > + *   vm - Virtual Machine
> > + *   memslot - KVM memory slot ID
> > + *
> > + * Output Args: None
> > + *
> > + * Return:
> > + *   Backing source file descriptor, -1 if the memslot is an anonymous region.
> > + *
> > + * Returns the backing source fd of a memslot, so tests can use it to punch
> > + * holes, or to setup permissions.
> > + */
> > +int vm_mem_region_get_src_fd(struct kvm_vm *vm, uint32_t memslot)
> > +{
> > +       struct userspace_mem_region *region;
> > +
> > +       region = memslot2region(vm, memslot);
> > +       return region->fd;
> > +}
> > +
> >  /*
> >   * VCPU Find
> >   *
> > --
> > 2.35.1.723.g4982287a31-goog
> >

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 02/11] KVM: selftests: Add vm_mem_region_get_src_fd library function
@ 2022-03-18 20:30       ` Ricardo Koller
  0 siblings, 0 replies; 42+ messages in thread
From: Ricardo Koller @ 2022-03-18 20:30 UTC (permalink / raw)
  To: Ben Gardon; +Cc: kvm, Marc Zyngier, Paolo Bonzini, Axel Rasmussen, kvmarm

On Wed, Mar 16, 2022 at 12:08:23PM -0600, Ben Gardon wrote:
> On Fri, Mar 11, 2022 at 12:02 AM Ricardo Koller <ricarkol@google.com> wrote:
> >
> > Add a library function to get the backing source FD of a memslot.
> >
> > Signed-off-by: Ricardo Koller <ricarkol@google.com>
> 
> This appears to be dead code as of this commit, would recommend
> merging it into the commit in which it's actually used.

I was trying to separate lib changes (which are mostly arch independent)
with the actual test. Would move the commit to be right before the one
that uses be better? and maybe add a commit comment mentioning how it's
going to be used.

> 
> > ---
> >  .../selftests/kvm/include/kvm_util_base.h     |  1 +
> >  tools/testing/selftests/kvm/lib/kvm_util.c    | 23 +++++++++++++++++++
> >  2 files changed, 24 insertions(+)
> >
> > diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
> > index 4ed6aa049a91..d6acec0858c0 100644
> > --- a/tools/testing/selftests/kvm/include/kvm_util_base.h
> > +++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
> > @@ -163,6 +163,7 @@ int _kvm_ioctl(struct kvm_vm *vm, unsigned long ioctl, void *arg);
> >  void vm_mem_region_set_flags(struct kvm_vm *vm, uint32_t slot, uint32_t flags);
> >  void vm_mem_region_move(struct kvm_vm *vm, uint32_t slot, uint64_t new_gpa);
> >  void vm_mem_region_delete(struct kvm_vm *vm, uint32_t slot);
> > +int vm_mem_region_get_src_fd(struct kvm_vm *vm, uint32_t memslot);
> >  void vm_vcpu_add(struct kvm_vm *vm, uint32_t vcpuid);
> >  vm_vaddr_t vm_vaddr_alloc(struct kvm_vm *vm, size_t sz, vm_vaddr_t vaddr_min);
> >  vm_vaddr_t vm_vaddr_alloc_pages(struct kvm_vm *vm, int nr_pages);
> > diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
> > index d8cf851ab119..64ef245b73de 100644
> > --- a/tools/testing/selftests/kvm/lib/kvm_util.c
> > +++ b/tools/testing/selftests/kvm/lib/kvm_util.c
> > @@ -580,6 +580,29 @@ kvm_userspace_memory_region_find(struct kvm_vm *vm, uint64_t start,
> >         return &region->region;
> >  }
> >
> > +/*
> > + * KVM Userspace Memory Get Backing Source FD
> > + *
> > + * Input Args:
> > + *   vm - Virtual Machine
> > + *   memslot - KVM memory slot ID
> > + *
> > + * Output Args: None
> > + *
> > + * Return:
> > + *   Backing source file descriptor, -1 if the memslot is an anonymous region.
> > + *
> > + * Returns the backing source fd of a memslot, so tests can use it to punch
> > + * holes, or to setup permissions.
> > + */
> > +int vm_mem_region_get_src_fd(struct kvm_vm *vm, uint32_t memslot)
> > +{
> > +       struct userspace_mem_region *region;
> > +
> > +       region = memslot2region(vm, memslot);
> > +       return region->fd;
> > +}
> > +
> >  /*
> >   * VCPU Find
> >   *
> > --
> > 2.35.1.723.g4982287a31-goog
> >
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 04/11] KVM: selftests: Add vm_alloc_page_table_in_memslot library function
  2022-03-16 18:07     ` Ben Gardon
@ 2022-03-18 20:33       ` Ricardo Koller
  -1 siblings, 0 replies; 42+ messages in thread
From: Ricardo Koller @ 2022-03-18 20:33 UTC (permalink / raw)
  To: Ben Gardon
  Cc: kvm, kvmarm, Andrew Jones, Paolo Bonzini, Marc Zyngier,
	Alexandru Elisei, Eric Auger, Oliver Upton, Reiji Watanabe,
	Raghavendra Rao Ananta, Axel Rasmussen

On Wed, Mar 16, 2022 at 12:07:21PM -0600, Ben Gardon wrote:
> On Fri, Mar 11, 2022 at 12:02 AM Ricardo Koller <ricarkol@google.com> wrote:
> >
> > Add a library function to allocate a page-table physical page in a
> > particular memslot.  The default behavior is to create new page-table
> > pages in memslot 0.
> >
> > Signed-off-by: Ricardo Koller <ricarkol@google.com>
> 
> This is very similar to one of the patches in the NX hugepages control
> series I sent out last week, I guess we both had similar needs at the
> same time.
> Your solution introduces way less churn though, so it's probably
> better. I might use this commit or wait for it to be merged before I
> send out v2 of my NX control series.

Both options sound good to me. I'm glad it helps.

> 
> In any case,
> Reviewed-by: Ben Gardon <bgardon@google.com>
> 
> > ---
> >  tools/testing/selftests/kvm/include/kvm_util_base.h | 1 +
> >  tools/testing/selftests/kvm/lib/kvm_util.c          | 8 +++++++-
> >  2 files changed, 8 insertions(+), 1 deletion(-)
> >
> > diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
> > index d6acec0858c0..c8dce12a9a52 100644
> > --- a/tools/testing/selftests/kvm/include/kvm_util_base.h
> > +++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
> > @@ -307,6 +307,7 @@ vm_paddr_t vm_phy_page_alloc(struct kvm_vm *vm, vm_paddr_t paddr_min,
> >  vm_paddr_t vm_phy_pages_alloc(struct kvm_vm *vm, size_t num,
> >                               vm_paddr_t paddr_min, uint32_t memslot);
> >  vm_paddr_t vm_alloc_page_table(struct kvm_vm *vm);
> > +vm_paddr_t vm_alloc_page_table_in_memslot(struct kvm_vm *vm, uint32_t pt_memslot);
> >
> >  /*
> >   * Create a VM with reasonable defaults
> > diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
> > index 64ef245b73de..ae21564241c8 100644
> > --- a/tools/testing/selftests/kvm/lib/kvm_util.c
> > +++ b/tools/testing/selftests/kvm/lib/kvm_util.c
> > @@ -2409,9 +2409,15 @@ vm_paddr_t vm_phy_page_alloc(struct kvm_vm *vm, vm_paddr_t paddr_min,
> >  /* Arbitrary minimum physical address used for virtual translation tables. */
> >  #define KVM_GUEST_PAGE_TABLE_MIN_PADDR 0x180000
> >
> > +vm_paddr_t vm_alloc_page_table_in_memslot(struct kvm_vm *vm, uint32_t pt_memslot)
> > +{
> > +       return vm_phy_page_alloc(vm, KVM_GUEST_PAGE_TABLE_MIN_PADDR,
> > +                       pt_memslot);
> > +}
> > +
> >  vm_paddr_t vm_alloc_page_table(struct kvm_vm *vm)
> >  {
> > -       return vm_phy_page_alloc(vm, KVM_GUEST_PAGE_TABLE_MIN_PADDR, 0);
> > +       return vm_alloc_page_table_in_memslot(vm, 0);
> >  }
> >
> >  /*
> > --
> > 2.35.1.723.g4982287a31-goog
> >

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 04/11] KVM: selftests: Add vm_alloc_page_table_in_memslot library function
@ 2022-03-18 20:33       ` Ricardo Koller
  0 siblings, 0 replies; 42+ messages in thread
From: Ricardo Koller @ 2022-03-18 20:33 UTC (permalink / raw)
  To: Ben Gardon; +Cc: kvm, Marc Zyngier, Paolo Bonzini, Axel Rasmussen, kvmarm

On Wed, Mar 16, 2022 at 12:07:21PM -0600, Ben Gardon wrote:
> On Fri, Mar 11, 2022 at 12:02 AM Ricardo Koller <ricarkol@google.com> wrote:
> >
> > Add a library function to allocate a page-table physical page in a
> > particular memslot.  The default behavior is to create new page-table
> > pages in memslot 0.
> >
> > Signed-off-by: Ricardo Koller <ricarkol@google.com>
> 
> This is very similar to one of the patches in the NX hugepages control
> series I sent out last week, I guess we both had similar needs at the
> same time.
> Your solution introduces way less churn though, so it's probably
> better. I might use this commit or wait for it to be merged before I
> send out v2 of my NX control series.

Both options sound good to me. I'm glad it helps.

> 
> In any case,
> Reviewed-by: Ben Gardon <bgardon@google.com>
> 
> > ---
> >  tools/testing/selftests/kvm/include/kvm_util_base.h | 1 +
> >  tools/testing/selftests/kvm/lib/kvm_util.c          | 8 +++++++-
> >  2 files changed, 8 insertions(+), 1 deletion(-)
> >
> > diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
> > index d6acec0858c0..c8dce12a9a52 100644
> > --- a/tools/testing/selftests/kvm/include/kvm_util_base.h
> > +++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
> > @@ -307,6 +307,7 @@ vm_paddr_t vm_phy_page_alloc(struct kvm_vm *vm, vm_paddr_t paddr_min,
> >  vm_paddr_t vm_phy_pages_alloc(struct kvm_vm *vm, size_t num,
> >                               vm_paddr_t paddr_min, uint32_t memslot);
> >  vm_paddr_t vm_alloc_page_table(struct kvm_vm *vm);
> > +vm_paddr_t vm_alloc_page_table_in_memslot(struct kvm_vm *vm, uint32_t pt_memslot);
> >
> >  /*
> >   * Create a VM with reasonable defaults
> > diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
> > index 64ef245b73de..ae21564241c8 100644
> > --- a/tools/testing/selftests/kvm/lib/kvm_util.c
> > +++ b/tools/testing/selftests/kvm/lib/kvm_util.c
> > @@ -2409,9 +2409,15 @@ vm_paddr_t vm_phy_page_alloc(struct kvm_vm *vm, vm_paddr_t paddr_min,
> >  /* Arbitrary minimum physical address used for virtual translation tables. */
> >  #define KVM_GUEST_PAGE_TABLE_MIN_PADDR 0x180000
> >
> > +vm_paddr_t vm_alloc_page_table_in_memslot(struct kvm_vm *vm, uint32_t pt_memslot)
> > +{
> > +       return vm_phy_page_alloc(vm, KVM_GUEST_PAGE_TABLE_MIN_PADDR,
> > +                       pt_memslot);
> > +}
> > +
> >  vm_paddr_t vm_alloc_page_table(struct kvm_vm *vm)
> >  {
> > -       return vm_phy_page_alloc(vm, KVM_GUEST_PAGE_TABLE_MIN_PADDR, 0);
> > +       return vm_alloc_page_table_in_memslot(vm, 0);
> >  }
> >
> >  /*
> > --
> > 2.35.1.723.g4982287a31-goog
> >
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 06/11] KVM: selftests: Add missing close and munmap in __vm_mem_region_delete
  2022-03-16 18:09     ` Ben Gardon
@ 2022-03-18 20:33       ` Ricardo Koller
  -1 siblings, 0 replies; 42+ messages in thread
From: Ricardo Koller @ 2022-03-18 20:33 UTC (permalink / raw)
  To: Ben Gardon
  Cc: kvm, kvmarm, Andrew Jones, Paolo Bonzini, Marc Zyngier,
	Alexandru Elisei, Eric Auger, Oliver Upton, Reiji Watanabe,
	Raghavendra Rao Ananta, Axel Rasmussen

On Wed, Mar 16, 2022 at 12:09:44PM -0600, Ben Gardon wrote:
> On Fri, Mar 11, 2022 at 12:02 AM Ricardo Koller <ricarkol@google.com> wrote:
> >
> > Deleting a memslot (when freeing a VM) is not closing the backing fd,
> > nor it's unmapping the alias mapping. Fix by adding the missing close
> > and munmap.
> >
> > Signed-off-by: Ricardo Koller <ricarkol@google.com>
> > ---
> >  tools/testing/selftests/kvm/lib/kvm_util.c | 6 ++++++
> >  1 file changed, 6 insertions(+)
> >
> > diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
> > index ae21564241c8..c25c79f97695 100644
> > --- a/tools/testing/selftests/kvm/lib/kvm_util.c
> > +++ b/tools/testing/selftests/kvm/lib/kvm_util.c
> > @@ -702,6 +702,12 @@ static void __vm_mem_region_delete(struct kvm_vm *vm,
> >         sparsebit_free(&region->unused_phy_pages);
> >         ret = munmap(region->mmap_start, region->mmap_size);
> >         TEST_ASSERT(ret == 0, "munmap failed, rc: %i errno: %i", ret, errno);
> > +       if (region->fd >= 0) {
> > +       /* There's an extra map if shared memory. */
> 
> Nit: indentation

Will fix in v2.

Thanks for the reviews!
> 
> > +               ret = munmap(region->mmap_alias, region->mmap_size);
> > +               TEST_ASSERT(ret == 0, "munmap failed, rc: %i errno: %i", ret, errno);
> > +               close(region->fd);
> > +       }
> >
> >         free(region);
> >  }
> > --
> > 2.35.1.723.g4982287a31-goog
> >

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 06/11] KVM: selftests: Add missing close and munmap in __vm_mem_region_delete
@ 2022-03-18 20:33       ` Ricardo Koller
  0 siblings, 0 replies; 42+ messages in thread
From: Ricardo Koller @ 2022-03-18 20:33 UTC (permalink / raw)
  To: Ben Gardon; +Cc: kvm, Marc Zyngier, Paolo Bonzini, Axel Rasmussen, kvmarm

On Wed, Mar 16, 2022 at 12:09:44PM -0600, Ben Gardon wrote:
> On Fri, Mar 11, 2022 at 12:02 AM Ricardo Koller <ricarkol@google.com> wrote:
> >
> > Deleting a memslot (when freeing a VM) is not closing the backing fd,
> > nor it's unmapping the alias mapping. Fix by adding the missing close
> > and munmap.
> >
> > Signed-off-by: Ricardo Koller <ricarkol@google.com>
> > ---
> >  tools/testing/selftests/kvm/lib/kvm_util.c | 6 ++++++
> >  1 file changed, 6 insertions(+)
> >
> > diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
> > index ae21564241c8..c25c79f97695 100644
> > --- a/tools/testing/selftests/kvm/lib/kvm_util.c
> > +++ b/tools/testing/selftests/kvm/lib/kvm_util.c
> > @@ -702,6 +702,12 @@ static void __vm_mem_region_delete(struct kvm_vm *vm,
> >         sparsebit_free(&region->unused_phy_pages);
> >         ret = munmap(region->mmap_start, region->mmap_size);
> >         TEST_ASSERT(ret == 0, "munmap failed, rc: %i errno: %i", ret, errno);
> > +       if (region->fd >= 0) {
> > +       /* There's an extra map if shared memory. */
> 
> Nit: indentation

Will fix in v2.

Thanks for the reviews!
> 
> > +               ret = munmap(region->mmap_alias, region->mmap_size);
> > +               TEST_ASSERT(ret == 0, "munmap failed, rc: %i errno: %i", ret, errno);
> > +               close(region->fd);
> > +       }
> >
> >         free(region);
> >  }
> > --
> > 2.35.1.723.g4982287a31-goog
> >
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 02/11] KVM: selftests: Add vm_mem_region_get_src_fd library function
  2022-03-18 20:30       ` Ricardo Koller
@ 2022-03-21 17:01         ` Ben Gardon
  -1 siblings, 0 replies; 42+ messages in thread
From: Ben Gardon @ 2022-03-21 17:01 UTC (permalink / raw)
  To: Ricardo Koller
  Cc: kvm, kvmarm, Andrew Jones, Paolo Bonzini, Marc Zyngier,
	Alexandru Elisei, Eric Auger, Oliver Upton, Reiji Watanabe,
	Raghavendra Rao Ananta, Axel Rasmussen

On Fri, Mar 18, 2022 at 1:30 PM Ricardo Koller <ricarkol@google.com> wrote:
>
> On Wed, Mar 16, 2022 at 12:08:23PM -0600, Ben Gardon wrote:
> > On Fri, Mar 11, 2022 at 12:02 AM Ricardo Koller <ricarkol@google.com> wrote:
> > >
> > > Add a library function to get the backing source FD of a memslot.
> > >
> > > Signed-off-by: Ricardo Koller <ricarkol@google.com>
> >
> > This appears to be dead code as of this commit, would recommend
> > merging it into the commit in which it's actually used.
>
> I was trying to separate lib changes (which are mostly arch independent)
> with the actual test. Would move the commit to be right before the one
> that uses be better? and maybe add a commit comment mentioning how it's
> going to be used.

Ah, that makes sense, I can see why you'd want to separate them.
Moving it right before the commit where it's used sounds fine to me.
Thanks!

>
>
> >
> > > ---
> > >  .../selftests/kvm/include/kvm_util_base.h     |  1 +
> > >  tools/testing/selftests/kvm/lib/kvm_util.c    | 23 +++++++++++++++++++
> > >  2 files changed, 24 insertions(+)
> > >
> > > diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
> > > index 4ed6aa049a91..d6acec0858c0 100644
> > > --- a/tools/testing/selftests/kvm/include/kvm_util_base.h
> > > +++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
> > > @@ -163,6 +163,7 @@ int _kvm_ioctl(struct kvm_vm *vm, unsigned long ioctl, void *arg);
> > >  void vm_mem_region_set_flags(struct kvm_vm *vm, uint32_t slot, uint32_t flags);
> > >  void vm_mem_region_move(struct kvm_vm *vm, uint32_t slot, uint64_t new_gpa);
> > >  void vm_mem_region_delete(struct kvm_vm *vm, uint32_t slot);
> > > +int vm_mem_region_get_src_fd(struct kvm_vm *vm, uint32_t memslot);
> > >  void vm_vcpu_add(struct kvm_vm *vm, uint32_t vcpuid);
> > >  vm_vaddr_t vm_vaddr_alloc(struct kvm_vm *vm, size_t sz, vm_vaddr_t vaddr_min);
> > >  vm_vaddr_t vm_vaddr_alloc_pages(struct kvm_vm *vm, int nr_pages);
> > > diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
> > > index d8cf851ab119..64ef245b73de 100644
> > > --- a/tools/testing/selftests/kvm/lib/kvm_util.c
> > > +++ b/tools/testing/selftests/kvm/lib/kvm_util.c
> > > @@ -580,6 +580,29 @@ kvm_userspace_memory_region_find(struct kvm_vm *vm, uint64_t start,
> > >         return &region->region;
> > >  }
> > >
> > > +/*
> > > + * KVM Userspace Memory Get Backing Source FD
> > > + *
> > > + * Input Args:
> > > + *   vm - Virtual Machine
> > > + *   memslot - KVM memory slot ID
> > > + *
> > > + * Output Args: None
> > > + *
> > > + * Return:
> > > + *   Backing source file descriptor, -1 if the memslot is an anonymous region.
> > > + *
> > > + * Returns the backing source fd of a memslot, so tests can use it to punch
> > > + * holes, or to setup permissions.
> > > + */
> > > +int vm_mem_region_get_src_fd(struct kvm_vm *vm, uint32_t memslot)
> > > +{
> > > +       struct userspace_mem_region *region;
> > > +
> > > +       region = memslot2region(vm, memslot);
> > > +       return region->fd;
> > > +}
> > > +
> > >  /*
> > >   * VCPU Find
> > >   *
> > > --
> > > 2.35.1.723.g4982287a31-goog
> > >

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 02/11] KVM: selftests: Add vm_mem_region_get_src_fd library function
@ 2022-03-21 17:01         ` Ben Gardon
  0 siblings, 0 replies; 42+ messages in thread
From: Ben Gardon @ 2022-03-21 17:01 UTC (permalink / raw)
  To: Ricardo Koller; +Cc: kvm, Marc Zyngier, Paolo Bonzini, Axel Rasmussen, kvmarm

On Fri, Mar 18, 2022 at 1:30 PM Ricardo Koller <ricarkol@google.com> wrote:
>
> On Wed, Mar 16, 2022 at 12:08:23PM -0600, Ben Gardon wrote:
> > On Fri, Mar 11, 2022 at 12:02 AM Ricardo Koller <ricarkol@google.com> wrote:
> > >
> > > Add a library function to get the backing source FD of a memslot.
> > >
> > > Signed-off-by: Ricardo Koller <ricarkol@google.com>
> >
> > This appears to be dead code as of this commit, would recommend
> > merging it into the commit in which it's actually used.
>
> I was trying to separate lib changes (which are mostly arch independent)
> with the actual test. Would move the commit to be right before the one
> that uses be better? and maybe add a commit comment mentioning how it's
> going to be used.

Ah, that makes sense, I can see why you'd want to separate them.
Moving it right before the commit where it's used sounds fine to me.
Thanks!

>
>
> >
> > > ---
> > >  .../selftests/kvm/include/kvm_util_base.h     |  1 +
> > >  tools/testing/selftests/kvm/lib/kvm_util.c    | 23 +++++++++++++++++++
> > >  2 files changed, 24 insertions(+)
> > >
> > > diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
> > > index 4ed6aa049a91..d6acec0858c0 100644
> > > --- a/tools/testing/selftests/kvm/include/kvm_util_base.h
> > > +++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
> > > @@ -163,6 +163,7 @@ int _kvm_ioctl(struct kvm_vm *vm, unsigned long ioctl, void *arg);
> > >  void vm_mem_region_set_flags(struct kvm_vm *vm, uint32_t slot, uint32_t flags);
> > >  void vm_mem_region_move(struct kvm_vm *vm, uint32_t slot, uint64_t new_gpa);
> > >  void vm_mem_region_delete(struct kvm_vm *vm, uint32_t slot);
> > > +int vm_mem_region_get_src_fd(struct kvm_vm *vm, uint32_t memslot);
> > >  void vm_vcpu_add(struct kvm_vm *vm, uint32_t vcpuid);
> > >  vm_vaddr_t vm_vaddr_alloc(struct kvm_vm *vm, size_t sz, vm_vaddr_t vaddr_min);
> > >  vm_vaddr_t vm_vaddr_alloc_pages(struct kvm_vm *vm, int nr_pages);
> > > diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
> > > index d8cf851ab119..64ef245b73de 100644
> > > --- a/tools/testing/selftests/kvm/lib/kvm_util.c
> > > +++ b/tools/testing/selftests/kvm/lib/kvm_util.c
> > > @@ -580,6 +580,29 @@ kvm_userspace_memory_region_find(struct kvm_vm *vm, uint64_t start,
> > >         return &region->region;
> > >  }
> > >
> > > +/*
> > > + * KVM Userspace Memory Get Backing Source FD
> > > + *
> > > + * Input Args:
> > > + *   vm - Virtual Machine
> > > + *   memslot - KVM memory slot ID
> > > + *
> > > + * Output Args: None
> > > + *
> > > + * Return:
> > > + *   Backing source file descriptor, -1 if the memslot is an anonymous region.
> > > + *
> > > + * Returns the backing source fd of a memslot, so tests can use it to punch
> > > + * holes, or to setup permissions.
> > > + */
> > > +int vm_mem_region_get_src_fd(struct kvm_vm *vm, uint32_t memslot)
> > > +{
> > > +       struct userspace_mem_region *region;
> > > +
> > > +       region = memslot2region(vm, memslot);
> > > +       return region->fd;
> > > +}
> > > +
> > >  /*
> > >   * VCPU Find
> > >   *
> > > --
> > > 2.35.1.723.g4982287a31-goog
> > >
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2022-03-21 19:33 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-11  6:01 [PATCH 00/11] KVM: selftests: Add aarch64/page_fault_test Ricardo Koller
2022-03-11  6:01 ` Ricardo Koller
2022-03-11  6:01 ` [PATCH 01/11] KVM: selftests: Add a userfaultfd library Ricardo Koller
2022-03-11  6:01   ` Ricardo Koller
2022-03-16 18:02   ` Ben Gardon
2022-03-16 18:02     ` Ben Gardon
2022-03-18 20:27     ` Ricardo Koller
2022-03-18 20:27       ` Ricardo Koller
2022-03-11  6:01 ` [PATCH 02/11] KVM: selftests: Add vm_mem_region_get_src_fd library function Ricardo Koller
2022-03-11  6:01   ` Ricardo Koller
2022-03-16 18:08   ` Ben Gardon
2022-03-16 18:08     ` Ben Gardon
2022-03-18 20:30     ` Ricardo Koller
2022-03-18 20:30       ` Ricardo Koller
2022-03-21 17:01       ` Ben Gardon
2022-03-21 17:01         ` Ben Gardon
2022-03-11  6:01 ` [PATCH 03/11] KVM: selftests: aarch64: Add vm_get_pte_gpa " Ricardo Koller
2022-03-11  6:01   ` Ricardo Koller
2022-03-11  6:02 ` [PATCH 04/11] KVM: selftests: Add vm_alloc_page_table_in_memslot " Ricardo Koller
2022-03-11  6:02   ` Ricardo Koller
2022-03-16 18:07   ` Ben Gardon
2022-03-16 18:07     ` Ben Gardon
2022-03-18 20:33     ` Ricardo Koller
2022-03-18 20:33       ` Ricardo Koller
2022-03-11  6:02 ` [PATCH 05/11] KVM: selftests: aarch64: Export _virt_pg_map with a pt_memslot arg Ricardo Koller
2022-03-11  6:02   ` Ricardo Koller
2022-03-11  6:02 ` [PATCH 06/11] KVM: selftests: Add missing close and munmap in __vm_mem_region_delete Ricardo Koller
2022-03-11  6:02   ` Ricardo Koller
2022-03-16 18:09   ` Ben Gardon
2022-03-16 18:09     ` Ben Gardon
2022-03-18 20:33     ` Ricardo Koller
2022-03-18 20:33       ` Ricardo Koller
2022-03-11  6:02 ` [PATCH 07/11] KVM: selftests: aarch64: Add aarch64/page_fault_test Ricardo Koller
2022-03-11  6:02   ` Ricardo Koller
2022-03-11  6:02 ` [PATCH 08/11] KVM: selftests: aarch64: Add userfaultfd tests into page_fault_test Ricardo Koller
2022-03-11  6:02   ` Ricardo Koller
2022-03-11  6:02 ` [PATCH 09/11] KVM: selftests: aarch64: Add dirty logging " Ricardo Koller
2022-03-11  6:02   ` Ricardo Koller
2022-03-11  6:02 ` [PATCH 10/11] KVM: selftests: aarch64: Add readonly memslot " Ricardo Koller
2022-03-11  6:02   ` Ricardo Koller
2022-03-11  6:02 ` [PATCH 11/11] KVM: selftests: aarch64: Add mix of " Ricardo Koller
2022-03-11  6:02   ` Ricardo Koller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.