All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v7 00/11] KVM: x86: Add a cap to disable NX hugepages on a VM
@ 2022-05-03 18:30 Ben Gardon
  2022-05-03 18:30 ` [PATCH v7 01/11] KVM: selftests: Remove dynamic memory allocation for stats header Ben Gardon
                   ` (10 more replies)
  0 siblings, 11 replies; 19+ messages in thread
From: Ben Gardon @ 2022-05-03 18:30 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
	Jim Mattson, David Dunn, Jing Zhang, Junaid Shahid, Ben Gardon

Given the high cost of NX hugepages in terms of TLB performance, it may
be desirable to disable the mitigation on a per-VM basis. In the case of public
cloud providers with many VMs on a single host, some VMs may be more trusted
than others. In order to maximize performance on critical VMs, while still
providing some protection to the host from iTLB Multihit, allow the mitigation
to be selectively disabled.

Disabling NX hugepages on a VM is relatively straightforward, but I took this
as an opportunity to add some NX hugepages test coverage and clean up selftests
infrastructure a bit.

This series was tested with the new selftest and the rest of the KVM selftests
on an Intel Haswell machine.

Changelog:
v1->v2:
	Dropped the complicated memslot refactor in favor of Ricardo Koller's
	patch with a similar effect.
	Incorporated David Dunn's feedback and reviewed by tag: shortened waits
	to speed up test.
v2->v3:
	Incorporated a suggestion from David on how to build the NX huge pages
	test.
	Fixed a build breakage identified by David.
	Dropped the per-vm nx_huge_pages field in favor of simply checking the
	global + per-VM disable override.
	Documented the new capability
	Separated out the commit to test disabling NX huge pages
	Removed permission check when checking if the disable NX capability is
	supported.
	Added test coverage for the permission check.
v3->v4:
	Collected RB's from Jing and David
	Modified stat collection to reduce a memory allocation [David]
	Incorporated various improvments to the NX test [David]
	Changed the NX disable test to run by default [David]
	Removed some now unnecessary commits
	Dropped the code to dump KVM stats from the binary stats test, and
	factor out parts of the existing test to library functions instead.
	[David, Jing, Sean]
	Dropped the improvement to a debugging log message as it's no longer
	relevant to this series.
v4->v5:
	Incorporated cleanup suggestions from David and Sean
	Added a patch with style fixes for the binary stats test from Sean
	Added a restriction that NX huge pages can only be disabled before
	vCPUs are created [Sean]

v5->v6:
	Scooped up David's RBs
	Added a magic token to skip nx_huge_pages_test when not run via
	wrapper script [Sean]
	Made the call to nx_huge_pages_test in the wrapper script more
	robust [Sean]
	Incorportated various nits and comment / documentation suggestions from
	Sean.
	Improved negative testing of NX disable without reboot permissions. [Sean]

v6->v7:
	Collected Peter Xu's Reviewed-by tags
	Added stats metadata caching to kvm_util
	Misc NX test fixups

Ben Gardon (10):
  KVM: selftests: Remove dynamic memory allocation for stats header
  KVM: selftests: Read binary stats header in lib
  KVM: selftests: Read binary stats desc in lib
  KVM: selftests: Read binary stat data in lib
  KVM: selftests: Add NX huge pages test
  KVM: x86: Fix errant brace in KVM capability handling
  KVM: x86/MMU: Allow NX huge pages to be disabled on a per-vm basis
  KVM: selftests: Factor out calculation of pages needed for a VM
  KVM: selftests: Test disabling NX hugepages on a VM
  KVM: selftests: Cache binary stats metadata for duration of test

Sean Christopherson (1):
  KVM: selftests: Clean up coding style in binary stats test

 Documentation/virt/kvm/api.rst                |  17 ++
 arch/x86/include/asm/kvm_host.h               |   2 +
 arch/x86/kvm/mmu.h                            |   8 +-
 arch/x86/kvm/mmu/spte.c                       |   7 +-
 arch/x86/kvm/mmu/spte.h                       |   3 +-
 arch/x86/kvm/mmu/tdp_mmu.c                    |   2 +-
 arch/x86/kvm/x86.c                            |  32 ++-
 include/uapi/linux/kvm.h                      |   1 +
 tools/testing/selftests/kvm/Makefile          |  10 +
 .../selftests/kvm/include/kvm_util_base.h     |  13 +
 .../selftests/kvm/kvm_binary_stats_test.c     | 142 +++++-----
 tools/testing/selftests/kvm/lib/kvm_util.c    | 262 ++++++++++++++++--
 .../selftests/kvm/lib/kvm_util_internal.h     |   5 +
 .../selftests/kvm/x86_64/nx_huge_pages_test.c | 222 +++++++++++++++
 .../kvm/x86_64/nx_huge_pages_test.sh          |  46 +++
 15 files changed, 678 insertions(+), 94 deletions(-)
 create mode 100644 tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
 create mode 100755 tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh

-- 
2.36.0.464.gb9c8b46e94-goog


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v7 01/11] KVM: selftests: Remove dynamic memory allocation for stats header
  2022-05-03 18:30 [PATCH v7 00/11] KVM: x86: Add a cap to disable NX hugepages on a VM Ben Gardon
@ 2022-05-03 18:30 ` Ben Gardon
  2022-05-03 18:30 ` [PATCH v7 02/11] KVM: selftests: Read binary stats header in lib Ben Gardon
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 19+ messages in thread
From: Ben Gardon @ 2022-05-03 18:30 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
	Jim Mattson, David Dunn, Jing Zhang, Junaid Shahid, Ben Gardon

There's no need to allocate dynamic memory for the stats header since
its size is known at compile time.

Reviewed-by: David Matlack <dmatlack@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Ben Gardon <bgardon@google.com>
---
 .../selftests/kvm/kvm_binary_stats_test.c     | 58 +++++++++----------
 1 file changed, 27 insertions(+), 31 deletions(-)

diff --git a/tools/testing/selftests/kvm/kvm_binary_stats_test.c b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
index 17f65d514915..dad34d8a41fe 100644
--- a/tools/testing/selftests/kvm/kvm_binary_stats_test.c
+++ b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
@@ -26,56 +26,53 @@ static void stats_test(int stats_fd)
 	int i;
 	size_t size_desc;
 	size_t size_data = 0;
-	struct kvm_stats_header *header;
+	struct kvm_stats_header header;
 	char *id;
 	struct kvm_stats_desc *stats_desc;
 	u64 *stats_data;
 	struct kvm_stats_desc *pdesc;
 
 	/* Read kvm stats header */
-	header = malloc(sizeof(*header));
-	TEST_ASSERT(header, "Allocate memory for stats header");
-
-	ret = read(stats_fd, header, sizeof(*header));
-	TEST_ASSERT(ret == sizeof(*header), "Read stats header");
-	size_desc = sizeof(*stats_desc) + header->name_size;
+	ret = read(stats_fd, &header, sizeof(header));
+	TEST_ASSERT(ret == sizeof(header), "Read stats header");
+	size_desc = sizeof(*stats_desc) + header.name_size;
 
 	/* Read kvm stats id string */
-	id = malloc(header->name_size);
+	id = malloc(header.name_size);
 	TEST_ASSERT(id, "Allocate memory for id string");
-	ret = read(stats_fd, id, header->name_size);
-	TEST_ASSERT(ret == header->name_size, "Read id string");
+	ret = read(stats_fd, id, header.name_size);
+	TEST_ASSERT(ret == header.name_size, "Read id string");
 
 	/* Check id string, that should start with "kvm" */
-	TEST_ASSERT(!strncmp(id, "kvm", 3) && strlen(id) < header->name_size,
+	TEST_ASSERT(!strncmp(id, "kvm", 3) && strlen(id) < header.name_size,
 				"Invalid KVM stats type, id: %s", id);
 
 	/* Sanity check for other fields in header */
-	if (header->num_desc == 0) {
+	if (header.num_desc == 0) {
 		printf("No KVM stats defined!");
 		return;
 	}
 	/* Check overlap */
-	TEST_ASSERT(header->desc_offset > 0 && header->data_offset > 0
-			&& header->desc_offset >= sizeof(*header)
-			&& header->data_offset >= sizeof(*header),
+	TEST_ASSERT(header.desc_offset > 0 && header.data_offset > 0
+			&& header.desc_offset >= sizeof(header)
+			&& header.data_offset >= sizeof(header),
 			"Invalid offset fields in header");
-	TEST_ASSERT(header->desc_offset > header->data_offset ||
-			(header->desc_offset + size_desc * header->num_desc <=
-							header->data_offset),
+	TEST_ASSERT(header.desc_offset > header.data_offset ||
+			(header.desc_offset + size_desc * header.num_desc <=
+							header.data_offset),
 			"Descriptor block is overlapped with data block");
 
 	/* Allocate memory for stats descriptors */
-	stats_desc = calloc(header->num_desc, size_desc);
+	stats_desc = calloc(header.num_desc, size_desc);
 	TEST_ASSERT(stats_desc, "Allocate memory for stats descriptors");
 	/* Read kvm stats descriptors */
 	ret = pread(stats_fd, stats_desc,
-			size_desc * header->num_desc, header->desc_offset);
-	TEST_ASSERT(ret == size_desc * header->num_desc,
+			size_desc * header.num_desc, header.desc_offset);
+	TEST_ASSERT(ret == size_desc * header.num_desc,
 			"Read KVM stats descriptors");
 
 	/* Sanity check for fields in descriptors */
-	for (i = 0; i < header->num_desc; ++i) {
+	for (i = 0; i < header.num_desc; ++i) {
 		pdesc = (void *)stats_desc + i * size_desc;
 		/* Check type,unit,base boundaries */
 		TEST_ASSERT((pdesc->flags & KVM_STATS_TYPE_MASK)
@@ -104,7 +101,7 @@ static void stats_test(int stats_fd)
 			break;
 		}
 		/* Check name string */
-		TEST_ASSERT(strlen(pdesc->name) < header->name_size,
+		TEST_ASSERT(strlen(pdesc->name) < header.name_size,
 				"KVM stats name(%s) too long", pdesc->name);
 		/* Check size field, which should not be zero */
 		TEST_ASSERT(pdesc->size, "KVM descriptor(%s) with size of 0",
@@ -124,14 +121,14 @@ static void stats_test(int stats_fd)
 		size_data += pdesc->size * sizeof(*stats_data);
 	}
 	/* Check overlap */
-	TEST_ASSERT(header->data_offset >= header->desc_offset
-		|| header->data_offset + size_data <= header->desc_offset,
+	TEST_ASSERT(header.data_offset >= header.desc_offset
+		|| header.data_offset + size_data <= header.desc_offset,
 		"Data block is overlapped with Descriptor block");
 	/* Check validity of all stats data size */
-	TEST_ASSERT(size_data >= header->num_desc * sizeof(*stats_data),
+	TEST_ASSERT(size_data >= header.num_desc * sizeof(*stats_data),
 			"Data size is not correct");
 	/* Check stats offset */
-	for (i = 0; i < header->num_desc; ++i) {
+	for (i = 0; i < header.num_desc; ++i) {
 		pdesc = (void *)stats_desc + i * size_desc;
 		TEST_ASSERT(pdesc->offset < size_data,
 			"Invalid offset (%u) for stats: %s",
@@ -142,15 +139,15 @@ static void stats_test(int stats_fd)
 	stats_data = malloc(size_data);
 	TEST_ASSERT(stats_data, "Allocate memory for stats data");
 	/* Read kvm stats data as a bulk */
-	ret = pread(stats_fd, stats_data, size_data, header->data_offset);
+	ret = pread(stats_fd, stats_data, size_data, header.data_offset);
 	TEST_ASSERT(ret == size_data, "Read KVM stats data");
 	/* Read kvm stats data one by one */
 	size_data = 0;
-	for (i = 0; i < header->num_desc; ++i) {
+	for (i = 0; i < header.num_desc; ++i) {
 		pdesc = (void *)stats_desc + i * size_desc;
 		ret = pread(stats_fd, stats_data,
 				pdesc->size * sizeof(*stats_data),
-				header->data_offset + size_data);
+				header.data_offset + size_data);
 		TEST_ASSERT(ret == pdesc->size * sizeof(*stats_data),
 				"Read data of KVM stats: %s", pdesc->name);
 		size_data += pdesc->size * sizeof(*stats_data);
@@ -159,7 +156,6 @@ static void stats_test(int stats_fd)
 	free(stats_data);
 	free(stats_desc);
 	free(id);
-	free(header);
 }
 
 
-- 
2.36.0.464.gb9c8b46e94-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v7 02/11] KVM: selftests: Read binary stats header in lib
  2022-05-03 18:30 [PATCH v7 00/11] KVM: x86: Add a cap to disable NX hugepages on a VM Ben Gardon
  2022-05-03 18:30 ` [PATCH v7 01/11] KVM: selftests: Remove dynamic memory allocation for stats header Ben Gardon
@ 2022-05-03 18:30 ` Ben Gardon
  2022-05-03 18:30 ` [PATCH v7 03/11] KVM: selftests: Read binary stats desc " Ben Gardon
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 19+ messages in thread
From: Ben Gardon @ 2022-05-03 18:30 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
	Jim Mattson, David Dunn, Jing Zhang, Junaid Shahid, Ben Gardon

Move the code to read the binary stats header to the KVM selftests
library. It will be re-used by other tests to check KVM behavior.

No functional change intended.

Reviewed-by: David Matlack <dmatlack@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Ben Gardon <bgardon@google.com>
---
 .../selftests/kvm/include/kvm_util_base.h     |  1 +
 .../selftests/kvm/kvm_binary_stats_test.c     |  4 ++--
 tools/testing/selftests/kvm/lib/kvm_util.c    | 21 +++++++++++++++++++
 3 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
index 92cef0ffb19e..749cded9b157 100644
--- a/tools/testing/selftests/kvm/include/kvm_util_base.h
+++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
@@ -400,6 +400,7 @@ void assert_on_unhandled_exception(struct kvm_vm *vm, uint32_t vcpuid);
 
 int vm_get_stats_fd(struct kvm_vm *vm);
 int vcpu_get_stats_fd(struct kvm_vm *vm, uint32_t vcpuid);
+void read_stats_header(int stats_fd, struct kvm_stats_header *header);
 
 uint32_t guest_get_vcpuid(void);
 
diff --git a/tools/testing/selftests/kvm/kvm_binary_stats_test.c b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
index dad34d8a41fe..fb511b42a03e 100644
--- a/tools/testing/selftests/kvm/kvm_binary_stats_test.c
+++ b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
@@ -33,8 +33,8 @@ static void stats_test(int stats_fd)
 	struct kvm_stats_desc *pdesc;
 
 	/* Read kvm stats header */
-	ret = read(stats_fd, &header, sizeof(header));
-	TEST_ASSERT(ret == sizeof(header), "Read stats header");
+	read_stats_header(stats_fd, &header);
+
 	size_desc = sizeof(*stats_desc) + header.name_size;
 
 	/* Read kvm stats id string */
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index 1665a220abcb..1d75d41f92dc 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -2556,3 +2556,24 @@ int vcpu_get_stats_fd(struct kvm_vm *vm, uint32_t vcpuid)
 
 	return ioctl(vcpu->fd, KVM_GET_STATS_FD, NULL);
 }
+
+/*
+ * Read binary stats header
+ *
+ * Input Args:
+ *   stats_fd - the file descriptor for the binary stats file from which to read
+ *
+ * Output Args:
+ *   header - a binary stats metadata header to be filled with data
+ *
+ * Return: void
+ *
+ * Read a header for the binary stats interface.
+ */
+void read_stats_header(int stats_fd, struct kvm_stats_header *header)
+{
+	ssize_t ret;
+
+	ret = read(stats_fd, header, sizeof(*header));
+	TEST_ASSERT(ret == sizeof(*header), "Read stats header");
+}
-- 
2.36.0.464.gb9c8b46e94-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v7 03/11] KVM: selftests: Read binary stats desc in lib
  2022-05-03 18:30 [PATCH v7 00/11] KVM: x86: Add a cap to disable NX hugepages on a VM Ben Gardon
  2022-05-03 18:30 ` [PATCH v7 01/11] KVM: selftests: Remove dynamic memory allocation for stats header Ben Gardon
  2022-05-03 18:30 ` [PATCH v7 02/11] KVM: selftests: Read binary stats header in lib Ben Gardon
@ 2022-05-03 18:30 ` Ben Gardon
  2022-05-05 17:08   ` Sean Christopherson
  2022-05-03 18:30 ` [PATCH v7 04/11] KVM: selftests: Clean up coding style in binary stats test Ben Gardon
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 19+ messages in thread
From: Ben Gardon @ 2022-05-03 18:30 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
	Jim Mattson, David Dunn, Jing Zhang, Junaid Shahid, Ben Gardon

Move the code to read the binary stats descriptors to the KVM selftests
library. It will be re-used by other tests to check KVM behavior.

No functional change intended.

Reviewed-by: David Matlack <dmatlack@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Ben Gardon <bgardon@google.com>
---
 .../selftests/kvm/include/kvm_util_base.h     |  2 +
 .../selftests/kvm/kvm_binary_stats_test.c     |  8 +---
 tools/testing/selftests/kvm/lib/kvm_util.c    | 38 +++++++++++++++++++
 3 files changed, 41 insertions(+), 7 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
index 749cded9b157..fabe46ddc1b2 100644
--- a/tools/testing/selftests/kvm/include/kvm_util_base.h
+++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
@@ -401,6 +401,8 @@ void assert_on_unhandled_exception(struct kvm_vm *vm, uint32_t vcpuid);
 int vm_get_stats_fd(struct kvm_vm *vm);
 int vcpu_get_stats_fd(struct kvm_vm *vm, uint32_t vcpuid);
 void read_stats_header(int stats_fd, struct kvm_stats_header *header);
+struct kvm_stats_desc *read_stats_desc(int stats_fd,
+				       struct kvm_stats_header *header);
 
 uint32_t guest_get_vcpuid(void);
 
diff --git a/tools/testing/selftests/kvm/kvm_binary_stats_test.c b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
index fb511b42a03e..b49fae45db1e 100644
--- a/tools/testing/selftests/kvm/kvm_binary_stats_test.c
+++ b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
@@ -62,14 +62,8 @@ static void stats_test(int stats_fd)
 							header.data_offset),
 			"Descriptor block is overlapped with data block");
 
-	/* Allocate memory for stats descriptors */
-	stats_desc = calloc(header.num_desc, size_desc);
-	TEST_ASSERT(stats_desc, "Allocate memory for stats descriptors");
 	/* Read kvm stats descriptors */
-	ret = pread(stats_fd, stats_desc,
-			size_desc * header.num_desc, header.desc_offset);
-	TEST_ASSERT(ret == size_desc * header.num_desc,
-			"Read KVM stats descriptors");
+	stats_desc = read_stats_desc(stats_fd, &header);
 
 	/* Sanity check for fields in descriptors */
 	for (i = 0; i < header.num_desc; ++i) {
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index 1d75d41f92dc..12fa8cc88043 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -2577,3 +2577,41 @@ void read_stats_header(int stats_fd, struct kvm_stats_header *header)
 	ret = read(stats_fd, header, sizeof(*header));
 	TEST_ASSERT(ret == sizeof(*header), "Read stats header");
 }
+
+static ssize_t stats_descs_size(struct kvm_stats_header *header)
+{
+	return header->num_desc *
+	       (sizeof(struct kvm_stats_desc) + header->name_size);
+}
+
+/*
+ * Read binary stats descriptors
+ *
+ * Input Args:
+ *   stats_fd - the file descriptor for the binary stats file from which to read
+ *   header - the binary stats metadata header corresponding to the given FD
+ *
+ * Output Args: None
+ *
+ * Return:
+ *   A pointer to a newly allocated series of stat descriptors.
+ *   Caller is responsible for freeing the returned kvm_stats_desc.
+ *
+ * Read the stats descriptors from the binary stats interface.
+ */
+struct kvm_stats_desc *read_stats_desc(int stats_fd,
+				       struct kvm_stats_header *header)
+{
+	struct kvm_stats_desc *stats_desc;
+	ssize_t ret;
+
+	stats_desc = malloc(stats_descs_size(header));
+	TEST_ASSERT(stats_desc, "Allocate memory for stats descriptors");
+
+	ret = pread(stats_fd, stats_desc, stats_descs_size(header),
+		    header->desc_offset);
+	TEST_ASSERT(ret == stats_descs_size(header),
+		    "Read KVM stats descriptors");
+
+	return stats_desc;
+}
-- 
2.36.0.464.gb9c8b46e94-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v7 04/11] KVM: selftests: Clean up coding style in binary stats test
  2022-05-03 18:30 [PATCH v7 00/11] KVM: x86: Add a cap to disable NX hugepages on a VM Ben Gardon
                   ` (2 preceding siblings ...)
  2022-05-03 18:30 ` [PATCH v7 03/11] KVM: selftests: Read binary stats desc " Ben Gardon
@ 2022-05-03 18:30 ` Ben Gardon
  2022-05-03 18:30 ` [PATCH v7 05/11] KVM: selftests: Read binary stat data in lib Ben Gardon
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 19+ messages in thread
From: Ben Gardon @ 2022-05-03 18:30 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
	Jim Mattson, David Dunn, Jing Zhang, Junaid Shahid, Ben Gardon

From: Sean Christopherson <seanjc@google.com>

Fix a variety of code style violations and/or inconsistencies in the
binary stats test.  The 80 char limit is a soft limit and can and should
be ignored/violated if doing so improves the overall code readability.

Specifically, provide consistent indentation and don't split expressions
at arbitrary points just to honor the 80 char limit.

Opportunistically expand/add comments to call out the more subtle aspects
of the code.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: David Matlack <dmatlack@google.com>
Signed-off-by: Ben Gardon <bgardon@google.com>
---
 .../selftests/kvm/kvm_binary_stats_test.c     | 91 ++++++++++++-------
 1 file changed, 56 insertions(+), 35 deletions(-)

diff --git a/tools/testing/selftests/kvm/kvm_binary_stats_test.c b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
index b49fae45db1e..8b31f8fc7e08 100644
--- a/tools/testing/selftests/kvm/kvm_binary_stats_test.c
+++ b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
@@ -35,47 +35,64 @@ static void stats_test(int stats_fd)
 	/* Read kvm stats header */
 	read_stats_header(stats_fd, &header);
 
+	/*
+	 * The base size of the descriptor is defined by KVM's ABI, but the
+	 * size of the name field is variable as far as KVM's ABI is concerned.
+	 * But, the size of name is constant for a given instance of KVM and
+	 * is provided by KVM in the overall stats header.
+	 */
 	size_desc = sizeof(*stats_desc) + header.name_size;
 
 	/* Read kvm stats id string */
 	id = malloc(header.name_size);
 	TEST_ASSERT(id, "Allocate memory for id string");
+
 	ret = read(stats_fd, id, header.name_size);
 	TEST_ASSERT(ret == header.name_size, "Read id string");
 
 	/* Check id string, that should start with "kvm" */
 	TEST_ASSERT(!strncmp(id, "kvm", 3) && strlen(id) < header.name_size,
-				"Invalid KVM stats type, id: %s", id);
+		    "Invalid KVM stats type, id: %s", id);
 
 	/* Sanity check for other fields in header */
 	if (header.num_desc == 0) {
 		printf("No KVM stats defined!");
 		return;
 	}
-	/* Check overlap */
-	TEST_ASSERT(header.desc_offset > 0 && header.data_offset > 0
-			&& header.desc_offset >= sizeof(header)
-			&& header.data_offset >= sizeof(header),
-			"Invalid offset fields in header");
+	/*
+	 * The descriptor and data offsets must be valid, they must not overlap
+	 * the header, and the descriptor and data blocks must not overlap each
+	 * other.  Note, the data block is rechecked after its size is known.
+	 */
+	TEST_ASSERT(header.desc_offset && header.desc_offset >= sizeof(header) &&
+		    header.data_offset && header.data_offset >= sizeof(header),
+		    "Invalid offset fields in header");
+
 	TEST_ASSERT(header.desc_offset > header.data_offset ||
-			(header.desc_offset + size_desc * header.num_desc <=
-							header.data_offset),
-			"Descriptor block is overlapped with data block");
+		    (header.desc_offset + size_desc * header.num_desc <= header.data_offset),
+		    "Descriptor block is overlapped with data block");
 
 	/* Read kvm stats descriptors */
 	stats_desc = read_stats_desc(stats_fd, &header);
 
 	/* Sanity check for fields in descriptors */
 	for (i = 0; i < header.num_desc; ++i) {
+		/*
+		 * Note, size_desc includes the of the name field, which is
+		 * variable, i.e. this is NOT equivalent to &stats_desc[i].
+		 */
 		pdesc = (void *)stats_desc + i * size_desc;
-		/* Check type,unit,base boundaries */
-		TEST_ASSERT((pdesc->flags & KVM_STATS_TYPE_MASK)
-				<= KVM_STATS_TYPE_MAX, "Unknown KVM stats type");
-		TEST_ASSERT((pdesc->flags & KVM_STATS_UNIT_MASK)
-				<= KVM_STATS_UNIT_MAX, "Unknown KVM stats unit");
-		TEST_ASSERT((pdesc->flags & KVM_STATS_BASE_MASK)
-				<= KVM_STATS_BASE_MAX, "Unknown KVM stats base");
-		/* Check exponent for stats unit
+
+		/* Check type, unit, and base boundaries */
+		TEST_ASSERT((pdesc->flags & KVM_STATS_TYPE_MASK) <= KVM_STATS_TYPE_MAX,
+			    "Unknown KVM stats type");
+		TEST_ASSERT((pdesc->flags & KVM_STATS_UNIT_MASK) <= KVM_STATS_UNIT_MAX,
+			    "Unknown KVM stats unit");
+		TEST_ASSERT((pdesc->flags & KVM_STATS_BASE_MASK) <= KVM_STATS_BASE_MAX,
+			    "Unknown KVM stats base");
+
+		/*
+		 * Check exponent for stats unit
 		 * Exponent for counter should be greater than or equal to 0
 		 * Exponent for unit bytes should be greater than or equal to 0
 		 * Exponent for unit seconds should be less than or equal to 0
@@ -86,47 +103,51 @@ static void stats_test(int stats_fd)
 		case KVM_STATS_UNIT_NONE:
 		case KVM_STATS_UNIT_BYTES:
 		case KVM_STATS_UNIT_CYCLES:
-			TEST_ASSERT(pdesc->exponent >= 0,
-					"Unsupported KVM stats unit");
+			TEST_ASSERT(pdesc->exponent >= 0, "Unsupported KVM stats unit");
 			break;
 		case KVM_STATS_UNIT_SECONDS:
-			TEST_ASSERT(pdesc->exponent <= 0,
-					"Unsupported KVM stats unit");
+			TEST_ASSERT(pdesc->exponent <= 0, "Unsupported KVM stats unit");
 			break;
 		}
 		/* Check name string */
 		TEST_ASSERT(strlen(pdesc->name) < header.name_size,
-				"KVM stats name(%s) too long", pdesc->name);
+			    "KVM stats name(%s) too long", pdesc->name);
 		/* Check size field, which should not be zero */
-		TEST_ASSERT(pdesc->size, "KVM descriptor(%s) with size of 0",
-				pdesc->name);
+		TEST_ASSERT(pdesc->size,
+			    "KVM descriptor(%s) with size of 0", pdesc->name);
 		/* Check bucket_size field */
 		switch (pdesc->flags & KVM_STATS_TYPE_MASK) {
 		case KVM_STATS_TYPE_LINEAR_HIST:
 			TEST_ASSERT(pdesc->bucket_size,
-			    "Bucket size of Linear Histogram stats (%s) is zero",
-			    pdesc->name);
+				    "Bucket size of Linear Histogram stats (%s) is zero",
+				    pdesc->name);
 			break;
 		default:
 			TEST_ASSERT(!pdesc->bucket_size,
-			    "Bucket size of stats (%s) is not zero",
-			    pdesc->name);
+				    "Bucket size of stats (%s) is not zero",
+				    pdesc->name);
 		}
 		size_data += pdesc->size * sizeof(*stats_data);
 	}
-	/* Check overlap */
-	TEST_ASSERT(header.data_offset >= header.desc_offset
-		|| header.data_offset + size_data <= header.desc_offset,
-		"Data block is overlapped with Descriptor block");
+
+	/*
+	 * Now that the size of the data block is known, verify the data block
+	 * doesn't overlap the descriptor block.
+	 */
+	TEST_ASSERT(header.data_offset >= header.desc_offset ||
+		    header.data_offset + size_data <= header.desc_offset,
+		    "Data block is overlapped with Descriptor block");
+
 	/* Check validity of all stats data size */
 	TEST_ASSERT(size_data >= header.num_desc * sizeof(*stats_data),
-			"Data size is not correct");
+		    "Data size is not correct");
+
 	/* Check stats offset */
 	for (i = 0; i < header.num_desc; ++i) {
 		pdesc = (void *)stats_desc + i * size_desc;
 		TEST_ASSERT(pdesc->offset < size_data,
-			"Invalid offset (%u) for stats: %s",
-			pdesc->offset, pdesc->name);
+			    "Invalid offset (%u) for stats: %s",
+			    pdesc->offset, pdesc->name);
 	}
 
 	/* Allocate memory for stats data */
-- 
2.36.0.464.gb9c8b46e94-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v7 05/11] KVM: selftests: Read binary stat data in lib
  2022-05-03 18:30 [PATCH v7 00/11] KVM: x86: Add a cap to disable NX hugepages on a VM Ben Gardon
                   ` (3 preceding siblings ...)
  2022-05-03 18:30 ` [PATCH v7 04/11] KVM: selftests: Clean up coding style in binary stats test Ben Gardon
@ 2022-05-03 18:30 ` Ben Gardon
  2022-05-05 18:06   ` Sean Christopherson
  2022-05-03 18:30 ` [PATCH v7 06/11] KVM: selftests: Add NX huge pages test Ben Gardon
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 19+ messages in thread
From: Ben Gardon @ 2022-05-03 18:30 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
	Jim Mattson, David Dunn, Jing Zhang, Junaid Shahid, Ben Gardon

Move the code to read the binary stats data to the KVM selftests
library. It will be re-used by other tests to check KVM behavior.

Reviewed-by: David Matlack <dmatlack@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Ben Gardon <bgardon@google.com>
---
 .../selftests/kvm/include/kvm_util_base.h     |  3 ++
 .../selftests/kvm/kvm_binary_stats_test.c     |  7 ++--
 tools/testing/selftests/kvm/lib/kvm_util.c    | 36 +++++++++++++++++++
 3 files changed, 41 insertions(+), 5 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
index fabe46ddc1b2..2a3a4d9ed8e3 100644
--- a/tools/testing/selftests/kvm/include/kvm_util_base.h
+++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
@@ -403,6 +403,9 @@ int vcpu_get_stats_fd(struct kvm_vm *vm, uint32_t vcpuid);
 void read_stats_header(int stats_fd, struct kvm_stats_header *header);
 struct kvm_stats_desc *read_stats_desc(int stats_fd,
 				       struct kvm_stats_header *header);
+int read_stat_data(int stats_fd, struct kvm_stats_header *header,
+		   struct kvm_stats_desc *desc, uint64_t *data,
+		   ssize_t max_elements);
 
 uint32_t guest_get_vcpuid(void);
 
diff --git a/tools/testing/selftests/kvm/kvm_binary_stats_test.c b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
index 8b31f8fc7e08..59677fae26e5 100644
--- a/tools/testing/selftests/kvm/kvm_binary_stats_test.c
+++ b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
@@ -160,11 +160,8 @@ static void stats_test(int stats_fd)
 	size_data = 0;
 	for (i = 0; i < header.num_desc; ++i) {
 		pdesc = (void *)stats_desc + i * size_desc;
-		ret = pread(stats_fd, stats_data,
-				pdesc->size * sizeof(*stats_data),
-				header.data_offset + size_data);
-		TEST_ASSERT(ret == pdesc->size * sizeof(*stats_data),
-				"Read data of KVM stats: %s", pdesc->name);
+		read_stat_data(stats_fd, &header, pdesc, stats_data,
+			       pdesc->size);
 		size_data += pdesc->size * sizeof(*stats_data);
 	}
 
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index 12fa8cc88043..ea4ab64e5997 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -2615,3 +2615,39 @@ struct kvm_stats_desc *read_stats_desc(int stats_fd,
 
 	return stats_desc;
 }
+
+/*
+ * Read stat data for a particular stat
+ *
+ * Input Args:
+ *   stats_fd - the file descriptor for the binary stats file from which to read
+ *   header - the binary stats metadata header corresponding to the given FD
+ *   desc - the binary stat metadata for the particular stat to be read
+ *   max_elements - the maximum number of 8-byte values to read into data
+ *
+ * Output Args:
+ *   data - the buffer into which stat data should be read
+ *
+ * Return:
+ *   The number of data elements read into data or -ERRNO on error.
+ *
+ * Read the data values of a specified stat from the binary stats interface.
+ */
+int read_stat_data(int stats_fd, struct kvm_stats_header *header,
+		   struct kvm_stats_desc *desc, uint64_t *data,
+		   ssize_t max_elements)
+{
+	ssize_t size = min_t(ssize_t, desc->size, max_elements);
+	ssize_t ret;
+
+	ret = pread(stats_fd, data, size * sizeof(*data),
+		    header->data_offset + desc->offset);
+
+	/* ret from pread is in bytes. */
+	ret = ret / sizeof(*data);
+
+	TEST_ASSERT(ret == size,
+		    "Read data of KVM stats: %s", desc->name);
+
+	return ret;
+}
-- 
2.36.0.464.gb9c8b46e94-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v7 06/11] KVM: selftests: Add NX huge pages test
  2022-05-03 18:30 [PATCH v7 00/11] KVM: x86: Add a cap to disable NX hugepages on a VM Ben Gardon
                   ` (4 preceding siblings ...)
  2022-05-03 18:30 ` [PATCH v7 05/11] KVM: selftests: Read binary stat data in lib Ben Gardon
@ 2022-05-03 18:30 ` Ben Gardon
  2022-05-05 18:59   ` Sean Christopherson
  2022-05-03 18:30 ` [PATCH v7 07/11] KVM: x86: Fix errant brace in KVM capability handling Ben Gardon
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 19+ messages in thread
From: Ben Gardon @ 2022-05-03 18:30 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
	Jim Mattson, David Dunn, Jing Zhang, Junaid Shahid, Ben Gardon

There's currently no test coverage of NX hugepages in KVM selftests, so
add a basic test to ensure that the feature works as intended.

Reviewed-by: David Matlack <dmatlack@google.com>
Signed-off-by: Ben Gardon <bgardon@google.com>
---
 tools/testing/selftests/kvm/Makefile          |  10 +
 .../selftests/kvm/include/kvm_util_base.h     |   1 +
 tools/testing/selftests/kvm/lib/kvm_util.c    |  80 ++++++++
 .../selftests/kvm/x86_64/nx_huge_pages_test.c | 180 ++++++++++++++++++
 .../kvm/x86_64/nx_huge_pages_test.sh          |  36 ++++
 5 files changed, 307 insertions(+)
 create mode 100644 tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
 create mode 100755 tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh

diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index af582d168621..9bb9bce4df37 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -43,6 +43,10 @@ LIBKVM_aarch64 = lib/aarch64/processor.c lib/aarch64/ucall.c lib/aarch64/handler
 LIBKVM_s390x = lib/s390x/processor.c lib/s390x/ucall.c lib/s390x/diag318_test_handler.c
 LIBKVM_riscv = lib/riscv/processor.c lib/riscv/ucall.c
 
+# Non-compiled test targets
+TEST_PROGS_x86_64 += x86_64/nx_huge_pages_test.sh
+
+# Compiled test targets
 TEST_GEN_PROGS_x86_64 = x86_64/cpuid_test
 TEST_GEN_PROGS_x86_64 += x86_64/cr4_cpuid_sync_test
 TEST_GEN_PROGS_x86_64 += x86_64/get_msr_index_features
@@ -104,6 +108,9 @@ TEST_GEN_PROGS_x86_64 += steal_time
 TEST_GEN_PROGS_x86_64 += kvm_binary_stats_test
 TEST_GEN_PROGS_x86_64 += system_counter_offset_test
 
+# Compiled outputs used by test targets
+TEST_GEN_PROGS_EXTENDED_x86_64 += x86_64/nx_huge_pages_test
+
 TEST_GEN_PROGS_aarch64 += aarch64/arch_timer
 TEST_GEN_PROGS_aarch64 += aarch64/debug-exceptions
 TEST_GEN_PROGS_aarch64 += aarch64/get-reg-list
@@ -142,7 +149,9 @@ TEST_GEN_PROGS_riscv += kvm_page_table_test
 TEST_GEN_PROGS_riscv += set_memory_region_test
 TEST_GEN_PROGS_riscv += kvm_binary_stats_test
 
+TEST_PROGS += $(TEST_PROGS_$(UNAME_M))
 TEST_GEN_PROGS += $(TEST_GEN_PROGS_$(UNAME_M))
+TEST_GEN_PROGS_EXTENDED += $(TEST_GEN_PROGS_EXTENDED_$(UNAME_M))
 LIBKVM += $(LIBKVM_$(UNAME_M))
 
 INSTALL_HDR_PATH = $(top_srcdir)/usr
@@ -193,6 +202,7 @@ $(OUTPUT)/libkvm.a: $(LIBKVM_OBJS)
 x := $(shell mkdir -p $(sort $(dir $(TEST_GEN_PROGS))))
 all: $(STATIC_LIBS)
 $(TEST_GEN_PROGS): $(STATIC_LIBS)
+$(TEST_GEN_PROGS_EXTENDED): $(STATIC_LIBS)
 
 cscope: include_paths = $(LINUX_TOOL_INCLUDE) $(LINUX_HDR_PATH) include lib ..
 cscope:
diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
index 2a3a4d9ed8e3..001b55ae25f8 100644
--- a/tools/testing/selftests/kvm/include/kvm_util_base.h
+++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
@@ -406,6 +406,7 @@ struct kvm_stats_desc *read_stats_desc(int stats_fd,
 int read_stat_data(int stats_fd, struct kvm_stats_header *header,
 		   struct kvm_stats_desc *desc, uint64_t *data,
 		   ssize_t max_elements);
+uint64_t vm_get_stat(struct kvm_vm *vm, const char *stat_name);
 
 uint32_t guest_get_vcpuid(void);
 
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index ea4ab64e5997..6930ba298170 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -2651,3 +2651,83 @@ int read_stat_data(int stats_fd, struct kvm_stats_header *header,
 
 	return ret;
 }
+
+/*
+ * Read the data of the named stat
+ *
+ * Input Args:
+ *   vm - the VM for which the stat should be read
+ *   stat_name - the name of the stat to read
+ *   max_elements - the maximum number of 8-byte values to read into data
+ *
+ * Output Args:
+ *   data - the buffer into which stat data should be read
+ *
+ * Return:
+ *   The number of data elements read into data or -ERRNO on error.
+ *
+ * Read the data values of a specified stat from the binary stats interface.
+ */
+static int __vm_get_stat(struct kvm_vm *vm, const char *stat_name,
+			 uint64_t *data, ssize_t max_elements)
+{
+	struct kvm_stats_desc *stats_desc;
+	struct kvm_stats_header header;
+	struct kvm_stats_desc *desc;
+	size_t size_desc;
+	int stats_fd;
+	int ret = -EINVAL;
+	int i;
+
+	stats_fd = vm_get_stats_fd(vm);
+
+	read_stats_header(stats_fd, &header);
+
+	stats_desc = read_stats_desc(stats_fd, &header);
+
+	size_desc = sizeof(struct kvm_stats_desc) + header.name_size;
+
+	/* Read kvm stats data one by one */
+	for (i = 0; i < header.num_desc; ++i) {
+		desc = (void *)stats_desc + (i * size_desc);
+
+		if (strcmp(desc->name, stat_name))
+			continue;
+
+		ret = read_stat_data(stats_fd, &header, desc, data,
+				     max_elements);
+
+		break;
+	}
+
+	free(stats_desc);
+	close(stats_fd);
+	return ret;
+}
+
+/*
+ * Read the value of the named stat
+ *
+ * Input Args:
+ *   vm - the VM for which the stat should be read
+ *   stat_name - the name of the stat to read
+ *
+ * Output Args: None
+ *
+ * Return:
+ *   The value of the stat
+ *
+ * Reads the value of the named stat through the binary stat interface. If
+ * the named stat has multiple data elements, only the first will be returned.
+ */
+uint64_t vm_get_stat(struct kvm_vm *vm, const char *stat_name)
+{
+	uint64_t data;
+	int ret;
+
+	ret = __vm_get_stat(vm, stat_name, &data, 1);
+	TEST_ASSERT(ret == 1,
+		    "Stat %s expected to have 1 element, but %d returned",
+		    stat_name, ret);
+	return data;
+}
diff --git a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
new file mode 100644
index 000000000000..238a6047791c
--- /dev/null
+++ b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
@@ -0,0 +1,180 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * tools/testing/selftests/kvm/nx_huge_page_test.c
+ *
+ * Usage: to be run via nx_huge_page_test.sh, which does the necessary
+ * environment setup and teardown
+ *
+ * Copyright (C) 2022, Google LLC.
+ */
+
+#define _GNU_SOURCE
+
+#include <fcntl.h>
+#include <stdint.h>
+#include <time.h>
+
+#include <test_util.h>
+#include "kvm_util.h"
+
+#define HPAGE_SLOT		10
+#define HPAGE_GVA		(23*1024*1024)
+#define HPAGE_GPA		(10*1024*1024)
+#define HPAGE_SLOT_NPAGES	(512 * 3)
+#define PAGE_SIZE		4096
+
+/*
+ * Passed by nx_huge_pages_test.sh to provide an easy warning if this test is
+ * being run without it.
+ */
+#define MAGIC_TOKEN 887563923
+
+/*
+ * x86 opcode for the return instruction. Used to call into, and then
+ * immediately return from, memory backed with hugepages.
+ */
+#define RETURN_OPCODE 0xC3
+
+/*
+ * Exit the VM after each memory access so that the userspace component of the
+ * test can make assertions about the pages backing the VM.
+ *
+ * See the below for an explanation of how each access should affect the
+ * backing mappings.
+ */
+void guest_code(void)
+{
+	uint64_t hpage_1 = HPAGE_GVA;
+	uint64_t hpage_2 = hpage_1 + (PAGE_SIZE * 512);
+	uint64_t hpage_3 = hpage_2 + (PAGE_SIZE * 512);
+
+	READ_ONCE(*(uint64_t *)hpage_1);
+	GUEST_SYNC(1);
+
+	READ_ONCE(*(uint64_t *)hpage_2);
+	GUEST_SYNC(2);
+
+	((void (*)(void)) hpage_1)();
+	GUEST_SYNC(3);
+
+	((void (*)(void)) hpage_3)();
+	GUEST_SYNC(4);
+
+	READ_ONCE(*(uint64_t *)hpage_1);
+	GUEST_SYNC(5);
+
+	READ_ONCE(*(uint64_t *)hpage_3);
+	GUEST_SYNC(6);
+}
+
+static void check_2m_page_count(struct kvm_vm *vm, int expected_pages_2m)
+{
+	int actual_pages_2m;
+
+	actual_pages_2m = vm_get_stat(vm, "pages_2m");
+
+	TEST_ASSERT(actual_pages_2m == expected_pages_2m,
+		    "Unexpected 2m page count. Expected %d, got %d",
+		    expected_pages_2m, actual_pages_2m);
+}
+
+static void check_split_count(struct kvm_vm *vm, int expected_splits)
+{
+	int actual_splits;
+
+	actual_splits = vm_get_stat(vm, "nx_lpage_splits");
+
+	TEST_ASSERT(actual_splits == expected_splits,
+		    "Unexpected NX huge page split count. Expected %d, got %d",
+		    expected_splits, actual_splits);
+}
+
+int main(int argc, char **argv)
+{
+	struct kvm_vm *vm;
+	struct timespec ts;
+	void *hva;
+
+	if (argc != 2 || strtol(argv[1], NULL, 0) != MAGIC_TOKEN) {
+		printf("This test must be run through nx_huge_pages_test.sh");
+		return KSFT_SKIP;
+	}
+
+	vm = vm_create_default(0, 0, guest_code);
+
+	vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS_HUGETLB,
+				    HPAGE_GPA, HPAGE_SLOT,
+				    HPAGE_SLOT_NPAGES, 0);
+
+	virt_map(vm, HPAGE_GVA, HPAGE_GPA, HPAGE_SLOT_NPAGES);
+
+	hva = addr_gpa2hva(vm, HPAGE_GPA);
+	memset(hva, RETURN_OPCODE, HPAGE_SLOT_NPAGES * PAGE_SIZE);
+
+	check_2m_page_count(vm, 0);
+	check_split_count(vm, 0);
+
+	/*
+	 * The guest code will first read from the first hugepage, resulting
+	 * in a huge page mapping being created.
+	 */
+	vcpu_run(vm, 0);
+	check_2m_page_count(vm, 1);
+	check_split_count(vm, 0);
+
+	/*
+	 * Then the guest code will read from the second hugepage, resulting
+	 * in another huge page mapping being created.
+	 */
+	vcpu_run(vm, 0);
+	check_2m_page_count(vm, 2);
+	check_split_count(vm, 0);
+
+	/*
+	 * Next, the guest will execute from the first huge page, causing it
+	 * to be remapped at 4k.
+	 */
+	vcpu_run(vm, 0);
+	check_2m_page_count(vm, 1);
+	check_split_count(vm, 1);
+
+	/*
+	 * Executing from the third huge page (previously unaccessed) will
+	 * cause part to be mapped at 4k.
+	 */
+	vcpu_run(vm, 0);
+	check_2m_page_count(vm, 1);
+	check_split_count(vm, 2);
+
+	/* Reading from the first huge page again should have no effect. */
+	vcpu_run(vm, 0);
+	check_2m_page_count(vm, 1);
+	check_split_count(vm, 2);
+
+	/*
+	 * Give recovery thread time to run. The wrapper script sets
+	 * recovery_period_ms to 100, so wait 5x that.
+	 */
+	ts.tv_sec = 0;
+	ts.tv_nsec = 500000000;
+	nanosleep(&ts, NULL);
+
+	/*
+	 * Now that the reclaimer has run, all the split pages should be gone.
+	 */
+	check_2m_page_count(vm, 1);
+	check_split_count(vm, 0);
+
+	/*
+	 * The 4k mapping on hpage 3 should have been removed, so check that
+	 * reading from it causes a huge page mapping to be installed.
+	 */
+	vcpu_run(vm, 0);
+	check_2m_page_count(vm, 2);
+	check_split_count(vm, 0);
+
+	kvm_vm_free(vm);
+
+	return 0;
+}
+
diff --git a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
new file mode 100755
index 000000000000..60bfed8181b9
--- /dev/null
+++ b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
@@ -0,0 +1,36 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0-only */
+#
+# Wrapper script which performs setup and cleanup for nx_huge_pages_test.
+# Makes use of root privileges to set up huge pages and KVM module parameters.
+#
+# tools/testing/selftests/kvm/nx_huge_page_test.sh
+# Copyright (C) 2022, Google LLC.
+
+set -e
+
+NX_HUGE_PAGES=$(sudo cat /sys/module/kvm/parameters/nx_huge_pages)
+NX_HUGE_PAGES_RECOVERY_RATIO=$(sudo cat /sys/module/kvm/parameters/nx_huge_pages_recovery_ratio)
+NX_HUGE_PAGES_RECOVERY_PERIOD=$(sudo cat /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms)
+HUGE_PAGES=$(sudo cat /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages)
+
+set +e
+
+(
+	set -e
+
+	sudo echo 1 > /sys/module/kvm/parameters/nx_huge_pages
+	sudo echo 1 > /sys/module/kvm/parameters/nx_huge_pages_recovery_ratio
+	sudo echo 100 > /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms
+	sudo echo 3 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
+
+	"$(dirname $0)"/nx_huge_pages_test 887563923
+)
+RET=$?
+
+sudo echo $NX_HUGE_PAGES > /sys/module/kvm/parameters/nx_huge_pages
+sudo echo $NX_HUGE_PAGES_RECOVERY_RATIO > /sys/module/kvm/parameters/nx_huge_pages_recovery_ratio
+sudo echo $NX_HUGE_PAGES_RECOVERY_PERIOD > /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms
+sudo echo $HUGE_PAGES > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
+
+exit $RET
-- 
2.36.0.464.gb9c8b46e94-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v7 07/11] KVM: x86: Fix errant brace in KVM capability handling
  2022-05-03 18:30 [PATCH v7 00/11] KVM: x86: Add a cap to disable NX hugepages on a VM Ben Gardon
                   ` (5 preceding siblings ...)
  2022-05-03 18:30 ` [PATCH v7 06/11] KVM: selftests: Add NX huge pages test Ben Gardon
@ 2022-05-03 18:30 ` Ben Gardon
  2022-05-03 18:30 ` [PATCH v7 08/11] KVM: x86/MMU: Allow NX huge pages to be disabled on a per-vm basis Ben Gardon
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 19+ messages in thread
From: Ben Gardon @ 2022-05-03 18:30 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
	Jim Mattson, David Dunn, Jing Zhang, Junaid Shahid, Ben Gardon

The braces around the KVM_CAP_XSAVE2 block also surround the
KVM_CAP_PMU_CAPABILITY block, likely the result of a merge issue. Simply
move the curly brace back to where it belongs.

Fixes: ba7bb663f5547 ("KVM: x86: Provide per VM capability for disabling PMU virtualization")

Reviewed-by: David Matlack <dmatlack@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Ben Gardon <bgardon@google.com>
---
 arch/x86/kvm/x86.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 8ee8c91fa762..5ba9c7c8140a 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4382,10 +4382,10 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 		if (r < sizeof(struct kvm_xsave))
 			r = sizeof(struct kvm_xsave);
 		break;
+	}
 	case KVM_CAP_PMU_CAPABILITY:
 		r = enable_pmu ? KVM_CAP_PMU_VALID_MASK : 0;
 		break;
-	}
 	case KVM_CAP_DISABLE_QUIRKS2:
 		r = KVM_X86_VALID_QUIRKS;
 		break;
-- 
2.36.0.464.gb9c8b46e94-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v7 08/11] KVM: x86/MMU: Allow NX huge pages to be disabled on a per-vm basis
  2022-05-03 18:30 [PATCH v7 00/11] KVM: x86: Add a cap to disable NX hugepages on a VM Ben Gardon
                   ` (6 preceding siblings ...)
  2022-05-03 18:30 ` [PATCH v7 07/11] KVM: x86: Fix errant brace in KVM capability handling Ben Gardon
@ 2022-05-03 18:30 ` Ben Gardon
  2022-05-03 18:30 ` [PATCH v7 09/11] KVM: selftests: Factor out calculation of pages needed for a VM Ben Gardon
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 19+ messages in thread
From: Ben Gardon @ 2022-05-03 18:30 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
	Jim Mattson, David Dunn, Jing Zhang, Junaid Shahid, Ben Gardon

In some cases, the NX hugepage mitigation for iTLB multihit is not
needed for all guests on a host. Allow disabling the mitigation on a
per-VM basis to avoid the performance hit of NX hugepages on trusted
workloads.

In order to disable NX hugepages on a VM, ensure that the userspace
actor has permission to reboot the system. Since disabling NX hugepages
would allow a guest to crash the system, it is similar to reboot
permissions.

Ideally, KVM would require userspace to prove it has access to KVM's
nx_huge_pages module param, e.g. so that userspace can opt out without
needing full reboot permissions.  But getting access to the module param
file info is difficult because it is buried in layers of sysfs and module
glue. Requiring CAP_SYS_BOOT is sufficient for all known use cases.

Suggested-by: Jim Mattson <jmattson@google.com>
Reviewed-by: David Matlack <dmatlack@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Ben Gardon <bgardon@google.com>
---
 Documentation/virt/kvm/api.rst  | 17 +++++++++++++++++
 arch/x86/include/asm/kvm_host.h |  2 ++
 arch/x86/kvm/mmu.h              |  8 ++++----
 arch/x86/kvm/mmu/spte.c         |  7 ++++---
 arch/x86/kvm/mmu/spte.h         |  3 ++-
 arch/x86/kvm/mmu/tdp_mmu.c      |  2 +-
 arch/x86/kvm/x86.c              | 30 ++++++++++++++++++++++++++++++
 include/uapi/linux/kvm.h        |  1 +
 8 files changed, 61 insertions(+), 9 deletions(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 23baf7fce038..ddda1f1440d6 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -7879,6 +7879,23 @@ At this time, KVM_PMU_CAP_DISABLE is the only capability.  Setting
 this capability will disable PMU virtualization for that VM.  Usermode
 should adjust CPUID leaf 0xA to reflect that the PMU is disabled.
 
+8.36 KVM_CAP_VM_DISABLE_NX_HUGE_PAGES
+---------------------------
+
+:Capability KVM_CAP_VM_DISABLE_NX_HUGE_PAGES
+:Architectures: x86
+:Type: vm
+:Parameters: arg[0] must be 0.
+:Returns 0 on success, -EPERM if the userspace process does not
+	 have CAP_SYS_BOOT, -EINVAL if args[0] is not 0 or any vCPUs have been
+	 created.
+
+This capability disables the NX huge pages mitigation for iTLB MULTIHIT.
+
+The capability has no effect if the nx_huge_pages module parameter is not set.
+
+This capability may only be set before any vCPUs are created.
+
 9. Known KVM API problems
 =========================
 
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index c59fea4bdb6e..0a1bb01c3229 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1249,6 +1249,8 @@ struct kvm_arch {
 	 * the global KVM_MAX_VCPU_IDS may lead to significant memory waste.
 	 */
 	u32 max_vcpu_ids;
+
+	bool disable_nx_huge_pages;
 };
 
 struct kvm_vm_stat {
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index 7e258cc94152..0ce84042c059 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -217,9 +217,9 @@ struct kvm_page_fault {
 int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault);
 
 extern int nx_huge_pages;
-static inline bool is_nx_huge_page_enabled(void)
+static inline bool is_nx_huge_page_enabled(struct kvm *kvm)
 {
-	return READ_ONCE(nx_huge_pages);
+	return READ_ONCE(nx_huge_pages) && !kvm->arch.disable_nx_huge_pages;
 }
 
 static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
@@ -235,8 +235,8 @@ static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
 		.user = err & PFERR_USER_MASK,
 		.prefetch = prefetch,
 		.is_tdp = likely(vcpu->arch.mmu->page_fault == kvm_tdp_page_fault),
-		.nx_huge_page_workaround_enabled = is_nx_huge_page_enabled(),
-
+		.nx_huge_page_workaround_enabled =
+			is_nx_huge_page_enabled(vcpu->kvm),
 		.max_level = KVM_MAX_HUGEPAGE_LEVEL,
 		.req_level = PG_LEVEL_4K,
 		.goal_level = PG_LEVEL_4K,
diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c
index 75c9e87d446a..ecea6cfc965d 100644
--- a/arch/x86/kvm/mmu/spte.c
+++ b/arch/x86/kvm/mmu/spte.c
@@ -117,7 +117,7 @@ bool make_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
 		spte |= spte_shadow_accessed_mask(spte);
 
 	if (level > PG_LEVEL_4K && (pte_access & ACC_EXEC_MASK) &&
-	    is_nx_huge_page_enabled()) {
+	    is_nx_huge_page_enabled(vcpu->kvm)) {
 		pte_access &= ~ACC_EXEC_MASK;
 	}
 
@@ -216,7 +216,8 @@ static u64 make_spte_executable(u64 spte)
  * This is used during huge page splitting to build the SPTEs that make up the
  * new page table.
  */
-u64 make_huge_page_split_spte(u64 huge_spte, int huge_level, int index)
+u64 make_huge_page_split_spte(struct kvm *kvm, u64 huge_spte, int huge_level,
+			      int index)
 {
 	u64 child_spte;
 	int child_level;
@@ -244,7 +245,7 @@ u64 make_huge_page_split_spte(u64 huge_spte, int huge_level, int index)
 		 * When splitting to a 4K page, mark the page executable as the
 		 * NX hugepage mitigation no longer applies.
 		 */
-		if (is_nx_huge_page_enabled())
+		if (is_nx_huge_page_enabled(kvm))
 			child_spte = make_spte_executable(child_spte);
 	}
 
diff --git a/arch/x86/kvm/mmu/spte.h b/arch/x86/kvm/mmu/spte.h
index fbbab180395e..19eb5ea67d95 100644
--- a/arch/x86/kvm/mmu/spte.h
+++ b/arch/x86/kvm/mmu/spte.h
@@ -412,7 +412,8 @@ bool make_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
 	       unsigned int pte_access, gfn_t gfn, kvm_pfn_t pfn,
 	       u64 old_spte, bool prefetch, bool can_unsync,
 	       bool host_writable, u64 *new_spte);
-u64 make_huge_page_split_spte(u64 huge_spte, int huge_level, int index);
+u64 make_huge_page_split_spte(struct kvm *kvm, u64 huge_spte, int huge_level,
+			      int index);
 u64 make_nonleaf_spte(u64 *child_pt, bool ad_disabled);
 u64 make_mmio_spte(struct kvm_vcpu *vcpu, u64 gfn, unsigned int access);
 u64 mark_spte_for_access_track(u64 spte);
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 4fabb2cd0ba9..36f33a95a414 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -1470,7 +1470,7 @@ static int tdp_mmu_split_huge_page(struct kvm *kvm, struct tdp_iter *iter,
 	 * not been linked in yet and thus is not reachable from any other CPU.
 	 */
 	for (i = 0; i < PT64_ENT_PER_PAGE; i++)
-		sp->spt[i] = make_huge_page_split_spte(huge_spte, level, i);
+		sp->spt[i] = make_huge_page_split_spte(kvm, huge_spte, level, i);
 
 	/*
 	 * Replace the huge spte with a pointer to the populated lower level
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 5ba9c7c8140a..6ce2b18dcaa1 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4286,6 +4286,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_SYS_ATTRIBUTES:
 	case KVM_CAP_VAPIC:
 	case KVM_CAP_ENABLE_CAP:
+	case KVM_CAP_VM_DISABLE_NX_HUGE_PAGES:
 		r = 1;
 		break;
 	case KVM_CAP_EXIT_HYPERCALL:
@@ -6093,6 +6094,35 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 		}
 		mutex_unlock(&kvm->lock);
 		break;
+	case KVM_CAP_VM_DISABLE_NX_HUGE_PAGES:
+		r = -EINVAL;
+
+		/*
+		 * Since the risk of disabling NX hugepages is a guest crashing
+		 * the system, ensure the userspace process has permission to
+		 * reboot the system.
+		 *
+		 * Note that unlike the reboot() syscall, the process must have
+		 * this capability in the root namespace because exposing
+		 * /dev/kvm into a container does not limit the scope of the
+		 * iTLB multihit bug to that container. In other words,
+		 * this must use capable(), not ns_capable().
+		 */
+		if (!capable(CAP_SYS_BOOT)) {
+			r = -EPERM;
+			break;
+		}
+
+		if (cap->args[0])
+			break;
+
+		mutex_lock(&kvm->lock);
+		if (!kvm->created_vcpus) {
+			kvm->arch.disable_nx_huge_pages = true;
+			r = 0;
+		}
+		mutex_unlock(&kvm->lock);
+		break;
 	default:
 		r = -EINVAL;
 		break;
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index e10d131edd80..3a09528a9a81 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1153,6 +1153,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_DISABLE_QUIRKS2 213
 #define KVM_CAP_VM_TSC_CONTROL 214
 #define KVM_CAP_SYSTEM_EVENT_DATA 215
+#define KVM_CAP_VM_DISABLE_NX_HUGE_PAGES 216
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
2.36.0.464.gb9c8b46e94-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v7 09/11] KVM: selftests: Factor out calculation of pages needed for a VM
  2022-05-03 18:30 [PATCH v7 00/11] KVM: x86: Add a cap to disable NX hugepages on a VM Ben Gardon
                   ` (7 preceding siblings ...)
  2022-05-03 18:30 ` [PATCH v7 08/11] KVM: x86/MMU: Allow NX huge pages to be disabled on a per-vm basis Ben Gardon
@ 2022-05-03 18:30 ` Ben Gardon
  2022-05-03 18:30 ` [PATCH v7 10/11] KVM: selftests: Test disabling NX hugepages on " Ben Gardon
  2022-05-03 18:30 ` [PATCH v7 11/11] KVM: selftests: Cache binary stats metadata for duration of test Ben Gardon
  10 siblings, 0 replies; 19+ messages in thread
From: Ben Gardon @ 2022-05-03 18:30 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
	Jim Mattson, David Dunn, Jing Zhang, Junaid Shahid, Ben Gardon

Factor out the calculation of the number of pages needed for a VM to
make it easier to separate creating the VM and adding vCPUs.

Reviewed-by: David Matlack <dmatlack@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Ben Gardon <bgardon@google.com>
---
 .../selftests/kvm/include/kvm_util_base.h     |  4 ++
 tools/testing/selftests/kvm/lib/kvm_util.c    | 59 ++++++++++++++-----
 2 files changed, 47 insertions(+), 16 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
index 001b55ae25f8..1dac3c6607f1 100644
--- a/tools/testing/selftests/kvm/include/kvm_util_base.h
+++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
@@ -312,6 +312,10 @@ vm_paddr_t vm_phy_pages_alloc(struct kvm_vm *vm, size_t num,
 			      vm_paddr_t paddr_min, uint32_t memslot);
 vm_paddr_t vm_alloc_page_table(struct kvm_vm *vm);
 
+uint64_t vm_pages_needed(enum vm_guest_mode mode, uint32_t nr_vcpus,
+			 uint64_t slot0_mem_pages, uint64_t extra_mem_pages,
+			 uint32_t num_percpu_pages);
+
 /*
  * Create a VM with reasonable defaults
  *
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index 6930ba298170..27ffd2537df6 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -377,7 +377,7 @@ struct kvm_vm *vm_create_without_vcpus(enum vm_guest_mode mode, uint64_t pages)
 }
 
 /*
- * VM Create with customized parameters
+ * Get the number of pages needed for a VM
  *
  * Input Args:
  *   mode - VM Mode (e.g. VM_MODE_P52V48_4K)
@@ -385,27 +385,17 @@ struct kvm_vm *vm_create_without_vcpus(enum vm_guest_mode mode, uint64_t pages)
  *   slot0_mem_pages - Slot0 physical memory size
  *   extra_mem_pages - Non-slot0 physical memory total size
  *   num_percpu_pages - Per-cpu physical memory pages
- *   guest_code - Guest entry point
- *   vcpuids - VCPU IDs
  *
  * Output Args: None
  *
  * Return:
- *   Pointer to opaque structure that describes the created VM.
- *
- * Creates a VM with the mode specified by mode (e.g. VM_MODE_P52V48_4K),
- * with customized slot0 memory size, at least 512 pages currently.
- * extra_mem_pages is only used to calculate the maximum page table size,
- * no real memory allocation for non-slot0 memory in this function.
+ *   The number of pages needed for the VM.
  */
-struct kvm_vm *vm_create_with_vcpus(enum vm_guest_mode mode, uint32_t nr_vcpus,
-				    uint64_t slot0_mem_pages, uint64_t extra_mem_pages,
-				    uint32_t num_percpu_pages, void *guest_code,
-				    uint32_t vcpuids[])
+uint64_t vm_pages_needed(enum vm_guest_mode mode, uint32_t nr_vcpus,
+			 uint64_t slot0_mem_pages, uint64_t extra_mem_pages,
+			 uint32_t num_percpu_pages)
 {
 	uint64_t vcpu_pages, extra_pg_pages, pages;
-	struct kvm_vm *vm;
-	int i;
 
 	/* Force slot0 memory size not small than DEFAULT_GUEST_PHY_PAGES */
 	if (slot0_mem_pages < DEFAULT_GUEST_PHY_PAGES)
@@ -421,11 +411,48 @@ struct kvm_vm *vm_create_with_vcpus(enum vm_guest_mode mode, uint32_t nr_vcpus,
 	extra_pg_pages = (slot0_mem_pages + extra_mem_pages + vcpu_pages) / PTES_PER_MIN_PAGE * 2;
 	pages = slot0_mem_pages + vcpu_pages + extra_pg_pages;
 
+	pages = vm_adjust_num_guest_pages(mode, pages);
+
+	return pages;
+}
+
+/*
+ * VM Create with customized parameters
+ *
+ * Input Args:
+ *   mode - VM Mode (e.g. VM_MODE_P52V48_4K)
+ *   nr_vcpus - VCPU count
+ *   slot0_mem_pages - Slot0 physical memory size
+ *   extra_mem_pages - Non-slot0 physical memory total size
+ *   num_percpu_pages - Per-cpu physical memory pages
+ *   guest_code - Guest entry point
+ *   vcpuids - VCPU IDs
+ *
+ * Output Args: None
+ *
+ * Return:
+ *   Pointer to opaque structure that describes the created VM.
+ *
+ * Creates a VM with the mode specified by mode (e.g. VM_MODE_P52V48_4K),
+ * with customized slot0 memory size, at least 512 pages currently.
+ * extra_mem_pages is only used to calculate the maximum page table size,
+ * no real memory allocation for non-slot0 memory in this function.
+ */
+struct kvm_vm *vm_create_with_vcpus(enum vm_guest_mode mode, uint32_t nr_vcpus,
+				    uint64_t slot0_mem_pages, uint64_t extra_mem_pages,
+				    uint32_t num_percpu_pages, void *guest_code,
+				    uint32_t vcpuids[])
+{
+	uint64_t pages;
+	struct kvm_vm *vm;
+	int i;
+
 	TEST_ASSERT(nr_vcpus <= kvm_check_cap(KVM_CAP_MAX_VCPUS),
 		    "nr_vcpus = %d too large for host, max-vcpus = %d",
 		    nr_vcpus, kvm_check_cap(KVM_CAP_MAX_VCPUS));
 
-	pages = vm_adjust_num_guest_pages(mode, pages);
+	pages = vm_pages_needed(mode, nr_vcpus, slot0_mem_pages,
+				extra_mem_pages, num_percpu_pages);
 
 	vm = vm_create_without_vcpus(mode, pages);
 
-- 
2.36.0.464.gb9c8b46e94-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v7 10/11] KVM: selftests: Test disabling NX hugepages on a VM
  2022-05-03 18:30 [PATCH v7 00/11] KVM: x86: Add a cap to disable NX hugepages on a VM Ben Gardon
                   ` (8 preceding siblings ...)
  2022-05-03 18:30 ` [PATCH v7 09/11] KVM: selftests: Factor out calculation of pages needed for a VM Ben Gardon
@ 2022-05-03 18:30 ` Ben Gardon
  2022-05-03 18:34   ` Ben Gardon
  2022-05-05 19:14   ` Sean Christopherson
  2022-05-03 18:30 ` [PATCH v7 11/11] KVM: selftests: Cache binary stats metadata for duration of test Ben Gardon
  10 siblings, 2 replies; 19+ messages in thread
From: Ben Gardon @ 2022-05-03 18:30 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
	Jim Mattson, David Dunn, Jing Zhang, Junaid Shahid, Ben Gardon

Add an argument to the NX huge pages test to test disabling the feature
on a VM using the new capability.

Reviewed-by: David Matlack <dmatlack@google.com>
Signed-off-by: Ben Gardon <bgardon@google.com>
---
 .../selftests/kvm/include/kvm_util_base.h     |  2 +
 tools/testing/selftests/kvm/lib/kvm_util.c    | 27 ++++++-
 .../selftests/kvm/x86_64/nx_huge_pages_test.c | 70 +++++++++++++++----
 .../kvm/x86_64/nx_huge_pages_test.sh          | 12 +++-
 4 files changed, 95 insertions(+), 16 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
index 1dac3c6607f1..eee96189c1c4 100644
--- a/tools/testing/selftests/kvm/include/kvm_util_base.h
+++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
@@ -414,4 +414,6 @@ uint64_t vm_get_stat(struct kvm_vm *vm, const char *stat_name);
 
 uint32_t guest_get_vcpuid(void);
 
+int __vm_disable_nx_huge_pages(struct kvm_vm *vm);
+
 #endif /* SELFTEST_KVM_UTIL_BASE_H */
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index 27ffd2537df6..0ec7efc2900d 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -112,6 +112,11 @@ int vm_check_cap(struct kvm_vm *vm, long cap)
 	return ret;
 }
 
+static int __vm_enable_cap(struct kvm_vm *vm, struct kvm_enable_cap *cap)
+{
+	return ioctl(vm->fd, KVM_ENABLE_CAP, cap);
+}
+
 /* VM Enable Capability
  *
  * Input Args:
@@ -128,7 +133,7 @@ int vm_enable_cap(struct kvm_vm *vm, struct kvm_enable_cap *cap)
 {
 	int ret;
 
-	ret = ioctl(vm->fd, KVM_ENABLE_CAP, cap);
+	ret = __vm_enable_cap(vm, cap);
 	TEST_ASSERT(ret == 0, "KVM_ENABLE_CAP IOCTL failed,\n"
 		"  rc: %i errno: %i", ret, errno);
 
@@ -2758,3 +2763,23 @@ uint64_t vm_get_stat(struct kvm_vm *vm, const char *stat_name)
 		    stat_name, ret);
 	return data;
 }
+
+/* VM disable NX huge pages
+ *
+ * Input Args:
+ *   vm - Virtual Machine
+ *
+ * Output Args: None
+ *
+ * Return: On success, 0 -ERRNO on failure.
+ *
+ * Disables NX huge pages for the VM.
+ */
+int __vm_disable_nx_huge_pages(struct kvm_vm *vm)
+{
+	struct kvm_enable_cap cap = { 0 };
+
+	cap.cap = KVM_CAP_VM_DISABLE_NX_HUGE_PAGES;
+	cap.args[0] = 0;
+	return __vm_enable_cap(vm, &cap);
+}
diff --git a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
index 238a6047791c..1e7328dd33d2 100644
--- a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
+++ b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
@@ -13,6 +13,8 @@
 #include <fcntl.h>
 #include <stdint.h>
 #include <time.h>
+#include <linux/reboot.h>
+#include <sys/syscall.h>
 
 #include <test_util.h>
 #include "kvm_util.h"
@@ -89,18 +91,36 @@ static void check_split_count(struct kvm_vm *vm, int expected_splits)
 		    expected_splits, actual_splits);
 }
 
-int main(int argc, char **argv)
+void run_test(bool disable_nx_huge_pages, bool reboot_permissions)
 {
 	struct kvm_vm *vm;
 	struct timespec ts;
+	uint64_t pages;
 	void *hva;
-
-	if (argc != 2 || strtol(argv[1], NULL, 0) != MAGIC_TOKEN) {
-		printf("This test must be run through nx_huge_pages_test.sh");
-		return KSFT_SKIP;
+	int r;
+
+	pages = vm_pages_needed(VM_MODE_DEFAULT, 1, DEFAULT_GUEST_PHY_PAGES,
+				0, 0);
+	vm = vm_create_without_vcpus(VM_MODE_DEFAULT, pages);
+
+	if (disable_nx_huge_pages) {
+		/*
+		 * Cannot run the test without NX huge pages if the kernel
+		 * does not support it.
+		 */
+		if (!kvm_check_cap(KVM_CAP_VM_DISABLE_NX_HUGE_PAGES))
+			return;
+
+		r = __vm_disable_nx_huge_pages(vm);
+		if (reboot_permissions) {
+			TEST_ASSERT(!r, "Disabling NX huge pages should succeed if process has reboot permissions");
+		} else {
+			TEST_ASSERT(r == -EPERM, "This process should not have permission to disable NX huge pages");
+			return;
+		}
 	}
 
-	vm = vm_create_default(0, 0, guest_code);
+	vm_vcpu_add_default(vm, 0, guest_code);
 
 	vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS_HUGETLB,
 				    HPAGE_GPA, HPAGE_SLOT,
@@ -133,23 +153,27 @@ int main(int argc, char **argv)
 	/*
 	 * Next, the guest will execute from the first huge page, causing it
 	 * to be remapped at 4k.
+	 *
+	 * If NX huge pages are disabled, this should have no effect.
 	 */
 	vcpu_run(vm, 0);
-	check_2m_page_count(vm, 1);
-	check_split_count(vm, 1);
+	check_2m_page_count(vm, disable_nx_huge_pages ? 2 : 1);
+	check_split_count(vm, disable_nx_huge_pages ? 0 : 1);
 
 	/*
 	 * Executing from the third huge page (previously unaccessed) will
 	 * cause part to be mapped at 4k.
+	 *
+	 * If NX huge pages are disabled, it should be mapped at 2M.
 	 */
 	vcpu_run(vm, 0);
-	check_2m_page_count(vm, 1);
-	check_split_count(vm, 2);
+	check_2m_page_count(vm, disable_nx_huge_pages ? 3 : 1);
+	check_split_count(vm, disable_nx_huge_pages ? 0 : 2);
 
 	/* Reading from the first huge page again should have no effect. */
 	vcpu_run(vm, 0);
-	check_2m_page_count(vm, 1);
-	check_split_count(vm, 2);
+	check_2m_page_count(vm, disable_nx_huge_pages ? 3 : 1);
+	check_split_count(vm, disable_nx_huge_pages ? 0 : 2);
 
 	/*
 	 * Give recovery thread time to run. The wrapper script sets
@@ -161,8 +185,11 @@ int main(int argc, char **argv)
 
 	/*
 	 * Now that the reclaimer has run, all the split pages should be gone.
+	 *
+	 * If NX huge pages are disabled, the relaimer will not run, so
+	 * nothing should change from here on.
 	 */
-	check_2m_page_count(vm, 1);
+	check_2m_page_count(vm, disable_nx_huge_pages ? 3 : 1);
 	check_split_count(vm, 0);
 
 	/*
@@ -170,10 +197,25 @@ int main(int argc, char **argv)
 	 * reading from it causes a huge page mapping to be installed.
 	 */
 	vcpu_run(vm, 0);
-	check_2m_page_count(vm, 2);
+	check_2m_page_count(vm, disable_nx_huge_pages ? 3 : 2);
 	check_split_count(vm, 0);
 
 	kvm_vm_free(vm);
+}
+
+int main(int argc, char **argv)
+{
+	bool reboot_permissions;
+
+	if (argc != 3 || strtol(argv[1], NULL, 0) != MAGIC_TOKEN) {
+		printf("This test must be run through nx_huge_pages_test.sh");
+		return KSFT_SKIP;
+	}
+
+	reboot_permissions = strtol(argv[2], NULL, 0);
+
+	run_test(false, reboot_permissions);
+	run_test(true, reboot_permissions);
 
 	return 0;
 }
diff --git a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
index 60bfed8181b9..c21c1f639141 100755
--- a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
+++ b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
@@ -16,6 +16,8 @@ HUGE_PAGES=$(sudo cat /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages)
 
 set +e
 
+NXECUTABLE="$(dirname $0)/nx_huge_pages_test"
+
 (
 	set -e
 
@@ -24,7 +26,15 @@ set +e
 	sudo echo 100 > /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms
 	sudo echo 3 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
 
-	"$(dirname $0)"/nx_huge_pages_test 887563923
+	# Test with reboot permissions
+	sudo setcap cap_sys_boot+ep $NXECUTABLE
+	$NXECUTABLE 887563923 1
+
+	# Test without reboot permissions
+	if [ $(whoami) != "root" ] ; then
+		sudo setcap cap_sys_boot-ep $NXECUTABLE
+		$NXECUTABLE 887563923 0
+	fi
 )
 RET=$?
 
-- 
2.36.0.464.gb9c8b46e94-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v7 11/11] KVM: selftests: Cache binary stats metadata for duration of test
  2022-05-03 18:30 [PATCH v7 00/11] KVM: x86: Add a cap to disable NX hugepages on a VM Ben Gardon
                   ` (9 preceding siblings ...)
  2022-05-03 18:30 ` [PATCH v7 10/11] KVM: selftests: Test disabling NX hugepages on " Ben Gardon
@ 2022-05-03 18:30 ` Ben Gardon
  10 siblings, 0 replies; 19+ messages in thread
From: Ben Gardon @ 2022-05-03 18:30 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
	Jim Mattson, David Dunn, Jing Zhang, Junaid Shahid, Ben Gardon

In order to improve performance across multiple reads of VM stats, cache
the stats metadata in the VM struct.

Signed-off-by: Ben Gardon <bgardon@google.com>
---
 tools/testing/selftests/kvm/lib/kvm_util.c    | 31 ++++++++++---------
 .../selftests/kvm/lib/kvm_util_internal.h     |  5 +++
 2 files changed, 21 insertions(+), 15 deletions(-)

diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index 0ec7efc2900d..4bb012efcc88 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -736,6 +736,12 @@ void kvm_vm_free(struct kvm_vm *vmp)
 	if (vmp == NULL)
 		return;
 
+	/* Free cached stats metadata and close FD */
+	if (vmp->stats_fd) {
+		free(vmp->stats_desc);
+		close(vmp->stats_fd);
+	}
+
 	/* Free userspace_mem_regions. */
 	hash_for_each_safe(vmp->regions.slot_hash, ctr, node, region, slot_node)
 		__vm_mem_region_delete(vmp, region, false);
@@ -2703,37 +2709,32 @@ int read_stat_data(int stats_fd, struct kvm_stats_header *header,
 static int __vm_get_stat(struct kvm_vm *vm, const char *stat_name,
 			 uint64_t *data, ssize_t max_elements)
 {
-	struct kvm_stats_desc *stats_desc;
-	struct kvm_stats_header header;
 	struct kvm_stats_desc *desc;
 	size_t size_desc;
-	int stats_fd;
 	int ret = -EINVAL;
 	int i;
 
-	stats_fd = vm_get_stats_fd(vm);
-
-	read_stats_header(stats_fd, &header);
-
-	stats_desc = read_stats_desc(stats_fd, &header);
+	if (!vm->stats_fd) {
+		vm->stats_fd = vm_get_stats_fd(vm);
+		read_stats_header(vm->stats_fd, &vm->stats_header);
+		vm->stats_desc = read_stats_desc(vm->stats_fd, &vm->stats_header);
+	}
 
-	size_desc = sizeof(struct kvm_stats_desc) + header.name_size;
+	size_desc = sizeof(struct kvm_stats_desc) + vm->stats_header.name_size;
 
 	/* Read kvm stats data one by one */
-	for (i = 0; i < header.num_desc; ++i) {
-		desc = (void *)stats_desc + (i * size_desc);
+	for (i = 0; i < vm->stats_header.num_desc; ++i) {
+		desc = (void *)vm->stats_desc + (i * size_desc);
 
 		if (strcmp(desc->name, stat_name))
 			continue;
 
-		ret = read_stat_data(stats_fd, &header, desc, data,
-				     max_elements);
+		ret = read_stat_data(vm->stats_fd, &vm->stats_header, desc,
+				     data, max_elements);
 
 		break;
 	}
 
-	free(stats_desc);
-	close(stats_fd);
 	return ret;
 }
 
diff --git a/tools/testing/selftests/kvm/lib/kvm_util_internal.h b/tools/testing/selftests/kvm/lib/kvm_util_internal.h
index a03febc24ba6..e753edd7b8d3 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util_internal.h
+++ b/tools/testing/selftests/kvm/lib/kvm_util_internal.h
@@ -67,6 +67,11 @@ struct kvm_vm {
 	vm_vaddr_t idt;
 	vm_vaddr_t handlers;
 	uint32_t dirty_ring_size;
+
+	/* Cache of information for binary stats interface */
+	int stats_fd;
+	struct kvm_stats_header stats_header;
+	struct kvm_stats_desc *stats_desc;
 };
 
 struct vcpu *vcpu_find(struct kvm_vm *vm, uint32_t vcpuid);
-- 
2.36.0.464.gb9c8b46e94-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v7 10/11] KVM: selftests: Test disabling NX hugepages on a VM
  2022-05-03 18:30 ` [PATCH v7 10/11] KVM: selftests: Test disabling NX hugepages on " Ben Gardon
@ 2022-05-03 18:34   ` Ben Gardon
  2022-05-05 19:14   ` Sean Christopherson
  1 sibling, 0 replies; 19+ messages in thread
From: Ben Gardon @ 2022-05-03 18:34 UTC (permalink / raw)
  To: LKML, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
	Jim Mattson, David Dunn, Jing Zhang, Junaid Shahid

On Tue, May 3, 2022 at 11:31 AM Ben Gardon <bgardon@google.com> wrote:
>
> Add an argument to the NX huge pages test to test disabling the feature
> on a VM using the new capability.
>
> Reviewed-by: David Matlack <dmatlack@google.com>
> Signed-off-by: Ben Gardon <bgardon@google.com>
> ---
>  .../selftests/kvm/include/kvm_util_base.h     |  2 +
>  tools/testing/selftests/kvm/lib/kvm_util.c    | 27 ++++++-
>  .../selftests/kvm/x86_64/nx_huge_pages_test.c | 70 +++++++++++++++----
>  .../kvm/x86_64/nx_huge_pages_test.sh          | 12 +++-
>  4 files changed, 95 insertions(+), 16 deletions(-)
>
> diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
> index 1dac3c6607f1..eee96189c1c4 100644
> --- a/tools/testing/selftests/kvm/include/kvm_util_base.h
> +++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
> @@ -414,4 +414,6 @@ uint64_t vm_get_stat(struct kvm_vm *vm, const char *stat_name);
>
>  uint32_t guest_get_vcpuid(void);
>
> +int __vm_disable_nx_huge_pages(struct kvm_vm *vm);
> +
>  #endif /* SELFTEST_KVM_UTIL_BASE_H */
> diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
> index 27ffd2537df6..0ec7efc2900d 100644
> --- a/tools/testing/selftests/kvm/lib/kvm_util.c
> +++ b/tools/testing/selftests/kvm/lib/kvm_util.c
> @@ -112,6 +112,11 @@ int vm_check_cap(struct kvm_vm *vm, long cap)
>         return ret;
>  }
>
> +static int __vm_enable_cap(struct kvm_vm *vm, struct kvm_enable_cap *cap)
> +{
> +       return ioctl(vm->fd, KVM_ENABLE_CAP, cap);
> +}
> +
>  /* VM Enable Capability
>   *
>   * Input Args:
> @@ -128,7 +133,7 @@ int vm_enable_cap(struct kvm_vm *vm, struct kvm_enable_cap *cap)
>  {
>         int ret;
>
> -       ret = ioctl(vm->fd, KVM_ENABLE_CAP, cap);
> +       ret = __vm_enable_cap(vm, cap);
>         TEST_ASSERT(ret == 0, "KVM_ENABLE_CAP IOCTL failed,\n"
>                 "  rc: %i errno: %i", ret, errno);
>
> @@ -2758,3 +2763,23 @@ uint64_t vm_get_stat(struct kvm_vm *vm, const char *stat_name)
>                     stat_name, ret);
>         return data;
>  }
> +
> +/* VM disable NX huge pages
> + *
> + * Input Args:
> + *   vm - Virtual Machine
> + *
> + * Output Args: None
> + *
> + * Return: On success, 0 -ERRNO on failure.
> + *
> + * Disables NX huge pages for the VM.
> + */
> +int __vm_disable_nx_huge_pages(struct kvm_vm *vm)
> +{
> +       struct kvm_enable_cap cap = { 0 };
> +
> +       cap.cap = KVM_CAP_VM_DISABLE_NX_HUGE_PAGES;
> +       cap.args[0] = 0;
> +       return __vm_enable_cap(vm, &cap);
> +}
> diff --git a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
> index 238a6047791c..1e7328dd33d2 100644
> --- a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
> +++ b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
> @@ -13,6 +13,8 @@
>  #include <fcntl.h>
>  #include <stdint.h>
>  #include <time.h>
> +#include <linux/reboot.h>
> +#include <sys/syscall.h>
>
>  #include <test_util.h>
>  #include "kvm_util.h"
> @@ -89,18 +91,36 @@ static void check_split_count(struct kvm_vm *vm, int expected_splits)
>                     expected_splits, actual_splits);
>  }
>
> -int main(int argc, char **argv)
> +void run_test(bool disable_nx_huge_pages, bool reboot_permissions)
>  {
>         struct kvm_vm *vm;
>         struct timespec ts;
> +       uint64_t pages;
>         void *hva;
> -
> -       if (argc != 2 || strtol(argv[1], NULL, 0) != MAGIC_TOKEN) {
> -               printf("This test must be run through nx_huge_pages_test.sh");
> -               return KSFT_SKIP;
> +       int r;
> +
> +       pages = vm_pages_needed(VM_MODE_DEFAULT, 1, DEFAULT_GUEST_PHY_PAGES,
> +                               0, 0);
> +       vm = vm_create_without_vcpus(VM_MODE_DEFAULT, pages);
> +
> +       if (disable_nx_huge_pages) {
> +               /*
> +                * Cannot run the test without NX huge pages if the kernel
> +                * does not support it.
> +                */
> +               if (!kvm_check_cap(KVM_CAP_VM_DISABLE_NX_HUGE_PAGES))
> +                       return;
> +
> +               r = __vm_disable_nx_huge_pages(vm);
> +               if (reboot_permissions) {
> +                       TEST_ASSERT(!r, "Disabling NX huge pages should succeed if process has reboot permissions");
> +               } else {
> +                       TEST_ASSERT(r == -EPERM, "This process should not have permission to disable NX huge pages");
> +                       return;
> +               }
>         }
>
> -       vm = vm_create_default(0, 0, guest_code);
> +       vm_vcpu_add_default(vm, 0, guest_code);
>
>         vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS_HUGETLB,
>                                     HPAGE_GPA, HPAGE_SLOT,
> @@ -133,23 +153,27 @@ int main(int argc, char **argv)
>         /*
>          * Next, the guest will execute from the first huge page, causing it
>          * to be remapped at 4k.
> +        *
> +        * If NX huge pages are disabled, this should have no effect.
>          */
>         vcpu_run(vm, 0);
> -       check_2m_page_count(vm, 1);
> -       check_split_count(vm, 1);
> +       check_2m_page_count(vm, disable_nx_huge_pages ? 2 : 1);
> +       check_split_count(vm, disable_nx_huge_pages ? 0 : 1);
>
>         /*
>          * Executing from the third huge page (previously unaccessed) will
>          * cause part to be mapped at 4k.
> +        *
> +        * If NX huge pages are disabled, it should be mapped at 2M.
>          */
>         vcpu_run(vm, 0);
> -       check_2m_page_count(vm, 1);
> -       check_split_count(vm, 2);
> +       check_2m_page_count(vm, disable_nx_huge_pages ? 3 : 1);
> +       check_split_count(vm, disable_nx_huge_pages ? 0 : 2);
>
>         /* Reading from the first huge page again should have no effect. */
>         vcpu_run(vm, 0);
> -       check_2m_page_count(vm, 1);
> -       check_split_count(vm, 2);
> +       check_2m_page_count(vm, disable_nx_huge_pages ? 3 : 1);
> +       check_split_count(vm, disable_nx_huge_pages ? 0 : 2);
>
>         /*
>          * Give recovery thread time to run. The wrapper script sets
> @@ -161,8 +185,11 @@ int main(int argc, char **argv)
>
>         /*
>          * Now that the reclaimer has run, all the split pages should be gone.
> +        *
> +        * If NX huge pages are disabled, the relaimer will not run, so
> +        * nothing should change from here on.
>          */
> -       check_2m_page_count(vm, 1);
> +       check_2m_page_count(vm, disable_nx_huge_pages ? 3 : 1);
>         check_split_count(vm, 0);
>
>         /*
> @@ -170,10 +197,25 @@ int main(int argc, char **argv)
>          * reading from it causes a huge page mapping to be installed.
>          */
>         vcpu_run(vm, 0);
> -       check_2m_page_count(vm, 2);
> +       check_2m_page_count(vm, disable_nx_huge_pages ? 3 : 2);
>         check_split_count(vm, 0);
>
>         kvm_vm_free(vm);
> +}
> +
> +int main(int argc, char **argv)
> +{
> +       bool reboot_permissions;
> +
> +       if (argc != 3 || strtol(argv[1], NULL, 0) != MAGIC_TOKEN) {
> +               printf("This test must be run through nx_huge_pages_test.sh");
> +               return KSFT_SKIP;
> +       }
> +
> +       reboot_permissions = strtol(argv[2], NULL, 0);
> +
> +       run_test(false, reboot_permissions);
> +       run_test(true, reboot_permissions);
>
>         return 0;
>  }
> diff --git a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
> index 60bfed8181b9..c21c1f639141 100755
> --- a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
> +++ b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
> @@ -16,6 +16,8 @@ HUGE_PAGES=$(sudo cat /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages)
>
>  set +e
>
> +NXECUTABLE="$(dirname $0)/nx_huge_pages_test"
> +
>  (
>         set -e
>
> @@ -24,7 +26,15 @@ set +e
>         sudo echo 100 > /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms
>         sudo echo 3 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
>
> -       "$(dirname $0)"/nx_huge_pages_test 887563923
> +       # Test with reboot permissions
> +       sudo setcap cap_sys_boot+ep $NXECUTABLE
> +       $NXECUTABLE 887563923 1
> +
> +       # Test without reboot permissions
> +       if [ $(whoami) != "root" ] ; then
> +               sudo setcap cap_sys_boot-ep $NXECUTABLE
> +               $NXECUTABLE 887563923 0
> +       fi
>  )
>  RET=$?

I feel lazy and dumb for not testing this as a non-root user, but I've
been working on setting up an environment to do such testing for over
a week just to test this commit, with no success. Perhaps I am too
ensconced in Google's development tools. In any case, if anyone is
willing to run this test as a non-root user to make sure it doesn't
blow up, I'd appreciate it. I can confirm that the test works fine
when run as root.

>
> --
> 2.36.0.464.gb9c8b46e94-goog
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v7 03/11] KVM: selftests: Read binary stats desc in lib
  2022-05-03 18:30 ` [PATCH v7 03/11] KVM: selftests: Read binary stats desc " Ben Gardon
@ 2022-05-05 17:08   ` Sean Christopherson
  2022-05-05 17:13     ` Sean Christopherson
  0 siblings, 1 reply; 19+ messages in thread
From: Sean Christopherson @ 2022-05-05 17:08 UTC (permalink / raw)
  To: Ben Gardon
  Cc: linux-kernel, kvm, Paolo Bonzini, Peter Xu, David Matlack,
	Jim Mattson, David Dunn, Jing Zhang, Junaid Shahid

On Tue, May 03, 2022, Ben Gardon wrote:
> diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
> index 1d75d41f92dc..12fa8cc88043 100644
> --- a/tools/testing/selftests/kvm/lib/kvm_util.c
> +++ b/tools/testing/selftests/kvm/lib/kvm_util.c
> @@ -2577,3 +2577,41 @@ void read_stats_header(int stats_fd, struct kvm_stats_header *header)
>  	ret = read(stats_fd, header, sizeof(*header));
>  	TEST_ASSERT(ret == sizeof(*header), "Read stats header");
>  }
> +
> +static ssize_t stats_descs_size(struct kvm_stats_header *header)

Please spell out "descriptors", I find it difficult to visually differentiate
desc vs. descs.  I don't mind the short version for variable names, but for helpers
provided by the library I think the added clarity is worth the verbosity.

> +{
> +	return header->num_desc *
> +	       (sizeof(struct kvm_stats_desc) + header->name_size);
> +}

Aha!  This is the right patch to deal with the "variable" name_size.  Rather than
open code the adjustment for header->name_size, add a helper for _that_.  Then
read_stats_descriptors() can do:

	desc_size = get_stats_descriptor_size(header);
	total_size = header->num_desc * get_stats_descriptor_size(header);

	stats_desc = calloc(header->num_desc, desc_size);
	TEST_ASSERT(stats_desc, "Allocate memory for stats descriptors");

	ret = pread(stats_fd, stats_desc, total_size, header->desc_offset);
	TEST_ASSERT(ret == total_size, "Read KVM stats descriptors");

Those helpers provide an even better place for the comments that my cleanup patch
adds.

> +
> +/*
> + * Read binary stats descriptors
> + *
> + * Input Args:
> + *   stats_fd - the file descriptor for the binary stats file from which to read
> + *   header - the binary stats metadata header corresponding to the given FD
> + *
> + * Output Args: None
> + *
> + * Return:
> + *   A pointer to a newly allocated series of stat descriptors.
> + *   Caller is responsible for freeing the returned kvm_stats_desc.
> + *
> + * Read the stats descriptors from the binary stats interface.
> + */
> +struct kvm_stats_desc *read_stats_desc(int stats_fd,

This be "read_stats_descriptors" (or read_stats_descs() if someone objects to the
verbose name) to make it clear this reads multiple desriptors.

E.g. this on top (compile tested only)

---
 .../selftests/kvm/include/kvm_util_base.h     | 26 +++++++++++++++++--
 .../selftests/kvm/kvm_binary_stats_test.c     |  7 ++---
 tools/testing/selftests/kvm/lib/kvm_util.c    | 23 +++++++---------
 3 files changed, 37 insertions(+), 19 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
index fabe46ddc1b2..e31f7113a529 100644
--- a/tools/testing/selftests/kvm/include/kvm_util_base.h
+++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
@@ -401,8 +401,30 @@ void assert_on_unhandled_exception(struct kvm_vm *vm, uint32_t vcpuid);
 int vm_get_stats_fd(struct kvm_vm *vm);
 int vcpu_get_stats_fd(struct kvm_vm *vm, uint32_t vcpuid);
 void read_stats_header(int stats_fd, struct kvm_stats_header *header);
-struct kvm_stats_desc *read_stats_desc(int stats_fd,
-				       struct kvm_stats_header *header);
+struct kvm_stats_desc *read_stats_descriptors(int stats_fd,
+					      struct kvm_stats_header *header);
+
+static inline ssize_t get_stats_descriptor_size(struct kvm_stats_header *header)
+{
+        /*
+         * The base size of the descriptor is defined by KVM's ABI, but the
+         * size of the name field is variable as far as KVM's ABI is concerned.
+         * But, the size of name is constant for a given instance of KVM and
+         * is provided by KVM in the overall stats header.
+         */
+	return sizeof(struct kvm_stats_desc) + header->name_size;
+}
+
+static inline struct kvm_stats_desc *get_stats_descriptor(struct kvm_stats_desc *stats,
+							  int index,
+							  struct kvm_stats_header *header)
+{
+	/*
+	 * Note, size_desc includes the of the name field, which is
+         * variable, i.e. this is NOT equivalent to &stats_desc[i].
+	 */
+	return (void *)stats + index * get_stats_descriptor_size(header);
+}

 uint32_t guest_get_vcpuid(void);

diff --git a/tools/testing/selftests/kvm/kvm_binary_stats_test.c b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
index b49fae45db1e..6252e3d6e964 100644
--- a/tools/testing/selftests/kvm/kvm_binary_stats_test.c
+++ b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
@@ -35,7 +35,7 @@ static void stats_test(int stats_fd)
 	/* Read kvm stats header */
 	read_stats_header(stats_fd, &header);

-	size_desc = sizeof(*stats_desc) + header.name_size;
+	size_desc = get_stats_descriptor_size(&header);

 	/* Read kvm stats id string */
 	id = malloc(header.name_size);
@@ -63,11 +63,12 @@ static void stats_test(int stats_fd)
 			"Descriptor block is overlapped with data block");

 	/* Read kvm stats descriptors */
-	stats_desc = read_stats_desc(stats_fd, &header);
+	stats_desc = read_stats_descriptors(stats_fd, &header);

 	/* Sanity check for fields in descriptors */
 	for (i = 0; i < header.num_desc; ++i) {
-		pdesc = (void *)stats_desc + i * size_desc;
+		pdesc = get_stats_descriptor(stats_desc, i, &header);
+
 		/* Check type,unit,base boundaries */
 		TEST_ASSERT((pdesc->flags & KVM_STATS_TYPE_MASK)
 				<= KVM_STATS_TYPE_MAX, "Unknown KVM stats type");
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index 12fa8cc88043..4374c553de1f 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -2578,12 +2578,6 @@ void read_stats_header(int stats_fd, struct kvm_stats_header *header)
 	TEST_ASSERT(ret == sizeof(*header), "Read stats header");
 }

-static ssize_t stats_descs_size(struct kvm_stats_header *header)
-{
-	return header->num_desc *
-	       (sizeof(struct kvm_stats_desc) + header->name_size);
-}
-
 /*
  * Read binary stats descriptors
  *
@@ -2599,19 +2593,20 @@ static ssize_t stats_descs_size(struct kvm_stats_header *header)
  *
  * Read the stats descriptors from the binary stats interface.
  */
-struct kvm_stats_desc *read_stats_desc(int stats_fd,
-				       struct kvm_stats_header *header)
+struct kvm_stats_desc *read_stats_descriptors(int stats_fd,
+					      struct kvm_stats_header *header)
 {
+	ssize_t ret, desc_size, total_size;
 	struct kvm_stats_desc *stats_desc;
-	ssize_t ret;

-	stats_desc = malloc(stats_descs_size(header));
+	desc_size = get_stats_descriptor_size(header);
+	total_size = header->num_desc * get_stats_descriptor_size(header);
+
+	stats_desc = calloc(header->num_desc, desc_size);
 	TEST_ASSERT(stats_desc, "Allocate memory for stats descriptors");

-	ret = pread(stats_fd, stats_desc, stats_descs_size(header),
-		    header->desc_offset);
-	TEST_ASSERT(ret == stats_descs_size(header),
-		    "Read KVM stats descriptors");
+	ret = pread(stats_fd, stats_desc, total_size, header->desc_offset);
+	TEST_ASSERT(ret == total_size, "Read KVM stats descriptors");

 	return stats_desc;
 }

base-commit: 6d8fd8c4f5fa1da6fa63da1d127b2804e79b1092
--


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v7 03/11] KVM: selftests: Read binary stats desc in lib
  2022-05-05 17:08   ` Sean Christopherson
@ 2022-05-05 17:13     ` Sean Christopherson
  0 siblings, 0 replies; 19+ messages in thread
From: Sean Christopherson @ 2022-05-05 17:13 UTC (permalink / raw)
  To: Ben Gardon
  Cc: linux-kernel, kvm, Paolo Bonzini, Peter Xu, David Matlack,
	Jim Mattson, David Dunn, Jing Zhang, Junaid Shahid

On Thu, May 05, 2022, Sean Christopherson wrote:
> @@ -63,11 +63,12 @@ static void stats_test(int stats_fd)
>  			"Descriptor block is overlapped with data block");
> 
>  	/* Read kvm stats descriptors */
> -	stats_desc = read_stats_desc(stats_fd, &header);
> +	stats_desc = read_stats_descriptors(stats_fd, &header);
> 
>  	/* Sanity check for fields in descriptors */
>  	for (i = 0; i < header.num_desc; ++i) {
> -		pdesc = (void *)stats_desc + i * size_desc;
> +		pdesc = get_stats_descriptor(stats_desc, i, &header);
> +
>  		/* Check type,unit,base boundaries */
>  		TEST_ASSERT((pdesc->flags & KVM_STATS_TYPE_MASK)
>  				<= KVM_STATS_TYPE_MAX, "Unknown KVM stats type");

Drat, I missed two instances!  And more on top...

---
 tools/testing/selftests/kvm/kvm_binary_stats_test.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/kvm/kvm_binary_stats_test.c b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
index 6252e3d6e964..6b5ce270b890 100644
--- a/tools/testing/selftests/kvm/kvm_binary_stats_test.c
+++ b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
@@ -124,7 +124,7 @@ static void stats_test(int stats_fd)
 			"Data size is not correct");
 	/* Check stats offset */
 	for (i = 0; i < header.num_desc; ++i) {
-		pdesc = (void *)stats_desc + i * size_desc;
+		pdesc = get_stats_descriptor(stats_desc, i, &header);
 		TEST_ASSERT(pdesc->offset < size_data,
 			"Invalid offset (%u) for stats: %s",
 			pdesc->offset, pdesc->name);
@@ -139,7 +139,7 @@ static void stats_test(int stats_fd)
 	/* Read kvm stats data one by one */
 	size_data = 0;
 	for (i = 0; i < header.num_desc; ++i) {
-		pdesc = (void *)stats_desc + i * size_desc;
+		pdesc = get_stats_descriptor(stats_desc, i, &header);
 		ret = pread(stats_fd, stats_data,
 				pdesc->size * sizeof(*stats_data),
 				header.data_offset + size_data);

base-commit: 84185927b3e27502a70685848adffbe56a804d98
--


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v7 05/11] KVM: selftests: Read binary stat data in lib
  2022-05-03 18:30 ` [PATCH v7 05/11] KVM: selftests: Read binary stat data in lib Ben Gardon
@ 2022-05-05 18:06   ` Sean Christopherson
  2022-05-05 18:38     ` Sean Christopherson
  0 siblings, 1 reply; 19+ messages in thread
From: Sean Christopherson @ 2022-05-05 18:06 UTC (permalink / raw)
  To: Ben Gardon
  Cc: linux-kernel, kvm, Paolo Bonzini, Peter Xu, David Matlack,
	Jim Mattson, David Dunn, Jing Zhang, Junaid Shahid

On Tue, May 03, 2022, Ben Gardon wrote:
> Move the code to read the binary stats data to the KVM selftests
> library. It will be re-used by other tests to check KVM behavior.
> 
> Reviewed-by: David Matlack <dmatlack@google.com>
> Reviewed-by: Peter Xu <peterx@redhat.com>
> Signed-off-by: Ben Gardon <bgardon@google.com>
> ---
>  .../selftests/kvm/include/kvm_util_base.h     |  3 ++
>  .../selftests/kvm/kvm_binary_stats_test.c     |  7 ++--
>  tools/testing/selftests/kvm/lib/kvm_util.c    | 36 +++++++++++++++++++
>  3 files changed, 41 insertions(+), 5 deletions(-)
> 
> diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
> index fabe46ddc1b2..2a3a4d9ed8e3 100644
> --- a/tools/testing/selftests/kvm/include/kvm_util_base.h
> +++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
> @@ -403,6 +403,9 @@ int vcpu_get_stats_fd(struct kvm_vm *vm, uint32_t vcpuid);
>  void read_stats_header(int stats_fd, struct kvm_stats_header *header);
>  struct kvm_stats_desc *read_stats_desc(int stats_fd,
>  				       struct kvm_stats_header *header);
> +int read_stat_data(int stats_fd, struct kvm_stats_header *header,
> +		   struct kvm_stats_desc *desc, uint64_t *data,
> +		   ssize_t max_elements);
>  
>  uint32_t guest_get_vcpuid(void);
>  
> diff --git a/tools/testing/selftests/kvm/kvm_binary_stats_test.c b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
> index 8b31f8fc7e08..59677fae26e5 100644
> --- a/tools/testing/selftests/kvm/kvm_binary_stats_test.c
> +++ b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
> @@ -160,11 +160,8 @@ static void stats_test(int stats_fd)
>  	size_data = 0;
>  	for (i = 0; i < header.num_desc; ++i) {
>  		pdesc = (void *)stats_desc + i * size_desc;
> -		ret = pread(stats_fd, stats_data,
> -				pdesc->size * sizeof(*stats_data),
> -				header.data_offset + size_data);
> -		TEST_ASSERT(ret == pdesc->size * sizeof(*stats_data),
> -				"Read data of KVM stats: %s", pdesc->name);
> +		read_stat_data(stats_fd, &header, pdesc, stats_data,
> +			       pdesc->size);
>  		size_data += pdesc->size * sizeof(*stats_data);

Not your code, but updating size_data is pointless and confusing.  It's especially
confusing as of this patch because it ignores the return read_stat_data().  I vote
to opportunistically delete this code.

>  	}
>  
> diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
> index 12fa8cc88043..ea4ab64e5997 100644
> --- a/tools/testing/selftests/kvm/lib/kvm_util.c
> +++ b/tools/testing/selftests/kvm/lib/kvm_util.c
> @@ -2615,3 +2615,39 @@ struct kvm_stats_desc *read_stats_desc(int stats_fd,
>  
>  	return stats_desc;
>  }
> +
> +/*
> + * Read stat data for a particular stat
> + *
> + * Input Args:
> + *   stats_fd - the file descriptor for the binary stats file from which to read
> + *   header - the binary stats metadata header corresponding to the given FD
> + *   desc - the binary stat metadata for the particular stat to be read
> + *   max_elements - the maximum number of 8-byte values to read into data
> + *
> + * Output Args:
> + *   data - the buffer into which stat data should be read
> + *
> + * Return:
> + *   The number of data elements read into data or -ERRNO on error.

This is a lie, it can never return -ERRNO.  Well, unless the caller is mean and
passes in -EINVAL for max_elements I guess.

> + *
> + * Read the data values of a specified stat from the binary stats interface.
> + */
> +int read_stat_data(int stats_fd, struct kvm_stats_header *header,
> +		   struct kvm_stats_desc *desc, uint64_t *data,
> +		   ssize_t max_elements)

Uber nit, @max_elements should be unsigned size_t, only the return from pread()
is signed, the input is unsigned.

> +{
> +	ssize_t size = min_t(ssize_t, desc->size, max_elements);

Not your fault (I blame struct kvm_stats_desc), but "nr_elements" would be far
more appropriate than "size".  And that frees up "size" to be the actual size,
which eliminates the division.

> +	ssize_t ret;
> +
> +	ret = pread(stats_fd, data, size * sizeof(*data),
> +		    header->data_offset + desc->offset);
> +
> +	/* ret from pread is in bytes. */
> +	ret = ret / sizeof(*data);
> +
> +	TEST_ASSERT(ret == size,
> +		    "Read data of KVM stats: %s", desc->name);

It'd be very helpful to print the expected vs. actual.

> +
> +	return ret;

Eww.  I really, really hate code that asserts on a value and then returns that
same value.  E.g. looking at just the declaration of read_stat_data() and the
change in stats_test(), I genuinely thought this patch dropped the assert.  The
assert in vm_get_stat() also added to the confusion (I was reviewing that patch,
not this one).

Rather than return the number of entries read, just assert that the number of
elements to be read is non-zero, then vm_get_stat() doesn't need to assert because
it'll be impossible to read anything but one entry without asserting.

void read_stat_data(int stats_fd, struct kvm_stats_header *header,
		    struct kvm_stats_desc *desc, uint64_t *data,
		    size_t max_elements)
{
	size_t nr_elements = min_t(size_t, desc->size, max_elements);
	size_t size = nr_elements * sizeof(*data);
	ssize_t ret;

	TEST_ASSERT(size, "No elements in stat '%s'", desc->name);

	ret = pread(stats_fd, data, size, header->data_offset + desc->offset);

	TEST_ASSERT(ret == size,
		    "pread() failed on stat '%s', wanted %lu bytes, got %ld",
		    desc->name, size, ret);
}


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v7 05/11] KVM: selftests: Read binary stat data in lib
  2022-05-05 18:06   ` Sean Christopherson
@ 2022-05-05 18:38     ` Sean Christopherson
  0 siblings, 0 replies; 19+ messages in thread
From: Sean Christopherson @ 2022-05-05 18:38 UTC (permalink / raw)
  To: Ben Gardon
  Cc: linux-kernel, kvm, Paolo Bonzini, Peter Xu, David Matlack,
	Jim Mattson, David Dunn, Jing Zhang, Junaid Shahid

On Thu, May 05, 2022, Sean Christopherson wrote:
> On Tue, May 03, 2022, Ben Gardon wrote:
> Eww.  I really, really hate code that asserts on a value and then returns that
> same value.  E.g. looking at just the declaration of read_stat_data() and the
> change in stats_test(), I genuinely thought this patch dropped the assert.  The
> assert in vm_get_stat() also added to the confusion (I was reviewing that patch,
> not this one).
> 
> Rather than return the number of entries read, just assert that the number of
> elements to be read is non-zero, then vm_get_stat() doesn't need to assert because
> it'll be impossible to read anything but one entry without asserting.

Ah, and __vm_get_stat() can do:

	for (i = 0; i < vm->stats_header.num_desc; ++i) {
		desc = get_stats_descriptor(vm->stats_desc, i, &vm->stats_header);

		if (strcmp(desc->name, stat_name))
			continue;

		read_stat_data(vm->stats_fd, &vm->stats_header, desc, data,
			       max_elements);
		return;
	}

	TEST_FAIL("Stat '%s' does not exist\n", stat_name);

> 
> void read_stat_data(int stats_fd, struct kvm_stats_header *header,
> 		    struct kvm_stats_desc *desc, uint64_t *data,
> 		    size_t max_elements)
> {
> 	size_t nr_elements = min_t(size_t, desc->size, max_elements);
> 	size_t size = nr_elements * sizeof(*data);
> 	ssize_t ret;
> 
> 	TEST_ASSERT(size, "No elements in stat '%s'", desc->name);
> 
> 	ret = pread(stats_fd, data, size, header->data_offset + desc->offset);
> 
> 	TEST_ASSERT(ret == size,
> 		    "pread() failed on stat '%s', wanted %lu bytes, got %ld",
> 		    desc->name, size, ret);

Related to not printing a raw EINVAL (similar to above), it might be worth special
casing the errno path, e.g.

	TEST_ASSERT(ret >= 0, "pread() failed on stat '%s', errno: %i (%s)",
		    desc->name, errno, strerror(errno));
	TEST_ASSERT(ret == size,
		    "pread() on stat '%s' read %ld bytes, wanted %lu bytes",
		    desc->name, size, ret);

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v7 06/11] KVM: selftests: Add NX huge pages test
  2022-05-03 18:30 ` [PATCH v7 06/11] KVM: selftests: Add NX huge pages test Ben Gardon
@ 2022-05-05 18:59   ` Sean Christopherson
  0 siblings, 0 replies; 19+ messages in thread
From: Sean Christopherson @ 2022-05-05 18:59 UTC (permalink / raw)
  To: Ben Gardon
  Cc: linux-kernel, kvm, Paolo Bonzini, Peter Xu, David Matlack,
	Jim Mattson, David Dunn, Jing Zhang, Junaid Shahid

On Tue, May 03, 2022, Ben Gardon wrote:
> There's currently no test coverage of NX hugepages in KVM selftests, so
> add a basic test to ensure that the feature works as intended.

Please describe how the test actually validates the feature, here and/or as a
file comment.  Doesn't have to be super detailed, but people shouldn't have to
read the code just to understand that the test uses stats to verify the KVM is
(or isn't) splitting pages as expected.

> Reviewed-by: David Matlack <dmatlack@google.com>
> Signed-off-by: Ben Gardon <bgardon@google.com>
> ---
>  tools/testing/selftests/kvm/Makefile          |  10 +
>  .../selftests/kvm/include/kvm_util_base.h     |   1 +
>  tools/testing/selftests/kvm/lib/kvm_util.c    |  80 ++++++++
>  .../selftests/kvm/x86_64/nx_huge_pages_test.c | 180 ++++++++++++++++++
>  .../kvm/x86_64/nx_huge_pages_test.sh          |  36 ++++

Test needs to be added to .gitignore.

>  5 files changed, 307 insertions(+)
>  create mode 100644 tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
>  create mode 100755 tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh

...

> + * Input Args:
> + *   vm - the VM for which the stat should be read
> + *   stat_name - the name of the stat to read
> + *   max_elements - the maximum number of 8-byte values to read into data
> + *
> + * Output Args:
> + *   data - the buffer into which stat data should be read
> + *
> + * Return:
> + *   The number of data elements read into data or -ERRNO on error.
> + *
> + * Read the data values of a specified stat from the binary stats interface.
> + */
> +static int __vm_get_stat(struct kvm_vm *vm, const char *stat_name,
> +			 uint64_t *data, ssize_t max_elements)
> +{
> +	struct kvm_stats_desc *stats_desc;
> +	struct kvm_stats_header header;
> +	struct kvm_stats_desc *desc;
> +	size_t size_desc;
> +	int stats_fd;
> +	int ret = -EINVAL;
> +	int i;
> +
> +	stats_fd = vm_get_stats_fd(vm);
> +
> +	read_stats_header(stats_fd, &header);
> +
> +	stats_desc = read_stats_desc(stats_fd, &header);
> +
> +	size_desc = sizeof(struct kvm_stats_desc) + header.name_size;
> +
> +	/* Read kvm stats data one by one */

Bad copy+paste.  This doesn't read every stat, it reads only the specified stat.

> +	for (i = 0; i < header.num_desc; ++i) {
> +		desc = (void *)stats_desc + (i * size_desc);

		desc = get_stats_descriptor(vm->stats_desc, i, vm->stats_header);

> +
> +		if (strcmp(desc->name, stat_name))
> +			continue;
> +
> +		ret = read_stat_data(stats_fd, &header, desc, data,
> +				     max_elements);
> +
> +		break;
> +	}
> +
> +	free(stats_desc);
> +	close(stats_fd);
> +	return ret;
> +}
> +
> +/*
> + * Read the value of the named stat
> + *
> + * Input Args:
> + *   vm - the VM for which the stat should be read
> + *   stat_name - the name of the stat to read
> + *
> + * Output Args: None
> + *
> + * Return:
> + *   The value of the stat
> + *
> + * Reads the value of the named stat through the binary stat interface. If
> + * the named stat has multiple data elements, only the first will be returned.
> + */
> +uint64_t vm_get_stat(struct kvm_vm *vm, const char *stat_name)

One read_stat_data() doesn't return an int, this can be inlined.  Yeah, it'll
expose __vm_get_stat(), but IMO there's no point in having __vm_get_stat() unless
it's exposed.  At that point, drop the function comment and defer to the inner
helper for that detailed info (or even drop it entirely).

> +{
> +	uint64_t data;
> +	int ret;
> +
> +	ret = __vm_get_stat(vm, stat_name, &data, 1);
> +	TEST_ASSERT(ret == 1,
> +		    "Stat %s expected to have 1 element, but %d returned",
> +		    stat_name, ret);
> +	return data;
> +}
> diff --git a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
> new file mode 100644
> index 000000000000..238a6047791c
> --- /dev/null
> +++ b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
> @@ -0,0 +1,180 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * tools/testing/selftests/kvm/nx_huge_page_test.c
> + *
> + * Usage: to be run via nx_huge_page_test.sh, which does the necessary
> + * environment setup and teardown
> + *
> + * Copyright (C) 2022, Google LLC.
> + */
> +
> +#define _GNU_SOURCE
> +
> +#include <fcntl.h>
> +#include <stdint.h>
> +#include <time.h>
> +
> +#include <test_util.h>
> +#include "kvm_util.h"
> +
> +#define HPAGE_SLOT		10
> +#define HPAGE_GVA		(23*1024*1024)
> +#define HPAGE_GPA		(10*1024*1024)

Is there any special meaning behind the addresses?  If not, IMO it's safer to
define the GPA to be 4gb, that way there's no chance of this colliding with
memslot0 created by the framework.  10mb (if my math isn't terrible) isn't all
that high.  And unless the GPA and GVA need to be different for some reason, just
identity map them.

> +#define HPAGE_SLOT_NPAGES	(512 * 3)
> +#define PAGE_SIZE		4096

commit e852be8b148e ("kvm: selftests: introduce and use more page size-related constants")
in kvm/master defines PAGE_SIZE.  I bring that up, because rather than open code
the "512" math in multiple places, I would much rather do something like

#define PAGE_SIZE_2MB		(PAGE_SIZE * 512)

or whatever, and then use that.

> +
> +/*
> + * Passed by nx_huge_pages_test.sh to provide an easy warning if this test is
> + * being run without it.
> + */
> +#define MAGIC_TOKEN 887563923
> +
> +/*
> + * x86 opcode for the return instruction. Used to call into, and then
> + * immediately return from, memory backed with hugepages.
> + */
> +#define RETURN_OPCODE 0xC3
> +
> +/*
> + * Exit the VM after each memory access so that the userspace component of the
> + * test can make assertions about the pages backing the VM.
> + *
> + * See the below for an explanation of how each access should affect the
> + * backing mappings.
> + */
> +void guest_code(void)
> +{
> +	uint64_t hpage_1 = HPAGE_GVA;
> +	uint64_t hpage_2 = hpage_1 + (PAGE_SIZE * 512);
> +	uint64_t hpage_3 = hpage_2 + (PAGE_SIZE * 512);
> +
> +	READ_ONCE(*(uint64_t *)hpage_1);
> +	GUEST_SYNC(1);
> +
> +	READ_ONCE(*(uint64_t *)hpage_2);
> +	GUEST_SYNC(2);
> +
> +	((void (*)(void)) hpage_1)();

LOL, nice.  It'd be very, very helpful for readers to add a helper for this, e.g.

static guest_do_CALL(void *target)
{
	((void (*)(void)) target)();
}

and then the usage should be a little more obvious.

	guest_do_CALL(hpage_1);

> +	GUEST_SYNC(3);
> +
> +	((void (*)(void)) hpage_3)();
> +	GUEST_SYNC(4);
> +
> +	READ_ONCE(*(uint64_t *)hpage_1);
> +	GUEST_SYNC(5);
> +
> +	READ_ONCE(*(uint64_t *)hpage_3);
> +	GUEST_SYNC(6);
> +}

...

> +int main(int argc, char **argv)
> +{
> +       struct kvm_vm *vm;
> +       struct timespec ts;
> +       void *hva;
> +
> +       if (argc != 2 || strtol(argv[1], NULL, 0) != MAGIC_TOKEN) {

Since this will take multiple params, I think it makes senes to skip on:

	if (argc < 2 || strtol(argv[1], NULL, 0) != MAGIC_TOKEN) {

And the error out on the remaining params.

> +               printf("This test must be run through nx_huge_pages_test.sh");

Needs a newline at the end.  Even better, just use print_skip().  And I strongly
prefer that the skip message complain about not getting the correct magic token,
not about the script.  It's totally valid to run the test without the script.
I think it's a good idea to provide a redirect to the script, but yell about the
magic token first.

> +               return KSFT_SKIP;
> +       }

...

> +	hva = addr_gpa2hva(vm, HPAGE_GPA);
> +	memset(hva, RETURN_OPCODE, HPAGE_SLOT_NPAGES * PAGE_SIZE);

Heh, no need to set the entire page, only the first byte needs to be written.  If
you're going for paranoia, write 0xcc to the entire slot and then write only the
first byte to RET.

> +	/*
> +	 * Give recovery thread time to run. The wrapper script sets
> +	 * recovery_period_ms to 100, so wait 5x that.
> +	 */
> +	ts.tv_sec = 0;
> +	ts.tv_nsec = 500000000;
> +	nanosleep(&ts, NULL);

Please pass in the configured nx_huge_pages_recovery_period_ms, or alternatively
read it from within the test.  I assume the test will fail if the period is too
low, so some sanity checks are likely in order.  It'll be unfortunate if someone
changes the script and forgets to update the test.

> +
> +	/*
> +	 * Now that the reclaimer has run, all the split pages should be gone.
> +	 */
> +	check_2m_page_count(vm, 1);
> +	check_split_count(vm, 0);
> +
> +	/*
> +	 * The 4k mapping on hpage 3 should have been removed, so check that
> +	 * reading from it causes a huge page mapping to be installed.
> +	 */
> +	vcpu_run(vm, 0);
> +	check_2m_page_count(vm, 2);
> +	check_split_count(vm, 0);
> +
> +	kvm_vm_free(vm);
> +
> +	return 0;
> +}
> +
> diff --git a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
> new file mode 100755
> index 000000000000..60bfed8181b9
> --- /dev/null
> +++ b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
> @@ -0,0 +1,36 @@
> +#!/bin/bash
> +# SPDX-License-Identifier: GPL-2.0-only */
> +#
> +# Wrapper script which performs setup and cleanup for nx_huge_pages_test.
> +# Makes use of root privileges to set up huge pages and KVM module parameters.
> +#
> +# tools/testing/selftests/kvm/nx_huge_page_test.sh
> +# Copyright (C) 2022, Google LLC.
> +
> +set -e
> +
> +NX_HUGE_PAGES=$(sudo cat /sys/module/kvm/parameters/nx_huge_pages)
> +NX_HUGE_PAGES_RECOVERY_RATIO=$(sudo cat /sys/module/kvm/parameters/nx_huge_pages_recovery_ratio)
> +NX_HUGE_PAGES_RECOVERY_PERIOD=$(sudo cat /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms)
> +HUGE_PAGES=$(sudo cat /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages)

I think we can omit sudo on these.  Since we're assuming sysfs is mounted at the
normal path, we can also likely assume it's mounted with normal permissions too,
i.e. read-only for non-root users.

> +
> +set +e
> +
> +(
> +	set -e
> +
> +	sudo echo 1 > /sys/module/kvm/parameters/nx_huge_pages

"sudo echo" doesn't work, the redirection is done by the shell, not echo itself
(which is run with sudo).  This needs to be e.g.

	echo 1 | sudo tee -a /sys/module/kvm/parameters/nx_huge_pages > /dev/null

> +	sudo echo 1 > /sys/module/kvm/parameters/nx_huge_pages_recovery_ratio
> +	sudo echo 100 > /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms
> +	sudo echo 3 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages

So, what happens if the user is running something else that needs huge pages?  Feels
like the script should read the value and add three, not just blindly write '3'.

> +
> +	"$(dirname $0)"/nx_huge_pages_test 887563923
> +)
> +RET=$?
> +
> +sudo echo $NX_HUGE_PAGES > /sys/module/kvm/parameters/nx_huge_pages
> +sudo echo $NX_HUGE_PAGES_RECOVERY_RATIO > /sys/module/kvm/parameters/nx_huge_pages_recovery_ratio
> +sudo echo $NX_HUGE_PAGES_RECOVERY_PERIOD > /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms
> +sudo echo $HUGE_PAGES > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
> +
> +exit $RET
> -- 
> 2.36.0.464.gb9c8b46e94-goog
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v7 10/11] KVM: selftests: Test disabling NX hugepages on a VM
  2022-05-03 18:30 ` [PATCH v7 10/11] KVM: selftests: Test disabling NX hugepages on " Ben Gardon
  2022-05-03 18:34   ` Ben Gardon
@ 2022-05-05 19:14   ` Sean Christopherson
  1 sibling, 0 replies; 19+ messages in thread
From: Sean Christopherson @ 2022-05-05 19:14 UTC (permalink / raw)
  To: Ben Gardon
  Cc: linux-kernel, kvm, Paolo Bonzini, Peter Xu, David Matlack,
	Jim Mattson, David Dunn, Jing Zhang, Junaid Shahid

On Tue, May 03, 2022, Ben Gardon wrote:
> +	if (disable_nx_huge_pages) {
> +		/*
> +		 * Cannot run the test without NX huge pages if the kernel
> +		 * does not support it.
> +		 */
> +		if (!kvm_check_cap(KVM_CAP_VM_DISABLE_NX_HUGE_PAGES))
> +			return;
> +
> +		r = __vm_disable_nx_huge_pages(vm);
> +		if (reboot_permissions) {
> +			TEST_ASSERT(!r, "Disabling NX huge pages should succeed if process has reboot permissions");
> +		} else {
> +			TEST_ASSERT(r == -EPERM, "This process should not have permission to disable NX huge pages");

This is wrong, the return value on ioctl() failure is -1, the error code is
in errno and it's a positive value.

LOL, but it passes because EPERM == 1, hilarious.  To avoid confusion:

			TEST_ASSERT(r == -1 && errno == EPERM,
				    "This process should not have permission to disable NX huge pages");

> diff --git a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
> index 60bfed8181b9..c21c1f639141 100755
> --- a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
> +++ b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
> @@ -16,6 +16,8 @@ HUGE_PAGES=$(sudo cat /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages)
>  
>  set +e
>  
> +NXECUTABLE="$(dirname $0)/nx_huge_pages_test"
> +
>  (
>  	set -e
>  
> @@ -24,7 +26,15 @@ set +e
>  	sudo echo 100 > /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms
>  	sudo echo 3 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
>  
> -	"$(dirname $0)"/nx_huge_pages_test 887563923
> +	# Test with reboot permissions
> +	sudo setcap cap_sys_boot+ep $NXECUTABLE

This leaves cap_sys_boot set on the executable if the script is run as root.

Probably this?  It's moderately user friendly without going too crazy on error
handling.

	# Test with reboot permissions
	if [ $(whoami) != "root" ] ; then
		sudo setcap cap_sys_boot+ep $NXECUTABLE
	fi
	$NXECUTABLE 887563923 1

	# Test without reboot permissions
	if [ $(whoami) != "root" ] ; then
		sudo setcap cap_sys_boot-ep $NXECUTABLE
		$NXECUTABLE 887563923 0
	fi

> +	$NXECUTABLE 887563923 1
> +
> +	# Test without reboot permissions
> +	if [ $(whoami) != "root" ] ; then
> +		sudo setcap cap_sys_boot-ep $NXECUTABLE
> +		$NXECUTABLE 887563923 0

I would much prefer a proper flag, not a magic 0 vs. 1.  

> +	fi
>  )
>  RET=$?
>  
> -- 
> 2.36.0.464.gb9c8b46e94-goog
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2022-05-05 19:14 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-03 18:30 [PATCH v7 00/11] KVM: x86: Add a cap to disable NX hugepages on a VM Ben Gardon
2022-05-03 18:30 ` [PATCH v7 01/11] KVM: selftests: Remove dynamic memory allocation for stats header Ben Gardon
2022-05-03 18:30 ` [PATCH v7 02/11] KVM: selftests: Read binary stats header in lib Ben Gardon
2022-05-03 18:30 ` [PATCH v7 03/11] KVM: selftests: Read binary stats desc " Ben Gardon
2022-05-05 17:08   ` Sean Christopherson
2022-05-05 17:13     ` Sean Christopherson
2022-05-03 18:30 ` [PATCH v7 04/11] KVM: selftests: Clean up coding style in binary stats test Ben Gardon
2022-05-03 18:30 ` [PATCH v7 05/11] KVM: selftests: Read binary stat data in lib Ben Gardon
2022-05-05 18:06   ` Sean Christopherson
2022-05-05 18:38     ` Sean Christopherson
2022-05-03 18:30 ` [PATCH v7 06/11] KVM: selftests: Add NX huge pages test Ben Gardon
2022-05-05 18:59   ` Sean Christopherson
2022-05-03 18:30 ` [PATCH v7 07/11] KVM: x86: Fix errant brace in KVM capability handling Ben Gardon
2022-05-03 18:30 ` [PATCH v7 08/11] KVM: x86/MMU: Allow NX huge pages to be disabled on a per-vm basis Ben Gardon
2022-05-03 18:30 ` [PATCH v7 09/11] KVM: selftests: Factor out calculation of pages needed for a VM Ben Gardon
2022-05-03 18:30 ` [PATCH v7 10/11] KVM: selftests: Test disabling NX hugepages on " Ben Gardon
2022-05-03 18:34   ` Ben Gardon
2022-05-05 19:14   ` Sean Christopherson
2022-05-03 18:30 ` [PATCH v7 11/11] KVM: selftests: Cache binary stats metadata for duration of test Ben Gardon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.