All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 00/10] KVM: x86: Add a cap to disable NX hugepages on a VM
@ 2022-04-11 21:10 Ben Gardon
  2022-04-11 21:10 ` [PATCH v4 01/10] KVM: selftests: Remove dynamic memory allocation for stats header Ben Gardon
                   ` (10 more replies)
  0 siblings, 11 replies; 31+ messages in thread
From: Ben Gardon @ 2022-04-11 21:10 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, Peter Shier,
	David Dunn, Junaid Shahid, Jim Mattson, David Matlack,
	Mingwei Zhang, Jing Zhang, Ben Gardon

Given the high cost of NX hugepages in terms of TLB performance, it may
be desirable to disable the mitigation on a per-VM basis. In the case of public
cloud providers with many VMs on a single host, some VMs may be more trusted
than others. In order to maximize performance on critical VMs, while still
providing some protection to the host from iTLB Multihit, allow the mitigation
to be selectively disabled.

Disabling NX hugepages on a VM is relatively straightforward, but I took this
as an opportunity to add some NX hugepages test coverage and clean up selftests
infrastructure a bit.

This series was tested with the new selftest and the rest of the KVM selftests
on an Intel Haswell machine.

The following tests failed, but I do not believe that has anything to do with
this series:
	userspace_io_test
	vmx_nested_tsc_scaling_test
	vmx_preemption_timer_test

Changelog:
v1->v2:
	Dropped the complicated memslot refactor in favor of Ricardo Koller's
	patch with a similar effect.
	Incorporated David Dunn's feedback and reviewed by tag: shortened waits
	to speed up test.
v2->v3:
	Incorporated a suggestion from David on how to build the NX huge pages
	test.
	Fixed a build breakage identified by David.
	Dropped the per-vm nx_huge_pages field in favor of simply checking the
	global + per-VM disable override.
	Documented the new capability
	Separated out the commit to test disabling NX huge pages
	Removed permission check when checking if the disable NX capability is
	supported.
	Added test coverage for the permission check.
v3->v4:
	Collected RB's from Jing and David
	Modified stat collection to reduce a memory allocation [David]
	Incorporated various improvments to the NX test [David]
	Changed the NX disable test to run by default [David]
	Removed some now unnecessary commits
	Dropped the code to dump KVM stats from the binary stats test, and
	factor out parts of the existing test to library functions instead.
	[David, Jing, Sean]
	Dropped the improvement to a debugging log message as it's no longer
	relevant to this series.

Ben Gardon (10):
  KVM: selftests: Remove dynamic memory allocation for stats header
  KVM: selftests: Read binary stats header in lib
  KVM: selftests: Read binary stats desc in lib
  KVM: selftests: Read binary stat data in lib
  KVM: selftests: Add NX huge pages test
  KVM: x86/MMU: Factor out updating NX hugepages state for a VM
  KVM: x86/MMU: Allow NX huge pages to be disabled on a per-vm basis
  KVM: x86: Fix errant brace in KVM capability handling
  KVM: x86/MMU: Require reboot permission to disable NX hugepages
  KVM: selftests: Test disabling NX hugepages on a VM

 Documentation/virt/kvm/api.rst                |  13 ++
 arch/x86/include/asm/kvm_host.h               |   2 +
 arch/x86/kvm/mmu.h                            |  10 +-
 arch/x86/kvm/mmu/mmu.c                        |  17 +-
 arch/x86/kvm/mmu/spte.c                       |   7 +-
 arch/x86/kvm/mmu/spte.h                       |   3 +-
 arch/x86/kvm/mmu/tdp_mmu.c                    |   3 +-
 arch/x86/kvm/x86.c                            |  17 +-
 include/uapi/linux/kvm.h                      |   1 +
 tools/testing/selftests/kvm/Makefile          |  10 +
 .../selftests/kvm/include/kvm_util_base.h     |  11 +
 .../selftests/kvm/kvm_binary_stats_test.c     |  75 +++----
 tools/testing/selftests/kvm/lib/kvm_util.c    | 125 ++++++++++-
 .../selftests/kvm/x86_64/nx_huge_pages_test.c | 198 ++++++++++++++++++
 .../kvm/x86_64/nx_huge_pages_test.sh          |  25 +++
 15 files changed, 453 insertions(+), 64 deletions(-)
 create mode 100644 tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
 create mode 100755 tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh

-- 
2.35.1.1178.g4f1659d476-goog


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v4 01/10] KVM: selftests: Remove dynamic memory allocation for stats header
  2022-04-11 21:10 [PATCH v4 00/10] KVM: x86: Add a cap to disable NX hugepages on a VM Ben Gardon
@ 2022-04-11 21:10 ` Ben Gardon
  2022-04-11 21:52   ` David Matlack
  2022-04-11 21:10 ` [PATCH v4 02/10] KVM: selftests: Read binary stats header in lib Ben Gardon
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Ben Gardon @ 2022-04-11 21:10 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, Peter Shier,
	David Dunn, Junaid Shahid, Jim Mattson, David Matlack,
	Mingwei Zhang, Jing Zhang, Ben Gardon

There's no need to allocate dynamic memory for the stats header since
its size is known at compile time.

Signed-off-by: Ben Gardon <bgardon@google.com>
---
 .../selftests/kvm/kvm_binary_stats_test.c     | 58 +++++++++----------
 1 file changed, 27 insertions(+), 31 deletions(-)

diff --git a/tools/testing/selftests/kvm/kvm_binary_stats_test.c b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
index 17f65d514915..dad34d8a41fe 100644
--- a/tools/testing/selftests/kvm/kvm_binary_stats_test.c
+++ b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
@@ -26,56 +26,53 @@ static void stats_test(int stats_fd)
 	int i;
 	size_t size_desc;
 	size_t size_data = 0;
-	struct kvm_stats_header *header;
+	struct kvm_stats_header header;
 	char *id;
 	struct kvm_stats_desc *stats_desc;
 	u64 *stats_data;
 	struct kvm_stats_desc *pdesc;
 
 	/* Read kvm stats header */
-	header = malloc(sizeof(*header));
-	TEST_ASSERT(header, "Allocate memory for stats header");
-
-	ret = read(stats_fd, header, sizeof(*header));
-	TEST_ASSERT(ret == sizeof(*header), "Read stats header");
-	size_desc = sizeof(*stats_desc) + header->name_size;
+	ret = read(stats_fd, &header, sizeof(header));
+	TEST_ASSERT(ret == sizeof(header), "Read stats header");
+	size_desc = sizeof(*stats_desc) + header.name_size;
 
 	/* Read kvm stats id string */
-	id = malloc(header->name_size);
+	id = malloc(header.name_size);
 	TEST_ASSERT(id, "Allocate memory for id string");
-	ret = read(stats_fd, id, header->name_size);
-	TEST_ASSERT(ret == header->name_size, "Read id string");
+	ret = read(stats_fd, id, header.name_size);
+	TEST_ASSERT(ret == header.name_size, "Read id string");
 
 	/* Check id string, that should start with "kvm" */
-	TEST_ASSERT(!strncmp(id, "kvm", 3) && strlen(id) < header->name_size,
+	TEST_ASSERT(!strncmp(id, "kvm", 3) && strlen(id) < header.name_size,
 				"Invalid KVM stats type, id: %s", id);
 
 	/* Sanity check for other fields in header */
-	if (header->num_desc == 0) {
+	if (header.num_desc == 0) {
 		printf("No KVM stats defined!");
 		return;
 	}
 	/* Check overlap */
-	TEST_ASSERT(header->desc_offset > 0 && header->data_offset > 0
-			&& header->desc_offset >= sizeof(*header)
-			&& header->data_offset >= sizeof(*header),
+	TEST_ASSERT(header.desc_offset > 0 && header.data_offset > 0
+			&& header.desc_offset >= sizeof(header)
+			&& header.data_offset >= sizeof(header),
 			"Invalid offset fields in header");
-	TEST_ASSERT(header->desc_offset > header->data_offset ||
-			(header->desc_offset + size_desc * header->num_desc <=
-							header->data_offset),
+	TEST_ASSERT(header.desc_offset > header.data_offset ||
+			(header.desc_offset + size_desc * header.num_desc <=
+							header.data_offset),
 			"Descriptor block is overlapped with data block");
 
 	/* Allocate memory for stats descriptors */
-	stats_desc = calloc(header->num_desc, size_desc);
+	stats_desc = calloc(header.num_desc, size_desc);
 	TEST_ASSERT(stats_desc, "Allocate memory for stats descriptors");
 	/* Read kvm stats descriptors */
 	ret = pread(stats_fd, stats_desc,
-			size_desc * header->num_desc, header->desc_offset);
-	TEST_ASSERT(ret == size_desc * header->num_desc,
+			size_desc * header.num_desc, header.desc_offset);
+	TEST_ASSERT(ret == size_desc * header.num_desc,
 			"Read KVM stats descriptors");
 
 	/* Sanity check for fields in descriptors */
-	for (i = 0; i < header->num_desc; ++i) {
+	for (i = 0; i < header.num_desc; ++i) {
 		pdesc = (void *)stats_desc + i * size_desc;
 		/* Check type,unit,base boundaries */
 		TEST_ASSERT((pdesc->flags & KVM_STATS_TYPE_MASK)
@@ -104,7 +101,7 @@ static void stats_test(int stats_fd)
 			break;
 		}
 		/* Check name string */
-		TEST_ASSERT(strlen(pdesc->name) < header->name_size,
+		TEST_ASSERT(strlen(pdesc->name) < header.name_size,
 				"KVM stats name(%s) too long", pdesc->name);
 		/* Check size field, which should not be zero */
 		TEST_ASSERT(pdesc->size, "KVM descriptor(%s) with size of 0",
@@ -124,14 +121,14 @@ static void stats_test(int stats_fd)
 		size_data += pdesc->size * sizeof(*stats_data);
 	}
 	/* Check overlap */
-	TEST_ASSERT(header->data_offset >= header->desc_offset
-		|| header->data_offset + size_data <= header->desc_offset,
+	TEST_ASSERT(header.data_offset >= header.desc_offset
+		|| header.data_offset + size_data <= header.desc_offset,
 		"Data block is overlapped with Descriptor block");
 	/* Check validity of all stats data size */
-	TEST_ASSERT(size_data >= header->num_desc * sizeof(*stats_data),
+	TEST_ASSERT(size_data >= header.num_desc * sizeof(*stats_data),
 			"Data size is not correct");
 	/* Check stats offset */
-	for (i = 0; i < header->num_desc; ++i) {
+	for (i = 0; i < header.num_desc; ++i) {
 		pdesc = (void *)stats_desc + i * size_desc;
 		TEST_ASSERT(pdesc->offset < size_data,
 			"Invalid offset (%u) for stats: %s",
@@ -142,15 +139,15 @@ static void stats_test(int stats_fd)
 	stats_data = malloc(size_data);
 	TEST_ASSERT(stats_data, "Allocate memory for stats data");
 	/* Read kvm stats data as a bulk */
-	ret = pread(stats_fd, stats_data, size_data, header->data_offset);
+	ret = pread(stats_fd, stats_data, size_data, header.data_offset);
 	TEST_ASSERT(ret == size_data, "Read KVM stats data");
 	/* Read kvm stats data one by one */
 	size_data = 0;
-	for (i = 0; i < header->num_desc; ++i) {
+	for (i = 0; i < header.num_desc; ++i) {
 		pdesc = (void *)stats_desc + i * size_desc;
 		ret = pread(stats_fd, stats_data,
 				pdesc->size * sizeof(*stats_data),
-				header->data_offset + size_data);
+				header.data_offset + size_data);
 		TEST_ASSERT(ret == pdesc->size * sizeof(*stats_data),
 				"Read data of KVM stats: %s", pdesc->name);
 		size_data += pdesc->size * sizeof(*stats_data);
@@ -159,7 +156,6 @@ static void stats_test(int stats_fd)
 	free(stats_data);
 	free(stats_desc);
 	free(id);
-	free(header);
 }
 
 
-- 
2.35.1.1178.g4f1659d476-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 02/10] KVM: selftests: Read binary stats header in lib
  2022-04-11 21:10 [PATCH v4 00/10] KVM: x86: Add a cap to disable NX hugepages on a VM Ben Gardon
  2022-04-11 21:10 ` [PATCH v4 01/10] KVM: selftests: Remove dynamic memory allocation for stats header Ben Gardon
@ 2022-04-11 21:10 ` Ben Gardon
  2022-04-11 21:55   ` David Matlack
  2022-04-11 21:10 ` [PATCH v4 03/10] KVM: selftests: Read binary stats desc " Ben Gardon
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Ben Gardon @ 2022-04-11 21:10 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, Peter Shier,
	David Dunn, Junaid Shahid, Jim Mattson, David Matlack,
	Mingwei Zhang, Jing Zhang, Ben Gardon

Move the code to read the binary stats header to the KVM selftests
library. It will be re-used by other tests to check KVM behavior.

No functional change intended.

Signed-off-by: Ben Gardon <bgardon@google.com>
---
 tools/testing/selftests/kvm/include/kvm_util_base.h | 1 +
 tools/testing/selftests/kvm/kvm_binary_stats_test.c | 4 ++--
 tools/testing/selftests/kvm/lib/kvm_util.c          | 8 ++++++++
 3 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
index 92cef0ffb19e..5ba3132f3110 100644
--- a/tools/testing/selftests/kvm/include/kvm_util_base.h
+++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
@@ -400,6 +400,7 @@ void assert_on_unhandled_exception(struct kvm_vm *vm, uint32_t vcpuid);
 
 int vm_get_stats_fd(struct kvm_vm *vm);
 int vcpu_get_stats_fd(struct kvm_vm *vm, uint32_t vcpuid);
+void read_vm_stats_header(int stats_fd, struct kvm_stats_header *header);
 
 uint32_t guest_get_vcpuid(void);
 
diff --git a/tools/testing/selftests/kvm/kvm_binary_stats_test.c b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
index dad34d8a41fe..22c22a90f15a 100644
--- a/tools/testing/selftests/kvm/kvm_binary_stats_test.c
+++ b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
@@ -33,8 +33,8 @@ static void stats_test(int stats_fd)
 	struct kvm_stats_desc *pdesc;
 
 	/* Read kvm stats header */
-	ret = read(stats_fd, &header, sizeof(header));
-	TEST_ASSERT(ret == sizeof(header), "Read stats header");
+	read_vm_stats_header(stats_fd, &header);
+
 	size_desc = sizeof(*stats_desc) + header.name_size;
 
 	/* Read kvm stats id string */
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index 1665a220abcb..0caf28e324ed 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -2556,3 +2556,11 @@ int vcpu_get_stats_fd(struct kvm_vm *vm, uint32_t vcpuid)
 
 	return ioctl(vcpu->fd, KVM_GET_STATS_FD, NULL);
 }
+
+void read_vm_stats_header(int stats_fd, struct kvm_stats_header *header)
+{
+	ssize_t ret;
+
+	ret = read(stats_fd, header, sizeof(*header));
+	TEST_ASSERT(ret == sizeof(*header), "Read stats header");
+}
-- 
2.35.1.1178.g4f1659d476-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 03/10] KVM: selftests: Read binary stats desc in lib
  2022-04-11 21:10 [PATCH v4 00/10] KVM: x86: Add a cap to disable NX hugepages on a VM Ben Gardon
  2022-04-11 21:10 ` [PATCH v4 01/10] KVM: selftests: Remove dynamic memory allocation for stats header Ben Gardon
  2022-04-11 21:10 ` [PATCH v4 02/10] KVM: selftests: Read binary stats header in lib Ben Gardon
@ 2022-04-11 21:10 ` Ben Gardon
  2022-04-11 22:01   ` David Matlack
  2022-04-12  0:54   ` Mingwei Zhang
  2022-04-11 21:10 ` [PATCH v4 04/10] KVM: selftests: Read binary stat data " Ben Gardon
                   ` (7 subsequent siblings)
  10 siblings, 2 replies; 31+ messages in thread
From: Ben Gardon @ 2022-04-11 21:10 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, Peter Shier,
	David Dunn, Junaid Shahid, Jim Mattson, David Matlack,
	Mingwei Zhang, Jing Zhang, Ben Gardon

Move the code to read the binary stats descriptors to the KVM selftests
library. It will be re-used by other tests to check KVM behavior.

No functional change intended.

Signed-off-by: Ben Gardon <bgardon@google.com>
---
 .../selftests/kvm/include/kvm_util_base.h     |  4 +++
 .../selftests/kvm/kvm_binary_stats_test.c     |  9 ++----
 tools/testing/selftests/kvm/lib/kvm_util.c    | 29 +++++++++++++++++++
 3 files changed, 35 insertions(+), 7 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
index 5ba3132f3110..c5f34551ff76 100644
--- a/tools/testing/selftests/kvm/include/kvm_util_base.h
+++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
@@ -401,6 +401,10 @@ void assert_on_unhandled_exception(struct kvm_vm *vm, uint32_t vcpuid);
 int vm_get_stats_fd(struct kvm_vm *vm);
 int vcpu_get_stats_fd(struct kvm_vm *vm, uint32_t vcpuid);
 void read_vm_stats_header(int stats_fd, struct kvm_stats_header *header);
+struct kvm_stats_desc *alloc_vm_stats_desc(int stats_fd,
+					  struct kvm_stats_header *header);
+void read_vm_stats_desc(int stats_fd, struct kvm_stats_header *header,
+			struct kvm_stats_desc *stats_desc);
 
 uint32_t guest_get_vcpuid(void);
 
diff --git a/tools/testing/selftests/kvm/kvm_binary_stats_test.c b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
index 22c22a90f15a..e4795bad7db6 100644
--- a/tools/testing/selftests/kvm/kvm_binary_stats_test.c
+++ b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
@@ -62,14 +62,9 @@ static void stats_test(int stats_fd)
 							header.data_offset),
 			"Descriptor block is overlapped with data block");
 
-	/* Allocate memory for stats descriptors */
-	stats_desc = calloc(header.num_desc, size_desc);
-	TEST_ASSERT(stats_desc, "Allocate memory for stats descriptors");
 	/* Read kvm stats descriptors */
-	ret = pread(stats_fd, stats_desc,
-			size_desc * header.num_desc, header.desc_offset);
-	TEST_ASSERT(ret == size_desc * header.num_desc,
-			"Read KVM stats descriptors");
+	stats_desc = alloc_vm_stats_desc(stats_fd, &header);
+	read_vm_stats_desc(stats_fd, &header, stats_desc);
 
 	/* Sanity check for fields in descriptors */
 	for (i = 0; i < header.num_desc; ++i) {
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index 0caf28e324ed..e3ae26fbef03 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -2564,3 +2564,32 @@ void read_vm_stats_header(int stats_fd, struct kvm_stats_header *header)
 	ret = read(stats_fd, header, sizeof(*header));
 	TEST_ASSERT(ret == sizeof(*header), "Read stats header");
 }
+
+static ssize_t stats_descs_size(struct kvm_stats_header *header)
+{
+	return header->num_desc *
+	       (sizeof(struct kvm_stats_desc) + header->name_size);
+}
+
+/* Caller is responsible for freeing the returned kvm_stats_desc. */
+struct kvm_stats_desc *alloc_vm_stats_desc(int stats_fd,
+					  struct kvm_stats_header *header)
+{
+	struct kvm_stats_desc *stats_desc;
+
+	stats_desc = malloc(stats_descs_size(header));
+	TEST_ASSERT(stats_desc, "Allocate memory for stats descriptors");
+
+	return stats_desc;
+}
+
+void read_vm_stats_desc(int stats_fd, struct kvm_stats_header *header,
+			struct kvm_stats_desc *stats_desc)
+{
+	ssize_t ret;
+
+	ret = pread(stats_fd, stats_desc, stats_descs_size(header),
+		    header->desc_offset);
+	TEST_ASSERT(ret == stats_descs_size(header),
+		    "Read KVM stats descriptors");
+}
-- 
2.35.1.1178.g4f1659d476-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 04/10] KVM: selftests: Read binary stat data in lib
  2022-04-11 21:10 [PATCH v4 00/10] KVM: x86: Add a cap to disable NX hugepages on a VM Ben Gardon
                   ` (2 preceding siblings ...)
  2022-04-11 21:10 ` [PATCH v4 03/10] KVM: selftests: Read binary stats desc " Ben Gardon
@ 2022-04-11 21:10 ` Ben Gardon
  2022-04-11 22:14   ` David Matlack
  2022-04-12  1:25   ` Mingwei Zhang
  2022-04-11 21:10 ` [PATCH v4 05/10] KVM: selftests: Add NX huge pages test Ben Gardon
                   ` (6 subsequent siblings)
  10 siblings, 2 replies; 31+ messages in thread
From: Ben Gardon @ 2022-04-11 21:10 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, Peter Shier,
	David Dunn, Junaid Shahid, Jim Mattson, David Matlack,
	Mingwei Zhang, Jing Zhang, Ben Gardon

Move the code to read the binary stats data to the KVM selftests
library. It will be re-used by other tests to check KVM behavior.

No functional change intended.

Signed-off-by: Ben Gardon <bgardon@google.com>
---
 .../selftests/kvm/include/kvm_util_base.h     |  3 +++
 .../selftests/kvm/kvm_binary_stats_test.c     | 20 +++++-------------
 tools/testing/selftests/kvm/lib/kvm_util.c    | 21 +++++++++++++++++++
 3 files changed, 29 insertions(+), 15 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
index c5f34551ff76..b2684cfc2cb1 100644
--- a/tools/testing/selftests/kvm/include/kvm_util_base.h
+++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
@@ -405,6 +405,9 @@ struct kvm_stats_desc *alloc_vm_stats_desc(int stats_fd,
 					  struct kvm_stats_header *header);
 void read_vm_stats_desc(int stats_fd, struct kvm_stats_header *header,
 			struct kvm_stats_desc *stats_desc);
+int read_stat_data(int stats_fd, struct kvm_stats_header *header,
+		   struct kvm_stats_desc *desc, uint64_t *data,
+		   ssize_t max_elements);
 
 uint32_t guest_get_vcpuid(void);
 
diff --git a/tools/testing/selftests/kvm/kvm_binary_stats_test.c b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
index e4795bad7db6..97b180249ba0 100644
--- a/tools/testing/selftests/kvm/kvm_binary_stats_test.c
+++ b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
@@ -20,6 +20,8 @@
 #include "asm/kvm.h"
 #include "linux/kvm.h"
 
+#define STAT_MAX_ELEMENTS 1000
+
 static void stats_test(int stats_fd)
 {
 	ssize_t ret;
@@ -29,7 +31,7 @@ static void stats_test(int stats_fd)
 	struct kvm_stats_header header;
 	char *id;
 	struct kvm_stats_desc *stats_desc;
-	u64 *stats_data;
+	u64 stats_data[STAT_MAX_ELEMENTS];
 	struct kvm_stats_desc *pdesc;
 
 	/* Read kvm stats header */
@@ -130,25 +132,13 @@ static void stats_test(int stats_fd)
 			pdesc->offset, pdesc->name);
 	}
 
-	/* Allocate memory for stats data */
-	stats_data = malloc(size_data);
-	TEST_ASSERT(stats_data, "Allocate memory for stats data");
-	/* Read kvm stats data as a bulk */
-	ret = pread(stats_fd, stats_data, size_data, header.data_offset);
-	TEST_ASSERT(ret == size_data, "Read KVM stats data");
 	/* Read kvm stats data one by one */
-	size_data = 0;
 	for (i = 0; i < header.num_desc; ++i) {
 		pdesc = (void *)stats_desc + i * size_desc;
-		ret = pread(stats_fd, stats_data,
-				pdesc->size * sizeof(*stats_data),
-				header.data_offset + size_data);
-		TEST_ASSERT(ret == pdesc->size * sizeof(*stats_data),
-				"Read data of KVM stats: %s", pdesc->name);
-		size_data += pdesc->size * sizeof(*stats_data);
+		read_stat_data(stats_fd, &header, pdesc, stats_data,
+			       ARRAY_SIZE(stats_data));
 	}
 
-	free(stats_data);
 	free(stats_desc);
 	free(id);
 }
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index e3ae26fbef03..64e2085f1129 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -2593,3 +2593,24 @@ void read_vm_stats_desc(int stats_fd, struct kvm_stats_header *header,
 	TEST_ASSERT(ret == stats_descs_size(header),
 		    "Read KVM stats descriptors");
 }
+
+int read_stat_data(int stats_fd, struct kvm_stats_header *header,
+		   struct kvm_stats_desc *desc, uint64_t *data,
+		   ssize_t max_elements)
+{
+	ssize_t ret;
+
+	TEST_ASSERT(desc->size <= max_elements,
+		    "Max data elements should be at least as large as stat data");
+
+	ret = pread(stats_fd, data, desc->size * sizeof(*data),
+		    header->data_offset + desc->offset);
+
+	/* ret from pread is in bytes. */
+	ret = ret / sizeof(*data);
+
+	TEST_ASSERT(ret == desc->size,
+		    "Read data of KVM stats: %s", desc->name);
+
+	return ret;
+}
-- 
2.35.1.1178.g4f1659d476-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 05/10] KVM: selftests: Add NX huge pages test
  2022-04-11 21:10 [PATCH v4 00/10] KVM: x86: Add a cap to disable NX hugepages on a VM Ben Gardon
                   ` (3 preceding siblings ...)
  2022-04-11 21:10 ` [PATCH v4 04/10] KVM: selftests: Read binary stat data " Ben Gardon
@ 2022-04-11 21:10 ` Ben Gardon
  2022-04-11 22:27   ` David Matlack
  2022-04-12  1:32   ` Mingwei Zhang
  2022-04-11 21:10 ` [PATCH v4 06/10] KVM: x86/MMU: Factor out updating NX hugepages state for a VM Ben Gardon
                   ` (5 subsequent siblings)
  10 siblings, 2 replies; 31+ messages in thread
From: Ben Gardon @ 2022-04-11 21:10 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, Peter Shier,
	David Dunn, Junaid Shahid, Jim Mattson, David Matlack,
	Mingwei Zhang, Jing Zhang, Ben Gardon

There's currently no test coverage of NX hugepages in KVM selftests, so
add a basic test to ensure that the feature works as intended.

Signed-off-by: Ben Gardon <bgardon@google.com>
---
 tools/testing/selftests/kvm/Makefile          |  10 ++
 .../selftests/kvm/include/kvm_util_base.h     |   1 +
 tools/testing/selftests/kvm/lib/kvm_util.c    |  48 ++++++
 .../selftests/kvm/x86_64/nx_huge_pages_test.c | 163 ++++++++++++++++++
 .../kvm/x86_64/nx_huge_pages_test.sh          |  25 +++
 5 files changed, 247 insertions(+)
 create mode 100644 tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
 create mode 100755 tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh

diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index af582d168621..9bb9bce4df37 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -43,6 +43,10 @@ LIBKVM_aarch64 = lib/aarch64/processor.c lib/aarch64/ucall.c lib/aarch64/handler
 LIBKVM_s390x = lib/s390x/processor.c lib/s390x/ucall.c lib/s390x/diag318_test_handler.c
 LIBKVM_riscv = lib/riscv/processor.c lib/riscv/ucall.c
 
+# Non-compiled test targets
+TEST_PROGS_x86_64 += x86_64/nx_huge_pages_test.sh
+
+# Compiled test targets
 TEST_GEN_PROGS_x86_64 = x86_64/cpuid_test
 TEST_GEN_PROGS_x86_64 += x86_64/cr4_cpuid_sync_test
 TEST_GEN_PROGS_x86_64 += x86_64/get_msr_index_features
@@ -104,6 +108,9 @@ TEST_GEN_PROGS_x86_64 += steal_time
 TEST_GEN_PROGS_x86_64 += kvm_binary_stats_test
 TEST_GEN_PROGS_x86_64 += system_counter_offset_test
 
+# Compiled outputs used by test targets
+TEST_GEN_PROGS_EXTENDED_x86_64 += x86_64/nx_huge_pages_test
+
 TEST_GEN_PROGS_aarch64 += aarch64/arch_timer
 TEST_GEN_PROGS_aarch64 += aarch64/debug-exceptions
 TEST_GEN_PROGS_aarch64 += aarch64/get-reg-list
@@ -142,7 +149,9 @@ TEST_GEN_PROGS_riscv += kvm_page_table_test
 TEST_GEN_PROGS_riscv += set_memory_region_test
 TEST_GEN_PROGS_riscv += kvm_binary_stats_test
 
+TEST_PROGS += $(TEST_PROGS_$(UNAME_M))
 TEST_GEN_PROGS += $(TEST_GEN_PROGS_$(UNAME_M))
+TEST_GEN_PROGS_EXTENDED += $(TEST_GEN_PROGS_EXTENDED_$(UNAME_M))
 LIBKVM += $(LIBKVM_$(UNAME_M))
 
 INSTALL_HDR_PATH = $(top_srcdir)/usr
@@ -193,6 +202,7 @@ $(OUTPUT)/libkvm.a: $(LIBKVM_OBJS)
 x := $(shell mkdir -p $(sort $(dir $(TEST_GEN_PROGS))))
 all: $(STATIC_LIBS)
 $(TEST_GEN_PROGS): $(STATIC_LIBS)
+$(TEST_GEN_PROGS_EXTENDED): $(STATIC_LIBS)
 
 cscope: include_paths = $(LINUX_TOOL_INCLUDE) $(LINUX_HDR_PATH) include lib ..
 cscope:
diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
index b2684cfc2cb1..f9c2ac0a5b97 100644
--- a/tools/testing/selftests/kvm/include/kvm_util_base.h
+++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
@@ -408,6 +408,7 @@ void read_vm_stats_desc(int stats_fd, struct kvm_stats_header *header,
 int read_stat_data(int stats_fd, struct kvm_stats_header *header,
 		   struct kvm_stats_desc *desc, uint64_t *data,
 		   ssize_t max_elements);
+uint64_t vm_get_single_stat(struct kvm_vm *vm, const char *stat_name);
 
 uint32_t guest_get_vcpuid(void);
 
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index 64e2085f1129..833c7e63d62d 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -2614,3 +2614,51 @@ int read_stat_data(int stats_fd, struct kvm_stats_header *header,
 
 	return ret;
 }
+
+static int vm_get_stat_data(struct kvm_vm *vm, const char *stat_name,
+			    uint64_t *data, ssize_t max_elements)
+{
+	struct kvm_stats_desc *stats_desc;
+	struct kvm_stats_header header;
+	struct kvm_stats_desc *desc;
+	size_t size_desc;
+	int stats_fd;
+	int ret = -EINVAL;
+	int i;
+
+	stats_fd = vm_get_stats_fd(vm);
+
+	read_vm_stats_header(stats_fd, &header);
+
+	stats_desc = alloc_vm_stats_desc(stats_fd, &header);
+	read_vm_stats_desc(stats_fd, &header, stats_desc);
+
+	size_desc = sizeof(struct kvm_stats_desc) + header.name_size;
+
+	/* Read kvm stats data one by one */
+	for (i = 0; i < header.num_desc; ++i) {
+		desc = (void *)stats_desc + (i * size_desc);
+
+		if (strcmp(desc->name, stat_name))
+			continue;
+
+		ret = read_stat_data(stats_fd, &header, desc, data,
+				     max_elements);
+	}
+
+	free(stats_desc);
+	close(stats_fd);
+	return ret;
+}
+
+uint64_t vm_get_single_stat(struct kvm_vm *vm, const char *stat_name)
+{
+	uint64_t data;
+	int ret;
+
+	ret = vm_get_stat_data(vm, stat_name, &data, 1);
+	TEST_ASSERT(ret == 1,
+		    "Stat %s expected to have 1 element, but %d returned",
+		    stat_name, ret);
+	return data;
+}
diff --git a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
new file mode 100644
index 000000000000..3f21726b22c7
--- /dev/null
+++ b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
@@ -0,0 +1,163 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * tools/testing/selftests/kvm/nx_huge_page_test.c
+ *
+ * Usage: to be run via nx_huge_page_test.sh, which does the necessary
+ * environment setup and teardown
+ *
+ * Copyright (C) 2022, Google LLC.
+ */
+
+#define _GNU_SOURCE
+
+#include <fcntl.h>
+#include <stdint.h>
+#include <time.h>
+
+#include <test_util.h>
+#include "kvm_util.h"
+
+#define HPAGE_SLOT		10
+#define HPAGE_GVA		(23*1024*1024)
+#define HPAGE_GPA		(10*1024*1024)
+#define HPAGE_SLOT_NPAGES	(512 * 3)
+#define PAGE_SIZE		4096
+
+/*
+ * When writing to guest memory, write the opcode for the `ret` instruction so
+ * that subsequent iteractions can exercise instruction fetch by calling the
+ * memory.
+ */
+#define RETURN_OPCODE 0xC3
+
+void guest_code(void)
+{
+	uint64_t hpage_1 = HPAGE_GVA;
+	uint64_t hpage_2 = hpage_1 + (PAGE_SIZE * 512);
+	uint64_t hpage_3 = hpage_2 + (PAGE_SIZE * 512);
+
+	READ_ONCE(*(uint64_t *)hpage_1);
+	GUEST_SYNC(1);
+
+	READ_ONCE(*(uint64_t *)hpage_2);
+	GUEST_SYNC(2);
+
+	((void (*)(void)) hpage_1)();
+	GUEST_SYNC(3);
+
+	((void (*)(void)) hpage_3)();
+	GUEST_SYNC(4);
+
+	READ_ONCE(*(uint64_t *)hpage_1);
+	GUEST_SYNC(5);
+
+	READ_ONCE(*(uint64_t *)hpage_3);
+	GUEST_SYNC(6);
+}
+
+static void check_2m_page_count(struct kvm_vm *vm, int expected_pages_2m)
+{
+	int actual_pages_2m;
+
+	actual_pages_2m = vm_get_single_stat(vm, "pages_2m");
+
+	TEST_ASSERT(actual_pages_2m == expected_pages_2m,
+		    "Unexpected 2m page count. Expected %d, got %d",
+		    expected_pages_2m, actual_pages_2m);
+}
+
+static void check_split_count(struct kvm_vm *vm, int expected_splits)
+{
+	int actual_splits;
+
+	actual_splits = vm_get_single_stat(vm, "nx_lpage_splits");
+
+	TEST_ASSERT(actual_splits == expected_splits,
+		    "Unexpected nx lpage split count. Expected %d, got %d",
+		    expected_splits, actual_splits);
+}
+
+int main(int argc, char **argv)
+{
+	struct kvm_vm *vm;
+	struct timespec ts;
+	void *hva;
+
+	vm = vm_create_default(0, 0, guest_code);
+
+	vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS_HUGETLB,
+				    HPAGE_GPA, HPAGE_SLOT,
+				    HPAGE_SLOT_NPAGES, 0);
+
+	virt_map(vm, HPAGE_GVA, HPAGE_GPA, HPAGE_SLOT_NPAGES);
+
+	hva = addr_gpa2hva(vm, HPAGE_GPA);
+	memset(hva, RETURN_OPCODE, HPAGE_SLOT_NPAGES * PAGE_SIZE);
+
+	check_2m_page_count(vm, 0);
+	check_split_count(vm, 0);
+
+	/*
+	 * The guest code will first read from the first hugepage, resulting
+	 * in a huge page mapping being created.
+	 */
+	vcpu_run(vm, 0);
+	check_2m_page_count(vm, 1);
+	check_split_count(vm, 0);
+
+	/*
+	 * Then the guest code will read from the second hugepage, resulting
+	 * in another huge page mapping being created.
+	 */
+	vcpu_run(vm, 0);
+	check_2m_page_count(vm, 2);
+	check_split_count(vm, 0);
+
+	/*
+	 * Next, the guest will execute from the first huge page, causing it
+	 * to be remapped at 4k.
+	 */
+	vcpu_run(vm, 0);
+	check_2m_page_count(vm, 1);
+	check_split_count(vm, 1);
+
+	/*
+	 * Executing from the third huge page (previously unaccessed) will
+	 * cause part to be mapped at 4k.
+	 */
+	vcpu_run(vm, 0);
+	check_2m_page_count(vm, 1);
+	check_split_count(vm, 2);
+
+	/* Reading from the first huge page again should have no effect. */
+	vcpu_run(vm, 0);
+	check_2m_page_count(vm, 1);
+	check_split_count(vm, 2);
+
+	/*
+	 * Give recovery thread time to run. The wrapper script sets
+	 * recovery_period_ms to 100, so wait 5x that.
+	 */
+	ts.tv_sec = 0;
+	ts.tv_nsec = 500000000;
+	nanosleep(&ts, NULL);
+
+	/*
+	 * Now that the reclaimer has run, all the split pages should be gone.
+	 */
+	check_2m_page_count(vm, 1);
+	check_split_count(vm, 0);
+
+	/*
+	 * The 4k mapping on hpage 3 should have been removed, so check that
+	 * reading from it causes a huge page mapping to be installed.
+	 */
+	vcpu_run(vm, 0);
+	check_2m_page_count(vm, 2);
+	check_split_count(vm, 0);
+
+	kvm_vm_free(vm);
+
+	return 0;
+}
+
diff --git a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
new file mode 100755
index 000000000000..19fc95723fcb
--- /dev/null
+++ b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
@@ -0,0 +1,25 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0-only */
+
+# tools/testing/selftests/kvm/nx_huge_page_test.sh
+# Copyright (C) 2022, Google LLC.
+
+NX_HUGE_PAGES=$(cat /sys/module/kvm/parameters/nx_huge_pages)
+NX_HUGE_PAGES_RECOVERY_RATIO=$(cat /sys/module/kvm/parameters/nx_huge_pages_recovery_ratio)
+NX_HUGE_PAGES_RECOVERY_PERIOD=$(cat /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms)
+HUGE_PAGES=$(cat /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages)
+
+echo 1 > /sys/module/kvm/parameters/nx_huge_pages
+echo 1 > /sys/module/kvm/parameters/nx_huge_pages_recovery_ratio
+echo 100 > /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms
+echo 200 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
+
+./nx_huge_pages_test
+RET=$?
+
+echo $NX_HUGE_PAGES > /sys/module/kvm/parameters/nx_huge_pages
+echo $NX_HUGE_PAGES_RECOVERY_RATIO > /sys/module/kvm/parameters/nx_huge_pages_recovery_ratio
+echo $NX_HUGE_PAGES_RECOVERY_PERIOD > /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms
+echo $HUGE_PAGES > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
+
+exit $RET
-- 
2.35.1.1178.g4f1659d476-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 06/10] KVM: x86/MMU: Factor out updating NX hugepages state for a VM
  2022-04-11 21:10 [PATCH v4 00/10] KVM: x86: Add a cap to disable NX hugepages on a VM Ben Gardon
                   ` (4 preceding siblings ...)
  2022-04-11 21:10 ` [PATCH v4 05/10] KVM: selftests: Add NX huge pages test Ben Gardon
@ 2022-04-11 21:10 ` Ben Gardon
  2022-04-11 21:10 ` [PATCH v4 07/10] KVM: x86/MMU: Allow NX huge pages to be disabled on a per-vm basis Ben Gardon
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 31+ messages in thread
From: Ben Gardon @ 2022-04-11 21:10 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, Peter Shier,
	David Dunn, Junaid Shahid, Jim Mattson, David Matlack,
	Mingwei Zhang, Jing Zhang, Ben Gardon

Factor out the code to update the NX hugepages state for an individual
VM. This will be expanded in future commits to allow per-VM control of
Nx hugepages.

No functional change intended.

Reviewed-by: David Matlack <dmatlack@google.com>
Signed-off-by: Ben Gardon <bgardon@google.com>
---
 arch/x86/kvm/mmu/mmu.c | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 69a30d6d1e2b..caaa610b7878 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -6144,6 +6144,15 @@ static void __set_nx_huge_pages(bool val)
 	nx_huge_pages = itlb_multihit_kvm_mitigation = val;
 }
 
+static void kvm_update_nx_huge_pages(struct kvm *kvm)
+{
+	mutex_lock(&kvm->slots_lock);
+	kvm_mmu_zap_all_fast(kvm);
+	mutex_unlock(&kvm->slots_lock);
+
+	wake_up_process(kvm->arch.nx_lpage_recovery_thread);
+}
+
 static int set_nx_huge_pages(const char *val, const struct kernel_param *kp)
 {
 	bool old_val = nx_huge_pages;
@@ -6166,13 +6175,9 @@ static int set_nx_huge_pages(const char *val, const struct kernel_param *kp)
 
 		mutex_lock(&kvm_lock);
 
-		list_for_each_entry(kvm, &vm_list, vm_list) {
-			mutex_lock(&kvm->slots_lock);
-			kvm_mmu_zap_all_fast(kvm);
-			mutex_unlock(&kvm->slots_lock);
+		list_for_each_entry(kvm, &vm_list, vm_list)
+			kvm_update_nx_huge_pages(kvm);
 
-			wake_up_process(kvm->arch.nx_lpage_recovery_thread);
-		}
 		mutex_unlock(&kvm_lock);
 	}
 
-- 
2.35.1.1178.g4f1659d476-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 07/10] KVM: x86/MMU: Allow NX huge pages to be disabled on a per-vm basis
  2022-04-11 21:10 [PATCH v4 00/10] KVM: x86: Add a cap to disable NX hugepages on a VM Ben Gardon
                   ` (5 preceding siblings ...)
  2022-04-11 21:10 ` [PATCH v4 06/10] KVM: x86/MMU: Factor out updating NX hugepages state for a VM Ben Gardon
@ 2022-04-11 21:10 ` Ben Gardon
  2022-04-12 17:54   ` Sean Christopherson
  2022-04-11 21:10 ` [PATCH v4 08/10] KVM: x86: Fix errant brace in KVM capability handling Ben Gardon
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Ben Gardon @ 2022-04-11 21:10 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, Peter Shier,
	David Dunn, Junaid Shahid, Jim Mattson, David Matlack,
	Mingwei Zhang, Jing Zhang, Ben Gardon

In some cases, the NX hugepage mitigation for iTLB multihit is not
needed for all guests on a host. Allow disabling the mitigation on a
per-VM basis to avoid the performance hit of NX hugepages on trusted
workloads.

Reviewed-by: David Matlack <dmatlack@google.com>
Signed-off-by: Ben Gardon <bgardon@google.com>
---
 Documentation/virt/kvm/api.rst  | 11 +++++++++++
 arch/x86/include/asm/kvm_host.h |  2 ++
 arch/x86/kvm/mmu.h              | 10 ++++++----
 arch/x86/kvm/mmu/mmu.c          |  2 +-
 arch/x86/kvm/mmu/spte.c         |  7 ++++---
 arch/x86/kvm/mmu/spte.h         |  3 ++-
 arch/x86/kvm/mmu/tdp_mmu.c      |  3 ++-
 arch/x86/kvm/x86.c              |  6 ++++++
 include/uapi/linux/kvm.h        |  1 +
 9 files changed, 35 insertions(+), 10 deletions(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 72183ae628f7..31fb002632bb 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -7855,6 +7855,17 @@ At this time, KVM_PMU_CAP_DISABLE is the only capability.  Setting
 this capability will disable PMU virtualization for that VM.  Usermode
 should adjust CPUID leaf 0xA to reflect that the PMU is disabled.
 
+8.36 KVM_CAP_VM_DISABLE_NX_HUGE_PAGES
+---------------------------
+
+:Capability KVM_CAP_PMU_CAPABILITY
+:Architectures: x86
+:Type: vm
+
+This capability disables the NX huge pages mitigation for iTLB MULTIHIT.
+
+The capability has no effect if the nx_huge_pages module parameter is not set.
+
 9. Known KVM API problems
 =========================
 
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 2c20f715f009..b8ab4fa7d4b2 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1240,6 +1240,8 @@ struct kvm_arch {
 	hpa_t	hv_root_tdp;
 	spinlock_t hv_root_tdp_lock;
 #endif
+
+	bool disable_nx_huge_pages;
 };
 
 struct kvm_vm_stat {
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index 671cfeccf04e..20d12e5b0040 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -173,10 +173,12 @@ struct kvm_page_fault {
 int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault);
 
 extern int nx_huge_pages;
-static inline bool is_nx_huge_page_enabled(void)
+static inline bool is_nx_huge_page_enabled(struct kvm *kvm)
 {
-	return READ_ONCE(nx_huge_pages);
+	return READ_ONCE(nx_huge_pages) &&
+	       !kvm->arch.disable_nx_huge_pages;
 }
+void kvm_update_nx_huge_pages(struct kvm *kvm);
 
 static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
 					u32 err, bool prefetch)
@@ -191,8 +193,8 @@ static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
 		.user = err & PFERR_USER_MASK,
 		.prefetch = prefetch,
 		.is_tdp = likely(vcpu->arch.mmu->page_fault == kvm_tdp_page_fault),
-		.nx_huge_page_workaround_enabled = is_nx_huge_page_enabled(),
-
+		.nx_huge_page_workaround_enabled =
+			is_nx_huge_page_enabled(vcpu->kvm),
 		.max_level = KVM_MAX_HUGEPAGE_LEVEL,
 		.req_level = PG_LEVEL_4K,
 		.goal_level = PG_LEVEL_4K,
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index caaa610b7878..149f353105f4 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -6144,7 +6144,7 @@ static void __set_nx_huge_pages(bool val)
 	nx_huge_pages = itlb_multihit_kvm_mitigation = val;
 }
 
-static void kvm_update_nx_huge_pages(struct kvm *kvm)
+void kvm_update_nx_huge_pages(struct kvm *kvm)
 {
 	mutex_lock(&kvm->slots_lock);
 	kvm_mmu_zap_all_fast(kvm);
diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c
index 4739b53c9734..877ad30bc7ad 100644
--- a/arch/x86/kvm/mmu/spte.c
+++ b/arch/x86/kvm/mmu/spte.c
@@ -116,7 +116,7 @@ bool make_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
 		spte |= spte_shadow_accessed_mask(spte);
 
 	if (level > PG_LEVEL_4K && (pte_access & ACC_EXEC_MASK) &&
-	    is_nx_huge_page_enabled()) {
+	    is_nx_huge_page_enabled(vcpu->kvm)) {
 		pte_access &= ~ACC_EXEC_MASK;
 	}
 
@@ -215,7 +215,8 @@ static u64 make_spte_executable(u64 spte)
  * This is used during huge page splitting to build the SPTEs that make up the
  * new page table.
  */
-u64 make_huge_page_split_spte(u64 huge_spte, int huge_level, int index)
+u64 make_huge_page_split_spte(struct kvm *kvm, u64 huge_spte, int huge_level,
+			      int index)
 {
 	u64 child_spte;
 	int child_level;
@@ -243,7 +244,7 @@ u64 make_huge_page_split_spte(u64 huge_spte, int huge_level, int index)
 		 * When splitting to a 4K page, mark the page executable as the
 		 * NX hugepage mitigation no longer applies.
 		 */
-		if (is_nx_huge_page_enabled())
+		if (is_nx_huge_page_enabled(kvm))
 			child_spte = make_spte_executable(child_spte);
 	}
 
diff --git a/arch/x86/kvm/mmu/spte.h b/arch/x86/kvm/mmu/spte.h
index 73f12615416f..e4142caff4b1 100644
--- a/arch/x86/kvm/mmu/spte.h
+++ b/arch/x86/kvm/mmu/spte.h
@@ -415,7 +415,8 @@ bool make_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
 	       unsigned int pte_access, gfn_t gfn, kvm_pfn_t pfn,
 	       u64 old_spte, bool prefetch, bool can_unsync,
 	       bool host_writable, u64 *new_spte);
-u64 make_huge_page_split_spte(u64 huge_spte, int huge_level, int index);
+u64 make_huge_page_split_spte(struct kvm *kvm, u64 huge_spte, int huge_level,
+			      int index);
 u64 make_nonleaf_spte(u64 *child_pt, bool ad_disabled);
 u64 make_mmio_spte(struct kvm_vcpu *vcpu, u64 gfn, unsigned int access);
 u64 mark_spte_for_access_track(u64 spte);
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 566548a3efa7..03aa1e0f60e2 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -1469,7 +1469,8 @@ static int tdp_mmu_split_huge_page(struct kvm *kvm, struct tdp_iter *iter,
 	 * not been linked in yet and thus is not reachable from any other CPU.
 	 */
 	for (i = 0; i < PT64_ENT_PER_PAGE; i++)
-		sp->spt[i] = make_huge_page_split_spte(huge_spte, level, i);
+		sp->spt[i] = make_huge_page_split_spte(kvm, huge_spte,
+						       level, i);
 
 	/*
 	 * Replace the huge spte with a pointer to the populated lower level
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ab336f7c82e4..b810ea45f965 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4286,6 +4286,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_SYS_ATTRIBUTES:
 	case KVM_CAP_VAPIC:
 	case KVM_CAP_ENABLE_CAP:
+	case KVM_CAP_VM_DISABLE_NX_HUGE_PAGES:
 		r = 1;
 		break;
 	case KVM_CAP_EXIT_HYPERCALL:
@@ -6079,6 +6080,11 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 		}
 		mutex_unlock(&kvm->lock);
 		break;
+	case KVM_CAP_VM_DISABLE_NX_HUGE_PAGES:
+		kvm->arch.disable_nx_huge_pages = true;
+		kvm_update_nx_huge_pages(kvm);
+		r = 0;
+		break;
 	default:
 		r = -EINVAL;
 		break;
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index dd1d8167e71f..7155488164bd 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1148,6 +1148,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_PMU_CAPABILITY 212
 #define KVM_CAP_DISABLE_QUIRKS2 213
 #define KVM_CAP_VM_TSC_CONTROL 214
+#define KVM_CAP_VM_DISABLE_NX_HUGE_PAGES 215
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
2.35.1.1178.g4f1659d476-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 08/10] KVM: x86: Fix errant brace in KVM capability handling
  2022-04-11 21:10 [PATCH v4 00/10] KVM: x86: Add a cap to disable NX hugepages on a VM Ben Gardon
                   ` (6 preceding siblings ...)
  2022-04-11 21:10 ` [PATCH v4 07/10] KVM: x86/MMU: Allow NX huge pages to be disabled on a per-vm basis Ben Gardon
@ 2022-04-11 21:10 ` Ben Gardon
  2022-04-11 21:10 ` [PATCH v4 09/10] KVM: x86/MMU: Require reboot permission to disable NX hugepages Ben Gardon
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 31+ messages in thread
From: Ben Gardon @ 2022-04-11 21:10 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, Peter Shier,
	David Dunn, Junaid Shahid, Jim Mattson, David Matlack,
	Mingwei Zhang, Jing Zhang, Ben Gardon

The braces around the KVM_CAP_XSAVE2 block also surround the
KVM_CAP_PMU_CAPABILITY block, likely the result of a merge issue. Simply
move the curly brace back to where it belongs.

Fixes: ba7bb663f5547 ("KVM: x86: Provide per VM capability for disabling PMU virtualization")
Signed-off-by: Ben Gardon <bgardon@google.com>
---
 arch/x86/kvm/x86.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b810ea45f965..de1d211f8aa3 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4383,10 +4383,10 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 		if (r < sizeof(struct kvm_xsave))
 			r = sizeof(struct kvm_xsave);
 		break;
+	}
 	case KVM_CAP_PMU_CAPABILITY:
 		r = enable_pmu ? KVM_CAP_PMU_VALID_MASK : 0;
 		break;
-	}
 	case KVM_CAP_DISABLE_QUIRKS2:
 		r = KVM_X86_VALID_QUIRKS;
 		break;
-- 
2.35.1.1178.g4f1659d476-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 09/10] KVM: x86/MMU: Require reboot permission to disable NX hugepages
  2022-04-11 21:10 [PATCH v4 00/10] KVM: x86: Add a cap to disable NX hugepages on a VM Ben Gardon
                   ` (7 preceding siblings ...)
  2022-04-11 21:10 ` [PATCH v4 08/10] KVM: x86: Fix errant brace in KVM capability handling Ben Gardon
@ 2022-04-11 21:10 ` Ben Gardon
  2022-04-12 18:08   ` Sean Christopherson
  2022-04-11 21:10 ` [PATCH v4 10/10] KVM: selftests: Test disabling NX hugepages on a VM Ben Gardon
  2022-04-11 21:15 ` [PATCH v4 00/10] KVM: x86: Add a cap to disable " Ben Gardon
  10 siblings, 1 reply; 31+ messages in thread
From: Ben Gardon @ 2022-04-11 21:10 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, Peter Shier,
	David Dunn, Junaid Shahid, Jim Mattson, David Matlack,
	Mingwei Zhang, Jing Zhang, Ben Gardon

Ensure that the userspace actor attempting to disable NX hugepages has
permission to reboot the system. Since disabling NX hugepages would
allow a guest to crash the system, it is similar to reboot permissions.

This approach is the simplest permission gating, but passing a file
descriptor opened for write for the module parameter would also work
well and be more precise.
The latter approach was suggested by Sean Christopherson.

Suggested-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Ben Gardon <bgardon@google.com>
---
 Documentation/virt/kvm/api.rst | 2 ++
 arch/x86/kvm/x86.c             | 9 +++++++++
 2 files changed, 11 insertions(+)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 31fb002632bb..021452a9fa91 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -7861,6 +7861,8 @@ should adjust CPUID leaf 0xA to reflect that the PMU is disabled.
 :Capability KVM_CAP_PMU_CAPABILITY
 :Architectures: x86
 :Type: vm
+:Returns 0 on success, -EPERM if the userspace process does not
+	 have CAP_SYS_BOOT
 
 This capability disables the NX huge pages mitigation for iTLB MULTIHIT.
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index de1d211f8aa3..8d3d6c48c5ec 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6081,6 +6081,15 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 		mutex_unlock(&kvm->lock);
 		break;
 	case KVM_CAP_VM_DISABLE_NX_HUGE_PAGES:
+		/*
+		 * Since the risk of disabling NX hugepages is a guest crashing
+		 * the system, ensure the userspace process has permission to
+		 * reboot the system.
+		 */
+		if (!capable(CAP_SYS_BOOT)) {
+			r = -EPERM;
+			break;
+		}
 		kvm->arch.disable_nx_huge_pages = true;
 		kvm_update_nx_huge_pages(kvm);
 		r = 0;
-- 
2.35.1.1178.g4f1659d476-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 10/10] KVM: selftests: Test disabling NX hugepages on a VM
  2022-04-11 21:10 [PATCH v4 00/10] KVM: x86: Add a cap to disable NX hugepages on a VM Ben Gardon
                   ` (8 preceding siblings ...)
  2022-04-11 21:10 ` [PATCH v4 09/10] KVM: x86/MMU: Require reboot permission to disable NX hugepages Ben Gardon
@ 2022-04-11 21:10 ` Ben Gardon
  2022-04-11 22:37   ` David Matlack
  2022-04-11 21:15 ` [PATCH v4 00/10] KVM: x86: Add a cap to disable " Ben Gardon
  10 siblings, 1 reply; 31+ messages in thread
From: Ben Gardon @ 2022-04-11 21:10 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, Peter Shier,
	David Dunn, Junaid Shahid, Jim Mattson, David Matlack,
	Mingwei Zhang, Jing Zhang, Ben Gardon

Add an argument to the NX huge pages test to test disabling the feature
on a VM using the new capability.

Signed-off-by: Ben Gardon <bgardon@google.com>
---
 .../selftests/kvm/include/kvm_util_base.h     |  2 +
 tools/testing/selftests/kvm/lib/kvm_util.c    | 19 ++++++-
 .../selftests/kvm/x86_64/nx_huge_pages_test.c | 53 +++++++++++++++----
 3 files changed, 64 insertions(+), 10 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
index f9c2ac0a5b97..15f24be6d93f 100644
--- a/tools/testing/selftests/kvm/include/kvm_util_base.h
+++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
@@ -412,4 +412,6 @@ uint64_t vm_get_single_stat(struct kvm_vm *vm, const char *stat_name);
 
 uint32_t guest_get_vcpuid(void);
 
+int vm_disable_nx_huge_pages(struct kvm_vm *vm);
+
 #endif /* SELFTEST_KVM_UTIL_BASE_H */
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index 833c7e63d62d..5fa5608eef03 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -112,6 +112,15 @@ int vm_check_cap(struct kvm_vm *vm, long cap)
 	return ret;
 }
 
+static int __vm_enable_cap(struct kvm_vm *vm, struct kvm_enable_cap *cap)
+{
+	int ret;
+
+	ret = ioctl(vm->fd, KVM_ENABLE_CAP, cap);
+
+	return ret;
+}
+
 /* VM Enable Capability
  *
  * Input Args:
@@ -128,7 +137,7 @@ int vm_enable_cap(struct kvm_vm *vm, struct kvm_enable_cap *cap)
 {
 	int ret;
 
-	ret = ioctl(vm->fd, KVM_ENABLE_CAP, cap);
+	ret = __vm_enable_cap(vm, cap);
 	TEST_ASSERT(ret == 0, "KVM_ENABLE_CAP IOCTL failed,\n"
 		"  rc: %i errno: %i", ret, errno);
 
@@ -2662,3 +2671,11 @@ uint64_t vm_get_single_stat(struct kvm_vm *vm, const char *stat_name)
 		    stat_name, ret);
 	return data;
 }
+
+int vm_disable_nx_huge_pages(struct kvm_vm *vm)
+{
+	struct kvm_enable_cap cap = { 0 };
+
+	cap.cap = KVM_CAP_VM_DISABLE_NX_HUGE_PAGES;
+	return __vm_enable_cap(vm, &cap);
+}
diff --git a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
index 3f21726b22c7..f8edf7910950 100644
--- a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
+++ b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
@@ -13,6 +13,8 @@
 #include <fcntl.h>
 #include <stdint.h>
 #include <time.h>
+#include <linux/reboot.h>
+#include <sys/syscall.h>
 
 #include <test_util.h>
 #include "kvm_util.h"
@@ -77,14 +79,41 @@ static void check_split_count(struct kvm_vm *vm, int expected_splits)
 		    expected_splits, actual_splits);
 }
 
-int main(int argc, char **argv)
+void run_test(bool disable_nx)
 {
 	struct kvm_vm *vm;
 	struct timespec ts;
 	void *hva;
+	int r;
 
 	vm = vm_create_default(0, 0, guest_code);
 
+	if (disable_nx) {
+		kvm_check_cap(KVM_CAP_VM_DISABLE_NX_HUGE_PAGES);
+
+		/*
+		 * Check if this process has the reboot permissions needed to
+		 * disable NX huge pages on a VM.
+		 *
+		 * The reboot call below will never have any effect because
+		 * the magic values are not set correctly, however the
+		 * permission check is done before the magic value check.
+		 */
+		r = syscall(SYS_reboot, 0, 0, 0, NULL);
+		if (errno == EPERM) {
+			r = vm_disable_nx_huge_pages(vm);
+			TEST_ASSERT(r == EPERM,
+				    "This process should not have permission to disable NX huge pages");
+			return;
+		}
+
+		TEST_ASSERT(errno == EINVAL,
+			    "Reboot syscall should fail with -EINVAL");
+
+		r = vm_disable_nx_huge_pages(vm);
+		TEST_ASSERT(!r, "Disabling NX huge pages should not fail if process has reboot permissions");
+	}
+
 	vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS_HUGETLB,
 				    HPAGE_GPA, HPAGE_SLOT,
 				    HPAGE_SLOT_NPAGES, 0);
@@ -118,21 +147,21 @@ int main(int argc, char **argv)
 	 * to be remapped at 4k.
 	 */
 	vcpu_run(vm, 0);
-	check_2m_page_count(vm, 1);
-	check_split_count(vm, 1);
+	check_2m_page_count(vm, disable_nx ? 2 : 1);
+	check_split_count(vm, disable_nx ? 0 : 1);
 
 	/*
 	 * Executing from the third huge page (previously unaccessed) will
 	 * cause part to be mapped at 4k.
 	 */
 	vcpu_run(vm, 0);
-	check_2m_page_count(vm, 1);
-	check_split_count(vm, 2);
+	check_2m_page_count(vm, disable_nx ? 3 : 1);
+	check_split_count(vm, disable_nx ? 0 : 2);
 
 	/* Reading from the first huge page again should have no effect. */
 	vcpu_run(vm, 0);
-	check_2m_page_count(vm, 1);
-	check_split_count(vm, 2);
+	check_2m_page_count(vm, disable_nx ? 3 : 1);
+	check_split_count(vm, disable_nx ? 0 : 2);
 
 	/*
 	 * Give recovery thread time to run. The wrapper script sets
@@ -145,7 +174,7 @@ int main(int argc, char **argv)
 	/*
 	 * Now that the reclaimer has run, all the split pages should be gone.
 	 */
-	check_2m_page_count(vm, 1);
+	check_2m_page_count(vm, disable_nx ? 3 : 1);
 	check_split_count(vm, 0);
 
 	/*
@@ -153,10 +182,16 @@ int main(int argc, char **argv)
 	 * reading from it causes a huge page mapping to be installed.
 	 */
 	vcpu_run(vm, 0);
-	check_2m_page_count(vm, 2);
+	check_2m_page_count(vm, disable_nx ? 3 : 2);
 	check_split_count(vm, 0);
 
 	kvm_vm_free(vm);
+}
+
+int main(int argc, char **argv)
+{
+	run_test(false);
+	run_test(true);
 
 	return 0;
 }
-- 
2.35.1.1178.g4f1659d476-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 00/10] KVM: x86: Add a cap to disable NX hugepages on a VM
  2022-04-11 21:10 [PATCH v4 00/10] KVM: x86: Add a cap to disable NX hugepages on a VM Ben Gardon
                   ` (9 preceding siblings ...)
  2022-04-11 21:10 ` [PATCH v4 10/10] KVM: selftests: Test disabling NX hugepages on a VM Ben Gardon
@ 2022-04-11 21:15 ` Ben Gardon
  10 siblings, 0 replies; 31+ messages in thread
From: Ben Gardon @ 2022-04-11 21:15 UTC (permalink / raw)
  To: LKML, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, Peter Shier,
	David Dunn, Junaid Shahid, Jim Mattson, David Matlack,
	Mingwei Zhang, Jing Zhang

On Mon, Apr 11, 2022 at 2:10 PM Ben Gardon <bgardon@google.com> wrote:
>
> Given the high cost of NX hugepages in terms of TLB performance, it may
> be desirable to disable the mitigation on a per-VM basis. In the case of public
> cloud providers with many VMs on a single host, some VMs may be more trusted
> than others. In order to maximize performance on critical VMs, while still
> providing some protection to the host from iTLB Multihit, allow the mitigation
> to be selectively disabled.
>
> Disabling NX hugepages on a VM is relatively straightforward, but I took this
> as an opportunity to add some NX hugepages test coverage and clean up selftests
> infrastructure a bit.
>
> This series was tested with the new selftest and the rest of the KVM selftests
> on an Intel Haswell machine.
>
> The following tests failed, but I do not believe that has anything to do with
> this series:
>         userspace_io_test
>         vmx_nested_tsc_scaling_test
>         vmx_preemption_timer_test
>
> Changelog:
> v1->v2:
>         Dropped the complicated memslot refactor in favor of Ricardo Koller's
>         patch with a similar effect.
>         Incorporated David Dunn's feedback and reviewed by tag: shortened waits
>         to speed up test.
> v2->v3:
>         Incorporated a suggestion from David on how to build the NX huge pages
>         test.
>         Fixed a build breakage identified by David.
>         Dropped the per-vm nx_huge_pages field in favor of simply checking the
>         global + per-VM disable override.
>         Documented the new capability
>         Separated out the commit to test disabling NX huge pages
>         Removed permission check when checking if the disable NX capability is
>         supported.
>         Added test coverage for the permission check.
> v3->v4:
>         Collected RB's from Jing and David
>         Modified stat collection to reduce a memory allocation [David]
>         Incorporated various improvments to the NX test [David]
>         Changed the NX disable test to run by default [David]
>         Removed some now unnecessary commits
>         Dropped the code to dump KVM stats from the binary stats test, and
>         factor out parts of the existing test to library functions instead.
>         [David, Jing, Sean]
>         Dropped the improvement to a debugging log message as it's no longer
>         relevant to this series.
>
> Ben Gardon (10):
>   KVM: selftests: Remove dynamic memory allocation for stats header
>   KVM: selftests: Read binary stats header in lib
>   KVM: selftests: Read binary stats desc in lib
>   KVM: selftests: Read binary stat data in lib
>   KVM: selftests: Add NX huge pages test
>   KVM: x86/MMU: Factor out updating NX hugepages state for a VM
>   KVM: x86/MMU: Allow NX huge pages to be disabled on a per-vm basis
>   KVM: x86: Fix errant brace in KVM capability handling
>   KVM: x86/MMU: Require reboot permission to disable NX hugepages
>   KVM: selftests: Test disabling NX hugepages on a VM

Paolo, I'm hoping we can get patches 1-5 in relatively quickly, and I
don't mind merging them first and then iterating on the later commits.
David Matlack and Mingwei Zhang both have series depending on those
test commits, so getting them finalized will help move all three of
our series along.

David, Mingwei, Sean, when you have time, I'd appreciate your reviews
on the first five patches in this series.

>
>  Documentation/virt/kvm/api.rst                |  13 ++
>  arch/x86/include/asm/kvm_host.h               |   2 +
>  arch/x86/kvm/mmu.h                            |  10 +-
>  arch/x86/kvm/mmu/mmu.c                        |  17 +-
>  arch/x86/kvm/mmu/spte.c                       |   7 +-
>  arch/x86/kvm/mmu/spte.h                       |   3 +-
>  arch/x86/kvm/mmu/tdp_mmu.c                    |   3 +-
>  arch/x86/kvm/x86.c                            |  17 +-
>  include/uapi/linux/kvm.h                      |   1 +
>  tools/testing/selftests/kvm/Makefile          |  10 +
>  .../selftests/kvm/include/kvm_util_base.h     |  11 +
>  .../selftests/kvm/kvm_binary_stats_test.c     |  75 +++----
>  tools/testing/selftests/kvm/lib/kvm_util.c    | 125 ++++++++++-
>  .../selftests/kvm/x86_64/nx_huge_pages_test.c | 198 ++++++++++++++++++
>  .../kvm/x86_64/nx_huge_pages_test.sh          |  25 +++
>  15 files changed, 453 insertions(+), 64 deletions(-)
>  create mode 100644 tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
>  create mode 100755 tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
>
> --
> 2.35.1.1178.g4f1659d476-goog
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 01/10] KVM: selftests: Remove dynamic memory allocation for stats header
  2022-04-11 21:10 ` [PATCH v4 01/10] KVM: selftests: Remove dynamic memory allocation for stats header Ben Gardon
@ 2022-04-11 21:52   ` David Matlack
  2022-04-11 22:50     ` Mingwei Zhang
  0 siblings, 1 reply; 31+ messages in thread
From: David Matlack @ 2022-04-11 21:52 UTC (permalink / raw)
  To: Ben Gardon
  Cc: LKML, kvm list, Paolo Bonzini, Peter Xu, Sean Christopherson,
	Peter Shier, David Dunn, Junaid Shahid, Jim Mattson,
	Mingwei Zhang, Jing Zhang

On Mon, Apr 11, 2022 at 2:10 PM Ben Gardon <bgardon@google.com> wrote:
>
> There's no need to allocate dynamic memory for the stats header since
> its size is known at compile time.
>
> Signed-off-by: Ben Gardon <bgardon@google.com>

Reviewed-by: David Matlack <dmatlack@google.com>

> ---
>  .../selftests/kvm/kvm_binary_stats_test.c     | 58 +++++++++----------
>  1 file changed, 27 insertions(+), 31 deletions(-)
>
> diff --git a/tools/testing/selftests/kvm/kvm_binary_stats_test.c b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
> index 17f65d514915..dad34d8a41fe 100644
> --- a/tools/testing/selftests/kvm/kvm_binary_stats_test.c
> +++ b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
> @@ -26,56 +26,53 @@ static void stats_test(int stats_fd)
>         int i;
>         size_t size_desc;
>         size_t size_data = 0;
> -       struct kvm_stats_header *header;
> +       struct kvm_stats_header header;
>         char *id;
>         struct kvm_stats_desc *stats_desc;
>         u64 *stats_data;
>         struct kvm_stats_desc *pdesc;
>
>         /* Read kvm stats header */
> -       header = malloc(sizeof(*header));
> -       TEST_ASSERT(header, "Allocate memory for stats header");
> -
> -       ret = read(stats_fd, header, sizeof(*header));
> -       TEST_ASSERT(ret == sizeof(*header), "Read stats header");
> -       size_desc = sizeof(*stats_desc) + header->name_size;
> +       ret = read(stats_fd, &header, sizeof(header));
> +       TEST_ASSERT(ret == sizeof(header), "Read stats header");
> +       size_desc = sizeof(*stats_desc) + header.name_size;
>
>         /* Read kvm stats id string */
> -       id = malloc(header->name_size);
> +       id = malloc(header.name_size);
>         TEST_ASSERT(id, "Allocate memory for id string");
> -       ret = read(stats_fd, id, header->name_size);
> -       TEST_ASSERT(ret == header->name_size, "Read id string");
> +       ret = read(stats_fd, id, header.name_size);
> +       TEST_ASSERT(ret == header.name_size, "Read id string");
>
>         /* Check id string, that should start with "kvm" */
> -       TEST_ASSERT(!strncmp(id, "kvm", 3) && strlen(id) < header->name_size,
> +       TEST_ASSERT(!strncmp(id, "kvm", 3) && strlen(id) < header.name_size,
>                                 "Invalid KVM stats type, id: %s", id);
>
>         /* Sanity check for other fields in header */
> -       if (header->num_desc == 0) {
> +       if (header.num_desc == 0) {
>                 printf("No KVM stats defined!");
>                 return;
>         }
>         /* Check overlap */
> -       TEST_ASSERT(header->desc_offset > 0 && header->data_offset > 0
> -                       && header->desc_offset >= sizeof(*header)
> -                       && header->data_offset >= sizeof(*header),
> +       TEST_ASSERT(header.desc_offset > 0 && header.data_offset > 0
> +                       && header.desc_offset >= sizeof(header)
> +                       && header.data_offset >= sizeof(header),
>                         "Invalid offset fields in header");
> -       TEST_ASSERT(header->desc_offset > header->data_offset ||
> -                       (header->desc_offset + size_desc * header->num_desc <=
> -                                                       header->data_offset),
> +       TEST_ASSERT(header.desc_offset > header.data_offset ||
> +                       (header.desc_offset + size_desc * header.num_desc <=
> +                                                       header.data_offset),
>                         "Descriptor block is overlapped with data block");
>
>         /* Allocate memory for stats descriptors */
> -       stats_desc = calloc(header->num_desc, size_desc);
> +       stats_desc = calloc(header.num_desc, size_desc);
>         TEST_ASSERT(stats_desc, "Allocate memory for stats descriptors");
>         /* Read kvm stats descriptors */
>         ret = pread(stats_fd, stats_desc,
> -                       size_desc * header->num_desc, header->desc_offset);
> -       TEST_ASSERT(ret == size_desc * header->num_desc,
> +                       size_desc * header.num_desc, header.desc_offset);
> +       TEST_ASSERT(ret == size_desc * header.num_desc,
>                         "Read KVM stats descriptors");
>
>         /* Sanity check for fields in descriptors */
> -       for (i = 0; i < header->num_desc; ++i) {
> +       for (i = 0; i < header.num_desc; ++i) {
>                 pdesc = (void *)stats_desc + i * size_desc;
>                 /* Check type,unit,base boundaries */
>                 TEST_ASSERT((pdesc->flags & KVM_STATS_TYPE_MASK)
> @@ -104,7 +101,7 @@ static void stats_test(int stats_fd)
>                         break;
>                 }
>                 /* Check name string */
> -               TEST_ASSERT(strlen(pdesc->name) < header->name_size,
> +               TEST_ASSERT(strlen(pdesc->name) < header.name_size,
>                                 "KVM stats name(%s) too long", pdesc->name);
>                 /* Check size field, which should not be zero */
>                 TEST_ASSERT(pdesc->size, "KVM descriptor(%s) with size of 0",
> @@ -124,14 +121,14 @@ static void stats_test(int stats_fd)
>                 size_data += pdesc->size * sizeof(*stats_data);
>         }
>         /* Check overlap */
> -       TEST_ASSERT(header->data_offset >= header->desc_offset
> -               || header->data_offset + size_data <= header->desc_offset,
> +       TEST_ASSERT(header.data_offset >= header.desc_offset
> +               || header.data_offset + size_data <= header.desc_offset,
>                 "Data block is overlapped with Descriptor block");
>         /* Check validity of all stats data size */
> -       TEST_ASSERT(size_data >= header->num_desc * sizeof(*stats_data),
> +       TEST_ASSERT(size_data >= header.num_desc * sizeof(*stats_data),
>                         "Data size is not correct");
>         /* Check stats offset */
> -       for (i = 0; i < header->num_desc; ++i) {
> +       for (i = 0; i < header.num_desc; ++i) {
>                 pdesc = (void *)stats_desc + i * size_desc;
>                 TEST_ASSERT(pdesc->offset < size_data,
>                         "Invalid offset (%u) for stats: %s",
> @@ -142,15 +139,15 @@ static void stats_test(int stats_fd)
>         stats_data = malloc(size_data);
>         TEST_ASSERT(stats_data, "Allocate memory for stats data");
>         /* Read kvm stats data as a bulk */
> -       ret = pread(stats_fd, stats_data, size_data, header->data_offset);
> +       ret = pread(stats_fd, stats_data, size_data, header.data_offset);
>         TEST_ASSERT(ret == size_data, "Read KVM stats data");
>         /* Read kvm stats data one by one */
>         size_data = 0;
> -       for (i = 0; i < header->num_desc; ++i) {
> +       for (i = 0; i < header.num_desc; ++i) {
>                 pdesc = (void *)stats_desc + i * size_desc;
>                 ret = pread(stats_fd, stats_data,
>                                 pdesc->size * sizeof(*stats_data),
> -                               header->data_offset + size_data);
> +                               header.data_offset + size_data);
>                 TEST_ASSERT(ret == pdesc->size * sizeof(*stats_data),
>                                 "Read data of KVM stats: %s", pdesc->name);
>                 size_data += pdesc->size * sizeof(*stats_data);
> @@ -159,7 +156,6 @@ static void stats_test(int stats_fd)
>         free(stats_data);
>         free(stats_desc);
>         free(id);
> -       free(header);
>  }
>
>
> --
> 2.35.1.1178.g4f1659d476-goog
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 02/10] KVM: selftests: Read binary stats header in lib
  2022-04-11 21:10 ` [PATCH v4 02/10] KVM: selftests: Read binary stats header in lib Ben Gardon
@ 2022-04-11 21:55   ` David Matlack
  0 siblings, 0 replies; 31+ messages in thread
From: David Matlack @ 2022-04-11 21:55 UTC (permalink / raw)
  To: Ben Gardon
  Cc: LKML, kvm list, Paolo Bonzini, Peter Xu, Sean Christopherson,
	Peter Shier, David Dunn, Junaid Shahid, Jim Mattson,
	Mingwei Zhang, Jing Zhang

On Mon, Apr 11, 2022 at 2:10 PM Ben Gardon <bgardon@google.com> wrote:
>
> Move the code to read the binary stats header to the KVM selftests
> library. It will be re-used by other tests to check KVM behavior.
>
> No functional change intended.
>
> Signed-off-by: Ben Gardon <bgardon@google.com>
> ---
>  tools/testing/selftests/kvm/include/kvm_util_base.h | 1 +
>  tools/testing/selftests/kvm/kvm_binary_stats_test.c | 4 ++--
>  tools/testing/selftests/kvm/lib/kvm_util.c          | 8 ++++++++
>  3 files changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
> index 92cef0ffb19e..5ba3132f3110 100644
> --- a/tools/testing/selftests/kvm/include/kvm_util_base.h
> +++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
> @@ -400,6 +400,7 @@ void assert_on_unhandled_exception(struct kvm_vm *vm, uint32_t vcpuid);
>
>  int vm_get_stats_fd(struct kvm_vm *vm);
>  int vcpu_get_stats_fd(struct kvm_vm *vm, uint32_t vcpuid);
> +void read_vm_stats_header(int stats_fd, struct kvm_stats_header *header);
>
>  uint32_t guest_get_vcpuid(void);
>
> diff --git a/tools/testing/selftests/kvm/kvm_binary_stats_test.c b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
> index dad34d8a41fe..22c22a90f15a 100644
> --- a/tools/testing/selftests/kvm/kvm_binary_stats_test.c
> +++ b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
> @@ -33,8 +33,8 @@ static void stats_test(int stats_fd)
>         struct kvm_stats_desc *pdesc;
>
>         /* Read kvm stats header */
> -       ret = read(stats_fd, &header, sizeof(header));
> -       TEST_ASSERT(ret == sizeof(header), "Read stats header");
> +       read_vm_stats_header(stats_fd, &header);

stats_test() is used to test both VM and vCPU stats. The only
difference is whether stats_fd is the VM stats fd or a vCPU stats fd.

Given that, please rename read_vm_stats_header() to something more
generic (since it can also be used for reading the vCPU stats header).
e.g. read_stats_header().

> +
>         size_desc = sizeof(*stats_desc) + header.name_size;
>
>         /* Read kvm stats id string */
> diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
> index 1665a220abcb..0caf28e324ed 100644
> --- a/tools/testing/selftests/kvm/lib/kvm_util.c
> +++ b/tools/testing/selftests/kvm/lib/kvm_util.c
> @@ -2556,3 +2556,11 @@ int vcpu_get_stats_fd(struct kvm_vm *vm, uint32_t vcpuid)
>
>         return ioctl(vcpu->fd, KVM_GET_STATS_FD, NULL);
>  }
> +
> +void read_vm_stats_header(int stats_fd, struct kvm_stats_header *header)
> +{
> +       ssize_t ret;
> +
> +       ret = read(stats_fd, header, sizeof(*header));
> +       TEST_ASSERT(ret == sizeof(*header), "Read stats header");
> +}
> --
> 2.35.1.1178.g4f1659d476-goog
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 03/10] KVM: selftests: Read binary stats desc in lib
  2022-04-11 21:10 ` [PATCH v4 03/10] KVM: selftests: Read binary stats desc " Ben Gardon
@ 2022-04-11 22:01   ` David Matlack
  2022-04-12  0:54   ` Mingwei Zhang
  1 sibling, 0 replies; 31+ messages in thread
From: David Matlack @ 2022-04-11 22:01 UTC (permalink / raw)
  To: Ben Gardon
  Cc: LKML, kvm list, Paolo Bonzini, Peter Xu, Sean Christopherson,
	Peter Shier, David Dunn, Junaid Shahid, Jim Mattson,
	Mingwei Zhang, Jing Zhang

On Mon, Apr 11, 2022 at 2:10 PM Ben Gardon <bgardon@google.com> wrote:
>
> Move the code to read the binary stats descriptors to the KVM selftests
> library. It will be re-used by other tests to check KVM behavior.
>
> No functional change intended.
>
> Signed-off-by: Ben Gardon <bgardon@google.com>
> ---
>  .../selftests/kvm/include/kvm_util_base.h     |  4 +++
>  .../selftests/kvm/kvm_binary_stats_test.c     |  9 ++----
>  tools/testing/selftests/kvm/lib/kvm_util.c    | 29 +++++++++++++++++++
>  3 files changed, 35 insertions(+), 7 deletions(-)
>
> diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
> index 5ba3132f3110..c5f34551ff76 100644
> --- a/tools/testing/selftests/kvm/include/kvm_util_base.h
> +++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
> @@ -401,6 +401,10 @@ void assert_on_unhandled_exception(struct kvm_vm *vm, uint32_t vcpuid);
>  int vm_get_stats_fd(struct kvm_vm *vm);
>  int vcpu_get_stats_fd(struct kvm_vm *vm, uint32_t vcpuid);
>  void read_vm_stats_header(int stats_fd, struct kvm_stats_header *header);
> +struct kvm_stats_desc *alloc_vm_stats_desc(int stats_fd,
> +                                         struct kvm_stats_header *header);
> +void read_vm_stats_desc(int stats_fd, struct kvm_stats_header *header,
> +                       struct kvm_stats_desc *stats_desc);

Same feedback here as the previous patch about getting rid of "vm_" in
the function names.

>
>  uint32_t guest_get_vcpuid(void);
>
> diff --git a/tools/testing/selftests/kvm/kvm_binary_stats_test.c b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
> index 22c22a90f15a..e4795bad7db6 100644
> --- a/tools/testing/selftests/kvm/kvm_binary_stats_test.c
> +++ b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
> @@ -62,14 +62,9 @@ static void stats_test(int stats_fd)
>                                                         header.data_offset),
>                         "Descriptor block is overlapped with data block");
>
> -       /* Allocate memory for stats descriptors */
> -       stats_desc = calloc(header.num_desc, size_desc);
> -       TEST_ASSERT(stats_desc, "Allocate memory for stats descriptors");
>         /* Read kvm stats descriptors */
> -       ret = pread(stats_fd, stats_desc,
> -                       size_desc * header.num_desc, header.desc_offset);
> -       TEST_ASSERT(ret == size_desc * header.num_desc,
> -                       "Read KVM stats descriptors");
> +       stats_desc = alloc_vm_stats_desc(stats_fd, &header);
> +       read_vm_stats_desc(stats_fd, &header, stats_desc);
>
>         /* Sanity check for fields in descriptors */
>         for (i = 0; i < header.num_desc; ++i) {
> diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
> index 0caf28e324ed..e3ae26fbef03 100644
> --- a/tools/testing/selftests/kvm/lib/kvm_util.c
> +++ b/tools/testing/selftests/kvm/lib/kvm_util.c
> @@ -2564,3 +2564,32 @@ void read_vm_stats_header(int stats_fd, struct kvm_stats_header *header)
>         ret = read(stats_fd, header, sizeof(*header));
>         TEST_ASSERT(ret == sizeof(*header), "Read stats header");
>  }
> +
> +static ssize_t stats_descs_size(struct kvm_stats_header *header)
> +{
> +       return header->num_desc *
> +              (sizeof(struct kvm_stats_desc) + header->name_size);
> +}
> +
> +/* Caller is responsible for freeing the returned kvm_stats_desc. */
> +struct kvm_stats_desc *alloc_vm_stats_desc(int stats_fd,
> +                                         struct kvm_stats_header *header)
> +{
> +       struct kvm_stats_desc *stats_desc;
> +
> +       stats_desc = malloc(stats_descs_size(header));
> +       TEST_ASSERT(stats_desc, "Allocate memory for stats descriptors");
> +
> +       return stats_desc;
> +}
> +
> +void read_vm_stats_desc(int stats_fd, struct kvm_stats_header *header,
> +                       struct kvm_stats_desc *stats_desc)

Reading the descriptors only needs to be done once per fd. So from an
API point of view I don't think there's value in creating separate
functions for the allocation and the read; just combine them in one.

> +{
> +       ssize_t ret;
> +
> +       ret = pread(stats_fd, stats_desc, stats_descs_size(header),
> +                   header->desc_offset);
> +       TEST_ASSERT(ret == stats_descs_size(header),
> +                   "Read KVM stats descriptors");
> +}
> --
> 2.35.1.1178.g4f1659d476-goog
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 04/10] KVM: selftests: Read binary stat data in lib
  2022-04-11 21:10 ` [PATCH v4 04/10] KVM: selftests: Read binary stat data " Ben Gardon
@ 2022-04-11 22:14   ` David Matlack
  2022-04-12 19:58     ` Ben Gardon
  2022-04-12  1:25   ` Mingwei Zhang
  1 sibling, 1 reply; 31+ messages in thread
From: David Matlack @ 2022-04-11 22:14 UTC (permalink / raw)
  To: Ben Gardon
  Cc: LKML, kvm list, Paolo Bonzini, Peter Xu, Sean Christopherson,
	Peter Shier, David Dunn, Junaid Shahid, Jim Mattson,
	Mingwei Zhang, Jing Zhang

On Mon, Apr 11, 2022 at 2:10 PM Ben Gardon <bgardon@google.com> wrote:
>
> Move the code to read the binary stats data to the KVM selftests
> library. It will be re-used by other tests to check KVM behavior.
>
> No functional change intended.
>
> Signed-off-by: Ben Gardon <bgardon@google.com>
> ---
>  .../selftests/kvm/include/kvm_util_base.h     |  3 +++
>  .../selftests/kvm/kvm_binary_stats_test.c     | 20 +++++-------------
>  tools/testing/selftests/kvm/lib/kvm_util.c    | 21 +++++++++++++++++++
>  3 files changed, 29 insertions(+), 15 deletions(-)
>
> diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
> index c5f34551ff76..b2684cfc2cb1 100644
> --- a/tools/testing/selftests/kvm/include/kvm_util_base.h
> +++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
> @@ -405,6 +405,9 @@ struct kvm_stats_desc *alloc_vm_stats_desc(int stats_fd,
>                                           struct kvm_stats_header *header);
>  void read_vm_stats_desc(int stats_fd, struct kvm_stats_header *header,
>                         struct kvm_stats_desc *stats_desc);
> +int read_stat_data(int stats_fd, struct kvm_stats_header *header,
> +                  struct kvm_stats_desc *desc, uint64_t *data,
> +                  ssize_t max_elements);
>
>  uint32_t guest_get_vcpuid(void);
>
> diff --git a/tools/testing/selftests/kvm/kvm_binary_stats_test.c b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
> index e4795bad7db6..97b180249ba0 100644
> --- a/tools/testing/selftests/kvm/kvm_binary_stats_test.c
> +++ b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
> @@ -20,6 +20,8 @@
>  #include "asm/kvm.h"
>  #include "linux/kvm.h"
>
> +#define STAT_MAX_ELEMENTS 1000
> +
>  static void stats_test(int stats_fd)
>  {
>         ssize_t ret;
> @@ -29,7 +31,7 @@ static void stats_test(int stats_fd)
>         struct kvm_stats_header header;
>         char *id;
>         struct kvm_stats_desc *stats_desc;
> -       u64 *stats_data;
> +       u64 stats_data[STAT_MAX_ELEMENTS];

What is the benefit of changing stats_data to a stack allocation with
a fixed limit?

>         struct kvm_stats_desc *pdesc;
>
>         /* Read kvm stats header */
> @@ -130,25 +132,13 @@ static void stats_test(int stats_fd)
>                         pdesc->offset, pdesc->name);
>         }
>
> -       /* Allocate memory for stats data */
> -       stats_data = malloc(size_data);
> -       TEST_ASSERT(stats_data, "Allocate memory for stats data");
> -       /* Read kvm stats data as a bulk */
> -       ret = pread(stats_fd, stats_data, size_data, header.data_offset);
> -       TEST_ASSERT(ret == size_data, "Read KVM stats data");
>         /* Read kvm stats data one by one */
> -       size_data = 0;
>         for (i = 0; i < header.num_desc; ++i) {
>                 pdesc = (void *)stats_desc + i * size_desc;
> -               ret = pread(stats_fd, stats_data,
> -                               pdesc->size * sizeof(*stats_data),
> -                               header.data_offset + size_data);
> -               TEST_ASSERT(ret == pdesc->size * sizeof(*stats_data),
> -                               "Read data of KVM stats: %s", pdesc->name);
> -               size_data += pdesc->size * sizeof(*stats_data);
> +               read_stat_data(stats_fd, &header, pdesc, stats_data,
> +                              ARRAY_SIZE(stats_data));
>         }
>
> -       free(stats_data);
>         free(stats_desc);
>         free(id);
>  }
> diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
> index e3ae26fbef03..64e2085f1129 100644
> --- a/tools/testing/selftests/kvm/lib/kvm_util.c
> +++ b/tools/testing/selftests/kvm/lib/kvm_util.c
> @@ -2593,3 +2593,24 @@ void read_vm_stats_desc(int stats_fd, struct kvm_stats_header *header,
>         TEST_ASSERT(ret == stats_descs_size(header),
>                     "Read KVM stats descriptors");
>  }
> +
> +int read_stat_data(int stats_fd, struct kvm_stats_header *header,

I would like to keep up the practice of adding docstrings to functions
in kvm_util. Can you add docstring comments for this function and the
other kvm_util functions introduced by this series?

> +                  struct kvm_stats_desc *desc, uint64_t *data,
> +                  ssize_t max_elements)
> +{
> +       ssize_t ret;
> +
> +       TEST_ASSERT(desc->size <= max_elements,
> +                   "Max data elements should be at least as large as stat data");

What is the reason for this assertion? Callers are required to read
all the data elements of a given stat?

> +
> +       ret = pread(stats_fd, data, desc->size * sizeof(*data),
> +                   header->data_offset + desc->offset);
> +
> +       /* ret from pread is in bytes. */
> +       ret = ret / sizeof(*data);
> +
> +       TEST_ASSERT(ret == desc->size,
> +                   "Read data of KVM stats: %s", desc->name);

Won't this assertion fail when called from kvm_binary_stats_test.c?
kvm_binary_stats_test.c looks like it reads all the stat data at once,
which means ret will be the total number of stat data points, and
desc->size will be the number of stat data points in the first stat.

> +
> +       return ret;
> +}
> --
> 2.35.1.1178.g4f1659d476-goog
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 05/10] KVM: selftests: Add NX huge pages test
  2022-04-11 21:10 ` [PATCH v4 05/10] KVM: selftests: Add NX huge pages test Ben Gardon
@ 2022-04-11 22:27   ` David Matlack
  2022-04-12 22:11     ` Ben Gardon
  2022-04-12  1:32   ` Mingwei Zhang
  1 sibling, 1 reply; 31+ messages in thread
From: David Matlack @ 2022-04-11 22:27 UTC (permalink / raw)
  To: Ben Gardon
  Cc: LKML, kvm list, Paolo Bonzini, Peter Xu, Sean Christopherson,
	Peter Shier, David Dunn, Junaid Shahid, Jim Mattson,
	Mingwei Zhang, Jing Zhang

On Mon, Apr 11, 2022 at 2:10 PM Ben Gardon <bgardon@google.com> wrote:
>
> There's currently no test coverage of NX hugepages in KVM selftests, so
> add a basic test to ensure that the feature works as intended.
>
> Signed-off-by: Ben Gardon <bgardon@google.com>
> ---
>  tools/testing/selftests/kvm/Makefile          |  10 ++
>  .../selftests/kvm/include/kvm_util_base.h     |   1 +
>  tools/testing/selftests/kvm/lib/kvm_util.c    |  48 ++++++
>  .../selftests/kvm/x86_64/nx_huge_pages_test.c | 163 ++++++++++++++++++
>  .../kvm/x86_64/nx_huge_pages_test.sh          |  25 +++
>  5 files changed, 247 insertions(+)
>  create mode 100644 tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
>  create mode 100755 tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
>
> diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
> index af582d168621..9bb9bce4df37 100644
> --- a/tools/testing/selftests/kvm/Makefile
> +++ b/tools/testing/selftests/kvm/Makefile
> @@ -43,6 +43,10 @@ LIBKVM_aarch64 = lib/aarch64/processor.c lib/aarch64/ucall.c lib/aarch64/handler
>  LIBKVM_s390x = lib/s390x/processor.c lib/s390x/ucall.c lib/s390x/diag318_test_handler.c
>  LIBKVM_riscv = lib/riscv/processor.c lib/riscv/ucall.c
>
> +# Non-compiled test targets
> +TEST_PROGS_x86_64 += x86_64/nx_huge_pages_test.sh
> +
> +# Compiled test targets
>  TEST_GEN_PROGS_x86_64 = x86_64/cpuid_test
>  TEST_GEN_PROGS_x86_64 += x86_64/cr4_cpuid_sync_test
>  TEST_GEN_PROGS_x86_64 += x86_64/get_msr_index_features
> @@ -104,6 +108,9 @@ TEST_GEN_PROGS_x86_64 += steal_time
>  TEST_GEN_PROGS_x86_64 += kvm_binary_stats_test
>  TEST_GEN_PROGS_x86_64 += system_counter_offset_test
>
> +# Compiled outputs used by test targets
> +TEST_GEN_PROGS_EXTENDED_x86_64 += x86_64/nx_huge_pages_test
> +
>  TEST_GEN_PROGS_aarch64 += aarch64/arch_timer
>  TEST_GEN_PROGS_aarch64 += aarch64/debug-exceptions
>  TEST_GEN_PROGS_aarch64 += aarch64/get-reg-list
> @@ -142,7 +149,9 @@ TEST_GEN_PROGS_riscv += kvm_page_table_test
>  TEST_GEN_PROGS_riscv += set_memory_region_test
>  TEST_GEN_PROGS_riscv += kvm_binary_stats_test
>
> +TEST_PROGS += $(TEST_PROGS_$(UNAME_M))
>  TEST_GEN_PROGS += $(TEST_GEN_PROGS_$(UNAME_M))
> +TEST_GEN_PROGS_EXTENDED += $(TEST_GEN_PROGS_EXTENDED_$(UNAME_M))
>  LIBKVM += $(LIBKVM_$(UNAME_M))
>
>  INSTALL_HDR_PATH = $(top_srcdir)/usr
> @@ -193,6 +202,7 @@ $(OUTPUT)/libkvm.a: $(LIBKVM_OBJS)
>  x := $(shell mkdir -p $(sort $(dir $(TEST_GEN_PROGS))))
>  all: $(STATIC_LIBS)
>  $(TEST_GEN_PROGS): $(STATIC_LIBS)
> +$(TEST_GEN_PROGS_EXTENDED): $(STATIC_LIBS)
>
>  cscope: include_paths = $(LINUX_TOOL_INCLUDE) $(LINUX_HDR_PATH) include lib ..
>  cscope:
> diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
> index b2684cfc2cb1..f9c2ac0a5b97 100644
> --- a/tools/testing/selftests/kvm/include/kvm_util_base.h
> +++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
> @@ -408,6 +408,7 @@ void read_vm_stats_desc(int stats_fd, struct kvm_stats_header *header,
>  int read_stat_data(int stats_fd, struct kvm_stats_header *header,
>                    struct kvm_stats_desc *desc, uint64_t *data,
>                    ssize_t max_elements);
> +uint64_t vm_get_single_stat(struct kvm_vm *vm, const char *stat_name);
>
>  uint32_t guest_get_vcpuid(void);
>
> diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
> index 64e2085f1129..833c7e63d62d 100644
> --- a/tools/testing/selftests/kvm/lib/kvm_util.c
> +++ b/tools/testing/selftests/kvm/lib/kvm_util.c
> @@ -2614,3 +2614,51 @@ int read_stat_data(int stats_fd, struct kvm_stats_header *header,
>
>         return ret;
>  }
> +
> +static int vm_get_stat_data(struct kvm_vm *vm, const char *stat_name,
> +                           uint64_t *data, ssize_t max_elements)
> +{
> +       struct kvm_stats_desc *stats_desc;
> +       struct kvm_stats_header header;
> +       struct kvm_stats_desc *desc;
> +       size_t size_desc;
> +       int stats_fd;
> +       int ret = -EINVAL;
> +       int i;
> +
> +       stats_fd = vm_get_stats_fd(vm);
> +
> +       read_vm_stats_header(stats_fd, &header);
> +
> +       stats_desc = alloc_vm_stats_desc(stats_fd, &header);
> +       read_vm_stats_desc(stats_fd, &header, stats_desc);

This is a fair bit of redundant work to do when reading every stat.
Reading stats in selftests is probably not going to be
performance-senstive, but it should be pretty easy to move everything
above to VM initialization and storing the outputs in struct kvm_vm
for access during this function.

> +
> +       size_desc = sizeof(struct kvm_stats_desc) + header.name_size;
> +
> +       /* Read kvm stats data one by one */
> +       for (i = 0; i < header.num_desc; ++i) {
> +               desc = (void *)stats_desc + (i * size_desc);
> +
> +               if (strcmp(desc->name, stat_name))
> +                       continue;
> +
> +               ret = read_stat_data(stats_fd, &header, desc, data,
> +                                    max_elements);
> +       }
> +
> +       free(stats_desc);
> +       close(stats_fd);
> +       return ret;
> +}
> +
> +uint64_t vm_get_single_stat(struct kvm_vm *vm, const char *stat_name)

nit: I'd prefer the simpler "vm_get_stat()". The function signature
already makes it clear we're reading one stat value. And when we add
more support for more complicated stats (e.g.
vm_get_histogram_stat()), I think "vm_get_stat()" will still work for
reading single value stats.

> +{
> +       uint64_t data;
> +       int ret;
> +
> +       ret = vm_get_stat_data(vm, stat_name, &data, 1);
> +       TEST_ASSERT(ret == 1,
> +                   "Stat %s expected to have 1 element, but %d returned",
> +                   stat_name, ret);
> +       return data;
> +}
> diff --git a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
> new file mode 100644
> index 000000000000..3f21726b22c7
> --- /dev/null
> +++ b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
> @@ -0,0 +1,163 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * tools/testing/selftests/kvm/nx_huge_page_test.c
> + *
> + * Usage: to be run via nx_huge_page_test.sh, which does the necessary
> + * environment setup and teardown
> + *
> + * Copyright (C) 2022, Google LLC.
> + */
> +
> +#define _GNU_SOURCE
> +
> +#include <fcntl.h>
> +#include <stdint.h>
> +#include <time.h>
> +
> +#include <test_util.h>
> +#include "kvm_util.h"
> +
> +#define HPAGE_SLOT             10
> +#define HPAGE_GVA              (23*1024*1024)
> +#define HPAGE_GPA              (10*1024*1024)
> +#define HPAGE_SLOT_NPAGES      (512 * 3)
> +#define PAGE_SIZE              4096
> +
> +/*
> + * When writing to guest memory, write the opcode for the `ret` instruction so
> + * that subsequent iteractions can exercise instruction fetch by calling the
> + * memory.

I think this comment needs to be reworded to better fit this test.


> + */
> +#define RETURN_OPCODE 0xC3
> +
> +void guest_code(void)
> +{
> +       uint64_t hpage_1 = HPAGE_GVA;
> +       uint64_t hpage_2 = hpage_1 + (PAGE_SIZE * 512);
> +       uint64_t hpage_3 = hpage_2 + (PAGE_SIZE * 512);
> +
> +       READ_ONCE(*(uint64_t *)hpage_1);
> +       GUEST_SYNC(1);
> +
> +       READ_ONCE(*(uint64_t *)hpage_2);
> +       GUEST_SYNC(2);
> +
> +       ((void (*)(void)) hpage_1)();
> +       GUEST_SYNC(3);
> +
> +       ((void (*)(void)) hpage_3)();
> +       GUEST_SYNC(4);
> +
> +       READ_ONCE(*(uint64_t *)hpage_1);
> +       GUEST_SYNC(5);
> +
> +       READ_ONCE(*(uint64_t *)hpage_3);
> +       GUEST_SYNC(6);
> +}
> +
> +static void check_2m_page_count(struct kvm_vm *vm, int expected_pages_2m)
> +{
> +       int actual_pages_2m;
> +
> +       actual_pages_2m = vm_get_single_stat(vm, "pages_2m");
> +
> +       TEST_ASSERT(actual_pages_2m == expected_pages_2m,
> +                   "Unexpected 2m page count. Expected %d, got %d",
> +                   expected_pages_2m, actual_pages_2m);
> +}
> +
> +static void check_split_count(struct kvm_vm *vm, int expected_splits)
> +{
> +       int actual_splits;
> +
> +       actual_splits = vm_get_single_stat(vm, "nx_lpage_splits");
> +
> +       TEST_ASSERT(actual_splits == expected_splits,
> +                   "Unexpected nx lpage split count. Expected %d, got %d",
> +                   expected_splits, actual_splits);
> +}
> +
> +int main(int argc, char **argv)
> +{
> +       struct kvm_vm *vm;
> +       struct timespec ts;
> +       void *hva;
> +
> +       vm = vm_create_default(0, 0, guest_code);
> +
> +       vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS_HUGETLB,
> +                                   HPAGE_GPA, HPAGE_SLOT,
> +                                   HPAGE_SLOT_NPAGES, 0);
> +
> +       virt_map(vm, HPAGE_GVA, HPAGE_GPA, HPAGE_SLOT_NPAGES);
> +
> +       hva = addr_gpa2hva(vm, HPAGE_GPA);
> +       memset(hva, RETURN_OPCODE, HPAGE_SLOT_NPAGES * PAGE_SIZE);
> +
> +       check_2m_page_count(vm, 0);
> +       check_split_count(vm, 0);
> +
> +       /*
> +        * The guest code will first read from the first hugepage, resulting
> +        * in a huge page mapping being created.
> +        */
> +       vcpu_run(vm, 0);
> +       check_2m_page_count(vm, 1);
> +       check_split_count(vm, 0);
> +
> +       /*
> +        * Then the guest code will read from the second hugepage, resulting
> +        * in another huge page mapping being created.
> +        */
> +       vcpu_run(vm, 0);
> +       check_2m_page_count(vm, 2);
> +       check_split_count(vm, 0);
> +
> +       /*
> +        * Next, the guest will execute from the first huge page, causing it
> +        * to be remapped at 4k.
> +        */
> +       vcpu_run(vm, 0);
> +       check_2m_page_count(vm, 1);
> +       check_split_count(vm, 1);
> +
> +       /*
> +        * Executing from the third huge page (previously unaccessed) will
> +        * cause part to be mapped at 4k.
> +        */
> +       vcpu_run(vm, 0);
> +       check_2m_page_count(vm, 1);
> +       check_split_count(vm, 2);
> +
> +       /* Reading from the first huge page again should have no effect. */
> +       vcpu_run(vm, 0);
> +       check_2m_page_count(vm, 1);
> +       check_split_count(vm, 2);
> +
> +       /*
> +        * Give recovery thread time to run. The wrapper script sets
> +        * recovery_period_ms to 100, so wait 5x that.
> +        */
> +       ts.tv_sec = 0;
> +       ts.tv_nsec = 500000000;
> +       nanosleep(&ts, NULL);
> +
> +       /*
> +        * Now that the reclaimer has run, all the split pages should be gone.
> +        */
> +       check_2m_page_count(vm, 1);
> +       check_split_count(vm, 0);
> +
> +       /*
> +        * The 4k mapping on hpage 3 should have been removed, so check that
> +        * reading from it causes a huge page mapping to be installed.
> +        */
> +       vcpu_run(vm, 0);
> +       check_2m_page_count(vm, 2);
> +       check_split_count(vm, 0);
> +
> +       kvm_vm_free(vm);
> +
> +       return 0;
> +}
> +
> diff --git a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
> new file mode 100755
> index 000000000000..19fc95723fcb
> --- /dev/null
> +++ b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
> @@ -0,0 +1,25 @@
> +#!/bin/bash
> +# SPDX-License-Identifier: GPL-2.0-only */
> +
> +# tools/testing/selftests/kvm/nx_huge_page_test.sh
> +# Copyright (C) 2022, Google LLC.
> +
> +NX_HUGE_PAGES=$(cat /sys/module/kvm/parameters/nx_huge_pages)
> +NX_HUGE_PAGES_RECOVERY_RATIO=$(cat /sys/module/kvm/parameters/nx_huge_pages_recovery_ratio)
> +NX_HUGE_PAGES_RECOVERY_PERIOD=$(cat /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms)
> +HUGE_PAGES=$(cat /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages)
> +
> +echo 1 > /sys/module/kvm/parameters/nx_huge_pages
> +echo 1 > /sys/module/kvm/parameters/nx_huge_pages_recovery_ratio
> +echo 100 > /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms
> +echo 200 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
> +
> +./nx_huge_pages_test
> +RET=$?
> +
> +echo $NX_HUGE_PAGES > /sys/module/kvm/parameters/nx_huge_pages
> +echo $NX_HUGE_PAGES_RECOVERY_RATIO > /sys/module/kvm/parameters/nx_huge_pages_recovery_ratio
> +echo $NX_HUGE_PAGES_RECOVERY_PERIOD > /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms
> +echo $HUGE_PAGES > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
> +
> +exit $RET
> --
> 2.35.1.1178.g4f1659d476-goog
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 10/10] KVM: selftests: Test disabling NX hugepages on a VM
  2022-04-11 21:10 ` [PATCH v4 10/10] KVM: selftests: Test disabling NX hugepages on a VM Ben Gardon
@ 2022-04-11 22:37   ` David Matlack
  0 siblings, 0 replies; 31+ messages in thread
From: David Matlack @ 2022-04-11 22:37 UTC (permalink / raw)
  To: Ben Gardon
  Cc: LKML, kvm list, Paolo Bonzini, Peter Xu, Sean Christopherson,
	Peter Shier, David Dunn, Junaid Shahid, Jim Mattson,
	Mingwei Zhang, Jing Zhang

On Mon, Apr 11, 2022 at 2:10 PM Ben Gardon <bgardon@google.com> wrote:
>
> Add an argument to the NX huge pages test to test disabling the feature
> on a VM using the new capability.
>
> Signed-off-by: Ben Gardon <bgardon@google.com>
> ---
>  .../selftests/kvm/include/kvm_util_base.h     |  2 +
>  tools/testing/selftests/kvm/lib/kvm_util.c    | 19 ++++++-
>  .../selftests/kvm/x86_64/nx_huge_pages_test.c | 53 +++++++++++++++----
>  3 files changed, 64 insertions(+), 10 deletions(-)
>
> diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
> index f9c2ac0a5b97..15f24be6d93f 100644
> --- a/tools/testing/selftests/kvm/include/kvm_util_base.h
> +++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
> @@ -412,4 +412,6 @@ uint64_t vm_get_single_stat(struct kvm_vm *vm, const char *stat_name);
>
>  uint32_t guest_get_vcpuid(void);
>
> +int vm_disable_nx_huge_pages(struct kvm_vm *vm);
> +
>  #endif /* SELFTEST_KVM_UTIL_BASE_H */
> diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
> index 833c7e63d62d..5fa5608eef03 100644
> --- a/tools/testing/selftests/kvm/lib/kvm_util.c
> +++ b/tools/testing/selftests/kvm/lib/kvm_util.c
> @@ -112,6 +112,15 @@ int vm_check_cap(struct kvm_vm *vm, long cap)
>         return ret;
>  }
>
> +static int __vm_enable_cap(struct kvm_vm *vm, struct kvm_enable_cap *cap)
> +{
> +       int ret;
> +
> +       ret = ioctl(vm->fd, KVM_ENABLE_CAP, cap);
> +
> +       return ret;
> +}
> +
>  /* VM Enable Capability
>   *
>   * Input Args:
> @@ -128,7 +137,7 @@ int vm_enable_cap(struct kvm_vm *vm, struct kvm_enable_cap *cap)
>  {
>         int ret;
>
> -       ret = ioctl(vm->fd, KVM_ENABLE_CAP, cap);
> +       ret = __vm_enable_cap(vm, cap);
>         TEST_ASSERT(ret == 0, "KVM_ENABLE_CAP IOCTL failed,\n"
>                 "  rc: %i errno: %i", ret, errno);
>
> @@ -2662,3 +2671,11 @@ uint64_t vm_get_single_stat(struct kvm_vm *vm, const char *stat_name)
>                     stat_name, ret);
>         return data;
>  }
> +
> +int vm_disable_nx_huge_pages(struct kvm_vm *vm)
> +{
> +       struct kvm_enable_cap cap = { 0 };
> +
> +       cap.cap = KVM_CAP_VM_DISABLE_NX_HUGE_PAGES;
> +       return __vm_enable_cap(vm, &cap);
> +}
> diff --git a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
> index 3f21726b22c7..f8edf7910950 100644
> --- a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
> +++ b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
> @@ -13,6 +13,8 @@
>  #include <fcntl.h>
>  #include <stdint.h>
>  #include <time.h>
> +#include <linux/reboot.h>
> +#include <sys/syscall.h>
>
>  #include <test_util.h>
>  #include "kvm_util.h"
> @@ -77,14 +79,41 @@ static void check_split_count(struct kvm_vm *vm, int expected_splits)
>                     expected_splits, actual_splits);
>  }
>
> -int main(int argc, char **argv)
> +void run_test(bool disable_nx)
>  {
>         struct kvm_vm *vm;
>         struct timespec ts;
>         void *hva;
> +       int r;
>
>         vm = vm_create_default(0, 0, guest_code);
>
> +       if (disable_nx) {
> +               kvm_check_cap(KVM_CAP_VM_DISABLE_NX_HUGE_PAGES);
> +
> +               /*
> +                * Check if this process has the reboot permissions needed to
> +                * disable NX huge pages on a VM.
> +                *
> +                * The reboot call below will never have any effect because
> +                * the magic values are not set correctly, however the
> +                * permission check is done before the magic value check.
> +                */
> +               r = syscall(SYS_reboot, 0, 0, 0, NULL);
> +               if (errno == EPERM) {

Should this be:

if (r && errno == EPERM) {

?

Otherwise errno might contain a stale value.

> +                       r = vm_disable_nx_huge_pages(vm);
> +                       TEST_ASSERT(r == EPERM,

TEST_ASSERT(r && errno == EPERM,

> +                                   "This process should not have permission to disable NX huge pages");
> +                       return;
> +               }
> +
> +               TEST_ASSERT(errno == EINVAL,

r && errno == EINVAL ?

> +                           "Reboot syscall should fail with -EINVAL");
> +
> +               r = vm_disable_nx_huge_pages(vm);
> +               TEST_ASSERT(!r, "Disabling NX huge pages should not fail if process has reboot permissions");

nit: s/not fail/succeed/

> +       }
> +
>         vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS_HUGETLB,
>                                     HPAGE_GPA, HPAGE_SLOT,
>                                     HPAGE_SLOT_NPAGES, 0);
> @@ -118,21 +147,21 @@ int main(int argc, char **argv)
>          * to be remapped at 4k.
>          */
>         vcpu_run(vm, 0);
> -       check_2m_page_count(vm, 1);
> -       check_split_count(vm, 1);
> +       check_2m_page_count(vm, disable_nx ? 2 : 1);
> +       check_split_count(vm, disable_nx ? 0 : 1);
>
>         /*
>          * Executing from the third huge page (previously unaccessed) will
>          * cause part to be mapped at 4k.
>          */
>         vcpu_run(vm, 0);
> -       check_2m_page_count(vm, 1);
> -       check_split_count(vm, 2);
> +       check_2m_page_count(vm, disable_nx ? 3 : 1);
> +       check_split_count(vm, disable_nx ? 0 : 2);
>
>         /* Reading from the first huge page again should have no effect. */
>         vcpu_run(vm, 0);
> -       check_2m_page_count(vm, 1);
> -       check_split_count(vm, 2);
> +       check_2m_page_count(vm, disable_nx ? 3 : 1);
> +       check_split_count(vm, disable_nx ? 0 : 2);
>
>         /*
>          * Give recovery thread time to run. The wrapper script sets
> @@ -145,7 +174,7 @@ int main(int argc, char **argv)
>         /*
>          * Now that the reclaimer has run, all the split pages should be gone.
>          */
> -       check_2m_page_count(vm, 1);
> +       check_2m_page_count(vm, disable_nx ? 3 : 1);
>         check_split_count(vm, 0);
>
>         /*
> @@ -153,10 +182,16 @@ int main(int argc, char **argv)
>          * reading from it causes a huge page mapping to be installed.
>          */
>         vcpu_run(vm, 0);
> -       check_2m_page_count(vm, 2);
> +       check_2m_page_count(vm, disable_nx ? 3 : 2);
>         check_split_count(vm, 0);
>
>         kvm_vm_free(vm);
> +}
> +
> +int main(int argc, char **argv)
> +{
> +       run_test(false);
> +       run_test(true);
>
>         return 0;
>  }
> --
> 2.35.1.1178.g4f1659d476-goog
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 01/10] KVM: selftests: Remove dynamic memory allocation for stats header
  2022-04-11 21:52   ` David Matlack
@ 2022-04-11 22:50     ` Mingwei Zhang
  0 siblings, 0 replies; 31+ messages in thread
From: Mingwei Zhang @ 2022-04-11 22:50 UTC (permalink / raw)
  To: David Matlack
  Cc: Ben Gardon, LKML, kvm list, Paolo Bonzini, Peter Xu,
	Sean Christopherson, Peter Shier, David Dunn, Junaid Shahid,
	Jim Mattson, Jing Zhang

On Mon, Apr 11, 2022, David Matlack wrote:
> On Mon, Apr 11, 2022 at 2:10 PM Ben Gardon <bgardon@google.com> wrote:
> >
> > There's no need to allocate dynamic memory for the stats header since
> > its size is known at compile time.
> >
> > Signed-off-by: Ben Gardon <bgardon@google.com>
> 
> Reviewed-by: David Matlack <dmatlack@google.com>
> 
Reviewed-by: Mingwei Zhang <mizhang@google.com>
> > ---
> >  .../selftests/kvm/kvm_binary_stats_test.c     | 58 +++++++++----------
> >  1 file changed, 27 insertions(+), 31 deletions(-)
> >
> > diff --git a/tools/testing/selftests/kvm/kvm_binary_stats_test.c b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
> > index 17f65d514915..dad34d8a41fe 100644
> > --- a/tools/testing/selftests/kvm/kvm_binary_stats_test.c
> > +++ b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
> > @@ -26,56 +26,53 @@ static void stats_test(int stats_fd)
> >         int i;
> >         size_t size_desc;
> >         size_t size_data = 0;
> > -       struct kvm_stats_header *header;
> > +       struct kvm_stats_header header;
> >         char *id;
> >         struct kvm_stats_desc *stats_desc;
> >         u64 *stats_data;
> >         struct kvm_stats_desc *pdesc;
> >
> >         /* Read kvm stats header */
> > -       header = malloc(sizeof(*header));
> > -       TEST_ASSERT(header, "Allocate memory for stats header");
> > -
> > -       ret = read(stats_fd, header, sizeof(*header));
> > -       TEST_ASSERT(ret == sizeof(*header), "Read stats header");
> > -       size_desc = sizeof(*stats_desc) + header->name_size;
> > +       ret = read(stats_fd, &header, sizeof(header));
> > +       TEST_ASSERT(ret == sizeof(header), "Read stats header");
> > +       size_desc = sizeof(*stats_desc) + header.name_size;
> >
> >         /* Read kvm stats id string */
> > -       id = malloc(header->name_size);
> > +       id = malloc(header.name_size);
> >         TEST_ASSERT(id, "Allocate memory for id string");
> > -       ret = read(stats_fd, id, header->name_size);
> > -       TEST_ASSERT(ret == header->name_size, "Read id string");
> > +       ret = read(stats_fd, id, header.name_size);
> > +       TEST_ASSERT(ret == header.name_size, "Read id string");
> >
> >         /* Check id string, that should start with "kvm" */
> > -       TEST_ASSERT(!strncmp(id, "kvm", 3) && strlen(id) < header->name_size,
> > +       TEST_ASSERT(!strncmp(id, "kvm", 3) && strlen(id) < header.name_size,
> >                                 "Invalid KVM stats type, id: %s", id);
> >
> >         /* Sanity check for other fields in header */
> > -       if (header->num_desc == 0) {
> > +       if (header.num_desc == 0) {
> >                 printf("No KVM stats defined!");
> >                 return;
> >         }
> >         /* Check overlap */
> > -       TEST_ASSERT(header->desc_offset > 0 && header->data_offset > 0
> > -                       && header->desc_offset >= sizeof(*header)
> > -                       && header->data_offset >= sizeof(*header),
> > +       TEST_ASSERT(header.desc_offset > 0 && header.data_offset > 0
> > +                       && header.desc_offset >= sizeof(header)
> > +                       && header.data_offset >= sizeof(header),
> >                         "Invalid offset fields in header");
> > -       TEST_ASSERT(header->desc_offset > header->data_offset ||
> > -                       (header->desc_offset + size_desc * header->num_desc <=
> > -                                                       header->data_offset),
> > +       TEST_ASSERT(header.desc_offset > header.data_offset ||
> > +                       (header.desc_offset + size_desc * header.num_desc <=
> > +                                                       header.data_offset),
> >                         "Descriptor block is overlapped with data block");
> >
> >         /* Allocate memory for stats descriptors */
> > -       stats_desc = calloc(header->num_desc, size_desc);
> > +       stats_desc = calloc(header.num_desc, size_desc);
> >         TEST_ASSERT(stats_desc, "Allocate memory for stats descriptors");
> >         /* Read kvm stats descriptors */
> >         ret = pread(stats_fd, stats_desc,
> > -                       size_desc * header->num_desc, header->desc_offset);
> > -       TEST_ASSERT(ret == size_desc * header->num_desc,
> > +                       size_desc * header.num_desc, header.desc_offset);
> > +       TEST_ASSERT(ret == size_desc * header.num_desc,
> >                         "Read KVM stats descriptors");
> >
> >         /* Sanity check for fields in descriptors */
> > -       for (i = 0; i < header->num_desc; ++i) {
> > +       for (i = 0; i < header.num_desc; ++i) {
> >                 pdesc = (void *)stats_desc + i * size_desc;
> >                 /* Check type,unit,base boundaries */
> >                 TEST_ASSERT((pdesc->flags & KVM_STATS_TYPE_MASK)
> > @@ -104,7 +101,7 @@ static void stats_test(int stats_fd)
> >                         break;
> >                 }
> >                 /* Check name string */
> > -               TEST_ASSERT(strlen(pdesc->name) < header->name_size,
> > +               TEST_ASSERT(strlen(pdesc->name) < header.name_size,
> >                                 "KVM stats name(%s) too long", pdesc->name);
> >                 /* Check size field, which should not be zero */
> >                 TEST_ASSERT(pdesc->size, "KVM descriptor(%s) with size of 0",
> > @@ -124,14 +121,14 @@ static void stats_test(int stats_fd)
> >                 size_data += pdesc->size * sizeof(*stats_data);
> >         }
> >         /* Check overlap */
> > -       TEST_ASSERT(header->data_offset >= header->desc_offset
> > -               || header->data_offset + size_data <= header->desc_offset,
> > +       TEST_ASSERT(header.data_offset >= header.desc_offset
> > +               || header.data_offset + size_data <= header.desc_offset,
> >                 "Data block is overlapped with Descriptor block");
> >         /* Check validity of all stats data size */
> > -       TEST_ASSERT(size_data >= header->num_desc * sizeof(*stats_data),
> > +       TEST_ASSERT(size_data >= header.num_desc * sizeof(*stats_data),
> >                         "Data size is not correct");
> >         /* Check stats offset */
> > -       for (i = 0; i < header->num_desc; ++i) {
> > +       for (i = 0; i < header.num_desc; ++i) {
> >                 pdesc = (void *)stats_desc + i * size_desc;
> >                 TEST_ASSERT(pdesc->offset < size_data,
> >                         "Invalid offset (%u) for stats: %s",
> > @@ -142,15 +139,15 @@ static void stats_test(int stats_fd)
> >         stats_data = malloc(size_data);
> >         TEST_ASSERT(stats_data, "Allocate memory for stats data");
> >         /* Read kvm stats data as a bulk */
> > -       ret = pread(stats_fd, stats_data, size_data, header->data_offset);
> > +       ret = pread(stats_fd, stats_data, size_data, header.data_offset);
> >         TEST_ASSERT(ret == size_data, "Read KVM stats data");
> >         /* Read kvm stats data one by one */
> >         size_data = 0;
> > -       for (i = 0; i < header->num_desc; ++i) {
> > +       for (i = 0; i < header.num_desc; ++i) {
> >                 pdesc = (void *)stats_desc + i * size_desc;
> >                 ret = pread(stats_fd, stats_data,
> >                                 pdesc->size * sizeof(*stats_data),
> > -                               header->data_offset + size_data);
> > +                               header.data_offset + size_data);
> >                 TEST_ASSERT(ret == pdesc->size * sizeof(*stats_data),
> >                                 "Read data of KVM stats: %s", pdesc->name);
> >                 size_data += pdesc->size * sizeof(*stats_data);
> > @@ -159,7 +156,6 @@ static void stats_test(int stats_fd)
> >         free(stats_data);
> >         free(stats_desc);
> >         free(id);
> > -       free(header);
> >  }
> >
> >
> > --
> > 2.35.1.1178.g4f1659d476-goog
> >

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 03/10] KVM: selftests: Read binary stats desc in lib
  2022-04-11 21:10 ` [PATCH v4 03/10] KVM: selftests: Read binary stats desc " Ben Gardon
  2022-04-11 22:01   ` David Matlack
@ 2022-04-12  0:54   ` Mingwei Zhang
  2022-04-12 18:56     ` Ben Gardon
  1 sibling, 1 reply; 31+ messages in thread
From: Mingwei Zhang @ 2022-04-12  0:54 UTC (permalink / raw)
  To: Ben Gardon
  Cc: linux-kernel, kvm, Paolo Bonzini, Peter Xu, Sean Christopherson,
	Peter Shier, David Dunn, Junaid Shahid, Jim Mattson,
	David Matlack, Jing Zhang

On Mon, Apr 11, 2022, Ben Gardon wrote:
> Move the code to read the binary stats descriptors to the KVM selftests
> library. It will be re-used by other tests to check KVM behavior.
> 
> No functional change intended.
> 
> Signed-off-by: Ben Gardon <bgardon@google.com>
> ---
>  .../selftests/kvm/include/kvm_util_base.h     |  4 +++
>  .../selftests/kvm/kvm_binary_stats_test.c     |  9 ++----
>  tools/testing/selftests/kvm/lib/kvm_util.c    | 29 +++++++++++++++++++
>  3 files changed, 35 insertions(+), 7 deletions(-)
> 
> diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
> index 5ba3132f3110..c5f34551ff76 100644
> --- a/tools/testing/selftests/kvm/include/kvm_util_base.h
> +++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
> @@ -401,6 +401,10 @@ void assert_on_unhandled_exception(struct kvm_vm *vm, uint32_t vcpuid);
>  int vm_get_stats_fd(struct kvm_vm *vm);
>  int vcpu_get_stats_fd(struct kvm_vm *vm, uint32_t vcpuid);
>  void read_vm_stats_header(int stats_fd, struct kvm_stats_header *header);
> +struct kvm_stats_desc *alloc_vm_stats_desc(int stats_fd,
> +					  struct kvm_stats_header *header);
> +void read_vm_stats_desc(int stats_fd, struct kvm_stats_header *header,
> +			struct kvm_stats_desc *stats_desc);
>  
>  uint32_t guest_get_vcpuid(void);
>  
> diff --git a/tools/testing/selftests/kvm/kvm_binary_stats_test.c b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
> index 22c22a90f15a..e4795bad7db6 100644
> --- a/tools/testing/selftests/kvm/kvm_binary_stats_test.c
> +++ b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
> @@ -62,14 +62,9 @@ static void stats_test(int stats_fd)
>  							header.data_offset),
>  			"Descriptor block is overlapped with data block");
>  
> -	/* Allocate memory for stats descriptors */
> -	stats_desc = calloc(header.num_desc, size_desc);
> -	TEST_ASSERT(stats_desc, "Allocate memory for stats descriptors");
>  	/* Read kvm stats descriptors */
> -	ret = pread(stats_fd, stats_desc,
> -			size_desc * header.num_desc, header.desc_offset);
> -	TEST_ASSERT(ret == size_desc * header.num_desc,
> -			"Read KVM stats descriptors");
> +	stats_desc = alloc_vm_stats_desc(stats_fd, &header);
> +	read_vm_stats_desc(stats_fd, &header, stats_desc);
>  
>  	/* Sanity check for fields in descriptors */
>  	for (i = 0; i < header.num_desc; ++i) {
> diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
> index 0caf28e324ed..e3ae26fbef03 100644
> --- a/tools/testing/selftests/kvm/lib/kvm_util.c
> +++ b/tools/testing/selftests/kvm/lib/kvm_util.c
> @@ -2564,3 +2564,32 @@ void read_vm_stats_header(int stats_fd, struct kvm_stats_header *header)
>  	ret = read(stats_fd, header, sizeof(*header));
>  	TEST_ASSERT(ret == sizeof(*header), "Read stats header");
>  }
> +
> +static ssize_t stats_descs_size(struct kvm_stats_header *header)
> +{
> +	return header->num_desc *
> +	       (sizeof(struct kvm_stats_desc) + header->name_size);
> +}
I was very confused on header->name_size. So this field means the
maximum string size of a stats name, right? Can we update the comments
in the kvm.h to specify that? By reading the comments, I don't really
feel this is how we should use this field.

hmm, if that is true, isn't this field a compile time value? Why do we
have to get it at runtime?

> +
> +/* Caller is responsible for freeing the returned kvm_stats_desc. */
> +struct kvm_stats_desc *alloc_vm_stats_desc(int stats_fd,
> +					  struct kvm_stats_header *header)
> +{
> +	struct kvm_stats_desc *stats_desc;
> +
> +	stats_desc = malloc(stats_descs_size(header));
> +	TEST_ASSERT(stats_desc, "Allocate memory for stats descriptors");
> +
> +	return stats_desc;
> +}
> +
> +void read_vm_stats_desc(int stats_fd, struct kvm_stats_header *header,
> +			struct kvm_stats_desc *stats_desc)
> +{
> +	ssize_t ret;
> +
> +	ret = pread(stats_fd, stats_desc, stats_descs_size(header),
> +		    header->desc_offset);
> +	TEST_ASSERT(ret == stats_descs_size(header),
> +		    "Read KVM stats descriptors");
> +}
> -- 
> 2.35.1.1178.g4f1659d476-goog
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 04/10] KVM: selftests: Read binary stat data in lib
  2022-04-11 21:10 ` [PATCH v4 04/10] KVM: selftests: Read binary stat data " Ben Gardon
  2022-04-11 22:14   ` David Matlack
@ 2022-04-12  1:25   ` Mingwei Zhang
  1 sibling, 0 replies; 31+ messages in thread
From: Mingwei Zhang @ 2022-04-12  1:25 UTC (permalink / raw)
  To: Ben Gardon
  Cc: linux-kernel, kvm, Paolo Bonzini, Peter Xu, Sean Christopherson,
	Peter Shier, David Dunn, Junaid Shahid, Jim Mattson,
	David Matlack, Jing Zhang

On Mon, Apr 11, 2022, Ben Gardon wrote:
> Move the code to read the binary stats data to the KVM selftests
> library. It will be re-used by other tests to check KVM behavior.
> 
> No functional change intended.
> 
> Signed-off-by: Ben Gardon <bgardon@google.com>

Reviewed-by: Mingwei Zhang <mizhang@google.com>
> ---
>  .../selftests/kvm/include/kvm_util_base.h     |  3 +++
>  .../selftests/kvm/kvm_binary_stats_test.c     | 20 +++++-------------
>  tools/testing/selftests/kvm/lib/kvm_util.c    | 21 +++++++++++++++++++
>  3 files changed, 29 insertions(+), 15 deletions(-)
> 
> diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
> index c5f34551ff76..b2684cfc2cb1 100644
> --- a/tools/testing/selftests/kvm/include/kvm_util_base.h
> +++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
> @@ -405,6 +405,9 @@ struct kvm_stats_desc *alloc_vm_stats_desc(int stats_fd,
>  					  struct kvm_stats_header *header);
>  void read_vm_stats_desc(int stats_fd, struct kvm_stats_header *header,
>  			struct kvm_stats_desc *stats_desc);
> +int read_stat_data(int stats_fd, struct kvm_stats_header *header,
> +		   struct kvm_stats_desc *desc, uint64_t *data,
> +		   ssize_t max_elements);
>  
>  uint32_t guest_get_vcpuid(void);
>  
> diff --git a/tools/testing/selftests/kvm/kvm_binary_stats_test.c b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
> index e4795bad7db6..97b180249ba0 100644
> --- a/tools/testing/selftests/kvm/kvm_binary_stats_test.c
> +++ b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
> @@ -20,6 +20,8 @@
>  #include "asm/kvm.h"
>  #include "linux/kvm.h"
>  
> +#define STAT_MAX_ELEMENTS 1000
> +
>  static void stats_test(int stats_fd)
>  {
>  	ssize_t ret;
> @@ -29,7 +31,7 @@ static void stats_test(int stats_fd)
>  	struct kvm_stats_header header;
>  	char *id;
>  	struct kvm_stats_desc *stats_desc;
> -	u64 *stats_data;
> +	u64 stats_data[STAT_MAX_ELEMENTS];
>  	struct kvm_stats_desc *pdesc;
>  
>  	/* Read kvm stats header */
> @@ -130,25 +132,13 @@ static void stats_test(int stats_fd)
>  			pdesc->offset, pdesc->name);
>  	}
>  
> -	/* Allocate memory for stats data */
> -	stats_data = malloc(size_data);
> -	TEST_ASSERT(stats_data, "Allocate memory for stats data");
> -	/* Read kvm stats data as a bulk */
> -	ret = pread(stats_fd, stats_data, size_data, header.data_offset);
> -	TEST_ASSERT(ret == size_data, "Read KVM stats data");
>  	/* Read kvm stats data one by one */
> -	size_data = 0;
>  	for (i = 0; i < header.num_desc; ++i) {
>  		pdesc = (void *)stats_desc + i * size_desc;
> -		ret = pread(stats_fd, stats_data,
> -				pdesc->size * sizeof(*stats_data),
> -				header.data_offset + size_data);
> -		TEST_ASSERT(ret == pdesc->size * sizeof(*stats_data),
> -				"Read data of KVM stats: %s", pdesc->name);
> -		size_data += pdesc->size * sizeof(*stats_data);
> +		read_stat_data(stats_fd, &header, pdesc, stats_data,
> +			       ARRAY_SIZE(stats_data));
>  	}
>  
> -	free(stats_data);
>  	free(stats_desc);
>  	free(id);
>  }
> diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
> index e3ae26fbef03..64e2085f1129 100644
> --- a/tools/testing/selftests/kvm/lib/kvm_util.c
> +++ b/tools/testing/selftests/kvm/lib/kvm_util.c
> @@ -2593,3 +2593,24 @@ void read_vm_stats_desc(int stats_fd, struct kvm_stats_header *header,
>  	TEST_ASSERT(ret == stats_descs_size(header),
>  		    "Read KVM stats descriptors");
>  }
> +
> +int read_stat_data(int stats_fd, struct kvm_stats_header *header,
> +		   struct kvm_stats_desc *desc, uint64_t *data,
> +		   ssize_t max_elements)
> +{
> +	ssize_t ret;
> +
> +	TEST_ASSERT(desc->size <= max_elements,
> +		    "Max data elements should be at least as large as stat data");
> +
> +	ret = pread(stats_fd, data, desc->size * sizeof(*data),
> +		    header->data_offset + desc->offset);

hmm, by checking the comments in kvm.h, I cannot infer desc->offset
should be added with header->data_offset. So, could you add the
following commit into your series. I feel that we need to improve
readability a little bit. Existing comments for kvm_stats_desc is not
quite helpful for users.

From: Mingwei Zhang <mizhang@google.com>
Date: Tue, 12 Apr 2022 01:09:10 +0000
Subject: [PATCH] KVM: stats: improve readability of kvm_stats_header and
 kvm_stats_desc

Some comments on these data structures are not quite helpful from user
perspective. For instance, kvm_stats_header->name_size was shared across
all stats. This might be a little bit counter-intuitive, since each stat
should have a different name length. So it has to be the maximum length of
the stats string. We cannot infer that by just reading the comments without
going through the implementation. In addition, kvm_stat_desc->offset is a
relative offset base on kvm_stats_header->data_offset. It is better to call
it out clearly in comments.

Update the comments of the fields in these two data structures.

Cc: Jing Zhang <zhangjingos@google.com>
Signed-off-by: Mingwei Zhang <mizhang@google.com>
---
 include/uapi/linux/kvm.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 5c0d7c77e9bd..986728b5b3fb 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1987,7 +1987,7 @@ struct kvm_dirty_gfn {
 /**
  * struct kvm_stats_header - Header of per vm/vcpu binary statistics data.
  * @flags: Some extra information for header, always 0 for now.
- * @name_size: The size in bytes of the memory which contains statistics
+ * @name_size: The maximum size in bytes of the memory which contains statistics
  *             name string including trailing '\0'. The memory is allocated
  *             at the send of statistics descriptor.
  * @num_desc: The number of statistics the vm or vcpu has.
@@ -2042,7 +2042,8 @@ struct kvm_stats_header {
  * @size: The number of data items for this stats.
  *        Every data item is of type __u64.
  * @offset: The offset of the stats to the start of stat structure in
- *          structure kvm or kvm_vcpu.
+ *          structure kvm or kvm_vcpu. When using it, add its value with
+ *          kvm_stats_header->data_offset.
  * @bucket_size: A parameter value used for histogram stats. It is only used
  *		for linear histogram stats, specifying the size of the bucket;
  * @name: The name string for the stats. Its size is indicated by the

base-commit: ce522ba9ef7e2d9fb22a39eb3371c0c64e2a433e
--
2.35.1.1178.g4f1659d476-goog
> +
> +	/* ret from pread is in bytes. */
> +	ret = ret / sizeof(*data);
> +
> +	TEST_ASSERT(ret == desc->size,
> +		    "Read data of KVM stats: %s", desc->name);
> +
> +	return ret;
> +}
> -- 
> 2.35.1.1178.g4f1659d476-goog
> 

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 05/10] KVM: selftests: Add NX huge pages test
  2022-04-11 21:10 ` [PATCH v4 05/10] KVM: selftests: Add NX huge pages test Ben Gardon
  2022-04-11 22:27   ` David Matlack
@ 2022-04-12  1:32   ` Mingwei Zhang
  2022-04-12 21:51     ` Ben Gardon
  1 sibling, 1 reply; 31+ messages in thread
From: Mingwei Zhang @ 2022-04-12  1:32 UTC (permalink / raw)
  To: Ben Gardon
  Cc: linux-kernel, kvm, Paolo Bonzini, Peter Xu, Sean Christopherson,
	Peter Shier, David Dunn, Junaid Shahid, Jim Mattson,
	David Matlack, Jing Zhang

On Mon, Apr 11, 2022, Ben Gardon wrote:
> There's currently no test coverage of NX hugepages in KVM selftests, so
> add a basic test to ensure that the feature works as intended.
> 
> Signed-off-by: Ben Gardon <bgardon@google.com>
> ---
>  tools/testing/selftests/kvm/Makefile          |  10 ++
>  .../selftests/kvm/include/kvm_util_base.h     |   1 +
>  tools/testing/selftests/kvm/lib/kvm_util.c    |  48 ++++++
>  .../selftests/kvm/x86_64/nx_huge_pages_test.c | 163 ++++++++++++++++++
>  .../kvm/x86_64/nx_huge_pages_test.sh          |  25 +++
>  5 files changed, 247 insertions(+)
>  create mode 100644 tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
>  create mode 100755 tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
> 
> diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
> index af582d168621..9bb9bce4df37 100644
> --- a/tools/testing/selftests/kvm/Makefile
> +++ b/tools/testing/selftests/kvm/Makefile
> @@ -43,6 +43,10 @@ LIBKVM_aarch64 = lib/aarch64/processor.c lib/aarch64/ucall.c lib/aarch64/handler
>  LIBKVM_s390x = lib/s390x/processor.c lib/s390x/ucall.c lib/s390x/diag318_test_handler.c
>  LIBKVM_riscv = lib/riscv/processor.c lib/riscv/ucall.c
>  
> +# Non-compiled test targets
> +TEST_PROGS_x86_64 += x86_64/nx_huge_pages_test.sh
> +
> +# Compiled test targets
>  TEST_GEN_PROGS_x86_64 = x86_64/cpuid_test
>  TEST_GEN_PROGS_x86_64 += x86_64/cr4_cpuid_sync_test
>  TEST_GEN_PROGS_x86_64 += x86_64/get_msr_index_features
> @@ -104,6 +108,9 @@ TEST_GEN_PROGS_x86_64 += steal_time
>  TEST_GEN_PROGS_x86_64 += kvm_binary_stats_test
>  TEST_GEN_PROGS_x86_64 += system_counter_offset_test
>  
> +# Compiled outputs used by test targets
> +TEST_GEN_PROGS_EXTENDED_x86_64 += x86_64/nx_huge_pages_test
> +
>  TEST_GEN_PROGS_aarch64 += aarch64/arch_timer
>  TEST_GEN_PROGS_aarch64 += aarch64/debug-exceptions
>  TEST_GEN_PROGS_aarch64 += aarch64/get-reg-list
> @@ -142,7 +149,9 @@ TEST_GEN_PROGS_riscv += kvm_page_table_test
>  TEST_GEN_PROGS_riscv += set_memory_region_test
>  TEST_GEN_PROGS_riscv += kvm_binary_stats_test
>  
> +TEST_PROGS += $(TEST_PROGS_$(UNAME_M))
>  TEST_GEN_PROGS += $(TEST_GEN_PROGS_$(UNAME_M))
> +TEST_GEN_PROGS_EXTENDED += $(TEST_GEN_PROGS_EXTENDED_$(UNAME_M))
>  LIBKVM += $(LIBKVM_$(UNAME_M))
>  
>  INSTALL_HDR_PATH = $(top_srcdir)/usr
> @@ -193,6 +202,7 @@ $(OUTPUT)/libkvm.a: $(LIBKVM_OBJS)
>  x := $(shell mkdir -p $(sort $(dir $(TEST_GEN_PROGS))))
>  all: $(STATIC_LIBS)
>  $(TEST_GEN_PROGS): $(STATIC_LIBS)
> +$(TEST_GEN_PROGS_EXTENDED): $(STATIC_LIBS)
>  
>  cscope: include_paths = $(LINUX_TOOL_INCLUDE) $(LINUX_HDR_PATH) include lib ..
>  cscope:
> diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
> index b2684cfc2cb1..f9c2ac0a5b97 100644
> --- a/tools/testing/selftests/kvm/include/kvm_util_base.h
> +++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
> @@ -408,6 +408,7 @@ void read_vm_stats_desc(int stats_fd, struct kvm_stats_header *header,
>  int read_stat_data(int stats_fd, struct kvm_stats_header *header,
>  		   struct kvm_stats_desc *desc, uint64_t *data,
>  		   ssize_t max_elements);
> +uint64_t vm_get_single_stat(struct kvm_vm *vm, const char *stat_name);
>  
>  uint32_t guest_get_vcpuid(void);
>  
> diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
> index 64e2085f1129..833c7e63d62d 100644
> --- a/tools/testing/selftests/kvm/lib/kvm_util.c
> +++ b/tools/testing/selftests/kvm/lib/kvm_util.c
> @@ -2614,3 +2614,51 @@ int read_stat_data(int stats_fd, struct kvm_stats_header *header,
>  
>  	return ret;
>  }
> +
> +static int vm_get_stat_data(struct kvm_vm *vm, const char *stat_name,
> +			    uint64_t *data, ssize_t max_elements)
> +{
> +	struct kvm_stats_desc *stats_desc;
> +	struct kvm_stats_header header;
> +	struct kvm_stats_desc *desc;
> +	size_t size_desc;
> +	int stats_fd;
> +	int ret = -EINVAL;
> +	int i;
> +
> +	stats_fd = vm_get_stats_fd(vm);
> +
> +	read_vm_stats_header(stats_fd, &header);
> +
> +	stats_desc = alloc_vm_stats_desc(stats_fd, &header);
> +	read_vm_stats_desc(stats_fd, &header, stats_desc);
> +
> +	size_desc = sizeof(struct kvm_stats_desc) + header.name_size;
> +
> +	/* Read kvm stats data one by one */
> +	for (i = 0; i < header.num_desc; ++i) {
> +		desc = (void *)stats_desc + (i * size_desc);
> +
> +		if (strcmp(desc->name, stat_name))
> +			continue;
> +
> +		ret = read_stat_data(stats_fd, &header, desc, data,
> +				     max_elements);
> +	}
> +
> +	free(stats_desc);
> +	close(stats_fd);
> +	return ret;
> +}

I could be wrong, but it seems this function is quite generic. Why not
putting it into kvm_util.c?
> +
> +uint64_t vm_get_single_stat(struct kvm_vm *vm, const char *stat_name)
> +{
> +	uint64_t data;
> +	int ret;
> +
> +	ret = vm_get_stat_data(vm, stat_name, &data, 1);
> +	TEST_ASSERT(ret == 1,
> +		    "Stat %s expected to have 1 element, but %d returned",
> +		    stat_name, ret);
> +	return data;
> +}
> diff --git a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
> new file mode 100644
> index 000000000000..3f21726b22c7
> --- /dev/null
> +++ b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
> @@ -0,0 +1,163 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * tools/testing/selftests/kvm/nx_huge_page_test.c
> + *
> + * Usage: to be run via nx_huge_page_test.sh, which does the necessary
> + * environment setup and teardown
> + *
> + * Copyright (C) 2022, Google LLC.
> + */
> +
> +#define _GNU_SOURCE
> +
> +#include <fcntl.h>
> +#include <stdint.h>
> +#include <time.h>
> +
> +#include <test_util.h>
> +#include "kvm_util.h"
> +
> +#define HPAGE_SLOT		10
> +#define HPAGE_GVA		(23*1024*1024)
> +#define HPAGE_GPA		(10*1024*1024)
> +#define HPAGE_SLOT_NPAGES	(512 * 3)
> +#define PAGE_SIZE		4096
> +
> +/*
> + * When writing to guest memory, write the opcode for the `ret` instruction so
> + * that subsequent iteractions can exercise instruction fetch by calling the
> + * memory.
> + */
> +#define RETURN_OPCODE 0xC3
> +
> +void guest_code(void)
> +{
> +	uint64_t hpage_1 = HPAGE_GVA;
> +	uint64_t hpage_2 = hpage_1 + (PAGE_SIZE * 512);
> +	uint64_t hpage_3 = hpage_2 + (PAGE_SIZE * 512);
> +
> +	READ_ONCE(*(uint64_t *)hpage_1);
> +	GUEST_SYNC(1);
> +
> +	READ_ONCE(*(uint64_t *)hpage_2);
> +	GUEST_SYNC(2);
> +
> +	((void (*)(void)) hpage_1)();
> +	GUEST_SYNC(3);
> +
> +	((void (*)(void)) hpage_3)();
> +	GUEST_SYNC(4);
> +
> +	READ_ONCE(*(uint64_t *)hpage_1);
> +	GUEST_SYNC(5);
> +
> +	READ_ONCE(*(uint64_t *)hpage_3);
> +	GUEST_SYNC(6);
> +}
> +
> +static void check_2m_page_count(struct kvm_vm *vm, int expected_pages_2m)
> +{
> +	int actual_pages_2m;
> +
> +	actual_pages_2m = vm_get_single_stat(vm, "pages_2m");
> +
> +	TEST_ASSERT(actual_pages_2m == expected_pages_2m,
> +		    "Unexpected 2m page count. Expected %d, got %d",
> +		    expected_pages_2m, actual_pages_2m);
> +}
> +
> +static void check_split_count(struct kvm_vm *vm, int expected_splits)
> +{
> +	int actual_splits;
> +
> +	actual_splits = vm_get_single_stat(vm, "nx_lpage_splits");
> +
> +	TEST_ASSERT(actual_splits == expected_splits,
> +		    "Unexpected nx lpage split count. Expected %d, got %d",
> +		    expected_splits, actual_splits);
> +}
> +
> +int main(int argc, char **argv)
> +{
> +	struct kvm_vm *vm;
> +	struct timespec ts;
> +	void *hva;
> +
> +	vm = vm_create_default(0, 0, guest_code);
> +
> +	vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS_HUGETLB,
> +				    HPAGE_GPA, HPAGE_SLOT,
> +				    HPAGE_SLOT_NPAGES, 0);
> +
> +	virt_map(vm, HPAGE_GVA, HPAGE_GPA, HPAGE_SLOT_NPAGES);
> +
> +	hva = addr_gpa2hva(vm, HPAGE_GPA);
> +	memset(hva, RETURN_OPCODE, HPAGE_SLOT_NPAGES * PAGE_SIZE);
> +
> +	check_2m_page_count(vm, 0);
> +	check_split_count(vm, 0);
> +
> +	/*
> +	 * The guest code will first read from the first hugepage, resulting
> +	 * in a huge page mapping being created.
> +	 */
> +	vcpu_run(vm, 0);
> +	check_2m_page_count(vm, 1);
> +	check_split_count(vm, 0);
> +
> +	/*
> +	 * Then the guest code will read from the second hugepage, resulting
> +	 * in another huge page mapping being created.
> +	 */
> +	vcpu_run(vm, 0);
> +	check_2m_page_count(vm, 2);
> +	check_split_count(vm, 0);
> +
> +	/*
> +	 * Next, the guest will execute from the first huge page, causing it
> +	 * to be remapped at 4k.
> +	 */
> +	vcpu_run(vm, 0);
> +	check_2m_page_count(vm, 1);
> +	check_split_count(vm, 1);
> +
> +	/*
> +	 * Executing from the third huge page (previously unaccessed) will
> +	 * cause part to be mapped at 4k.
> +	 */
> +	vcpu_run(vm, 0);
> +	check_2m_page_count(vm, 1);
> +	check_split_count(vm, 2);
> +
> +	/* Reading from the first huge page again should have no effect. */
> +	vcpu_run(vm, 0);
> +	check_2m_page_count(vm, 1);
> +	check_split_count(vm, 2);
> +
> +	/*
> +	 * Give recovery thread time to run. The wrapper script sets
> +	 * recovery_period_ms to 100, so wait 5x that.
> +	 */
> +	ts.tv_sec = 0;
> +	ts.tv_nsec = 500000000;
> +	nanosleep(&ts, NULL);
> +
> +	/*
> +	 * Now that the reclaimer has run, all the split pages should be gone.
> +	 */
> +	check_2m_page_count(vm, 1);
> +	check_split_count(vm, 0);
> +
> +	/*
> +	 * The 4k mapping on hpage 3 should have been removed, so check that
> +	 * reading from it causes a huge page mapping to be installed.
> +	 */
> +	vcpu_run(vm, 0);
> +	check_2m_page_count(vm, 2);
> +	check_split_count(vm, 0);
> +
> +	kvm_vm_free(vm);
> +
> +	return 0;
> +}
> +
> diff --git a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
> new file mode 100755
> index 000000000000..19fc95723fcb
> --- /dev/null
> +++ b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
> @@ -0,0 +1,25 @@
> +#!/bin/bash
> +# SPDX-License-Identifier: GPL-2.0-only */
> +
> +# tools/testing/selftests/kvm/nx_huge_page_test.sh
> +# Copyright (C) 2022, Google LLC.
> +
> +NX_HUGE_PAGES=$(cat /sys/module/kvm/parameters/nx_huge_pages)
> +NX_HUGE_PAGES_RECOVERY_RATIO=$(cat /sys/module/kvm/parameters/nx_huge_pages_recovery_ratio)
> +NX_HUGE_PAGES_RECOVERY_PERIOD=$(cat /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms)
> +HUGE_PAGES=$(cat /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages)
> +
> +echo 1 > /sys/module/kvm/parameters/nx_huge_pages
> +echo 1 > /sys/module/kvm/parameters/nx_huge_pages_recovery_ratio
> +echo 100 > /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms
> +echo 200 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
> +
> +./nx_huge_pages_test
> +RET=$?
> +
> +echo $NX_HUGE_PAGES > /sys/module/kvm/parameters/nx_huge_pages
> +echo $NX_HUGE_PAGES_RECOVERY_RATIO > /sys/module/kvm/parameters/nx_huge_pages_recovery_ratio
> +echo $NX_HUGE_PAGES_RECOVERY_PERIOD > /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms
> +echo $HUGE_PAGES > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
> +
> +exit $RET
> -- 
> 2.35.1.1178.g4f1659d476-goog
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 07/10] KVM: x86/MMU: Allow NX huge pages to be disabled on a per-vm basis
  2022-04-11 21:10 ` [PATCH v4 07/10] KVM: x86/MMU: Allow NX huge pages to be disabled on a per-vm basis Ben Gardon
@ 2022-04-12 17:54   ` Sean Christopherson
  0 siblings, 0 replies; 31+ messages in thread
From: Sean Christopherson @ 2022-04-12 17:54 UTC (permalink / raw)
  To: Ben Gardon
  Cc: linux-kernel, kvm, Paolo Bonzini, Peter Xu, Peter Shier,
	David Dunn, Junaid Shahid, Jim Mattson, David Matlack,
	Mingwei Zhang, Jing Zhang

On Mon, Apr 11, 2022, Ben Gardon wrote:
> @@ -6079,6 +6080,11 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>  		}
>  		mutex_unlock(&kvm->lock);
>  		break;
> +	case KVM_CAP_VM_DISABLE_NX_HUGE_PAGES:
> +		kvm->arch.disable_nx_huge_pages = true;

It's probably worth requiring cap->args[0] to be zero, KVM has been burned too many
times by lack of extensibility.

> +		kvm_update_nx_huge_pages(kvm);

Is there actually a use case to do this while the VM is running?  Given that this
is a one-way control, i.e. userspace can't re-enable the mitigation, me thinks the
answer is no.  And logically, I don't see why userspace would suddenly decide to
trust the guest at some random point in time.

So, require this to be done before vCPUs are created, then there's no need to
zap SPTEs because there can't be any SPTEs to zap.  Then the previous patch also
goes away.  Or to be really draconian, disallow the cap if memslots have been
created, though I think created_vcpus will be sufficient now and in the future.

We can always lift the restriction if someone has a use case for toggling this
while the VM is running, but we can't do the reverse.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 09/10] KVM: x86/MMU: Require reboot permission to disable NX hugepages
  2022-04-11 21:10 ` [PATCH v4 09/10] KVM: x86/MMU: Require reboot permission to disable NX hugepages Ben Gardon
@ 2022-04-12 18:08   ` Sean Christopherson
  0 siblings, 0 replies; 31+ messages in thread
From: Sean Christopherson @ 2022-04-12 18:08 UTC (permalink / raw)
  To: Ben Gardon
  Cc: linux-kernel, kvm, Paolo Bonzini, Peter Xu, Peter Shier,
	David Dunn, Junaid Shahid, Jim Mattson, David Matlack,
	Mingwei Zhang, Jing Zhang

On Mon, Apr 11, 2022, Ben Gardon wrote:
> Ensure that the userspace actor attempting to disable NX hugepages has
> permission to reboot the system. Since disabling NX hugepages would
> allow a guest to crash the system, it is similar to reboot permissions.
> 
> This approach is the simplest permission gating, but passing a file
> descriptor opened for write for the module parameter would also work
> well and be more precise.
> The latter approach was suggested by Sean Christopherson.

State _why_ the latter approach wasn't chosen, vague hand waving about this
being simpler doesn't help the reader understand how unbelievably painful it would
be to actually get at the module param (I looked briefy, it'd be beyond ugly).
We can still hand wave a bit, but there should at least be a hint as to why
option A was chosen instead of option B.

It'd also be helpful to call out what is lost by requiring CAP_SYS_BOOT, because
again "precision" is rather vague.  The important aspect of loss of precision
is that a userspace process can't be given access to _just_ the NX module param
to opt out of the workaround, it needs full reboot permissions.

E.g.

  Ideally, KVM would require userspace to prove it has access to KVM's
  nx_huge_pages module param, e.g. so that userspace can opt out without
  needing full reboot permissions.  But getting access to the module param
  file info is a mess since it's buried in layers of sysfs and module
  glue, and requiring CAP_SYS_BOOT is sufficient for all known use cases.

> Suggested-by: Jim Mattson <jmattson@google.com>
> Signed-off-by: Ben Gardon <bgardon@google.com>
> ---
>  Documentation/virt/kvm/api.rst | 2 ++
>  arch/x86/kvm/x86.c             | 9 +++++++++
>  2 files changed, 11 insertions(+)
> 
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index 31fb002632bb..021452a9fa91 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -7861,6 +7861,8 @@ should adjust CPUID leaf 0xA to reflect that the PMU is disabled.
>  :Capability KVM_CAP_PMU_CAPABILITY
>  :Architectures: x86
>  :Type: vm
> +:Returns 0 on success, -EPERM if the userspace process does not
> +	 have CAP_SYS_BOOT
>  
>  This capability disables the NX huge pages mitigation for iTLB MULTIHIT.
>  
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index de1d211f8aa3..8d3d6c48c5ec 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -6081,6 +6081,15 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>  		mutex_unlock(&kvm->lock);
>  		break;
>  	case KVM_CAP_VM_DISABLE_NX_HUGE_PAGES:
> +		/*
> +		 * Since the risk of disabling NX hugepages is a guest crashing
> +		 * the system, ensure the userspace process has permission to
> +		 * reboot the system.
> +		 */
> +		if (!capable(CAP_SYS_BOOT)) {
> +			r = -EPERM;
> +			break;
> +		}

This needs to go with the introduction of the cap.  There is zero reason to create
a window where the kernel is vulnerable.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 03/10] KVM: selftests: Read binary stats desc in lib
  2022-04-12  0:54   ` Mingwei Zhang
@ 2022-04-12 18:56     ` Ben Gardon
  2022-04-12 19:02       ` Sean Christopherson
  0 siblings, 1 reply; 31+ messages in thread
From: Ben Gardon @ 2022-04-12 18:56 UTC (permalink / raw)
  To: Mingwei Zhang
  Cc: LKML, kvm, Paolo Bonzini, Peter Xu, Sean Christopherson,
	Peter Shier, David Dunn, Junaid Shahid, Jim Mattson,
	David Matlack, Jing Zhang

On Mon, Apr 11, 2022 at 5:55 PM Mingwei Zhang <mizhang@google.com> wrote:
>
> On Mon, Apr 11, 2022, Ben Gardon wrote:
> > Move the code to read the binary stats descriptors to the KVM selftests
> > library. It will be re-used by other tests to check KVM behavior.
> >
> > No functional change intended.
> >
> > Signed-off-by: Ben Gardon <bgardon@google.com>
> > ---
> >  .../selftests/kvm/include/kvm_util_base.h     |  4 +++
> >  .../selftests/kvm/kvm_binary_stats_test.c     |  9 ++----
> >  tools/testing/selftests/kvm/lib/kvm_util.c    | 29 +++++++++++++++++++
> >  3 files changed, 35 insertions(+), 7 deletions(-)
> >
> > diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
> > index 5ba3132f3110..c5f34551ff76 100644
> > --- a/tools/testing/selftests/kvm/include/kvm_util_base.h
> > +++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
> > @@ -401,6 +401,10 @@ void assert_on_unhandled_exception(struct kvm_vm *vm, uint32_t vcpuid);
> >  int vm_get_stats_fd(struct kvm_vm *vm);
> >  int vcpu_get_stats_fd(struct kvm_vm *vm, uint32_t vcpuid);
> >  void read_vm_stats_header(int stats_fd, struct kvm_stats_header *header);
> > +struct kvm_stats_desc *alloc_vm_stats_desc(int stats_fd,
> > +                                       struct kvm_stats_header *header);
> > +void read_vm_stats_desc(int stats_fd, struct kvm_stats_header *header,
> > +                     struct kvm_stats_desc *stats_desc);
> >
> >  uint32_t guest_get_vcpuid(void);
> >
> > diff --git a/tools/testing/selftests/kvm/kvm_binary_stats_test.c b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
> > index 22c22a90f15a..e4795bad7db6 100644
> > --- a/tools/testing/selftests/kvm/kvm_binary_stats_test.c
> > +++ b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
> > @@ -62,14 +62,9 @@ static void stats_test(int stats_fd)
> >                                                       header.data_offset),
> >                       "Descriptor block is overlapped with data block");
> >
> > -     /* Allocate memory for stats descriptors */
> > -     stats_desc = calloc(header.num_desc, size_desc);
> > -     TEST_ASSERT(stats_desc, "Allocate memory for stats descriptors");
> >       /* Read kvm stats descriptors */
> > -     ret = pread(stats_fd, stats_desc,
> > -                     size_desc * header.num_desc, header.desc_offset);
> > -     TEST_ASSERT(ret == size_desc * header.num_desc,
> > -                     "Read KVM stats descriptors");
> > +     stats_desc = alloc_vm_stats_desc(stats_fd, &header);
> > +     read_vm_stats_desc(stats_fd, &header, stats_desc);
> >
> >       /* Sanity check for fields in descriptors */
> >       for (i = 0; i < header.num_desc; ++i) {
> > diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
> > index 0caf28e324ed..e3ae26fbef03 100644
> > --- a/tools/testing/selftests/kvm/lib/kvm_util.c
> > +++ b/tools/testing/selftests/kvm/lib/kvm_util.c
> > @@ -2564,3 +2564,32 @@ void read_vm_stats_header(int stats_fd, struct kvm_stats_header *header)
> >       ret = read(stats_fd, header, sizeof(*header));
> >       TEST_ASSERT(ret == sizeof(*header), "Read stats header");
> >  }
> > +
> > +static ssize_t stats_descs_size(struct kvm_stats_header *header)
> > +{
> > +     return header->num_desc *
> > +            (sizeof(struct kvm_stats_desc) + header->name_size);
> > +}
> I was very confused on header->name_size. So this field means the
> maximum string size of a stats name, right? Can we update the comments
> in the kvm.h to specify that? By reading the comments, I don't really
> feel this is how we should use this field.

I believe that's right. I agree the documentation on that was a little
confusing.

>
> hmm, if that is true, isn't this field a compile time value? Why do we
> have to get it at runtime?

It's compile time for the kernel but not for the userspace binaries
which ultimately consume the stats.
We could cheat in this selftest perhaps, but then it wouldn't be as
good of a test.

>
> > +
> > +/* Caller is responsible for freeing the returned kvm_stats_desc. */
> > +struct kvm_stats_desc *alloc_vm_stats_desc(int stats_fd,
> > +                                       struct kvm_stats_header *header)
> > +{
> > +     struct kvm_stats_desc *stats_desc;
> > +
> > +     stats_desc = malloc(stats_descs_size(header));
> > +     TEST_ASSERT(stats_desc, "Allocate memory for stats descriptors");
> > +
> > +     return stats_desc;
> > +}
> > +
> > +void read_vm_stats_desc(int stats_fd, struct kvm_stats_header *header,
> > +                     struct kvm_stats_desc *stats_desc)
> > +{
> > +     ssize_t ret;
> > +
> > +     ret = pread(stats_fd, stats_desc, stats_descs_size(header),
> > +                 header->desc_offset);
> > +     TEST_ASSERT(ret == stats_descs_size(header),
> > +                 "Read KVM stats descriptors");
> > +}
> > --
> > 2.35.1.1178.g4f1659d476-goog
> >

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 03/10] KVM: selftests: Read binary stats desc in lib
  2022-04-12 18:56     ` Ben Gardon
@ 2022-04-12 19:02       ` Sean Christopherson
  2022-04-12 20:02         ` Sean Christopherson
  0 siblings, 1 reply; 31+ messages in thread
From: Sean Christopherson @ 2022-04-12 19:02 UTC (permalink / raw)
  To: Ben Gardon
  Cc: Mingwei Zhang, LKML, kvm, Paolo Bonzini, Peter Xu, Peter Shier,
	David Dunn, Junaid Shahid, Jim Mattson, David Matlack,
	Jing Zhang

On Tue, Apr 12, 2022, Ben Gardon wrote:
> On Mon, Apr 11, 2022 at 5:55 PM Mingwei Zhang <mizhang@google.com> wrote:
> > I was very confused on header->name_size. So this field means the
> > maximum string size of a stats name, right? Can we update the comments
> > in the kvm.h to specify that? By reading the comments, I don't really
> > feel this is how we should use this field.
> 
> I believe that's right. I agree the documentation on that was a little
> confusing.

Heh, a little.  I got tripped up looking at this too.  Give me a few minutes and
I'll attach a cleanup patch to add comments and fix the myriad style issues.

This whole file is painful to look at.  Aside from violating preferred kernel
style, it's horribly consistent with itself.  Well, except for the 80 char limit,
to which it has a fanatical devotion.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 04/10] KVM: selftests: Read binary stat data in lib
  2022-04-11 22:14   ` David Matlack
@ 2022-04-12 19:58     ` Ben Gardon
  0 siblings, 0 replies; 31+ messages in thread
From: Ben Gardon @ 2022-04-12 19:58 UTC (permalink / raw)
  To: David Matlack
  Cc: LKML, kvm list, Paolo Bonzini, Peter Xu, Sean Christopherson,
	Peter Shier, David Dunn, Junaid Shahid, Jim Mattson,
	Mingwei Zhang, Jing Zhang

On Mon, Apr 11, 2022 at 3:15 PM David Matlack <dmatlack@google.com> wrote:
>
> On Mon, Apr 11, 2022 at 2:10 PM Ben Gardon <bgardon@google.com> wrote:
> >
> > Move the code to read the binary stats data to the KVM selftests
> > library. It will be re-used by other tests to check KVM behavior.
> >
> > No functional change intended.
> >
> > Signed-off-by: Ben Gardon <bgardon@google.com>
> > ---
> >  .../selftests/kvm/include/kvm_util_base.h     |  3 +++
> >  .../selftests/kvm/kvm_binary_stats_test.c     | 20 +++++-------------
> >  tools/testing/selftests/kvm/lib/kvm_util.c    | 21 +++++++++++++++++++
> >  3 files changed, 29 insertions(+), 15 deletions(-)
> >
> > diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
> > index c5f34551ff76..b2684cfc2cb1 100644
> > --- a/tools/testing/selftests/kvm/include/kvm_util_base.h
> > +++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
> > @@ -405,6 +405,9 @@ struct kvm_stats_desc *alloc_vm_stats_desc(int stats_fd,
> >                                           struct kvm_stats_header *header);
> >  void read_vm_stats_desc(int stats_fd, struct kvm_stats_header *header,
> >                         struct kvm_stats_desc *stats_desc);
> > +int read_stat_data(int stats_fd, struct kvm_stats_header *header,
> > +                  struct kvm_stats_desc *desc, uint64_t *data,
> > +                  ssize_t max_elements);
> >
> >  uint32_t guest_get_vcpuid(void);
> >
> > diff --git a/tools/testing/selftests/kvm/kvm_binary_stats_test.c b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
> > index e4795bad7db6..97b180249ba0 100644
> > --- a/tools/testing/selftests/kvm/kvm_binary_stats_test.c
> > +++ b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
> > @@ -20,6 +20,8 @@
> >  #include "asm/kvm.h"
> >  #include "linux/kvm.h"
> >
> > +#define STAT_MAX_ELEMENTS 1000
> > +
> >  static void stats_test(int stats_fd)
> >  {
> >         ssize_t ret;
> > @@ -29,7 +31,7 @@ static void stats_test(int stats_fd)
> >         struct kvm_stats_header header;
> >         char *id;
> >         struct kvm_stats_desc *stats_desc;
> > -       u64 *stats_data;
> > +       u64 stats_data[STAT_MAX_ELEMENTS];
>
> What is the benefit of changing stats_data to a stack allocation with
> a fixed limit?

There isn't really a benefit. Will remove.

>
> >         struct kvm_stats_desc *pdesc;
> >
> >         /* Read kvm stats header */
> > @@ -130,25 +132,13 @@ static void stats_test(int stats_fd)
> >                         pdesc->offset, pdesc->name);
> >         }
> >
> > -       /* Allocate memory for stats data */
> > -       stats_data = malloc(size_data);
> > -       TEST_ASSERT(stats_data, "Allocate memory for stats data");
> > -       /* Read kvm stats data as a bulk */
> > -       ret = pread(stats_fd, stats_data, size_data, header.data_offset);
> > -       TEST_ASSERT(ret == size_data, "Read KVM stats data");
> >         /* Read kvm stats data one by one */
> > -       size_data = 0;
> >         for (i = 0; i < header.num_desc; ++i) {
> >                 pdesc = (void *)stats_desc + i * size_desc;
> > -               ret = pread(stats_fd, stats_data,
> > -                               pdesc->size * sizeof(*stats_data),
> > -                               header.data_offset + size_data);
> > -               TEST_ASSERT(ret == pdesc->size * sizeof(*stats_data),
> > -                               "Read data of KVM stats: %s", pdesc->name);
> > -               size_data += pdesc->size * sizeof(*stats_data);
> > +               read_stat_data(stats_fd, &header, pdesc, stats_data,
> > +                              ARRAY_SIZE(stats_data));
> >         }
> >
> > -       free(stats_data);
> >         free(stats_desc);
> >         free(id);
> >  }
> > diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
> > index e3ae26fbef03..64e2085f1129 100644
> > --- a/tools/testing/selftests/kvm/lib/kvm_util.c
> > +++ b/tools/testing/selftests/kvm/lib/kvm_util.c
> > @@ -2593,3 +2593,24 @@ void read_vm_stats_desc(int stats_fd, struct kvm_stats_header *header,
> >         TEST_ASSERT(ret == stats_descs_size(header),
> >                     "Read KVM stats descriptors");
> >  }
> > +
> > +int read_stat_data(int stats_fd, struct kvm_stats_header *header,
>
> I would like to keep up the practice of adding docstrings to functions
> in kvm_util. Can you add docstring comments for this function and the
> other kvm_util functions introduced by this series?

Will do.

>
> > +                  struct kvm_stats_desc *desc, uint64_t *data,
> > +                  ssize_t max_elements)
> > +{
> > +       ssize_t ret;
> > +
> > +       TEST_ASSERT(desc->size <= max_elements,
> > +                   "Max data elements should be at least as large as stat data");
>
> What is the reason for this assertion? Callers are required to read
> all the data elements of a given stat?

Yeah, that was the idea, but it doesn't seem very useful. I'll remove it.

>
> > +
> > +       ret = pread(stats_fd, data, desc->size * sizeof(*data),
> > +                   header->data_offset + desc->offset);
> > +
> > +       /* ret from pread is in bytes. */
> > +       ret = ret / sizeof(*data);
> > +
> > +       TEST_ASSERT(ret == desc->size,
> > +                   "Read data of KVM stats: %s", desc->name);
>
> Won't this assertion fail when called from kvm_binary_stats_test.c?
> kvm_binary_stats_test.c looks like it reads all the stat data at once,
> which means ret will be the total number of stat data points, and
> desc->size will be the number of stat data points in the first stat.

Hmmm it shouldn't. I think we're just reading one stat at at time.

>
> > +
> > +       return ret;
> > +}
> > --
> > 2.35.1.1178.g4f1659d476-goog
> >

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 03/10] KVM: selftests: Read binary stats desc in lib
  2022-04-12 19:02       ` Sean Christopherson
@ 2022-04-12 20:02         ` Sean Christopherson
  2022-04-12 22:12           ` Ben Gardon
  0 siblings, 1 reply; 31+ messages in thread
From: Sean Christopherson @ 2022-04-12 20:02 UTC (permalink / raw)
  To: Ben Gardon
  Cc: Mingwei Zhang, LKML, kvm, Paolo Bonzini, Peter Xu, Peter Shier,
	David Dunn, Junaid Shahid, Jim Mattson, David Matlack,
	Jing Zhang

On Tue, Apr 12, 2022, Sean Christopherson wrote:
> On Tue, Apr 12, 2022, Ben Gardon wrote:
> > On Mon, Apr 11, 2022 at 5:55 PM Mingwei Zhang <mizhang@google.com> wrote:
> > > I was very confused on header->name_size. So this field means the
> > > maximum string size of a stats name, right? Can we update the comments
> > > in the kvm.h to specify that? By reading the comments, I don't really
> > > feel this is how we should use this field.
> > 
> > I believe that's right. I agree the documentation on that was a little
> > confusing.
> 
> Heh, a little.  I got tripped up looking at this too.  Give me a few minutes and
> I'll attach a cleanup patch to add comments and fix the myriad style issues.
> 
> This whole file is painful to look at.  Aside from violating preferred kernel
> style, it's horribly consistent with itself.  Well, except for the 80 char limit,
> to which it has a fanatical devotion.

If you send a new version of this series, can you add this on top?

From: Sean Christopherson <seanjc@google.com>
Date: Tue, 12 Apr 2022 11:49:59 -0700
Subject: [PATCH] KVM: selftests: Clean up coding style in binary stats test

Fix a variety of code style violations and/or inconsistencies in the
binary stats test.  The 80 char limit is a soft limit and can and should
be ignored/violated if doing so improves the overall code readability.

Specifically, provide consistent indentation and don't split expressions
at arbitrary points just to honor the 80 char limit.

Opportunistically expand/add comments to call out the more subtle aspects
of the code.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 .../selftests/kvm/kvm_binary_stats_test.c     | 91 ++++++++++++-------
 1 file changed, 56 insertions(+), 35 deletions(-)

diff --git a/tools/testing/selftests/kvm/kvm_binary_stats_test.c b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
index 97b180249ba0..944ed52e3f07 100644
--- a/tools/testing/selftests/kvm/kvm_binary_stats_test.c
+++ b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
@@ -37,32 +37,42 @@ static void stats_test(int stats_fd)
 	/* Read kvm stats header */
 	read_vm_stats_header(stats_fd, &header);

+	/*
+	 * The base size of the descriptor is defined by KVM's ABI, but the
+	 * size of the name field is variable as far as KVM's ABI is concerned.
+	 * But, the size of name is constant for a given instance of KVM and
+	 * is provided by KVM in the overall stats header.
+	 */
 	size_desc = sizeof(*stats_desc) + header.name_size;

 	/* Read kvm stats id string */
 	id = malloc(header.name_size);
 	TEST_ASSERT(id, "Allocate memory for id string");
+
 	ret = read(stats_fd, id, header.name_size);
 	TEST_ASSERT(ret == header.name_size, "Read id string");

 	/* Check id string, that should start with "kvm" */
 	TEST_ASSERT(!strncmp(id, "kvm", 3) && strlen(id) < header.name_size,
-				"Invalid KVM stats type, id: %s", id);
+		    "Invalid KVM stats type, id: %s", id);

 	/* Sanity check for other fields in header */
 	if (header.num_desc == 0) {
 		printf("No KVM stats defined!");
 		return;
 	}
-	/* Check overlap */
-	TEST_ASSERT(header.desc_offset > 0 && header.data_offset > 0
-			&& header.desc_offset >= sizeof(header)
-			&& header.data_offset >= sizeof(header),
-			"Invalid offset fields in header");
+	/*
+	 * The descriptor and data offsets must be valid, they must not overlap
+	 * the header, and the descriptor and data blocks must not overlap each
+	 * other.  Note, the data block is rechecked after its size is known.
+	 */
+	TEST_ASSERT(header.desc_offset && header.desc_offset >= sizeof(header) &&
+		    header.data_offset && header.data_offset >= sizeof(header),
+		    "Invalid offset fields in header");
+
 	TEST_ASSERT(header.desc_offset > header.data_offset ||
-			(header.desc_offset + size_desc * header.num_desc <=
-							header.data_offset),
-			"Descriptor block is overlapped with data block");
+		    (header.desc_offset + size_desc * header.num_desc <= header.data_offset),
+		    "Descriptor block is overlapped with data block");

 	/* Read kvm stats descriptors */
 	stats_desc = alloc_vm_stats_desc(stats_fd, &header);
@@ -70,15 +80,22 @@ static void stats_test(int stats_fd)

 	/* Sanity check for fields in descriptors */
 	for (i = 0; i < header.num_desc; ++i) {
+		/*
+		 * Note, size_desc includes the of the name field, which is
+		 * variable, i.e. this is NOT equivalent to &stats_desc[i].
+		 */
 		pdesc = (void *)stats_desc + i * size_desc;
-		/* Check type,unit,base boundaries */
-		TEST_ASSERT((pdesc->flags & KVM_STATS_TYPE_MASK)
-				<= KVM_STATS_TYPE_MAX, "Unknown KVM stats type");
-		TEST_ASSERT((pdesc->flags & KVM_STATS_UNIT_MASK)
-				<= KVM_STATS_UNIT_MAX, "Unknown KVM stats unit");
-		TEST_ASSERT((pdesc->flags & KVM_STATS_BASE_MASK)
-				<= KVM_STATS_BASE_MAX, "Unknown KVM stats base");
-		/* Check exponent for stats unit
+
+		/* Check type, unit, and base boundaries */
+		TEST_ASSERT((pdesc->flags & KVM_STATS_TYPE_MASK) <= KVM_STATS_TYPE_MAX,
+			    "Unknown KVM stats type");
+		TEST_ASSERT((pdesc->flags & KVM_STATS_UNIT_MASK) <= KVM_STATS_UNIT_MAX,
+			    "Unknown KVM stats unit");
+		TEST_ASSERT((pdesc->flags & KVM_STATS_BASE_MASK) <= KVM_STATS_BASE_MAX,
+			    "Unknown KVM stats base");
+
+		/*
+		 * Check exponent for stats unit
 		 * Exponent for counter should be greater than or equal to 0
 		 * Exponent for unit bytes should be greater than or equal to 0
 		 * Exponent for unit seconds should be less than or equal to 0
@@ -89,47 +106,51 @@ static void stats_test(int stats_fd)
 		case KVM_STATS_UNIT_NONE:
 		case KVM_STATS_UNIT_BYTES:
 		case KVM_STATS_UNIT_CYCLES:
-			TEST_ASSERT(pdesc->exponent >= 0,
-					"Unsupported KVM stats unit");
+			TEST_ASSERT(pdesc->exponent >= 0, "Unsupported KVM stats unit");
 			break;
 		case KVM_STATS_UNIT_SECONDS:
-			TEST_ASSERT(pdesc->exponent <= 0,
-					"Unsupported KVM stats unit");
+			TEST_ASSERT(pdesc->exponent <= 0, "Unsupported KVM stats unit");
 			break;
 		}
 		/* Check name string */
 		TEST_ASSERT(strlen(pdesc->name) < header.name_size,
-				"KVM stats name(%s) too long", pdesc->name);
+			    "KVM stats name(%s) too long", pdesc->name);
 		/* Check size field, which should not be zero */
-		TEST_ASSERT(pdesc->size, "KVM descriptor(%s) with size of 0",
-				pdesc->name);
+		TEST_ASSERT(pdesc->size,
+			    "KVM descriptor(%s) with size of 0", pdesc->name);
 		/* Check bucket_size field */
 		switch (pdesc->flags & KVM_STATS_TYPE_MASK) {
 		case KVM_STATS_TYPE_LINEAR_HIST:
 			TEST_ASSERT(pdesc->bucket_size,
-			    "Bucket size of Linear Histogram stats (%s) is zero",
-			    pdesc->name);
+				    "Bucket size of Linear Histogram stats (%s) is zero",
+				    pdesc->name);
 			break;
 		default:
 			TEST_ASSERT(!pdesc->bucket_size,
-			    "Bucket size of stats (%s) is not zero",
-			    pdesc->name);
+				    "Bucket size of stats (%s) is not zero",
+				    pdesc->name);
 		}
 		size_data += pdesc->size * sizeof(*stats_data);
 	}
-	/* Check overlap */
-	TEST_ASSERT(header.data_offset >= header.desc_offset
-		|| header.data_offset + size_data <= header.desc_offset,
-		"Data block is overlapped with Descriptor block");
+
+	/*
+	 * Now that the size of the data block is known, verify the data block
+	 * doesn't overlap the descriptor block.
+	 */
+	TEST_ASSERT(header.data_offset >= header.desc_offset ||
+		    header.data_offset + size_data <= header.desc_offset,
+		    "Data block is overlapped with Descriptor block");
+
 	/* Check validity of all stats data size */
 	TEST_ASSERT(size_data >= header.num_desc * sizeof(*stats_data),
-			"Data size is not correct");
+		    "Data size is not correct");
+
 	/* Check stats offset */
 	for (i = 0; i < header.num_desc; ++i) {
 		pdesc = (void *)stats_desc + i * size_desc;
 		TEST_ASSERT(pdesc->offset < size_data,
-			"Invalid offset (%u) for stats: %s",
-			pdesc->offset, pdesc->name);
+			    "Invalid offset (%u) for stats: %s",
+			    pdesc->offset, pdesc->name);
 	}

 	/* Read kvm stats data one by one */

base-commit: f9955a6aaa037a7a8198a817b9d272efcf10961a
--


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 05/10] KVM: selftests: Add NX huge pages test
  2022-04-12  1:32   ` Mingwei Zhang
@ 2022-04-12 21:51     ` Ben Gardon
  0 siblings, 0 replies; 31+ messages in thread
From: Ben Gardon @ 2022-04-12 21:51 UTC (permalink / raw)
  To: Mingwei Zhang
  Cc: LKML, kvm, Paolo Bonzini, Peter Xu, Sean Christopherson,
	Peter Shier, David Dunn, Junaid Shahid, Jim Mattson,
	David Matlack, Jing Zhang

On Mon, Apr 11, 2022 at 6:32 PM Mingwei Zhang <mizhang@google.com> wrote:
>
> On Mon, Apr 11, 2022, Ben Gardon wrote:
> > There's currently no test coverage of NX hugepages in KVM selftests, so
> > add a basic test to ensure that the feature works as intended.
> >
> > Signed-off-by: Ben Gardon <bgardon@google.com>
> > ---
> >  tools/testing/selftests/kvm/Makefile          |  10 ++
> >  .../selftests/kvm/include/kvm_util_base.h     |   1 +
> >  tools/testing/selftests/kvm/lib/kvm_util.c    |  48 ++++++
> >  .../selftests/kvm/x86_64/nx_huge_pages_test.c | 163 ++++++++++++++++++
> >  .../kvm/x86_64/nx_huge_pages_test.sh          |  25 +++
> >  5 files changed, 247 insertions(+)
> >  create mode 100644 tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
> >  create mode 100755 tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
> >
> > diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
> > index af582d168621..9bb9bce4df37 100644
> > --- a/tools/testing/selftests/kvm/Makefile
> > +++ b/tools/testing/selftests/kvm/Makefile
> > @@ -43,6 +43,10 @@ LIBKVM_aarch64 = lib/aarch64/processor.c lib/aarch64/ucall.c lib/aarch64/handler
> >  LIBKVM_s390x = lib/s390x/processor.c lib/s390x/ucall.c lib/s390x/diag318_test_handler.c
> >  LIBKVM_riscv = lib/riscv/processor.c lib/riscv/ucall.c
> >
> > +# Non-compiled test targets
> > +TEST_PROGS_x86_64 += x86_64/nx_huge_pages_test.sh
> > +
> > +# Compiled test targets
> >  TEST_GEN_PROGS_x86_64 = x86_64/cpuid_test
> >  TEST_GEN_PROGS_x86_64 += x86_64/cr4_cpuid_sync_test
> >  TEST_GEN_PROGS_x86_64 += x86_64/get_msr_index_features
> > @@ -104,6 +108,9 @@ TEST_GEN_PROGS_x86_64 += steal_time
> >  TEST_GEN_PROGS_x86_64 += kvm_binary_stats_test
> >  TEST_GEN_PROGS_x86_64 += system_counter_offset_test
> >
> > +# Compiled outputs used by test targets
> > +TEST_GEN_PROGS_EXTENDED_x86_64 += x86_64/nx_huge_pages_test
> > +
> >  TEST_GEN_PROGS_aarch64 += aarch64/arch_timer
> >  TEST_GEN_PROGS_aarch64 += aarch64/debug-exceptions
> >  TEST_GEN_PROGS_aarch64 += aarch64/get-reg-list
> > @@ -142,7 +149,9 @@ TEST_GEN_PROGS_riscv += kvm_page_table_test
> >  TEST_GEN_PROGS_riscv += set_memory_region_test
> >  TEST_GEN_PROGS_riscv += kvm_binary_stats_test
> >
> > +TEST_PROGS += $(TEST_PROGS_$(UNAME_M))
> >  TEST_GEN_PROGS += $(TEST_GEN_PROGS_$(UNAME_M))
> > +TEST_GEN_PROGS_EXTENDED += $(TEST_GEN_PROGS_EXTENDED_$(UNAME_M))
> >  LIBKVM += $(LIBKVM_$(UNAME_M))
> >
> >  INSTALL_HDR_PATH = $(top_srcdir)/usr
> > @@ -193,6 +202,7 @@ $(OUTPUT)/libkvm.a: $(LIBKVM_OBJS)
> >  x := $(shell mkdir -p $(sort $(dir $(TEST_GEN_PROGS))))
> >  all: $(STATIC_LIBS)
> >  $(TEST_GEN_PROGS): $(STATIC_LIBS)
> > +$(TEST_GEN_PROGS_EXTENDED): $(STATIC_LIBS)
> >
> >  cscope: include_paths = $(LINUX_TOOL_INCLUDE) $(LINUX_HDR_PATH) include lib ..
> >  cscope:
> > diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
> > index b2684cfc2cb1..f9c2ac0a5b97 100644
> > --- a/tools/testing/selftests/kvm/include/kvm_util_base.h
> > +++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
> > @@ -408,6 +408,7 @@ void read_vm_stats_desc(int stats_fd, struct kvm_stats_header *header,
> >  int read_stat_data(int stats_fd, struct kvm_stats_header *header,
> >                  struct kvm_stats_desc *desc, uint64_t *data,
> >                  ssize_t max_elements);
> > +uint64_t vm_get_single_stat(struct kvm_vm *vm, const char *stat_name);
> >
> >  uint32_t guest_get_vcpuid(void);
> >
> > diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
> > index 64e2085f1129..833c7e63d62d 100644
> > --- a/tools/testing/selftests/kvm/lib/kvm_util.c
> > +++ b/tools/testing/selftests/kvm/lib/kvm_util.c
> > @@ -2614,3 +2614,51 @@ int read_stat_data(int stats_fd, struct kvm_stats_header *header,
> >
> >       return ret;
> >  }
> > +
> > +static int vm_get_stat_data(struct kvm_vm *vm, const char *stat_name,
> > +                         uint64_t *data, ssize_t max_elements)
> > +{
> > +     struct kvm_stats_desc *stats_desc;
> > +     struct kvm_stats_header header;
> > +     struct kvm_stats_desc *desc;
> > +     size_t size_desc;
> > +     int stats_fd;
> > +     int ret = -EINVAL;
> > +     int i;
> > +
> > +     stats_fd = vm_get_stats_fd(vm);
> > +
> > +     read_vm_stats_header(stats_fd, &header);
> > +
> > +     stats_desc = alloc_vm_stats_desc(stats_fd, &header);
> > +     read_vm_stats_desc(stats_fd, &header, stats_desc);
> > +
> > +     size_desc = sizeof(struct kvm_stats_desc) + header.name_size;
> > +
> > +     /* Read kvm stats data one by one */
> > +     for (i = 0; i < header.num_desc; ++i) {
> > +             desc = (void *)stats_desc + (i * size_desc);
> > +
> > +             if (strcmp(desc->name, stat_name))
> > +                     continue;
> > +
> > +             ret = read_stat_data(stats_fd, &header, desc, data,
> > +                                  max_elements);
> > +     }
> > +
> > +     free(stats_desc);
> > +     close(stats_fd);
> > +     return ret;
> > +}
>
> I could be wrong, but it seems this function is quite generic. Why not
> putting it into kvm_util.c?

I agree it should go in kvm_util.c. If you look above, you'll see that
it's actually already in that file.

> > +
> > +uint64_t vm_get_single_stat(struct kvm_vm *vm, const char *stat_name)
> > +{
> > +     uint64_t data;
> > +     int ret;
> > +
> > +     ret = vm_get_stat_data(vm, stat_name, &data, 1);
> > +     TEST_ASSERT(ret == 1,
> > +                 "Stat %s expected to have 1 element, but %d returned",
> > +                 stat_name, ret);
> > +     return data;
> > +}
> > diff --git a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
> > new file mode 100644
> > index 000000000000..3f21726b22c7
> > --- /dev/null
> > +++ b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
> > @@ -0,0 +1,163 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/*
> > + * tools/testing/selftests/kvm/nx_huge_page_test.c
> > + *
> > + * Usage: to be run via nx_huge_page_test.sh, which does the necessary
> > + * environment setup and teardown
> > + *
> > + * Copyright (C) 2022, Google LLC.
> > + */
> > +
> > +#define _GNU_SOURCE
> > +
> > +#include <fcntl.h>
> > +#include <stdint.h>
> > +#include <time.h>
> > +
> > +#include <test_util.h>
> > +#include "kvm_util.h"
> > +
> > +#define HPAGE_SLOT           10
> > +#define HPAGE_GVA            (23*1024*1024)
> > +#define HPAGE_GPA            (10*1024*1024)
> > +#define HPAGE_SLOT_NPAGES    (512 * 3)
> > +#define PAGE_SIZE            4096
> > +
> > +/*
> > + * When writing to guest memory, write the opcode for the `ret` instruction so
> > + * that subsequent iteractions can exercise instruction fetch by calling the
> > + * memory.
> > + */
> > +#define RETURN_OPCODE 0xC3
> > +
> > +void guest_code(void)
> > +{
> > +     uint64_t hpage_1 = HPAGE_GVA;
> > +     uint64_t hpage_2 = hpage_1 + (PAGE_SIZE * 512);
> > +     uint64_t hpage_3 = hpage_2 + (PAGE_SIZE * 512);
> > +
> > +     READ_ONCE(*(uint64_t *)hpage_1);
> > +     GUEST_SYNC(1);
> > +
> > +     READ_ONCE(*(uint64_t *)hpage_2);
> > +     GUEST_SYNC(2);
> > +
> > +     ((void (*)(void)) hpage_1)();
> > +     GUEST_SYNC(3);
> > +
> > +     ((void (*)(void)) hpage_3)();
> > +     GUEST_SYNC(4);
> > +
> > +     READ_ONCE(*(uint64_t *)hpage_1);
> > +     GUEST_SYNC(5);
> > +
> > +     READ_ONCE(*(uint64_t *)hpage_3);
> > +     GUEST_SYNC(6);
> > +}
> > +
> > +static void check_2m_page_count(struct kvm_vm *vm, int expected_pages_2m)
> > +{
> > +     int actual_pages_2m;
> > +
> > +     actual_pages_2m = vm_get_single_stat(vm, "pages_2m");
> > +
> > +     TEST_ASSERT(actual_pages_2m == expected_pages_2m,
> > +                 "Unexpected 2m page count. Expected %d, got %d",
> > +                 expected_pages_2m, actual_pages_2m);
> > +}
> > +
> > +static void check_split_count(struct kvm_vm *vm, int expected_splits)
> > +{
> > +     int actual_splits;
> > +
> > +     actual_splits = vm_get_single_stat(vm, "nx_lpage_splits");
> > +
> > +     TEST_ASSERT(actual_splits == expected_splits,
> > +                 "Unexpected nx lpage split count. Expected %d, got %d",
> > +                 expected_splits, actual_splits);
> > +}
> > +
> > +int main(int argc, char **argv)
> > +{
> > +     struct kvm_vm *vm;
> > +     struct timespec ts;
> > +     void *hva;
> > +
> > +     vm = vm_create_default(0, 0, guest_code);
> > +
> > +     vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS_HUGETLB,
> > +                                 HPAGE_GPA, HPAGE_SLOT,
> > +                                 HPAGE_SLOT_NPAGES, 0);
> > +
> > +     virt_map(vm, HPAGE_GVA, HPAGE_GPA, HPAGE_SLOT_NPAGES);
> > +
> > +     hva = addr_gpa2hva(vm, HPAGE_GPA);
> > +     memset(hva, RETURN_OPCODE, HPAGE_SLOT_NPAGES * PAGE_SIZE);
> > +
> > +     check_2m_page_count(vm, 0);
> > +     check_split_count(vm, 0);
> > +
> > +     /*
> > +      * The guest code will first read from the first hugepage, resulting
> > +      * in a huge page mapping being created.
> > +      */
> > +     vcpu_run(vm, 0);
> > +     check_2m_page_count(vm, 1);
> > +     check_split_count(vm, 0);
> > +
> > +     /*
> > +      * Then the guest code will read from the second hugepage, resulting
> > +      * in another huge page mapping being created.
> > +      */
> > +     vcpu_run(vm, 0);
> > +     check_2m_page_count(vm, 2);
> > +     check_split_count(vm, 0);
> > +
> > +     /*
> > +      * Next, the guest will execute from the first huge page, causing it
> > +      * to be remapped at 4k.
> > +      */
> > +     vcpu_run(vm, 0);
> > +     check_2m_page_count(vm, 1);
> > +     check_split_count(vm, 1);
> > +
> > +     /*
> > +      * Executing from the third huge page (previously unaccessed) will
> > +      * cause part to be mapped at 4k.
> > +      */
> > +     vcpu_run(vm, 0);
> > +     check_2m_page_count(vm, 1);
> > +     check_split_count(vm, 2);
> > +
> > +     /* Reading from the first huge page again should have no effect. */
> > +     vcpu_run(vm, 0);
> > +     check_2m_page_count(vm, 1);
> > +     check_split_count(vm, 2);
> > +
> > +     /*
> > +      * Give recovery thread time to run. The wrapper script sets
> > +      * recovery_period_ms to 100, so wait 5x that.
> > +      */
> > +     ts.tv_sec = 0;
> > +     ts.tv_nsec = 500000000;
> > +     nanosleep(&ts, NULL);
> > +
> > +     /*
> > +      * Now that the reclaimer has run, all the split pages should be gone.
> > +      */
> > +     check_2m_page_count(vm, 1);
> > +     check_split_count(vm, 0);
> > +
> > +     /*
> > +      * The 4k mapping on hpage 3 should have been removed, so check that
> > +      * reading from it causes a huge page mapping to be installed.
> > +      */
> > +     vcpu_run(vm, 0);
> > +     check_2m_page_count(vm, 2);
> > +     check_split_count(vm, 0);
> > +
> > +     kvm_vm_free(vm);
> > +
> > +     return 0;
> > +}
> > +
> > diff --git a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
> > new file mode 100755
> > index 000000000000..19fc95723fcb
> > --- /dev/null
> > +++ b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
> > @@ -0,0 +1,25 @@
> > +#!/bin/bash
> > +# SPDX-License-Identifier: GPL-2.0-only */
> > +
> > +# tools/testing/selftests/kvm/nx_huge_page_test.sh
> > +# Copyright (C) 2022, Google LLC.
> > +
> > +NX_HUGE_PAGES=$(cat /sys/module/kvm/parameters/nx_huge_pages)
> > +NX_HUGE_PAGES_RECOVERY_RATIO=$(cat /sys/module/kvm/parameters/nx_huge_pages_recovery_ratio)
> > +NX_HUGE_PAGES_RECOVERY_PERIOD=$(cat /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms)
> > +HUGE_PAGES=$(cat /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages)
> > +
> > +echo 1 > /sys/module/kvm/parameters/nx_huge_pages
> > +echo 1 > /sys/module/kvm/parameters/nx_huge_pages_recovery_ratio
> > +echo 100 > /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms
> > +echo 200 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
> > +
> > +./nx_huge_pages_test
> > +RET=$?
> > +
> > +echo $NX_HUGE_PAGES > /sys/module/kvm/parameters/nx_huge_pages
> > +echo $NX_HUGE_PAGES_RECOVERY_RATIO > /sys/module/kvm/parameters/nx_huge_pages_recovery_ratio
> > +echo $NX_HUGE_PAGES_RECOVERY_PERIOD > /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms
> > +echo $HUGE_PAGES > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
> > +
> > +exit $RET
> > --
> > 2.35.1.1178.g4f1659d476-goog
> >

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 05/10] KVM: selftests: Add NX huge pages test
  2022-04-11 22:27   ` David Matlack
@ 2022-04-12 22:11     ` Ben Gardon
  0 siblings, 0 replies; 31+ messages in thread
From: Ben Gardon @ 2022-04-12 22:11 UTC (permalink / raw)
  To: David Matlack
  Cc: LKML, kvm list, Paolo Bonzini, Peter Xu, Sean Christopherson,
	Peter Shier, David Dunn, Junaid Shahid, Jim Mattson,
	Mingwei Zhang, Jing Zhang

On Mon, Apr 11, 2022 at 3:28 PM David Matlack <dmatlack@google.com> wrote:
>
> On Mon, Apr 11, 2022 at 2:10 PM Ben Gardon <bgardon@google.com> wrote:
> >
> > There's currently no test coverage of NX hugepages in KVM selftests, so
> > add a basic test to ensure that the feature works as intended.
> >
> > Signed-off-by: Ben Gardon <bgardon@google.com>
> > ---
> >  tools/testing/selftests/kvm/Makefile          |  10 ++
> >  .../selftests/kvm/include/kvm_util_base.h     |   1 +
> >  tools/testing/selftests/kvm/lib/kvm_util.c    |  48 ++++++
> >  .../selftests/kvm/x86_64/nx_huge_pages_test.c | 163 ++++++++++++++++++
> >  .../kvm/x86_64/nx_huge_pages_test.sh          |  25 +++
> >  5 files changed, 247 insertions(+)
> >  create mode 100644 tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
> >  create mode 100755 tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
> >
> > diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
> > index af582d168621..9bb9bce4df37 100644
> > --- a/tools/testing/selftests/kvm/Makefile
> > +++ b/tools/testing/selftests/kvm/Makefile
> > @@ -43,6 +43,10 @@ LIBKVM_aarch64 = lib/aarch64/processor.c lib/aarch64/ucall.c lib/aarch64/handler
> >  LIBKVM_s390x = lib/s390x/processor.c lib/s390x/ucall.c lib/s390x/diag318_test_handler.c
> >  LIBKVM_riscv = lib/riscv/processor.c lib/riscv/ucall.c
> >
> > +# Non-compiled test targets
> > +TEST_PROGS_x86_64 += x86_64/nx_huge_pages_test.sh
> > +
> > +# Compiled test targets
> >  TEST_GEN_PROGS_x86_64 = x86_64/cpuid_test
> >  TEST_GEN_PROGS_x86_64 += x86_64/cr4_cpuid_sync_test
> >  TEST_GEN_PROGS_x86_64 += x86_64/get_msr_index_features
> > @@ -104,6 +108,9 @@ TEST_GEN_PROGS_x86_64 += steal_time
> >  TEST_GEN_PROGS_x86_64 += kvm_binary_stats_test
> >  TEST_GEN_PROGS_x86_64 += system_counter_offset_test
> >
> > +# Compiled outputs used by test targets
> > +TEST_GEN_PROGS_EXTENDED_x86_64 += x86_64/nx_huge_pages_test
> > +
> >  TEST_GEN_PROGS_aarch64 += aarch64/arch_timer
> >  TEST_GEN_PROGS_aarch64 += aarch64/debug-exceptions
> >  TEST_GEN_PROGS_aarch64 += aarch64/get-reg-list
> > @@ -142,7 +149,9 @@ TEST_GEN_PROGS_riscv += kvm_page_table_test
> >  TEST_GEN_PROGS_riscv += set_memory_region_test
> >  TEST_GEN_PROGS_riscv += kvm_binary_stats_test
> >
> > +TEST_PROGS += $(TEST_PROGS_$(UNAME_M))
> >  TEST_GEN_PROGS += $(TEST_GEN_PROGS_$(UNAME_M))
> > +TEST_GEN_PROGS_EXTENDED += $(TEST_GEN_PROGS_EXTENDED_$(UNAME_M))
> >  LIBKVM += $(LIBKVM_$(UNAME_M))
> >
> >  INSTALL_HDR_PATH = $(top_srcdir)/usr
> > @@ -193,6 +202,7 @@ $(OUTPUT)/libkvm.a: $(LIBKVM_OBJS)
> >  x := $(shell mkdir -p $(sort $(dir $(TEST_GEN_PROGS))))
> >  all: $(STATIC_LIBS)
> >  $(TEST_GEN_PROGS): $(STATIC_LIBS)
> > +$(TEST_GEN_PROGS_EXTENDED): $(STATIC_LIBS)
> >
> >  cscope: include_paths = $(LINUX_TOOL_INCLUDE) $(LINUX_HDR_PATH) include lib ..
> >  cscope:
> > diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
> > index b2684cfc2cb1..f9c2ac0a5b97 100644
> > --- a/tools/testing/selftests/kvm/include/kvm_util_base.h
> > +++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
> > @@ -408,6 +408,7 @@ void read_vm_stats_desc(int stats_fd, struct kvm_stats_header *header,
> >  int read_stat_data(int stats_fd, struct kvm_stats_header *header,
> >                    struct kvm_stats_desc *desc, uint64_t *data,
> >                    ssize_t max_elements);
> > +uint64_t vm_get_single_stat(struct kvm_vm *vm, const char *stat_name);
> >
> >  uint32_t guest_get_vcpuid(void);
> >
> > diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
> > index 64e2085f1129..833c7e63d62d 100644
> > --- a/tools/testing/selftests/kvm/lib/kvm_util.c
> > +++ b/tools/testing/selftests/kvm/lib/kvm_util.c
> > @@ -2614,3 +2614,51 @@ int read_stat_data(int stats_fd, struct kvm_stats_header *header,
> >
> >         return ret;
> >  }
> > +
> > +static int vm_get_stat_data(struct kvm_vm *vm, const char *stat_name,
> > +                           uint64_t *data, ssize_t max_elements)
> > +{
> > +       struct kvm_stats_desc *stats_desc;
> > +       struct kvm_stats_header header;
> > +       struct kvm_stats_desc *desc;
> > +       size_t size_desc;
> > +       int stats_fd;
> > +       int ret = -EINVAL;
> > +       int i;
> > +
> > +       stats_fd = vm_get_stats_fd(vm);
> > +
> > +       read_vm_stats_header(stats_fd, &header);
> > +
> > +       stats_desc = alloc_vm_stats_desc(stats_fd, &header);
> > +       read_vm_stats_desc(stats_fd, &header, stats_desc);
>
> This is a fair bit of redundant work to do when reading every stat.
> Reading stats in selftests is probably not going to be
> performance-senstive, but it should be pretty easy to move everything
> above to VM initialization and storing the outputs in struct kvm_vm
> for access during this function.
>

That's true, but for now I'm just going to leave this as-is. If we
have a case where it is performance sensitive, we can look at
optimizing the stats collection for that case.

> > +
> > +       size_desc = sizeof(struct kvm_stats_desc) + header.name_size;
> > +
> > +       /* Read kvm stats data one by one */
> > +       for (i = 0; i < header.num_desc; ++i) {
> > +               desc = (void *)stats_desc + (i * size_desc);
> > +
> > +               if (strcmp(desc->name, stat_name))
> > +                       continue;
> > +
> > +               ret = read_stat_data(stats_fd, &header, desc, data,
> > +                                    max_elements);
> > +       }
> > +
> > +       free(stats_desc);
> > +       close(stats_fd);
> > +       return ret;
> > +}
> > +
> > +uint64_t vm_get_single_stat(struct kvm_vm *vm, const char *stat_name)
>
> nit: I'd prefer the simpler "vm_get_stat()". The function signature
> already makes it clear we're reading one stat value. And when we add
> more support for more complicated stats (e.g.
> vm_get_histogram_stat()), I think "vm_get_stat()" will still work for
> reading single value stats.

Will do.

>
> > +{
> > +       uint64_t data;
> > +       int ret;
> > +
> > +       ret = vm_get_stat_data(vm, stat_name, &data, 1);
> > +       TEST_ASSERT(ret == 1,
> > +                   "Stat %s expected to have 1 element, but %d returned",
> > +                   stat_name, ret);
> > +       return data;
> > +}
> > diff --git a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
> > new file mode 100644
> > index 000000000000..3f21726b22c7
> > --- /dev/null
> > +++ b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
> > @@ -0,0 +1,163 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/*
> > + * tools/testing/selftests/kvm/nx_huge_page_test.c
> > + *
> > + * Usage: to be run via nx_huge_page_test.sh, which does the necessary
> > + * environment setup and teardown
> > + *
> > + * Copyright (C) 2022, Google LLC.
> > + */
> > +
> > +#define _GNU_SOURCE
> > +
> > +#include <fcntl.h>
> > +#include <stdint.h>
> > +#include <time.h>
> > +
> > +#include <test_util.h>
> > +#include "kvm_util.h"
> > +
> > +#define HPAGE_SLOT             10
> > +#define HPAGE_GVA              (23*1024*1024)
> > +#define HPAGE_GPA              (10*1024*1024)
> > +#define HPAGE_SLOT_NPAGES      (512 * 3)
> > +#define PAGE_SIZE              4096
> > +
> > +/*
> > + * When writing to guest memory, write the opcode for the `ret` instruction so
> > + * that subsequent iteractions can exercise instruction fetch by calling the
> > + * memory.
>
> I think this comment needs to be reworded to better fit this test.
>

Woops, will do.

>
> > + */
> > +#define RETURN_OPCODE 0xC3
> > +
> > +void guest_code(void)
> > +{
> > +       uint64_t hpage_1 = HPAGE_GVA;
> > +       uint64_t hpage_2 = hpage_1 + (PAGE_SIZE * 512);
> > +       uint64_t hpage_3 = hpage_2 + (PAGE_SIZE * 512);
> > +
> > +       READ_ONCE(*(uint64_t *)hpage_1);
> > +       GUEST_SYNC(1);
> > +
> > +       READ_ONCE(*(uint64_t *)hpage_2);
> > +       GUEST_SYNC(2);
> > +
> > +       ((void (*)(void)) hpage_1)();
> > +       GUEST_SYNC(3);
> > +
> > +       ((void (*)(void)) hpage_3)();
> > +       GUEST_SYNC(4);
> > +
> > +       READ_ONCE(*(uint64_t *)hpage_1);
> > +       GUEST_SYNC(5);
> > +
> > +       READ_ONCE(*(uint64_t *)hpage_3);
> > +       GUEST_SYNC(6);
> > +}
> > +
> > +static void check_2m_page_count(struct kvm_vm *vm, int expected_pages_2m)
> > +{
> > +       int actual_pages_2m;
> > +
> > +       actual_pages_2m = vm_get_single_stat(vm, "pages_2m");
> > +
> > +       TEST_ASSERT(actual_pages_2m == expected_pages_2m,
> > +                   "Unexpected 2m page count. Expected %d, got %d",
> > +                   expected_pages_2m, actual_pages_2m);
> > +}
> > +
> > +static void check_split_count(struct kvm_vm *vm, int expected_splits)
> > +{
> > +       int actual_splits;
> > +
> > +       actual_splits = vm_get_single_stat(vm, "nx_lpage_splits");
> > +
> > +       TEST_ASSERT(actual_splits == expected_splits,
> > +                   "Unexpected nx lpage split count. Expected %d, got %d",
> > +                   expected_splits, actual_splits);
> > +}
> > +
> > +int main(int argc, char **argv)
> > +{
> > +       struct kvm_vm *vm;
> > +       struct timespec ts;
> > +       void *hva;
> > +
> > +       vm = vm_create_default(0, 0, guest_code);
> > +
> > +       vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS_HUGETLB,
> > +                                   HPAGE_GPA, HPAGE_SLOT,
> > +                                   HPAGE_SLOT_NPAGES, 0);
> > +
> > +       virt_map(vm, HPAGE_GVA, HPAGE_GPA, HPAGE_SLOT_NPAGES);
> > +
> > +       hva = addr_gpa2hva(vm, HPAGE_GPA);
> > +       memset(hva, RETURN_OPCODE, HPAGE_SLOT_NPAGES * PAGE_SIZE);
> > +
> > +       check_2m_page_count(vm, 0);
> > +       check_split_count(vm, 0);
> > +
> > +       /*
> > +        * The guest code will first read from the first hugepage, resulting
> > +        * in a huge page mapping being created.
> > +        */
> > +       vcpu_run(vm, 0);
> > +       check_2m_page_count(vm, 1);
> > +       check_split_count(vm, 0);
> > +
> > +       /*
> > +        * Then the guest code will read from the second hugepage, resulting
> > +        * in another huge page mapping being created.
> > +        */
> > +       vcpu_run(vm, 0);
> > +       check_2m_page_count(vm, 2);
> > +       check_split_count(vm, 0);
> > +
> > +       /*
> > +        * Next, the guest will execute from the first huge page, causing it
> > +        * to be remapped at 4k.
> > +        */
> > +       vcpu_run(vm, 0);
> > +       check_2m_page_count(vm, 1);
> > +       check_split_count(vm, 1);
> > +
> > +       /*
> > +        * Executing from the third huge page (previously unaccessed) will
> > +        * cause part to be mapped at 4k.
> > +        */
> > +       vcpu_run(vm, 0);
> > +       check_2m_page_count(vm, 1);
> > +       check_split_count(vm, 2);
> > +
> > +       /* Reading from the first huge page again should have no effect. */
> > +       vcpu_run(vm, 0);
> > +       check_2m_page_count(vm, 1);
> > +       check_split_count(vm, 2);
> > +
> > +       /*
> > +        * Give recovery thread time to run. The wrapper script sets
> > +        * recovery_period_ms to 100, so wait 5x that.
> > +        */
> > +       ts.tv_sec = 0;
> > +       ts.tv_nsec = 500000000;
> > +       nanosleep(&ts, NULL);
> > +
> > +       /*
> > +        * Now that the reclaimer has run, all the split pages should be gone.
> > +        */
> > +       check_2m_page_count(vm, 1);
> > +       check_split_count(vm, 0);
> > +
> > +       /*
> > +        * The 4k mapping on hpage 3 should have been removed, so check that
> > +        * reading from it causes a huge page mapping to be installed.
> > +        */
> > +       vcpu_run(vm, 0);
> > +       check_2m_page_count(vm, 2);
> > +       check_split_count(vm, 0);
> > +
> > +       kvm_vm_free(vm);
> > +
> > +       return 0;
> > +}
> > +
> > diff --git a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
> > new file mode 100755
> > index 000000000000..19fc95723fcb
> > --- /dev/null
> > +++ b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
> > @@ -0,0 +1,25 @@
> > +#!/bin/bash
> > +# SPDX-License-Identifier: GPL-2.0-only */
> > +
> > +# tools/testing/selftests/kvm/nx_huge_page_test.sh
> > +# Copyright (C) 2022, Google LLC.
> > +
> > +NX_HUGE_PAGES=$(cat /sys/module/kvm/parameters/nx_huge_pages)
> > +NX_HUGE_PAGES_RECOVERY_RATIO=$(cat /sys/module/kvm/parameters/nx_huge_pages_recovery_ratio)
> > +NX_HUGE_PAGES_RECOVERY_PERIOD=$(cat /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms)
> > +HUGE_PAGES=$(cat /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages)
> > +
> > +echo 1 > /sys/module/kvm/parameters/nx_huge_pages
> > +echo 1 > /sys/module/kvm/parameters/nx_huge_pages_recovery_ratio
> > +echo 100 > /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms
> > +echo 200 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
> > +
> > +./nx_huge_pages_test
> > +RET=$?
> > +
> > +echo $NX_HUGE_PAGES > /sys/module/kvm/parameters/nx_huge_pages
> > +echo $NX_HUGE_PAGES_RECOVERY_RATIO > /sys/module/kvm/parameters/nx_huge_pages_recovery_ratio
> > +echo $NX_HUGE_PAGES_RECOVERY_PERIOD > /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms
> > +echo $HUGE_PAGES > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
> > +
> > +exit $RET
> > --
> > 2.35.1.1178.g4f1659d476-goog
> >

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 03/10] KVM: selftests: Read binary stats desc in lib
  2022-04-12 20:02         ` Sean Christopherson
@ 2022-04-12 22:12           ` Ben Gardon
  0 siblings, 0 replies; 31+ messages in thread
From: Ben Gardon @ 2022-04-12 22:12 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Mingwei Zhang, LKML, kvm, Paolo Bonzini, Peter Xu, Peter Shier,
	David Dunn, Junaid Shahid, Jim Mattson, David Matlack,
	Jing Zhang

On Tue, Apr 12, 2022 at 1:03 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Tue, Apr 12, 2022, Sean Christopherson wrote:
> > On Tue, Apr 12, 2022, Ben Gardon wrote:
> > > On Mon, Apr 11, 2022 at 5:55 PM Mingwei Zhang <mizhang@google.com> wrote:
> > > > I was very confused on header->name_size. So this field means the
> > > > maximum string size of a stats name, right? Can we update the comments
> > > > in the kvm.h to specify that? By reading the comments, I don't really
> > > > feel this is how we should use this field.
> > >
> > > I believe that's right. I agree the documentation on that was a little
> > > confusing.
> >
> > Heh, a little.  I got tripped up looking at this too.  Give me a few minutes and
> > I'll attach a cleanup patch to add comments and fix the myriad style issues.
> >
> > This whole file is painful to look at.  Aside from violating preferred kernel
> > style, it's horribly consistent with itself.  Well, except for the 80 char limit,
> > to which it has a fanatical devotion.
>
> If you send a new version of this series, can you add this on top?

Will do.

>
> From: Sean Christopherson <seanjc@google.com>
> Date: Tue, 12 Apr 2022 11:49:59 -0700
> Subject: [PATCH] KVM: selftests: Clean up coding style in binary stats test
>
> Fix a variety of code style violations and/or inconsistencies in the
> binary stats test.  The 80 char limit is a soft limit and can and should
> be ignored/violated if doing so improves the overall code readability.
>
> Specifically, provide consistent indentation and don't split expressions
> at arbitrary points just to honor the 80 char limit.
>
> Opportunistically expand/add comments to call out the more subtle aspects
> of the code.
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  .../selftests/kvm/kvm_binary_stats_test.c     | 91 ++++++++++++-------
>  1 file changed, 56 insertions(+), 35 deletions(-)
>
> diff --git a/tools/testing/selftests/kvm/kvm_binary_stats_test.c b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
> index 97b180249ba0..944ed52e3f07 100644
> --- a/tools/testing/selftests/kvm/kvm_binary_stats_test.c
> +++ b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
> @@ -37,32 +37,42 @@ static void stats_test(int stats_fd)
>         /* Read kvm stats header */
>         read_vm_stats_header(stats_fd, &header);
>
> +       /*
> +        * The base size of the descriptor is defined by KVM's ABI, but the
> +        * size of the name field is variable as far as KVM's ABI is concerned.
> +        * But, the size of name is constant for a given instance of KVM and
> +        * is provided by KVM in the overall stats header.
> +        */
>         size_desc = sizeof(*stats_desc) + header.name_size;
>
>         /* Read kvm stats id string */
>         id = malloc(header.name_size);
>         TEST_ASSERT(id, "Allocate memory for id string");
> +
>         ret = read(stats_fd, id, header.name_size);
>         TEST_ASSERT(ret == header.name_size, "Read id string");
>
>         /* Check id string, that should start with "kvm" */
>         TEST_ASSERT(!strncmp(id, "kvm", 3) && strlen(id) < header.name_size,
> -                               "Invalid KVM stats type, id: %s", id);
> +                   "Invalid KVM stats type, id: %s", id);
>
>         /* Sanity check for other fields in header */
>         if (header.num_desc == 0) {
>                 printf("No KVM stats defined!");
>                 return;
>         }
> -       /* Check overlap */
> -       TEST_ASSERT(header.desc_offset > 0 && header.data_offset > 0
> -                       && header.desc_offset >= sizeof(header)
> -                       && header.data_offset >= sizeof(header),
> -                       "Invalid offset fields in header");
> +       /*
> +        * The descriptor and data offsets must be valid, they must not overlap
> +        * the header, and the descriptor and data blocks must not overlap each
> +        * other.  Note, the data block is rechecked after its size is known.
> +        */
> +       TEST_ASSERT(header.desc_offset && header.desc_offset >= sizeof(header) &&
> +                   header.data_offset && header.data_offset >= sizeof(header),
> +                   "Invalid offset fields in header");
> +
>         TEST_ASSERT(header.desc_offset > header.data_offset ||
> -                       (header.desc_offset + size_desc * header.num_desc <=
> -                                                       header.data_offset),
> -                       "Descriptor block is overlapped with data block");
> +                   (header.desc_offset + size_desc * header.num_desc <= header.data_offset),
> +                   "Descriptor block is overlapped with data block");
>
>         /* Read kvm stats descriptors */
>         stats_desc = alloc_vm_stats_desc(stats_fd, &header);
> @@ -70,15 +80,22 @@ static void stats_test(int stats_fd)
>
>         /* Sanity check for fields in descriptors */
>         for (i = 0; i < header.num_desc; ++i) {
> +               /*
> +                * Note, size_desc includes the of the name field, which is
> +                * variable, i.e. this is NOT equivalent to &stats_desc[i].
> +                */
>                 pdesc = (void *)stats_desc + i * size_desc;
> -               /* Check type,unit,base boundaries */
> -               TEST_ASSERT((pdesc->flags & KVM_STATS_TYPE_MASK)
> -                               <= KVM_STATS_TYPE_MAX, "Unknown KVM stats type");
> -               TEST_ASSERT((pdesc->flags & KVM_STATS_UNIT_MASK)
> -                               <= KVM_STATS_UNIT_MAX, "Unknown KVM stats unit");
> -               TEST_ASSERT((pdesc->flags & KVM_STATS_BASE_MASK)
> -                               <= KVM_STATS_BASE_MAX, "Unknown KVM stats base");
> -               /* Check exponent for stats unit
> +
> +               /* Check type, unit, and base boundaries */
> +               TEST_ASSERT((pdesc->flags & KVM_STATS_TYPE_MASK) <= KVM_STATS_TYPE_MAX,
> +                           "Unknown KVM stats type");
> +               TEST_ASSERT((pdesc->flags & KVM_STATS_UNIT_MASK) <= KVM_STATS_UNIT_MAX,
> +                           "Unknown KVM stats unit");
> +               TEST_ASSERT((pdesc->flags & KVM_STATS_BASE_MASK) <= KVM_STATS_BASE_MAX,
> +                           "Unknown KVM stats base");
> +
> +               /*
> +                * Check exponent for stats unit
>                  * Exponent for counter should be greater than or equal to 0
>                  * Exponent for unit bytes should be greater than or equal to 0
>                  * Exponent for unit seconds should be less than or equal to 0
> @@ -89,47 +106,51 @@ static void stats_test(int stats_fd)
>                 case KVM_STATS_UNIT_NONE:
>                 case KVM_STATS_UNIT_BYTES:
>                 case KVM_STATS_UNIT_CYCLES:
> -                       TEST_ASSERT(pdesc->exponent >= 0,
> -                                       "Unsupported KVM stats unit");
> +                       TEST_ASSERT(pdesc->exponent >= 0, "Unsupported KVM stats unit");
>                         break;
>                 case KVM_STATS_UNIT_SECONDS:
> -                       TEST_ASSERT(pdesc->exponent <= 0,
> -                                       "Unsupported KVM stats unit");
> +                       TEST_ASSERT(pdesc->exponent <= 0, "Unsupported KVM stats unit");
>                         break;
>                 }
>                 /* Check name string */
>                 TEST_ASSERT(strlen(pdesc->name) < header.name_size,
> -                               "KVM stats name(%s) too long", pdesc->name);
> +                           "KVM stats name(%s) too long", pdesc->name);
>                 /* Check size field, which should not be zero */
> -               TEST_ASSERT(pdesc->size, "KVM descriptor(%s) with size of 0",
> -                               pdesc->name);
> +               TEST_ASSERT(pdesc->size,
> +                           "KVM descriptor(%s) with size of 0", pdesc->name);
>                 /* Check bucket_size field */
>                 switch (pdesc->flags & KVM_STATS_TYPE_MASK) {
>                 case KVM_STATS_TYPE_LINEAR_HIST:
>                         TEST_ASSERT(pdesc->bucket_size,
> -                           "Bucket size of Linear Histogram stats (%s) is zero",
> -                           pdesc->name);
> +                                   "Bucket size of Linear Histogram stats (%s) is zero",
> +                                   pdesc->name);
>                         break;
>                 default:
>                         TEST_ASSERT(!pdesc->bucket_size,
> -                           "Bucket size of stats (%s) is not zero",
> -                           pdesc->name);
> +                                   "Bucket size of stats (%s) is not zero",
> +                                   pdesc->name);
>                 }
>                 size_data += pdesc->size * sizeof(*stats_data);
>         }
> -       /* Check overlap */
> -       TEST_ASSERT(header.data_offset >= header.desc_offset
> -               || header.data_offset + size_data <= header.desc_offset,
> -               "Data block is overlapped with Descriptor block");
> +
> +       /*
> +        * Now that the size of the data block is known, verify the data block
> +        * doesn't overlap the descriptor block.
> +        */
> +       TEST_ASSERT(header.data_offset >= header.desc_offset ||
> +                   header.data_offset + size_data <= header.desc_offset,
> +                   "Data block is overlapped with Descriptor block");
> +
>         /* Check validity of all stats data size */
>         TEST_ASSERT(size_data >= header.num_desc * sizeof(*stats_data),
> -                       "Data size is not correct");
> +                   "Data size is not correct");
> +
>         /* Check stats offset */
>         for (i = 0; i < header.num_desc; ++i) {
>                 pdesc = (void *)stats_desc + i * size_desc;
>                 TEST_ASSERT(pdesc->offset < size_data,
> -                       "Invalid offset (%u) for stats: %s",
> -                       pdesc->offset, pdesc->name);
> +                           "Invalid offset (%u) for stats: %s",
> +                           pdesc->offset, pdesc->name);
>         }
>
>         /* Read kvm stats data one by one */
>
> base-commit: f9955a6aaa037a7a8198a817b9d272efcf10961a
> --
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2022-04-12 23:34 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-11 21:10 [PATCH v4 00/10] KVM: x86: Add a cap to disable NX hugepages on a VM Ben Gardon
2022-04-11 21:10 ` [PATCH v4 01/10] KVM: selftests: Remove dynamic memory allocation for stats header Ben Gardon
2022-04-11 21:52   ` David Matlack
2022-04-11 22:50     ` Mingwei Zhang
2022-04-11 21:10 ` [PATCH v4 02/10] KVM: selftests: Read binary stats header in lib Ben Gardon
2022-04-11 21:55   ` David Matlack
2022-04-11 21:10 ` [PATCH v4 03/10] KVM: selftests: Read binary stats desc " Ben Gardon
2022-04-11 22:01   ` David Matlack
2022-04-12  0:54   ` Mingwei Zhang
2022-04-12 18:56     ` Ben Gardon
2022-04-12 19:02       ` Sean Christopherson
2022-04-12 20:02         ` Sean Christopherson
2022-04-12 22:12           ` Ben Gardon
2022-04-11 21:10 ` [PATCH v4 04/10] KVM: selftests: Read binary stat data " Ben Gardon
2022-04-11 22:14   ` David Matlack
2022-04-12 19:58     ` Ben Gardon
2022-04-12  1:25   ` Mingwei Zhang
2022-04-11 21:10 ` [PATCH v4 05/10] KVM: selftests: Add NX huge pages test Ben Gardon
2022-04-11 22:27   ` David Matlack
2022-04-12 22:11     ` Ben Gardon
2022-04-12  1:32   ` Mingwei Zhang
2022-04-12 21:51     ` Ben Gardon
2022-04-11 21:10 ` [PATCH v4 06/10] KVM: x86/MMU: Factor out updating NX hugepages state for a VM Ben Gardon
2022-04-11 21:10 ` [PATCH v4 07/10] KVM: x86/MMU: Allow NX huge pages to be disabled on a per-vm basis Ben Gardon
2022-04-12 17:54   ` Sean Christopherson
2022-04-11 21:10 ` [PATCH v4 08/10] KVM: x86: Fix errant brace in KVM capability handling Ben Gardon
2022-04-11 21:10 ` [PATCH v4 09/10] KVM: x86/MMU: Require reboot permission to disable NX hugepages Ben Gardon
2022-04-12 18:08   ` Sean Christopherson
2022-04-11 21:10 ` [PATCH v4 10/10] KVM: selftests: Test disabling NX hugepages on a VM Ben Gardon
2022-04-11 22:37   ` David Matlack
2022-04-11 21:15 ` [PATCH v4 00/10] KVM: x86: Add a cap to disable " Ben Gardon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.