All of lore.kernel.org
 help / color / mirror / Atom feed
* [kvm-unit-tests PATCH 00/14] Run access test in an L2 guest
@ 2021-11-10 21:19 Aaron Lewis
  2021-11-10 21:19 ` [kvm-unit-tests PATCH 01/14] x86: cleanup handling of 16-byte GDT descriptors Aaron Lewis
                   ` (14 more replies)
  0 siblings, 15 replies; 20+ messages in thread
From: Aaron Lewis @ 2021-11-10 21:19 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, jmattson, seanjc, Aaron Lewis

The motivation behind this change is to test the routing logic when an
exception occurs in an L2 guest and ensure the exception goes to the
correct place.  For example, if an exception occurs in L2, does L1 want
to get involved, or L0, or do niether of them care about it and leave
it to L2 to handle.  Test that the exception doesn't end up going to L1
When L1 didn't ask for it.  This was occurring before commit 18712c13709d
("KVM: nVMX: Use vmx_need_pf_intercept() when deciding if L0 wants a #PF")
fixed the issue.  Without that fix, running
vmx_pf_exception_test_reduced_maxphyaddr with allow_smaller_maxphyaddr=Y
would have resulted in the test failing with the following error:

x86/vmx_tests.c:10698: assert failed: false: Unexpected exit to L1,
exit_reason: VMX_EXC_NMI (0x0)

This series only tests the routing logic for #PFs.  A future
series will address other exceptions, however, getting #PF testing in
place is a big enough chunk that the other exceptions will be submitted
seperately (in a future series).

This series is dependant on Paolo's changes (inlcuded). Without them,
running ac_test_run() on one of the userspace test fails.  Of note:  the
commit ("x86: get rid of ring0stacktop") has been updated to include a fix
for a compiler error to get it building on clang.

This series is also dependant on the commit ("x86: Look up the PTEs rather
than assuming them").  This was sent out for review seperately, however,
it is needed to get ac_test_run() running on a different cr3 than the one
access_test runs on, so it is included here as well.  This is also v2 of
that commit.  While preparing this series a review came in, so I just
included the changes here.

Paolo Bonzini (9):
  x86: cleanup handling of 16-byte GDT descriptors
  x86: fix call to set_gdt_entry
  unify field names and definitions for GDT descriptors
  replace tss_descr global with a function
  x86: Move IDT to desc.c
  x86: unify name of 32-bit and 64-bit GDT
  x86: get rid of ring0stacktop
  x86: Move 64-bit GDT and TSS to desc.c
  x86: Move 32-bit GDT and TSS to desc.c

Aaron Lewis (5):
  x86: Look up the PTEs rather than assuming them (v2)
  x86: Prepare access test for running in L2
  x86: Fix tabs in access.c
  x86: Clean up the global, page_table_levels, in access.c
  x86: Add tests to run ac_test_run() in an L2 guest

 lib/libcflat.h         |    1 +
 lib/x86/asm/setup.h    |    6 +
 lib/x86/desc.c         |  116 +++-
 lib/x86/desc.h         |   31 +-
 lib/x86/setup.c        |   49 ++
 lib/x86/usermode.c     |    9 +-
 lib/x86/vm.c           |   21 +
 lib/x86/vm.h           |    3 +
 x86/Makefile.common    |    4 +
 x86/Makefile.x86_64    |    2 +-
 x86/access.c           | 1447 ++++++++++++++++++++--------------------
 x86/access.h           |    9 +
 x86/access_test.c      |   20 +
 x86/cstart.S           |  115 +---
 x86/cstart64.S         |   98 +--
 x86/flat.lds           |    1 +
 x86/smap.c             |    2 +-
 x86/svm_tests.c        |   15 +-
 x86/taskswitch.c       |    4 +-
 x86/umip.c             |   19 +-
 x86/unittests.cfg      |   17 +-
 x86/vmware_backdoors.c |   22 +-
 x86/vmx.c              |   17 +-
 x86/vmx_tests.c        |   53 +-
 24 files changed, 1080 insertions(+), 1001 deletions(-)
 create mode 100644 lib/x86/asm/setup.h
 create mode 100644 x86/access.h
 create mode 100644 x86/access_test.c

-- 
2.34.0.rc1.387.gb447b232ab-goog


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [kvm-unit-tests PATCH 01/14] x86: cleanup handling of 16-byte GDT descriptors
  2021-11-10 21:19 [kvm-unit-tests PATCH 00/14] Run access test in an L2 guest Aaron Lewis
@ 2021-11-10 21:19 ` Aaron Lewis
  2021-11-10 21:19 ` [kvm-unit-tests PATCH 02/14] x86: fix call to set_gdt_entry Aaron Lewis
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: Aaron Lewis @ 2021-11-10 21:19 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, jmattson, seanjc, Aaron Lewis

Look them up using a gdt_entry_t pointer, so that the address of
the descriptor is correct even for "odd" selectors (e.g. 0x98).
Rename the struct from segment_desc64 to system_desc64,
highlighting that it is only used in the case of S=0 (system
descriptor).  Rename the "limit" bitfield to "limit2", matching
the convention used for the various parts of the base field.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 lib/x86/desc.h         |  4 ++--
 x86/svm_tests.c        | 12 ++++++------
 x86/vmware_backdoors.c |  8 ++++----
 3 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/lib/x86/desc.h b/lib/x86/desc.h
index a6ffb38..1755486 100644
--- a/lib/x86/desc.h
+++ b/lib/x86/desc.h
@@ -172,7 +172,7 @@ typedef struct {
 	u8 base_high;
 } gdt_entry_t;
 
-struct segment_desc64 {
+struct system_desc64 {
 	uint16_t limit1;
 	uint16_t base1;
 	uint8_t  base2;
@@ -183,7 +183,7 @@ struct segment_desc64 {
 			uint16_t s:1;
 			uint16_t dpl:2;
 			uint16_t p:1;
-			uint16_t limit:4;
+			uint16_t limit2:4;
 			uint16_t avl:1;
 			uint16_t l:1;
 			uint16_t db:1;
diff --git a/x86/svm_tests.c b/x86/svm_tests.c
index afdd359..2fdb0dc 100644
--- a/x86/svm_tests.c
+++ b/x86/svm_tests.c
@@ -1876,22 +1876,22 @@ static bool reg_corruption_check(struct svm_test *test)
 static void get_tss_entry(void *data)
 {
     struct descriptor_table_ptr gdt;
-    struct segment_desc64 *gdt_table;
-    struct segment_desc64 *tss_entry;
+    gdt_entry_t *gdt_table;
+    struct system_desc64 *tss_entry;
     u16 tr = 0;
 
     sgdt(&gdt);
     tr = str();
-    gdt_table = (struct segment_desc64 *) gdt.base;
-    tss_entry = &gdt_table[tr / sizeof(struct segment_desc64)];
-    *((struct segment_desc64 **)data) = tss_entry;
+    gdt_table = (gdt_entry_t *) gdt.base;
+    tss_entry = (struct system_desc64 *) &gdt_table[tr / 8];
+    *((struct system_desc64 **)data) = tss_entry;
 }
 
 static int orig_cpu_count;
 
 static void init_startup_prepare(struct svm_test *test)
 {
-    struct segment_desc64 *tss_entry;
+    struct system_desc64 *tss_entry;
     int i;
 
     on_cpu(1, get_tss_entry, &tss_entry);
diff --git a/x86/vmware_backdoors.c b/x86/vmware_backdoors.c
index b4902a9..b1433cd 100644
--- a/x86/vmware_backdoors.c
+++ b/x86/vmware_backdoors.c
@@ -133,8 +133,8 @@ struct fault_test vmware_backdoor_tests[] = {
 static void set_tss_ioperm(void)
 {
 	struct descriptor_table_ptr gdt;
-	struct segment_desc64 *gdt_table;
-	struct segment_desc64 *tss_entry;
+	gdt_entry_t *gdt_table;
+	struct system_desc64 *tss_entry;
 	u16 tr = 0;
 	tss64_t *tss;
 	unsigned char *ioperm_bitmap;
@@ -142,8 +142,8 @@ static void set_tss_ioperm(void)
 
 	sgdt(&gdt);
 	tr = str();
-	gdt_table = (struct segment_desc64 *) gdt.base;
-	tss_entry = &gdt_table[tr / sizeof(struct segment_desc64)];
+	gdt_table = (gdt_entry_t *) gdt.base;
+	tss_entry = (struct system_desc64 *) &gdt_table[tr / 8];
 	tss_base = ((uint64_t) tss_entry->base1 |
 			((uint64_t) tss_entry->base2 << 16) |
 			((uint64_t) tss_entry->base3 << 24) |
-- 
2.34.0.rc1.387.gb447b232ab-goog


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [kvm-unit-tests PATCH 02/14] x86: fix call to set_gdt_entry
  2021-11-10 21:19 [kvm-unit-tests PATCH 00/14] Run access test in an L2 guest Aaron Lewis
  2021-11-10 21:19 ` [kvm-unit-tests PATCH 01/14] x86: cleanup handling of 16-byte GDT descriptors Aaron Lewis
@ 2021-11-10 21:19 ` Aaron Lewis
  2021-11-10 21:19 ` [kvm-unit-tests PATCH 03/14] unify field names and definitions for GDT descriptors Aaron Lewis
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: Aaron Lewis @ 2021-11-10 21:19 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, jmattson, seanjc, Aaron Lewis

The low four bits of the fourth argument are unused, make them
zero in all the callers.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 lib/x86/desc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/x86/desc.c b/lib/x86/desc.c
index e7378c1..b691c9b 100644
--- a/lib/x86/desc.c
+++ b/lib/x86/desc.c
@@ -336,7 +336,7 @@ void setup_tss32(void)
 	tss_intr.ds = tss_intr.es = tss_intr.fs = tss_intr.ss = 0x10;
 	tss_intr.gs = read_gs();
 	tss_intr.iomap_base = (u16)desc_size;
-	set_gdt_entry(TSS_INTR, (u32)&tss_intr, desc_size - 1, 0x89, 0x0f);
+	set_gdt_entry(TSS_INTR, (u32)&tss_intr, desc_size - 1, 0x89, 0);
 }
 
 void set_intr_task_gate(int e, void *fn)
-- 
2.34.0.rc1.387.gb447b232ab-goog


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [kvm-unit-tests PATCH 03/14] unify field names and definitions for GDT descriptors
  2021-11-10 21:19 [kvm-unit-tests PATCH 00/14] Run access test in an L2 guest Aaron Lewis
  2021-11-10 21:19 ` [kvm-unit-tests PATCH 01/14] x86: cleanup handling of 16-byte GDT descriptors Aaron Lewis
  2021-11-10 21:19 ` [kvm-unit-tests PATCH 02/14] x86: fix call to set_gdt_entry Aaron Lewis
@ 2021-11-10 21:19 ` Aaron Lewis
  2021-11-10 21:19 ` [kvm-unit-tests PATCH 04/14] replace tss_descr global with a function Aaron Lewis
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: Aaron Lewis @ 2021-11-10 21:19 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, jmattson, seanjc, Aaron Lewis

Use the same names and definitions (apart from the high base field)
for GDT descriptors in both 32-bit and 64-bit code.  The next patch
will also reuse gdt_entry_t in the 16-byte struct definition, for now
some duplication remains.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 lib/x86/desc.c   | 18 +++++++-----------
 lib/x86/desc.h   | 28 +++++++++++++++++++++-------
 x86/taskswitch.c |  2 +-
 3 files changed, 29 insertions(+), 19 deletions(-)

diff --git a/lib/x86/desc.c b/lib/x86/desc.c
index b691c9b..ba0db65 100644
--- a/lib/x86/desc.c
+++ b/lib/x86/desc.c
@@ -280,22 +280,18 @@ bool exception_rflags_rf(void)
 static char intr_alt_stack[4096];
 
 #ifndef __x86_64__
-void set_gdt_entry(int sel, u32 base,  u32 limit, u8 access, u8 gran)
+void set_gdt_entry(int sel, u32 base,  u32 limit, u8 type, u8 flags)
 {
 	int num = sel >> 3;
 
 	/* Setup the descriptor base address */
-	gdt32[num].base_low = (base & 0xFFFF);
-	gdt32[num].base_middle = (base >> 16) & 0xFF;
-	gdt32[num].base_high = (base >> 24) & 0xFF;
+	gdt32[num].base1 = (base & 0xFFFF);
+	gdt32[num].base2 = (base >> 16) & 0xFF;
+	gdt32[num].base3 = (base >> 24) & 0xFF;
 
-	/* Setup the descriptor limits */
-	gdt32[num].limit_low = (limit & 0xFFFF);
-	gdt32[num].granularity = ((limit >> 16) & 0x0F);
-
-	/* Finally, set up the granularity and access flags */
-	gdt32[num].granularity |= (gran & 0xF0);
-	gdt32[num].access = access;
+	/* Setup the descriptor limits, type and flags */
+	gdt32[num].limit1 = (limit & 0xFFFF);
+	gdt32[num].type_limit_flags = ((limit & 0xF0000) >> 8) | ((flags & 0xF0) << 8) | type;
 }
 
 void set_gdt_task_gate(u16 sel, u16 tss_sel)
diff --git a/lib/x86/desc.h b/lib/x86/desc.h
index 1755486..c339e0e 100644
--- a/lib/x86/desc.h
+++ b/lib/x86/desc.h
@@ -164,14 +164,27 @@ typedef struct {
 } idt_entry_t;
 
 typedef struct {
-	u16 limit_low;
-	u16 base_low;
-	u8 base_middle;
-	u8 access;
-	u8 granularity;
-	u8 base_high;
-} gdt_entry_t;
+	uint16_t limit1;
+	uint16_t base1;
+	uint8_t  base2;
+	union {
+		uint16_t  type_limit_flags;      /* Type and limit flags */
+		struct {
+			uint16_t type:4;
+			uint16_t s:1;
+			uint16_t dpl:2;
+			uint16_t p:1;
+			uint16_t limit2:4;
+			uint16_t avl:1;
+			uint16_t l:1;
+			uint16_t db:1;
+			uint16_t g:1;
+		} __attribute__((__packed__));
+	} __attribute__((__packed__));
+	uint8_t  base3;
+} __attribute__((__packed__)) gdt_entry_t;
 
+#ifdef __x86_64__
 struct system_desc64 {
 	uint16_t limit1;
 	uint16_t base1;
@@ -194,6 +207,7 @@ struct system_desc64 {
 	uint32_t base4;
 	uint32_t zero;
 } __attribute__((__packed__));
+#endif
 
 #define DESC_BUSY ((uint64_t) 1 << 41)
 
diff --git a/x86/taskswitch.c b/x86/taskswitch.c
index 889831e..b6b3451 100644
--- a/x86/taskswitch.c
+++ b/x86/taskswitch.c
@@ -21,7 +21,7 @@ fault_handler(unsigned long error_code)
 
 	tss.eip += 2;
 
-	gdt32[TSS_MAIN / 8].access &= ~2;
+	gdt32[TSS_MAIN / 8].type &= ~2;
 
 	set_gdt_task_gate(TSS_RETURN, tss_intr.prev);
 }
-- 
2.34.0.rc1.387.gb447b232ab-goog


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [kvm-unit-tests PATCH 04/14] replace tss_descr global with a function
  2021-11-10 21:19 [kvm-unit-tests PATCH 00/14] Run access test in an L2 guest Aaron Lewis
                   ` (2 preceding siblings ...)
  2021-11-10 21:19 ` [kvm-unit-tests PATCH 03/14] unify field names and definitions for GDT descriptors Aaron Lewis
@ 2021-11-10 21:19 ` Aaron Lewis
  2021-11-10 21:19 ` [kvm-unit-tests PATCH 05/14] x86: Move IDT to desc.c Aaron Lewis
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: Aaron Lewis @ 2021-11-10 21:19 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, jmattson, seanjc, Aaron Lewis

tss_descr is declared as a struct descriptor_table_ptr but it is actualy
pointing to an _entry_ in the GDT.  Also it is different per CPU, but
tss_descr does not recognize that.  Fix both by reusing the code
(already present e.g. in the vmware_backdoors test) that extracts
the base from the GDT entry; and also provide a helper to retrieve
the limit, which is needed in vmx.c.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 lib/x86/desc.c         | 32 ++++++++++++++++++++++++++++++++
 lib/x86/desc.h         | 25 ++++++-------------------
 x86/cstart64.S         |  1 -
 x86/svm_tests.c        | 15 +++------------
 x86/taskswitch.c       |  2 +-
 x86/vmware_backdoors.c | 22 ++++++----------------
 x86/vmx.c              |  9 +++++----
 7 files changed, 53 insertions(+), 53 deletions(-)

diff --git a/lib/x86/desc.c b/lib/x86/desc.c
index ba0db65..94f0ddb 100644
--- a/lib/x86/desc.c
+++ b/lib/x86/desc.c
@@ -409,3 +409,35 @@ void __set_exception_jmpbuf(jmp_buf *addr)
 {
 	exception_jmpbuf = addr;
 }
+
+gdt_entry_t *get_tss_descr(void)
+{
+	struct descriptor_table_ptr gdt_ptr;
+	gdt_entry_t *gdt;
+
+	sgdt(&gdt_ptr);
+	gdt = (gdt_entry_t *)gdt_ptr.base;
+	return &gdt[str() / 8];
+}
+
+unsigned long get_gdt_entry_base(gdt_entry_t *entry)
+{
+	unsigned long base;
+	base = entry->base1 | ((u32)entry->base2 << 16) | ((u32)entry->base3 << 24);
+#ifdef __x86_64__
+	if (!entry->s) {
+		base |= (u64)((struct system_desc64 *)entry)->base4 << 32;
+	}
+#endif
+	return base;
+}
+
+unsigned long get_gdt_entry_limit(gdt_entry_t *entry)
+{
+	unsigned long limit;
+	limit = entry->limit1 | ((u32)entry->limit2 << 16);
+	if (entry->g) {
+		limit = (limit << 12) | 0xFFF;
+	}
+	return limit;
+}
diff --git a/lib/x86/desc.h b/lib/x86/desc.h
index c339e0e..51148d1 100644
--- a/lib/x86/desc.h
+++ b/lib/x86/desc.h
@@ -186,30 +186,13 @@ typedef struct {
 
 #ifdef __x86_64__
 struct system_desc64 {
-	uint16_t limit1;
-	uint16_t base1;
-	uint8_t  base2;
-	union {
-		uint16_t  type_limit_flags;      /* Type and limit flags */
-		struct {
-			uint16_t type:4;
-			uint16_t s:1;
-			uint16_t dpl:2;
-			uint16_t p:1;
-			uint16_t limit2:4;
-			uint16_t avl:1;
-			uint16_t l:1;
-			uint16_t db:1;
-			uint16_t g:1;
-		} __attribute__((__packed__));
-	} __attribute__((__packed__));
-	uint8_t  base3;
+	gdt_entry_t common;
 	uint32_t base4;
 	uint32_t zero;
 } __attribute__((__packed__));
 #endif
 
-#define DESC_BUSY ((uint64_t) 1 << 41)
+#define DESC_BUSY 2
 
 extern idt_entry_t boot_idt[256];
 
@@ -253,4 +236,8 @@ static inline void *get_idt_addr(idt_entry_t *entry)
 	return (void *)addr;
 }
 
+extern gdt_entry_t *get_tss_descr(void);
+extern unsigned long get_gdt_entry_base(gdt_entry_t *entry);
+extern unsigned long get_gdt_entry_limit(gdt_entry_t *entry);
+
 #endif
diff --git a/x86/cstart64.S b/x86/cstart64.S
index 5c6ad38..cf38bae 100644
--- a/x86/cstart64.S
+++ b/x86/cstart64.S
@@ -4,7 +4,6 @@
 .globl boot_idt
 
 .globl idt_descr
-.globl tss_descr
 .globl gdt64_desc
 .globl online_cpus
 .globl cpu_online_count
diff --git a/x86/svm_tests.c b/x86/svm_tests.c
index 2fdb0dc..8ad6122 100644
--- a/x86/svm_tests.c
+++ b/x86/svm_tests.c
@@ -1875,23 +1875,14 @@ static bool reg_corruption_check(struct svm_test *test)
 
 static void get_tss_entry(void *data)
 {
-    struct descriptor_table_ptr gdt;
-    gdt_entry_t *gdt_table;
-    struct system_desc64 *tss_entry;
-    u16 tr = 0;
-
-    sgdt(&gdt);
-    tr = str();
-    gdt_table = (gdt_entry_t *) gdt.base;
-    tss_entry = (struct system_desc64 *) &gdt_table[tr / 8];
-    *((struct system_desc64 **)data) = tss_entry;
+    *((gdt_entry_t **)data) = get_tss_descr();
 }
 
 static int orig_cpu_count;
 
 static void init_startup_prepare(struct svm_test *test)
 {
-    struct system_desc64 *tss_entry;
+    gdt_entry_t *tss_entry;
     int i;
 
     on_cpu(1, get_tss_entry, &tss_entry);
@@ -1905,7 +1896,7 @@ static void init_startup_prepare(struct svm_test *test)
 
     --cpu_online_count;
 
-    *(uint64_t *)tss_entry &= ~DESC_BUSY;
+    tss_entry->type &= ~DESC_BUSY;
 
     apic_icr_write(APIC_DEST_PHYSICAL | APIC_DM_STARTUP, id_map[1]);
 
diff --git a/x86/taskswitch.c b/x86/taskswitch.c
index b6b3451..0fa818d 100644
--- a/x86/taskswitch.c
+++ b/x86/taskswitch.c
@@ -21,7 +21,7 @@ fault_handler(unsigned long error_code)
 
 	tss.eip += 2;
 
-	gdt32[TSS_MAIN / 8].type &= ~2;
+	gdt32[TSS_MAIN / 8].type &= ~DESC_BUSY;
 
 	set_gdt_task_gate(TSS_RETURN, tss_intr.prev);
 }
diff --git a/x86/vmware_backdoors.c b/x86/vmware_backdoors.c
index b1433cd..bc10020 100644
--- a/x86/vmware_backdoors.c
+++ b/x86/vmware_backdoors.c
@@ -132,23 +132,13 @@ struct fault_test vmware_backdoor_tests[] = {
  */
 static void set_tss_ioperm(void)
 {
-	struct descriptor_table_ptr gdt;
-	gdt_entry_t *gdt_table;
-	struct system_desc64 *tss_entry;
-	u16 tr = 0;
+	gdt_entry_t *tss_entry;
 	tss64_t *tss;
 	unsigned char *ioperm_bitmap;
-	uint64_t tss_base;
-
-	sgdt(&gdt);
-	tr = str();
-	gdt_table = (gdt_entry_t *) gdt.base;
-	tss_entry = (struct system_desc64 *) &gdt_table[tr / 8];
-	tss_base = ((uint64_t) tss_entry->base1 |
-			((uint64_t) tss_entry->base2 << 16) |
-			((uint64_t) tss_entry->base3 << 24) |
-			((uint64_t) tss_entry->base4 << 32));
-	tss = (tss64_t *)tss_base;
+	u16 tr = str();
+
+	tss_entry = get_tss_descr();
+	tss = (tss64_t *)get_gdt_entry_base(tss_entry);
 	tss->iomap_base = sizeof(*tss);
 	ioperm_bitmap = ((unsigned char *)tss+tss->iomap_base);
 
@@ -157,7 +147,7 @@ static void set_tss_ioperm(void)
 		1 << (RANDOM_IO_PORT % 8);
 	ioperm_bitmap[VMWARE_BACKDOOR_PORT / 8] |=
 		1 << (VMWARE_BACKDOOR_PORT % 8);
-	*(uint64_t *)tss_entry &= ~DESC_BUSY;
+	tss_entry->type &= ~DESC_BUSY;
 
 	/* Update TSS */
 	ltr(tr);
diff --git a/x86/vmx.c b/x86/vmx.c
index 20dc677..d45c6de 100644
--- a/x86/vmx.c
+++ b/x86/vmx.c
@@ -75,7 +75,6 @@ union vmx_ept_vpid  ept_vpid;
 
 extern struct descriptor_table_ptr gdt64_desc;
 extern struct descriptor_table_ptr idt_descr;
-extern struct descriptor_table_ptr tss_descr;
 extern void *vmx_return;
 extern void *entry_sysenter;
 extern void *guest_entry;
@@ -1275,7 +1274,7 @@ static void init_vmcs_host(void)
 	vmcs_write(HOST_SEL_FS, KERNEL_DS);
 	vmcs_write(HOST_SEL_GS, KERNEL_DS);
 	vmcs_write(HOST_SEL_TR, TSS_MAIN);
-	vmcs_write(HOST_BASE_TR, tss_descr.base);
+	vmcs_write(HOST_BASE_TR, get_gdt_entry_base(get_tss_descr()));
 	vmcs_write(HOST_BASE_GDTR, gdt64_desc.base);
 	vmcs_write(HOST_BASE_IDTR, idt_descr.base);
 	vmcs_write(HOST_BASE_FS, 0);
@@ -1291,6 +1290,8 @@ static void init_vmcs_host(void)
 
 static void init_vmcs_guest(void)
 {
+	gdt_entry_t *tss_descr = get_tss_descr();
+
 	/* 26.3 CHECKING AND LOADING GUEST STATE */
 	ulong guest_cr0, guest_cr4, guest_cr3;
 	/* 26.3.1.1 */
@@ -1331,7 +1332,7 @@ static void init_vmcs_guest(void)
 	vmcs_write(GUEST_BASE_DS, 0);
 	vmcs_write(GUEST_BASE_FS, 0);
 	vmcs_write(GUEST_BASE_GS, 0);
-	vmcs_write(GUEST_BASE_TR, tss_descr.base);
+	vmcs_write(GUEST_BASE_TR, get_gdt_entry_base(tss_descr));
 	vmcs_write(GUEST_BASE_LDTR, 0);
 
 	vmcs_write(GUEST_LIMIT_CS, 0xFFFFFFFF);
@@ -1341,7 +1342,7 @@ static void init_vmcs_guest(void)
 	vmcs_write(GUEST_LIMIT_FS, 0xFFFFFFFF);
 	vmcs_write(GUEST_LIMIT_GS, 0xFFFFFFFF);
 	vmcs_write(GUEST_LIMIT_LDTR, 0xffff);
-	vmcs_write(GUEST_LIMIT_TR, tss_descr.limit);
+	vmcs_write(GUEST_LIMIT_TR, get_gdt_entry_limit(tss_descr));
 
 	vmcs_write(GUEST_AR_CS, 0xa09b);
 	vmcs_write(GUEST_AR_DS, 0xc093);
-- 
2.34.0.rc1.387.gb447b232ab-goog


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [kvm-unit-tests PATCH 05/14] x86: Move IDT to desc.c
  2021-11-10 21:19 [kvm-unit-tests PATCH 00/14] Run access test in an L2 guest Aaron Lewis
                   ` (3 preceding siblings ...)
  2021-11-10 21:19 ` [kvm-unit-tests PATCH 04/14] replace tss_descr global with a function Aaron Lewis
@ 2021-11-10 21:19 ` Aaron Lewis
  2021-11-10 21:19 ` [kvm-unit-tests PATCH 06/14] x86: unify name of 32-bit and 64-bit GDT Aaron Lewis
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: Aaron Lewis @ 2021-11-10 21:19 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, jmattson, seanjc, Aaron Lewis

Move the IDT data structures from x86/cstart.S and x86/cstart64.S to
lib/x86/desc.c, so that the follow-up UEFI support commits can reuse
these definitions, without re-defining them in UEFI's boot up assembly
code.

Extracted by a patch by Zixuan Wang <zxwang42@gmail.com> and ported
to 32-bit too.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 lib/x86/desc.c | 10 ++++++++++
 x86/cstart.S   | 11 -----------
 x86/cstart64.S | 14 --------------
 3 files changed, 10 insertions(+), 25 deletions(-)

diff --git a/lib/x86/desc.c b/lib/x86/desc.c
index 94f0ddb..2ef5aad 100644
--- a/lib/x86/desc.c
+++ b/lib/x86/desc.c
@@ -3,6 +3,16 @@
 #include "processor.h"
 #include <setjmp.h>
 
+/* Boot-related data structures */
+
+/* IDT and IDT descriptor */
+idt_entry_t boot_idt[256] = {0};
+
+struct descriptor_table_ptr idt_descr = {
+	.limit = sizeof(boot_idt) - 1,
+	.base = (unsigned long)boot_idt,
+};
+
 #ifndef __x86_64__
 __attribute__((regparm(1)))
 #endif
diff --git a/x86/cstart.S b/x86/cstart.S
index bcf7218..4461c38 100644
--- a/x86/cstart.S
+++ b/x86/cstart.S
@@ -1,7 +1,6 @@
 
 #include "apic-defs.h"
 
-.globl boot_idt
 .global online_cpus
 
 ipi_vector = 0x20
@@ -28,12 +27,6 @@ i = 0
         i = i + 1
         .endr
 
-boot_idt:
-	.rept 256
-	.quad 0
-	.endr
-end_boot_idt:
-
 .globl gdt32
 gdt32:
 	.quad 0
@@ -78,10 +71,6 @@ tss:
         .endr
 tss_end:
 
-idt_descr:
-	.word end_boot_idt - boot_idt - 1
-	.long boot_idt
-
 .section .init
 
 .code32
diff --git a/x86/cstart64.S b/x86/cstart64.S
index cf38bae..b98a0d3 100644
--- a/x86/cstart64.S
+++ b/x86/cstart64.S
@@ -1,9 +1,6 @@
 
 #include "apic-defs.h"
 
-.globl boot_idt
-
-.globl idt_descr
 .globl gdt64_desc
 .globl online_cpus
 .globl cpu_online_count
@@ -50,13 +47,6 @@ ptl5:
 
 .align 4096
 
-boot_idt:
-	.rept 256
-	.quad 0
-	.quad 0
-	.endr
-end_boot_idt:
-
 gdt64_desc:
 	.word gdt64_end - gdt64 - 1
 	.quad gdt64
@@ -290,10 +280,6 @@ setup_5level_page_table:
 lvl5:
 	retq
 
-idt_descr:
-	.word end_boot_idt - boot_idt - 1
-	.quad boot_idt
-
 online_cpus:
 	.fill (max_cpus + 7) / 8, 1, 0
 
-- 
2.34.0.rc1.387.gb447b232ab-goog


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [kvm-unit-tests PATCH 06/14] x86: unify name of 32-bit and 64-bit GDT
  2021-11-10 21:19 [kvm-unit-tests PATCH 00/14] Run access test in an L2 guest Aaron Lewis
                   ` (4 preceding siblings ...)
  2021-11-10 21:19 ` [kvm-unit-tests PATCH 05/14] x86: Move IDT to desc.c Aaron Lewis
@ 2021-11-10 21:19 ` Aaron Lewis
  2021-11-10 21:19 ` [kvm-unit-tests PATCH 07/14] x86: get rid of ring0stacktop Aaron Lewis
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: Aaron Lewis @ 2021-11-10 21:19 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, jmattson, seanjc, Aaron Lewis

There's no need to distinguish gdt32 and gdt64, since the same C functions
operate on both and selector numbers are mostly unified between 32-
and 64-bit versions.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 lib/x86/desc.c   | 12 ++++++------
 lib/x86/desc.h   |  2 +-
 x86/cstart.S     | 20 ++++++++++----------
 x86/cstart64.S   | 17 +++++++++--------
 x86/taskswitch.c |  2 +-
 x86/vmx.c        |  8 ++++----
 x86/vmx_tests.c  |  4 ++--
 7 files changed, 33 insertions(+), 32 deletions(-)

diff --git a/lib/x86/desc.c b/lib/x86/desc.c
index 2ef5aad..ac167d0 100644
--- a/lib/x86/desc.c
+++ b/lib/x86/desc.c
@@ -292,16 +292,16 @@ static char intr_alt_stack[4096];
 #ifndef __x86_64__
 void set_gdt_entry(int sel, u32 base,  u32 limit, u8 type, u8 flags)
 {
-	int num = sel >> 3;
+	gdt_entry_t *entry = &gdt[sel >> 3];
 
 	/* Setup the descriptor base address */
-	gdt32[num].base1 = (base & 0xFFFF);
-	gdt32[num].base2 = (base >> 16) & 0xFF;
-	gdt32[num].base3 = (base >> 24) & 0xFF;
+	entry->base1 = (base & 0xFFFF);
+	entry->base2 = (base >> 16) & 0xFF;
+	entry->base3 = (base >> 24) & 0xFF;
 
 	/* Setup the descriptor limits, type and flags */
-	gdt32[num].limit1 = (limit & 0xFFFF);
-	gdt32[num].type_limit_flags = ((limit & 0xF0000) >> 8) | ((flags & 0xF0) << 8) | type;
+	entry->limit1 = (limit & 0xFFFF);
+	entry->type_limit_flags = ((limit & 0xF0000) >> 8) | ((flags & 0xF0) << 8) | type;
 }
 
 void set_gdt_task_gate(u16 sel, u16 tss_sel)
diff --git a/lib/x86/desc.h b/lib/x86/desc.h
index 51148d1..c0817d8 100644
--- a/lib/x86/desc.h
+++ b/lib/x86/desc.h
@@ -197,7 +197,6 @@ struct system_desc64 {
 extern idt_entry_t boot_idt[256];
 
 #ifndef __x86_64__
-extern gdt_entry_t gdt32[];
 extern tss32_t tss;
 extern tss32_t tss_intr;
 void set_gdt_task_gate(u16 tss_sel, u16 sel);
@@ -207,6 +206,7 @@ void setup_tss32(void);
 #else
 extern tss64_t tss;
 #endif
+extern gdt_entry_t gdt[];
 
 unsigned exception_vector(void);
 int write_cr4_checking(unsigned long val);
diff --git a/x86/cstart.S b/x86/cstart.S
index 4461c38..5e925d8 100644
--- a/x86/cstart.S
+++ b/x86/cstart.S
@@ -27,8 +27,8 @@ i = 0
         i = i + 1
         .endr
 
-.globl gdt32
-gdt32:
+.globl gdt
+gdt:
 	.quad 0
 	.quad 0x00cf9b000000ffff // flat 32-bit code segment
 	.quad 0x00cf93000000ffff // flat 32-bit data segment
@@ -55,7 +55,7 @@ percpu_descr:
         .rept max_cpus
         .quad 0x00cf93000000ffff // 32-bit data segment for perCPU area
         .endr
-gdt32_end:
+gdt_end:
 
 i = 0
 .globl tss
@@ -94,7 +94,7 @@ mb_cmdline = 16
 	mov %al, percpu_descr+4(,%ecx,8)
 	mov %ah, percpu_descr+7(,%ecx,8)
 
-	lea percpu_descr-gdt32(,%ecx,8), %eax
+	lea percpu_descr-gdt(,%ecx,8), %eax
 	mov %ax, %gs
 
 .endm
@@ -110,7 +110,7 @@ mb_cmdline = 16
 
 .globl start
 start:
-        lgdtl gdt32_descr
+        lgdtl gdt_descr
         setup_segments
         mov $stacktop, %esp
         setup_percpu_area
@@ -195,7 +195,7 @@ load_tss:
 	shr $16, %eax
 	mov %al, tss_descr+4(,%ebx,8)
 	mov %ah, tss_descr+7(,%ebx,8)
-	lea tss_descr-gdt32(,%ebx,8), %eax
+	lea tss_descr-gdt(,%ebx,8), %eax
 	ltr %ax
 	ret
 
@@ -224,11 +224,11 @@ sipi_entry:
 	mov %cr0, %eax
 	or $1, %eax
 	mov %eax, %cr0
-	lgdtl gdt32_descr - sipi_entry
+	lgdtl gdt_descr - sipi_entry
 	ljmpl $8, $ap_start32
 
-gdt32_descr:
-	.word gdt32_end - gdt32 - 1
-	.long gdt32
+gdt_descr:
+	.word gdt_end - gdt - 1
+	.long gdt
 
 sipi_end:
diff --git a/x86/cstart64.S b/x86/cstart64.S
index b98a0d3..46b9d9b 100644
--- a/x86/cstart64.S
+++ b/x86/cstart64.S
@@ -1,7 +1,8 @@
 
 #include "apic-defs.h"
 
-.globl gdt64_desc
+.globl gdt
+.globl gdt_descr
 .globl online_cpus
 .globl cpu_online_count
 
@@ -47,11 +48,11 @@ ptl5:
 
 .align 4096
 
-gdt64_desc:
-	.word gdt64_end - gdt64 - 1
-	.quad gdt64
+gdt_descr:
+	.word gdt_end - gdt - 1
+	.quad gdt
 
-gdt64:
+gdt:
 	.quad 0
 	.quad 0x00af9b000000ffff // 64-bit code segment
 	.quad 0x00cf93000000ffff // 32/64-bit data segment
@@ -75,7 +76,7 @@ tss_descr:
 	.quad 0x000089000000ffff // 64-bit avail tss
 	.quad 0                  // tss high addr
 	.endr
-gdt64_end:
+gdt_end:
 
 i = 0
 .globl tss
@@ -162,7 +163,7 @@ switch_to_5level:
 	jmpl $8, $lvl5
 
 prepare_64:
-	lgdt gdt64_desc
+	lgdt gdt_descr
 	setup_segments
 
 	xor %eax, %eax
@@ -300,7 +301,7 @@ load_tss:
 	mov %al, tss_descr+7(%rbx)
 	shr $8, %rax
 	mov %eax, tss_descr+8(%rbx)
-	lea tss_descr-gdt64(%rbx), %rax
+	lea tss_descr-gdt(%rbx), %rax
 	ltr %ax
 	ret
 
diff --git a/x86/taskswitch.c b/x86/taskswitch.c
index 0fa818d..1d6e6e2 100644
--- a/x86/taskswitch.c
+++ b/x86/taskswitch.c
@@ -21,7 +21,7 @@ fault_handler(unsigned long error_code)
 
 	tss.eip += 2;
 
-	gdt32[TSS_MAIN / 8].type &= ~DESC_BUSY;
+	gdt[TSS_MAIN / 8].type &= ~DESC_BUSY;
 
 	set_gdt_task_gate(TSS_RETURN, tss_intr.prev);
 }
diff --git a/x86/vmx.c b/x86/vmx.c
index d45c6de..7a2f7a3 100644
--- a/x86/vmx.c
+++ b/x86/vmx.c
@@ -73,7 +73,7 @@ union vmx_ctrl_msr ctrl_exit_rev;
 union vmx_ctrl_msr ctrl_enter_rev;
 union vmx_ept_vpid  ept_vpid;
 
-extern struct descriptor_table_ptr gdt64_desc;
+extern struct descriptor_table_ptr gdt_descr;
 extern struct descriptor_table_ptr idt_descr;
 extern void *vmx_return;
 extern void *entry_sysenter;
@@ -1275,7 +1275,7 @@ static void init_vmcs_host(void)
 	vmcs_write(HOST_SEL_GS, KERNEL_DS);
 	vmcs_write(HOST_SEL_TR, TSS_MAIN);
 	vmcs_write(HOST_BASE_TR, get_gdt_entry_base(get_tss_descr()));
-	vmcs_write(HOST_BASE_GDTR, gdt64_desc.base);
+	vmcs_write(HOST_BASE_GDTR, gdt_descr.base);
 	vmcs_write(HOST_BASE_IDTR, idt_descr.base);
 	vmcs_write(HOST_BASE_FS, 0);
 	vmcs_write(HOST_BASE_GS, 0);
@@ -1354,9 +1354,9 @@ static void init_vmcs_guest(void)
 	vmcs_write(GUEST_AR_TR, 0x8b);
 
 	/* 26.3.1.3 */
-	vmcs_write(GUEST_BASE_GDTR, gdt64_desc.base);
+	vmcs_write(GUEST_BASE_GDTR, gdt_descr.base);
 	vmcs_write(GUEST_BASE_IDTR, idt_descr.base);
-	vmcs_write(GUEST_LIMIT_GDTR, gdt64_desc.limit);
+	vmcs_write(GUEST_LIMIT_GDTR, gdt_descr.limit);
 	vmcs_write(GUEST_LIMIT_IDTR, idt_descr.limit);
 
 	/* 26.3.1.4 */
diff --git a/x86/vmx_tests.c b/x86/vmx_tests.c
index ac2b0b4..9ee6653 100644
--- a/x86/vmx_tests.c
+++ b/x86/vmx_tests.c
@@ -797,8 +797,8 @@ asm(
 	"insn_wbinvd: wbinvd;ret\n\t"
 	"insn_cpuid: mov $10, %eax; cpuid;ret\n\t"
 	"insn_invd: invd;ret\n\t"
-	"insn_sgdt: sgdt gdt64_desc;ret\n\t"
-	"insn_lgdt: lgdt gdt64_desc;ret\n\t"
+	"insn_sgdt: sgdt gdt_descr;ret\n\t"
+	"insn_lgdt: lgdt gdt_descr;ret\n\t"
 	"insn_sidt: sidt idt_descr;ret\n\t"
 	"insn_lidt: lidt idt_descr;ret\n\t"
 	"insn_sldt: sldt %ax;ret\n\t"
-- 
2.34.0.rc1.387.gb447b232ab-goog


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [kvm-unit-tests PATCH 07/14] x86: get rid of ring0stacktop
  2021-11-10 21:19 [kvm-unit-tests PATCH 00/14] Run access test in an L2 guest Aaron Lewis
                   ` (5 preceding siblings ...)
  2021-11-10 21:19 ` [kvm-unit-tests PATCH 06/14] x86: unify name of 32-bit and 64-bit GDT Aaron Lewis
@ 2021-11-10 21:19 ` Aaron Lewis
  2021-11-10 21:19 ` [kvm-unit-tests PATCH 08/14] x86: Move 64-bit GDT and TSS to desc.c Aaron Lewis
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: Aaron Lewis @ 2021-11-10 21:19 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, jmattson, seanjc, Aaron Lewis

The ring3 switch code relied on a special stack page that was used
for the ring0 stack during the ring3 part of the test.  This special
stack page was used if an exception handler ran during the ring3 part
of the test.

This method is quite complex; it is easier to just use the same
stack for the "outer" part of the test and the exception handler.
To do so, store esp/rsp in the TSS just before doing the PUSH/IRET
sequence.  On 64-bit, the TSS can also be used to restore rsp after
coming back from ring3.

Unifying the three copies of the ring switching code is left as an
exercise to the reader.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[aaron: removed commas after (tss.rsp0) in usermode.c and umips.c]
Signed-off-by: Aaron Lewis <aaronlewis@google.com>
---
 lib/x86/usermode.c |  9 +++++----
 x86/access.c       | 16 ++++++++--------
 x86/cstart.S       |  6 +-----
 x86/cstart64.S     |  6 +-----
 x86/umip.c         | 19 ++++++++++++-------
 5 files changed, 27 insertions(+), 29 deletions(-)

diff --git a/lib/x86/usermode.c b/lib/x86/usermode.c
index f032523..49b87b2 100644
--- a/lib/x86/usermode.c
+++ b/lib/x86/usermode.c
@@ -47,8 +47,8 @@ uint64_t run_in_user(usermode_func func, unsigned int fault_vector,
 	}
 
 	asm volatile (
-			/* Backing Up Stack in rdi */
-			"mov %%rsp, %%rdi\n\t"
+			/* Prepare kernel SP for exception handlers */
+			"mov %%rsp, %[rsp0]\n\t"
 			/* Load user_ds to DS and ES */
 			"mov %[user_ds], %%ax\n\t"
 			"mov %%ax, %%ds\n\t"
@@ -92,9 +92,10 @@ uint64_t run_in_user(usermode_func func, unsigned int fault_vector,
 			"int %[kernel_entry_vector]\n\t"
 			/* Kernel Mode */
 			"ret_to_kernel:\n\t"
-			"mov %%rdi, %%rsp\n\t"
+			"mov %[rsp0], %%rsp\n\t"
 			:
-			"+a"(rax)
+			"+a"(rax),
+			[rsp0]"=m"(tss.rsp0)
 			:
 			[arg1]"m"(arg1),
 			[arg2]"m"(arg2),
diff --git a/x86/access.c b/x86/access.c
index 4725bbd..49d31b1 100644
--- a/x86/access.c
+++ b/x86/access.c
@@ -711,7 +711,7 @@ static int ac_test_do_access(ac_test_t *at)
     }
 
     asm volatile ("mov $fixed1, %%rsi \n\t"
-		  "mov %%rsp, %%rdx \n\t"
+		  "mov %%rsp, %[rsp0] \n\t"
 		  "cmp $0, %[user] \n\t"
 		  "jz do_access \n\t"
 		  "push %%rax; mov %[user_ds], %%ax; mov %%ax, %%ds; pop %%rax  \n\t"
@@ -734,8 +734,14 @@ static int ac_test_do_access(ac_test_t *at)
 		  "done: \n"
 		  "fixed1: \n"
 		  "int %[kernel_entry_vector] \n\t"
+		  ".section .text.entry \n\t"
+		  "kernel_entry: \n\t"
+		  "mov %[rsp0], %%rsp \n\t"
+		  "jmp back_to_kernel \n\t"
+		  ".section .text \n\t"
 		  "back_to_kernel:"
-		  : [reg]"+r"(r), "+a"(fault), "=b"(e), "=&d"(rsp)
+		  : [reg]"+r"(r), "+a"(fault), "=b"(e), "=&d"(rsp),
+		    [rsp0]"=m"(tss.rsp0)
 		  : [addr]"r"(at->virt),
 		    [write]"r"(F(AC_ACCESS_WRITE)),
 		    [user]"r"(F(AC_ACCESS_USER)),
@@ -754,12 +760,6 @@ static int ac_test_do_access(ac_test_t *at)
 		  "iretq \n\t"
 		  ".section .text");
 
-    asm volatile (".section .text.entry \n\t"
-		  "kernel_entry: \n\t"
-		  "mov %rdx, %rsp \n\t"
-		  "jmp back_to_kernel \n\t"
-		  ".section .text");
-
     ac_test_check(at, &success, fault && !at->expected_fault,
                   "unexpected fault");
     ac_test_check(at, &success, !fault && at->expected_fault,
diff --git a/x86/cstart.S b/x86/cstart.S
index 5e925d8..e9100a4 100644
--- a/x86/cstart.S
+++ b/x86/cstart.S
@@ -13,10 +13,6 @@ max_cpus = MAX_TEST_CPUS
 	.align 16
 stacktop:
 
-	. = . + 4096
-	.align 16
-ring0stacktop:
-
 .data
 
 .align 4096
@@ -62,7 +58,7 @@ i = 0
 tss:
         .rept max_cpus
         .long 0
-        .long ring0stacktop - i * 4096
+        .long 0
         .long 16
         .quad 0, 0
         .quad 0, 0, 0, 0, 0, 0, 0, 0
diff --git a/x86/cstart64.S b/x86/cstart64.S
index 46b9d9b..18c7457 100644
--- a/x86/cstart64.S
+++ b/x86/cstart64.S
@@ -16,10 +16,6 @@ max_cpus = MAX_TEST_CPUS
 	.align 16
 stacktop:
 
-	. = . + 4096 * max_cpus
-	.align 16
-ring0stacktop:
-
 .data
 
 .align 4096
@@ -83,7 +79,7 @@ i = 0
 tss:
 	.rept max_cpus
 	.long 0
-	.quad ring0stacktop - i * 4096
+	.quad 0
 	.quad 0, 0
 	.quad 0, 0, 0, 0, 0, 0, 0, 0
 	.long 0, 0, 0
diff --git a/x86/umip.c b/x86/umip.c
index 0fc1f65..1936989 100644
--- a/x86/umip.c
+++ b/x86/umip.c
@@ -124,7 +124,7 @@ static noinline int do_ring3(void (*fn)(const char *), const char *arg)
 		  "mov %%dx, %%es\n\t"
 		  "mov %%dx, %%fs\n\t"
 		  "mov %%dx, %%gs\n\t"
-		  "mov %%" R "sp, %%" R "cx\n\t"
+		  "mov %%" R "sp, %[sp0]\n\t" /* kernel sp for exception handlers */
 		  "push" W " %%" R "dx \n\t"
 		  "lea %[user_stack_top], %%" R "dx \n\t"
 		  "push" W " %%" R "dx \n\t"
@@ -133,8 +133,6 @@ static noinline int do_ring3(void (*fn)(const char *), const char *arg)
 		  "push" W " $1f \n\t"
 		  "iret" W "\n"
 		  "1: \n\t"
-		  "push %%" R "cx\n\t"   /* save kernel SP */
-
 #ifndef __x86_64__
 		  "push %[arg]\n\t"
 #endif
@@ -142,13 +140,15 @@ static noinline int do_ring3(void (*fn)(const char *), const char *arg)
 #ifndef __x86_64__
 		  "pop %%ecx\n\t"
 #endif
-
-		  "pop %%" R "cx\n\t"
 		  "mov $1f, %%" R "dx\n\t"
 		  "int %[kernel_entry_vector]\n\t"
 		  ".section .text.entry \n\t"
 		  "kernel_entry: \n\t"
-		  "mov %%" R "cx, %%" R "sp \n\t"
+#ifdef __x86_64__
+		  "mov %[sp0], %%" R "sp\n\t"
+#else
+		  "add $(5 * " S "), %%esp\n\t"
+#endif
 		  "mov %[kernel_ds], %%cx\n\t"
 		  "mov %%cx, %%ds\n\t"
 		  "mov %%cx, %%es\n\t"
@@ -157,7 +157,12 @@ static noinline int do_ring3(void (*fn)(const char *), const char *arg)
 		  "jmp *%%" R "dx \n\t"
 		  ".section .text\n\t"
 		  "1:\n\t"
-		  : [ret] "=&a" (ret)
+		  : [ret] "=&a" (ret),
+#ifdef __x86_64__
+		    [sp0] "=m" (tss.rsp0)
+#else
+		    [sp0] "=m" (tss.esp0)
+#endif
 		  : [user_ds] "i" (USER_DS),
 		    [user_cs] "i" (USER_CS),
 		    [user_stack_top]"m"(user_stack[sizeof(user_stack) -
-- 
2.34.0.rc1.387.gb447b232ab-goog


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [kvm-unit-tests PATCH 08/14] x86: Move 64-bit GDT and TSS to desc.c
  2021-11-10 21:19 [kvm-unit-tests PATCH 00/14] Run access test in an L2 guest Aaron Lewis
                   ` (6 preceding siblings ...)
  2021-11-10 21:19 ` [kvm-unit-tests PATCH 07/14] x86: get rid of ring0stacktop Aaron Lewis
@ 2021-11-10 21:19 ` Aaron Lewis
  2021-11-10 21:19 ` [kvm-unit-tests PATCH 09/14] x86: Move 32-bit " Aaron Lewis
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: Aaron Lewis @ 2021-11-10 21:19 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, jmattson, seanjc, Aaron Lewis, Zixuan Wang

Move the GDT and TSS data structures from x86/cstart64.S to
lib/x86/desc.c, so that the follow-up UEFI support commits can reuse
these definitions, without re-defining them in UEFI's boot up assembly
code.

Signed-off-by: Zixuan Wang <zixuanwang@google.com>
Message-Id: <20211004204931.1537823-2-zxwang42@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 lib/x86/asm/setup.h |  6 ++++
 lib/x86/desc.c      | 38 +++++++++++++++++++++--
 lib/x86/desc.h      |  4 +--
 lib/x86/setup.c     | 25 +++++++++++++++
 lib/x86/usermode.c  |  2 +-
 x86/access.c        |  2 +-
 x86/cstart64.S      | 76 +++++----------------------------------------
 x86/umip.c          |  2 +-
 8 files changed, 79 insertions(+), 76 deletions(-)
 create mode 100644 lib/x86/asm/setup.h

diff --git a/lib/x86/asm/setup.h b/lib/x86/asm/setup.h
new file mode 100644
index 0000000..e3c3bfb
--- /dev/null
+++ b/lib/x86/asm/setup.h
@@ -0,0 +1,6 @@
+#ifndef _X86_ASM_SETUP_H_
+#define _X86_ASM_SETUP_H_
+
+unsigned long setup_tss(void);
+
+#endif /* _X86_ASM_SETUP_H_ */
diff --git a/lib/x86/desc.c b/lib/x86/desc.c
index ac167d0..c185c01 100644
--- a/lib/x86/desc.c
+++ b/lib/x86/desc.c
@@ -2,6 +2,7 @@
 #include "desc.h"
 #include "processor.h"
 #include <setjmp.h>
+#include "apic-defs.h"
 
 /* Boot-related data structures */
 
@@ -13,6 +14,29 @@ struct descriptor_table_ptr idt_descr = {
 	.base = (unsigned long)boot_idt,
 };
 
+#ifdef __x86_64__
+/* GDT, TSS and descriptors */
+gdt_entry_t gdt[TSS_MAIN / 8 + MAX_TEST_CPUS * 2] = {
+	{     0, 0, 0, .type_limit_flags = 0x0000}, /* 0x00 null */
+	{0xffff, 0, 0, .type_limit_flags = 0xaf9b}, /* 0x08 64-bit code segment */
+	{0xffff, 0, 0, .type_limit_flags = 0xcf93}, /* 0x10 32/64-bit data segment */
+	{0xffff, 0, 0, .type_limit_flags = 0xaf1b}, /* 0x18 64-bit code segment, not present */
+	{0xffff, 0, 0, .type_limit_flags = 0xcf9b}, /* 0x20 32-bit code segment */
+	{0xffff, 0, 0, .type_limit_flags = 0x8f9b}, /* 0x28 16-bit code segment */
+	{0xffff, 0, 0, .type_limit_flags = 0x8f93}, /* 0x30 16-bit data segment */
+	{0xffff, 0, 0, .type_limit_flags = 0xcffb}, /* 0x38 32-bit code segment (user) */
+	{0xffff, 0, 0, .type_limit_flags = 0xcff3}, /* 0x40 32/64-bit data segment (user) */
+	{0xffff, 0, 0, .type_limit_flags = 0xaffb}, /* 0x48 64-bit code segment (user) */
+};
+
+tss64_t tss[MAX_TEST_CPUS] = {0};
+
+struct descriptor_table_ptr gdt_descr = {
+	.limit = sizeof(gdt) - 1,
+	.base = (unsigned long)gdt,
+};
+#endif
+
 #ifndef __x86_64__
 __attribute__((regparm(1)))
 #endif
@@ -289,8 +313,7 @@ bool exception_rflags_rf(void)
 
 static char intr_alt_stack[4096];
 
-#ifndef __x86_64__
-void set_gdt_entry(int sel, u32 base,  u32 limit, u8 type, u8 flags)
+void set_gdt_entry(int sel, unsigned long base,  u32 limit, u8 type, u8 flags)
 {
 	gdt_entry_t *entry = &gdt[sel >> 3];
 
@@ -302,8 +325,17 @@ void set_gdt_entry(int sel, u32 base,  u32 limit, u8 type, u8 flags)
 	/* Setup the descriptor limits, type and flags */
 	entry->limit1 = (limit & 0xFFFF);
 	entry->type_limit_flags = ((limit & 0xF0000) >> 8) | ((flags & 0xF0) << 8) | type;
+
+#ifdef __x86_64__
+	if (!entry->s) {
+		struct system_desc64 *entry16 = (struct system_desc64 *)entry;
+		entry16->zero = 0;
+		entry16->base4 = base >> 32;
+	}
+#endif
 }
 
+#ifndef __x86_64__
 void set_gdt_task_gate(u16 sel, u16 tss_sel)
 {
     set_gdt_entry(sel, tss_sel, 0, 0x85, 0); // task, present
@@ -380,7 +412,7 @@ void set_intr_alt_stack(int e, void *addr)
 
 void setup_alt_stack(void)
 {
-	tss.ist1 = (u64)intr_alt_stack + 4096;
+	tss[0].ist1 = (u64)intr_alt_stack + 4096;
 }
 #endif
 
diff --git a/lib/x86/desc.h b/lib/x86/desc.h
index c0817d8..ddfae04 100644
--- a/lib/x86/desc.h
+++ b/lib/x86/desc.h
@@ -204,7 +204,7 @@ void set_idt_task_gate(int vec, u16 sel);
 void set_intr_task_gate(int vec, void *fn);
 void setup_tss32(void);
 #else
-extern tss64_t tss;
+extern tss64_t tss[];
 #endif
 extern gdt_entry_t gdt[];
 
@@ -215,7 +215,7 @@ bool exception_rflags_rf(void);
 void set_idt_entry(int vec, void *addr, int dpl);
 void set_idt_sel(int vec, u16 sel);
 void set_idt_dpl(int vec, u16 dpl);
-void set_gdt_entry(int sel, u32 base,  u32 limit, u8 access, u8 gran);
+void set_gdt_entry(int sel, unsigned long base, u32 limit, u8 access, u8 gran);
 void set_intr_alt_stack(int e, void *fn);
 void print_current_tss_info(void);
 handler handle_exception(u8 v, handler fn);
diff --git a/lib/x86/setup.c b/lib/x86/setup.c
index 7befe09..ec005b5 100644
--- a/lib/x86/setup.c
+++ b/lib/x86/setup.c
@@ -2,6 +2,7 @@
  * Initialize machine setup information
  *
  * Copyright (C) 2017, Red Hat Inc, Andrew Jones <drjones@redhat.com>
+ * Copyright (C) 2021, Google Inc, Zixuan Wang <zixuanwang@google.com>
  *
  * This work is licensed under the terms of the GNU LGPL, version 2.
  */
@@ -9,6 +10,10 @@
 #include "fwcfg.h"
 #include "alloc_phys.h"
 #include "argv.h"
+#include "desc.h"
+#include "apic.h"
+#include "apic-defs.h"
+#include "asm/setup.h"
 
 extern char edata;
 
@@ -97,6 +102,26 @@ void find_highmem(void)
 		phys_alloc_init(best_start, best_end - best_start);
 	}
 }
+
+/* Setup TSS for the current processor, and return TSS offset within GDT */
+unsigned long setup_tss(void)
+{
+	u32 id;
+	tss64_t *tss_entry;
+
+	id = apic_id();
+
+	/* Runtime address of current TSS */
+	tss_entry = &tss[id];
+
+	/* Update TSS */
+	memset((void *)tss_entry, 0, sizeof(tss64_t));
+
+	/* Update TSS descriptors; each descriptor takes up 2 entries */
+	set_gdt_entry(TSS_MAIN + id * 16, (unsigned long)tss_entry, 0xffff, 0x89, 0);
+
+	return TSS_MAIN + id * 16;
+}
 #endif
 
 void setup_multiboot(struct mbi_bootinfo *bi)
diff --git a/lib/x86/usermode.c b/lib/x86/usermode.c
index 49b87b2..2e77831 100644
--- a/lib/x86/usermode.c
+++ b/lib/x86/usermode.c
@@ -95,7 +95,7 @@ uint64_t run_in_user(usermode_func func, unsigned int fault_vector,
 			"mov %[rsp0], %%rsp\n\t"
 			:
 			"+a"(rax),
-			[rsp0]"=m"(tss.rsp0)
+			[rsp0]"=m"(tss[0].rsp0)
 			:
 			[arg1]"m"(arg1),
 			[arg2]"m"(arg2),
diff --git a/x86/access.c b/x86/access.c
index 49d31b1..a781a0c 100644
--- a/x86/access.c
+++ b/x86/access.c
@@ -741,7 +741,7 @@ static int ac_test_do_access(ac_test_t *at)
 		  ".section .text \n\t"
 		  "back_to_kernel:"
 		  : [reg]"+r"(r), "+a"(fault), "=b"(e), "=&d"(rsp),
-		    [rsp0]"=m"(tss.rsp0)
+		    [rsp0]"=m"(tss[0].rsp0)
 		  : [addr]"r"(at->virt),
 		    [write]"r"(F(AC_ACCESS_WRITE)),
 		    [user]"r"(F(AC_ACCESS_USER)),
diff --git a/x86/cstart64.S b/x86/cstart64.S
index 18c7457..c6daa34 100644
--- a/x86/cstart64.S
+++ b/x86/cstart64.S
@@ -1,8 +1,6 @@
 
 #include "apic-defs.h"
 
-.globl gdt
-.globl gdt_descr
 .globl online_cpus
 .globl cpu_online_count
 
@@ -44,49 +42,6 @@ ptl5:
 
 .align 4096
 
-gdt_descr:
-	.word gdt_end - gdt - 1
-	.quad gdt
-
-gdt:
-	.quad 0
-	.quad 0x00af9b000000ffff // 64-bit code segment
-	.quad 0x00cf93000000ffff // 32/64-bit data segment
-	.quad 0x00af1b000000ffff // 64-bit code segment, not present
-	.quad 0x00cf9b000000ffff // 32-bit code segment
-	.quad 0x008f9b000000FFFF // 16-bit code segment
-	.quad 0x008f93000000FFFF // 16-bit data segment
-	.quad 0x00cffb000000ffff // 32-bit code segment (user)
-	.quad 0x00cff3000000ffff // 32/64-bit data segment (user)
-	.quad 0x00affb000000ffff // 64-bit code segment (user)
-
-	.quad 0			 // 6 spare selectors
-	.quad 0
-	.quad 0
-	.quad 0
-	.quad 0
-	.quad 0
-
-tss_descr:
-	.rept max_cpus
-	.quad 0x000089000000ffff // 64-bit avail tss
-	.quad 0                  // tss high addr
-	.endr
-gdt_end:
-
-i = 0
-.globl tss
-tss:
-	.rept max_cpus
-	.long 0
-	.quad 0
-	.quad 0, 0
-	.quad 0, 0, 0, 0, 0, 0, 0, 0
-	.long 0, 0, 0
-i = i + 1
-	.endr
-tss_end:
-
 mb_boot_info:	.quad 0
 
 pt_root:	.quad ptl4
@@ -111,6 +66,12 @@ MSR_GS_BASE = 0xc0000101
 	wrmsr
 .endm
 
+.macro load_tss
+	lidtq idt_descr
+	call setup_tss
+	ltr %ax
+.endm
+
 .macro setup_segments
 	mov $MSR_GS_BASE, %ecx
 	rdmsr
@@ -228,7 +189,7 @@ save_id:
 
 ap_start64:
 	call reset_apic
-	call load_tss
+	load_tss
 	call enable_apic
 	call save_id
 	call enable_x2apic
@@ -241,7 +202,7 @@ ap_start64:
 
 start64:
 	call reset_apic
-	call load_tss
+	load_tss
 	call mask_pic_interrupts
 	call enable_apic
 	call save_id
@@ -280,27 +241,6 @@ lvl5:
 online_cpus:
 	.fill (max_cpus + 7) / 8, 1, 0
 
-load_tss:
-	lidtq idt_descr
-	mov $(APIC_DEFAULT_PHYS_BASE + APIC_ID), %eax
-	mov (%rax), %eax
-	shr $24, %eax
-	mov %eax, %ebx
-	shl $4, %ebx
-	mov $((tss_end - tss) / max_cpus), %edx
-	imul %edx
-	add $tss, %rax
-	mov %ax, tss_descr+2(%rbx)
-	shr $16, %rax
-	mov %al, tss_descr+4(%rbx)
-	shr $8, %rax
-	mov %al, tss_descr+7(%rbx)
-	shr $8, %rax
-	mov %eax, tss_descr+8(%rbx)
-	lea tss_descr-gdt(%rbx), %rax
-	ltr %ax
-	ret
-
 ap_init:
 	cld
 	lea sipi_entry, %rsi
diff --git a/x86/umip.c b/x86/umip.c
index 1936989..0a52342 100644
--- a/x86/umip.c
+++ b/x86/umip.c
@@ -159,7 +159,7 @@ static noinline int do_ring3(void (*fn)(const char *), const char *arg)
 		  "1:\n\t"
 		  : [ret] "=&a" (ret),
 #ifdef __x86_64__
-		    [sp0] "=m" (tss.rsp0)
+		    [sp0] "=m" (tss[0].rsp0)
 #else
 		    [sp0] "=m" (tss.esp0)
 #endif
-- 
2.34.0.rc1.387.gb447b232ab-goog


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [kvm-unit-tests PATCH 09/14] x86: Move 32-bit GDT and TSS to desc.c
  2021-11-10 21:19 [kvm-unit-tests PATCH 00/14] Run access test in an L2 guest Aaron Lewis
                   ` (7 preceding siblings ...)
  2021-11-10 21:19 ` [kvm-unit-tests PATCH 08/14] x86: Move 64-bit GDT and TSS to desc.c Aaron Lewis
@ 2021-11-10 21:19 ` Aaron Lewis
  2021-11-10 21:19 ` [kvm-unit-tests PATCH 10/14] x86: Look up the PTEs rather than assuming them Aaron Lewis
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: Aaron Lewis @ 2021-11-10 21:19 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, jmattson, seanjc, Aaron Lewis

Move the GDT and TSS data structures from x86/cstart.S to
lib/x86/desc.c, for consistency with the 64-bit version.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 lib/x86/asm/setup.h |  2 +-
 lib/x86/desc.c      | 22 ++++++++--
 lib/x86/desc.h      |  2 +-
 lib/x86/setup.c     | 26 +++++++++++-
 x86/cstart.S        | 98 +++++++--------------------------------------
 x86/cstart64.S      |  1 +
 x86/smap.c          |  2 +-
 x86/taskswitch.c    |  2 +-
 x86/umip.c          |  2 +-
 9 files changed, 63 insertions(+), 94 deletions(-)

diff --git a/lib/x86/asm/setup.h b/lib/x86/asm/setup.h
index e3c3bfb..4310132 100644
--- a/lib/x86/asm/setup.h
+++ b/lib/x86/asm/setup.h
@@ -1,6 +1,6 @@
 #ifndef _X86_ASM_SETUP_H_
 #define _X86_ASM_SETUP_H_
 
-unsigned long setup_tss(void);
+unsigned long setup_tss(u8 *stacktop);
 
 #endif /* _X86_ASM_SETUP_H_ */
diff --git a/lib/x86/desc.c b/lib/x86/desc.c
index c185c01..16b7256 100644
--- a/lib/x86/desc.c
+++ b/lib/x86/desc.c
@@ -14,8 +14,22 @@ struct descriptor_table_ptr idt_descr = {
 	.base = (unsigned long)boot_idt,
 };
 
-#ifdef __x86_64__
+#ifndef __x86_64__
 /* GDT, TSS and descriptors */
+gdt_entry_t gdt[TSS_MAIN / 8 + MAX_TEST_CPUS * 2] = {
+	{     0, 0, 0, .type_limit_flags = 0x0000}, /* 0x00 null */
+	{0xffff, 0, 0, .type_limit_flags = 0xcf9b}, /* flat 32-bit code segment */
+	{0xffff, 0, 0, .type_limit_flags = 0xcf93}, /* flat 32-bit data segment */
+	{0xffff, 0, 0, .type_limit_flags = 0xcf1b}, /* flat 32-bit code segment, not present */
+	{     0, 0, 0, .type_limit_flags = 0x0000}, /* TSS for task gates */
+	{0xffff, 0, 0, .type_limit_flags = 0x8f9b}, /* 16-bit code segment */
+	{0xffff, 0, 0, .type_limit_flags = 0x8f93}, /* 16-bit data segment */
+	{0xffff, 0, 0, .type_limit_flags = 0xcffb}, /* 32-bit code segment (user) */
+	{0xffff, 0, 0, .type_limit_flags = 0xcff3}, /* 32-bit data segment (user) */
+};
+
+tss32_t tss[MAX_TEST_CPUS] = {0};
+#else
 gdt_entry_t gdt[TSS_MAIN / 8 + MAX_TEST_CPUS * 2] = {
 	{     0, 0, 0, .type_limit_flags = 0x0000}, /* 0x00 null */
 	{0xffff, 0, 0, .type_limit_flags = 0xaf9b}, /* 0x08 64-bit code segment */
@@ -30,12 +44,12 @@ gdt_entry_t gdt[TSS_MAIN / 8 + MAX_TEST_CPUS * 2] = {
 };
 
 tss64_t tss[MAX_TEST_CPUS] = {0};
+#endif
 
 struct descriptor_table_ptr gdt_descr = {
 	.limit = sizeof(gdt) - 1,
 	.base = (unsigned long)gdt,
 };
-#endif
 
 #ifndef __x86_64__
 __attribute__((regparm(1)))
@@ -365,7 +379,7 @@ void setup_tss32(void)
 {
 	u16 desc_size = sizeof(tss32_t);
 
-	tss.cr3 = read_cr3();
+	tss[0].cr3 = read_cr3();
 	tss_intr.cr3 = read_cr3();
 	tss_intr.ss0 = tss_intr.ss1 = tss_intr.ss2 = 0x10;
 	tss_intr.esp = tss_intr.esp0 = tss_intr.esp1 = tss_intr.esp2 =
@@ -401,7 +415,7 @@ void print_current_tss_info(void)
 		printf("Unknown TSS %x\n", tr);
 	else
 		printf("TR=%x (%s) Main TSS back link %x. Intr TSS back link %x\n",
-		       tr, tr ? "interrupt" : "main", tss.prev, tss_intr.prev);
+		       tr, tr ? "interrupt" : "main", tss[0].prev, tss_intr.prev);
 }
 #else
 void set_intr_alt_stack(int e, void *addr)
diff --git a/lib/x86/desc.h b/lib/x86/desc.h
index ddfae04..b65539e 100644
--- a/lib/x86/desc.h
+++ b/lib/x86/desc.h
@@ -197,7 +197,7 @@ struct system_desc64 {
 extern idt_entry_t boot_idt[256];
 
 #ifndef __x86_64__
-extern tss32_t tss;
+extern tss32_t tss[];
 extern tss32_t tss_intr;
 void set_gdt_task_gate(u16 tss_sel, u16 sel);
 void set_idt_task_gate(int vec, u16 sel);
diff --git a/lib/x86/setup.c b/lib/x86/setup.c
index ec005b5..9c4393f 100644
--- a/lib/x86/setup.c
+++ b/lib/x86/setup.c
@@ -104,7 +104,7 @@ void find_highmem(void)
 }
 
 /* Setup TSS for the current processor, and return TSS offset within GDT */
-unsigned long setup_tss(void)
+unsigned long setup_tss(u8 *stacktop)
 {
 	u32 id;
 	tss64_t *tss_entry;
@@ -122,6 +122,30 @@ unsigned long setup_tss(void)
 
 	return TSS_MAIN + id * 16;
 }
+#else
+/* Setup TSS for the current processor, and return TSS offset within GDT */
+unsigned long setup_tss(u8 *stacktop)
+{
+	u32 id;
+	tss32_t *tss_entry;
+
+	id = apic_id();
+
+	/* Runtime address of current TSS */
+	tss_entry = &tss[id];
+
+	/* Update TSS */
+	memset((void *)tss_entry, 0, sizeof(tss32_t));
+	tss_entry->ss0 = KERNEL_DS;
+
+	/* Update descriptors for TSS and percpu data segment.  */
+	set_gdt_entry(TSS_MAIN + id * 8,
+		      (unsigned long)tss_entry, 0xffff, 0x89, 0);
+	set_gdt_entry(TSS_MAIN + MAX_TEST_CPUS * 8 + id * 8,
+		      (unsigned long)stacktop - 4096, 0xfffff, 0x93, 0xc0);
+
+	return TSS_MAIN + id * 8;
+}
 #endif
 
 void setup_multiboot(struct mbi_bootinfo *bi)
diff --git a/x86/cstart.S b/x86/cstart.S
index e9100a4..2c0eec7 100644
--- a/x86/cstart.S
+++ b/x86/cstart.S
@@ -23,50 +23,6 @@ i = 0
         i = i + 1
         .endr
 
-.globl gdt
-gdt:
-	.quad 0
-	.quad 0x00cf9b000000ffff // flat 32-bit code segment
-	.quad 0x00cf93000000ffff // flat 32-bit data segment
-	.quad 0x00cf1b000000ffff // flat 32-bit code segment, not present
-	.quad 0                  // TSS for task gates
-	.quad 0x008f9b000000FFFF // 16-bit code segment
-	.quad 0x008f93000000FFFF // 16-bit data segment
-	.quad 0x00cffb000000ffff // 32-bit code segment (user)
-	.quad 0x00cff3000000ffff // 32-bit data segment (user)
-	.quad 0                  // unused
-
-	.quad 0			 // 6 spare selectors
-	.quad 0
-	.quad 0
-	.quad 0
-	.quad 0
-	.quad 0
-
-tss_descr:
-        .rept max_cpus
-        .quad 0x000089000000ffff // 32-bit avail tss
-        .endr
-percpu_descr:
-        .rept max_cpus
-        .quad 0x00cf93000000ffff // 32-bit data segment for perCPU area
-        .endr
-gdt_end:
-
-i = 0
-.globl tss
-tss:
-        .rept max_cpus
-        .long 0
-        .long 0
-        .long 16
-        .quad 0, 0
-        .quad 0, 0, 0, 0, 0, 0, 0, 0
-        .long 0, 0, 0
-        i = i + 1
-        .endr
-tss_end:
-
 .section .init
 
 .code32
@@ -78,21 +34,14 @@ mb_flags = 0x0
 	.long mb_magic, mb_flags, 0 - (mb_magic + mb_flags)
 mb_cmdline = 16
 
-.macro setup_percpu_area
-	lea -4096(%esp), %eax
-
-	/* fill GS_BASE in the GDT, do not clobber %ebx (multiboot info) */
-	mov (APIC_DEFAULT_PHYS_BASE + APIC_ID), %ecx
-	shr $24, %ecx
-	mov %ax, percpu_descr+2(,%ecx,8)
-
-	shr $16, %eax
-	mov %al, percpu_descr+4(,%ecx,8)
-	mov %ah, percpu_descr+7(,%ecx,8)
-
-	lea percpu_descr-gdt(,%ecx,8), %eax
+.macro setup_tr_and_percpu
+	lidt idt_descr
+	push %esp
+	call setup_tss
+	addl $4, %esp
+	ltr %ax
+	add $(max_cpus * 8), %ax
 	mov %ax, %gs
-
 .endm
 
 .macro setup_segments
@@ -109,7 +58,6 @@ start:
         lgdtl gdt_descr
         setup_segments
         mov $stacktop, %esp
-        setup_percpu_area
 
         push %ebx
         call setup_multiboot
@@ -147,11 +95,10 @@ ap_start32:
 	setup_segments
 	mov $-4096, %esp
 	lock xaddl %esp, smp_stacktop
-	setup_percpu_area
+	setup_tr_and_percpu
 	call prepare_32
 	call reset_apic
 	call save_id
-	call load_tss
 	call enable_apic
 	call enable_x2apic
 	sti
@@ -162,9 +109,9 @@ ap_start32:
 	jmp 1b
 
 start32:
+	setup_tr_and_percpu
 	call reset_apic
 	call save_id
-	call load_tss
 	call mask_pic_interrupts
 	call enable_apic
 	call ap_init
@@ -177,26 +124,9 @@ start32:
 	push %eax
 	call exit
 
-load_tss:
-	lidt idt_descr
-	mov $16, %eax
-	mov %ax, %ss
-	mov (APIC_DEFAULT_PHYS_BASE + APIC_ID), %eax
-	shr $24, %eax
-	mov %eax, %ebx
-	mov $((tss_end - tss) / max_cpus), %edx
-	imul %edx
-	add $tss, %eax
-	mov %ax, tss_descr+2(,%ebx,8)
-	shr $16, %eax
-	mov %al, tss_descr+4(,%ebx,8)
-	mov %ah, tss_descr+7(,%ebx,8)
-	lea tss_descr-gdt(,%ebx,8), %eax
-	ltr %ax
-	ret
-
 ap_init:
 	cld
+	sgdtl ap_gdt_descr // must be close to sipi_entry for real mode access to work
 	lea sipi_entry, %esi
 	xor %edi, %edi
 	mov $(sipi_end - sipi_entry), %ecx
@@ -220,11 +150,11 @@ sipi_entry:
 	mov %cr0, %eax
 	or $1, %eax
 	mov %eax, %cr0
-	lgdtl gdt_descr - sipi_entry
+	lgdtl ap_gdt_descr - sipi_entry
 	ljmpl $8, $ap_start32
 
-gdt_descr:
-	.word gdt_end - gdt - 1
-	.long gdt
+ap_gdt_descr:
+	.word 0
+	.long 0
 
 sipi_end:
diff --git a/x86/cstart64.S b/x86/cstart64.S
index c6daa34..ddb83a0 100644
--- a/x86/cstart64.S
+++ b/x86/cstart64.S
@@ -68,6 +68,7 @@ MSR_GS_BASE = 0xc0000101
 
 .macro load_tss
 	lidtq idt_descr
+	movq %rsp, %rdi
 	call setup_tss
 	ltr %ax
 .endm
diff --git a/x86/smap.c b/x86/smap.c
index ac2c8d5..c6ddf38 100644
--- a/x86/smap.c
+++ b/x86/smap.c
@@ -20,7 +20,7 @@ void do_pf_tss(unsigned long error_code)
 	save = test;
 
 #ifndef __x86_64__
-	tss.eflags |= X86_EFLAGS_AC;
+	tss[0].eflags |= X86_EFLAGS_AC;
 #endif
 }
 
diff --git a/x86/taskswitch.c b/x86/taskswitch.c
index 1d6e6e2..0d31149 100644
--- a/x86/taskswitch.c
+++ b/x86/taskswitch.c
@@ -19,7 +19,7 @@ fault_handler(unsigned long error_code)
 	print_current_tss_info();
 	printf("error code %lx\n", error_code);
 
-	tss.eip += 2;
+	tss[0].eip += 2;
 
 	gdt[TSS_MAIN / 8].type &= ~DESC_BUSY;
 
diff --git a/x86/umip.c b/x86/umip.c
index 0a52342..af8db59 100644
--- a/x86/umip.c
+++ b/x86/umip.c
@@ -161,7 +161,7 @@ static noinline int do_ring3(void (*fn)(const char *), const char *arg)
 #ifdef __x86_64__
 		    [sp0] "=m" (tss[0].rsp0)
 #else
-		    [sp0] "=m" (tss.esp0)
+		    [sp0] "=m" (tss[0].esp0)
 #endif
 		  : [user_ds] "i" (USER_DS),
 		    [user_cs] "i" (USER_CS),
-- 
2.34.0.rc1.387.gb447b232ab-goog


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [kvm-unit-tests PATCH 10/14] x86: Look up the PTEs rather than assuming them
  2021-11-10 21:19 [kvm-unit-tests PATCH 00/14] Run access test in an L2 guest Aaron Lewis
                   ` (8 preceding siblings ...)
  2021-11-10 21:19 ` [kvm-unit-tests PATCH 09/14] x86: Move 32-bit " Aaron Lewis
@ 2021-11-10 21:19 ` Aaron Lewis
  2021-11-29 19:43   ` Babu Moger
  2021-11-10 21:19 ` [kvm-unit-tests PATCH 11/14] x86: Prepare access test for running in L2 Aaron Lewis
                   ` (4 subsequent siblings)
  14 siblings, 1 reply; 20+ messages in thread
From: Aaron Lewis @ 2021-11-10 21:19 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, jmattson, seanjc, Aaron Lewis

Rather than assuming which PTEs the SMEP test runs on, look them up to
ensure they are correct.  If this test were to run on a different page
table (ie: run in an L2 test) the wrong PTEs would be set.  Switch to
looking up the PTEs to avoid this from happening.

Signed-off-by: Aaron Lewis <aaronlewis@google.com>
---
 lib/libcflat.h |  1 +
 lib/x86/vm.c   | 21 +++++++++++++++++++++
 lib/x86/vm.h   |  3 +++
 x86/access.c   | 26 ++++++++++++++++++--------
 x86/cstart64.S |  1 -
 x86/flat.lds   |  1 +
 6 files changed, 44 insertions(+), 9 deletions(-)

diff --git a/lib/libcflat.h b/lib/libcflat.h
index 9bb7e08..c1fd31f 100644
--- a/lib/libcflat.h
+++ b/lib/libcflat.h
@@ -35,6 +35,7 @@
 #define __ALIGN_MASK(x, mask)	(((x) + (mask)) & ~(mask))
 #define __ALIGN(x, a)		__ALIGN_MASK(x, (typeof(x))(a) - 1)
 #define ALIGN(x, a)		__ALIGN((x), (a))
+#define ALIGN_DOWN(x, a)	__ALIGN((x) - ((a) - 1), (a))
 #define IS_ALIGNED(x, a)	(((x) & ((typeof(x))(a) - 1)) == 0)
 
 #define MIN(a, b)		((a) < (b) ? (a) : (b))
diff --git a/lib/x86/vm.c b/lib/x86/vm.c
index 5cd2ee4..6a70ef6 100644
--- a/lib/x86/vm.c
+++ b/lib/x86/vm.c
@@ -281,3 +281,24 @@ void force_4k_page(void *addr)
 	if (pte & PT_PAGE_SIZE_MASK)
 		split_large_page(ptep, 2);
 }
+
+/*
+ * Call the callback on each page from virt to virt + len.
+ */
+void walk_pte(void *virt, size_t len, pte_callback_t callback)
+{
+    pgd_t *cr3 = current_page_table();
+    uintptr_t start = (uintptr_t)virt;
+    uintptr_t end = (uintptr_t)virt + len;
+    struct pte_search search;
+    size_t page_size;
+    uintptr_t curr;
+
+    for (curr = start; curr < end; curr = ALIGN_DOWN(curr + page_size, page_size)) {
+        search = find_pte_level(cr3, (void *)curr, 1);
+        assert(found_leaf_pte(search));
+        page_size = 1ul << PGDIR_BITS(search.level);
+
+        callback(search, (void *)curr);
+    }
+}
diff --git a/lib/x86/vm.h b/lib/x86/vm.h
index d9753c3..4c6dff9 100644
--- a/lib/x86/vm.h
+++ b/lib/x86/vm.h
@@ -52,4 +52,7 @@ struct vm_vcpu_info {
         u64 cr0;
 };
 
+typedef void (*pte_callback_t)(struct pte_search search, void *va);
+void walk_pte(void *virt, size_t len, pte_callback_t callback);
+
 #endif
diff --git a/x86/access.c b/x86/access.c
index a781a0c..8e3a718 100644
--- a/x86/access.c
+++ b/x86/access.c
@@ -201,10 +201,24 @@ static void set_cr0_wp(int wp)
     }
 }
 
+static void clear_user_mask(struct pte_search search, void *va)
+{
+	*search.pte &= ~PT_USER_MASK;
+}
+
+static void set_user_mask(struct pte_search search, void *va)
+{
+	*search.pte |= PT_USER_MASK;
+
+	/* Flush to avoid spurious #PF */
+	invlpg(va);
+}
+
 static unsigned set_cr4_smep(int smep)
 {
+    extern char stext, etext;
+    size_t len = (size_t)&etext - (size_t)&stext;
     unsigned long cr4 = shadow_cr4;
-    extern u64 ptl2[];
     unsigned r;
 
     cr4 &= ~CR4_SMEP_MASK;
@@ -214,14 +228,10 @@ static unsigned set_cr4_smep(int smep)
         return 0;
 
     if (smep)
-        ptl2[2] &= ~PT_USER_MASK;
+        walk_pte(&stext, len, clear_user_mask);
     r = write_cr4_checking(cr4);
-    if (r || !smep) {
-        ptl2[2] |= PT_USER_MASK;
-
-	/* Flush to avoid spurious #PF */
-	invlpg((void *)(2 << 21));
-    }
+    if (r || !smep)
+        walk_pte(&stext, len, set_user_mask);
     if (!r)
         shadow_cr4 = cr4;
     return r;
diff --git a/x86/cstart64.S b/x86/cstart64.S
index ddb83a0..ff79ae7 100644
--- a/x86/cstart64.S
+++ b/x86/cstart64.S
@@ -17,7 +17,6 @@ stacktop:
 .data
 
 .align 4096
-.globl ptl2
 ptl2:
 i = 0
 	.rept 512 * 4
diff --git a/x86/flat.lds b/x86/flat.lds
index a278b56..337bc44 100644
--- a/x86/flat.lds
+++ b/x86/flat.lds
@@ -3,6 +3,7 @@ SECTIONS
     . = 4M + SIZEOF_HEADERS;
     stext = .;
     .text : { *(.init) *(.text) *(.text.*) }
+    etext = .;
     . = ALIGN(4K);
     .data : {
           *(.data)
-- 
2.34.0.rc1.387.gb447b232ab-goog


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [kvm-unit-tests PATCH 11/14] x86: Prepare access test for running in L2
  2021-11-10 21:19 [kvm-unit-tests PATCH 00/14] Run access test in an L2 guest Aaron Lewis
                   ` (9 preceding siblings ...)
  2021-11-10 21:19 ` [kvm-unit-tests PATCH 10/14] x86: Look up the PTEs rather than assuming them Aaron Lewis
@ 2021-11-10 21:19 ` Aaron Lewis
  2021-11-10 21:19 ` [kvm-unit-tests PATCH 12/14] x86: Fix tabs in access.c Aaron Lewis
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: Aaron Lewis @ 2021-11-10 21:19 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, jmattson, seanjc, Aaron Lewis

Move main out of access.c in preparation for running the test in L2.
This allows access.c to be used as common code that will be
included in a nested tests later in this series.

Signed-off-by: Aaron Lewis <aaronlewis@google.com>
---
 x86/Makefile.common |  2 ++
 x86/Makefile.x86_64 |  2 +-
 x86/access.c        | 24 +++---------------------
 x86/access.h        |  8 ++++++++
 x86/access_test.c   | 22 ++++++++++++++++++++++
 x86/unittests.cfg   |  4 ++--
 6 files changed, 38 insertions(+), 24 deletions(-)
 create mode 100644 x86/access.h
 create mode 100644 x86/access_test.c

diff --git a/x86/Makefile.common b/x86/Makefile.common
index 52bb7aa..a665854 100644
--- a/x86/Makefile.common
+++ b/x86/Makefile.common
@@ -72,6 +72,8 @@ $(TEST_DIR)/realmode.elf: $(TEST_DIR)/realmode.o
 
 $(TEST_DIR)/realmode.o: bits = $(if $(call cc-option,-m16,""),16,32)
 
+$(TEST_DIR)/access_test.elf: $(TEST_DIR)/access.o
+
 $(TEST_DIR)/kvmclock_test.elf: $(TEST_DIR)/kvmclock.o
 
 $(TEST_DIR)/hyperv_synic.elf: $(TEST_DIR)/hyperv.o
diff --git a/x86/Makefile.x86_64 b/x86/Makefile.x86_64
index 8134952..390d0e9 100644
--- a/x86/Makefile.x86_64
+++ b/x86/Makefile.x86_64
@@ -9,7 +9,7 @@ cflatobjs += lib/x86/setjmp64.o
 cflatobjs += lib/x86/intel-iommu.o
 cflatobjs += lib/x86/usermode.o
 
-tests = $(TEST_DIR)/access.flat $(TEST_DIR)/apic.flat \
+tests = $(TEST_DIR)/access_test.flat $(TEST_DIR)/apic.flat \
 	  $(TEST_DIR)/emulator.flat $(TEST_DIR)/idt_test.flat \
 	  $(TEST_DIR)/xsave.flat $(TEST_DIR)/rmap_chain.flat \
 	  $(TEST_DIR)/pcid.flat $(TEST_DIR)/debug.flat \
diff --git a/x86/access.c b/x86/access.c
index 8e3a718..de6726e 100644
--- a/x86/access.c
+++ b/x86/access.c
@@ -1,9 +1,9 @@
-
 #include "libcflat.h"
 #include "desc.h"
 #include "processor.h"
 #include "asm/page.h"
 #include "x86/vm.h"
+#include "access.h"
 
 #define smp_id() 0
 
@@ -14,7 +14,7 @@ static _Bool verbose = false;
 
 typedef unsigned long pt_element_t;
 static int invalid_mask;
-static int page_table_levels;
+int page_table_levels;
 
 #define PT_BASE_ADDR_MASK ((pt_element_t)((((pt_element_t)1 << 36) - 1) & PAGE_MASK))
 #define PT_PSE_BASE_ADDR_MASK (PT_BASE_ADDR_MASK & ~(1ull << 21))
@@ -1069,7 +1069,7 @@ const ac_test_fn ac_test_cases[] =
 	check_effective_sp_permissions,
 };
 
-static int ac_test_run(void)
+int ac_test_run()
 {
     ac_test_t at;
     ac_pool_t pool;
@@ -1150,21 +1150,3 @@ static int ac_test_run(void)
 
     return successes == tests;
 }
-
-int main(void)
-{
-    int r;
-
-    printf("starting test\n\n");
-    page_table_levels = 4;
-    r = ac_test_run();
-
-    if (this_cpu_has(X86_FEATURE_LA57)) {
-        page_table_levels = 5;
-        printf("starting 5-level paging test.\n\n");
-        setup_5level_page_table();
-        r = ac_test_run();
-    }
-
-    return r ? 0 : 1;
-}
diff --git a/x86/access.h b/x86/access.h
new file mode 100644
index 0000000..4f67b62
--- /dev/null
+++ b/x86/access.h
@@ -0,0 +1,8 @@
+#ifndef X86_ACCESS_H
+#define X86_ACCESS_H
+
+int ac_test_run(void);
+
+extern int page_table_levels;
+
+#endif // X86_ACCESS_H
\ No newline at end of file
diff --git a/x86/access_test.c b/x86/access_test.c
new file mode 100644
index 0000000..497f286
--- /dev/null
+++ b/x86/access_test.c
@@ -0,0 +1,22 @@
+#include "libcflat.h"
+#include "processor.h"
+#include "x86/vm.h"
+#include "access.h"
+
+int main(void)
+{
+    int r;
+
+    printf("starting test\n\n");
+    page_table_levels = 4;
+    r = ac_test_run();
+
+    if (this_cpu_has(X86_FEATURE_LA57)) {
+        page_table_levels = 5;
+        printf("starting 5-level paging test.\n\n");
+        setup_5level_page_table();
+        r = ac_test_run();
+    }
+
+    return r ? 0 : 1;
+}
diff --git a/x86/unittests.cfg b/x86/unittests.cfg
index 3000e53..dbeb8a2 100644
--- a/x86/unittests.cfg
+++ b/x86/unittests.cfg
@@ -114,13 +114,13 @@ groups = vmexit
 extra_params = -cpu qemu64,+x2apic,+tsc-deadline -append tscdeadline_immed
 
 [access]
-file = access.flat
+file = access_test.flat
 arch = x86_64
 extra_params = -cpu max
 timeout = 180
 
 [access-reduced-maxphyaddr]
-file = access.flat
+file = access_test.flat
 arch = x86_64
 extra_params = -cpu IvyBridge,phys-bits=36,host-phys-bits=off
 timeout = 180
-- 
2.34.0.rc1.387.gb447b232ab-goog


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [kvm-unit-tests PATCH 12/14] x86: Fix tabs in access.c
  2021-11-10 21:19 [kvm-unit-tests PATCH 00/14] Run access test in an L2 guest Aaron Lewis
                   ` (10 preceding siblings ...)
  2021-11-10 21:19 ` [kvm-unit-tests PATCH 11/14] x86: Prepare access test for running in L2 Aaron Lewis
@ 2021-11-10 21:19 ` Aaron Lewis
  2021-11-10 21:20 ` [kvm-unit-tests PATCH 13/14] x86: Clean up the global, page_table_levels, " Aaron Lewis
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: Aaron Lewis @ 2021-11-10 21:19 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, jmattson, seanjc, Aaron Lewis

Tabs and spaces in this file are inconsistent and don't follow the
coding style.  Correct them to adhere to the standard and make it easier
to work in this file.

No functional change intended.

Signed-off-by: Aaron Lewis <aaronlewis@google.com>
---
 x86/access.c | 1383 +++++++++++++++++++++++++-------------------------
 1 file changed, 691 insertions(+), 692 deletions(-)

diff --git a/x86/access.c b/x86/access.c
index de6726e..f832385 100644
--- a/x86/access.c
+++ b/x86/access.c
@@ -30,57 +30,57 @@ int page_table_levels;
 #define PFERR_PK_MASK (1U << 5)
 
 #define MSR_EFER 0xc0000080
-#define EFER_NX_MASK		(1ull << 11)
+#define EFER_NX_MASK            (1ull << 11)
 
 #define PT_INDEX(address, level)       \
-       ((address) >> (12 + ((level)-1) * 9)) & 511
+	  ((address) >> (12 + ((level)-1) * 9)) & 511
 
 /*
  * page table access check tests
  */
 
 enum {
-    AC_PTE_PRESENT_BIT,
-    AC_PTE_WRITABLE_BIT,
-    AC_PTE_USER_BIT,
-    AC_PTE_ACCESSED_BIT,
-    AC_PTE_DIRTY_BIT,
-    AC_PTE_NX_BIT,
-    AC_PTE_BIT51_BIT,
-    AC_PTE_BIT36_BIT,
-
-    AC_PDE_PRESENT_BIT,
-    AC_PDE_WRITABLE_BIT,
-    AC_PDE_USER_BIT,
-    AC_PDE_ACCESSED_BIT,
-    AC_PDE_DIRTY_BIT,
-    AC_PDE_PSE_BIT,
-    AC_PDE_NX_BIT,
-    AC_PDE_BIT51_BIT,
-    AC_PDE_BIT36_BIT,
-    AC_PDE_BIT13_BIT,
-
-    /*
-     *  special test case to DISABLE writable bit on page directory
-     *  pointer table entry.
-     */
-    AC_PDPTE_NO_WRITABLE_BIT,
-
-    AC_PKU_AD_BIT,
-    AC_PKU_WD_BIT,
-    AC_PKU_PKEY_BIT,
-
-    AC_ACCESS_USER_BIT,
-    AC_ACCESS_WRITE_BIT,
-    AC_ACCESS_FETCH_BIT,
-    AC_ACCESS_TWICE_BIT,
-
-    AC_CPU_EFER_NX_BIT,
-    AC_CPU_CR0_WP_BIT,
-    AC_CPU_CR4_SMEP_BIT,
-    AC_CPU_CR4_PKE_BIT,
-
-    NR_AC_FLAGS
+	AC_PTE_PRESENT_BIT,
+	AC_PTE_WRITABLE_BIT,
+	AC_PTE_USER_BIT,
+	AC_PTE_ACCESSED_BIT,
+	AC_PTE_DIRTY_BIT,
+	AC_PTE_NX_BIT,
+	AC_PTE_BIT51_BIT,
+	AC_PTE_BIT36_BIT,
+
+	AC_PDE_PRESENT_BIT,
+	AC_PDE_WRITABLE_BIT,
+	AC_PDE_USER_BIT,
+	AC_PDE_ACCESSED_BIT,
+	AC_PDE_DIRTY_BIT,
+	AC_PDE_PSE_BIT,
+	AC_PDE_NX_BIT,
+	AC_PDE_BIT51_BIT,
+	AC_PDE_BIT36_BIT,
+	AC_PDE_BIT13_BIT,
+
+	/*
+	 *  special test case to DISABLE writable bit on page directory
+	 *  pointer table entry.
+	 */
+	AC_PDPTE_NO_WRITABLE_BIT,
+
+	AC_PKU_AD_BIT,
+	AC_PKU_WD_BIT,
+	AC_PKU_PKEY_BIT,
+
+	AC_ACCESS_USER_BIT,
+	AC_ACCESS_WRITE_BIT,
+	AC_ACCESS_FETCH_BIT,
+	AC_ACCESS_TWICE_BIT,
+
+	AC_CPU_EFER_NX_BIT,
+	AC_CPU_CR0_WP_BIT,
+	AC_CPU_CR4_SMEP_BIT,
+	AC_CPU_CR4_PKE_BIT,
+
+	NR_AC_FLAGS
 };
 
 #define AC_PTE_PRESENT_MASK   (1 << AC_PTE_PRESENT_BIT)
@@ -120,65 +120,65 @@ enum {
 #define AC_CPU_CR4_PKE_MASK   (1 << AC_CPU_CR4_PKE_BIT)
 
 const char *ac_names[] = {
-    [AC_PTE_PRESENT_BIT] = "pte.p",
-    [AC_PTE_ACCESSED_BIT] = "pte.a",
-    [AC_PTE_WRITABLE_BIT] = "pte.rw",
-    [AC_PTE_USER_BIT] = "pte.user",
-    [AC_PTE_DIRTY_BIT] = "pte.d",
-    [AC_PTE_NX_BIT] = "pte.nx",
-    [AC_PTE_BIT51_BIT] = "pte.51",
-    [AC_PTE_BIT36_BIT] = "pte.36",
-    [AC_PDE_PRESENT_BIT] = "pde.p",
-    [AC_PDE_ACCESSED_BIT] = "pde.a",
-    [AC_PDE_WRITABLE_BIT] = "pde.rw",
-    [AC_PDE_USER_BIT] = "pde.user",
-    [AC_PDE_DIRTY_BIT] = "pde.d",
-    [AC_PDE_PSE_BIT] = "pde.pse",
-    [AC_PDE_NX_BIT] = "pde.nx",
-    [AC_PDE_BIT51_BIT] = "pde.51",
-    [AC_PDE_BIT36_BIT] = "pde.36",
-    [AC_PDE_BIT13_BIT] = "pde.13",
-    [AC_PDPTE_NO_WRITABLE_BIT] = "pdpte.ro",
-    [AC_PKU_AD_BIT] = "pkru.ad",
-    [AC_PKU_WD_BIT] = "pkru.wd",
-    [AC_PKU_PKEY_BIT] = "pkey=1",
-    [AC_ACCESS_WRITE_BIT] = "write",
-    [AC_ACCESS_USER_BIT] = "user",
-    [AC_ACCESS_FETCH_BIT] = "fetch",
-    [AC_ACCESS_TWICE_BIT] = "twice",
-    [AC_CPU_EFER_NX_BIT] = "efer.nx",
-    [AC_CPU_CR0_WP_BIT] = "cr0.wp",
-    [AC_CPU_CR4_SMEP_BIT] = "cr4.smep",
-    [AC_CPU_CR4_PKE_BIT] = "cr4.pke",
+	[AC_PTE_PRESENT_BIT] = "pte.p",
+	[AC_PTE_ACCESSED_BIT] = "pte.a",
+	[AC_PTE_WRITABLE_BIT] = "pte.rw",
+	[AC_PTE_USER_BIT] = "pte.user",
+	[AC_PTE_DIRTY_BIT] = "pte.d",
+	[AC_PTE_NX_BIT] = "pte.nx",
+	[AC_PTE_BIT51_BIT] = "pte.51",
+	[AC_PTE_BIT36_BIT] = "pte.36",
+	[AC_PDE_PRESENT_BIT] = "pde.p",
+	[AC_PDE_ACCESSED_BIT] = "pde.a",
+	[AC_PDE_WRITABLE_BIT] = "pde.rw",
+	[AC_PDE_USER_BIT] = "pde.user",
+	[AC_PDE_DIRTY_BIT] = "pde.d",
+	[AC_PDE_PSE_BIT] = "pde.pse",
+	[AC_PDE_NX_BIT] = "pde.nx",
+	[AC_PDE_BIT51_BIT] = "pde.51",
+	[AC_PDE_BIT36_BIT] = "pde.36",
+	[AC_PDE_BIT13_BIT] = "pde.13",
+	[AC_PDPTE_NO_WRITABLE_BIT] = "pdpte.ro",
+	[AC_PKU_AD_BIT] = "pkru.ad",
+	[AC_PKU_WD_BIT] = "pkru.wd",
+	[AC_PKU_PKEY_BIT] = "pkey=1",
+	[AC_ACCESS_WRITE_BIT] = "write",
+	[AC_ACCESS_USER_BIT] = "user",
+	[AC_ACCESS_FETCH_BIT] = "fetch",
+	[AC_ACCESS_TWICE_BIT] = "twice",
+	[AC_CPU_EFER_NX_BIT] = "efer.nx",
+	[AC_CPU_CR0_WP_BIT] = "cr0.wp",
+	[AC_CPU_CR4_SMEP_BIT] = "cr4.smep",
+	[AC_CPU_CR4_PKE_BIT] = "cr4.pke",
 };
 
 static inline void *va(pt_element_t phys)
 {
-    return (void *)phys;
+	return (void *)phys;
 }
 
 typedef struct {
-    pt_element_t pt_pool;
-    unsigned pt_pool_size;
-    unsigned pt_pool_current;
+	pt_element_t pt_pool;
+	unsigned pt_pool_size;
+	unsigned pt_pool_current;
 } ac_pool_t;
 
 typedef struct {
-    unsigned flags;
-    void *virt;
-    pt_element_t phys;
-    pt_element_t *ptep;
-    pt_element_t expected_pte;
-    pt_element_t *pdep;
-    pt_element_t expected_pde;
-    pt_element_t ignore_pde;
-    int expected_fault;
-    unsigned expected_error;
+	unsigned flags;
+	void *virt;
+	pt_element_t phys;
+	pt_element_t *ptep;
+	pt_element_t expected_pte;
+	pt_element_t *pdep;
+	pt_element_t expected_pde;
+	pt_element_t ignore_pde;
+	int expected_fault;
+	unsigned expected_error;
 } ac_test_t;
 
 typedef struct {
-    unsigned short limit;
-    unsigned long linear_addr;
+	unsigned short limit;
+	unsigned long linear_addr;
 } __attribute__((packed)) descriptor_table_t;
 
 
@@ -190,15 +190,15 @@ static unsigned long long shadow_efer;
 
 static void set_cr0_wp(int wp)
 {
-    unsigned long cr0 = shadow_cr0;
-
-    cr0 &= ~CR0_WP_MASK;
-    if (wp)
-	cr0 |= CR0_WP_MASK;
-    if (cr0 != shadow_cr0) {
-        write_cr0(cr0);
-        shadow_cr0 = cr0;
-    }
+	unsigned long cr0 = shadow_cr0;
+
+	cr0 &= ~CR0_WP_MASK;
+	if (wp)
+		cr0 |= CR0_WP_MASK;
+	if (cr0 != shadow_cr0) {
+		write_cr0(cr0);
+		shadow_cr0 = cr0;
+	}
 }
 
 static void clear_user_mask(struct pte_search search, void *va)
@@ -216,404 +216,405 @@ static void set_user_mask(struct pte_search search, void *va)
 
 static unsigned set_cr4_smep(int smep)
 {
-    extern char stext, etext;
-    size_t len = (size_t)&etext - (size_t)&stext;
-    unsigned long cr4 = shadow_cr4;
-    unsigned r;
-
-    cr4 &= ~CR4_SMEP_MASK;
-    if (smep)
-	cr4 |= CR4_SMEP_MASK;
-    if (cr4 == shadow_cr4)
-        return 0;
-
-    if (smep)
-        walk_pte(&stext, len, clear_user_mask);
-    r = write_cr4_checking(cr4);
-    if (r || !smep)
-        walk_pte(&stext, len, set_user_mask);
-    if (!r)
-        shadow_cr4 = cr4;
-    return r;
+	extern char stext, etext;
+	size_t len = (size_t)&etext - (size_t)&stext;
+	unsigned long cr4 = shadow_cr4;
+	unsigned r;
+
+	cr4 &= ~CR4_SMEP_MASK;
+	if (smep)
+		cr4 |= CR4_SMEP_MASK;
+	if (cr4 == shadow_cr4)
+		return 0;
+
+	if (smep)
+		walk_pte(&stext, len, clear_user_mask);
+	r = write_cr4_checking(cr4);
+	if (r || !smep)
+		walk_pte(&stext, len, set_user_mask);
+	if (!r)
+		shadow_cr4 = cr4;
+	return r;
 }
 
 static void set_cr4_pke(int pke)
 {
-    unsigned long cr4 = shadow_cr4;
-
-    cr4 &= ~X86_CR4_PKE;
-    if (pke)
-	cr4 |= X86_CR4_PKE;
-    if (cr4 == shadow_cr4)
-        return;
-
-    /* Check that protection keys do not affect accesses when CR4.PKE=0.  */
-    if ((shadow_cr4 & X86_CR4_PKE) && !pke)
-        write_pkru(0xfffffffc);
-    write_cr4(cr4);
-    shadow_cr4 = cr4;
+	unsigned long cr4 = shadow_cr4;
+
+	cr4 &= ~X86_CR4_PKE;
+	if (pke)
+		cr4 |= X86_CR4_PKE;
+	if (cr4 == shadow_cr4)
+		return;
+
+	/* Check that protection keys do not affect accesses when CR4.PKE=0.  */
+	if ((shadow_cr4 & X86_CR4_PKE) && !pke)
+		write_pkru(0xfffffffc);
+	write_cr4(cr4);
+	shadow_cr4 = cr4;
 }
 
 static void set_efer_nx(int nx)
 {
-    unsigned long long efer = shadow_efer;
-
-    efer &= ~EFER_NX_MASK;
-    if (nx)
-	efer |= EFER_NX_MASK;
-    if (efer != shadow_efer) {
-        wrmsr(MSR_EFER, efer);
-        shadow_efer = efer;
-    }
+	unsigned long long efer = shadow_efer;
+
+	efer &= ~EFER_NX_MASK;
+	if (nx)
+		efer |= EFER_NX_MASK;
+	if (efer != shadow_efer) {
+		wrmsr(MSR_EFER, efer);
+		shadow_efer = efer;
+	}
 }
 
 static void ac_env_int(ac_pool_t *pool)
 {
-    extern char page_fault, kernel_entry;
-    set_idt_entry(14, &page_fault, 0);
-    set_idt_entry(0x20, &kernel_entry, 3);
+	extern char page_fault, kernel_entry;
+	set_idt_entry(14, &page_fault, 0);
+	set_idt_entry(0x20, &kernel_entry, 3);
 
-    pool->pt_pool = 33 * 1024 * 1024;
-    pool->pt_pool_size = 120 * 1024 * 1024 - pool->pt_pool;
-    pool->pt_pool_current = 0;
+	pool->pt_pool = 33 * 1024 * 1024;
+	pool->pt_pool_size = 120 * 1024 * 1024 - pool->pt_pool;
+	pool->pt_pool_current = 0;
 }
 
 static void ac_test_init(ac_test_t *at, void *virt)
 {
-    set_efer_nx(1);
-    set_cr0_wp(1);
-    at->flags = 0;
-    at->virt = virt;
-    at->phys = 32 * 1024 * 1024;
+	set_efer_nx(1);
+	set_cr0_wp(1);
+	at->flags = 0;
+	at->virt = virt;
+	at->phys = 32 * 1024 * 1024;
 }
 
 static int ac_test_bump_one(ac_test_t *at)
 {
-    at->flags = ((at->flags | invalid_mask) + 1) & ~invalid_mask;
-    return at->flags < (1 << NR_AC_FLAGS);
+	at->flags = ((at->flags | invalid_mask) + 1) & ~invalid_mask;
+	return at->flags < (1 << NR_AC_FLAGS);
 }
 
 #define F(x)  ((flags & x##_MASK) != 0)
 
 static _Bool ac_test_legal(ac_test_t *at)
 {
-    int flags = at->flags;
-    unsigned reserved;
-
-    if (F(AC_ACCESS_FETCH) && F(AC_ACCESS_WRITE))
-	return false;
-
-    /*
-     * Since we convert current page to kernel page when cr4.smep=1,
-     * we can't switch to user mode.
-     */
-    if (F(AC_ACCESS_USER) && F(AC_CPU_CR4_SMEP))
-	return false;
-
-    /*
-     * Only test protection key faults if CR4.PKE=1.
-     */
-    if (!F(AC_CPU_CR4_PKE) &&
-        (F(AC_PKU_AD) || F(AC_PKU_WD))) {
-	return false;
-    }
-
-    /*
-     * pde.bit13 checks handling of reserved bits in largepage PDEs.  It is
-     * meaningless if there is a PTE.
-     */
-    if (!F(AC_PDE_PSE) && F(AC_PDE_BIT13))
-        return false;
-
-    /*
-     * Shorten the test by avoiding testing too many reserved bit combinations.
-     * Skip testing multiple reserved bits to shorten the test. Reserved bit
-     * page faults are terminal and multiple reserved bits do not affect the
-     * error code; the odds of a KVM bug are super low, and the odds of actually
-     * being able to detect a bug are even lower.
-     */
-    reserved = (AC_PDE_BIT51_MASK | AC_PDE_BIT36_MASK | AC_PDE_BIT13_MASK |
-	        AC_PTE_BIT51_MASK | AC_PTE_BIT36_MASK);
-    if (!F(AC_CPU_EFER_NX))
-        reserved |= AC_PDE_NX_MASK | AC_PTE_NX_MASK;
-
-    /* Only test one reserved bit at a time.  */
-    reserved &= flags;
-    if (reserved & (reserved - 1))
-        return false;
-
-    return true;
+	int flags = at->flags;
+	unsigned reserved;
+
+	if (F(AC_ACCESS_FETCH) && F(AC_ACCESS_WRITE))
+		return false;
+
+	/*
+	 * Since we convert current page to kernel page when cr4.smep=1,
+	 * we can't switch to user mode.
+	 */
+	if (F(AC_ACCESS_USER) && F(AC_CPU_CR4_SMEP))
+		return false;
+
+	/*
+	 * Only test protection key faults if CR4.PKE=1.
+	 */
+	if (!F(AC_CPU_CR4_PKE) &&
+		(F(AC_PKU_AD) || F(AC_PKU_WD))) {
+		return false;
+	}
+
+	/*
+	 * pde.bit13 checks handling of reserved bits in largepage PDEs.  It is
+	 * meaningless if there is a PTE.
+	 */
+	if (!F(AC_PDE_PSE) && F(AC_PDE_BIT13))
+		return false;
+
+	/*
+	 * Shorten the test by avoiding testing too many reserved bit combinations.
+	 * Skip testing multiple reserved bits to shorten the test. Reserved bit
+	 * page faults are terminal and multiple reserved bits do not affect the
+	 * error code; the odds of a KVM bug are super low, and the odds of actually
+	 * being able to detect a bug are even lower.
+	 */
+	reserved = (AC_PDE_BIT51_MASK | AC_PDE_BIT36_MASK | AC_PDE_BIT13_MASK |
+		   AC_PTE_BIT51_MASK | AC_PTE_BIT36_MASK);
+	if (!F(AC_CPU_EFER_NX))
+		reserved |= AC_PDE_NX_MASK | AC_PTE_NX_MASK;
+
+	/* Only test one reserved bit at a time.  */
+	reserved &= flags;
+	if (reserved & (reserved - 1))
+		return false;
+
+	return true;
 }
 
 static int ac_test_bump(ac_test_t *at)
 {
-    int ret;
+	int ret;
 
-    ret = ac_test_bump_one(at);
-    while (ret && !ac_test_legal(at))
 	ret = ac_test_bump_one(at);
-    return ret;
+	while (ret && !ac_test_legal(at))
+		ret = ac_test_bump_one(at);
+	return ret;
 }
 
 static pt_element_t ac_test_alloc_pt(ac_pool_t *pool)
 {
-    pt_element_t ret = pool->pt_pool + pool->pt_pool_current;
-    pool->pt_pool_current += PAGE_SIZE;
-    memset(va(ret), 0, PAGE_SIZE);
-    return ret;
+	pt_element_t ret = pool->pt_pool + pool->pt_pool_current;
+	pool->pt_pool_current += PAGE_SIZE;
+	memset(va(ret), 0, PAGE_SIZE);
+	return ret;
 }
 
 static _Bool ac_test_enough_room(ac_pool_t *pool)
 {
-    return pool->pt_pool_current + 5 * PAGE_SIZE <= pool->pt_pool_size;
+	return pool->pt_pool_current + 5 * PAGE_SIZE <= pool->pt_pool_size;
 }
 
 static void ac_test_reset_pt_pool(ac_pool_t *pool)
 {
-    pool->pt_pool_current = 0;
+	pool->pt_pool_current = 0;
 }
 
 static pt_element_t ac_test_permissions(ac_test_t *at, unsigned flags,
-                                        bool writable, bool user,
-                                        bool executable)
+					bool writable, bool user,
+					bool executable)
 {
-    bool kwritable = !F(AC_CPU_CR0_WP) && !F(AC_ACCESS_USER);
-    pt_element_t expected = 0;
-
-    if (F(AC_ACCESS_USER) && !user)
-	at->expected_fault = 1;
-
-    if (F(AC_ACCESS_WRITE) && !writable && !kwritable)
-	at->expected_fault = 1;
-
-    if (F(AC_ACCESS_FETCH) && !executable)
-	at->expected_fault = 1;
-
-    if (F(AC_ACCESS_FETCH) && user && F(AC_CPU_CR4_SMEP))
-        at->expected_fault = 1;
-
-    if (user && !F(AC_ACCESS_FETCH) && F(AC_PKU_PKEY) && F(AC_CPU_CR4_PKE)) {
-        if (F(AC_PKU_AD)) {
-            at->expected_fault = 1;
-            at->expected_error |= PFERR_PK_MASK;
-        } else if (F(AC_ACCESS_WRITE) && F(AC_PKU_WD) && !kwritable) {
-            at->expected_fault = 1;
-            at->expected_error |= PFERR_PK_MASK;
-        }
-    }
-
-    if (!at->expected_fault) {
-        expected |= PT_ACCESSED_MASK;
-        if (F(AC_ACCESS_WRITE))
-            expected |= PT_DIRTY_MASK;
-    }
-
-    return expected;
+	bool kwritable = !F(AC_CPU_CR0_WP) && !F(AC_ACCESS_USER);
+	pt_element_t expected = 0;
+
+	if (F(AC_ACCESS_USER) && !user)
+		at->expected_fault = 1;
+
+	if (F(AC_ACCESS_WRITE) && !writable && !kwritable)
+		at->expected_fault = 1;
+
+	if (F(AC_ACCESS_FETCH) && !executable)
+		at->expected_fault = 1;
+
+	if (F(AC_ACCESS_FETCH) && user && F(AC_CPU_CR4_SMEP))
+		at->expected_fault = 1;
+
+	if (user && !F(AC_ACCESS_FETCH) && F(AC_PKU_PKEY) && F(AC_CPU_CR4_PKE)) {
+		if (F(AC_PKU_AD)) {
+			at->expected_fault = 1;
+			at->expected_error |= PFERR_PK_MASK;
+		} else if (F(AC_ACCESS_WRITE) && F(AC_PKU_WD) && !kwritable) {
+			at->expected_fault = 1;
+			at->expected_error |= PFERR_PK_MASK;
+		}
+	}
+
+	if (!at->expected_fault) {
+		expected |= PT_ACCESSED_MASK;
+		if (F(AC_ACCESS_WRITE))
+			expected |= PT_DIRTY_MASK;
+	}
+
+	return expected;
 }
 
 static void ac_emulate_access(ac_test_t *at, unsigned flags)
 {
-    bool pde_valid, pte_valid;
-    bool user, writable, executable;
-
-    if (F(AC_ACCESS_USER))
-	at->expected_error |= PFERR_USER_MASK;
-
-    if (F(AC_ACCESS_WRITE))
-	at->expected_error |= PFERR_WRITE_MASK;
-
-    if (F(AC_ACCESS_FETCH))
-	at->expected_error |= PFERR_FETCH_MASK;
-
-    if (!F(AC_PDE_ACCESSED))
-        at->ignore_pde = PT_ACCESSED_MASK;
-
-    pde_valid = F(AC_PDE_PRESENT)
-        && !F(AC_PDE_BIT51) && !F(AC_PDE_BIT36) && !F(AC_PDE_BIT13)
-        && !(F(AC_PDE_NX) && !F(AC_CPU_EFER_NX));
-
-    if (!pde_valid) {
-        at->expected_fault = 1;
-	if (F(AC_PDE_PRESENT)) {
-            at->expected_error |= PFERR_RESERVED_MASK;
-        } else {
-            at->expected_error &= ~PFERR_PRESENT_MASK;
-        }
-	goto fault;
-    }
-
-    writable = !F(AC_PDPTE_NO_WRITABLE) && F(AC_PDE_WRITABLE);
-    user = F(AC_PDE_USER);
-    executable = !F(AC_PDE_NX);
-
-    if (F(AC_PDE_PSE)) {
-        at->expected_pde |= ac_test_permissions(at, flags, writable, user,
-                                                executable);
-	goto no_pte;
-    }
-
-    at->expected_pde |= PT_ACCESSED_MASK;
-
-    pte_valid = F(AC_PTE_PRESENT)
-        && !F(AC_PTE_BIT51) && !F(AC_PTE_BIT36)
-        && !(F(AC_PTE_NX) && !F(AC_CPU_EFER_NX));
-
-    if (!pte_valid) {
-        at->expected_fault = 1;
-	if (F(AC_PTE_PRESENT)) {
-            at->expected_error |= PFERR_RESERVED_MASK;
-        } else {
-            at->expected_error &= ~PFERR_PRESENT_MASK;
-        }
-	goto fault;
-    }
-
-    writable &= F(AC_PTE_WRITABLE);
-    user &= F(AC_PTE_USER);
-    executable &= !F(AC_PTE_NX);
-
-    at->expected_pte |= ac_test_permissions(at, flags, writable, user,
-                                            executable);
+	bool pde_valid, pte_valid;
+	bool user, writable, executable;
+
+	if (F(AC_ACCESS_USER))
+		at->expected_error |= PFERR_USER_MASK;
+
+	if (F(AC_ACCESS_WRITE))
+		at->expected_error |= PFERR_WRITE_MASK;
+
+	if (F(AC_ACCESS_FETCH))
+		at->expected_error |= PFERR_FETCH_MASK;
+
+	if (!F(AC_PDE_ACCESSED))
+		at->ignore_pde = PT_ACCESSED_MASK;
+
+	pde_valid = F(AC_PDE_PRESENT)
+		&& !F(AC_PDE_BIT51) && !F(AC_PDE_BIT36) && !F(AC_PDE_BIT13)
+		&& !(F(AC_PDE_NX) && !F(AC_CPU_EFER_NX));
+
+	if (!pde_valid) {
+		at->expected_fault = 1;
+		if (F(AC_PDE_PRESENT)) {
+			at->expected_error |= PFERR_RESERVED_MASK;
+		} else {
+			at->expected_error &= ~PFERR_PRESENT_MASK;
+		}
+		goto fault;
+	}
+
+	writable = !F(AC_PDPTE_NO_WRITABLE) && F(AC_PDE_WRITABLE);
+	user = F(AC_PDE_USER);
+	executable = !F(AC_PDE_NX);
+
+	if (F(AC_PDE_PSE)) {
+		at->expected_pde |= ac_test_permissions(at, flags, writable,
+							user, executable);
+		goto no_pte;
+	}
+
+	at->expected_pde |= PT_ACCESSED_MASK;
+
+	pte_valid = F(AC_PTE_PRESENT)
+		    && !F(AC_PTE_BIT51) && !F(AC_PTE_BIT36)
+		    && !(F(AC_PTE_NX) && !F(AC_CPU_EFER_NX));
+
+	if (!pte_valid) {
+		at->expected_fault = 1;
+		if (F(AC_PTE_PRESENT)) {
+			at->expected_error |= PFERR_RESERVED_MASK;
+		} else {
+			at->expected_error &= ~PFERR_PRESENT_MASK;
+		}
+		goto fault;
+	}
+
+	writable &= F(AC_PTE_WRITABLE);
+	user &= F(AC_PTE_USER);
+	executable &= !F(AC_PTE_NX);
+
+	at->expected_pte |= ac_test_permissions(at, flags, writable, user,
+						executable);
 
 no_pte:
 fault:
-    if (!at->expected_fault)
-        at->ignore_pde = 0;
-    if (!F(AC_CPU_EFER_NX) && !F(AC_CPU_CR4_SMEP))
-        at->expected_error &= ~PFERR_FETCH_MASK;
+	if (!at->expected_fault)
+		at->ignore_pde = 0;
+	if (!F(AC_CPU_EFER_NX) && !F(AC_CPU_CR4_SMEP))
+		at->expected_error &= ~PFERR_FETCH_MASK;
 }
 
 static void ac_set_expected_status(ac_test_t *at)
 {
-    invlpg(at->virt);
-
-    if (at->ptep)
-	at->expected_pte = *at->ptep;
-    at->expected_pde = *at->pdep;
-    at->ignore_pde = 0;
-    at->expected_fault = 0;
-    at->expected_error = PFERR_PRESENT_MASK;
-
-    if (at->flags & AC_ACCESS_TWICE_MASK) {
-        ac_emulate_access(at, at->flags & ~AC_ACCESS_WRITE_MASK
-                          & ~AC_ACCESS_FETCH_MASK & ~AC_ACCESS_USER_MASK);
-        at->expected_fault = 0;
+	invlpg(at->virt);
+
+	if (at->ptep)
+		at->expected_pte = *at->ptep;
+	at->expected_pde = *at->pdep;
+	at->ignore_pde = 0;
+	at->expected_fault = 0;
 	at->expected_error = PFERR_PRESENT_MASK;
-        at->ignore_pde = 0;
-    }
 
-    ac_emulate_access(at, at->flags);
+	if (at->flags & AC_ACCESS_TWICE_MASK) {
+		ac_emulate_access(at, at->flags &
+				  ~AC_ACCESS_WRITE_MASK &
+				  ~AC_ACCESS_FETCH_MASK &
+				  ~AC_ACCESS_USER_MASK);
+		at->expected_fault = 0;
+		at->expected_error = PFERR_PRESENT_MASK;
+		at->ignore_pde = 0;
+	}
+
+	ac_emulate_access(at, at->flags);
 }
 
 static void __ac_setup_specific_pages(ac_test_t *at, ac_pool_t *pool, bool reuse,
 				      u64 pd_page, u64 pt_page)
-
 {
-    unsigned long root = read_cr3();
-    int flags = at->flags;
-    bool skip = true;
-
-    if (!ac_test_enough_room(pool))
-	ac_test_reset_pt_pool(pool);
-
-    at->ptep = 0;
-    for (int i = page_table_levels; i >= 1 && (i >= 2 || !F(AC_PDE_PSE)); --i) {
-	pt_element_t *vroot = va(root & PT_BASE_ADDR_MASK);
-	unsigned index = PT_INDEX((unsigned long)at->virt, i);
-	pt_element_t pte = 0;
+	unsigned long root = read_cr3();
+	int flags = at->flags;
+	bool skip = true;
 
-	/*
-	 * Reuse existing page tables along the path to the test code and data
-	 * (which is in the bottom 2MB).
-	 */
-	if (skip && i >= 2 && index == 0) {
-	    goto next;
-	}
-	skip = false;
-	if (reuse && vroot[index]) {
-	    switch (i) {
-	    case 2:
-		at->pdep = &vroot[index];
-		break;
-	    case 1:
-		at->ptep = &vroot[index];
-		break;
-	    }
-	    goto next;
-	}
+	if (!ac_test_enough_room(pool))
+		ac_test_reset_pt_pool(pool);
 
-	switch (i) {
-	case 5:
-	case 4:
-	    pte = ac_test_alloc_pt(pool);
-	    pte |= PT_PRESENT_MASK | PT_WRITABLE_MASK | PT_USER_MASK;
-	    break;
-	case 3:
-	    pte = pd_page ? pd_page : ac_test_alloc_pt(pool);
-	    pte |= PT_PRESENT_MASK | PT_USER_MASK;
-	    if (!F(AC_PDPTE_NO_WRITABLE))
-		pte |= PT_WRITABLE_MASK;
-	    break;
-	case 2:
-	    if (!F(AC_PDE_PSE)) {
-		pte = pt_page ? pt_page : ac_test_alloc_pt(pool);
-		/* The protection key is ignored on non-leaf entries.  */
-                if (F(AC_PKU_PKEY))
-                    pte |= 2ull << 59;
-	    } else {
-		pte = at->phys & PT_PSE_BASE_ADDR_MASK;
-		pte |= PT_PAGE_SIZE_MASK;
-                if (F(AC_PKU_PKEY))
-                    pte |= 1ull << 59;
-	    }
-	    if (F(AC_PDE_PRESENT))
-		pte |= PT_PRESENT_MASK;
-	    if (F(AC_PDE_WRITABLE))
-		pte |= PT_WRITABLE_MASK;
-	    if (F(AC_PDE_USER))
-		pte |= PT_USER_MASK;
-	    if (F(AC_PDE_ACCESSED))
-		pte |= PT_ACCESSED_MASK;
-	    if (F(AC_PDE_DIRTY))
-		pte |= PT_DIRTY_MASK;
-	    if (F(AC_PDE_NX))
-		pte |= PT64_NX_MASK;
-	    if (F(AC_PDE_BIT51))
-		pte |= 1ull << 51;
-	    if (F(AC_PDE_BIT36))
-                pte |= 1ull << 36;
-	    if (F(AC_PDE_BIT13))
-		pte |= 1ull << 13;
-	    at->pdep = &vroot[index];
-	    break;
-	case 1:
-	    pte = at->phys & PT_BASE_ADDR_MASK;
-	    if (F(AC_PKU_PKEY))
-		pte |= 1ull << 59;
-	    if (F(AC_PTE_PRESENT))
-		pte |= PT_PRESENT_MASK;
-	    if (F(AC_PTE_WRITABLE))
-		pte |= PT_WRITABLE_MASK;
-	    if (F(AC_PTE_USER))
-		pte |= PT_USER_MASK;
-	    if (F(AC_PTE_ACCESSED))
-		pte |= PT_ACCESSED_MASK;
-	    if (F(AC_PTE_DIRTY))
-		pte |= PT_DIRTY_MASK;
-	    if (F(AC_PTE_NX))
-		pte |= PT64_NX_MASK;
-	    if (F(AC_PTE_BIT51))
-		pte |= 1ull << 51;
-	    if (F(AC_PTE_BIT36))
-                pte |= 1ull << 36;
-	    at->ptep = &vroot[index];
-	    break;
-	}
-	vroot[index] = pte;
+	at->ptep = 0;
+	for (int i = page_table_levels; i >= 1 && (i >= 2 || !F(AC_PDE_PSE)); --i) {
+		pt_element_t *vroot = va(root & PT_BASE_ADDR_MASK);
+		unsigned index = PT_INDEX((unsigned long)at->virt, i);
+		pt_element_t pte = 0;
+
+		/*
+		 * Reuse existing page tables along the path to the test code and data
+		 * (which is in the bottom 2MB).
+		 */
+		if (skip && i >= 2 && index == 0) {
+			goto next;
+		}
+		skip = false;
+		if (reuse && vroot[index]) {
+			switch (i) {
+			case 2:
+				at->pdep = &vroot[index];
+				break;
+			case 1:
+				at->ptep = &vroot[index];
+				break;
+			}
+			goto next;
+		}
+
+		switch (i) {
+		case 5:
+		case 4:
+			pte = ac_test_alloc_pt(pool);
+			pte |= PT_PRESENT_MASK | PT_WRITABLE_MASK | PT_USER_MASK;
+			break;
+		case 3:
+			pte = pd_page ? pd_page : ac_test_alloc_pt(pool);
+			pte |= PT_PRESENT_MASK | PT_USER_MASK;
+			if (!F(AC_PDPTE_NO_WRITABLE))
+				pte |= PT_WRITABLE_MASK;
+			break;
+		case 2:
+			if (!F(AC_PDE_PSE)) {
+				pte = pt_page ? pt_page : ac_test_alloc_pt(pool);
+				/* The protection key is ignored on non-leaf entries.  */
+				if (F(AC_PKU_PKEY))
+					pte |= 2ull << 59;
+			} else {
+				pte = at->phys & PT_PSE_BASE_ADDR_MASK;
+				pte |= PT_PAGE_SIZE_MASK;
+				if (F(AC_PKU_PKEY))
+					pte |= 1ull << 59;
+			}
+			if (F(AC_PDE_PRESENT))
+				pte |= PT_PRESENT_MASK;
+			if (F(AC_PDE_WRITABLE))
+				pte |= PT_WRITABLE_MASK;
+			if (F(AC_PDE_USER))
+				pte |= PT_USER_MASK;
+			if (F(AC_PDE_ACCESSED))
+				pte |= PT_ACCESSED_MASK;
+			if (F(AC_PDE_DIRTY))
+				pte |= PT_DIRTY_MASK;
+			if (F(AC_PDE_NX))
+				pte |= PT64_NX_MASK;
+			if (F(AC_PDE_BIT51))
+				pte |= 1ull << 51;
+			if (F(AC_PDE_BIT36))
+				pte |= 1ull << 36;
+			if (F(AC_PDE_BIT13))
+				pte |= 1ull << 13;
+			at->pdep = &vroot[index];
+			break;
+		case 1:
+			pte = at->phys & PT_BASE_ADDR_MASK;
+			if (F(AC_PKU_PKEY))
+				pte |= 1ull << 59;
+			if (F(AC_PTE_PRESENT))
+				pte |= PT_PRESENT_MASK;
+			if (F(AC_PTE_WRITABLE))
+				pte |= PT_WRITABLE_MASK;
+			if (F(AC_PTE_USER))
+				pte |= PT_USER_MASK;
+			if (F(AC_PTE_ACCESSED))
+				pte |= PT_ACCESSED_MASK;
+			if (F(AC_PTE_DIRTY))
+				pte |= PT_DIRTY_MASK;
+			if (F(AC_PTE_NX))
+				pte |= PT64_NX_MASK;
+			if (F(AC_PTE_BIT51))
+				pte |= 1ull << 51;
+			if (F(AC_PTE_BIT36))
+				pte |= 1ull << 36;
+			at->ptep = &vroot[index];
+			break;
+		}
+		vroot[index] = pte;
  next:
-	root = vroot[index];
-    }
-    ac_set_expected_status(at);
+		root = vroot[index];
+	}
+	ac_set_expected_status(at);
 }
 
 static void ac_test_setup_pte(ac_test_t *at, ac_pool_t *pool)
@@ -630,11 +631,11 @@ static void ac_setup_specific_pages(ac_test_t *at, ac_pool_t *pool,
 static void dump_mapping(ac_test_t *at)
 {
 	unsigned long root = read_cr3();
-        int flags = at->flags;
+	int flags = at->flags;
 	int i;
 
 	printf("Dump mapping: address: %p\n", at->virt);
-	for (i = page_table_levels ; i >= 1 && (i >= 2 || !F(AC_PDE_PSE)); --i) {
+	for (i = page_table_levels; i >= 1 && (i >= 2 || !F(AC_PDE_PSE)); --i) {
 		pt_element_t *vroot = va(root & PT_BASE_ADDR_MASK);
 		unsigned index = PT_INDEX((unsigned long)at->virt, i);
 		pt_element_t pte = vroot[index];
@@ -645,168 +646,166 @@ static void dump_mapping(ac_test_t *at)
 }
 
 static void ac_test_check(ac_test_t *at, _Bool *success_ret, _Bool cond,
-                          const char *fmt, ...)
+			  const char *fmt, ...)
 {
-    va_list ap;
-    char buf[500];
+	va_list ap;
+	char buf[500];
 
-    if (!*success_ret) {
-        return;
-    }
+	if (!*success_ret) {
+		return;
+	}
 
-    if (!cond) {
-        return;
-    }
+	if (!cond) {
+		return;
+	}
 
-    *success_ret = false;
+	*success_ret = false;
 
-    if (!verbose) {
-        puts("\n");
-        ac_test_show(at);
-    }
+	if (!verbose) {
+		puts("\n");
+		ac_test_show(at);
+	}
 
-    va_start(ap, fmt);
-    vsnprintf(buf, sizeof(buf), fmt, ap);
-    va_end(ap);
-    printf("FAIL: %s\n", buf);
-    dump_mapping(at);
+	va_start(ap, fmt);
+	vsnprintf(buf, sizeof(buf), fmt, ap);
+	va_end(ap);
+	printf("FAIL: %s\n", buf);
+	dump_mapping(at);
 }
 
 static int pt_match(pt_element_t pte1, pt_element_t pte2, pt_element_t ignore)
 {
-    pte1 &= ~ignore;
-    pte2 &= ~ignore;
-    return pte1 == pte2;
+	pte1 &= ~ignore;
+	pte2 &= ~ignore;
+	return pte1 == pte2;
 }
 
 static int ac_test_do_access(ac_test_t *at)
 {
-    static unsigned unique = 42;
-    int fault = 0;
-    unsigned e;
-    static unsigned char user_stack[4096];
-    unsigned long rsp;
-    _Bool success = true;
-    int flags = at->flags;
-
-    ++unique;
-    if (!(unique & 65535)) {
-        puts(".");
-    }
-
-    *((unsigned char *)at->phys) = 0xc3; /* ret */
-
-    unsigned r = unique;
-    set_cr0_wp(F(AC_CPU_CR0_WP));
-    set_efer_nx(F(AC_CPU_EFER_NX));
-    set_cr4_pke(F(AC_CPU_CR4_PKE));
-    if (F(AC_CPU_CR4_PKE)) {
-        /* WD2=AD2=1, WD1=F(AC_PKU_WD), AD1=F(AC_PKU_AD) */
-        write_pkru(0x30 | (F(AC_PKU_WD) ? 8 : 0) |
-                   (F(AC_PKU_AD) ? 4 : 0));
-    }
-
-    set_cr4_smep(F(AC_CPU_CR4_SMEP));
-
-    if (F(AC_ACCESS_TWICE)) {
-	asm volatile (
-	    "mov $fixed2, %%rsi \n\t"
-	    "mov (%[addr]), %[reg] \n\t"
-	    "fixed2:"
-	    : [reg]"=r"(r), [fault]"=a"(fault), "=b"(e)
-	    : [addr]"r"(at->virt)
-	    : "rsi"
-	    );
-	fault = 0;
-    }
-
-    asm volatile ("mov $fixed1, %%rsi \n\t"
-		  "mov %%rsp, %[rsp0] \n\t"
-		  "cmp $0, %[user] \n\t"
-		  "jz do_access \n\t"
-		  "push %%rax; mov %[user_ds], %%ax; mov %%ax, %%ds; pop %%rax  \n\t"
-		  "pushq %[user_ds] \n\t"
-		  "pushq %[user_stack_top] \n\t"
-		  "pushfq \n\t"
-		  "pushq %[user_cs] \n\t"
-		  "pushq $do_access \n\t"
-		  "iretq \n"
-		  "do_access: \n\t"
-		  "cmp $0, %[fetch] \n\t"
-		  "jnz 2f \n\t"
-		  "cmp $0, %[write] \n\t"
-		  "jnz 1f \n\t"
-		  "mov (%[addr]), %[reg] \n\t"
-		  "jmp done \n\t"
-		  "1: mov %[reg], (%[addr]) \n\t"
-		  "jmp done \n\t"
-		  "2: call *%[addr] \n\t"
-		  "done: \n"
-		  "fixed1: \n"
-		  "int %[kernel_entry_vector] \n\t"
-		  ".section .text.entry \n\t"
-		  "kernel_entry: \n\t"
-		  "mov %[rsp0], %%rsp \n\t"
-		  "jmp back_to_kernel \n\t"
-		  ".section .text \n\t"
-		  "back_to_kernel:"
-		  : [reg]"+r"(r), "+a"(fault), "=b"(e), "=&d"(rsp),
-		    [rsp0]"=m"(tss[0].rsp0)
-		  : [addr]"r"(at->virt),
-		    [write]"r"(F(AC_ACCESS_WRITE)),
-		    [user]"r"(F(AC_ACCESS_USER)),
-		    [fetch]"r"(F(AC_ACCESS_FETCH)),
-		    [user_ds]"i"(USER_DS),
-		    [user_cs]"i"(USER_CS),
-		    [user_stack_top]"r"(user_stack + sizeof user_stack),
-		    [kernel_entry_vector]"i"(0x20)
-		  : "rsi");
-
-    asm volatile (".section .text.pf \n\t"
-		  "page_fault: \n\t"
-		  "pop %rbx \n\t"
-		  "mov %rsi, (%rsp) \n\t"
-		  "movl $1, %eax \n\t"
-		  "iretq \n\t"
-		  ".section .text");
-
-    ac_test_check(at, &success, fault && !at->expected_fault,
-                  "unexpected fault");
-    ac_test_check(at, &success, !fault && at->expected_fault,
-                  "unexpected access");
-    ac_test_check(at, &success, fault && e != at->expected_error,
-                  "error code %x expected %x", e, at->expected_error);
-    if (at->ptep)
-        ac_test_check(at, &success, *at->ptep != at->expected_pte,
-                      "pte %x expected %x", *at->ptep, at->expected_pte);
-    ac_test_check(at, &success,
-                  !pt_match(*at->pdep, at->expected_pde, at->ignore_pde),
-                  "pde %x expected %x", *at->pdep, at->expected_pde);
-
-    if (success && verbose) {
-	if (at->expected_fault) {
-            printf("PASS (%x)\n", at->expected_error);
-	} else {
-            printf("PASS\n");
+	static unsigned unique = 42;
+	int fault = 0;
+	unsigned e;
+	static unsigned char user_stack[4096];
+	unsigned long rsp;
+	_Bool success = true;
+	int flags = at->flags;
+
+	++unique;
+	if (!(unique & 65535)) {
+		puts(".");
 	}
-    }
-    return success;
+
+	*((unsigned char *)at->phys) = 0xc3; /* ret */
+
+	unsigned r = unique;
+	set_cr0_wp(F(AC_CPU_CR0_WP));
+	set_efer_nx(F(AC_CPU_EFER_NX));
+	set_cr4_pke(F(AC_CPU_CR4_PKE));
+	if (F(AC_CPU_CR4_PKE)) {
+		/* WD2=AD2=1, WD1=F(AC_PKU_WD), AD1=F(AC_PKU_AD) */
+		write_pkru(0x30 | (F(AC_PKU_WD) ? 8 : 0) |
+			   (F(AC_PKU_AD) ? 4 : 0));
+	}
+
+	set_cr4_smep(F(AC_CPU_CR4_SMEP));
+
+	if (F(AC_ACCESS_TWICE)) {
+		asm volatile ("mov $fixed2, %%rsi \n\t"
+			      "mov (%[addr]), %[reg] \n\t"
+			      "fixed2:"
+			      : [reg]"=r"(r), [fault]"=a"(fault), "=b"(e)
+			      : [addr]"r"(at->virt)
+			      : "rsi");
+		fault = 0;
+	}
+
+	asm volatile ("mov $fixed1, %%rsi \n\t"
+		      "mov %%rsp, %[rsp0] \n\t"
+		      "cmp $0, %[user] \n\t"
+		      "jz do_access \n\t"
+		      "push %%rax; mov %[user_ds], %%ax; mov %%ax, %%ds; pop %%rax \n\t"
+		      "pushq %[user_ds] \n\t"
+		      "pushq %[user_stack_top] \n\t"
+		      "pushfq \n\t"
+		      "pushq %[user_cs] \n\t"
+		      "pushq $do_access \n\t"
+		      "iretq \n"
+		      "do_access: \n\t"
+		      "cmp $0, %[fetch] \n\t"
+		      "jnz 2f \n\t"
+		      "cmp $0, %[write] \n\t"
+		      "jnz 1f \n\t"
+		      "mov (%[addr]), %[reg] \n\t"
+		      "jmp done \n\t"
+		      "1: mov %[reg], (%[addr]) \n\t"
+		      "jmp done \n\t"
+		      "2: call *%[addr] \n\t"
+		      "done: \n"
+		      "fixed1: \n"
+		      "int %[kernel_entry_vector] \n\t"
+		      ".section .text.entry \n\t"
+		      "kernel_entry: \n\t"
+		      "mov %[rsp0], %%rsp \n\t"
+		      "jmp back_to_kernel \n\t"
+		      ".section .text \n\t"
+		      "back_to_kernel:"
+		      : [reg]"+r"(r), "+a"(fault), "=b"(e), "=&d"(rsp),
+			[rsp0]"=m"(tss[0].rsp0)
+		      : [addr]"r"(at->virt),
+			[write]"r"(F(AC_ACCESS_WRITE)),
+			[user]"r"(F(AC_ACCESS_USER)),
+			[fetch]"r"(F(AC_ACCESS_FETCH)),
+			[user_ds]"i"(USER_DS),
+			[user_cs]"i"(USER_CS),
+			[user_stack_top]"r"(user_stack + sizeof user_stack),
+			[kernel_entry_vector]"i"(0x20)
+		      : "rsi");
+
+	asm volatile (".section .text.pf \n\t"
+		      "page_fault: \n\t"
+		      "pop %rbx \n\t"
+		      "mov %rsi, (%rsp) \n\t"
+		      "movl $1, %eax \n\t"
+		      "iretq \n\t"
+		      ".section .text");
+
+	ac_test_check(at, &success, fault && !at->expected_fault,
+		      "unexpected fault");
+	ac_test_check(at, &success, !fault && at->expected_fault,
+		      "unexpected access");
+	ac_test_check(at, &success, fault && e != at->expected_error,
+		      "error code %x expected %x", e, at->expected_error);
+	if (at->ptep)
+		ac_test_check(at, &success, *at->ptep != at->expected_pte,
+			      "pte %x expected %x", *at->ptep, at->expected_pte);
+	ac_test_check(at, &success,
+		      !pt_match(*at->pdep, at->expected_pde, at->ignore_pde),
+		      "pde %x expected %x", *at->pdep, at->expected_pde);
+
+	if (success && verbose) {
+		if (at->expected_fault) {
+			printf("PASS (%x)\n", at->expected_error);
+		} else {
+			printf("PASS\n");
+		}
+	}
+	return success;
 }
 
 static void ac_test_show(ac_test_t *at)
 {
-    char line[5000];
-
-    *line = 0;
-    strcat(line, "test");
-    for (int i = 0; i < NR_AC_FLAGS; ++i)
-	if (at->flags & (1 << i)) {
-	    strcat(line, " ");
-	    strcat(line, ac_names[i]);
-	}
-
-    strcat(line, ": ");
-    printf("%s", line);
+	char line[5000];
+
+	*line = 0;
+	strcat(line, "test");
+	for (int i = 0; i < NR_AC_FLAGS; ++i)
+		if (at->flags & (1 << i)) {
+			strcat(line, " ");
+			strcat(line, ac_names[i]);
+		}
+
+	strcat(line, ": ");
+	printf("%s", line);
 }
 
 /*
@@ -815,36 +814,36 @@ static void ac_test_show(ac_test_t *at)
  */
 static int corrupt_hugepage_triger(ac_pool_t *pool)
 {
-    ac_test_t at1, at2;
+	ac_test_t at1, at2;
 
-    ac_test_init(&at1, (void *)(0x123400000000));
-    ac_test_init(&at2, (void *)(0x666600000000));
+	ac_test_init(&at1, (void *)(0x123400000000));
+	ac_test_init(&at2, (void *)(0x666600000000));
 
-    at2.flags = AC_CPU_CR0_WP_MASK | AC_PDE_PSE_MASK | AC_PDE_PRESENT_MASK;
-    ac_test_setup_pte(&at2, pool);
-    if (!ac_test_do_access(&at2))
-        goto err;
+	at2.flags = AC_CPU_CR0_WP_MASK | AC_PDE_PSE_MASK | AC_PDE_PRESENT_MASK;
+	ac_test_setup_pte(&at2, pool);
+	if (!ac_test_do_access(&at2))
+		goto err;
 
-    at1.flags = at2.flags | AC_PDE_WRITABLE_MASK;
-    ac_test_setup_pte(&at1, pool);
-    if (!ac_test_do_access(&at1))
-        goto err;
+	at1.flags = at2.flags | AC_PDE_WRITABLE_MASK;
+	ac_test_setup_pte(&at1, pool);
+	if (!ac_test_do_access(&at1))
+		goto err;
 
-    at1.flags |= AC_ACCESS_WRITE_MASK;
-    ac_set_expected_status(&at1);
-    if (!ac_test_do_access(&at1))
-        goto err;
+	at1.flags |= AC_ACCESS_WRITE_MASK;
+	ac_set_expected_status(&at1);
+	if (!ac_test_do_access(&at1))
+		goto err;
 
-    at2.flags |= AC_ACCESS_WRITE_MASK;
-    ac_set_expected_status(&at2);
-    if (!ac_test_do_access(&at2))
-        goto err;
+	at2.flags |= AC_ACCESS_WRITE_MASK;
+	ac_set_expected_status(&at2);
+	if (!ac_test_do_access(&at2))
+		goto err;
 
-    return 1;
+	return 1;
 
 err:
-    printf("corrupt_hugepage_triger test fail\n");
-    return 0;
+	printf("corrupt_hugepage_triger test fail\n");
+	return 0;
 }
 
 /*
@@ -861,24 +860,24 @@ static int check_pfec_on_prefetch_pte(ac_pool_t *pool)
 	at1.flags = AC_PDE_PRESENT_MASK | AC_PTE_PRESENT_MASK;
 	ac_setup_specific_pages(&at1, pool, 30 * 1024 * 1024, 30 * 1024 * 1024);
 
-        at2.flags = at1.flags | AC_PTE_NX_MASK;
+	at2.flags = at1.flags | AC_PTE_NX_MASK;
 	ac_setup_specific_pages(&at2, pool, 30 * 1024 * 1024, 30 * 1024 * 1024);
 
 	if (!ac_test_do_access(&at1)) {
 		printf("%s: prepare fail\n", __FUNCTION__);
-		goto err;
+			goto err;
 	}
 
 	if (!ac_test_do_access(&at2)) {
 		printf("%s: check PFEC on prefetch pte path fail\n",
-			__FUNCTION__);
+		       __FUNCTION__);
 		goto err;
 	}
 
 	return 1;
 
 err:
-    return 0;
+	return 0;
 }
 
 /*
@@ -903,14 +902,14 @@ static int check_large_pte_dirty_for_nowp(ac_pool_t *pool)
 	ac_test_init(&at1, (void *)(0x123403000000));
 	ac_test_init(&at2, (void *)(0x666606000000));
 
-        at2.flags = AC_PDE_PRESENT_MASK | AC_PDE_PSE_MASK;
+	at2.flags = AC_PDE_PRESENT_MASK | AC_PDE_PSE_MASK;
 	ac_test_setup_pte(&at2, pool);
 	if (!ac_test_do_access(&at2)) {
 		printf("%s: read on the first mapping fail.\n", __FUNCTION__);
 		goto err;
 	}
 
-        at1.flags = at2.flags | AC_ACCESS_WRITE_MASK;
+	at1.flags = at2.flags | AC_ACCESS_WRITE_MASK;
 	ac_test_setup_pte(&at1, pool);
 	if (!ac_test_do_access(&at1)) {
 		printf("%s: write on the second mapping fail.\n", __FUNCTION__);
@@ -936,17 +935,17 @@ static int check_smep_andnot_wp(ac_pool_t *pool)
 	int err_prepare_andnot_wp, err_smep_andnot_wp;
 
 	if (!this_cpu_has(X86_FEATURE_SMEP)) {
-	    return 1;
+		return 1;
 	}
 
 	ac_test_init(&at1, (void *)(0x123406001000));
 
 	at1.flags = AC_PDE_PRESENT_MASK | AC_PTE_PRESENT_MASK |
-            AC_PDE_USER_MASK | AC_PTE_USER_MASK |
-            AC_PDE_ACCESSED_MASK | AC_PTE_ACCESSED_MASK |
-            AC_CPU_CR4_SMEP_MASK |
-            AC_CPU_CR0_WP_MASK |
-            AC_ACCESS_WRITE_MASK;
+		    AC_PDE_USER_MASK | AC_PTE_USER_MASK |
+		    AC_PDE_ACCESSED_MASK | AC_PTE_ACCESSED_MASK |
+		    AC_CPU_CR4_SMEP_MASK |
+		    AC_CPU_CR0_WP_MASK |
+		    AC_ACCESS_WRITE_MASK;
 	ac_test_setup_pte(&at1, pool);
 
 	/*
@@ -960,10 +959,10 @@ static int check_smep_andnot_wp(ac_pool_t *pool)
 		goto clean_up;
 	}
 
-        at1.flags &= ~AC_ACCESS_WRITE_MASK;
-        at1.flags |= AC_ACCESS_FETCH_MASK;
-        ac_set_expected_status(&at1);
-        err_smep_andnot_wp = ac_test_do_access(&at1);
+	at1.flags &= ~AC_ACCESS_WRITE_MASK;
+	at1.flags |= AC_ACCESS_FETCH_MASK;
+	ac_set_expected_status(&at1);
+	err_smep_andnot_wp = ac_test_do_access(&at1);
 
 clean_up:
 	set_cr4_smep(0);
@@ -1049,14 +1048,14 @@ static int check_effective_sp_permissions(ac_pool_t *pool)
 
 static int ac_test_exec(ac_test_t *at, ac_pool_t *pool)
 {
-    int r;
-
-    if (verbose) {
-        ac_test_show(at);
-    }
-    ac_test_setup_pte(at, pool);
-    r = ac_test_do_access(at);
-    return r;
+	int r;
+
+	if (verbose) {
+		ac_test_show(at);
+	}
+	ac_test_setup_pte(at, pool);
+	r = ac_test_do_access(at);
+	return r;
 }
 
 typedef int (*ac_test_fn)(ac_pool_t *pool);
@@ -1071,82 +1070,82 @@ const ac_test_fn ac_test_cases[] =
 
 int ac_test_run()
 {
-    ac_test_t at;
-    ac_pool_t pool;
-    int i, tests, successes;
-
-    printf("run\n");
-    tests = successes = 0;
-
-    shadow_cr0 = read_cr0();
-    shadow_cr4 = read_cr4();
-    shadow_efer = rdmsr(MSR_EFER);
-
-    if (cpuid_maxphyaddr() >= 52) {
-        invalid_mask |= AC_PDE_BIT51_MASK;
-        invalid_mask |= AC_PTE_BIT51_MASK;
-    }
-    if (cpuid_maxphyaddr() >= 37) {
-        invalid_mask |= AC_PDE_BIT36_MASK;
-        invalid_mask |= AC_PTE_BIT36_MASK;
-    }
-
-    if (this_cpu_has(X86_FEATURE_PKU)) {
-        set_cr4_pke(1);
-        set_cr4_pke(0);
-        /* Now PKRU = 0xFFFFFFFF.  */
-    } else {
-	tests++;
-	if (write_cr4_checking(shadow_cr4 | X86_CR4_PKE) == GP_VECTOR) {
-            successes++;
-            invalid_mask |= AC_PKU_AD_MASK;
-            invalid_mask |= AC_PKU_WD_MASK;
-            invalid_mask |= AC_PKU_PKEY_MASK;
-            invalid_mask |= AC_CPU_CR4_PKE_MASK;
-            printf("CR4.PKE not available, disabling PKE tests\n");
-	} else {
-            printf("Set PKE in CR4 - expect #GP: FAIL!\n");
-            set_cr4_pke(0);
+	ac_test_t at;
+	ac_pool_t pool;
+	int i, tests, successes;
+
+	printf("run\n");
+	tests = successes = 0;
+
+	shadow_cr0 = read_cr0();
+	shadow_cr4 = read_cr4();
+	shadow_efer = rdmsr(MSR_EFER);
+
+	if (cpuid_maxphyaddr() >= 52) {
+		invalid_mask |= AC_PDE_BIT51_MASK;
+		invalid_mask |= AC_PTE_BIT51_MASK;
+	}
+	if (cpuid_maxphyaddr() >= 37) {
+		invalid_mask |= AC_PDE_BIT36_MASK;
+		invalid_mask |= AC_PTE_BIT36_MASK;
 	}
-    }
-
-    if (!this_cpu_has(X86_FEATURE_SMEP)) {
-	tests++;
-	if (set_cr4_smep(1) == GP_VECTOR) {
-            successes++;
-            invalid_mask |= AC_CPU_CR4_SMEP_MASK;
-            printf("CR4.SMEP not available, disabling SMEP tests\n");
+
+	if (this_cpu_has(X86_FEATURE_PKU)) {
+		set_cr4_pke(1);
+		set_cr4_pke(0);
+		/* Now PKRU = 0xFFFFFFFF.  */
 	} else {
-            printf("Set SMEP in CR4 - expect #GP: FAIL!\n");
-            set_cr4_smep(0);
+		tests++;
+		if (write_cr4_checking(shadow_cr4 | X86_CR4_PKE) == GP_VECTOR) {
+			successes++;
+			invalid_mask |= AC_PKU_AD_MASK;
+			invalid_mask |= AC_PKU_WD_MASK;
+			invalid_mask |= AC_PKU_PKEY_MASK;
+			invalid_mask |= AC_CPU_CR4_PKE_MASK;
+			printf("CR4.PKE not available, disabling PKE tests\n");
+		} else {
+			printf("Set PKE in CR4 - expect #GP: FAIL!\n");
+			set_cr4_pke(0);
+		}
+	}
+
+	if (!this_cpu_has(X86_FEATURE_SMEP)) {
+		tests++;
+		if (set_cr4_smep(1) == GP_VECTOR) {
+			successes++;
+			invalid_mask |= AC_CPU_CR4_SMEP_MASK;
+			printf("CR4.SMEP not available, disabling SMEP tests\n");
+		} else {
+			printf("Set SMEP in CR4 - expect #GP: FAIL!\n");
+			set_cr4_smep(0);
+		}
+	}
+
+	/* Toggling LA57 in 64-bit mode (guaranteed for this test) is illegal. */
+	if (this_cpu_has(X86_FEATURE_LA57)) {
+		tests++;
+		if (write_cr4_checking(shadow_cr4 ^ X86_CR4_LA57) == GP_VECTOR)
+			successes++;
+
+		/* Force a VM-Exit on KVM, which doesn't intercept LA57 itself. */
+		tests++;
+		if (write_cr4_checking(shadow_cr4 ^ (X86_CR4_LA57 | X86_CR4_PSE)) == GP_VECTOR)
+			successes++;
 	}
-    }
-
-    /* Toggling LA57 in 64-bit mode (guaranteed for this test) is illegal. */
-    if (this_cpu_has(X86_FEATURE_LA57)) {
-        tests++;
-        if (write_cr4_checking(shadow_cr4 ^ X86_CR4_LA57) == GP_VECTOR)
-            successes++;
-
-        /* Force a VM-Exit on KVM, which doesn't intercept LA57 itself. */
-        tests++;
-        if (write_cr4_checking(shadow_cr4 ^ (X86_CR4_LA57 | X86_CR4_PSE)) == GP_VECTOR)
-            successes++;
-    }
-
-    ac_env_int(&pool);
-    ac_test_init(&at, (void *)(0x123400000000 + 16 * smp_id()));
-    do {
-	++tests;
-	successes += ac_test_exec(&at, &pool);
-    } while (ac_test_bump(&at));
-
-    for (i = 0; i < ARRAY_SIZE(ac_test_cases); i++) {
-	++tests;
-	successes += ac_test_cases[i](&pool);
-    }
-
-    printf("\n%d tests, %d failures\n", tests, tests - successes);
-
-    return successes == tests;
+
+	ac_env_int(&pool);
+	ac_test_init(&at, (void *)(0x123400000000 + 16 * smp_id()));
+	do {
+		++tests;
+		successes += ac_test_exec(&at, &pool);
+	} while (ac_test_bump(&at));
+
+	for (i = 0; i < ARRAY_SIZE(ac_test_cases); i++) {
+		++tests;
+		successes += ac_test_cases[i](&pool);
+	}
+
+	printf("\n%d tests, %d failures\n", tests, tests - successes);
+
+	return successes == tests;
 }
-- 
2.34.0.rc1.387.gb447b232ab-goog


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [kvm-unit-tests PATCH 13/14] x86: Clean up the global, page_table_levels, in access.c
  2021-11-10 21:19 [kvm-unit-tests PATCH 00/14] Run access test in an L2 guest Aaron Lewis
                   ` (11 preceding siblings ...)
  2021-11-10 21:19 ` [kvm-unit-tests PATCH 12/14] x86: Fix tabs in access.c Aaron Lewis
@ 2021-11-10 21:20 ` Aaron Lewis
  2021-11-10 21:20 ` [kvm-unit-tests PATCH 14/14] x86: Add tests that run ac_test_run() in an L2 guest Aaron Lewis
  2021-11-11 17:51 ` [kvm-unit-tests PATCH 00/14] Run access test " Paolo Bonzini
  14 siblings, 0 replies; 20+ messages in thread
From: Aaron Lewis @ 2021-11-10 21:20 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, jmattson, seanjc, Aaron Lewis

Remove the global, page table levels, from access.c and store it in the
test struct ac_test_t instead.

Signed-off-by: Aaron Lewis <aaronlewis@google.com>
---
 x86/access.c      | 50 ++++++++++++++++++++++++-----------------------
 x86/access.h      |  5 +++--
 x86/access_test.c |  6 ++----
 3 files changed, 31 insertions(+), 30 deletions(-)

diff --git a/x86/access.c b/x86/access.c
index f832385..c5e71db 100644
--- a/x86/access.c
+++ b/x86/access.c
@@ -14,7 +14,6 @@ static _Bool verbose = false;
 
 typedef unsigned long pt_element_t;
 static int invalid_mask;
-int page_table_levels;
 
 #define PT_BASE_ADDR_MASK ((pt_element_t)((((pt_element_t)1 << 36) - 1) & PAGE_MASK))
 #define PT_PSE_BASE_ADDR_MASK (PT_BASE_ADDR_MASK & ~(1ull << 21))
@@ -174,6 +173,7 @@ typedef struct {
 	pt_element_t ignore_pde;
 	int expected_fault;
 	unsigned expected_error;
+	int page_table_levels;
 } ac_test_t;
 
 typedef struct {
@@ -278,13 +278,14 @@ static void ac_env_int(ac_pool_t *pool)
 	pool->pt_pool_current = 0;
 }
 
-static void ac_test_init(ac_test_t *at, void *virt)
+static void ac_test_init(ac_test_t *at, void *virt, int page_table_levels)
 {
 	set_efer_nx(1);
 	set_cr0_wp(1);
 	at->flags = 0;
 	at->virt = virt;
 	at->phys = 32 * 1024 * 1024;
+	at->page_table_levels = page_table_levels;
 }
 
 static int ac_test_bump_one(ac_test_t *at)
@@ -518,7 +519,7 @@ static void __ac_setup_specific_pages(ac_test_t *at, ac_pool_t *pool, bool reuse
 		ac_test_reset_pt_pool(pool);
 
 	at->ptep = 0;
-	for (int i = page_table_levels; i >= 1 && (i >= 2 || !F(AC_PDE_PSE)); --i) {
+	for (int i = at->page_table_levels; i >= 1 && (i >= 2 || !F(AC_PDE_PSE)); --i) {
 		pt_element_t *vroot = va(root & PT_BASE_ADDR_MASK);
 		unsigned index = PT_INDEX((unsigned long)at->virt, i);
 		pt_element_t pte = 0;
@@ -635,7 +636,7 @@ static void dump_mapping(ac_test_t *at)
 	int i;
 
 	printf("Dump mapping: address: %p\n", at->virt);
-	for (i = page_table_levels; i >= 1 && (i >= 2 || !F(AC_PDE_PSE)); --i) {
+	for (i = at->page_table_levels; i >= 1 && (i >= 2 || !F(AC_PDE_PSE)); --i) {
 		pt_element_t *vroot = va(root & PT_BASE_ADDR_MASK);
 		unsigned index = PT_INDEX((unsigned long)at->virt, i);
 		pt_element_t pte = vroot[index];
@@ -812,12 +813,12 @@ static void ac_test_show(ac_test_t *at)
  * This test case is used to triger the bug which is fixed by
  * commit e09e90a5 in the kvm tree
  */
-static int corrupt_hugepage_triger(ac_pool_t *pool)
+static int corrupt_hugepage_triger(ac_pool_t *pool, int page_table_levels)
 {
 	ac_test_t at1, at2;
 
-	ac_test_init(&at1, (void *)(0x123400000000));
-	ac_test_init(&at2, (void *)(0x666600000000));
+	ac_test_init(&at1, (void *)(0x123400000000), page_table_levels);
+	ac_test_init(&at2, (void *)(0x666600000000), page_table_levels);
 
 	at2.flags = AC_CPU_CR0_WP_MASK | AC_PDE_PSE_MASK | AC_PDE_PRESENT_MASK;
 	ac_test_setup_pte(&at2, pool);
@@ -850,12 +851,12 @@ err:
  * This test case is used to triger the bug which is fixed by
  * commit 3ddf6c06e13e in the kvm tree
  */
-static int check_pfec_on_prefetch_pte(ac_pool_t *pool)
+static int check_pfec_on_prefetch_pte(ac_pool_t *pool, int page_table_levels)
 {
 	ac_test_t at1, at2;
 
-	ac_test_init(&at1, (void *)(0x123406001000));
-	ac_test_init(&at2, (void *)(0x123406003000));
+	ac_test_init(&at1, (void *)(0x123406001000), page_table_levels);
+	ac_test_init(&at2, (void *)(0x123406003000), page_table_levels);
 
 	at1.flags = AC_PDE_PRESENT_MASK | AC_PTE_PRESENT_MASK;
 	ac_setup_specific_pages(&at1, pool, 30 * 1024 * 1024, 30 * 1024 * 1024);
@@ -895,12 +896,12 @@ err:
  *
  * Note: to trigger this bug, hugepage should be disabled on host.
  */
-static int check_large_pte_dirty_for_nowp(ac_pool_t *pool)
+static int check_large_pte_dirty_for_nowp(ac_pool_t *pool, int page_table_levels)
 {
 	ac_test_t at1, at2;
 
-	ac_test_init(&at1, (void *)(0x123403000000));
-	ac_test_init(&at2, (void *)(0x666606000000));
+	ac_test_init(&at1, (void *)(0x123403000000), page_table_levels);
+	ac_test_init(&at2, (void *)(0x666606000000), page_table_levels);
 
 	at2.flags = AC_PDE_PRESENT_MASK | AC_PDE_PSE_MASK;
 	ac_test_setup_pte(&at2, pool);
@@ -929,7 +930,7 @@ err:
 	return 0;
 }
 
-static int check_smep_andnot_wp(ac_pool_t *pool)
+static int check_smep_andnot_wp(ac_pool_t *pool, int page_table_levels)
 {
 	ac_test_t at1;
 	int err_prepare_andnot_wp, err_smep_andnot_wp;
@@ -938,7 +939,7 @@ static int check_smep_andnot_wp(ac_pool_t *pool)
 		return 1;
 	}
 
-	ac_test_init(&at1, (void *)(0x123406001000));
+	ac_test_init(&at1, (void *)(0x123406001000), page_table_levels);
 
 	at1.flags = AC_PDE_PRESENT_MASK | AC_PTE_PRESENT_MASK |
 		    AC_PDE_USER_MASK | AC_PTE_USER_MASK |
@@ -979,7 +980,7 @@ err:
 	return 0;
 }
 
-static int check_effective_sp_permissions(ac_pool_t *pool)
+static int check_effective_sp_permissions(ac_pool_t *pool, int page_table_levels)
 {
 	unsigned long ptr1 = 0x123480000000;
 	unsigned long ptr2 = ptr1 + SZ_2M;
@@ -1000,22 +1001,22 @@ static int check_effective_sp_permissions(ac_pool_t *pool)
 	 * pud1 and pud2 point to the same pmd page.
 	 */
 
-	ac_test_init(&at1, (void *)(ptr1));
+	ac_test_init(&at1, (void *)(ptr1), page_table_levels);
 	at1.flags = AC_PDE_PRESENT_MASK | AC_PTE_PRESENT_MASK |
 		    AC_PDE_USER_MASK | AC_PTE_USER_MASK |
 		    AC_PDE_ACCESSED_MASK | AC_PTE_ACCESSED_MASK |
 		    AC_PTE_WRITABLE_MASK | AC_ACCESS_USER_MASK;
 	__ac_setup_specific_pages(&at1, pool, false, pmd, 0);
 
-	ac_test_init(&at2, (void *)(ptr2));
+	ac_test_init(&at2, (void *)(ptr2), page_table_levels);
 	at2.flags = at1.flags | AC_PDE_WRITABLE_MASK | AC_PTE_DIRTY_MASK | AC_ACCESS_WRITE_MASK;
 	__ac_setup_specific_pages(&at2, pool, true, pmd, 0);
 
-	ac_test_init(&at3, (void *)(ptr3));
+	ac_test_init(&at3, (void *)(ptr3), page_table_levels);
 	at3.flags = AC_PDPTE_NO_WRITABLE_MASK | at1.flags;
 	__ac_setup_specific_pages(&at3, pool, true, pmd, 0);
 
-	ac_test_init(&at4, (void *)(ptr4));
+	ac_test_init(&at4, (void *)(ptr4), page_table_levels);
 	at4.flags = AC_PDPTE_NO_WRITABLE_MASK | at2.flags;
 	__ac_setup_specific_pages(&at4, pool, true, pmd, 0);
 
@@ -1058,7 +1059,7 @@ static int ac_test_exec(ac_test_t *at, ac_pool_t *pool)
 	return r;
 }
 
-typedef int (*ac_test_fn)(ac_pool_t *pool);
+typedef int (*ac_test_fn)(ac_pool_t *pool, int page_table_levels);
 const ac_test_fn ac_test_cases[] =
 {
 	corrupt_hugepage_triger,
@@ -1068,7 +1069,7 @@ const ac_test_fn ac_test_cases[] =
 	check_effective_sp_permissions,
 };
 
-int ac_test_run()
+int ac_test_run(int page_table_levels)
 {
 	ac_test_t at;
 	ac_pool_t pool;
@@ -1134,7 +1135,8 @@ int ac_test_run()
 	}
 
 	ac_env_int(&pool);
-	ac_test_init(&at, (void *)(0x123400000000 + 16 * smp_id()));
+	ac_test_init(&at, (void *)(0x123400000000 + 16 * smp_id()),
+		page_table_levels);
 	do {
 		++tests;
 		successes += ac_test_exec(&at, &pool);
@@ -1142,7 +1144,7 @@ int ac_test_run()
 
 	for (i = 0; i < ARRAY_SIZE(ac_test_cases); i++) {
 		++tests;
-		successes += ac_test_cases[i](&pool);
+		successes += ac_test_cases[i](&pool, page_table_levels);
 	}
 
 	printf("\n%d tests, %d failures\n", tests, tests - successes);
diff --git a/x86/access.h b/x86/access.h
index 4f67b62..bcfa7b2 100644
--- a/x86/access.h
+++ b/x86/access.h
@@ -1,8 +1,9 @@
 #ifndef X86_ACCESS_H
 #define X86_ACCESS_H
 
-int ac_test_run(void);
+#define PT_LEVEL_PML4 4
+#define PT_LEVEL_PML5 5
 
-extern int page_table_levels;
+int ac_test_run(int page_table_levels);
 
 #endif // X86_ACCESS_H
\ No newline at end of file
diff --git a/x86/access_test.c b/x86/access_test.c
index 497f286..991f333 100644
--- a/x86/access_test.c
+++ b/x86/access_test.c
@@ -8,14 +8,12 @@ int main(void)
     int r;
 
     printf("starting test\n\n");
-    page_table_levels = 4;
-    r = ac_test_run();
+    r = ac_test_run(PT_LEVEL_PML4);
 
     if (this_cpu_has(X86_FEATURE_LA57)) {
-        page_table_levels = 5;
         printf("starting 5-level paging test.\n\n");
         setup_5level_page_table();
-        r = ac_test_run();
+        r = ac_test_run(PT_LEVEL_PML5);
     }
 
     return r ? 0 : 1;
-- 
2.34.0.rc1.387.gb447b232ab-goog


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [kvm-unit-tests PATCH 14/14] x86: Add tests that run ac_test_run() in an L2 guest
  2021-11-10 21:19 [kvm-unit-tests PATCH 00/14] Run access test in an L2 guest Aaron Lewis
                   ` (12 preceding siblings ...)
  2021-11-10 21:20 ` [kvm-unit-tests PATCH 13/14] x86: Clean up the global, page_table_levels, " Aaron Lewis
@ 2021-11-10 21:20 ` Aaron Lewis
  2021-11-11 17:51 ` [kvm-unit-tests PATCH 00/14] Run access test " Paolo Bonzini
  14 siblings, 0 replies; 20+ messages in thread
From: Aaron Lewis @ 2021-11-10 21:20 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, jmattson, seanjc, Aaron Lewis

Add tests vmx_pf_exception_test and vmx_pf_exception_test_reduced_maxphyaddr
to vmx_tests.c.

The purpose of these tests are to test the reflection logic in KVM to
ensure exceptions are being routed to were they are intended to go.  For
example, it will test that we are not accidentally reflecting exceptions
into L1 when L1 isn't expecting them.  Commit 18712c13709d ("KVM: nVMX:
Use vmx_need_pf_intercept() when deciding if L0 wants a #PF") fixed an
issue related to this which went undetected because there was no testing
in place.  This adds testing to ensure there is coverage for such
issues.

Signed-off-by: Aaron Lewis <aaronlewis@google.com>
---
 x86/Makefile.common |  2 ++
 x86/unittests.cfg   | 13 ++++++++++++
 x86/vmx_tests.c     | 49 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 64 insertions(+)

diff --git a/x86/Makefile.common b/x86/Makefile.common
index a665854..461de51 100644
--- a/x86/Makefile.common
+++ b/x86/Makefile.common
@@ -74,6 +74,8 @@ $(TEST_DIR)/realmode.o: bits = $(if $(call cc-option,-m16,""),16,32)
 
 $(TEST_DIR)/access_test.elf: $(TEST_DIR)/access.o
 
+$(TEST_DIR)/vmx.elf: $(TEST_DIR)/access.o
+
 $(TEST_DIR)/kvmclock_test.elf: $(TEST_DIR)/kvmclock.o
 
 $(TEST_DIR)/hyperv_synic.elf: $(TEST_DIR)/hyperv.o
diff --git a/x86/unittests.cfg b/x86/unittests.cfg
index dbeb8a2..4069e4c 100644
--- a/x86/unittests.cfg
+++ b/x86/unittests.cfg
@@ -347,6 +347,19 @@ extra_params = -cpu max,+vmx -append vmx_vmcs_shadow_test
 arch = x86_64
 groups = vmx
 
+[vmx_pf_exception_test]
+file = vmx.flat
+extra_params = -cpu max,+vmx -append vmx_pf_exception_test
+arch = x86_64
+groups = vmx nested_exception
+
+[vmx_pf_exception_test_reduced_maxphyaddr]
+file = vmx.flat
+extra_params = -cpu IvyBridge,phys-bits=36,host-phys-bits=off,+vmx -append vmx_pf_exception_test
+arch = x86_64
+groups = vmx nested_exception
+check = /sys/module/kvm_intel/parameters/allow_smaller_maxphyaddr=Y
+
 [debug]
 file = debug.flat
 arch = x86_64
diff --git a/x86/vmx_tests.c b/x86/vmx_tests.c
index 9ee6653..8cf3543 100644
--- a/x86/vmx_tests.c
+++ b/x86/vmx_tests.c
@@ -20,6 +20,7 @@
 #include "alloc_page.h"
 #include "smp.h"
 #include "delay.h"
+#include "access.h"
 
 #define VPID_CAP_INVVPID_TYPES_SHIFT 40
 
@@ -10658,6 +10659,53 @@ static void atomic_switch_overflow_msrs_test(void)
 		test_skip("Test is only supported on KVM");
 }
 
+static void vmx_pf_exception_test_guest(void)
+{
+	ac_test_run(PT_LEVEL_PML4);
+}
+
+static void vmx_pf_exception_test(void)
+{
+	u64 efer;
+	struct cpuid cpuid;
+
+	test_set_guest(vmx_pf_exception_test_guest);
+
+	enter_guest();
+
+	while (vmcs_read(EXI_REASON) != VMX_VMCALL) {
+		switch (vmcs_read(EXI_REASON)) {
+		case VMX_RDMSR:
+			assert(regs.rcx == MSR_EFER);
+			efer = vmcs_read(GUEST_EFER);
+			regs.rdx = efer >> 32;
+			regs.rax = efer & 0xffffffff;
+			break;
+		case VMX_WRMSR:
+			assert(regs.rcx == MSR_EFER);
+			efer = regs.rdx << 32 | (regs.rax & 0xffffffff);
+			vmcs_write(GUEST_EFER, efer);
+			break;
+		case VMX_CPUID:
+			cpuid = (struct cpuid) {0, 0, 0, 0};
+			cpuid = raw_cpuid(regs.rax, regs.rcx);
+			regs.rax = cpuid.a;
+			regs.rbx = cpuid.b;
+			regs.rcx = cpuid.c;
+			regs.rdx = cpuid.d;
+			break;
+		default:
+			assert_msg(false,
+				"Unexpected exit to L1, exit_reason: %s (0x%lx)",
+				exit_reason_description(vmcs_read(EXI_REASON)),
+				vmcs_read(EXI_REASON));
+		}
+		skip_exit_insn();
+		enter_guest();
+	}
+
+	assert_exit_reason(VMX_VMCALL);
+}
 #define TEST(name) { #name, .v2 = name }
 
 /* name/init/guest_main/exit_handler/syscall_handler/guest_regs */
@@ -10763,5 +10811,6 @@ struct vmx_test vmx_tests[] = {
 	TEST(rdtsc_vmexit_diff_test),
 	TEST(vmx_mtf_test),
 	TEST(vmx_mtf_pdpte_test),
+	TEST(vmx_pf_exception_test),
 	{ NULL, NULL, NULL, NULL, NULL, {0} },
 };
-- 
2.34.0.rc1.387.gb447b232ab-goog


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [kvm-unit-tests PATCH 00/14] Run access test in an L2 guest
  2021-11-10 21:19 [kvm-unit-tests PATCH 00/14] Run access test in an L2 guest Aaron Lewis
                   ` (13 preceding siblings ...)
  2021-11-10 21:20 ` [kvm-unit-tests PATCH 14/14] x86: Add tests that run ac_test_run() in an L2 guest Aaron Lewis
@ 2021-11-11 17:51 ` Paolo Bonzini
  14 siblings, 0 replies; 20+ messages in thread
From: Paolo Bonzini @ 2021-11-11 17:51 UTC (permalink / raw)
  To: Aaron Lewis, kvm; +Cc: jmattson, seanjc

On 11/10/21 22:19, Aaron Lewis wrote:
> The motivation behind this change is to test the routing logic when an
> exception occurs in an L2 guest and ensure the exception goes to the
> correct place.  For example, if an exception occurs in L2, does L1 want
> to get involved, or L0, or do niether of them care about it and leave
> it to L2 to handle.  Test that the exception doesn't end up going to L1
> When L1 didn't ask for it.  This was occurring before commit 18712c13709d
> ("KVM: nVMX: Use vmx_need_pf_intercept() when deciding if L0 wants a #PF")
> fixed the issue.  Without that fix, running
> vmx_pf_exception_test_reduced_maxphyaddr with allow_smaller_maxphyaddr=Y
> would have resulted in the test failing with the following error:
> 
> x86/vmx_tests.c:10698: assert failed: false: Unexpected exit to L1,
> exit_reason: VMX_EXC_NMI (0x0)
> 
> This series only tests the routing logic for #PFs.  A future
> series will address other exceptions, however, getting #PF testing in
> place is a big enough chunk that the other exceptions will be submitted
> seperately (in a future series).
> 
> This series is dependant on Paolo's changes (inlcuded). Without them,
> running ac_test_run() on one of the userspace test fails.  Of note:  the
> commit ("x86: get rid of ring0stacktop") has been updated to include a fix
> for a compiler error to get it building on clang.
> 
> This series is also dependant on the commit ("x86: Look up the PTEs rather
> than assuming them").  This was sent out for review seperately, however,
> it is needed to get ac_test_run() running on a different cr3 than the one
> access_test runs on, so it is included here as well.  This is also v2 of
> that commit.  While preparing this series a review came in, so I just
> included the changes here.

Queued, thanks.

Paolo


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [kvm-unit-tests PATCH 10/14] x86: Look up the PTEs rather than assuming them
  2021-11-10 21:19 ` [kvm-unit-tests PATCH 10/14] x86: Look up the PTEs rather than assuming them Aaron Lewis
@ 2021-11-29 19:43   ` Babu Moger
  2021-11-29 20:43     ` Sean Christopherson
  0 siblings, 1 reply; 20+ messages in thread
From: Babu Moger @ 2021-11-29 19:43 UTC (permalink / raw)
  To: Aaron Lewis, kvm; +Cc: pbonzini, jmattson, seanjc

Hi,

This patch caused the regression. Here is the failure. Failure seems to
happen only on 5-level paging. Still investigating.. Let me know if you
have any idea,

#./tests/access
BUILD_HEAD=f3e081d7
timeout -k 1s --foreground 180 /usr/local/bin/qemu-system-x86_64
--no-reboot -nodefaults -device pc-testdev

-device isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc none -serial stdio
-device pci-testdev -machine accel=kvm

-kernel /tmp/tmp.w1JL6jhyfN -smp 1 -cpu max # -initrd /tmp/tmp.9coF1FJSwD
enabling apic
starting test

starting 5-level paging test.

run
............................FAIL access

=============================

Git bisect log.

 git bisect log
git bisect start
# bad: [68aa4e32f2b717b991e4dce7dfdde2247366abbc] x86: do not run
vmx_pf_exception_test twice
git bisect bad 68aa4e32f2b717b991e4dce7dfdde2247366abbc
# good: [c90c646d7381c99ac7d9d7812bd8535214458978] access: treat NX as
reserved if EFER.NXE=0
git bisect good c90c646d7381c99ac7d9d7812bd8535214458978
# good: [c90c646d7381c99ac7d9d7812bd8535214458978] access: treat NX as
reserved if EFER.NXE=0
git bisect good c90c646d7381c99ac7d9d7812bd8535214458978
# good: [c73cc92d8060dd79b71cbd2ded1454a6e924b771] s390x: uv-host: Fence a
destroy cpu test on z15
git bisect good c73cc92d8060dd79b71cbd2ded1454a6e924b771
# good: [c73cc92d8060dd79b71cbd2ded1454a6e924b771] s390x: uv-host: Fence a
destroy cpu test on z15
git bisect good c73cc92d8060dd79b71cbd2ded1454a6e924b771
# good: [2e88ad238a19253b94e9f410e4c86ed632c134a0] unify field names and
definitions for GDT descriptors
git bisect good 2e88ad238a19253b94e9f410e4c86ed632c134a0
# good: [91abf0b9aa0bac4ca17df0be63871ca6e1562eac] Merge branch
'gdt-idt-cleanup' into master
git bisect good 91abf0b9aa0bac4ca17df0be63871ca6e1562eac
# bad: [0f10d9aea13631a414a3023699dd2dfd47dfd02f] x86: Prepare access test
for running in L2
git bisect bad 0f10d9aea13631a414a3023699dd2dfd47dfd02f
# good: [7a14c1d9468626d4cdd0d883097c7caaa36a91bf] x86: Fix operand size
for lldt
git bisect good 7a14c1d9468626d4cdd0d883097c7caaa36a91bf
# bad: [f3e081d74812ee05be7e744eb8be3f04a2f65c87] x86: Look up the PTEs
rather than assuming them
git bisect bad f3e081d74812ee05be7e744eb8be3f04a2f65c87
# good: [f7599ce50db691c4281479002f03d611927bed1c] x86: Add a regression
test for L1 LDTR persistence bug
git bisect good f7599ce50db691c4281479002f03d611927bed1c
# first bad commit: [f3e081d74812ee05be7e744eb8be3f04a2f65c87] x86: Look
up the PTEs rather than assuming them

Thanks

Babu


On 11/10/21 3:19 PM, Aaron Lewis wrote:
> Rather than assuming which PTEs the SMEP test runs on, look them up to
> ensure they are correct.  If this test were to run on a different page
> table (ie: run in an L2 test) the wrong PTEs would be set.  Switch to
> looking up the PTEs to avoid this from happening.
>
> Signed-off-by: Aaron Lewis <aaronlewis@google.com>
> ---
>  lib/libcflat.h |  1 +
>  lib/x86/vm.c   | 21 +++++++++++++++++++++
>  lib/x86/vm.h   |  3 +++
>  x86/access.c   | 26 ++++++++++++++++++--------
>  x86/cstart64.S |  1 -
>  x86/flat.lds   |  1 +
>  6 files changed, 44 insertions(+), 9 deletions(-)
>
> diff --git a/lib/libcflat.h b/lib/libcflat.h
> index 9bb7e08..c1fd31f 100644
> --- a/lib/libcflat.h
> +++ b/lib/libcflat.h
> @@ -35,6 +35,7 @@
>  #define __ALIGN_MASK(x, mask)	(((x) + (mask)) & ~(mask))
>  #define __ALIGN(x, a)		__ALIGN_MASK(x, (typeof(x))(a) - 1)
>  #define ALIGN(x, a)		__ALIGN((x), (a))
> +#define ALIGN_DOWN(x, a)	__ALIGN((x) - ((a) - 1), (a))
>  #define IS_ALIGNED(x, a)	(((x) & ((typeof(x))(a) - 1)) == 0)
>  
>  #define MIN(a, b)		((a) < (b) ? (a) : (b))
> diff --git a/lib/x86/vm.c b/lib/x86/vm.c
> index 5cd2ee4..6a70ef6 100644
> --- a/lib/x86/vm.c
> +++ b/lib/x86/vm.c
> @@ -281,3 +281,24 @@ void force_4k_page(void *addr)
>  	if (pte & PT_PAGE_SIZE_MASK)
>  		split_large_page(ptep, 2);
>  }
> +
> +/*
> + * Call the callback on each page from virt to virt + len.
> + */
> +void walk_pte(void *virt, size_t len, pte_callback_t callback)
> +{
> +    pgd_t *cr3 = current_page_table();
> +    uintptr_t start = (uintptr_t)virt;
> +    uintptr_t end = (uintptr_t)virt + len;
> +    struct pte_search search;
> +    size_t page_size;
> +    uintptr_t curr;
> +
> +    for (curr = start; curr < end; curr = ALIGN_DOWN(curr + page_size, page_size)) {
> +        search = find_pte_level(cr3, (void *)curr, 1);
> +        assert(found_leaf_pte(search));
> +        page_size = 1ul << PGDIR_BITS(search.level);
> +
> +        callback(search, (void *)curr);
> +    }
> +}
> diff --git a/lib/x86/vm.h b/lib/x86/vm.h
> index d9753c3..4c6dff9 100644
> --- a/lib/x86/vm.h
> +++ b/lib/x86/vm.h
> @@ -52,4 +52,7 @@ struct vm_vcpu_info {
>          u64 cr0;
>  };
>  
> +typedef void (*pte_callback_t)(struct pte_search search, void *va);
> +void walk_pte(void *virt, size_t len, pte_callback_t callback);
> +
>  #endif
> diff --git a/x86/access.c b/x86/access.c
> index a781a0c..8e3a718 100644
> --- a/x86/access.c
> +++ b/x86/access.c
> @@ -201,10 +201,24 @@ static void set_cr0_wp(int wp)
>      }
>  }
>  
> +static void clear_user_mask(struct pte_search search, void *va)
> +{
> +	*search.pte &= ~PT_USER_MASK;
> +}
> +
> +static void set_user_mask(struct pte_search search, void *va)
> +{
> +	*search.pte |= PT_USER_MASK;
> +
> +	/* Flush to avoid spurious #PF */
> +	invlpg(va);
> +}
> +
>  static unsigned set_cr4_smep(int smep)
>  {
> +    extern char stext, etext;
> +    size_t len = (size_t)&etext - (size_t)&stext;
>      unsigned long cr4 = shadow_cr4;
> -    extern u64 ptl2[];
>      unsigned r;
>  
>      cr4 &= ~CR4_SMEP_MASK;
> @@ -214,14 +228,10 @@ static unsigned set_cr4_smep(int smep)
>          return 0;
>  
>      if (smep)
> -        ptl2[2] &= ~PT_USER_MASK;
> +        walk_pte(&stext, len, clear_user_mask);
>      r = write_cr4_checking(cr4);
> -    if (r || !smep) {
> -        ptl2[2] |= PT_USER_MASK;
> -
> -	/* Flush to avoid spurious #PF */
> -	invlpg((void *)(2 << 21));
> -    }
> +    if (r || !smep)
> +        walk_pte(&stext, len, set_user_mask);
>      if (!r)
>          shadow_cr4 = cr4;
>      return r;
> diff --git a/x86/cstart64.S b/x86/cstart64.S
> index ddb83a0..ff79ae7 100644
> --- a/x86/cstart64.S
> +++ b/x86/cstart64.S
> @@ -17,7 +17,6 @@ stacktop:
>  .data
>  
>  .align 4096
> -.globl ptl2
>  ptl2:
>  i = 0
>  	.rept 512 * 4
> diff --git a/x86/flat.lds b/x86/flat.lds
> index a278b56..337bc44 100644
> --- a/x86/flat.lds
> +++ b/x86/flat.lds
> @@ -3,6 +3,7 @@ SECTIONS
>      . = 4M + SIZEOF_HEADERS;
>      stext = .;
>      .text : { *(.init) *(.text) *(.text.*) }
> +    etext = .;
>      . = ALIGN(4K);
>      .data : {
>            *(.data)

-- 
Thanks
Babu Moger


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [kvm-unit-tests PATCH 10/14] x86: Look up the PTEs rather than assuming them
  2021-11-29 19:43   ` Babu Moger
@ 2021-11-29 20:43     ` Sean Christopherson
  2021-11-29 21:04       ` Babu Moger
  2021-11-29 21:30       ` Babu Moger
  0 siblings, 2 replies; 20+ messages in thread
From: Sean Christopherson @ 2021-11-29 20:43 UTC (permalink / raw)
  To: Babu Moger; +Cc: Aaron Lewis, kvm, pbonzini, jmattson

On Mon, Nov 29, 2021, Babu Moger wrote:
> Hi,
> 
> This patch caused the regression. Here is the failure. Failure seems to
> happen only on 5-level paging. Still investigating.. Let me know if you
> have any idea,

Fix posted, though you may need to take the entire series up to that point for
things to actually work.

https://lore.kernel.org/all/20211125012857.508243-11-seanjc@google.com/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [kvm-unit-tests PATCH 10/14] x86: Look up the PTEs rather than assuming them
  2021-11-29 20:43     ` Sean Christopherson
@ 2021-11-29 21:04       ` Babu Moger
  2021-11-29 21:30       ` Babu Moger
  1 sibling, 0 replies; 20+ messages in thread
From: Babu Moger @ 2021-11-29 21:04 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Aaron Lewis, kvm, pbonzini, jmattson


On 11/29/21 2:43 PM, Sean Christopherson wrote:
> On Mon, Nov 29, 2021, Babu Moger wrote:
>> Hi,
>>
>> This patch caused the regression. Here is the failure. Failure seems to
>> happen only on 5-level paging. Still investigating.. Let me know if you
>> have any idea,
> Fix posted, though you may need to take the entire series up to that point for
> things to actually work.

Sure. Will test and let you know.

Thanks


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [kvm-unit-tests PATCH 10/14] x86: Look up the PTEs rather than assuming them
  2021-11-29 20:43     ` Sean Christopherson
  2021-11-29 21:04       ` Babu Moger
@ 2021-11-29 21:30       ` Babu Moger
  1 sibling, 0 replies; 20+ messages in thread
From: Babu Moger @ 2021-11-29 21:30 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Aaron Lewis, kvm, pbonzini, jmattson


On 11/29/21 2:43 PM, Sean Christopherson wrote:
> On Mon, Nov 29, 2021, Babu Moger wrote:
>> Hi,
>>
>> This patch caused the regression. Here is the failure. Failure seems to
>> happen only on 5-level paging. Still investigating.. Let me know if you
>> have any idea,
> Fix posted, though you may need to take the entire series up to that point for
> things to actually work.
Yes. Verified the fix. Looks good. Thanks



^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2021-11-29 22:24 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-10 21:19 [kvm-unit-tests PATCH 00/14] Run access test in an L2 guest Aaron Lewis
2021-11-10 21:19 ` [kvm-unit-tests PATCH 01/14] x86: cleanup handling of 16-byte GDT descriptors Aaron Lewis
2021-11-10 21:19 ` [kvm-unit-tests PATCH 02/14] x86: fix call to set_gdt_entry Aaron Lewis
2021-11-10 21:19 ` [kvm-unit-tests PATCH 03/14] unify field names and definitions for GDT descriptors Aaron Lewis
2021-11-10 21:19 ` [kvm-unit-tests PATCH 04/14] replace tss_descr global with a function Aaron Lewis
2021-11-10 21:19 ` [kvm-unit-tests PATCH 05/14] x86: Move IDT to desc.c Aaron Lewis
2021-11-10 21:19 ` [kvm-unit-tests PATCH 06/14] x86: unify name of 32-bit and 64-bit GDT Aaron Lewis
2021-11-10 21:19 ` [kvm-unit-tests PATCH 07/14] x86: get rid of ring0stacktop Aaron Lewis
2021-11-10 21:19 ` [kvm-unit-tests PATCH 08/14] x86: Move 64-bit GDT and TSS to desc.c Aaron Lewis
2021-11-10 21:19 ` [kvm-unit-tests PATCH 09/14] x86: Move 32-bit " Aaron Lewis
2021-11-10 21:19 ` [kvm-unit-tests PATCH 10/14] x86: Look up the PTEs rather than assuming them Aaron Lewis
2021-11-29 19:43   ` Babu Moger
2021-11-29 20:43     ` Sean Christopherson
2021-11-29 21:04       ` Babu Moger
2021-11-29 21:30       ` Babu Moger
2021-11-10 21:19 ` [kvm-unit-tests PATCH 11/14] x86: Prepare access test for running in L2 Aaron Lewis
2021-11-10 21:19 ` [kvm-unit-tests PATCH 12/14] x86: Fix tabs in access.c Aaron Lewis
2021-11-10 21:20 ` [kvm-unit-tests PATCH 13/14] x86: Clean up the global, page_table_levels, " Aaron Lewis
2021-11-10 21:20 ` [kvm-unit-tests PATCH 14/14] x86: Add tests that run ac_test_run() in an L2 guest Aaron Lewis
2021-11-11 17:51 ` [kvm-unit-tests PATCH 00/14] Run access test " Paolo Bonzini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.