linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 00/13] XOM for KVM guest userspace
@ 2019-10-03 21:23 Rick Edgecombe
  2019-10-03 21:23 ` [RFC PATCH 01/13] kvm: Enable MTRR to work with GFNs with perm bits Rick Edgecombe
                   ` (15 more replies)
  0 siblings, 16 replies; 41+ messages in thread
From: Rick Edgecombe @ 2019-10-03 21:23 UTC (permalink / raw)
  To: kvm, linux-kernel, x86, linux-mm, luto, peterz, dave.hansen,
	pbonzini, sean.j.christopherson, keescook
  Cc: kristen, deneen.t.dock, Rick Edgecombe

This patchset enables the ability for KVM guests to create execute-only (XO)
memory by utilizing EPT based XO permissions. XO memory is currently supported
on Intel hardware natively for CPU's with PKU, but this enables it on older
platforms, and can support XO for kernel memory as well.

In the guest, this patchset enables XO memory for userspace, using the existing
interface (mprotect PROT_EXEC && !PROT_READ) used for arm64 and x86 PKU HW. A
larger follow on to this enables setting the kernel text as XO, but this is just
the KVM pieces and guest userspace. The yet un-posted QEMU patches to work with
these changes are here:
https://github.com/redgecombe/qemu/

Guest Interface
===============
The way XO is exposed to the guest is by creating a virtual XO permission bit in
the guest page tables.

There are normally four kinds of page table bits:
1. Bits ignored by the hardware
2. Bits that must be 0 or else the hardware throws a RSVD page fault
3. Bits used by the hardware for addresses
4. Bits used by the hardware for permissions and other features

We want to find a bit in the guest page tables to use to mean execute-only
memory so that guest can map the same physical memory with different permissions
simultaneously like other permission bits. We also want the translations to be
done by the hardware, which means we can't use ignored or reserved bits. We also
can't easily re-purpose a feature bit. This leaves address bits. The idea here
is we will take an address bit and re-purpose it as a feature bit.

The first thing we have to do is tell the guest that it can't use the address
bit we are stealing. Luckily there is an existing CPUID leaf that conveys the
number of physical address bits which is already intercepted by KVM, and so we
can reduce it as needed. This puts what was previously the top physical address
bit into what is defined as the "reserved area" of the PTE.

Here is how the PTE would be transformed, where M is the number of physical bits
exposed by the CPUID leaf.

Normal:
|--------------------------------------------------------|
| .. |     RSVD (51 to M)     |   PFN (M-1 to 12)   | .. |
|--------------------------------------------------------|

KVM XO (with M reduced by 1):
|--------------------------------------------------------|
| .. |  RSVD (51 to M+1) | XO |   PFN (M-1 to 12)   | .. |
|--------------------------------------------------------|

So the way XOM is exposed to the guest is by having the VMM provide two aliases
in the guest physical address space for the same memory. The first half has
normal EPT permissions, and the second half has XO permissions. This way the
high PFN bit in the guest page tables acts like an XO permission bit. The VMM
reports to the guest a number of physical address bits that exclude the XO bit,
so from the guest perspective the XO bit is in the region that would be
"reserved", and from the CPU's perspective the bit is still a normal PFN bit.

Backwards Compatibility
-----------------------
Since software would have previously received a #PF with the RSVD error code
set, when the HW encountered any set bits in the region 51 to M, there was some
internal discussion on whether this should have a virtual MSR for the OS to turn
it on only if the OS knows it isn't relying on this behavior for bit M. The
argument against needing an MSR is this blurb from the Intel SDM about reserved
bits:
"Bits reserved in the paging-structure entries are reserved for future
functionality. Software developers should be aware that such bits may be used in
the future and that a paging-structure entry that causes a page-fault exception
on one processor might not do so in the future."

So in the current patchset there is no MSR write required for the guest to turn
on this feature. It will have this behavior whenever qemu is run with
"-cpu +xo".

KVM XO CPUID Feature Bit
------------------------
Althrough this patchset targets KVM, the idea is that this interface might be
implemented by other hypervisors. Especially since as it appears especially like
a normal CPU feature it would be nice if there was a single CPUID bit to check
for different implementations like there often is for real CPU features. In the
past there was a proposal for "generic leaves" [1], where regions are assigned
for VMMs to define, but where the behavior will not change across VMMs. This
patchset follows this proposal and defines a bit in a new leaf to expose the
presense of the above described behavior. I'm hoping to get some suggestions on
the right way to expose it by this RFC.

Injecting Page Faults
---------------------
When there is an attempt to read memory from an XO address range, a #PF is
injected into the guest with P=1, W/R=0, RSVD=0, I/D=0. When there is an attempt
to write, it is P=1, W/R=1, RSVD=0, I/D=0.

Implementation
==============
In KVM this patchset adds a new memslot, KVM_MEM_EXECONLY, which maps memory as
execute-only via EPT permissions, and will inject a PF to the guest if there is
a violation. The x86 emulator is also made aware of XO memory perissions, and
virtualized features that act on PFN's are made aware that VTs view of the GFN
includes the permission bit (and so needs to be masked to get the guests view of
the PFN).

QEMU manipulates the physical address bits exposed to the guest and adds an
extra KVM_MEM_EXECONLY memslot that points to the same userspace memory in the
XO range for every memslot added in the normal range.

The violating linear address is determined from the EPT feature that provides
the linear address of the violation if availible, and if not availible emulates
the violating instruction to determine which linear address to use in the
injected fault.

Performance
===========
The performance impact is not fully characterized yet. In the larger patchset
that sets kernel text to be XO, there wasn't any measurable impact compiling
the kernel. The hope is that there will not be a large impact, but more testing
is needed.

Status
======
Regression testing is still needed including the nested virtualization case and
impact of XO in the other memslot address spaces. This is based on 5.3.

[1] https://lwn.net/Articles/301888/


Rick Edgecombe (13):
  kvm: Enable MTRR to work with GFNs with perm bits
  kvm: Add support for X86_FEATURE_KVM_XO
  kvm: Add XO memslot type
  kvm, vmx: Add support for gva exit qualification
  kvm: Add #PF injection for KVM XO
  kvm: Add KVM_CAP_EXECONLY_MEM
  kvm: Add docs for KVM_CAP_EXECONLY_MEM
  x86/boot: Rename USE_EARLY_PGTABLE_L5
  x86/cpufeature: Add detection of KVM XO
  x86/mm: Add NR page bit for KVM XO
  x86, ptdump: Add NR bit to page table dump
  mmap: Add XO support for KVM XO
  x86/Kconfig: Add Kconfig for KVM based XO

 Documentation/virt/kvm/api.txt                | 16 ++--
 arch/x86/Kconfig                              | 13 +++
 arch/x86/boot/compressed/misc.h               |  2 +-
 arch/x86/include/asm/cpufeature.h             |  7 +-
 arch/x86/include/asm/cpufeatures.h            |  5 +-
 arch/x86/include/asm/disabled-features.h      |  3 +-
 arch/x86/include/asm/kvm_host.h               |  7 ++
 arch/x86/include/asm/pgtable_32_types.h       |  1 +
 arch/x86/include/asm/pgtable_64_types.h       | 30 ++++++-
 arch/x86/include/asm/pgtable_types.h          | 13 +++
 arch/x86/include/asm/required-features.h      |  3 +-
 arch/x86/include/asm/sparsemem.h              |  4 +-
 arch/x86/include/asm/vmx.h                    |  1 +
 arch/x86/include/uapi/asm/kvm_para.h          |  3 +
 arch/x86/kernel/cpu/common.c                  |  7 +-
 arch/x86/kernel/head64.c                      | 43 +++++++++-
 arch/x86/kvm/cpuid.c                          |  7 ++
 arch/x86/kvm/cpuid.h                          |  1 +
 arch/x86/kvm/mmu.c                            | 79 +++++++++++++++++--
 arch/x86/kvm/mtrr.c                           |  8 ++
 arch/x86/kvm/paging_tmpl.h                    | 29 +++++--
 arch/x86/kvm/svm.c                            |  6 ++
 arch/x86/kvm/vmx/vmx.c                        |  6 ++
 arch/x86/kvm/x86.c                            |  9 ++-
 arch/x86/mm/dump_pagetables.c                 |  6 +-
 arch/x86/mm/init.c                            |  3 +
 arch/x86/mm/kasan_init_64.c                   |  2 +-
 include/uapi/linux/kvm.h                      |  2 +
 mm/mmap.c                                     | 30 +++++--
 .../arch/x86/include/asm/disabled-features.h  |  3 +-
 tools/include/uapi/linux/kvm.h                |  1 +
 virt/kvm/kvm_main.c                           | 15 +++-
 32 files changed, 322 insertions(+), 43 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [RFC PATCH 01/13] kvm: Enable MTRR to work with GFNs with perm bits
  2019-10-03 21:23 [RFC PATCH 00/13] XOM for KVM guest userspace Rick Edgecombe
@ 2019-10-03 21:23 ` Rick Edgecombe
  2019-10-14  6:47   ` Yu Zhang
  2019-10-03 21:23 ` [RFC PATCH 02/13] kvm: Add support for X86_FEATURE_KVM_XO Rick Edgecombe
                   ` (14 subsequent siblings)
  15 siblings, 1 reply; 41+ messages in thread
From: Rick Edgecombe @ 2019-10-03 21:23 UTC (permalink / raw)
  To: kvm, linux-kernel, x86, linux-mm, luto, peterz, dave.hansen,
	pbonzini, sean.j.christopherson, keescook
  Cc: kristen, deneen.t.dock, Rick Edgecombe

Mask gfn by maxphyaddr in kvm_mtrr_get_guest_memory_type so that the
guests view of gfn is used when high bits of the physical memory are
used as extra permissions bits. This supports the KVM XO feature.

TODO: Since MTRR is emulated using EPT permissions, the XO version of
the gpa range will not inherrit the MTRR type with this implementation.
There shouldn't be any legacy use of KVM XO, but hypothetically it could
interfere with the uncacheable MTRR type.

Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
---
 arch/x86/kvm/mtrr.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/x86/kvm/mtrr.c b/arch/x86/kvm/mtrr.c
index 25ce3edd1872..da38f3b83e51 100644
--- a/arch/x86/kvm/mtrr.c
+++ b/arch/x86/kvm/mtrr.c
@@ -621,6 +621,14 @@ u8 kvm_mtrr_get_guest_memory_type(struct kvm_vcpu *vcpu, gfn_t gfn)
 	const int wt_wb_mask = (1 << MTRR_TYPE_WRBACK)
 			       | (1 << MTRR_TYPE_WRTHROUGH);
 
+	/*
+	 * Handle situations where gfn bits are used as permissions bits by
+	 * masking KVMs view of the gfn with the guests physical address bits
+	 * in order to match the guests view of physical address. For normal
+	 * situations this will have no effect.
+	 */
+	gfn &= (1ULL << (cpuid_maxphyaddr(vcpu) - PAGE_SHIFT));
+
 	start = gfn_to_gpa(gfn);
 	end = start + PAGE_SIZE;
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [RFC PATCH 02/13] kvm: Add support for X86_FEATURE_KVM_XO
  2019-10-03 21:23 [RFC PATCH 00/13] XOM for KVM guest userspace Rick Edgecombe
  2019-10-03 21:23 ` [RFC PATCH 01/13] kvm: Enable MTRR to work with GFNs with perm bits Rick Edgecombe
@ 2019-10-03 21:23 ` Rick Edgecombe
  2019-10-03 21:23 ` [RFC PATCH 03/13] kvm: Add XO memslot type Rick Edgecombe
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 41+ messages in thread
From: Rick Edgecombe @ 2019-10-03 21:23 UTC (permalink / raw)
  To: kvm, linux-kernel, x86, linux-mm, luto, peterz, dave.hansen,
	pbonzini, sean.j.christopherson, keescook
  Cc: kristen, deneen.t.dock, Rick Edgecombe

Add X86_FEATURE_KVM_XO which reduces the physical address bits exposed by
CPUID and uses the hosts highest physical address bit as an XO/NR
permission bit in the guest page tables. Adjust reserved mask so KVM guest
page tables walks are aware this bit is not reserved.

Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
---
 arch/x86/include/asm/cpufeature.h    |  1 +
 arch/x86/include/asm/cpufeatures.h   |  3 +++
 arch/x86/include/uapi/asm/kvm_para.h |  3 +++
 arch/x86/kvm/cpuid.c                 |  7 +++++++
 arch/x86/kvm/cpuid.h                 |  1 +
 arch/x86/kvm/mmu.c                   | 18 ++++++++++++------
 6 files changed, 27 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index 58acda503817..17127ffbc2a2 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -30,6 +30,7 @@ enum cpuid_leafs
 	CPUID_7_ECX,
 	CPUID_8000_0007_EBX,
 	CPUID_7_EDX,
+	CPUID_4000_0030_EAX
 };
 
 #ifdef CONFIG_X86_FEATURE_NAMES
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index e880f2408e29..7ba217e894ea 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -364,6 +364,9 @@
 #define X86_FEATURE_ARCH_CAPABILITIES	(18*32+29) /* IA32_ARCH_CAPABILITIES MSR (Intel) */
 #define X86_FEATURE_SPEC_CTRL_SSBD	(18*32+31) /* "" Speculative Store Bypass Disable */
 
+/* KVM-defined CPU features, CPUID level 0x40000030 (EAX), word 19 */
+#define X86_FEATURE_KVM_XO		(19*32+0) /* KVM EPT based execute only memory support */
+
 /*
  * BUG word(s)
  */
diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
index 2a8e0b6b9805..ecff0ff25cf4 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -34,6 +34,9 @@
 
 #define KVM_HINTS_REALTIME      0
 
+#define KVM_CPUID_FEAT_GENERIC	0x40000030
+#define KVM_FEATURE_GENERIC_XO		0
+
 /* The last 8 bits are used to indicate how to interpret the flags field
  * in pvclock structure. If no bits are set, all flags are ignored.
  */
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 22c2720cd948..bcbf3f93602d 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -700,6 +700,12 @@ static inline int __do_cpuid_func(struct kvm_cpuid_entry2 *entry, u32 function,
 		if (sched_info_on())
 			entry->eax |= (1 << KVM_FEATURE_STEAL_TIME);
 
+		entry->ebx = 0;
+		entry->ecx = 0;
+		entry->edx = 0;
+		break;
+	case KVM_CPUID_FEAT_GENERIC:
+		entry->eax = (1 << KVM_FEATURE_GENERIC_XO);
 		entry->ebx = 0;
 		entry->ecx = 0;
 		entry->edx = 0;
@@ -845,6 +851,7 @@ int kvm_dev_ioctl_get_cpuid(struct kvm_cpuid2 *cpuid,
 		{ .func = 0x80000000 },
 		{ .func = 0xC0000000, .qualifier = is_centaur_cpu },
 		{ .func = KVM_CPUID_SIGNATURE },
+		{ .func = KVM_CPUID_FEAT_GENERIC },
 	};
 
 	if (cpuid->nent < 1)
diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index d78a61408243..c36d462a0e01 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -53,6 +53,7 @@ static const struct cpuid_reg reverse_cpuid[] = {
 	[CPUID_7_ECX]         = {         7, 0, CPUID_ECX},
 	[CPUID_8000_0007_EBX] = {0x80000007, 0, CPUID_EBX},
 	[CPUID_7_EDX]         = {         7, 0, CPUID_EDX},
+	[CPUID_4000_0030_EAX] = {0x40000030, 0, CPUID_EAX},
 };
 
 static __always_inline struct cpuid_reg x86_feature_cpuid(unsigned x86_feature)
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index a63964e7cec7..e44a8053af78 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -4358,12 +4358,15 @@ static void
 __reset_rsvds_bits_mask(struct kvm_vcpu *vcpu,
 			struct rsvd_bits_validate *rsvd_check,
 			int maxphyaddr, int level, bool nx, bool gbpages,
-			bool pse, bool amd)
+			bool pse, bool amd, bool xo)
 {
 	u64 exb_bit_rsvd = 0;
 	u64 gbpages_bit_rsvd = 0;
 	u64 nonleaf_bit8_rsvd = 0;
 
+	/* Adjust maxphyaddr to include the XO bit if in use */
+	maxphyaddr += xo;
+
 	rsvd_check->bad_mt_xwr = 0;
 
 	if (!nx)
@@ -4448,10 +4451,12 @@ static void reset_rsvds_bits_mask(struct kvm_vcpu *vcpu,
 				  struct kvm_mmu *context)
 {
 	__reset_rsvds_bits_mask(vcpu, &context->guest_rsvd_check,
-				cpuid_maxphyaddr(vcpu), context->root_level,
+				cpuid_maxphyaddr(vcpu),
+				context->root_level,
 				context->nx,
 				guest_cpuid_has(vcpu, X86_FEATURE_GBPAGES),
-				is_pse(vcpu), guest_cpuid_is_amd(vcpu));
+				is_pse(vcpu), guest_cpuid_is_amd(vcpu),
+				guest_cpuid_has(vcpu, X86_FEATURE_KVM_XO));
 }
 
 static void
@@ -4520,7 +4525,7 @@ reset_shadow_zero_bits_mask(struct kvm_vcpu *vcpu, struct kvm_mmu *context)
 				shadow_phys_bits,
 				context->shadow_root_level, uses_nx,
 				guest_cpuid_has(vcpu, X86_FEATURE_GBPAGES),
-				is_pse(vcpu), true);
+				is_pse(vcpu), true, false);
 
 	if (!shadow_me_mask)
 		return;
@@ -4557,7 +4562,7 @@ reset_tdp_shadow_zero_bits_mask(struct kvm_vcpu *vcpu,
 					shadow_phys_bits,
 					context->shadow_root_level, false,
 					boot_cpu_has(X86_FEATURE_GBPAGES),
-					true, true);
+					true, true, false);
 	else
 		__reset_rsvds_bits_mask_ept(shadow_zero_check,
 					    shadow_phys_bits,
@@ -4818,7 +4823,8 @@ static union kvm_mmu_extended_role kvm_calc_mmu_role_ext(struct kvm_vcpu *vcpu)
 	ext.cr4_pse = !!is_pse(vcpu);
 	ext.cr4_pke = !!kvm_read_cr4_bits(vcpu, X86_CR4_PKE);
 	ext.cr4_la57 = !!kvm_read_cr4_bits(vcpu, X86_CR4_LA57);
-	ext.maxphyaddr = cpuid_maxphyaddr(vcpu);
+	ext.maxphyaddr = cpuid_maxphyaddr(vcpu)
+			 + guest_cpuid_has(vcpu, X86_FEATURE_KVM_XO);
 
 	ext.valid = 1;
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [RFC PATCH 03/13] kvm: Add XO memslot type
  2019-10-03 21:23 [RFC PATCH 00/13] XOM for KVM guest userspace Rick Edgecombe
  2019-10-03 21:23 ` [RFC PATCH 01/13] kvm: Enable MTRR to work with GFNs with perm bits Rick Edgecombe
  2019-10-03 21:23 ` [RFC PATCH 02/13] kvm: Add support for X86_FEATURE_KVM_XO Rick Edgecombe
@ 2019-10-03 21:23 ` Rick Edgecombe
  2019-10-04  7:27   ` Paolo Bonzini
  2019-10-03 21:23 ` [RFC PATCH 04/13] kvm, vmx: Add support for gva exit qualification Rick Edgecombe
                   ` (12 subsequent siblings)
  15 siblings, 1 reply; 41+ messages in thread
From: Rick Edgecombe @ 2019-10-03 21:23 UTC (permalink / raw)
  To: kvm, linux-kernel, x86, linux-mm, luto, peterz, dave.hansen,
	pbonzini, sean.j.christopherson, keescook
  Cc: kristen, deneen.t.dock, Rick Edgecombe, Yu Zhang

Add XO memslot type to create execute-only guest physical memory based on
the RO memslot. Like the RO memslot, disallow changing the memslot type
to/from XO.

In the EPT case ACC_USER_MASK represents the readable bit, so add the
ability for set_spte() to unset this.

This is based in part on a patch by Yu Zhang.

Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
---
 arch/x86/kvm/mmu.c             |  9 ++++++++-
 include/uapi/linux/kvm.h       |  1 +
 tools/include/uapi/linux/kvm.h |  1 +
 virt/kvm/kvm_main.c            | 15 ++++++++++++++-
 4 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index e44a8053af78..338cc64cc821 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2981,6 +2981,8 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
 
 	if (pte_access & ACC_USER_MASK)
 		spte |= shadow_user_mask;
+	else
+		spte &= ~shadow_user_mask;
 
 	if (level > PT_PAGE_TABLE_LEVEL)
 		spte |= PT_PAGE_SIZE_MASK;
@@ -3203,6 +3205,11 @@ static int __direct_map(struct kvm_vcpu *vcpu, gpa_t gpa, int write,
 	int ret;
 	gfn_t gfn = gpa >> PAGE_SHIFT;
 	gfn_t base_gfn = gfn;
+	struct kvm_memory_slot *slot = kvm_vcpu_gfn_to_memslot(vcpu, gfn);
+	unsigned int pte_access = ACC_ALL;
+
+	if (slot && slot->flags & KVM_MEM_EXECONLY)
+		pte_access = ACC_EXEC_MASK;
 
 	if (!VALID_PAGE(vcpu->arch.mmu->root_hpa))
 		return RET_PF_RETRY;
@@ -3222,7 +3229,7 @@ static int __direct_map(struct kvm_vcpu *vcpu, gpa_t gpa, int write,
 		}
 	}
 
-	ret = mmu_set_spte(vcpu, it.sptep, ACC_ALL,
+	ret = mmu_set_spte(vcpu, it.sptep, pte_access,
 			   write, level, base_gfn, pfn, prefault,
 			   map_writable);
 	direct_pte_prefetch(vcpu, it.sptep);
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 5e3f12d5359e..ede487b7b216 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -109,6 +109,7 @@ struct kvm_userspace_memory_region {
  */
 #define KVM_MEM_LOG_DIRTY_PAGES	(1UL << 0)
 #define KVM_MEM_READONLY	(1UL << 1)
+#define KVM_MEM_EXECONLY	(1UL << 2)
 
 /* for KVM_IRQ_LINE */
 struct kvm_irq_level {
diff --git a/tools/include/uapi/linux/kvm.h b/tools/include/uapi/linux/kvm.h
index 5e3f12d5359e..ede487b7b216 100644
--- a/tools/include/uapi/linux/kvm.h
+++ b/tools/include/uapi/linux/kvm.h
@@ -109,6 +109,7 @@ struct kvm_userspace_memory_region {
  */
 #define KVM_MEM_LOG_DIRTY_PAGES	(1UL << 0)
 #define KVM_MEM_READONLY	(1UL << 1)
+#define KVM_MEM_EXECONLY	(1UL << 2)
 
 /* for KVM_IRQ_LINE */
 struct kvm_irq_level {
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index c6a91b044d8d..65087c1d67be 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -865,6 +865,8 @@ static int check_memory_region_flags(const struct kvm_userspace_memory_region *m
 	valid_flags |= KVM_MEM_READONLY;
 #endif
 
+	valid_flags |= KVM_MEM_EXECONLY;
+
 	if (mem->flags & ~valid_flags)
 		return -EINVAL;
 
@@ -969,9 +971,12 @@ int __kvm_set_memory_region(struct kvm *kvm,
 		if (!old.npages)
 			change = KVM_MR_CREATE;
 		else { /* Modify an existing slot. */
+			const __u8 changeable = KVM_MEM_READONLY
+					       | KVM_MEM_EXECONLY;
+
 			if ((mem->userspace_addr != old.userspace_addr) ||
 			    (npages != old.npages) ||
-			    ((new.flags ^ old.flags) & KVM_MEM_READONLY))
+			    ((new.flags ^ old.flags) & changeable))
 				goto out;
 
 			if (base_gfn != old.base_gfn)
@@ -1356,6 +1361,11 @@ static bool memslot_is_readonly(struct kvm_memory_slot *slot)
 	return slot->flags & KVM_MEM_READONLY;
 }
 
+static bool memslot_is_execonly(struct kvm_memory_slot *slot)
+{
+	return slot->flags & KVM_MEM_EXECONLY;
+}
+
 static unsigned long __gfn_to_hva_many(struct kvm_memory_slot *slot, gfn_t gfn,
 				       gfn_t *nr_pages, bool write)
 {
@@ -1365,6 +1375,9 @@ static unsigned long __gfn_to_hva_many(struct kvm_memory_slot *slot, gfn_t gfn,
 	if (memslot_is_readonly(slot) && write)
 		return KVM_HVA_ERR_RO_BAD;
 
+	if (memslot_is_execonly(slot) && write)
+		return KVM_HVA_ERR_RO_BAD;
+
 	if (nr_pages)
 		*nr_pages = slot->npages - (gfn - slot->base_gfn);
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [RFC PATCH 04/13] kvm, vmx: Add support for gva exit qualification
  2019-10-03 21:23 [RFC PATCH 00/13] XOM for KVM guest userspace Rick Edgecombe
                   ` (2 preceding siblings ...)
  2019-10-03 21:23 ` [RFC PATCH 03/13] kvm: Add XO memslot type Rick Edgecombe
@ 2019-10-03 21:23 ` Rick Edgecombe
  2019-10-03 21:23 ` [RFC PATCH 05/13] kvm: Add #PF injection for KVM XO Rick Edgecombe
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 41+ messages in thread
From: Rick Edgecombe @ 2019-10-03 21:23 UTC (permalink / raw)
  To: kvm, linux-kernel, x86, linux-mm, luto, peterz, dave.hansen,
	pbonzini, sean.j.christopherson, keescook
  Cc: kristen, deneen.t.dock, Rick Edgecombe

VMX supports providing the guest virtual address that caused and EPT
violation. Add support for this so it can be used by the KVM XO feature.

Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
---
 arch/x86/include/asm/kvm_host.h | 4 ++++
 arch/x86/include/asm/vmx.h      | 1 +
 arch/x86/kvm/vmx/vmx.c          | 5 +++++
 arch/x86/kvm/x86.c              | 1 +
 4 files changed, 11 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index bdc16b0aa7c6..b363a7fc47b0 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -781,6 +781,10 @@ struct kvm_vcpu_arch {
 	bool gpa_available;
 	gpa_t gpa_val;
 
+	/* GVA available */
+	bool gva_available;
+	gva_t gva_val;
+
 	/* be preempted when it's in kernel-mode(cpl=0) */
 	bool preempted_in_kernel;
 
diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
index a39136b0d509..67457f2d19e2 100644
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -522,6 +522,7 @@ struct vmx_msr_entry {
 #define EPT_VIOLATION_READABLE_BIT	3
 #define EPT_VIOLATION_WRITABLE_BIT	4
 #define EPT_VIOLATION_EXECUTABLE_BIT	5
+#define EPT_VIOLATION_GVA_LINEAR_VALID	7
 #define EPT_VIOLATION_GVA_TRANSLATED_BIT 8
 #define EPT_VIOLATION_ACC_READ		(1 << EPT_VIOLATION_ACC_READ_BIT)
 #define EPT_VIOLATION_ACC_WRITE		(1 << EPT_VIOLATION_ACC_WRITE_BIT)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index c030c96fc81a..a30dbab8a2d4 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -5116,6 +5116,11 @@ static int handle_ept_violation(struct kvm_vcpu *vcpu)
 	error_code |= (exit_qualification & 0x100) != 0 ?
 	       PFERR_GUEST_FINAL_MASK : PFERR_GUEST_PAGE_MASK;
 
+	if (exit_qualification | EPT_VIOLATION_GVA_LINEAR_VALID) {
+		vcpu->arch.gva_available = true;
+		vcpu->arch.gva_val = vmcs_readl(GUEST_LINEAR_ADDRESS);
+	}
+
 	vcpu->arch.exit_qualification = exit_qualification;
 	return kvm_mmu_page_fault(vcpu, gpa, error_code, NULL, 0);
 }
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 91602d310a3f..aa138d3a86c5 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8092,6 +8092,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 		kvm_lapic_sync_from_vapic(vcpu);
 
 	vcpu->arch.gpa_available = false;
+	vcpu->arch.gva_available = false;
 	r = kvm_x86_ops->handle_exit(vcpu);
 	return r;
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [RFC PATCH 05/13] kvm: Add #PF injection for KVM XO
  2019-10-03 21:23 [RFC PATCH 00/13] XOM for KVM guest userspace Rick Edgecombe
                   ` (3 preceding siblings ...)
  2019-10-03 21:23 ` [RFC PATCH 04/13] kvm, vmx: Add support for gva exit qualification Rick Edgecombe
@ 2019-10-03 21:23 ` Rick Edgecombe
  2019-10-04  7:42   ` Paolo Bonzini
  2019-10-03 21:23 ` [RFC PATCH 06/13] kvm: Add KVM_CAP_EXECONLY_MEM Rick Edgecombe
                   ` (10 subsequent siblings)
  15 siblings, 1 reply; 41+ messages in thread
From: Rick Edgecombe @ 2019-10-03 21:23 UTC (permalink / raw)
  To: kvm, linux-kernel, x86, linux-mm, luto, peterz, dave.hansen,
	pbonzini, sean.j.christopherson, keescook
  Cc: kristen, deneen.t.dock, Rick Edgecombe

If there is a read or write violation on the gfn range of an XO memslot,
then inject a page fault into the guest with the guest virtual address
that faulted. This can be done directly if the hardware provides the gva
access that caused the fault. Otherwise, the violating instruction needs
to be emulated to figure it out.

TODO:
Currently ACC_USER_MASK is used to mean not-readable in the EPT case,
but in the x86 page tables case it means the real user bit and so can't
be overloaded to mean not readable. Probably a new dedicated ACC_ flag is
needed for not readable to be used in XOM cases. Instead of changing that
everywhere a conditional is added in paging_tmpl.h to check for the KVM XO
bit. This should probably be made to work with the logic in
permission_fault instead of having a special case.

Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
---
 arch/x86/include/asm/kvm_host.h |  2 ++
 arch/x86/kvm/mmu.c              | 52 +++++++++++++++++++++++++++++++++
 arch/x86/kvm/paging_tmpl.h      | 29 ++++++++++++++----
 arch/x86/kvm/x86.c              |  5 +++-
 4 files changed, 82 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index b363a7fc47b0..6d06c794d720 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -785,6 +785,8 @@ struct kvm_vcpu_arch {
 	bool gva_available;
 	gva_t gva_val;
 
+	bool xo_fault;
+
 	/* be preempted when it's in kernel-mode(cpl=0) */
 	bool preempted_in_kernel;
 
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 338cc64cc821..d5ba44066b62 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -45,6 +45,7 @@
 #include <asm/io.h>
 #include <asm/vmx.h>
 #include <asm/kvm_page_track.h>
+#include <asm/traps.h>
 #include "trace.h"
 
 /*
@@ -4130,6 +4131,34 @@ check_hugepage_cache_consistency(struct kvm_vcpu *vcpu, gfn_t gfn, int level)
 	return kvm_mtrr_check_gfn_range_consistency(vcpu, gfn, page_num);
 }
 
+
+static int try_inject_exec_only_pf(struct kvm_vcpu *vcpu, u64 error_code)
+{
+	struct x86_exception fault;
+	int cpl = kvm_x86_ops->get_cpl(vcpu);
+	/*
+	 * There is an assumption here that if there is an TDP violation for an
+	 * XO memslot, then it must be a read or write fault.
+	 */
+	u16 fault_error_code = X86_PF_PROT | (cpl == 3 ? X86_PF_USER : 0);
+
+	if (!vcpu->arch.gva_available)
+		return 0;
+
+	if (error_code & PFERR_WRITE_MASK)
+		fault_error_code |= X86_PF_WRITE;
+
+	fault.vector = PF_VECTOR;
+	fault.error_code_valid = true;
+	fault.error_code = fault_error_code;
+	fault.nested_page_fault = false;
+	fault.address = vcpu->arch.gva_val;
+	fault.async_page_fault = true;
+	kvm_inject_page_fault(vcpu, &fault);
+
+	return 1;
+}
+
 static int tdp_page_fault(struct kvm_vcpu *vcpu, gva_t gpa, u32 error_code,
 			  bool prefault)
 {
@@ -4141,12 +4170,35 @@ static int tdp_page_fault(struct kvm_vcpu *vcpu, gva_t gpa, u32 error_code,
 	unsigned long mmu_seq;
 	int write = error_code & PFERR_WRITE_MASK;
 	bool map_writable;
+	struct kvm_memory_slot *slot = kvm_vcpu_gfn_to_memslot(vcpu, gfn);
 
 	MMU_WARN_ON(!VALID_PAGE(vcpu->arch.mmu->root_hpa));
 
 	if (page_fault_handle_page_track(vcpu, error_code, gfn))
 		return RET_PF_EMULATE;
 
+	/*
+	 * Set xo_fault when the fault is a read or write fault on an xo memslot
+	 * so that the emulator knows it needs to check page table permissions
+	 * and will inject a fault.
+	 */
+	vcpu->arch.xo_fault = false;
+	if (slot && unlikely((slot->flags & KVM_MEM_EXECONLY)
+		&& !(error_code & PFERR_FETCH_MASK)))
+		vcpu->arch.xo_fault = true;
+
+	/* If memslot is xo, need to inject fault */
+	if (unlikely(vcpu->arch.xo_fault)) {
+		/*
+		 * If not enough information to inject the fault,
+		 * emulate to figure it out and emulate the PF.
+		 */
+		if (!try_inject_exec_only_pf(vcpu, error_code))
+			return RET_PF_EMULATE;
+
+		return 1;
+	}
+
 	r = mmu_topup_memory_caches(vcpu);
 	if (r)
 		return r;
diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index 7d5cdb3af594..eae1871c5225 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -307,7 +307,9 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker,
 	gpa_t pte_gpa;
 	bool have_ad;
 	int offset;
-	u64 walk_nx_mask = 0;
+	u64 walk_mask = 0;
+	u64 walk_nr_mask = 0;
+	bool kvm_xo = guest_cpuid_has(vcpu, X86_FEATURE_KVM_XO);
 	const int write_fault = access & PFERR_WRITE_MASK;
 	const int user_fault  = access & PFERR_USER_MASK;
 	const int fetch_fault = access & PFERR_FETCH_MASK;
@@ -322,7 +324,11 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker,
 	have_ad       = PT_HAVE_ACCESSED_DIRTY(mmu);
 
 #if PTTYPE == 64
-	walk_nx_mask = 1ULL << PT64_NX_SHIFT;
+	walk_mask = 1ULL << PT64_NX_SHIFT;
+	if (kvm_xo) {
+		walk_nr_mask = 1ULL << cpuid_maxphyaddr(vcpu);
+		walk_mask |= walk_nr_mask;
+	}
 	if (walker->level == PT32E_ROOT_LEVEL) {
 		pte = mmu->get_pdptr(vcpu, (addr >> 30) & 3);
 		trace_kvm_mmu_paging_element(pte, walker->level);
@@ -395,7 +401,7 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker,
 		 * Inverting the NX it lets us AND it like other
 		 * permission bits.
 		 */
-		pte_access = pt_access & (pte ^ walk_nx_mask);
+		pte_access = pt_access & (pte ^ walk_mask);
 
 		if (unlikely(!FNAME(is_present_gpte)(pte)))
 			goto error;
@@ -412,12 +418,25 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker,
 	accessed_dirty = have_ad ? pte_access & PT_GUEST_ACCESSED_MASK : 0;
 
 	/* Convert to ACC_*_MASK flags for struct guest_walker.  */
-	walker->pt_access = FNAME(gpte_access)(pt_access ^ walk_nx_mask);
-	walker->pte_access = FNAME(gpte_access)(pte_access ^ walk_nx_mask);
+	walker->pt_access = FNAME(gpte_access)(pt_access ^ walk_mask);
+	walker->pte_access = FNAME(gpte_access)(pte_access ^ walk_mask);
+
 	errcode = permission_fault(vcpu, mmu, walker->pte_access, pte_pkey, access);
 	if (unlikely(errcode))
 		goto error;
 
+	/*
+	 * KVM XO bit is not checked in permission_fault(), so check it here and
+	 * inject appropriate fault.
+	 */
+	if (kvm_xo && !fetch_fault
+	    && (walk_nr_mask & (pte_access ^ walk_nr_mask))) {
+		errcode = PFERR_PRESENT_MASK;
+		if (write_fault)
+			errcode	|= PFERR_WRITE_MASK;
+		goto error;
+	}
+
 	gfn = gpte_to_gfn_lvl(pte, walker->level);
 	gfn += (addr & PT_LVL_OFFSET_MASK(walker->level)) >> PAGE_SHIFT;
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index aa138d3a86c5..2e321d788672 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5494,8 +5494,11 @@ static int emulator_read_write_onepage(unsigned long addr, void *val,
 	 * Note, this cannot be used on string operations since string
 	 * operation using rep will only have the initial GPA from the NPF
 	 * occurred.
+	 *
+	 * If the fault was an XO fault, we need to walk the page tables to
+	 * determine the gva and emulate the PF.
 	 */
-	if (vcpu->arch.gpa_available &&
+	if (!vcpu->arch.xo_fault && vcpu->arch.gpa_available &&
 	    emulator_can_use_gpa(ctxt) &&
 	    (addr & ~PAGE_MASK) == (vcpu->arch.gpa_val & ~PAGE_MASK)) {
 		gpa = vcpu->arch.gpa_val;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [RFC PATCH 06/13] kvm: Add KVM_CAP_EXECONLY_MEM
  2019-10-03 21:23 [RFC PATCH 00/13] XOM for KVM guest userspace Rick Edgecombe
                   ` (4 preceding siblings ...)
  2019-10-03 21:23 ` [RFC PATCH 05/13] kvm: Add #PF injection for KVM XO Rick Edgecombe
@ 2019-10-03 21:23 ` Rick Edgecombe
  2019-10-04  7:24   ` Paolo Bonzini
  2019-10-03 21:23 ` [RFC PATCH 07/13] kvm: Add docs for KVM_CAP_EXECONLY_MEM Rick Edgecombe
                   ` (9 subsequent siblings)
  15 siblings, 1 reply; 41+ messages in thread
From: Rick Edgecombe @ 2019-10-03 21:23 UTC (permalink / raw)
  To: kvm, linux-kernel, x86, linux-mm, luto, peterz, dave.hansen,
	pbonzini, sean.j.christopherson, keescook
  Cc: kristen, deneen.t.dock, Rick Edgecombe

Add a KVM capability for the KVM_MEM_EXECONLY memslot type. This memslot
type is supported if the HW supports execute-only TDP.

Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
---
 arch/x86/include/asm/kvm_host.h | 1 +
 arch/x86/kvm/svm.c              | 6 ++++++
 arch/x86/kvm/vmx/vmx.c          | 1 +
 arch/x86/kvm/x86.c              | 3 +++
 include/uapi/linux/kvm.h        | 1 +
 5 files changed, 12 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 6d06c794d720..be3ff71e6227 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1132,6 +1132,7 @@ struct kvm_x86_ops {
 	bool (*xsaves_supported)(void);
 	bool (*umip_emulated)(void);
 	bool (*pt_supported)(void);
+	bool (*tdp_xo_supported)(void);
 
 	int (*check_nested_events)(struct kvm_vcpu *vcpu, bool external_intr);
 	void (*request_immediate_exit)(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index e0368076a1ef..f9f25f32e946 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -6005,6 +6005,11 @@ static bool svm_pt_supported(void)
 	return false;
 }
 
+static bool svm_xo_supported(void)
+{
+	return false;
+}
+
 static bool svm_has_wbinvd_exit(void)
 {
 	return true;
@@ -7293,6 +7298,7 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
 	.xsaves_supported = svm_xsaves_supported,
 	.umip_emulated = svm_umip_emulated,
 	.pt_supported = svm_pt_supported,
+	.tdp_xo_supported = svm_xo_supported,
 
 	.set_supported_cpuid = svm_set_supported_cpuid,
 
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index a30dbab8a2d4..7e7260c715f2 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7767,6 +7767,7 @@ static struct kvm_x86_ops vmx_x86_ops __ro_after_init = {
 	.xsaves_supported = vmx_xsaves_supported,
 	.umip_emulated = vmx_umip_emulated,
 	.pt_supported = vmx_pt_supported,
+	.tdp_xo_supported = cpu_has_vmx_ept_execute_only,
 
 	.request_immediate_exit = vmx_request_immediate_exit,
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 2e321d788672..810cfdb1a315 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3183,6 +3183,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 		r = kvm_x86_ops->get_nested_state ?
 			kvm_x86_ops->get_nested_state(NULL, NULL, 0) : 0;
 		break;
+	case KVM_CAP_EXECONLY_MEM:
+		r = kvm_x86_ops->tdp_xo_supported();
+		break;
 	default:
 		break;
 	}
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index ede487b7b216..7778a1f03b78 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -997,6 +997,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_ARM_PTRAUTH_ADDRESS 171
 #define KVM_CAP_ARM_PTRAUTH_GENERIC 172
 #define KVM_CAP_PMU_EVENT_FILTER 173
+#define KVM_CAP_EXECONLY_MEM 174
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [RFC PATCH 07/13] kvm: Add docs for KVM_CAP_EXECONLY_MEM
  2019-10-03 21:23 [RFC PATCH 00/13] XOM for KVM guest userspace Rick Edgecombe
                   ` (5 preceding siblings ...)
  2019-10-03 21:23 ` [RFC PATCH 06/13] kvm: Add KVM_CAP_EXECONLY_MEM Rick Edgecombe
@ 2019-10-03 21:23 ` Rick Edgecombe
  2019-10-03 21:23 ` [RFC PATCH 08/13] x86/boot: Rename USE_EARLY_PGTABLE_L5 Rick Edgecombe
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 41+ messages in thread
From: Rick Edgecombe @ 2019-10-03 21:23 UTC (permalink / raw)
  To: kvm, linux-kernel, x86, linux-mm, luto, peterz, dave.hansen,
	pbonzini, sean.j.christopherson, keescook
  Cc: kristen, deneen.t.dock, Rick Edgecombe

Add documentation for the KVM_CAP_EXECONLY_MEM capability and
KVM_MEM_EXECONLY memslot.

Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
---
 Documentation/virt/kvm/api.txt | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/Documentation/virt/kvm/api.txt b/Documentation/virt/kvm/api.txt
index 2d067767b617..a8001f996a8a 100644
--- a/Documentation/virt/kvm/api.txt
+++ b/Documentation/virt/kvm/api.txt
@@ -1096,6 +1096,7 @@ struct kvm_userspace_memory_region {
 /* for kvm_memory_region::flags */
 #define KVM_MEM_LOG_DIRTY_PAGES	(1UL << 0)
 #define KVM_MEM_READONLY	(1UL << 1)
+#define KVM_MEM_EXECONLY	(1UL << 2)
 
 This ioctl allows the user to create, modify or delete a guest physical
 memory slot.  Bits 0-15 of "slot" specify the slot id and this value
@@ -1123,12 +1124,15 @@ It is recommended that the lower 21 bits of guest_phys_addr and userspace_addr
 be identical.  This allows large pages in the guest to be backed by large
 pages in the host.
 
-The flags field supports two flags: KVM_MEM_LOG_DIRTY_PAGES and
-KVM_MEM_READONLY.  The former can be set to instruct KVM to keep track of
-writes to memory within the slot.  See KVM_GET_DIRTY_LOG ioctl to know how to
-use it.  The latter can be set, if KVM_CAP_READONLY_MEM capability allows it,
-to make a new slot read-only.  In this case, writes to this memory will be
-posted to userspace as KVM_EXIT_MMIO exits.
+The flags field supports three flags: KVM_MEM_LOG_DIRTY_PAGES, KVM_MEM_READONLY
+and KVM_MEM_EXECONLY.  KVM_MEM_LOG_DIRTY_PAGES can be set to instruct KVM to
+keep track of writes to memory within the slot.  See KVM_GET_DIRTY_LOG ioctl to
+know how to use it.  KVM_MEM_READONLY can be set, if KVM_CAP_READONLY_MEM
+capability allows it, to make a new slot read-only.  In this case, writes to
+this memory will be posted to userspace as KVM_EXIT_MMIO exits. KVM_MEM_EXECONLY
+can be set, if KVM_CAP_EXECONLY_MEM capability allows it, to make a new slot
+exec-only. Guest read accesses to KVM_CAP_EXECONLY_MEM will trigger an
+appropriate fault injected into the guest, in support of X86_FEATURE_KVM_XO.
 
 When the KVM_CAP_SYNC_MMU capability is available, changes in the backing of
 the memory region are automatically reflected into the guest.  For example, an
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [RFC PATCH 08/13] x86/boot: Rename USE_EARLY_PGTABLE_L5
  2019-10-03 21:23 [RFC PATCH 00/13] XOM for KVM guest userspace Rick Edgecombe
                   ` (6 preceding siblings ...)
  2019-10-03 21:23 ` [RFC PATCH 07/13] kvm: Add docs for KVM_CAP_EXECONLY_MEM Rick Edgecombe
@ 2019-10-03 21:23 ` Rick Edgecombe
  2019-10-03 21:23 ` [RFC PATCH 09/13] x86/cpufeature: Add detection of KVM XO Rick Edgecombe
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 41+ messages in thread
From: Rick Edgecombe @ 2019-10-03 21:23 UTC (permalink / raw)
  To: kvm, linux-kernel, x86, linux-mm, luto, peterz, dave.hansen,
	pbonzini, sean.j.christopherson, keescook
  Cc: kristen, deneen.t.dock, Rick Edgecombe

Rename USE_EARLY_PGTABLE_L5 to USE_EARLY_PGTABLE so that it can be used
by other early boot detectable page table features.

Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
---
 arch/x86/boot/compressed/misc.h         | 2 +-
 arch/x86/include/asm/pgtable_64_types.h | 4 ++--
 arch/x86/kernel/cpu/common.c            | 2 +-
 arch/x86/kernel/head64.c                | 2 +-
 arch/x86/mm/kasan_init_64.c             | 2 +-
 5 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index c8181392f70d..45a23aa807bd 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -14,7 +14,7 @@
 #undef CONFIG_KASAN
 
 /* cpu_feature_enabled() cannot be used this early */
-#define USE_EARLY_PGTABLE_L5
+#define USE_EARLY_PGTABLE
 
 #include <linux/linkage.h>
 #include <linux/screen_info.h>
diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h
index 52e5f5f2240d..6b55b837ead4 100644
--- a/arch/x86/include/asm/pgtable_64_types.h
+++ b/arch/x86/include/asm/pgtable_64_types.h
@@ -23,7 +23,7 @@ typedef struct { pteval_t pte; } pte_t;
 #ifdef CONFIG_X86_5LEVEL
 extern unsigned int __pgtable_l5_enabled;
 
-#ifdef USE_EARLY_PGTABLE_L5
+#ifdef USE_EARLY_PGTABLE
 /*
  * cpu_feature_enabled() is not available in early boot code.
  * Use variable instead.
@@ -34,7 +34,7 @@ static inline bool pgtable_l5_enabled(void)
 }
 #else
 #define pgtable_l5_enabled() cpu_feature_enabled(X86_FEATURE_LA57)
-#endif /* USE_EARLY_PGTABLE_L5 */
+#endif /* USE_EARLY_PGTABLE */
 
 #else
 #define pgtable_l5_enabled() 0
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index f125bf7ecb6f..4f08e164c0b1 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0-only
 /* cpu_feature_enabled() cannot be used this early */
-#define USE_EARLY_PGTABLE_L5
+#define USE_EARLY_PGTABLE
 
 #include <linux/memblock.h>
 #include <linux/linkage.h>
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 29ffa495bd1c..55f5294c3cdf 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -8,7 +8,7 @@
 #define DISABLE_BRANCH_PROFILING
 
 /* cpu_feature_enabled() cannot be used this early */
-#define USE_EARLY_PGTABLE_L5
+#define USE_EARLY_PGTABLE
 
 #include <linux/init.h>
 #include <linux/linkage.h>
diff --git a/arch/x86/mm/kasan_init_64.c b/arch/x86/mm/kasan_init_64.c
index 296da58f3013..9466d7abae49 100644
--- a/arch/x86/mm/kasan_init_64.c
+++ b/arch/x86/mm/kasan_init_64.c
@@ -3,7 +3,7 @@
 #define pr_fmt(fmt) "kasan: " fmt
 
 /* cpu_feature_enabled() cannot be used this early */
-#define USE_EARLY_PGTABLE_L5
+#define USE_EARLY_PGTABLE
 
 #include <linux/memblock.h>
 #include <linux/kasan.h>
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [RFC PATCH 09/13] x86/cpufeature: Add detection of KVM XO
  2019-10-03 21:23 [RFC PATCH 00/13] XOM for KVM guest userspace Rick Edgecombe
                   ` (7 preceding siblings ...)
  2019-10-03 21:23 ` [RFC PATCH 08/13] x86/boot: Rename USE_EARLY_PGTABLE_L5 Rick Edgecombe
@ 2019-10-03 21:23 ` Rick Edgecombe
  2019-10-29 23:33   ` Kees Cook
  2019-10-03 21:23 ` [RFC PATCH 10/13] x86/mm: Add NR page bit for " Rick Edgecombe
                   ` (6 subsequent siblings)
  15 siblings, 1 reply; 41+ messages in thread
From: Rick Edgecombe @ 2019-10-03 21:23 UTC (permalink / raw)
  To: kvm, linux-kernel, x86, linux-mm, luto, peterz, dave.hansen,
	pbonzini, sean.j.christopherson, keescook
  Cc: kristen, deneen.t.dock, Rick Edgecombe

Add a new CPUID leaf to hold the contents of CPUID 0x40000030 EAX to
detect KVM defined generic VMM features.

The leaf was proposed to allow KVM to communicate features that are
defined by KVM, but available for any VMM to implement.

Add cpu_feature_enabled() support for features in this leaf (KVM XO), and
a pgtable_kvmxo_enabled() helper similar to pgtable_l5_enabled() so that
pgtable_kvmxo_enabled() can be used in early code that includes
arch/x86/include/asm/sparsemem.h.

Lastly, in head64.c detect and this feature and perform necessary
adjustments to physical_mask.

Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
---
 arch/x86/include/asm/cpufeature.h             |  6 ++-
 arch/x86/include/asm/cpufeatures.h            |  2 +-
 arch/x86/include/asm/disabled-features.h      |  3 +-
 arch/x86/include/asm/pgtable_32_types.h       |  1 +
 arch/x86/include/asm/pgtable_64_types.h       | 26 ++++++++++++-
 arch/x86/include/asm/required-features.h      |  3 +-
 arch/x86/include/asm/sparsemem.h              |  4 +-
 arch/x86/kernel/cpu/common.c                  |  5 +++
 arch/x86/kernel/head64.c                      | 38 ++++++++++++++++++-
 .../arch/x86/include/asm/disabled-features.h  |  3 +-
 10 files changed, 80 insertions(+), 11 deletions(-)

diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index 17127ffbc2a2..7d04ea4f1623 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -82,8 +82,9 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
 	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 16, feature_bit) ||	\
 	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 17, feature_bit) ||	\
 	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 18, feature_bit) ||	\
+	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 19, feature_bit) ||	\
 	   REQUIRED_MASK_CHECK					  ||	\
-	   BUILD_BUG_ON_ZERO(NCAPINTS != 19))
+	   BUILD_BUG_ON_ZERO(NCAPINTS != 20))
 
 #define DISABLED_MASK_BIT_SET(feature_bit)				\
 	 ( CHECK_BIT_IN_MASK_WORD(DISABLED_MASK,  0, feature_bit) ||	\
@@ -105,8 +106,9 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
 	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 16, feature_bit) ||	\
 	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 17, feature_bit) ||	\
 	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 18, feature_bit) ||	\
+	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 19, feature_bit) ||	\
 	   DISABLED_MASK_CHECK					  ||	\
-	   BUILD_BUG_ON_ZERO(NCAPINTS != 19))
+	   BUILD_BUG_ON_ZERO(NCAPINTS != 20))
 
 #define cpu_has(c, bit)							\
 	(__builtin_constant_p(bit) && REQUIRED_MASK_BIT_SET(bit) ? 1 :	\
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 7ba217e894ea..9c1b07674401 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -13,7 +13,7 @@
 /*
  * Defines x86 CPU feature bits
  */
-#define NCAPINTS			19	   /* N 32-bit words worth of info */
+#define NCAPINTS			20	   /* N 32-bit words worth of info */
 #define NBUGINTS			1	   /* N 32-bit bug flags */
 
 /*
diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h
index a5ea841cc6d2..f0f935f8d917 100644
--- a/arch/x86/include/asm/disabled-features.h
+++ b/arch/x86/include/asm/disabled-features.h
@@ -84,6 +84,7 @@
 #define DISABLED_MASK16	(DISABLE_PKU|DISABLE_OSPKE|DISABLE_LA57|DISABLE_UMIP)
 #define DISABLED_MASK17	0
 #define DISABLED_MASK18	0
-#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 19)
+#define DISABLED_MASK19	0
+#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 20)
 
 #endif /* _ASM_X86_DISABLED_FEATURES_H */
diff --git a/arch/x86/include/asm/pgtable_32_types.h b/arch/x86/include/asm/pgtable_32_types.h
index b0bc0fff5f1f..57a11692715e 100644
--- a/arch/x86/include/asm/pgtable_32_types.h
+++ b/arch/x86/include/asm/pgtable_32_types.h
@@ -16,6 +16,7 @@
 #endif
 
 #define pgtable_l5_enabled() 0
+#define pgtable_kvmxo_enabled() 0
 
 #define PGDIR_SIZE	(1UL << PGDIR_SHIFT)
 #define PGDIR_MASK	(~(PGDIR_SIZE - 1))
diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h
index 6b55b837ead4..7c7c9d1a199a 100644
--- a/arch/x86/include/asm/pgtable_64_types.h
+++ b/arch/x86/include/asm/pgtable_64_types.h
@@ -43,10 +43,34 @@ static inline bool pgtable_l5_enabled(void)
 extern unsigned int pgdir_shift;
 extern unsigned int ptrs_per_p4d;
 
+#ifdef CONFIG_KVM_XO
+extern unsigned int __pgtable_kvmxo_enabled;
+
+#ifdef USE_EARLY_PGTABLE
+/*
+ * cpu_feature_enabled() is not available in early boot code.
+ * Use variable instead.
+ */
+static inline bool pgtable_kvmxo_enabled(void)
+{
+	return __pgtable_kvmxo_enabled;
+}
+#else
+#define pgtable_kvmxo_enabled() cpu_feature_enabled(X86_FEATURE_KVM_XO)
+#endif /* USE_EARLY_PGTABLE */
+
+#else
+#define pgtable_kvmxo_enabled() 0
+#endif /* CONFIG_KVM_XO */
+
 #endif	/* !__ASSEMBLY__ */
 
 #define SHARED_KERNEL_PMD	0
 
+#if defined(CONFIG_X86_5LEVEL) || defined(CONFIG_KVM_XO)
+#define MAX_POSSIBLE_PHYSMEM_BITS	52
+#endif
+
 #ifdef CONFIG_X86_5LEVEL
 
 /*
@@ -64,8 +88,6 @@ extern unsigned int ptrs_per_p4d;
 #define P4D_SIZE		(_AC(1, UL) << P4D_SHIFT)
 #define P4D_MASK		(~(P4D_SIZE - 1))
 
-#define MAX_POSSIBLE_PHYSMEM_BITS	52
-
 #else /* CONFIG_X86_5LEVEL */
 
 /*
diff --git a/arch/x86/include/asm/required-features.h b/arch/x86/include/asm/required-features.h
index 6847d85400a8..fa5700097f64 100644
--- a/arch/x86/include/asm/required-features.h
+++ b/arch/x86/include/asm/required-features.h
@@ -101,6 +101,7 @@
 #define REQUIRED_MASK16	0
 #define REQUIRED_MASK17	0
 #define REQUIRED_MASK18	0
-#define REQUIRED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 19)
+#define REQUIRED_MASK19	0
+#define REQUIRED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 20)
 
 #endif /* _ASM_X86_REQUIRED_FEATURES_H */
diff --git a/arch/x86/include/asm/sparsemem.h b/arch/x86/include/asm/sparsemem.h
index 199218719a86..24b305195369 100644
--- a/arch/x86/include/asm/sparsemem.h
+++ b/arch/x86/include/asm/sparsemem.h
@@ -27,8 +27,8 @@
 # endif
 #else /* CONFIG_X86_32 */
 # define SECTION_SIZE_BITS	27 /* matt - 128 is convenient right now */
-# define MAX_PHYSADDR_BITS	(pgtable_l5_enabled() ? 52 : 44)
-# define MAX_PHYSMEM_BITS	(pgtable_l5_enabled() ? 52 : 46)
+# define MAX_PHYSADDR_BITS	((pgtable_l5_enabled() ? 52 : 44) - !!pgtable_kvmxo_enabled())
+# define MAX_PHYSMEM_BITS	((pgtable_l5_enabled() ? 52 : 46) - !!pgtable_kvmxo_enabled())
 #endif
 
 #endif /* CONFIG_SPARSEMEM */
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 4f08e164c0b1..ee204aefbcfd 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -933,6 +933,11 @@ void get_cpu_cap(struct cpuinfo_x86 *c)
 		c->x86_capability[CPUID_D_1_EAX] = eax;
 	}
 
+	eax = cpuid_eax(0x40000000);
+	c->extended_cpuid_level = eax;
+	if (c->extended_cpuid_level >= 0x40000030)
+		c->x86_capability[CPUID_4000_0030_EAX] = cpuid_eax(0x40000030);
+
 	/* AMD-defined flags: level 0x80000001 */
 	eax = cpuid_eax(0x80000000);
 	c->extended_cpuid_level = eax;
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 55f5294c3cdf..7091702a7bec 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -52,6 +52,11 @@ unsigned int ptrs_per_p4d __ro_after_init = 1;
 EXPORT_SYMBOL(ptrs_per_p4d);
 #endif
 
+#ifdef CONFIG_KVM_XO
+unsigned int __pgtable_kvmxo_enabled __ro_after_init;
+unsigned int __pgtable_kvmxo_bit __ro_after_init;
+#endif /* CONFIG_KVM_XO */
+
 #ifdef CONFIG_DYNAMIC_MEMORY_LAYOUT
 unsigned long page_offset_base __ro_after_init = __PAGE_OFFSET_BASE_L4;
 EXPORT_SYMBOL(page_offset_base);
@@ -73,12 +78,14 @@ static unsigned long __head *fixup_long(void *ptr, unsigned long physaddr)
 	return fixup_pointer(ptr, physaddr);
 }
 
-#ifdef CONFIG_X86_5LEVEL
+#if defined(CONFIG_X86_5LEVEL) || defined(CONFIG_KVM_XO)
 static unsigned int __head *fixup_int(void *ptr, unsigned long physaddr)
 {
 	return fixup_pointer(ptr, physaddr);
 }
+#endif
 
+#ifdef CONFIG_X86_5LEVEL
 static bool __head check_la57_support(unsigned long physaddr)
 {
 	/*
@@ -104,6 +111,33 @@ static bool __head check_la57_support(unsigned long physaddr)
 }
 #endif
 
+#ifdef CONFIG_KVM_XO
+static void __head check_kvmxo_support(unsigned long physaddr)
+{
+	unsigned long physbits;
+
+	if ((native_cpuid_eax(0x40000000) < 0x40000030) ||
+	    !(native_cpuid_eax(0x40000030) & (1 << (X86_FEATURE_KVM_XO & 31))))
+		return;
+
+	if (native_cpuid_eax(0x80000000) < 0x80000008)
+		return;
+
+	physbits = native_cpuid_eax(0x80000008) & 0xff;
+
+	/*
+	 * If KVM XO is active, the top physical address bit is the permisison
+	 * bit, so zero it in the mask.
+	 */
+	physical_mask &= ~(1UL << physbits);
+
+	*fixup_int(&__pgtable_kvmxo_enabled, physaddr) = 1;
+	*fixup_int(&__pgtable_kvmxo_bit, physaddr) = physbits;
+}
+#else /* CONFIG_KVM_XO */
+static void __head check_kvmxo_support(unsigned long physaddr) { }
+#endif /* CONFIG_KVM_XO */
+
 /* Code in __startup_64() can be relocated during execution, but the compiler
  * doesn't have to generate PC-relative relocations when accessing globals from
  * that function. Clang actually does not generate them, which leads to
@@ -127,6 +161,8 @@ unsigned long __head __startup_64(unsigned long physaddr,
 
 	la57 = check_la57_support(physaddr);
 
+	check_kvmxo_support(physaddr);
+
 	/* Is the address too large? */
 	if (physaddr >> MAX_PHYSMEM_BITS)
 		for (;;);
diff --git a/tools/arch/x86/include/asm/disabled-features.h b/tools/arch/x86/include/asm/disabled-features.h
index a5ea841cc6d2..f0f935f8d917 100644
--- a/tools/arch/x86/include/asm/disabled-features.h
+++ b/tools/arch/x86/include/asm/disabled-features.h
@@ -84,6 +84,7 @@
 #define DISABLED_MASK16	(DISABLE_PKU|DISABLE_OSPKE|DISABLE_LA57|DISABLE_UMIP)
 #define DISABLED_MASK17	0
 #define DISABLED_MASK18	0
-#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 19)
+#define DISABLED_MASK19	0
+#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 20)
 
 #endif /* _ASM_X86_DISABLED_FEATURES_H */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [RFC PATCH 10/13] x86/mm: Add NR page bit for KVM XO
  2019-10-03 21:23 [RFC PATCH 00/13] XOM for KVM guest userspace Rick Edgecombe
                   ` (8 preceding siblings ...)
  2019-10-03 21:23 ` [RFC PATCH 09/13] x86/cpufeature: Add detection of KVM XO Rick Edgecombe
@ 2019-10-03 21:23 ` Rick Edgecombe
  2019-10-04  7:33   ` Paolo Bonzini
  2019-10-03 21:23 ` [RFC PATCH 11/13] x86, ptdump: Add NR bit to page table dump Rick Edgecombe
                   ` (5 subsequent siblings)
  15 siblings, 1 reply; 41+ messages in thread
From: Rick Edgecombe @ 2019-10-03 21:23 UTC (permalink / raw)
  To: kvm, linux-kernel, x86, linux-mm, luto, peterz, dave.hansen,
	pbonzini, sean.j.christopherson, keescook
  Cc: kristen, deneen.t.dock, Rick Edgecombe

Add _PAGE_BIT_NR and _PAGE_NR, the values of which are determined
dynamically at boot. This page type is only valid after checking for for
the KVM XO CPUID bit.

Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
---
 arch/x86/include/asm/pgtable_types.h | 11 +++++++++++
 arch/x86/mm/init.c                   |  3 +++
 2 files changed, 14 insertions(+)

diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h
index b5e49e6bac63..d3c92c992089 100644
--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -30,6 +30,14 @@
 #define _PAGE_BIT_PKEY_BIT3	62	/* Protection Keys, bit 4/4 */
 #define _PAGE_BIT_NX		63	/* No execute: only valid after cpuid check */
 
+#if defined(CONFIG_KVM_XO) && !defined(__ASSEMBLY__)
+extern unsigned int __pgtable_kvmxo_bit;
+/* KVM based not-readable: only valid after cpuid check */
+#define _PAGE_BIT_NR		(__pgtable_kvmxo_bit)
+#else /* defined(CONFIG_KVM_XO) && !defined(__ASSEMBLY__) */
+#define _PAGE_BIT_NR		0
+#endif /* defined(CONFIG_KVM_XO) && !defined(__ASSEMBLY__) */
+
 #define _PAGE_BIT_SPECIAL	_PAGE_BIT_SOFTW1
 #define _PAGE_BIT_CPA_TEST	_PAGE_BIT_SOFTW1
 #define _PAGE_BIT_SOFT_DIRTY	_PAGE_BIT_SOFTW3 /* software dirty tracking */
@@ -39,6 +47,9 @@
 /* - if the user mapped it with PROT_NONE; pte_present gives true */
 #define _PAGE_BIT_PROTNONE	_PAGE_BIT_GLOBAL
 
+#define _PAGE_NR	(pgtable_kvmxo_enabled() ? \
+			(_AT(pteval_t, 1) << _PAGE_BIT_NR) : 0)
+
 #define _PAGE_PRESENT	(_AT(pteval_t, 1) << _PAGE_BIT_PRESENT)
 #define _PAGE_RW	(_AT(pteval_t, 1) << _PAGE_BIT_RW)
 #define _PAGE_USER	(_AT(pteval_t, 1) << _PAGE_BIT_USER)
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index fd10d91a6115..7298156a76d5 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -195,6 +195,9 @@ static void __init probe_page_size_mask(void)
 		__supported_pte_mask |= _PAGE_GLOBAL;
 	}
 
+	if (pgtable_kvmxo_enabled())
+		__supported_pte_mask |= _PAGE_NR;
+
 	/* By the default is everything supported: */
 	__default_kernel_pte_mask = __supported_pte_mask;
 	/* Except when with PTI where the kernel is mostly non-Global: */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [RFC PATCH 11/13] x86, ptdump: Add NR bit to page table dump
  2019-10-03 21:23 [RFC PATCH 00/13] XOM for KVM guest userspace Rick Edgecombe
                   ` (9 preceding siblings ...)
  2019-10-03 21:23 ` [RFC PATCH 10/13] x86/mm: Add NR page bit for " Rick Edgecombe
@ 2019-10-03 21:23 ` Rick Edgecombe
  2019-10-03 21:23 ` [RFC PATCH 12/13] mmap: Add XO support for KVM XO Rick Edgecombe
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 41+ messages in thread
From: Rick Edgecombe @ 2019-10-03 21:23 UTC (permalink / raw)
  To: kvm, linux-kernel, x86, linux-mm, luto, peterz, dave.hansen,
	pbonzini, sean.j.christopherson, keescook
  Cc: kristen, deneen.t.dock, Rick Edgecombe

Add printing of the NR permission to the page table dump code.

Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
---
 arch/x86/mm/dump_pagetables.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/x86/mm/dump_pagetables.c b/arch/x86/mm/dump_pagetables.c
index ab67822fd2f4..8932aa9e3a9e 100644
--- a/arch/x86/mm/dump_pagetables.c
+++ b/arch/x86/mm/dump_pagetables.c
@@ -182,7 +182,7 @@ static void printk_prot(struct seq_file *m, pgprot_t prot, int level, bool dmsg)
 
 	if (!(pr & _PAGE_PRESENT)) {
 		/* Not present */
-		pt_dump_cont_printf(m, dmsg, "                              ");
+		pt_dump_cont_printf(m, dmsg, "                                 ");
 	} else {
 		if (pr & _PAGE_USER)
 			pt_dump_cont_printf(m, dmsg, "USR ");
@@ -219,6 +219,10 @@ static void printk_prot(struct seq_file *m, pgprot_t prot, int level, bool dmsg)
 			pt_dump_cont_printf(m, dmsg, "NX ");
 		else
 			pt_dump_cont_printf(m, dmsg, "x  ");
+		if (pr & _PAGE_NR)
+			pt_dump_cont_printf(m, dmsg, "NR ");
+		else
+			pt_dump_cont_printf(m, dmsg, "r  ");
 	}
 	pt_dump_cont_printf(m, dmsg, "%s\n", level_name[level]);
 }
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [RFC PATCH 12/13] mmap: Add XO support for KVM XO
  2019-10-03 21:23 [RFC PATCH 00/13] XOM for KVM guest userspace Rick Edgecombe
                   ` (10 preceding siblings ...)
  2019-10-03 21:23 ` [RFC PATCH 11/13] x86, ptdump: Add NR bit to page table dump Rick Edgecombe
@ 2019-10-03 21:23 ` Rick Edgecombe
  2019-10-04  7:34   ` Paolo Bonzini
  2019-10-03 21:24 ` [RFC PATCH 13/13] x86/Kconfig: Add Kconfig for KVM based XO Rick Edgecombe
                   ` (3 subsequent siblings)
  15 siblings, 1 reply; 41+ messages in thread
From: Rick Edgecombe @ 2019-10-03 21:23 UTC (permalink / raw)
  To: kvm, linux-kernel, x86, linux-mm, luto, peterz, dave.hansen,
	pbonzini, sean.j.christopherson, keescook
  Cc: kristen, deneen.t.dock, Rick Edgecombe

The KVM XO feature enables the ability to create execute-only virtual
memory. Use this feature to create XO memory when PROT_EXEC and not
PROT_READ, as the behavior in the case of protection keys for userspace
and some arm64 platforms.

In the case of the ability to create execute only memory with protection
keys AND the ability to create native execute only memory, use the KVM
XO method of creating execute only memory to save a protection key.

Set the values of the __P100 and __S100 in protection_map during boot
instead of statically because the actual KVM XO bit in the PTE is
determinted at boot time and so can't be known at compile time.

Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
---
 arch/x86/include/asm/pgtable_types.h |  2 ++
 arch/x86/kernel/head64.c             |  3 +++
 mm/mmap.c                            | 30 +++++++++++++++++++++++-----
 3 files changed, 30 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h
index d3c92c992089..fe976b4f0132 100644
--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -176,6 +176,8 @@ enum page_cache_mode {
 					 _PAGE_ACCESSED | _PAGE_NX)
 #define PAGE_READONLY_EXEC	__pgprot(_PAGE_PRESENT | _PAGE_USER |	\
 					 _PAGE_ACCESSED)
+#define PAGE_EXECONLY		__pgprot(_PAGE_PRESENT | _PAGE_USER |	\
+					 _PAGE_ACCESSED | _PAGE_NR)
 
 #define __PAGE_KERNEL_EXEC						\
 	(_PAGE_PRESENT | _PAGE_RW | _PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_GLOBAL)
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 7091702a7bec..69772b6e1810 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -133,6 +133,9 @@ static void __head check_kvmxo_support(unsigned long physaddr)
 
 	*fixup_int(&__pgtable_kvmxo_enabled, physaddr) = 1;
 	*fixup_int(&__pgtable_kvmxo_bit, physaddr) = physbits;
+
+	protection_map[4] = PAGE_EXECONLY;
+	protection_map[12] = PAGE_EXECONLY;
 }
 #else /* CONFIG_KVM_XO */
 static void __head check_kvmxo_support(unsigned long physaddr) { }
diff --git a/mm/mmap.c b/mm/mmap.c
index 7e8c3e8ae75f..034ffa0255b2 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1379,6 +1379,29 @@ static inline bool file_mmap_ok(struct file *file, struct inode *inode,
 	return true;
 }
 
+static inline int get_pkey(unsigned long flags)
+{
+	const unsigned long p_xo = pgprot_val(protection_map[4]);
+	const unsigned long p_xr = pgprot_val(protection_map[5]);
+	const unsigned long s_xo = pgprot_val(protection_map[12]);
+	const unsigned long s_xr = pgprot_val(protection_map[13]);
+	int pkey;
+
+	/* Prefer non-pkey XO capability if available, to save a pkey */
+
+	if (flags & MAP_PRIVATE && (p_xo != p_xr))
+		return 0;
+
+	if (flags & MAP_SHARED && (s_xo != s_xr))
+		return 0;
+
+	pkey = execute_only_pkey(current->mm);
+	if (pkey < 0)
+		pkey = 0;
+
+	return pkey;
+}
+
 /*
  * The caller must hold down_write(&current->mm->mmap_sem).
  */
@@ -1440,11 +1463,8 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
 			return -EEXIST;
 	}
 
-	if (prot == PROT_EXEC) {
-		pkey = execute_only_pkey(mm);
-		if (pkey < 0)
-			pkey = 0;
-	}
+	if (prot == PROT_EXEC)
+		pkey = get_pkey(flags);
 
 	/* Do simple checking here so the lower-level routines won't have
 	 * to. we assume access permissions have been handled by the open
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [RFC PATCH 13/13] x86/Kconfig: Add Kconfig for KVM based XO
  2019-10-03 21:23 [RFC PATCH 00/13] XOM for KVM guest userspace Rick Edgecombe
                   ` (11 preceding siblings ...)
  2019-10-03 21:23 ` [RFC PATCH 12/13] mmap: Add XO support for KVM XO Rick Edgecombe
@ 2019-10-03 21:24 ` Rick Edgecombe
  2019-10-29 23:36   ` Kees Cook
  2019-10-04  7:22 ` [RFC PATCH 00/13] XOM for KVM guest userspace Paolo Bonzini
                   ` (2 subsequent siblings)
  15 siblings, 1 reply; 41+ messages in thread
From: Rick Edgecombe @ 2019-10-03 21:24 UTC (permalink / raw)
  To: kvm, linux-kernel, x86, linux-mm, luto, peterz, dave.hansen,
	pbonzini, sean.j.christopherson, keescook
  Cc: kristen, deneen.t.dock, Rick Edgecombe

Add CONFIG_KVM_XO for supporting KVM based execute only memory.

Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
---
 arch/x86/Kconfig | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 222855cc0158..3a3af2a456e8 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -802,6 +802,19 @@ config KVM_GUEST
 	  underlying device model, the host provides the guest with
 	  timing infrastructure such as time of day, and system time
 
+config KVM_XO
+	bool "Support for KVM based execute only virtual memory permissions"
+	select DYNAMIC_PHYSICAL_MASK
+	select SPARSEMEM_VMEMMAP
+	depends on KVM_GUEST && X86_64
+	default y
+	help
+	  This option enables support for execute only memory for KVM guests. If
+	  support from the underlying VMM is not detected at boot, this
+	  capability will automatically disable.
+
+	  If you are unsure how to answer this question, answer Y.
+
 config PVH
 	bool "Support for running PVH guests"
 	---help---
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 00/13] XOM for KVM guest userspace
  2019-10-03 21:23 [RFC PATCH 00/13] XOM for KVM guest userspace Rick Edgecombe
                   ` (12 preceding siblings ...)
  2019-10-03 21:24 ` [RFC PATCH 13/13] x86/Kconfig: Add Kconfig for KVM based XO Rick Edgecombe
@ 2019-10-04  7:22 ` Paolo Bonzini
  2019-10-04 19:03   ` Edgecombe, Rick P
  2019-10-04 14:56 ` Andy Lutomirski
  2019-10-29 23:40 ` Kees Cook
  15 siblings, 1 reply; 41+ messages in thread
From: Paolo Bonzini @ 2019-10-04  7:22 UTC (permalink / raw)
  To: Rick Edgecombe, kvm, linux-kernel, x86, linux-mm, luto, peterz,
	dave.hansen, sean.j.christopherson, keescook
  Cc: kristen, deneen.t.dock

On 03/10/19 23:23, Rick Edgecombe wrote:
> Since software would have previously received a #PF with the RSVD error code
> set, when the HW encountered any set bits in the region 51 to M, there was some
> internal discussion on whether this should have a virtual MSR for the OS to turn
> it on only if the OS knows it isn't relying on this behavior for bit M. The
> argument against needing an MSR is this blurb from the Intel SDM about reserved
> bits:
> "Bits reserved in the paging-structure entries are reserved for future
> functionality. Software developers should be aware that such bits may be used in
> the future and that a paging-structure entry that causes a page-fault exception
> on one processor might not do so in the future."
> 
> So in the current patchset there is no MSR write required for the guest to turn
> on this feature. It will have this behavior whenever qemu is run with
> "-cpu +xo".

I think the part of the manual that you quote is out of date.  Whenever
Intel has "unreserved" bits in the page tables they have done that only
if specific bits in CR4 or EFER or VMCS execution controls are set; this
is a good thing, and I'd really like it to be codified in the SDM.

The only bits for which this does not (and should not) apply are indeed
bits 51:MAXPHYADDR.  But the SDM makes it clear that bits 51:MAXPHYADDR
are reserved, hence "unreserving" bits based on just a QEMU command line
option would be against the specification.  So, please don't do this and
introduce an MSR that enables the feature.

Paolo


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 06/13] kvm: Add KVM_CAP_EXECONLY_MEM
  2019-10-03 21:23 ` [RFC PATCH 06/13] kvm: Add KVM_CAP_EXECONLY_MEM Rick Edgecombe
@ 2019-10-04  7:24   ` Paolo Bonzini
  2019-10-04 19:11     ` Edgecombe, Rick P
  0 siblings, 1 reply; 41+ messages in thread
From: Paolo Bonzini @ 2019-10-04  7:24 UTC (permalink / raw)
  To: Rick Edgecombe, kvm, linux-kernel, x86, linux-mm, luto, peterz,
	dave.hansen, sean.j.christopherson, keescook
  Cc: kristen, deneen.t.dock

On 03/10/19 23:23, Rick Edgecombe wrote:
> Add a KVM capability for the KVM_MEM_EXECONLY memslot type. This memslot
> type is supported if the HW supports execute-only TDP.
> 
> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> ---
>  arch/x86/include/asm/kvm_host.h | 1 +
>  arch/x86/kvm/svm.c              | 6 ++++++
>  arch/x86/kvm/vmx/vmx.c          | 1 +
>  arch/x86/kvm/x86.c              | 3 +++
>  include/uapi/linux/kvm.h        | 1 +
>  5 files changed, 12 insertions(+)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 6d06c794d720..be3ff71e6227 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1132,6 +1132,7 @@ struct kvm_x86_ops {
>  	bool (*xsaves_supported)(void);
>  	bool (*umip_emulated)(void);
>  	bool (*pt_supported)(void);
> +	bool (*tdp_xo_supported)(void);
>  
>  	int (*check_nested_events)(struct kvm_vcpu *vcpu, bool external_intr);
>  	void (*request_immediate_exit)(struct kvm_vcpu *vcpu);
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index e0368076a1ef..f9f25f32e946 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -6005,6 +6005,11 @@ static bool svm_pt_supported(void)
>  	return false;
>  }
>  
> +static bool svm_xo_supported(void)
> +{
> +	return false;
> +}
> +
>  static bool svm_has_wbinvd_exit(void)
>  {
>  	return true;
> @@ -7293,6 +7298,7 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
>  	.xsaves_supported = svm_xsaves_supported,
>  	.umip_emulated = svm_umip_emulated,
>  	.pt_supported = svm_pt_supported,
> +	.tdp_xo_supported = svm_xo_supported,
>  
>  	.set_supported_cpuid = svm_set_supported_cpuid,
>  
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index a30dbab8a2d4..7e7260c715f2 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -7767,6 +7767,7 @@ static struct kvm_x86_ops vmx_x86_ops __ro_after_init = {
>  	.xsaves_supported = vmx_xsaves_supported,
>  	.umip_emulated = vmx_umip_emulated,
>  	.pt_supported = vmx_pt_supported,
> +	.tdp_xo_supported = cpu_has_vmx_ept_execute_only,
>  
>  	.request_immediate_exit = vmx_request_immediate_exit,
>  
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 2e321d788672..810cfdb1a315 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -3183,6 +3183,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>  		r = kvm_x86_ops->get_nested_state ?
>  			kvm_x86_ops->get_nested_state(NULL, NULL, 0) : 0;
>  		break;
> +	case KVM_CAP_EXECONLY_MEM:
> +		r = kvm_x86_ops->tdp_xo_supported();
> +		break;
>  	default:
>  		break;
>  	}
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index ede487b7b216..7778a1f03b78 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -997,6 +997,7 @@ struct kvm_ppc_resize_hpt {
>  #define KVM_CAP_ARM_PTRAUTH_ADDRESS 171
>  #define KVM_CAP_ARM_PTRAUTH_GENERIC 172
>  #define KVM_CAP_PMU_EVENT_FILTER 173
> +#define KVM_CAP_EXECONLY_MEM 174
>  
>  #ifdef KVM_CAP_IRQ_ROUTING
>  
> 

This is not needed, execution only can be a CPUID bit in the hypervisor
range (see Documentation/virt/kvm/cpuid.txt).  Userspace can use
KVM_GET_SUPPORTED_CPUID to check whether the host supports it.

Paolo


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 03/13] kvm: Add XO memslot type
  2019-10-03 21:23 ` [RFC PATCH 03/13] kvm: Add XO memslot type Rick Edgecombe
@ 2019-10-04  7:27   ` Paolo Bonzini
  2019-10-04 19:06     ` Edgecombe, Rick P
  0 siblings, 1 reply; 41+ messages in thread
From: Paolo Bonzini @ 2019-10-04  7:27 UTC (permalink / raw)
  To: Rick Edgecombe, kvm, linux-kernel, x86, linux-mm, luto, peterz,
	dave.hansen, sean.j.christopherson, keescook
  Cc: kristen, deneen.t.dock, Yu Zhang

On 03/10/19 23:23, Rick Edgecombe wrote:
> Add XO memslot type to create execute-only guest physical memory based on
> the RO memslot. Like the RO memslot, disallow changing the memslot type
> to/from XO.
> 
> In the EPT case ACC_USER_MASK represents the readable bit, so add the
> ability for set_spte() to unset this.
> 
> This is based in part on a patch by Yu Zhang.
> 
> Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>

Instead of this, why not check the exit qualification gpa and, if it has
the XO bit set, mask away both the XO bit and the R bit?  It can be done
unconditionally for all memslots.  This should require no change to
userspace.

Paolo


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 10/13] x86/mm: Add NR page bit for KVM XO
  2019-10-03 21:23 ` [RFC PATCH 10/13] x86/mm: Add NR page bit for " Rick Edgecombe
@ 2019-10-04  7:33   ` Paolo Bonzini
  0 siblings, 0 replies; 41+ messages in thread
From: Paolo Bonzini @ 2019-10-04  7:33 UTC (permalink / raw)
  To: Rick Edgecombe, kvm, linux-kernel, x86, linux-mm, luto, peterz,
	dave.hansen, sean.j.christopherson, keescook
  Cc: kristen, deneen.t.dock

On 03/10/19 23:23, Rick Edgecombe wrote:
> +/* KVM based not-readable: only valid after cpuid check */
> +#define _PAGE_BIT_NR		(__pgtable_kvmxo_bit)
> +#else /* defined(CONFIG_KVM_XO) && !defined(__ASSEMBLY__) */
> +#define _PAGE_BIT_NR		0
> +#endif /* defined(CONFIG_KVM_XO) && !defined(__ASSEMBLY__) */

Please do not #define _PAGE_BIT_NR and _PAGE_NR, so that it's clear that
they are variables.

Paolo


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 12/13] mmap: Add XO support for KVM XO
  2019-10-03 21:23 ` [RFC PATCH 12/13] mmap: Add XO support for KVM XO Rick Edgecombe
@ 2019-10-04  7:34   ` Paolo Bonzini
  2019-10-04 19:12     ` Edgecombe, Rick P
  0 siblings, 1 reply; 41+ messages in thread
From: Paolo Bonzini @ 2019-10-04  7:34 UTC (permalink / raw)
  To: Rick Edgecombe, kvm, linux-kernel, x86, linux-mm, luto, peterz,
	dave.hansen, sean.j.christopherson, keescook
  Cc: kristen, deneen.t.dock

On 03/10/19 23:23, Rick Edgecombe wrote:
> +
> +	protection_map[4] = PAGE_EXECONLY;
> +	protection_map[12] = PAGE_EXECONLY;

Can you add #defines for the bits in protection_map?  Also perhaps you
can replace the p_xo/p_xr/s_xo/s_xr checks with just with "if
(pgtable_kvmxo_enabled()".

Paolo

> +	/* Prefer non-pkey XO capability if available, to save a pkey */
> +
> +	if (flags & MAP_PRIVATE && (p_xo != p_xr))
> +		return 0;
> +
> +	if (flags & MAP_SHARED && (s_xo != s_xr))
> +		return 0;
>
> +	pkey = execute_only_pkey(current->mm);
> +	if (pkey < 0)


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 05/13] kvm: Add #PF injection for KVM XO
  2019-10-03 21:23 ` [RFC PATCH 05/13] kvm: Add #PF injection for KVM XO Rick Edgecombe
@ 2019-10-04  7:42   ` Paolo Bonzini
  2019-10-04 19:11     ` Edgecombe, Rick P
  0 siblings, 1 reply; 41+ messages in thread
From: Paolo Bonzini @ 2019-10-04  7:42 UTC (permalink / raw)
  To: Rick Edgecombe, kvm, linux-kernel, x86, linux-mm, luto, peterz,
	dave.hansen, sean.j.christopherson, keescook
  Cc: kristen, deneen.t.dock

On 03/10/19 23:23, Rick Edgecombe wrote:
> +	if (!vcpu->arch.gva_available)
> +		return 0;

Please return RET_PF_* constants, RET_PF_EMULATE here.

> +	if (error_code & PFERR_WRITE_MASK)
> +		fault_error_code |= X86_PF_WRITE;
> +
> +	fault.vector = PF_VECTOR;
> +	fault.error_code_valid = true;
> +	fault.error_code = fault_error_code;
> +	fault.nested_page_fault = false;
> +	fault.address = vcpu->arch.gva_val;
> +	fault.async_page_fault = true;

Not an async page fault.

> +	kvm_inject_page_fault(vcpu, &fault);
> +
> +	return 1;

Here you would return RET_PF_RETRY - you've injected the page fault and
all that's left to do is reenter execution of the vCPU.

[...]

> +	if (unlikely(vcpu->arch.xo_fault)) {
> +		/*
> +		 * If not enough information to inject the fault,
> +		 * emulate to figure it out and emulate the PF.
> +		 */
> +		if (!try_inject_exec_only_pf(vcpu, error_code))
> +			return RET_PF_EMULATE;
> +
> +		return 1;
> +	}

Returning 1 is wrong, it's also RET_PF_EMULATE.  If you change
try_inject_exec_only_pf return values to RET_PF_*, you can simply return
the value of try_inject_exec_only_pf(vcpu, error_code).

That said, I wonder if it's better to just handle this in
handle_ept_violation.  Basically, if bits 5:3 of the exit qualification
are 100 you can bypass the whole mmu.c page fault handling and just
inject an exec-only page fault.

Thanks,

Paolo


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 00/13] XOM for KVM guest userspace
  2019-10-03 21:23 [RFC PATCH 00/13] XOM for KVM guest userspace Rick Edgecombe
                   ` (13 preceding siblings ...)
  2019-10-04  7:22 ` [RFC PATCH 00/13] XOM for KVM guest userspace Paolo Bonzini
@ 2019-10-04 14:56 ` Andy Lutomirski
  2019-10-04 20:09   ` Edgecombe, Rick P
  2019-10-29 23:40 ` Kees Cook
  15 siblings, 1 reply; 41+ messages in thread
From: Andy Lutomirski @ 2019-10-04 14:56 UTC (permalink / raw)
  To: Rick Edgecombe
  Cc: kvm list, LKML, X86 ML, Linux-MM, Andrew Lutomirski,
	Peter Zijlstra, Dave Hansen, Paolo Bonzini, Christopherson,
	Sean J, Kees Cook, Kristen Carlson Accardi, Dock, Deneen T

On Thu, Oct 3, 2019 at 2:38 PM Rick Edgecombe
<rick.p.edgecombe@intel.com> wrote:
>
> This patchset enables the ability for KVM guests to create execute-only (XO)
> memory by utilizing EPT based XO permissions. XO memory is currently supported
> on Intel hardware natively for CPU's with PKU, but this enables it on older
> platforms, and can support XO for kernel memory as well.

The patchset seems to sometimes call this feature "XO" and sometimes
call it "NR".  To me, XO implies no-read and no-write, whereas NR
implies just no-read.  Can you please clarify *exactly* what the new
bit does and be consistent?

I suggest that you make it NR, which allows for PROT_EXEC and
PROT_EXEC|PROT_WRITE and plain PROT_WRITE.  WX is of dubious value,
but I can imagine plain W being genuinely useful for logging and for
JITs that could maintain a W and a separate X mapping of some code.
In other words, with an NR bit, all 8 logical access modes are
possible.  Also, keeping the paging bits more orthogonal seems nice --
we already have a bit that controls write access.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 00/13] XOM for KVM guest userspace
  2019-10-04  7:22 ` [RFC PATCH 00/13] XOM for KVM guest userspace Paolo Bonzini
@ 2019-10-04 19:03   ` Edgecombe, Rick P
  0 siblings, 0 replies; 41+ messages in thread
From: Edgecombe, Rick P @ 2019-10-04 19:03 UTC (permalink / raw)
  To: kvm, linux-kernel, peterz, keescook, Christopherson, Sean J,
	linux-mm, x86, luto, pbonzini, Hansen, Dave
  Cc: kristen, Dock, Deneen T

On Fri, 2019-10-04 at 09:22 +0200, Paolo Bonzini wrote:
> On 03/10/19 23:23, Rick Edgecombe wrote:
> > Since software would have previously received a #PF with the RSVD error code
> > set, when the HW encountered any set bits in the region 51 to M, there was
> > some
> > internal discussion on whether this should have a virtual MSR for the OS to
> > turn
> > it on only if the OS knows it isn't relying on this behavior for bit M. The
> > argument against needing an MSR is this blurb from the Intel SDM about
> > reserved
> > bits:
> > "Bits reserved in the paging-structure entries are reserved for future
> > functionality. Software developers should be aware that such bits may be
> > used in
> > the future and that a paging-structure entry that causes a page-fault
> > exception
> > on one processor might not do so in the future."
> > 
> > So in the current patchset there is no MSR write required for the guest to
> > turn
> > on this feature. It will have this behavior whenever qemu is run with
> > "-cpu +xo".
> 
> I think the part of the manual that you quote is out of date.  Whenever
> Intel has "unreserved" bits in the page tables they have done that only
> if specific bits in CR4 or EFER or VMCS execution controls are set; this
> is a good thing, and I'd really like it to be codified in the SDM.
> 
> The only bits for which this does not (and should not) apply are indeed
> bits 51:MAXPHYADDR.  But the SDM makes it clear that bits 51:MAXPHYADDR
> are reserved, hence "unreserving" bits based on just a QEMU command line
> option would be against the specification.  So, please don't do this and
> introduce an MSR that enables the feature.
> 
> Paolo
> 
Hi Paolo,

Thanks for taking a look!

Fair enough, MSR it is.

Rick

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 03/13] kvm: Add XO memslot type
  2019-10-04  7:27   ` Paolo Bonzini
@ 2019-10-04 19:06     ` Edgecombe, Rick P
  2019-10-06 16:15       ` Paolo Bonzini
  0 siblings, 1 reply; 41+ messages in thread
From: Edgecombe, Rick P @ 2019-10-04 19:06 UTC (permalink / raw)
  To: kvm, linux-kernel, peterz, keescook, Christopherson, Sean J,
	linux-mm, x86, luto, pbonzini, Hansen, Dave
  Cc: kristen, Dock, Deneen T, yu.c.zhang

On Fri, 2019-10-04 at 09:27 +0200, Paolo Bonzini wrote:
> On 03/10/19 23:23, Rick Edgecombe wrote:
> > Add XO memslot type to create execute-only guest physical memory based on
> > the RO memslot. Like the RO memslot, disallow changing the memslot type
> > to/from XO.
> > 
> > In the EPT case ACC_USER_MASK represents the readable bit, so add the
> > ability for set_spte() to unset this.
> > 
> > This is based in part on a patch by Yu Zhang.
> > 
> > Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
> > Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> 
> Instead of this, why not check the exit qualification gpa and, if it has
> the XO bit set, mask away both the XO bit and the R bit?  It can be done
> unconditionally for all memslots.  This should require no change to
> userspace.
> 
> Paolo
> 
The reasoning was that it seems like KVM leaves it to userspace to control the
physical address space layout since userspace decides the supported physical
address bits and lays out memory in the physical address space. So duplication
with XO memslots was an attempt was to keep the logic around that together.

I'll take another look at doing it this way though. I think userspace may still
need to adjust the MAXPHYADDR and be aware it can't layout memory in the XO
range.

Thanks,

Rick

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 05/13] kvm: Add #PF injection for KVM XO
  2019-10-04  7:42   ` Paolo Bonzini
@ 2019-10-04 19:11     ` Edgecombe, Rick P
  0 siblings, 0 replies; 41+ messages in thread
From: Edgecombe, Rick P @ 2019-10-04 19:11 UTC (permalink / raw)
  To: kvm, linux-kernel, peterz, keescook, Christopherson, Sean J,
	linux-mm, x86, luto, pbonzini, Hansen, Dave
  Cc: kristen, Dock, Deneen T

On Fri, 2019-10-04 at 09:42 +0200, Paolo Bonzini wrote:
> On 03/10/19 23:23, Rick Edgecombe wrote:
> > +	if (!vcpu->arch.gva_available)
> > +		return 0;
> 
> Please return RET_PF_* constants, RET_PF_EMULATE here.

Ok.

> > +	if (error_code & PFERR_WRITE_MASK)
> > +		fault_error_code |= X86_PF_WRITE;
> > +
> > +	fault.vector = PF_VECTOR;
> > +	fault.error_code_valid = true;
> > +	fault.error_code = fault_error_code;
> > +	fault.nested_page_fault = false;
> > +	fault.address = vcpu->arch.gva_val;
> > +	fault.async_page_fault = true;
> 
> Not an async page fault.

Right.

> > +	kvm_inject_page_fault(vcpu, &fault);
> > +
> > +	return 1;
> 
> Here you would return RET_PF_RETRY - you've injected the page fault and
> all that's left to do is reenter execution of the vCPU.
> 
> [...]
> 
> > +	if (unlikely(vcpu->arch.xo_fault)) {
> > +		/*
> > +		 * If not enough information to inject the fault,
> > +		 * emulate to figure it out and emulate the PF.
> > +		 */
> > +		if (!try_inject_exec_only_pf(vcpu, error_code))
> > +			return RET_PF_EMULATE;
> > +
> > +		return 1;
> > +	}
> 
> Returning 1 is wrong, it's also RET_PF_EMULATE.  If you change
> try_inject_exec_only_pf return values to RET_PF_*, you can simply return
> the value of try_inject_exec_only_pf(vcpu, error_code).

Oh right! I must have broken this at some point. Thanks. 

> That said, I wonder if it's better to just handle this in
> handle_ept_violation.  Basically, if bits 5:3 of the exit qualification
> are 100 you can bypass the whole mmu.c page fault handling and just
> inject an exec-only page fault.
> 
> Thanks,
> 
> Paolo

Hmm, that could be cleaner. I'll see how it fits together when I fix the nested
case, since some of that logic looks to be in mmu.c.

Thanks,

Rick

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 06/13] kvm: Add KVM_CAP_EXECONLY_MEM
  2019-10-04  7:24   ` Paolo Bonzini
@ 2019-10-04 19:11     ` Edgecombe, Rick P
  0 siblings, 0 replies; 41+ messages in thread
From: Edgecombe, Rick P @ 2019-10-04 19:11 UTC (permalink / raw)
  To: kvm, linux-kernel, peterz, keescook, Christopherson, Sean J,
	linux-mm, x86, luto, pbonzini, Hansen, Dave
  Cc: kristen, Dock, Deneen T

On Fri, 2019-10-04 at 09:24 +0200, Paolo Bonzini wrote:
> On 03/10/19 23:23, Rick Edgecombe wrote:
> > Add a KVM capability for the KVM_MEM_EXECONLY memslot type. This memslot
> > type is supported if the HW supports execute-only TDP.
> > 
> > Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> > ---
> >  arch/x86/include/asm/kvm_host.h | 1 +
> >  arch/x86/kvm/svm.c              | 6 ++++++
> >  arch/x86/kvm/vmx/vmx.c          | 1 +
> >  arch/x86/kvm/x86.c              | 3 +++
> >  include/uapi/linux/kvm.h        | 1 +
> >  5 files changed, 12 insertions(+)
> > 
> > diff --git a/arch/x86/include/asm/kvm_host.h
> > b/arch/x86/include/asm/kvm_host.h
> > index 6d06c794d720..be3ff71e6227 100644
> > --- a/arch/x86/include/asm/kvm_host.h
> > +++ b/arch/x86/include/asm/kvm_host.h
> > @@ -1132,6 +1132,7 @@ struct kvm_x86_ops {
> >  	bool (*xsaves_supported)(void);
> >  	bool (*umip_emulated)(void);
> >  	bool (*pt_supported)(void);
> > +	bool (*tdp_xo_supported)(void);
> >  
> >  	int (*check_nested_events)(struct kvm_vcpu *vcpu, bool external_intr);
> >  	void (*request_immediate_exit)(struct kvm_vcpu *vcpu);
> > diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> > index e0368076a1ef..f9f25f32e946 100644
> > --- a/arch/x86/kvm/svm.c
> > +++ b/arch/x86/kvm/svm.c
> > @@ -6005,6 +6005,11 @@ static bool svm_pt_supported(void)
> >  	return false;
> >  }
> >  
> > +static bool svm_xo_supported(void)
> > +{
> > +	return false;
> > +}
> > +
> >  static bool svm_has_wbinvd_exit(void)
> >  {
> >  	return true;
> > @@ -7293,6 +7298,7 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init
> > = {
> >  	.xsaves_supported = svm_xsaves_supported,
> >  	.umip_emulated = svm_umip_emulated,
> >  	.pt_supported = svm_pt_supported,
> > +	.tdp_xo_supported = svm_xo_supported,
> >  
> >  	.set_supported_cpuid = svm_set_supported_cpuid,
> >  
> > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > index a30dbab8a2d4..7e7260c715f2 100644
> > --- a/arch/x86/kvm/vmx/vmx.c
> > +++ b/arch/x86/kvm/vmx/vmx.c
> > @@ -7767,6 +7767,7 @@ static struct kvm_x86_ops vmx_x86_ops __ro_after_init
> > = {
> >  	.xsaves_supported = vmx_xsaves_supported,
> >  	.umip_emulated = vmx_umip_emulated,
> >  	.pt_supported = vmx_pt_supported,
> > +	.tdp_xo_supported = cpu_has_vmx_ept_execute_only,
> >  
> >  	.request_immediate_exit = vmx_request_immediate_exit,
> >  
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 2e321d788672..810cfdb1a315 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -3183,6 +3183,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long
> > ext)
> >  		r = kvm_x86_ops->get_nested_state ?
> >  			kvm_x86_ops->get_nested_state(NULL, NULL, 0) : 0;
> >  		break;
> > +	case KVM_CAP_EXECONLY_MEM:
> > +		r = kvm_x86_ops->tdp_xo_supported();
> > +		break;
> >  	default:
> >  		break;
> >  	}
> > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> > index ede487b7b216..7778a1f03b78 100644
> > --- a/include/uapi/linux/kvm.h
> > +++ b/include/uapi/linux/kvm.h
> > @@ -997,6 +997,7 @@ struct kvm_ppc_resize_hpt {
> >  #define KVM_CAP_ARM_PTRAUTH_ADDRESS 171
> >  #define KVM_CAP_ARM_PTRAUTH_GENERIC 172
> >  #define KVM_CAP_PMU_EVENT_FILTER 173
> > +#define KVM_CAP_EXECONLY_MEM 174
> >  
> >  #ifdef KVM_CAP_IRQ_ROUTING
> >  
> > 
> 
> This is not needed, execution only can be a CPUID bit in the hypervisor
> range (see Documentation/virt/kvm/cpuid.txt).  Userspace can use
> KVM_GET_SUPPORTED_CPUID to check whether the host supports it.
> 
Oh yea. I didn't see this. Definitely seems better.

Thanks,

Rick


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 12/13] mmap: Add XO support for KVM XO
  2019-10-04  7:34   ` Paolo Bonzini
@ 2019-10-04 19:12     ` Edgecombe, Rick P
  0 siblings, 0 replies; 41+ messages in thread
From: Edgecombe, Rick P @ 2019-10-04 19:12 UTC (permalink / raw)
  To: kvm, linux-kernel, peterz, keescook, Christopherson, Sean J,
	linux-mm, x86, luto, pbonzini, Hansen, Dave
  Cc: kristen, Dock, Deneen T

On Fri, 2019-10-04 at 09:34 +0200, Paolo Bonzini wrote:
> On 03/10/19 23:23, Rick Edgecombe wrote:
> > +
> > +	protection_map[4] = PAGE_EXECONLY;
> > +	protection_map[12] = PAGE_EXECONLY;
> 
> Can you add #defines for the bits in protection_map?  Also perhaps you
> can replace the p_xo/p_xr/s_xo/s_xr checks with just with "if
> (pgtable_kvmxo_enabled()".
> 
> Paolo

PAGE_EXECONLY is not known at compile time since the NR bit position depends
on the number of physical address bits. So it can't be set the way the other
ones are in protection_map[], if thats what you are saying.

I didn't love the p_xo/p_xr/s_xo/s_xr checks, but since mm/mmap.c is cross arch
it seemed the best option. Maybe a cross arch helper like
non_pkey_xo_supported() instead?

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 00/13] XOM for KVM guest userspace
  2019-10-04 14:56 ` Andy Lutomirski
@ 2019-10-04 20:09   ` Edgecombe, Rick P
  2019-10-05  1:33     ` Andy Lutomirski
  0 siblings, 1 reply; 41+ messages in thread
From: Edgecombe, Rick P @ 2019-10-04 20:09 UTC (permalink / raw)
  To: luto
  Cc: kvm, linux-kernel, peterz, keescook, Dock, Deneen T,
	Christopherson, Sean J, linux-mm, x86, kristen, pbonzini, Hansen,
	Dave

On Fri, 2019-10-04 at 07:56 -0700, Andy Lutomirski wrote:
> On Thu, Oct 3, 2019 at 2:38 PM Rick Edgecombe
> <rick.p.edgecombe@intel.com> wrote:
> > 
> > This patchset enables the ability for KVM guests to create execute-only (XO)
> > memory by utilizing EPT based XO permissions. XO memory is currently
> > supported
> > on Intel hardware natively for CPU's with PKU, but this enables it on older
> > platforms, and can support XO for kernel memory as well.
> 
> The patchset seems to sometimes call this feature "XO" and sometimes
> call it "NR".  To me, XO implies no-read and no-write, whereas NR
> implies just no-read.  Can you please clarify *exactly* what the new
> bit does and be consistent?
> 
> I suggest that you make it NR, which allows for PROT_EXEC and
> PROT_EXEC|PROT_WRITE and plain PROT_WRITE.  WX is of dubious value,
> but I can imagine plain W being genuinely useful for logging and for
> JITs that could maintain a W and a separate X mapping of some code.
> In other words, with an NR bit, all 8 logical access modes are
> possible.  Also, keeping the paging bits more orthogonal seems nice --
> we already have a bit that controls write access.

Sorry, yes the behavior of this bit needs to be documented a lot better. I will
definitely do this for the next version.

To clarify, since the EPT permissions in the XO/NR range are executable, and not
readable or writeable the new bit really means XO, but only when NX is 0 since
the guest page tables are being checked as well. When NR=1, W=1, and NX=0, the
memory is still XO.

NR was picked over XO because as you say. The idea is that it can be defined
that in the case of KVM XO, NR and writable is not a valid combination, like
writeable but not readable is defined as not valid for the EPT.

I *think* whenever NX=1, NR=1 it should be similar to not present in that it
can't be used for anything or have its translation cached. I am not 100% sure on
the cached part and was thinking of just making the "spec" that the translation
caching behavior is undefined. I can look into this if anyone thinks we need to
know. In the current patchset it shouldn't be possible to create this
combination.

Since write-only memory isn't supported in EPT we can't do the same trick to
create a new HW permission. But I guess if we emulate it, we could make the new
bit mean just NR, and support write-only by allowing emulation when KVM gets a
write EPT violations to NR memory. It might still be useful for the JIT case you
mentioned, or a shared memory mailbox. On the other hand, userspace might be
surprised to encounter that memory is different speeds depending on the
permission. I also wonder if any userspace apps are asking for just PROT_WRITE
and expecting readable memory.

Thanks,

Rick


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 00/13] XOM for KVM guest userspace
  2019-10-04 20:09   ` Edgecombe, Rick P
@ 2019-10-05  1:33     ` Andy Lutomirski
  2019-10-07 18:14       ` Edgecombe, Rick P
  0 siblings, 1 reply; 41+ messages in thread
From: Andy Lutomirski @ 2019-10-05  1:33 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: luto, kvm, linux-kernel, peterz, keescook, Dock, Deneen T,
	Christopherson, Sean J, linux-mm, x86, kristen, pbonzini, Hansen,
	Dave

On Fri, Oct 4, 2019 at 1:10 PM Edgecombe, Rick P
<rick.p.edgecombe@intel.com> wrote:
>
> On Fri, 2019-10-04 at 07:56 -0700, Andy Lutomirski wrote:
> > On Thu, Oct 3, 2019 at 2:38 PM Rick Edgecombe
> > <rick.p.edgecombe@intel.com> wrote:
> > >
> > > This patchset enables the ability for KVM guests to create execute-only (XO)
> > > memory by utilizing EPT based XO permissions. XO memory is currently
> > > supported
> > > on Intel hardware natively for CPU's with PKU, but this enables it on older
> > > platforms, and can support XO for kernel memory as well.
> >
> > The patchset seems to sometimes call this feature "XO" and sometimes
> > call it "NR".  To me, XO implies no-read and no-write, whereas NR
> > implies just no-read.  Can you please clarify *exactly* what the new
> > bit does and be consistent?
> >
> > I suggest that you make it NR, which allows for PROT_EXEC and
> > PROT_EXEC|PROT_WRITE and plain PROT_WRITE.  WX is of dubious value,
> > but I can imagine plain W being genuinely useful for logging and for
> > JITs that could maintain a W and a separate X mapping of some code.
> > In other words, with an NR bit, all 8 logical access modes are
> > possible.  Also, keeping the paging bits more orthogonal seems nice --
> > we already have a bit that controls write access.
>
> Sorry, yes the behavior of this bit needs to be documented a lot better. I will
> definitely do this for the next version.
>
> To clarify, since the EPT permissions in the XO/NR range are executable, and not
> readable or writeable the new bit really means XO, but only when NX is 0 since
> the guest page tables are being checked as well. When NR=1, W=1, and NX=0, the
> memory is still XO.
>
> NR was picked over XO because as you say. The idea is that it can be defined
> that in the case of KVM XO, NR and writable is not a valid combination, like
> writeable but not readable is defined as not valid for the EPT.
>

Ugh, I see, this is an "EPT Misconfiguration".  Oh, well.  I guess
just keep things as they are and document things better, please.
Don't try to emulate.

I don't suppose Intel could be convinced to get rid of that in a
future CPU and allow write-only memory?

BTW, is your patch checking for support in IA32_VMX_EPT_VPID_CAP?  I
didn't notice it, but I didn't look that hard.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 03/13] kvm: Add XO memslot type
  2019-10-04 19:06     ` Edgecombe, Rick P
@ 2019-10-06 16:15       ` Paolo Bonzini
  0 siblings, 0 replies; 41+ messages in thread
From: Paolo Bonzini @ 2019-10-06 16:15 UTC (permalink / raw)
  To: Edgecombe, Rick P, kvm, linux-kernel, peterz, keescook,
	Christopherson, Sean J, linux-mm, x86, luto, Hansen, Dave
  Cc: kristen, Dock, Deneen T, yu.c.zhang

On 04/10/19 21:06, Edgecombe, Rick P wrote:
> The reasoning was that it seems like KVM leaves it to userspace to control the
> physical address space layout since userspace decides the supported physical
> address bits and lays out memory in the physical address space. So duplication
> with XO memslots was an attempt was to keep the logic around that together.
> 
> I'll take another look at doing it this way though. I think userspace may still
> need to adjust the MAXPHYADDR and be aware it can't layout memory in the XO
> range.

Right, you would have to use KVM_ENABLE_CAP passing the desired X bit
(which must be < MAXPHYADDR) as the argument.  Userspace needs to know
that it must then make MAXPHYADDR in the guest CPUID equal to the
argument.  When the MSR is written to 1, bit "MAXPHYADDR-1" in the page
table entries becomes an XO bit.

Paolo


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 00/13] XOM for KVM guest userspace
  2019-10-05  1:33     ` Andy Lutomirski
@ 2019-10-07 18:14       ` Edgecombe, Rick P
  0 siblings, 0 replies; 41+ messages in thread
From: Edgecombe, Rick P @ 2019-10-07 18:14 UTC (permalink / raw)
  To: luto
  Cc: kvm, linux-kernel, peterz, keescook, Dock, Deneen T,
	Christopherson, Sean J, linux-mm, x86, kristen, pbonzini, Hansen,
	Dave

On Fri, 2019-10-04 at 18:33 -0700, Andy Lutomirski wrote:
> On Fri, Oct 4, 2019 at 1:10 PM Edgecombe, Rick P
> <rick.p.edgecombe@intel.com> wrote:
> > 
> > On Fri, 2019-10-04 at 07:56 -0700, Andy Lutomirski wrote:
> > > On Thu, Oct 3, 2019 at 2:38 PM Rick Edgecombe
> > > <rick.p.edgecombe@intel.com> wrote:
> > > > 
> > > > This patchset enables the ability for KVM guests to create execute-only
> > > > (XO)
> > > > memory by utilizing EPT based XO permissions. XO memory is currently
> > > > supported
> > > > on Intel hardware natively for CPU's with PKU, but this enables it on
> > > > older
> > > > platforms, and can support XO for kernel memory as well.
> > > 
> > > The patchset seems to sometimes call this feature "XO" and sometimes
> > > call it "NR".  To me, XO implies no-read and no-write, whereas NR
> > > implies just no-read.  Can you please clarify *exactly* what the new
> > > bit does and be consistent?
> > > 
> > > I suggest that you make it NR, which allows for PROT_EXEC and
> > > PROT_EXEC|PROT_WRITE and plain PROT_WRITE.  WX is of dubious value,
> > > but I can imagine plain W being genuinely useful for logging and for
> > > JITs that could maintain a W and a separate X mapping of some code.
> > > In other words, with an NR bit, all 8 logical access modes are
> > > possible.  Also, keeping the paging bits more orthogonal seems nice --
> > > we already have a bit that controls write access.
> > 
> > Sorry, yes the behavior of this bit needs to be documented a lot better. I
> > will
> > definitely do this for the next version.
> > 
> > To clarify, since the EPT permissions in the XO/NR range are executable, and
> > not
> > readable or writeable the new bit really means XO, but only when NX is 0
> > since
> > the guest page tables are being checked as well. When NR=1, W=1, and NX=0,
> > the
> > memory is still XO.
> > 
> > NR was picked over XO because as you say. The idea is that it can be defined
> > that in the case of KVM XO, NR and writable is not a valid combination, like
> > writeable but not readable is defined as not valid for the EPT.
> > 
> 
> Ugh, I see, this is an "EPT Misconfiguration".  Oh, well.  I guess
> just keep things as they are and document things better, please.
> Don't try to emulate.

Ah, I see what you were thinking. Ok will do.

> I don't suppose Intel could be convinced to get rid of that in a
> future CPU and allow write-only memory?

Hmm, I'm not sure. I can try to pass it along.

> BTW, is your patch checking for support in IA32_VMX_EPT_VPID_CAP?  I
> didn't notice it, but I didn't look that hard.

Yep, there was already a helper: cpu_has_vmx_ept_execute_only().


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 01/13] kvm: Enable MTRR to work with GFNs with perm bits
  2019-10-03 21:23 ` [RFC PATCH 01/13] kvm: Enable MTRR to work with GFNs with perm bits Rick Edgecombe
@ 2019-10-14  6:47   ` Yu Zhang
  2019-10-14 18:44     ` Edgecombe, Rick P
  0 siblings, 1 reply; 41+ messages in thread
From: Yu Zhang @ 2019-10-14  6:47 UTC (permalink / raw)
  To: Rick Edgecombe
  Cc: kvm, linux-kernel, x86, linux-mm, luto, peterz, dave.hansen,
	pbonzini, sean.j.christopherson, keescook, kristen,
	deneen.t.dock

On Thu, Oct 03, 2019 at 02:23:48PM -0700, Rick Edgecombe wrote:
> Mask gfn by maxphyaddr in kvm_mtrr_get_guest_memory_type so that the
> guests view of gfn is used when high bits of the physical memory are
> used as extra permissions bits. This supports the KVM XO feature.
> 
> TODO: Since MTRR is emulated using EPT permissions, the XO version of
> the gpa range will not inherrit the MTRR type with this implementation.
> There shouldn't be any legacy use of KVM XO, but hypothetically it could
> interfere with the uncacheable MTRR type.
> 
> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> ---
>  arch/x86/kvm/mtrr.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/arch/x86/kvm/mtrr.c b/arch/x86/kvm/mtrr.c
> index 25ce3edd1872..da38f3b83e51 100644
> --- a/arch/x86/kvm/mtrr.c
> +++ b/arch/x86/kvm/mtrr.c
> @@ -621,6 +621,14 @@ u8 kvm_mtrr_get_guest_memory_type(struct kvm_vcpu *vcpu, gfn_t gfn)
>  	const int wt_wb_mask = (1 << MTRR_TYPE_WRBACK)
>  			       | (1 << MTRR_TYPE_WRTHROUGH);
>  
> +	/*
> +	 * Handle situations where gfn bits are used as permissions bits by
> +	 * masking KVMs view of the gfn with the guests physical address bits
> +	 * in order to match the guests view of physical address. For normal
> +	 * situations this will have no effect.
> +	 */
> +	gfn &= (1ULL << (cpuid_maxphyaddr(vcpu) - PAGE_SHIFT));
> +

Won't this break the MTRR calculation for normal gfns?
Are you suggesting use the same MTRR value for the XO range as the normal one's?
If so, may be we should use:

	if (guest_cpuid_has(vcpu, X86_FEATURE_KVM_XO))
		gfn &= ~(1ULL << (cpuid_maxphyaddr(vcpu) - PAGE_SHIFT));


>  	start = gfn_to_gpa(gfn);
>  	end = start + PAGE_SIZE;
>  
> -- 
> 2.17.1
> 

B.R.
Yu

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 01/13] kvm: Enable MTRR to work with GFNs with perm bits
  2019-10-14  6:47   ` Yu Zhang
@ 2019-10-14 18:44     ` Edgecombe, Rick P
  0 siblings, 0 replies; 41+ messages in thread
From: Edgecombe, Rick P @ 2019-10-14 18:44 UTC (permalink / raw)
  To: yu.c.zhang
  Cc: kvm, linux-kernel, peterz, keescook, Dock, Deneen T,
	Christopherson, Sean J, linux-mm, x86, kristen, luto, pbonzini,
	Hansen, Dave

On Mon, 2019-10-14 at 14:47 +0800, Yu Zhang wrote:
> On Thu, Oct 03, 2019 at 02:23:48PM -0700, Rick Edgecombe wrote:
> > Mask gfn by maxphyaddr in kvm_mtrr_get_guest_memory_type so that the
> > guests view of gfn is used when high bits of the physical memory are
> > used as extra permissions bits. This supports the KVM XO feature.
> > 
> > TODO: Since MTRR is emulated using EPT permissions, the XO version of
> > the gpa range will not inherrit the MTRR type with this implementation.
> > There shouldn't be any legacy use of KVM XO, but hypothetically it could
> > interfere with the uncacheable MTRR type.
> > 
> > Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> > ---
> >  arch/x86/kvm/mtrr.c | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> > 
> > diff --git a/arch/x86/kvm/mtrr.c b/arch/x86/kvm/mtrr.c
> > index 25ce3edd1872..da38f3b83e51 100644
> > --- a/arch/x86/kvm/mtrr.c
> > +++ b/arch/x86/kvm/mtrr.c
> > @@ -621,6 +621,14 @@ u8 kvm_mtrr_get_guest_memory_type(struct kvm_vcpu
> > *vcpu, gfn_t gfn)
> >  	const int wt_wb_mask = (1 << MTRR_TYPE_WRBACK)
> >  			       | (1 << MTRR_TYPE_WRTHROUGH);
> >  
> > +	/*
> > +	 * Handle situations where gfn bits are used as permissions bits by
> > +	 * masking KVMs view of the gfn with the guests physical address bits
> > +	 * in order to match the guests view of physical address. For normal
> > +	 * situations this will have no effect.
> > +	 */
> > +	gfn &= (1ULL << (cpuid_maxphyaddr(vcpu) - PAGE_SHIFT));
> > +
> 
> Won't this break the MTRR calculation for normal gfns?
> Are you suggesting use the same MTRR value for the XO range as the normal
> one's?
> If so, may be we should use:
> 
> 	if (guest_cpuid_has(vcpu, X86_FEATURE_KVM_XO))
> 		gfn &= ~(1ULL << (cpuid_maxphyaddr(vcpu) - PAGE_SHIFT));
> 
Yes you're right this is broken, but zeroing a bit beyond the max physical
address address should be ok here I think, so you shouldn't need the feature
check.

In any case, this logic will go away anyway after the suggestions to mask the
GPA soon after the exit. Then most of KVM can just operate on the guest view of
the GPA as normal.

Design wise, I think MTRR should affect the XO GPA's as well because if we are
going to pretend the XO bit is not a PFN bit, that would be expected. I am not
sure if it would actually break anything though unless someone did uncacheable
XO, so that could maybe also just be declared illegal.

> >  	start = gfn_to_gpa(gfn);
> >  	end = start + PAGE_SIZE;
> >  
> > -- 
> > 2.17.1
> > 
> 
> B.R.
> Yu

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 09/13] x86/cpufeature: Add detection of KVM XO
  2019-10-03 21:23 ` [RFC PATCH 09/13] x86/cpufeature: Add detection of KVM XO Rick Edgecombe
@ 2019-10-29 23:33   ` Kees Cook
  2019-10-29 23:52     ` Edgecombe, Rick P
  0 siblings, 1 reply; 41+ messages in thread
From: Kees Cook @ 2019-10-29 23:33 UTC (permalink / raw)
  To: Rick Edgecombe
  Cc: kvm, linux-kernel, x86, linux-mm, luto, peterz, dave.hansen,
	pbonzini, sean.j.christopherson, kristen, deneen.t.dock

On Thu, Oct 03, 2019 at 02:23:56PM -0700, Rick Edgecombe wrote:
> Add a new CPUID leaf to hold the contents of CPUID 0x40000030 EAX to
> detect KVM defined generic VMM features.
> 
> The leaf was proposed to allow KVM to communicate features that are
> defined by KVM, but available for any VMM to implement.
> 
> Add cpu_feature_enabled() support for features in this leaf (KVM XO), and
> a pgtable_kvmxo_enabled() helper similar to pgtable_l5_enabled() so that
> pgtable_kvmxo_enabled() can be used in early code that includes
> arch/x86/include/asm/sparsemem.h.
> 
> Lastly, in head64.c detect and this feature and perform necessary
> adjustments to physical_mask.

Can this be exposed to /proc/cpuinfo so a guest userspace can determine
if this feature is enabled?

-Kees

> 
> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> ---
>  arch/x86/include/asm/cpufeature.h             |  6 ++-
>  arch/x86/include/asm/cpufeatures.h            |  2 +-
>  arch/x86/include/asm/disabled-features.h      |  3 +-
>  arch/x86/include/asm/pgtable_32_types.h       |  1 +
>  arch/x86/include/asm/pgtable_64_types.h       | 26 ++++++++++++-
>  arch/x86/include/asm/required-features.h      |  3 +-
>  arch/x86/include/asm/sparsemem.h              |  4 +-
>  arch/x86/kernel/cpu/common.c                  |  5 +++
>  arch/x86/kernel/head64.c                      | 38 ++++++++++++++++++-
>  .../arch/x86/include/asm/disabled-features.h  |  3 +-
>  10 files changed, 80 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
> index 17127ffbc2a2..7d04ea4f1623 100644
> --- a/arch/x86/include/asm/cpufeature.h
> +++ b/arch/x86/include/asm/cpufeature.h
> @@ -82,8 +82,9 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
>  	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 16, feature_bit) ||	\
>  	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 17, feature_bit) ||	\
>  	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 18, feature_bit) ||	\
> +	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 19, feature_bit) ||	\
>  	   REQUIRED_MASK_CHECK					  ||	\
> -	   BUILD_BUG_ON_ZERO(NCAPINTS != 19))
> +	   BUILD_BUG_ON_ZERO(NCAPINTS != 20))
>  
>  #define DISABLED_MASK_BIT_SET(feature_bit)				\
>  	 ( CHECK_BIT_IN_MASK_WORD(DISABLED_MASK,  0, feature_bit) ||	\
> @@ -105,8 +106,9 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
>  	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 16, feature_bit) ||	\
>  	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 17, feature_bit) ||	\
>  	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 18, feature_bit) ||	\
> +	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 19, feature_bit) ||	\
>  	   DISABLED_MASK_CHECK					  ||	\
> -	   BUILD_BUG_ON_ZERO(NCAPINTS != 19))
> +	   BUILD_BUG_ON_ZERO(NCAPINTS != 20))
>  
>  #define cpu_has(c, bit)							\
>  	(__builtin_constant_p(bit) && REQUIRED_MASK_BIT_SET(bit) ? 1 :	\
> diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
> index 7ba217e894ea..9c1b07674401 100644
> --- a/arch/x86/include/asm/cpufeatures.h
> +++ b/arch/x86/include/asm/cpufeatures.h
> @@ -13,7 +13,7 @@
>  /*
>   * Defines x86 CPU feature bits
>   */
> -#define NCAPINTS			19	   /* N 32-bit words worth of info */
> +#define NCAPINTS			20	   /* N 32-bit words worth of info */
>  #define NBUGINTS			1	   /* N 32-bit bug flags */
>  
>  /*
> diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h
> index a5ea841cc6d2..f0f935f8d917 100644
> --- a/arch/x86/include/asm/disabled-features.h
> +++ b/arch/x86/include/asm/disabled-features.h
> @@ -84,6 +84,7 @@
>  #define DISABLED_MASK16	(DISABLE_PKU|DISABLE_OSPKE|DISABLE_LA57|DISABLE_UMIP)
>  #define DISABLED_MASK17	0
>  #define DISABLED_MASK18	0
> -#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 19)
> +#define DISABLED_MASK19	0
> +#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 20)
>  
>  #endif /* _ASM_X86_DISABLED_FEATURES_H */
> diff --git a/arch/x86/include/asm/pgtable_32_types.h b/arch/x86/include/asm/pgtable_32_types.h
> index b0bc0fff5f1f..57a11692715e 100644
> --- a/arch/x86/include/asm/pgtable_32_types.h
> +++ b/arch/x86/include/asm/pgtable_32_types.h
> @@ -16,6 +16,7 @@
>  #endif
>  
>  #define pgtable_l5_enabled() 0
> +#define pgtable_kvmxo_enabled() 0
>  
>  #define PGDIR_SIZE	(1UL << PGDIR_SHIFT)
>  #define PGDIR_MASK	(~(PGDIR_SIZE - 1))
> diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h
> index 6b55b837ead4..7c7c9d1a199a 100644
> --- a/arch/x86/include/asm/pgtable_64_types.h
> +++ b/arch/x86/include/asm/pgtable_64_types.h
> @@ -43,10 +43,34 @@ static inline bool pgtable_l5_enabled(void)
>  extern unsigned int pgdir_shift;
>  extern unsigned int ptrs_per_p4d;
>  
> +#ifdef CONFIG_KVM_XO
> +extern unsigned int __pgtable_kvmxo_enabled;
> +
> +#ifdef USE_EARLY_PGTABLE
> +/*
> + * cpu_feature_enabled() is not available in early boot code.
> + * Use variable instead.
> + */
> +static inline bool pgtable_kvmxo_enabled(void)
> +{
> +	return __pgtable_kvmxo_enabled;
> +}
> +#else
> +#define pgtable_kvmxo_enabled() cpu_feature_enabled(X86_FEATURE_KVM_XO)
> +#endif /* USE_EARLY_PGTABLE */
> +
> +#else
> +#define pgtable_kvmxo_enabled() 0
> +#endif /* CONFIG_KVM_XO */
> +
>  #endif	/* !__ASSEMBLY__ */
>  
>  #define SHARED_KERNEL_PMD	0
>  
> +#if defined(CONFIG_X86_5LEVEL) || defined(CONFIG_KVM_XO)
> +#define MAX_POSSIBLE_PHYSMEM_BITS	52
> +#endif
> +
>  #ifdef CONFIG_X86_5LEVEL
>  
>  /*
> @@ -64,8 +88,6 @@ extern unsigned int ptrs_per_p4d;
>  #define P4D_SIZE		(_AC(1, UL) << P4D_SHIFT)
>  #define P4D_MASK		(~(P4D_SIZE - 1))
>  
> -#define MAX_POSSIBLE_PHYSMEM_BITS	52
> -
>  #else /* CONFIG_X86_5LEVEL */
>  
>  /*
> diff --git a/arch/x86/include/asm/required-features.h b/arch/x86/include/asm/required-features.h
> index 6847d85400a8..fa5700097f64 100644
> --- a/arch/x86/include/asm/required-features.h
> +++ b/arch/x86/include/asm/required-features.h
> @@ -101,6 +101,7 @@
>  #define REQUIRED_MASK16	0
>  #define REQUIRED_MASK17	0
>  #define REQUIRED_MASK18	0
> -#define REQUIRED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 19)
> +#define REQUIRED_MASK19	0
> +#define REQUIRED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 20)
>  
>  #endif /* _ASM_X86_REQUIRED_FEATURES_H */
> diff --git a/arch/x86/include/asm/sparsemem.h b/arch/x86/include/asm/sparsemem.h
> index 199218719a86..24b305195369 100644
> --- a/arch/x86/include/asm/sparsemem.h
> +++ b/arch/x86/include/asm/sparsemem.h
> @@ -27,8 +27,8 @@
>  # endif
>  #else /* CONFIG_X86_32 */
>  # define SECTION_SIZE_BITS	27 /* matt - 128 is convenient right now */
> -# define MAX_PHYSADDR_BITS	(pgtable_l5_enabled() ? 52 : 44)
> -# define MAX_PHYSMEM_BITS	(pgtable_l5_enabled() ? 52 : 46)
> +# define MAX_PHYSADDR_BITS	((pgtable_l5_enabled() ? 52 : 44) - !!pgtable_kvmxo_enabled())
> +# define MAX_PHYSMEM_BITS	((pgtable_l5_enabled() ? 52 : 46) - !!pgtable_kvmxo_enabled())
>  #endif
>  
>  #endif /* CONFIG_SPARSEMEM */
> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index 4f08e164c0b1..ee204aefbcfd 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -933,6 +933,11 @@ void get_cpu_cap(struct cpuinfo_x86 *c)
>  		c->x86_capability[CPUID_D_1_EAX] = eax;
>  	}
>  
> +	eax = cpuid_eax(0x40000000);
> +	c->extended_cpuid_level = eax;
> +	if (c->extended_cpuid_level >= 0x40000030)
> +		c->x86_capability[CPUID_4000_0030_EAX] = cpuid_eax(0x40000030);
> +
>  	/* AMD-defined flags: level 0x80000001 */
>  	eax = cpuid_eax(0x80000000);
>  	c->extended_cpuid_level = eax;
> diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
> index 55f5294c3cdf..7091702a7bec 100644
> --- a/arch/x86/kernel/head64.c
> +++ b/arch/x86/kernel/head64.c
> @@ -52,6 +52,11 @@ unsigned int ptrs_per_p4d __ro_after_init = 1;
>  EXPORT_SYMBOL(ptrs_per_p4d);
>  #endif
>  
> +#ifdef CONFIG_KVM_XO
> +unsigned int __pgtable_kvmxo_enabled __ro_after_init;
> +unsigned int __pgtable_kvmxo_bit __ro_after_init;
> +#endif /* CONFIG_KVM_XO */
> +
>  #ifdef CONFIG_DYNAMIC_MEMORY_LAYOUT
>  unsigned long page_offset_base __ro_after_init = __PAGE_OFFSET_BASE_L4;
>  EXPORT_SYMBOL(page_offset_base);
> @@ -73,12 +78,14 @@ static unsigned long __head *fixup_long(void *ptr, unsigned long physaddr)
>  	return fixup_pointer(ptr, physaddr);
>  }
>  
> -#ifdef CONFIG_X86_5LEVEL
> +#if defined(CONFIG_X86_5LEVEL) || defined(CONFIG_KVM_XO)
>  static unsigned int __head *fixup_int(void *ptr, unsigned long physaddr)
>  {
>  	return fixup_pointer(ptr, physaddr);
>  }
> +#endif
>  
> +#ifdef CONFIG_X86_5LEVEL
>  static bool __head check_la57_support(unsigned long physaddr)
>  {
>  	/*
> @@ -104,6 +111,33 @@ static bool __head check_la57_support(unsigned long physaddr)
>  }
>  #endif
>  
> +#ifdef CONFIG_KVM_XO
> +static void __head check_kvmxo_support(unsigned long physaddr)
> +{
> +	unsigned long physbits;
> +
> +	if ((native_cpuid_eax(0x40000000) < 0x40000030) ||
> +	    !(native_cpuid_eax(0x40000030) & (1 << (X86_FEATURE_KVM_XO & 31))))
> +		return;
> +
> +	if (native_cpuid_eax(0x80000000) < 0x80000008)
> +		return;
> +
> +	physbits = native_cpuid_eax(0x80000008) & 0xff;
> +
> +	/*
> +	 * If KVM XO is active, the top physical address bit is the permisison
> +	 * bit, so zero it in the mask.
> +	 */
> +	physical_mask &= ~(1UL << physbits);
> +
> +	*fixup_int(&__pgtable_kvmxo_enabled, physaddr) = 1;
> +	*fixup_int(&__pgtable_kvmxo_bit, physaddr) = physbits;
> +}
> +#else /* CONFIG_KVM_XO */
> +static void __head check_kvmxo_support(unsigned long physaddr) { }
> +#endif /* CONFIG_KVM_XO */
> +
>  /* Code in __startup_64() can be relocated during execution, but the compiler
>   * doesn't have to generate PC-relative relocations when accessing globals from
>   * that function. Clang actually does not generate them, which leads to
> @@ -127,6 +161,8 @@ unsigned long __head __startup_64(unsigned long physaddr,
>  
>  	la57 = check_la57_support(physaddr);
>  
> +	check_kvmxo_support(physaddr);
> +
>  	/* Is the address too large? */
>  	if (physaddr >> MAX_PHYSMEM_BITS)
>  		for (;;);
> diff --git a/tools/arch/x86/include/asm/disabled-features.h b/tools/arch/x86/include/asm/disabled-features.h
> index a5ea841cc6d2..f0f935f8d917 100644
> --- a/tools/arch/x86/include/asm/disabled-features.h
> +++ b/tools/arch/x86/include/asm/disabled-features.h
> @@ -84,6 +84,7 @@
>  #define DISABLED_MASK16	(DISABLE_PKU|DISABLE_OSPKE|DISABLE_LA57|DISABLE_UMIP)
>  #define DISABLED_MASK17	0
>  #define DISABLED_MASK18	0
> -#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 19)
> +#define DISABLED_MASK19	0
> +#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 20)
>  
>  #endif /* _ASM_X86_DISABLED_FEATURES_H */
> -- 
> 2.17.1
> 

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 13/13] x86/Kconfig: Add Kconfig for KVM based XO
  2019-10-03 21:24 ` [RFC PATCH 13/13] x86/Kconfig: Add Kconfig for KVM based XO Rick Edgecombe
@ 2019-10-29 23:36   ` Kees Cook
  2019-10-30  0:01     ` Edgecombe, Rick P
  0 siblings, 1 reply; 41+ messages in thread
From: Kees Cook @ 2019-10-29 23:36 UTC (permalink / raw)
  To: Rick Edgecombe
  Cc: kvm, linux-kernel, x86, linux-mm, luto, peterz, dave.hansen,
	pbonzini, sean.j.christopherson, kristen, deneen.t.dock

On Thu, Oct 03, 2019 at 02:24:00PM -0700, Rick Edgecombe wrote:
> Add CONFIG_KVM_XO for supporting KVM based execute only memory.

I would expect this config to be added earlier in the series so that the
code being added that depends on it can be incrementally build tested...

(Also, if this is default=y, why have a Kconfig for it at all? Guests
need to know to use this already, yes?)

-Kees

> 
> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> ---
>  arch/x86/Kconfig | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
> 
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 222855cc0158..3a3af2a456e8 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -802,6 +802,19 @@ config KVM_GUEST
>  	  underlying device model, the host provides the guest with
>  	  timing infrastructure such as time of day, and system time
>  
> +config KVM_XO
> +	bool "Support for KVM based execute only virtual memory permissions"
> +	select DYNAMIC_PHYSICAL_MASK
> +	select SPARSEMEM_VMEMMAP
> +	depends on KVM_GUEST && X86_64
> +	default y
> +	help
> +	  This option enables support for execute only memory for KVM guests. If
> +	  support from the underlying VMM is not detected at boot, this
> +	  capability will automatically disable.
> +
> +	  If you are unsure how to answer this question, answer Y.
> +
>  config PVH
>  	bool "Support for running PVH guests"
>  	---help---
> -- 
> 2.17.1
> 

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 00/13] XOM for KVM guest userspace
  2019-10-03 21:23 [RFC PATCH 00/13] XOM for KVM guest userspace Rick Edgecombe
                   ` (14 preceding siblings ...)
  2019-10-04 14:56 ` Andy Lutomirski
@ 2019-10-29 23:40 ` Kees Cook
  2019-10-30  0:27   ` Edgecombe, Rick P
  15 siblings, 1 reply; 41+ messages in thread
From: Kees Cook @ 2019-10-29 23:40 UTC (permalink / raw)
  To: Rick Edgecombe
  Cc: kvm, linux-kernel, x86, linux-mm, luto, peterz, dave.hansen,
	pbonzini, sean.j.christopherson, kristen, deneen.t.dock

On Thu, Oct 03, 2019 at 02:23:47PM -0700, Rick Edgecombe wrote:
> larger follow on to this enables setting the kernel text as XO, but this is just

Is the kernel side series visible somewhere public yet?

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 09/13] x86/cpufeature: Add detection of KVM XO
  2019-10-29 23:33   ` Kees Cook
@ 2019-10-29 23:52     ` Edgecombe, Rick P
  2019-10-30 14:55       ` Sean Christopherson
  0 siblings, 1 reply; 41+ messages in thread
From: Edgecombe, Rick P @ 2019-10-29 23:52 UTC (permalink / raw)
  To: keescook
  Cc: kvm, linux-kernel, peterz, Dock, Deneen T, Christopherson,
	Sean J, linux-mm, x86, kristen, luto, pbonzini, Hansen, Dave

On Tue, 2019-10-29 at 16:33 -0700, Kees Cook wrote:
> On Thu, Oct 03, 2019 at 02:23:56PM -0700, Rick Edgecombe wrote:
> > Add a new CPUID leaf to hold the contents of CPUID 0x40000030 EAX to
> > detect KVM defined generic VMM features.
> > 
> > The leaf was proposed to allow KVM to communicate features that are
> > defined by KVM, but available for any VMM to implement.
> > 
> > Add cpu_feature_enabled() support for features in this leaf (KVM XO), and
> > a pgtable_kvmxo_enabled() helper similar to pgtable_l5_enabled() so that
> > pgtable_kvmxo_enabled() can be used in early code that includes
> > arch/x86/include/asm/sparsemem.h.
> > 
> > Lastly, in head64.c detect and this feature and perform necessary
> > adjustments to physical_mask.
> 
> Can this be exposed to /proc/cpuinfo so a guest userspace can determine
> if this feature is enabled?
> 
> -Kees

Is there a good place to expose the information that the PROT_EXEC and
!PROT_READ combo creates execute-only memory? This way apps can check one place
for the support and not worry about the implementation whether it's this, x86
pkeys, arm or other.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 13/13] x86/Kconfig: Add Kconfig for KVM based XO
  2019-10-29 23:36   ` Kees Cook
@ 2019-10-30  0:01     ` Edgecombe, Rick P
  2019-10-30 18:36       ` Kees Cook
  0 siblings, 1 reply; 41+ messages in thread
From: Edgecombe, Rick P @ 2019-10-30  0:01 UTC (permalink / raw)
  To: keescook
  Cc: kvm, linux-kernel, peterz, Dock, Deneen T, Christopherson,
	Sean J, linux-mm, x86, kristen, luto, pbonzini, Hansen, Dave

On Tue, 2019-10-29 at 16:36 -0700, Kees Cook wrote:
> On Thu, Oct 03, 2019 at 02:24:00PM -0700, Rick Edgecombe wrote:
> > Add CONFIG_KVM_XO for supporting KVM based execute only memory.
> 
> I would expect this config to be added earlier in the series so that the
> code being added that depends on it can be incrementally build tested...
> 
> (Also, if this is default=y, why have a Kconfig for it at all? Guests
> need to know to use this already, yes?)
> 
> -Kees
Hmm, good point. One reason could be that this requires SPARSEMEM_VMEMMAP due to
some pre-processor tricks that need a compile time known max physical address
size. So maybe someone could want KVM_GUEST and !SPARSEMEM_VMEMMAP. I'm not
sure.

> > 
> > Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> > ---
> >  arch/x86/Kconfig | 13 +++++++++++++
> >  1 file changed, 13 insertions(+)
> > 
> > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> > index 222855cc0158..3a3af2a456e8 100644
> > --- a/arch/x86/Kconfig
> > +++ b/arch/x86/Kconfig
> > @@ -802,6 +802,19 @@ config KVM_GUEST
> >  	  underlying device model, the host provides the guest with
> >  	  timing infrastructure such as time of day, and system time
> >  
> > +config KVM_XO
> > +	bool "Support for KVM based execute only virtual memory permissions"
> > +	select DYNAMIC_PHYSICAL_MASK
> > +	select SPARSEMEM_VMEMMAP
> > +	depends on KVM_GUEST && X86_64
> > +	default y
> > +	help
> > +	  This option enables support for execute only memory for KVM guests. If
> > +	  support from the underlying VMM is not detected at boot, this
> > +	  capability will automatically disable.
> > +
> > +	  If you are unsure how to answer this question, answer Y.
> > +
> >  config PVH
> >  	bool "Support for running PVH guests"
> >  	---help---
> > -- 
> > 2.17.1
> > 
> 
> 

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 00/13] XOM for KVM guest userspace
  2019-10-29 23:40 ` Kees Cook
@ 2019-10-30  0:27   ` Edgecombe, Rick P
  0 siblings, 0 replies; 41+ messages in thread
From: Edgecombe, Rick P @ 2019-10-30  0:27 UTC (permalink / raw)
  To: keescook
  Cc: kvm, linux-kernel, peterz, Dock, Deneen T, Christopherson,
	Sean J, linux-mm, x86, kristen, luto, pbonzini, Hansen, Dave

On Tue, 2019-10-29 at 16:40 -0700, Kees Cook wrote:
> On Thu, Oct 03, 2019 at 02:23:47PM -0700, Rick Edgecombe wrote:
> > larger follow on to this enables setting the kernel text as XO, but this is
> > just
> 
> Is the kernel side series visible somewhere public yet?
> 
The POC from my Plumber's talk is up here:
https://github.com/redgecombe/linux/commits/exec_only

It doesn't work with this KVM series though as I made changes on the KVM side. I
don't consider it ready for posting on the list yet. Luckily though, PeterZ's
switching of ftrace to text_poke(), and your exception table patchset will make
it easier when the time comes.

Right now I am re-doing the KVM pieces to get rid of the memslot duplication. I
am ending up having to touch a lot more KVM mmu code, and it's taken some time
to work through. Then I wanted get some more performance numbers before dropping
the RFC tag. So it may still be a bit before I can pick up the kernel text piece
again.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 09/13] x86/cpufeature: Add detection of KVM XO
  2019-10-29 23:52     ` Edgecombe, Rick P
@ 2019-10-30 14:55       ` Sean Christopherson
  2019-10-30 21:02         ` Edgecombe, Rick P
  0 siblings, 1 reply; 41+ messages in thread
From: Sean Christopherson @ 2019-10-30 14:55 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: keescook, kvm, linux-kernel, peterz, Dock, Deneen T, linux-mm,
	x86, kristen, luto, pbonzini, Hansen, Dave

On Tue, Oct 29, 2019 at 04:52:08PM -0700, Edgecombe, Rick P wrote:
> On Tue, 2019-10-29 at 16:33 -0700, Kees Cook wrote:
> > On Thu, Oct 03, 2019 at 02:23:56PM -0700, Rick Edgecombe wrote:
> > > Add a new CPUID leaf to hold the contents of CPUID 0x40000030 EAX to
> > > detect KVM defined generic VMM features.
> > >
> > > The leaf was proposed to allow KVM to communicate features that are
> > > defined by KVM, but available for any VMM to implement.

This doesn't necessarily work the way you intend, KVM's base CPUID isn't
guaranteed to be 0x40000000.  E.g. KVM supports advertising itself as
HyperV *and* KVM, in which case KVM's CPUID base will be 0x40000100.

I think you're better off just making this a standard KVM CPUID feature.
If a different hypervisor wants to reuse guest support as is, it can
advertise KVM support at a lower priority.

Note, querying guest CPUID isn't straightforward in either case.  But,
KVM doesn't support disabling its other CPUID-base paravirt features, e.g.
KVM emulates the kvm_clock MSRs regardless of what userspace advertises to
the guest.  Depending on what changes are required in KVM's MMU, this may
also need to be a KVM-wide feature, i.e. controlled via a module param.

> > > Add cpu_feature_enabled() support for features in this leaf (KVM XO), and
> > > a pgtable_kvmxo_enabled() helper similar to pgtable_l5_enabled() so that
> > > pgtable_kvmxo_enabled() can be used in early code that includes
> > > arch/x86/include/asm/sparsemem.h.
> > >
> > > Lastly, in head64.c detect and this feature and perform necessary
> > > adjustments to physical_mask.
> >
> > Can this be exposed to /proc/cpuinfo so a guest userspace can determine
> > if this feature is enabled?
> >
> > -Kees
>
> Is there a good place to expose the information that the PROT_EXEC and
> !PROT_READ combo creates execute-only memory? This way apps can check one place
> for the support and not worry about the implementation whether it's this, x86
> pkeys, arm or other.

I don't think so?  Assuming there's no common method, it can be displayed
in /proc/cpuinfo by adding a synthetic bit, e.g. in Linux-defined word 8
(virtualization) instead of a dedicated word.  The bit can then be
set if the features exists and is enabled (by the guest).

I'd also name the feature EXEC_ONLY.  XO is unnecessarily terse IMO, and
including "KVM" in the name may be misconstrued as a host KVM feature and
will be flat out wrong if hardware ever supports XO natively.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 13/13] x86/Kconfig: Add Kconfig for KVM based XO
  2019-10-30  0:01     ` Edgecombe, Rick P
@ 2019-10-30 18:36       ` Kees Cook
  0 siblings, 0 replies; 41+ messages in thread
From: Kees Cook @ 2019-10-30 18:36 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: kvm, linux-kernel, peterz, Dock, Deneen T, Christopherson,
	Sean J, linux-mm, x86, kristen, luto, pbonzini, Hansen, Dave

On Wed, Oct 30, 2019 at 12:01:18AM +0000, Edgecombe, Rick P wrote:
> On Tue, 2019-10-29 at 16:36 -0700, Kees Cook wrote:
> > On Thu, Oct 03, 2019 at 02:24:00PM -0700, Rick Edgecombe wrote:
> > > Add CONFIG_KVM_XO for supporting KVM based execute only memory.
> > 
> > I would expect this config to be added earlier in the series so that the
> > code being added that depends on it can be incrementally build tested...
> > 
> > (Also, if this is default=y, why have a Kconfig for it at all? Guests
> > need to know to use this already, yes?)
> > 
> > -Kees
> Hmm, good point. One reason could be that this requires SPARSEMEM_VMEMMAP due to
> some pre-processor tricks that need a compile time known max physical address
> size. So maybe someone could want KVM_GUEST and !SPARSEMEM_VMEMMAP. I'm not
> sure.

Good point about the combination of other CONFIGs. All the more reason
to move it earlier, though.

-Kees

> 
> > > 
> > > Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> > > ---
> > >  arch/x86/Kconfig | 13 +++++++++++++
> > >  1 file changed, 13 insertions(+)
> > > 
> > > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> > > index 222855cc0158..3a3af2a456e8 100644
> > > --- a/arch/x86/Kconfig
> > > +++ b/arch/x86/Kconfig
> > > @@ -802,6 +802,19 @@ config KVM_GUEST
> > >  	  underlying device model, the host provides the guest with
> > >  	  timing infrastructure such as time of day, and system time
> > >  
> > > +config KVM_XO
> > > +	bool "Support for KVM based execute only virtual memory permissions"
> > > +	select DYNAMIC_PHYSICAL_MASK
> > > +	select SPARSEMEM_VMEMMAP
> > > +	depends on KVM_GUEST && X86_64
> > > +	default y
> > > +	help
> > > +	  This option enables support for execute only memory for KVM guests. If
> > > +	  support from the underlying VMM is not detected at boot, this
> > > +	  capability will automatically disable.
> > > +
> > > +	  If you are unsure how to answer this question, answer Y.
> > > +
> > >  config PVH
> > >  	bool "Support for running PVH guests"
> > >  	---help---
> > > -- 
> > > 2.17.1
> > > 
> > 
> > 

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 09/13] x86/cpufeature: Add detection of KVM XO
  2019-10-30 14:55       ` Sean Christopherson
@ 2019-10-30 21:02         ` Edgecombe, Rick P
  0 siblings, 0 replies; 41+ messages in thread
From: Edgecombe, Rick P @ 2019-10-30 21:02 UTC (permalink / raw)
  To: Christopherson, Sean J
  Cc: kvm, linux-kernel, peterz, keescook, Dock, Deneen T, linux-mm,
	x86, kristen, luto, pbonzini, Hansen, Dave

On Wed, 2019-10-30 at 07:55 -0700, Sean Christopherson wrote:
> On Tue, Oct 29, 2019 at 04:52:08PM -0700, Edgecombe, Rick P wrote:
> > On Tue, 2019-10-29 at 16:33 -0700, Kees Cook wrote:
> > > On Thu, Oct 03, 2019 at 02:23:56PM -0700, Rick Edgecombe wrote:
> > > > Add a new CPUID leaf to hold the contents of CPUID 0x40000030 EAX to
> > > > detect KVM defined generic VMM features.
> > > > 
> > > > The leaf was proposed to allow KVM to communicate features that are
> > > > defined by KVM, but available for any VMM to implement.
> 
> This doesn't necessarily work the way you intend, KVM's base CPUID isn't
> guaranteed to be 0x40000000.  E.g. KVM supports advertising itself as
> HyperV *and* KVM, in which case KVM's CPUID base will be 0x40000100.
> 
> I think you're better off just making this a standard KVM CPUID feature.
> If a different hypervisor wants to reuse guest support as is, it can
> advertise KVM support at a lower priority.
> 
Ok, I'm fine going with the simpler KVM CPUID bit. It's not like per-VMM CPUID
leaf meanings are a new scenario with this.

> Note, querying guest CPUID isn't straightforward in either case.  But,
> KVM doesn't support disabling its other CPUID-base paravirt features, e.g.
> KVM emulates the kvm_clock MSRs regardless of what userspace advertises to
> the guest.  Depending on what changes are required in KVM's MMU, this may
> also need to be a KVM-wide feature, i.e. controlled via a module param.
> > > > Add cpu_feature_enabled() support for features in this leaf (KVM XO),
> > > > and
> > > > a pgtable_kvmxo_enabled() helper similar to pgtable_l5_enabled() so that
> > > > pgtable_kvmxo_enabled() can be used in early code that includes
> > > > arch/x86/include/asm/sparsemem.h.
> > > > 
> > > > Lastly, in head64.c detect and this feature and perform necessary
> > > > adjustments to physical_mask.
> > > 
> > > Can this be exposed to /proc/cpuinfo so a guest userspace can determine
> > > if this feature is enabled?
> > > 
> > > -Kees
> > 
> > Is there a good place to expose the information that the PROT_EXEC and
> > !PROT_READ combo creates execute-only memory? This way apps can check one
> > place
> > for the support and not worry about the implementation whether it's this,
> > x86
> > pkeys, arm or other.
> 
> I don't think so?  Assuming there's no common method, it can be displayed
> in /proc/cpuinfo by adding a synthetic bit, e.g. in Linux-defined word 8
> (virtualization) instead of a dedicated word.  The bit can then be
> set if the features exists and is enabled (by the guest).
> 
> I'd also name the feature EXEC_ONLY.  XO is unnecessarily terse IMO, and
> including "KVM" in the name may be misconstrued as a host KVM feature and
> will be flat out wrong if hardware ever supports XO natively.

Ok, if there is no generic way I guess I'll do this.



^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2019-10-30 21:02 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-03 21:23 [RFC PATCH 00/13] XOM for KVM guest userspace Rick Edgecombe
2019-10-03 21:23 ` [RFC PATCH 01/13] kvm: Enable MTRR to work with GFNs with perm bits Rick Edgecombe
2019-10-14  6:47   ` Yu Zhang
2019-10-14 18:44     ` Edgecombe, Rick P
2019-10-03 21:23 ` [RFC PATCH 02/13] kvm: Add support for X86_FEATURE_KVM_XO Rick Edgecombe
2019-10-03 21:23 ` [RFC PATCH 03/13] kvm: Add XO memslot type Rick Edgecombe
2019-10-04  7:27   ` Paolo Bonzini
2019-10-04 19:06     ` Edgecombe, Rick P
2019-10-06 16:15       ` Paolo Bonzini
2019-10-03 21:23 ` [RFC PATCH 04/13] kvm, vmx: Add support for gva exit qualification Rick Edgecombe
2019-10-03 21:23 ` [RFC PATCH 05/13] kvm: Add #PF injection for KVM XO Rick Edgecombe
2019-10-04  7:42   ` Paolo Bonzini
2019-10-04 19:11     ` Edgecombe, Rick P
2019-10-03 21:23 ` [RFC PATCH 06/13] kvm: Add KVM_CAP_EXECONLY_MEM Rick Edgecombe
2019-10-04  7:24   ` Paolo Bonzini
2019-10-04 19:11     ` Edgecombe, Rick P
2019-10-03 21:23 ` [RFC PATCH 07/13] kvm: Add docs for KVM_CAP_EXECONLY_MEM Rick Edgecombe
2019-10-03 21:23 ` [RFC PATCH 08/13] x86/boot: Rename USE_EARLY_PGTABLE_L5 Rick Edgecombe
2019-10-03 21:23 ` [RFC PATCH 09/13] x86/cpufeature: Add detection of KVM XO Rick Edgecombe
2019-10-29 23:33   ` Kees Cook
2019-10-29 23:52     ` Edgecombe, Rick P
2019-10-30 14:55       ` Sean Christopherson
2019-10-30 21:02         ` Edgecombe, Rick P
2019-10-03 21:23 ` [RFC PATCH 10/13] x86/mm: Add NR page bit for " Rick Edgecombe
2019-10-04  7:33   ` Paolo Bonzini
2019-10-03 21:23 ` [RFC PATCH 11/13] x86, ptdump: Add NR bit to page table dump Rick Edgecombe
2019-10-03 21:23 ` [RFC PATCH 12/13] mmap: Add XO support for KVM XO Rick Edgecombe
2019-10-04  7:34   ` Paolo Bonzini
2019-10-04 19:12     ` Edgecombe, Rick P
2019-10-03 21:24 ` [RFC PATCH 13/13] x86/Kconfig: Add Kconfig for KVM based XO Rick Edgecombe
2019-10-29 23:36   ` Kees Cook
2019-10-30  0:01     ` Edgecombe, Rick P
2019-10-30 18:36       ` Kees Cook
2019-10-04  7:22 ` [RFC PATCH 00/13] XOM for KVM guest userspace Paolo Bonzini
2019-10-04 19:03   ` Edgecombe, Rick P
2019-10-04 14:56 ` Andy Lutomirski
2019-10-04 20:09   ` Edgecombe, Rick P
2019-10-05  1:33     ` Andy Lutomirski
2019-10-07 18:14       ` Edgecombe, Rick P
2019-10-29 23:40 ` Kees Cook
2019-10-30  0:27   ` Edgecombe, Rick P

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).