kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v9 0/6] Linear Address Masking (LAM) KVM Enabling
@ 2023-06-06  9:18 Binbin Wu
  2023-06-06  9:18 ` [PATCH v9 1/6] KVM: x86: Consolidate flags for __linearize() Binbin Wu
                   ` (5 more replies)
  0 siblings, 6 replies; 26+ messages in thread
From: Binbin Wu @ 2023-06-06  9:18 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: seanjc, pbonzini, chao.gao, kai.huang, David.Laight, robert.hu,
	binbin.wu

===Feature Introduction===

Linear-address masking (LAM) [1], modifies the checking that is applied to
*64-bit* linear addresses, allowing software to use of the untranslated address
bits for metadata and masks the metadata bits before using them as linear 
addresses to access memory.

When the feature is virtualized and exposed to guest, it can be used for efficient
address sanitizers (ASAN) implementation and for optimizations in JITs and virtual
machines.

Regarding which pointer bits are masked and can be used for metadata, LAM has 2
modes:
- LAM_48: metadata bits 62:48, i.e. LAM width of 15.
- LAM_57: metadata bits 62:57, i.e. LAM width of 6.

* For user pointers:
  CR3.LAM_U57 = CR3.LAM_U48 = 0, LAM is off;
  CR3.LAM_U57 = 1, LAM57 is active;
  CR3.LAM_U57 = 0 and CR3.LAM_U48 = 1, LAM48 is active.
* For supervisor pointers: 
  CR4.LAM_SUP =0, LAM is off;
  CR4.LAM_SUP =1 with 5-level paging mode, LAM57 is active;
  CR4.LAM_SUP =1 with 4-level paging mode, LAM48 is active.

The modified LAM canonicality check:
* LAM_S48                : [ 1 ][ metadata ][ 1 ]
                             63               47
* LAM_U48                : [ 0 ][ metadata ][ 0 ]
                             63               47
* LAM_S57                : [ 1 ][ metadata ][ 1 ]
                             63               56
* LAM_U57 + 5-lvl paging : [ 0 ][ metadata ][ 0 ]
                             63               56
* LAM_U57 + 4-lvl paging : [ 0 ][ metadata ][ 0...0 ]
                             63               56..47

Note:
1. LAM applies to only data address, not to instructions.
2. LAM identification of an address as user or supervisor is based solely on the
   value of pointer bit 63 and does not depend on the CPL.
3. LAM doesn't apply to the writes to control registers or MSRs.
4. LAM masking applies before paging, so the faulting linear address in CR2
   doesn't contain the metadata.
5  The guest linear address saved in VMCS doesn't contain metadata.
6. For user mode address, it is possible that 5-level paging and LAM_U48 are both
   set, in this case, the effective usable linear address width is 48.
   (Currently, only LAM_U57 is enabled in Linux kernel. [2])

===LAM KVM Design===
LAM KVM enabling includes the following parts:
- Feature Enumeration
  LAM feature is enumerated by CPUID.7.1:EAX.LAM[bit 26].
  If hardware supports LAM and host doesn't disable it explicitly (e.g. via 
  clearcpuid), LAM feature will be exposed to user VMM.

- CR4 Virtualization
  LAM uses CR4.LAM_SUP (bit 28) to configure LAM on supervisor pointers.
  Add support to allow guests to set the new CR4 control bit for guests to enable
  LAM on supervisor pointers.

- CR3 Virtualization
  LAM uses CR3.LAM_U48 (bit 62) and CR3.LAM_U57 (bit 61) to configure LAM on user
  pointers.
  Add support to allow guests to set two new CR3 non-address control bits for
  guests to enable LAM on user pointers.

- Modified Canonicality Check and Metadata Mask
  When LAM is enabled, 64-bit linear address may be tagged with metadata. Linear
  address should be checked for modified canonicality and untagged in instruction
  emulations and VMExit handlers when LAM is applicable.

LAM support in SGX enclave mode needs additional enabling and is not
included in this patch series.

LAM QEMU patch:
https://lists.nongnu.org/archive/html/qemu-devel/2023-05/msg07843.html

LAM kvm-unit-tests patch:
https://lore.kernel.org/kvm/20230530024356.24870-1-binbin.wu@linux.intel.com/

===Test===
1. Add test cases in kvm-unit-test [3] for LAM, including LAM_SUP and LAM_{U57,U48}.
   For supervisor pointers, the test covers CR4 LAM_SUP bits toggle, Memory/MMIO
   access with tagged pointer, and some special instructions (INVLPG, INVPCID,
   INVVPID), INVVPID cases also used to cover VMX instruction VMExit path.
   For uer pointers, the test covers CR3 LAM bits toggle, Memory/MMIO access with
   tagged pointer.
   MMIO cases are used to trigger instruction emulation path.
   Run the unit test with both LAM feature on/off (i.e. including negative cases).
   Run the unit test in L1 guest with both LAM feature on/off.
2. Run Kernel LAM kselftests [2] in guest, with both EPT=Y/N.
3. Launch a nested guest.

All tests have passed in Simics environment.

[1] Intel ISE https://cdrdv2.intel.com/v1/dl/getContent/671368
    Chapter Linear Address Masking (LAM)
[2] https://lore.kernel.org/all/20230312112612.31869-9-kirill.shutemov@linux.intel.com/
[3] https://lore.kernel.org/kvm/20230530024356.24870-1-binbin.wu@linux.intel.com/

---
Changelog
v8 --> v9:
- Remove unnecessary local variables and some modification of commit message of patch 1. [Chao, David, Kai]
- Add back description about CR4.LAM_SUP setting outside of 64-bit mode in commit message of patch 2. [Kai]
- Clear LAM bits from cr3_ctrl_bits if LAM isn't exposed to the guest in patch 3. [Chao]
- Some comments and commit message changes in patch 4. [Chao]
- Use 'gva_t *' instead of 'u64 *' to avoid explicit type casts in patch 5. [Chao]

v7 --> v8:
- Rebase the patch series on Linux kernel v6.4-rc1, which has all dependencies
  of this patch series.
- Add a cleanup patch to consolidate flags for __linearize().
- Some changes in commit message of patch 2.
- Rewrite the commit message of patch 3 (Thanks Kai Huang for the suggestion).
  Opportunistically use GENMASK_ULL() to define __PT_BASE_ADDR_MASK.
  Opportunistically use kvm_vcpu_is_legal_cr3() to check CR3 in SVM nested code,
  to provide a clear distinction b/t CR3 and GPA checks.
- Change interface untag_addr() to optional version to avoid adding a dummy
  version in SVM in patch 4.
- Verified and confirmed that LAM doesn't apply to the address in descriptor of
  invvpid, drop the untag action for it in patch 5.

v6 --> v7:
- Changes to CR3 virtualization when EPT off
  * Leave LAM bits in root.pgd to force a new root for a CR3+LAM combination. (Sean)
  * Perform GFN calculation from guest CR3/PGD generically by extracting the maximal 
    base address mask. (Sean)
- Remove de-reference of ctxt->vcpu in the emulator. (Sean)
- Fix a bug in v6, which hardcoded "write" to "false" by mistake in linearize(). (Chao Gao)
- Add Chao Gao's reviewed-by in Patch 5.
- Add Xuelian Guo's tested-by in the patch set.
- Separate cleanup patches from the patch set.

v5 --> v6:
Add Patch 2 to fix the check of 64-bit mode.
Add untag_addr() to kvm_x86_ops to hide vendor specific code.
Simplify the LAM canonicality check per Chao's suggestion.
Add cr3_ctrl_bits to kvm_vcpu_arch to simplify cr3 invalidation/extract/mask (Chao Gao)
Extend the patchset scope to include nested virtualization and SGX ENCLS handling.
- Add X86_CR4_LAM_SUP in cr4_fixed1_update for nested vmx. (Chao Gao)
- Add SGX ENCLS VMExit handling
- Add VMX instruction VMExit handling
More descriptions in cover letter.
Add Chao's reviewed-by on Patch 4.
Add more test cases in kvm-unit-test.

v4 --> v5:
Reorder and melt patches surround CR3.LAM bits into Patch 3 of this
version.
Revise Patch 1's subject and description
Drop Patch 3
Use kvm_read_cr4_bits() instead of kvm_read_cr4()
Fix: No need to untag addr when write to msr, it should be legacy canonical check
Rename kvm_is_valid_cr3() --> kvm_vcpu_is_valid_cr3(), and update some call
sites of kvm_vcpu_is_valid_cr3() to use kvm_is_valid_cr3().
Other refactors and Miscs.

v3 --> v4:
Drop unrelated Patch 1 in v3 (Binbin, Sean, Xiaoyao)
Intercept CR4.LAM_SUP instead of pass through to guest (Sean)
Just filter out CR3.LAM_{U48, U57}, instead of all reserved high bits
(Sean, Yuan)
Use existing __canonical_address() helper instead write a new one (Weijiang)
Add LAM handling in KVM emulation (Yu, Yuan)
Add Jingqi's reviwed-by on Patch 7
Rebased to Kirill's latest code, which is 6.2-rc1 base.

v2 --> v3:
As LAM Kernel patches are in tip tree now, rebase to it.
https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/

v1 --> v2:
1. Fixes i386-allyesconfig build error on get_pgd(), where
   CR3_HIGH_RSVD_MASK isn't applicable.
   (Reported-by: kernel test robot <lkp@intel.com>)
2. In kvm_set_cr3(), be conservative on skip tlb flush when only LAM bits
   toggles. (Kirill)

Binbin Wu (3):
  KVM: x86: Consolidate flags for __linearize()
  KVM: x86: Introduce untag_addr() in kvm_x86_ops
  KVM: x86: Untag address when LAM applicable

Robert Hoo (3):
  KVM: x86: Virtualize CR4.LAM_SUP
  KVM: x86: Virtualize CR3.LAM_{U48,U57}
  KVM: x86: Expose LAM feature to userspace VMM

 arch/x86/include/asm/kvm-x86-ops.h |  1 +
 arch/x86/include/asm/kvm_host.h    | 10 +++-
 arch/x86/kvm/cpuid.c               |  2 +-
 arch/x86/kvm/cpuid.h               |  5 ++
 arch/x86/kvm/emulate.c             | 31 +++++++----
 arch/x86/kvm/kvm_emulate.h         |  7 +++
 arch/x86/kvm/mmu.h                 |  5 ++
 arch/x86/kvm/mmu/mmu.c             |  8 ++-
 arch/x86/kvm/mmu/mmu_internal.h    |  1 +
 arch/x86/kvm/mmu/paging_tmpl.h     |  3 +-
 arch/x86/kvm/mmu/spte.h            |  2 +-
 arch/x86/kvm/svm/nested.c          |  4 +-
 arch/x86/kvm/vmx/nested.c          |  6 ++-
 arch/x86/kvm/vmx/sgx.c             |  1 +
 arch/x86/kvm/vmx/vmx.c             | 84 +++++++++++++++++++++++++++++-
 arch/x86/kvm/vmx/vmx.h             |  2 +
 arch/x86/kvm/x86.c                 | 11 +++-
 arch/x86/kvm/x86.h                 |  2 +
 18 files changed, 163 insertions(+), 22 deletions(-)


base-commit: 06b66e050095db599f4f598ee627d1dc9c933018
-- 
2.25.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v9 1/6] KVM: x86: Consolidate flags for __linearize()
  2023-06-06  9:18 [PATCH v9 0/6] Linear Address Masking (LAM) KVM Enabling Binbin Wu
@ 2023-06-06  9:18 ` Binbin Wu
  2023-06-06  9:18 ` [PATCH v9 2/6] KVM: x86: Virtualize CR4.LAM_SUP Binbin Wu
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 26+ messages in thread
From: Binbin Wu @ 2023-06-06  9:18 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: seanjc, pbonzini, chao.gao, kai.huang, David.Laight, robert.hu,
	binbin.wu

Consolidate two bool parameters (write/fetch) of __linearize() into a
'u32 flags' parameter to make the function be more concise and future
extendable, i.e. to support Intel Linear Address Masking (LAM), which
allows high non-address bits of linear address to be used as metadata.

Define two flags to replace the two bools.  A new flag will be added to
to support LAM to skip masking off metadata bits of linear address
under some conditions.

No functional change intended.

Signed-off-by: Binbin Wu <binbin.wu@linux.intel.com>
Reviewed-by: Chao Gao <chao.gao@intel.com>
Acked-by: Kai Huang <kai.huang@intel.com>
---
 arch/x86/kvm/emulate.c     | 19 ++++++++++---------
 arch/x86/kvm/kvm_emulate.h |  4 ++++
 2 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 936a397a08cd..e89afc39e56f 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -687,8 +687,8 @@ static unsigned insn_alignment(struct x86_emulate_ctxt *ctxt, unsigned size)
 static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
 				       struct segmented_address addr,
 				       unsigned *max_size, unsigned size,
-				       bool write, bool fetch,
-				       enum x86emul_mode mode, ulong *linear)
+				       enum x86emul_mode mode, ulong *linear,
+				       u32 flags)
 {
 	struct desc_struct desc;
 	bool usable;
@@ -718,10 +718,10 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
 			goto bad;
 		/* code segment in protected mode or read-only data segment */
 		if ((((ctxt->mode != X86EMUL_MODE_REAL) && (desc.type & 8))
-					|| !(desc.type & 2)) && write)
+			|| !(desc.type & 2)) && (flags & X86EMUL_F_WRITE))
 			goto bad;
 		/* unreadable code segment */
-		if (!fetch && (desc.type & 8) && !(desc.type & 2))
+		if (!(flags & X86EMUL_F_FETCH) && (desc.type & 8) && !(desc.type & 2))
 			goto bad;
 		lim = desc_limit_scaled(&desc);
 		if (!(desc.type & 8) && (desc.type & 4)) {
@@ -757,8 +757,8 @@ static int linearize(struct x86_emulate_ctxt *ctxt,
 		     ulong *linear)
 {
 	unsigned max_size;
-	return __linearize(ctxt, addr, &max_size, size, write, false,
-			   ctxt->mode, linear);
+	return __linearize(ctxt, addr, &max_size, size, ctxt->mode, linear,
+			   write ? X86EMUL_F_WRITE : 0);
 }
 
 static inline int assign_eip(struct x86_emulate_ctxt *ctxt, ulong dst)
@@ -771,7 +771,8 @@ static inline int assign_eip(struct x86_emulate_ctxt *ctxt, ulong dst)
 
 	if (ctxt->op_bytes != sizeof(unsigned long))
 		addr.ea = dst & ((1UL << (ctxt->op_bytes << 3)) - 1);
-	rc = __linearize(ctxt, addr, &max_size, 1, false, true, ctxt->mode, &linear);
+	rc = __linearize(ctxt, addr, &max_size, 1, ctxt->mode, &linear,
+			 X86EMUL_F_FETCH);
 	if (rc == X86EMUL_CONTINUE)
 		ctxt->_eip = addr.ea;
 	return rc;
@@ -907,8 +908,8 @@ static int __do_insn_fetch_bytes(struct x86_emulate_ctxt *ctxt, int op_size)
 	 * boundary check itself.  Instead, we use max_size to check
 	 * against op_size.
 	 */
-	rc = __linearize(ctxt, addr, &max_size, 0, false, true, ctxt->mode,
-			 &linear);
+	rc = __linearize(ctxt, addr, &max_size, 0, ctxt->mode, &linear,
+			 X86EMUL_F_FETCH);
 	if (unlikely(rc != X86EMUL_CONTINUE))
 		return rc;
 
diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
index ab65f3a47dfd..5b9ec610b2cb 100644
--- a/arch/x86/kvm/kvm_emulate.h
+++ b/arch/x86/kvm/kvm_emulate.h
@@ -88,6 +88,10 @@ struct x86_instruction_info {
 #define X86EMUL_IO_NEEDED       5 /* IO is needed to complete emulation */
 #define X86EMUL_INTERCEPTED     6 /* Intercepted by nested VMCB/VMCS */
 
+/* x86-specific emulation flags */
+#define X86EMUL_F_FETCH			BIT(0)
+#define X86EMUL_F_WRITE			BIT(1)
+
 struct x86_emulate_ops {
 	void (*vm_bugged)(struct x86_emulate_ctxt *ctxt);
 	/*
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v9 2/6] KVM: x86: Virtualize CR4.LAM_SUP
  2023-06-06  9:18 [PATCH v9 0/6] Linear Address Masking (LAM) KVM Enabling Binbin Wu
  2023-06-06  9:18 ` [PATCH v9 1/6] KVM: x86: Consolidate flags for __linearize() Binbin Wu
@ 2023-06-06  9:18 ` Binbin Wu
  2023-06-07  3:40   ` Huang, Kai
  2023-06-06  9:18 ` [PATCH v9 3/6] KVM: x86: Virtualize CR3.LAM_{U48,U57} Binbin Wu
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 26+ messages in thread
From: Binbin Wu @ 2023-06-06  9:18 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: seanjc, pbonzini, chao.gao, kai.huang, David.Laight, robert.hu,
	binbin.wu

From: Robert Hoo <robert.hu@linux.intel.com>

Add support to allow guests to set the new CR4 control bit for guests to enable
the new Intel CPU feature Linear Address Masking (LAM) on supervisor pointers.

LAM modifies the checking that is applied to 64-bit linear addresses, allowing
software to use of the untranslated address bits for metadata and masks the
metadata bits before using them as linear addresses to access memory. LAM uses
CR4.LAM_SUP (bit 28) to configure LAM for supervisor pointers. LAM also changes
VMENTER to allow the bit to be set in VMCS's HOST_CR4 and GUEST_CR4 for
virtualization. Note CR4.LAM_SUP is allowed to be set even not in 64-bit mode,
but it will not take effect since LAM only applies to 64-bit linear addresses.

Move CR4.LAM_SUP out of CR4_RESERVED_BITS and its reservation depends on vcpu
supporting LAM feature or not. Leave the bit intercepted to avoid vmread every
time when KVM fetches its value, with the expectation that guest won't toggle
the bit frequently.

Set CR4.LAM_SUP bit in the emulated IA32_VMX_CR4_FIXED1 MSR for guests to allow
guests to enable LAM for supervisor pointers in nested VMX operation.

Hardware is not required to do TLB flush when CR4.LAM_SUP toggled, KVM doesn't
need to emulate TLB flush based on it.
There's no other features/vmx_exec_controls connection, no other code needed in
{kvm,vmx}_set_cr4().

Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
Co-developed-by: Binbin Wu <binbin.wu@linux.intel.com>
Signed-off-by: Binbin Wu <binbin.wu@linux.intel.com>
Reviewed-by: Chao Gao <chao.gao@intel.com>
Tested-by: Xuelian Guo <xuelian.guo@intel.com>
---
 arch/x86/include/asm/kvm_host.h | 3 ++-
 arch/x86/kvm/vmx/vmx.c          | 3 +++
 arch/x86/kvm/x86.h              | 2 ++
 3 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index fb9d1f2d6136..c6f03d151c31 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -125,7 +125,8 @@
 			  | X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR | X86_CR4_PCIDE \
 			  | X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE \
 			  | X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_VMXE \
-			  | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP))
+			  | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP \
+			  | X86_CR4_LAM_SUP))
 
 #define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR)
 
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 2d9d155691a7..0dd2970ba5c8 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7600,6 +7600,9 @@ static void nested_vmx_cr_fixed1_bits_update(struct kvm_vcpu *vcpu)
 	cr4_fixed1_update(X86_CR4_UMIP,       ecx, feature_bit(UMIP));
 	cr4_fixed1_update(X86_CR4_LA57,       ecx, feature_bit(LA57));
 
+	entry = kvm_find_cpuid_entry_index(vcpu, 0x7, 1);
+	cr4_fixed1_update(X86_CR4_LAM_SUP,    eax, feature_bit(LAM));
+
 #undef cr4_fixed1_update
 }
 
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index 82e3dafc5453..24e2b56356b8 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -528,6 +528,8 @@ bool kvm_msr_allowed(struct kvm_vcpu *vcpu, u32 index, u32 type);
 		__reserved_bits |= X86_CR4_VMXE;        \
 	if (!__cpu_has(__c, X86_FEATURE_PCID))          \
 		__reserved_bits |= X86_CR4_PCIDE;       \
+	if (!__cpu_has(__c, X86_FEATURE_LAM))           \
+		__reserved_bits |= X86_CR4_LAM_SUP;     \
 	__reserved_bits;                                \
 })
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v9 3/6] KVM: x86: Virtualize CR3.LAM_{U48,U57}
  2023-06-06  9:18 [PATCH v9 0/6] Linear Address Masking (LAM) KVM Enabling Binbin Wu
  2023-06-06  9:18 ` [PATCH v9 1/6] KVM: x86: Consolidate flags for __linearize() Binbin Wu
  2023-06-06  9:18 ` [PATCH v9 2/6] KVM: x86: Virtualize CR4.LAM_SUP Binbin Wu
@ 2023-06-06  9:18 ` Binbin Wu
  2023-06-27 23:40   ` Sean Christopherson
  2023-06-06  9:18 ` [PATCH v9 4/6] KVM: x86: Introduce untag_addr() in kvm_x86_ops Binbin Wu
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 26+ messages in thread
From: Binbin Wu @ 2023-06-06  9:18 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: seanjc, pbonzini, chao.gao, kai.huang, David.Laight, robert.hu,
	binbin.wu

From: Robert Hoo <robert.hu@linux.intel.com>

Add support to allow guests to set two new CR3 non-address control bits for
guests to enable the new Intel CPU feature Linear Address Masking (LAM) on user
pointers.

LAM modifies the checking that is applied to 64-bit linear addresses, allowing
software to use of the untranslated address bits for metadata and masks the
metadata bits before using them as linear addresses to access memory. LAM uses
two new CR3 non-address bits LAM_U48 (bit 62) and AM_U57 (bit 61) to configure
LAM for user pointers. LAM also changes VMENTER to allow both bits to be set in
VMCS's HOST_CR3 and GUEST_CR3 for virtualization.

When EPT is on, CR3 is not trapped by KVM and it's up to the guest to set any of
the two LAM control bits. However, when EPT is off, the actual CR3 used by the
guest is generated from the shadow MMU root which is different from the CR3 that
is *set* by the guest, and KVM needs to manually apply any active control bits
to VMCS's GUEST_CR3 based on the cached CR3 *seen* by the guest.

KVM manually checks guest's CR3 to make sure it points to a valid guest physical
address (i.e. to support smaller MAXPHYSADDR in the guest). Extend this check
to allow the two LAM control bits to be set. And to make such check generic,
introduce a new field 'cr3_ctrl_bits' to vcpu to record all feature control bits
that are allowed to be set by the guest. After check, non-address bits of guest
CR3 will be stripped off to extract guest physical address.

In case of nested, for a guest which supports LAM, both VMCS12's HOST_CR3 and
GUEST_CR3 are allowed to have the new LAM control bits set, i.e. when L0 enters
L1 to emulate a VMEXIT from L2 to L1 or when L0 enters L2 directly. KVM also
manually checks VMCS12's HOST_CR3 and GUEST_CR3 being valid physical address.
Extend such check to allow the new LAM control bits too.

Note, LAM doesn't have a global control bit to turn on/off LAM completely, but
purely depends on hardware's CPUID to determine it can be enabled or not. That
means, when EPT is on, even when KVM doesn't expose LAM to guest, the guest can
still set LAM control bits in CR3 w/o causing problem. This is an unfortunate
virtualization hole. KVM could choose to intercept CR3 in this case and inject
fault but this would hurt performance when running a normal VM w/o LAM support.
This is undesirable. Just choose to let the guest do such illegal thing as the
worst case is guest being killed when KVM eventually find out such illegal
behaviour and that is the guest to blame.

Opportunistically use GENMASK_ULL() to define __PT_BASE_ADDR_MASK.
Opportunistically use kvm_vcpu_is_legal_cr3() to check CR3 in SVM nested code,
to provide a clear distinction b/t CR3 and GPA checks.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
Co-developed-by: Binbin Wu <binbin.wu@linux.intel.com>
Signed-off-by: Binbin Wu <binbin.wu@linux.intel.com>
Tested-by: Xuelian Guo <xuelian.guo@intel.com>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Reviewed-by: Chao Gao <chao.gao@intel.com>
---
 arch/x86/include/asm/kvm_host.h | 5 +++++
 arch/x86/kvm/cpuid.h            | 5 +++++
 arch/x86/kvm/mmu.h              | 5 +++++
 arch/x86/kvm/mmu/mmu.c          | 8 +++++++-
 arch/x86/kvm/mmu/mmu_internal.h | 1 +
 arch/x86/kvm/mmu/paging_tmpl.h  | 3 ++-
 arch/x86/kvm/mmu/spte.h         | 2 +-
 arch/x86/kvm/svm/nested.c       | 4 ++--
 arch/x86/kvm/vmx/nested.c       | 4 ++--
 arch/x86/kvm/vmx/vmx.c          | 8 +++++++-
 arch/x86/kvm/x86.c              | 4 ++--
 11 files changed, 39 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index c6f03d151c31..46471dd9cc1b 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -727,6 +727,11 @@ struct kvm_vcpu_arch {
 	unsigned long cr0_guest_owned_bits;
 	unsigned long cr2;
 	unsigned long cr3;
+	/*
+	 * CR3 non-address feature control bits.
+	 * Guest CR3 may contain any of those bits at runtime.
+	 */
+	u64 cr3_ctrl_bits;
 	unsigned long cr4;
 	unsigned long cr4_guest_owned_bits;
 	unsigned long cr4_guest_rsvd_bits;
diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index b1658c0de847..ef8e1b912d7d 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -42,6 +42,11 @@ static inline int cpuid_maxphyaddr(struct kvm_vcpu *vcpu)
 	return vcpu->arch.maxphyaddr;
 }
 
+static inline bool kvm_vcpu_is_legal_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
+{
+	return !((cr3 & vcpu->arch.reserved_gpa_bits) & ~vcpu->arch.cr3_ctrl_bits);
+}
+
 static inline bool kvm_vcpu_is_legal_gpa(struct kvm_vcpu *vcpu, gpa_t gpa)
 {
 	return !(gpa & vcpu->arch.reserved_gpa_bits);
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index 92d5a1924fc1..81d8a433dae1 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -144,6 +144,11 @@ static inline unsigned long kvm_get_active_pcid(struct kvm_vcpu *vcpu)
 	return kvm_get_pcid(vcpu, kvm_read_cr3(vcpu));
 }
 
+static inline u64 kvm_get_active_cr3_ctrl_bits(struct kvm_vcpu *vcpu)
+{
+	return kvm_read_cr3(vcpu) & vcpu->arch.cr3_ctrl_bits;
+}
+
 static inline void kvm_mmu_load_pgd(struct kvm_vcpu *vcpu)
 {
 	u64 root_hpa = vcpu->arch.mmu->root.hpa;
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index c8961f45e3b1..deea9a9f0c75 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3812,7 +3812,13 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
 	hpa_t root;
 
 	root_pgd = kvm_mmu_get_guest_pgd(vcpu, mmu);
-	root_gfn = root_pgd >> PAGE_SHIFT;
+	/*
+	 * Guest PGD can be CR3 or EPTP (for nested EPT case). CR3 may contain
+	 * additional control bits (e.g. LAM control bits). To be generic,
+	 * unconditionally strip non-address bits when computing the GFN since
+	 * the guest PGD has already been checked for validity.
+	 */
+	root_gfn = (root_pgd & __PT_BASE_ADDR_MASK) >> PAGE_SHIFT;
 
 	if (mmu_check_root(vcpu, root_gfn))
 		return 1;
diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h
index d39af5639ce9..7d2105432d66 100644
--- a/arch/x86/kvm/mmu/mmu_internal.h
+++ b/arch/x86/kvm/mmu/mmu_internal.h
@@ -21,6 +21,7 @@ extern bool dbg;
 #endif
 
 /* Page table builder macros common to shadow (host) PTEs and guest PTEs. */
+#define __PT_BASE_ADDR_MASK GENMASK_ULL(51, 12)
 #define __PT_LEVEL_SHIFT(level, bits_per_level)	\
 	(PAGE_SHIFT + ((level) - 1) * (bits_per_level))
 #define __PT_INDEX(address, level, bits_per_level) \
diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index 0662e0278e70..394733ac9088 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -62,7 +62,7 @@
 #endif
 
 /* Common logic, but per-type values.  These also need to be undefined. */
-#define PT_BASE_ADDR_MASK	((pt_element_t)(((1ULL << 52) - 1) & ~(u64)(PAGE_SIZE-1)))
+#define PT_BASE_ADDR_MASK	((pt_element_t)__PT_BASE_ADDR_MASK)
 #define PT_LVL_ADDR_MASK(lvl)	__PT_LVL_ADDR_MASK(PT_BASE_ADDR_MASK, lvl, PT_LEVEL_BITS)
 #define PT_LVL_OFFSET_MASK(lvl)	__PT_LVL_OFFSET_MASK(PT_BASE_ADDR_MASK, lvl, PT_LEVEL_BITS)
 #define PT_INDEX(addr, lvl)	__PT_INDEX(addr, lvl, PT_LEVEL_BITS)
@@ -324,6 +324,7 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker,
 	trace_kvm_mmu_pagetable_walk(addr, access);
 retry_walk:
 	walker->level = mmu->cpu_role.base.level;
+	/* gpte_to_gfn() will strip non-address bits. */
 	pte           = kvm_mmu_get_guest_pgd(vcpu, mmu);
 	have_ad       = PT_HAVE_ACCESSED_DIRTY(mmu);
 
diff --git a/arch/x86/kvm/mmu/spte.h b/arch/x86/kvm/mmu/spte.h
index 1279db2eab44..777f7d443e3b 100644
--- a/arch/x86/kvm/mmu/spte.h
+++ b/arch/x86/kvm/mmu/spte.h
@@ -36,7 +36,7 @@ static_assert(SPTE_TDP_AD_ENABLED == 0);
 #ifdef CONFIG_DYNAMIC_PHYSICAL_MASK
 #define SPTE_BASE_ADDR_MASK (physical_mask & ~(u64)(PAGE_SIZE-1))
 #else
-#define SPTE_BASE_ADDR_MASK (((1ULL << 52) - 1) & ~(u64)(PAGE_SIZE-1))
+#define SPTE_BASE_ADDR_MASK __PT_BASE_ADDR_MASK
 #endif
 
 #define SPTE_PERM_MASK (PT_PRESENT_MASK | PT_WRITABLE_MASK | shadow_user_mask \
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 96936ddf1b3c..1df801a48451 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -311,7 +311,7 @@ static bool __nested_vmcb_check_save(struct kvm_vcpu *vcpu,
 	if ((save->efer & EFER_LME) && (save->cr0 & X86_CR0_PG)) {
 		if (CC(!(save->cr4 & X86_CR4_PAE)) ||
 		    CC(!(save->cr0 & X86_CR0_PE)) ||
-		    CC(kvm_vcpu_is_illegal_gpa(vcpu, save->cr3)))
+		    CC(!kvm_vcpu_is_legal_cr3(vcpu, save->cr3)))
 			return false;
 	}
 
@@ -520,7 +520,7 @@ static void nested_svm_transition_tlb_flush(struct kvm_vcpu *vcpu)
 static int nested_svm_load_cr3(struct kvm_vcpu *vcpu, unsigned long cr3,
 			       bool nested_npt, bool reload_pdptrs)
 {
-	if (CC(kvm_vcpu_is_illegal_gpa(vcpu, cr3)))
+	if (CC(!kvm_vcpu_is_legal_cr3(vcpu, cr3)))
 		return -EINVAL;
 
 	if (reload_pdptrs && !nested_npt && is_pae_paging(vcpu) &&
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index e35cf0bd0df9..11b12a75ca91 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -1085,7 +1085,7 @@ static int nested_vmx_load_cr3(struct kvm_vcpu *vcpu, unsigned long cr3,
 			       bool nested_ept, bool reload_pdptrs,
 			       enum vm_entry_failure_code *entry_failure_code)
 {
-	if (CC(kvm_vcpu_is_illegal_gpa(vcpu, cr3))) {
+	if (CC(!kvm_vcpu_is_legal_cr3(vcpu, cr3))) {
 		*entry_failure_code = ENTRY_FAIL_DEFAULT;
 		return -EINVAL;
 	}
@@ -2913,7 +2913,7 @@ static int nested_vmx_check_host_state(struct kvm_vcpu *vcpu,
 
 	if (CC(!nested_host_cr0_valid(vcpu, vmcs12->host_cr0)) ||
 	    CC(!nested_host_cr4_valid(vcpu, vmcs12->host_cr4)) ||
-	    CC(kvm_vcpu_is_illegal_gpa(vcpu, vmcs12->host_cr3)))
+	    CC(!kvm_vcpu_is_legal_cr3(vcpu, vmcs12->host_cr3)))
 		return -EINVAL;
 
 	if (CC(is_noncanonical_address(vmcs12->host_ia32_sysenter_esp, vcpu)) ||
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 0dd2970ba5c8..52dcf3c00bb8 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -3358,7 +3358,8 @@ static void vmx_load_mmu_pgd(struct kvm_vcpu *vcpu, hpa_t root_hpa,
 			update_guest_cr3 = false;
 		vmx_ept_load_pdptrs(vcpu);
 	} else {
-		guest_cr3 = root_hpa | kvm_get_active_pcid(vcpu);
+		guest_cr3 = root_hpa | kvm_get_active_pcid(vcpu) |
+		            kvm_get_active_cr3_ctrl_bits(vcpu);
 	}
 
 	if (update_guest_cr3)
@@ -7740,6 +7741,11 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 		vmx->msr_ia32_feature_control_valid_bits &=
 			~FEAT_CTL_SGX_LC_ENABLED;
 
+	if (guest_cpuid_has(vcpu, X86_FEATURE_LAM))
+		vcpu->arch.cr3_ctrl_bits |= X86_CR3_LAM_U48 | X86_CR3_LAM_U57;
+	else
+		vcpu->arch.cr3_ctrl_bits &= ~(X86_CR3_LAM_U48 | X86_CR3_LAM_U57);
+
 	/* Refresh #PF interception to account for MAXPHYADDR changes. */
 	vmx_update_exception_bitmap(vcpu);
 }
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 5ad55ef71433..709fc920f378 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1275,7 +1275,7 @@ int kvm_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
 	 * stuff CR3, e.g. for RSM emulation, and there is no guarantee that
 	 * the current vCPU mode is accurate.
 	 */
-	if (kvm_vcpu_is_illegal_gpa(vcpu, cr3))
+	if (!kvm_vcpu_is_legal_cr3(vcpu, cr3))
 		return 1;
 
 	if (is_pae_paging(vcpu) && !load_pdptrs(vcpu, cr3))
@@ -11456,7 +11456,7 @@ static bool kvm_is_valid_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs)
 		 */
 		if (!(sregs->cr4 & X86_CR4_PAE) || !(sregs->efer & EFER_LMA))
 			return false;
-		if (kvm_vcpu_is_illegal_gpa(vcpu, sregs->cr3))
+		if (!kvm_vcpu_is_legal_cr3(vcpu, sregs->cr3))
 			return false;
 	} else {
 		/*
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v9 4/6] KVM: x86: Introduce untag_addr() in kvm_x86_ops
  2023-06-06  9:18 [PATCH v9 0/6] Linear Address Masking (LAM) KVM Enabling Binbin Wu
                   ` (2 preceding siblings ...)
  2023-06-06  9:18 ` [PATCH v9 3/6] KVM: x86: Virtualize CR3.LAM_{U48,U57} Binbin Wu
@ 2023-06-06  9:18 ` Binbin Wu
  2023-06-28  0:15   ` Sean Christopherson
  2023-06-06  9:18 ` [PATCH v9 5/6] KVM: x86: Untag address when LAM applicable Binbin Wu
  2023-06-06  9:18 ` [PATCH v9 6/6] KVM: x86: Expose LAM feature to userspace VMM Binbin Wu
  5 siblings, 1 reply; 26+ messages in thread
From: Binbin Wu @ 2023-06-06  9:18 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: seanjc, pbonzini, chao.gao, kai.huang, David.Laight, robert.hu,
	binbin.wu

Introduce a new optional interface untag_addr() to kvm_x86_ops to untag
the metadata from linear address. Implement LAM version in VMX.

When enabled feature like Intel Linear Address Masking or AMD Upper
Address Ignore, linear address may be tagged with metadata. Linear
address should be checked for modified canonicality and untagged in
instruction emulations or VMExit handlers if LAM or UAI is applicable.

Introduce untag_addr() to kvm_x86_ops to hide the vendor specific code.
Pass the 'flags' to avoid distinguishing processor vendor in common emulator
path for the cases whose untag policies are different in the future.

For VMX, LAM version is implemented.

Signed-off-by: Binbin Wu <binbin.wu@linux.intel.com>
Tested-by: Xuelian Guo <xuelian.guo@intel.com>
Reviewed-by: Chao Gao <chao.gao@intel.com>
---
 arch/x86/include/asm/kvm-x86-ops.h |  1 +
 arch/x86/include/asm/kvm_host.h    |  2 +
 arch/x86/kvm/kvm_emulate.h         |  1 +
 arch/x86/kvm/vmx/vmx.c             | 73 ++++++++++++++++++++++++++++++
 arch/x86/kvm/vmx/vmx.h             |  2 +
 5 files changed, 79 insertions(+)

diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
index 13bc212cd4bc..c0cebe671d41 100644
--- a/arch/x86/include/asm/kvm-x86-ops.h
+++ b/arch/x86/include/asm/kvm-x86-ops.h
@@ -52,6 +52,7 @@ KVM_X86_OP(cache_reg)
 KVM_X86_OP(get_rflags)
 KVM_X86_OP(set_rflags)
 KVM_X86_OP(get_if_flag)
+KVM_X86_OP_OPTIONAL(untag_addr)
 KVM_X86_OP(flush_tlb_all)
 KVM_X86_OP(flush_tlb_current)
 KVM_X86_OP_OPTIONAL(flush_remote_tlbs)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 46471dd9cc1b..62a72560fa65 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1588,6 +1588,8 @@ struct kvm_x86_ops {
 	void (*set_rflags)(struct kvm_vcpu *vcpu, unsigned long rflags);
 	bool (*get_if_flag)(struct kvm_vcpu *vcpu);
 
+	void (*untag_addr)(struct kvm_vcpu *vcpu, gva_t *gva, u32 flags);
+
 	void (*flush_tlb_all)(struct kvm_vcpu *vcpu);
 	void (*flush_tlb_current)(struct kvm_vcpu *vcpu);
 	int  (*flush_remote_tlbs)(struct kvm *kvm);
diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
index 5b9ec610b2cb..c2091e24a6b9 100644
--- a/arch/x86/kvm/kvm_emulate.h
+++ b/arch/x86/kvm/kvm_emulate.h
@@ -91,6 +91,7 @@ struct x86_instruction_info {
 /* x86-specific emulation flags */
 #define X86EMUL_F_FETCH			BIT(0)
 #define X86EMUL_F_WRITE			BIT(1)
+#define X86EMUL_F_SKIPLAM		BIT(2)
 
 struct x86_emulate_ops {
 	void (*vm_bugged)(struct x86_emulate_ctxt *ctxt);
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 52dcf3c00bb8..82a225d1000e 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -8133,6 +8133,77 @@ static void vmx_vm_destroy(struct kvm *kvm)
 	free_pages((unsigned long)kvm_vmx->pid_table, vmx_get_pid_table_order(kvm));
 }
 
+#define LAM_S57_EN_MASK (X86_CR4_LAM_SUP | X86_CR4_LA57)
+static int lam_sign_extend_bit(struct kvm_vcpu *vcpu, gva_t addr)
+{
+	u64 cr3, cr4;
+
+	/*
+	 * The LAM identification of a pointer as user or supervisor is
+	 * based solely on the value of pointer bit 63.
+	 */
+	if (!(addr >> 63)) {
+		cr3 = kvm_read_cr3(vcpu);
+		if (cr3 & X86_CR3_LAM_U57)
+			return 56;
+		if (cr3 & X86_CR3_LAM_U48)
+			return 47;
+	} else {
+		cr4 = kvm_read_cr4_bits(vcpu, LAM_S57_EN_MASK);
+		if (cr4 == LAM_S57_EN_MASK)
+			return 56;
+		if (cr4 & X86_CR4_LAM_SUP)
+			return 47;
+	}
+	return -1;
+}
+
+/*
+ * Only called in 64-bit mode.
+ *
+ * LAM has a modified canonical check when applicable:
+ * LAM_S48                : [ 1 ][ metadata ][ 1 ]
+ *                            63               47
+ * LAM_U48                : [ 0 ][ metadata ][ 0 ]
+ *                            63               47
+ * LAM_S57                : [ 1 ][ metadata ][ 1 ]
+ *                            63               56
+ * LAM_U57 + 5-lvl paging : [ 0 ][ metadata ][ 0 ]
+ *                            63               56
+ * LAM_U57 + 4-lvl paging : [ 0 ][ metadata ][ 0...0 ]
+ *                            63               56..47
+ *
+ * Untag the metadata bits by sign-extending the value of bit 47 (LAM48) or
+ * bit 56 (LAM57). The resulting address after untag isn't guaranteed to be
+ * canonical. Callers should perform the original canonical check and raise
+ * #GP/#SS if the address is non-canonical.
+ *
+ * Note that KVM masks the metadata in addresses, performs the (original)
+ * canonicality checking and then walks page table. This is slightly
+ * different from hardware behavior but achieves the same effect.
+ * Specifically, if LAM is enabled, the processor performs a modified
+ * canonicality checking where the metadata are ignored instead of
+ * masked. After the modified canonicality checking, the processor masks
+ * the metadata before passing addresses for paging translation.
+ */
+void vmx_untag_addr(struct kvm_vcpu *vcpu, gva_t *gva, u32 flags)
+{
+	int sign_ext_bit;
+
+	/*
+	 * Check LAM_U48 in cr3_ctrl_bits to avoid guest_cpuid_has().
+	 * If not set, vCPU doesn't supports LAM.
+	 */
+	if (!(vcpu->arch.cr3_ctrl_bits & X86_CR3_LAM_U48) ||
+	    (flags & X86EMUL_F_SKIPLAM) || WARN_ON_ONCE(!is_64_bit_mode(vcpu)))
+		return;
+
+	sign_ext_bit = lam_sign_extend_bit(vcpu, *gva);
+	if (sign_ext_bit > 0)
+		*gva = (sign_extend64(*gva, sign_ext_bit) & ~BIT_ULL(63)) |
+		       (*gva & BIT_ULL(63));
+}
+
 static struct kvm_x86_ops vmx_x86_ops __initdata = {
 	.name = KBUILD_MODNAME,
 
@@ -8181,6 +8252,8 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
 	.set_rflags = vmx_set_rflags,
 	.get_if_flag = vmx_get_if_flag,
 
+	.untag_addr = vmx_untag_addr,
+
 	.flush_tlb_all = vmx_flush_tlb_all,
 	.flush_tlb_current = vmx_flush_tlb_current,
 	.flush_tlb_gva = vmx_flush_tlb_gva,
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 9e66531861cf..c4bbd3024fa8 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -433,6 +433,8 @@ void vmx_enable_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr, int type);
 u64 vmx_get_l2_tsc_offset(struct kvm_vcpu *vcpu);
 u64 vmx_get_l2_tsc_multiplier(struct kvm_vcpu *vcpu);
 
+void vmx_untag_addr(struct kvm_vcpu *vcpu, gva_t *gva, u32 flags);
+
 static inline void vmx_set_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr,
 					     int type, bool value)
 {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v9 5/6] KVM: x86: Untag address when LAM applicable
  2023-06-06  9:18 [PATCH v9 0/6] Linear Address Masking (LAM) KVM Enabling Binbin Wu
                   ` (3 preceding siblings ...)
  2023-06-06  9:18 ` [PATCH v9 4/6] KVM: x86: Introduce untag_addr() in kvm_x86_ops Binbin Wu
@ 2023-06-06  9:18 ` Binbin Wu
  2023-06-28  0:19   ` Sean Christopherson
  2023-06-06  9:18 ` [PATCH v9 6/6] KVM: x86: Expose LAM feature to userspace VMM Binbin Wu
  5 siblings, 1 reply; 26+ messages in thread
From: Binbin Wu @ 2023-06-06  9:18 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: seanjc, pbonzini, chao.gao, kai.huang, David.Laight, robert.hu,
	binbin.wu

Untag address for 64-bit memory/MMIO operand in instruction emulations
and VMExit handlers when LAM is applicable.

For instruction emulation, untag address in __linearize() before
canonical check. LAM doesn't apply to addresses used for instruction
fetches or to those that specify the targets of jump and call instructions,
use X86EMUL_F_SKIPLAM to skip LAM untag.

For VMExit handlers related to 64-bit linear address:
- Cases need to untag address
  Operand(s) of VMX instructions and INVPCID.
  Operand(s) of SGX ENCLS.
- Cases LAM doesn't apply to
  Operand of INVLPG.
  Linear address in INVPCID descriptor (no change needed).
  Linear address in INVVPID descriptor (it has been confirmed, although it is
  not called out in LAM spec, no change needed).
  BASEADDR specified in SESC of ECREATE (no change needed).

Note:
LAM doesn't apply to the writes to control registers or MSRs.
LAM masking applies before paging, so the faulting linear address in CR2
doesn't contain the metadata.
The guest linear address saved in VMCS doesn't contain metadata.

Co-developed-by: Robert Hoo <robert.hu@linux.intel.com>
Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
Signed-off-by: Binbin Wu <binbin.wu@linux.intel.com>
Reviewed-by: Chao Gao <chao.gao@intel.com>
Tested-by: Xuelian Guo <xuelian.guo@intel.com>
---
 arch/x86/kvm/emulate.c     | 16 +++++++++++++---
 arch/x86/kvm/kvm_emulate.h |  2 ++
 arch/x86/kvm/vmx/nested.c  |  2 ++
 arch/x86/kvm/vmx/sgx.c     |  1 +
 arch/x86/kvm/x86.c         |  7 +++++++
 5 files changed, 25 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index e89afc39e56f..c135adb26f1e 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -701,6 +701,7 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
 	*max_size = 0;
 	switch (mode) {
 	case X86EMUL_MODE_PROT64:
+		ctxt->ops->untag_addr(ctxt, &la, flags);
 		*linear = la;
 		va_bits = ctxt_virt_addr_bits(ctxt);
 		if (!__is_canonical_address(la, va_bits))
@@ -771,8 +772,12 @@ static inline int assign_eip(struct x86_emulate_ctxt *ctxt, ulong dst)
 
 	if (ctxt->op_bytes != sizeof(unsigned long))
 		addr.ea = dst & ((1UL << (ctxt->op_bytes << 3)) - 1);
+	/*
+	 * LAM doesn't apply to addresses that specify the targets of jump and
+	 * call instructions.
+	 */
 	rc = __linearize(ctxt, addr, &max_size, 1, ctxt->mode, &linear,
-			 X86EMUL_F_FETCH);
+			 X86EMUL_F_FETCH | X86EMUL_F_SKIPLAM);
 	if (rc == X86EMUL_CONTINUE)
 		ctxt->_eip = addr.ea;
 	return rc;
@@ -907,9 +912,11 @@ static int __do_insn_fetch_bytes(struct x86_emulate_ctxt *ctxt, int op_size)
 	 * __linearize is called with size 0 so that it does not do any
 	 * boundary check itself.  Instead, we use max_size to check
 	 * against op_size.
+	 *
+	 * LAM doesn't apply to addresses used for instruction fetches.
 	 */
 	rc = __linearize(ctxt, addr, &max_size, 0, ctxt->mode, &linear,
-			 X86EMUL_F_FETCH);
+			 X86EMUL_F_FETCH | X86EMUL_F_SKIPLAM);
 	if (unlikely(rc != X86EMUL_CONTINUE))
 		return rc;
 
@@ -3442,8 +3449,11 @@ static int em_invlpg(struct x86_emulate_ctxt *ctxt)
 {
 	int rc;
 	ulong linear;
+	unsigned max_size;
 
-	rc = linearize(ctxt, ctxt->src.addr.mem, 1, false, &linear);
+	/* LAM doesn't apply to invlpg */
+	rc = __linearize(ctxt, ctxt->src.addr.mem, &max_size, 1, ctxt->mode,
+		&linear, X86EMUL_F_SKIPLAM);
 	if (rc == X86EMUL_CONTINUE)
 		ctxt->ops->invlpg(ctxt, linear);
 	/* Disable writeback. */
diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
index c2091e24a6b9..3875ab175cd2 100644
--- a/arch/x86/kvm/kvm_emulate.h
+++ b/arch/x86/kvm/kvm_emulate.h
@@ -230,6 +230,8 @@ struct x86_emulate_ops {
 	int (*leave_smm)(struct x86_emulate_ctxt *ctxt);
 	void (*triple_fault)(struct x86_emulate_ctxt *ctxt);
 	int (*set_xcr)(struct x86_emulate_ctxt *ctxt, u32 index, u64 xcr);
+
+	void (*untag_addr)(struct x86_emulate_ctxt *ctxt, gva_t *addr, u32 flags);
 };
 
 /* Type, address-of, and value of an instruction's operand. */
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 11b12a75ca91..6c8dab9999f2 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4981,6 +4981,7 @@ int get_vmx_mem_address(struct kvm_vcpu *vcpu, unsigned long exit_qualification,
 		else
 			*ret = off;
 
+		vmx_untag_addr(vcpu, ret, 0);
 		/* Long mode: #GP(0)/#SS(0) if the memory address is in a
 		 * non-canonical form. This is the only check on the memory
 		 * destination for long mode!
@@ -5798,6 +5799,7 @@ static int handle_invvpid(struct kvm_vcpu *vcpu)
 	vpid02 = nested_get_vpid02(vcpu);
 	switch (type) {
 	case VMX_VPID_EXTENT_INDIVIDUAL_ADDR:
+		/* LAM doesn't apply to the address in descriptor of invvpid */
 		if (!operand.vpid ||
 		    is_noncanonical_address(operand.gla, vcpu))
 			return nested_vmx_fail(vcpu,
diff --git a/arch/x86/kvm/vmx/sgx.c b/arch/x86/kvm/vmx/sgx.c
index 2261b684a7d4..b4faa94bace7 100644
--- a/arch/x86/kvm/vmx/sgx.c
+++ b/arch/x86/kvm/vmx/sgx.c
@@ -37,6 +37,7 @@ static int sgx_get_encls_gva(struct kvm_vcpu *vcpu, unsigned long offset,
 	if (!IS_ALIGNED(*gva, alignment)) {
 		fault = true;
 	} else if (likely(is_64_bit_mode(vcpu))) {
+		vmx_untag_addr(vcpu, gva, 0);
 		fault = is_noncanonical_address(*gva, vcpu);
 	} else {
 		*gva &= 0xffffffff;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 709fc920f378..ed2dca55573b 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8296,6 +8296,11 @@ static void emulator_vm_bugged(struct x86_emulate_ctxt *ctxt)
 		kvm_vm_bugged(kvm);
 }
 
+static void emulator_untag_addr(struct x86_emulate_ctxt *ctxt, gva_t *addr, u32 flags)
+{
+	static_call(kvm_x86_untag_addr)(emul_to_vcpu(ctxt), addr, flags);
+}
+
 static const struct x86_emulate_ops emulate_ops = {
 	.vm_bugged           = emulator_vm_bugged,
 	.read_gpr            = emulator_read_gpr,
@@ -8341,6 +8346,7 @@ static const struct x86_emulate_ops emulate_ops = {
 	.leave_smm           = emulator_leave_smm,
 	.triple_fault        = emulator_triple_fault,
 	.set_xcr             = emulator_set_xcr,
+	.untag_addr          = emulator_untag_addr,
 };
 
 static void toggle_interruptibility(struct kvm_vcpu *vcpu, u32 mask)
@@ -13367,6 +13373,7 @@ int kvm_handle_invpcid(struct kvm_vcpu *vcpu, unsigned long type, gva_t gva)
 
 	switch (type) {
 	case INVPCID_TYPE_INDIV_ADDR:
+		/* LAM doesn't apply to the address in descriptor of invpcid */
 		if ((!pcid_enabled && (operand.pcid != 0)) ||
 		    is_noncanonical_address(operand.gla, vcpu)) {
 			kvm_inject_gp(vcpu, 0);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v9 6/6] KVM: x86: Expose LAM feature to userspace VMM
  2023-06-06  9:18 [PATCH v9 0/6] Linear Address Masking (LAM) KVM Enabling Binbin Wu
                   ` (4 preceding siblings ...)
  2023-06-06  9:18 ` [PATCH v9 5/6] KVM: x86: Untag address when LAM applicable Binbin Wu
@ 2023-06-06  9:18 ` Binbin Wu
  2023-06-07  3:52   ` Huang, Kai
  5 siblings, 1 reply; 26+ messages in thread
From: Binbin Wu @ 2023-06-06  9:18 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: seanjc, pbonzini, chao.gao, kai.huang, David.Laight, robert.hu,
	binbin.wu

From: Robert Hoo <robert.hu@linux.intel.com>

LAM feature is enumerated by CPUID.7.1:EAX.LAM[bit 26].
Expose the feature to userspace as the final step after the following
supports:
- CR4.LAM_SUP virtualization
- CR3.LAM_U48 and CR3.LAM_U57 virtualization
- Check and untag 64-bit linear address when LAM applies in instruction
  emulations and VMExit handlers.

LAM support in SGX enclave mode needs additional enabling and is not
included in this patch series.

Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
Signed-off-by: Binbin Wu <binbin.wu@linux.intel.com>
Reviewed-by: Jingqi Liu <jingqi.liu@intel.com>
Reviewed-by: Chao Gao <chao.gao@intel.com>
Tested-by: Xuelian Guo <xuelian.guo@intel.com>
---
 arch/x86/kvm/cpuid.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 241f554f1764..166243fb5705 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -643,7 +643,7 @@ void kvm_set_cpu_caps(void)
 	kvm_cpu_cap_mask(CPUID_7_1_EAX,
 		F(AVX_VNNI) | F(AVX512_BF16) | F(CMPCCXADD) |
 		F(FZRM) | F(FSRS) | F(FSRC) |
-		F(AMX_FP16) | F(AVX_IFMA)
+		F(AMX_FP16) | F(AVX_IFMA) | F(LAM)
 	);
 
 	kvm_cpu_cap_init_kvm_defined(CPUID_7_1_EDX,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v9 2/6] KVM: x86: Virtualize CR4.LAM_SUP
  2023-06-06  9:18 ` [PATCH v9 2/6] KVM: x86: Virtualize CR4.LAM_SUP Binbin Wu
@ 2023-06-07  3:40   ` Huang, Kai
  2023-06-07  4:55     ` Binbin Wu
  0 siblings, 1 reply; 26+ messages in thread
From: Huang, Kai @ 2023-06-07  3:40 UTC (permalink / raw)
  To: kvm, linux-kernel, binbin.wu
  Cc: robert.hu, pbonzini, Christopherson,, Sean, Gao, Chao, David.Laight

On Tue, 2023-06-06 at 17:18 +0800, Binbin Wu wrote:
> Move CR4.LAM_SUP out of CR4_RESERVED_BITS and its reservation depends on vcpu
> supporting LAM feature or not. Leave the bit intercepted to avoid vmread every
> time when KVM fetches its value, with the expectation that guest won't toggle
> the bit frequently.

KVM only needs to do vmread once to cache guest's CR4, and presumable vmread is
a lot cheaper than a VMEXIT.  So I don't see the value of intercepting it if
there's no need to do.

But presumably I think we cannot allow guest to own this bit because KVM wants
to return a valid CR4 if LAM isn't exposed to guest?  Otherwise guest can still
set this bit even LAM isn't exposed to guest.

Am I missing something?

If not, your justification of intercepting this bit isn't correct and needs
update.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v9 6/6] KVM: x86: Expose LAM feature to userspace VMM
  2023-06-06  9:18 ` [PATCH v9 6/6] KVM: x86: Expose LAM feature to userspace VMM Binbin Wu
@ 2023-06-07  3:52   ` Huang, Kai
  2023-06-16  1:45     ` Binbin Wu
  0 siblings, 1 reply; 26+ messages in thread
From: Huang, Kai @ 2023-06-07  3:52 UTC (permalink / raw)
  To: kvm, linux-kernel, binbin.wu
  Cc: robert.hu, pbonzini, Christopherson,, Sean, Gao, Chao, David.Laight

On Tue, 2023-06-06 at 17:18 +0800, Binbin Wu wrote:
> From: Robert Hoo <robert.hu@linux.intel.com>
> 
> LAM feature is enumerated by CPUID.7.1:EAX.LAM[bit 26].
> Expose the feature to userspace as the final step after the following
> supports:
> - CR4.LAM_SUP virtualization
> - CR3.LAM_U48 and CR3.LAM_U57 virtualization
> - Check and untag 64-bit linear address when LAM applies in instruction
>   emulations and VMExit handlers.
> 
> LAM support in SGX enclave mode needs additional enabling and is not
> included in this patch series.

"needs additional enabling" may not be accurate.  Just say:

Exposing SGX LAM support is not supported yet.  SGX LAM support is enumerated in
SGX's own CPUID and there's no hard requirement that it must be supported when
LAM is reported in CPUID leaf 0x7.

?

Or have you found the answer to above question that I asked in previous series.

Anyway:

Reviewed-by: Kai Huang <kai.huang@intel.com>

> 
> Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
> Signed-off-by: Binbin Wu <binbin.wu@linux.intel.com>
> Reviewed-by: Jingqi Liu <jingqi.liu@intel.com>
> Reviewed-by: Chao Gao <chao.gao@intel.com>
> Tested-by: Xuelian Guo <xuelian.guo@intel.com>
> ---
>  arch/x86/kvm/cpuid.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 241f554f1764..166243fb5705 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -643,7 +643,7 @@ void kvm_set_cpu_caps(void)
>  	kvm_cpu_cap_mask(CPUID_7_1_EAX,
>  		F(AVX_VNNI) | F(AVX512_BF16) | F(CMPCCXADD) |
>  		F(FZRM) | F(FSRS) | F(FSRC) |
> -		F(AMX_FP16) | F(AVX_IFMA)
> +		F(AMX_FP16) | F(AVX_IFMA) | F(LAM)
>  	);
>  
>  	kvm_cpu_cap_init_kvm_defined(CPUID_7_1_EDX,


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v9 2/6] KVM: x86: Virtualize CR4.LAM_SUP
  2023-06-07  3:40   ` Huang, Kai
@ 2023-06-07  4:55     ` Binbin Wu
  2023-06-07  9:20       ` Huang, Kai
  0 siblings, 1 reply; 26+ messages in thread
From: Binbin Wu @ 2023-06-07  4:55 UTC (permalink / raw)
  To: Huang, Kai
  Cc: kvm, linux-kernel, robert.hu, pbonzini, Christopherson,,
	Sean, Gao, Chao, David.Laight



On 6/7/2023 11:40 AM, Huang, Kai wrote:
> On Tue, 2023-06-06 at 17:18 +0800, Binbin Wu wrote:
>> Move CR4.LAM_SUP out of CR4_RESERVED_BITS and its reservation depends on vcpu
>> supporting LAM feature or not. Leave the bit intercepted to avoid vmread every
>> time when KVM fetches its value, with the expectation that guest won't toggle
>> the bit frequently.
> KVM only needs to do vmread once to cache guest's CR4, and presumable vmread is
> a lot cheaper than a VMEXIT.  So I don't see the value of intercepting it if
> there's no need to do.
Here is the discussion about the general rule of interception of CR4 bit.
Sean mentioned:  "As a base
rule, KVM intercepts CR4 bits unless there's a reason not to, e.g. if 
the CR4 bit
in question is written frequently by real guests and/or never consumed 
by KVM."
https://lore.kernel.org/all/Y7xA53sLxCwzfvgD@google.com/

And CR4.LAM_SUP value will be used to determin the LAM mode when apply 
LAM masking in instruction emulations / VMExit handlers,
and if the bit is passed-through, it will be a vmread in these pathes.

>
> But presumably I think we cannot allow guest to own this bit because KVM wants
> to return a valid CR4 if LAM isn't exposed to guest?  Otherwise guest can still
> set this bit even LAM isn't exposed to guest.
>
> Am I missing something?
Right, this is also a reason why the CR4.LAM_SUP bit should be intercepted.
Will update the justification.
I suppose this reason is enough for justification, will remove the 
performance part in changelog.

Thanks.
>
> If not, your justification of intercepting this bit isn't correct and needs
> update.
>


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v9 2/6] KVM: x86: Virtualize CR4.LAM_SUP
  2023-06-07  4:55     ` Binbin Wu
@ 2023-06-07  9:20       ` Huang, Kai
  0 siblings, 0 replies; 26+ messages in thread
From: Huang, Kai @ 2023-06-07  9:20 UTC (permalink / raw)
  To: binbin.wu
  Cc: kvm, robert.hu, pbonzini, linux-kernel, Christopherson,,
	Sean, Gao, Chao, David.Laight

On Wed, 2023-06-07 at 12:55 +0800, Binbin Wu wrote:
> 
> On 6/7/2023 11:40 AM, Huang, Kai wrote:
> > On Tue, 2023-06-06 at 17:18 +0800, Binbin Wu wrote:
> > > Move CR4.LAM_SUP out of CR4_RESERVED_BITS and its reservation depends on vcpu
> > > supporting LAM feature or not. Leave the bit intercepted to avoid vmread every
> > > time when KVM fetches its value, with the expectation that guest won't toggle
> > > the bit frequently.
> > KVM only needs to do vmread once to cache guest's CR4, and presumable vmread is
> > a lot cheaper than a VMEXIT.  So I don't see the value of intercepting it if
> > there's no need to do.
> Here is the discussion about the general rule of interception of CR4 bit.
> Sean mentioned:  "As a base
> rule, KVM intercepts CR4 bits unless there's a reason not to, e.g. if 
> the CR4 bit
> in question is written frequently by real guests and/or never consumed 
> by KVM."
> https://lore.kernel.org/all/Y7xA53sLxCwzfvgD@google.com/
> 
> And CR4.LAM_SUP value will be used to determin the LAM mode when apply 
> LAM masking in instruction emulations / VMExit handlers,
> and if the bit is passed-through, it will be a vmread in these pathes.

Yeah agreed.

> 
> > 
> > But presumably I think we cannot allow guest to own this bit because KVM wants
> > to return a valid CR4 if LAM isn't exposed to guest?  Otherwise guest can still
> > set this bit even LAM isn't exposed to guest.
> > 
> > Am I missing something?
> Right, this is also a reason why the CR4.LAM_SUP bit should be intercepted.
> Will update the justification.
> I suppose this reason is enough for justification, will remove the 
> performance part in changelog.

Anyway,

Reviewed-by: Kai Huang <kai.huang@intel.com>



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v9 6/6] KVM: x86: Expose LAM feature to userspace VMM
  2023-06-07  3:52   ` Huang, Kai
@ 2023-06-16  1:45     ` Binbin Wu
  0 siblings, 0 replies; 26+ messages in thread
From: Binbin Wu @ 2023-06-16  1:45 UTC (permalink / raw)
  To: Huang, Kai
  Cc: kvm, linux-kernel, robert.hu, pbonzini, Christopherson,,
	Sean, Gao, Chao, David.Laight



On 6/7/2023 11:52 AM, Huang, Kai wrote:
> On Tue, 2023-06-06 at 17:18 +0800, Binbin Wu wrote:
>> From: Robert Hoo<robert.hu@linux.intel.com>
>>
>> LAM feature is enumerated by CPUID.7.1:EAX.LAM[bit 26].
>> Expose the feature to userspace as the final step after the following
>> supports:
>> - CR4.LAM_SUP virtualization
>> - CR3.LAM_U48 and CR3.LAM_U57 virtualization
>> - Check and untag 64-bit linear address when LAM applies in instruction
>>    emulations and VMExit handlers.
>>
>> LAM support in SGX enclave mode needs additional enabling and is not
>> included in this patch series.
> "needs additional enabling" may not be accurate.  Just say:
>
> Exposing SGX LAM support is not supported yet.  SGX LAM support is enumerated in
> SGX's own CPUID and there's no hard requirement that it must be supported when
> LAM is reported in CPUID leaf 0x7.
> ?
>
> Or have you found the answer to above question that I asked in previous series.
Sorry for late response.
It's just got confirmed that there is _NO_ hard requirement that SGX LAM 
must be supported when LAM is reported in CPUID leaf 0x7.
The changelog you suggeusted above is right and LGTM. Thanks.

And another question "Is it possible that 
CPUID.(EAX=12H,ECX=1):EAX[9:8]=0x3, while CPUID.7.1:EAX.LAM[bit 26] is 0?"
No certain answer yet. Will figure it out before adding SGX LAM support.

> Anyway:
>
> Reviewed-by: Kai Huang<kai.huang@intel.com>
>
>> Signed-off-by: Robert Hoo<robert.hu@linux.intel.com>
>> Signed-off-by: Binbin Wu<binbin.wu@linux.intel.com>
>> Reviewed-by: Jingqi Liu<jingqi.liu@intel.com>
>> Reviewed-by: Chao Gao<chao.gao@intel.com>
>> Tested-by: Xuelian Guo<xuelian.guo@intel.com>
>> ---
>>   arch/x86/kvm/cpuid.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
>> index 241f554f1764..166243fb5705 100644
>> --- a/arch/x86/kvm/cpuid.c
>> +++ b/arch/x86/kvm/cpuid.c
>> @@ -643,7 +643,7 @@ void kvm_set_cpu_caps(void)
>>   	kvm_cpu_cap_mask(CPUID_7_1_EAX,
>>   		F(AVX_VNNI) | F(AVX512_BF16) | F(CMPCCXADD) |
>>   		F(FZRM) | F(FSRS) | F(FSRC) |
>> -		F(AMX_FP16) | F(AVX_IFMA)
>> +		F(AMX_FP16) | F(AVX_IFMA) | F(LAM)
>>   	);
>>   
>>   	kvm_cpu_cap_init_kvm_defined(CPUID_7_1_EDX,


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v9 3/6] KVM: x86: Virtualize CR3.LAM_{U48,U57}
  2023-06-06  9:18 ` [PATCH v9 3/6] KVM: x86: Virtualize CR3.LAM_{U48,U57} Binbin Wu
@ 2023-06-27 23:40   ` Sean Christopherson
  2023-06-28  3:05     ` Binbin Wu
  0 siblings, 1 reply; 26+ messages in thread
From: Sean Christopherson @ 2023-06-27 23:40 UTC (permalink / raw)
  To: Binbin Wu, t
  Cc: kvm, linux-kernel, pbonzini, chao.gao, kai.huang, David.Laight,
	robert.hu

On Tue, Jun 06, 2023, Binbin Wu wrote:
> Opportunistically use GENMASK_ULL() to define __PT_BASE_ADDR_MASK.

This are not the type of changes to do opportunstically.   Opportunstic changes
are things like fixing comment typos, dropping unnecessary semi-colons, fixing
coding styles violations, etc.

> Opportunistically use kvm_vcpu_is_legal_cr3() to check CR3 in SVM nested code,
> to provide a clear distinction b/t CR3 and GPA checks.

This *shouldn't* be an opportunsitic thing.  That you felt compelled to call it
out is a symptom of this patch doing too much.

In short, split this into three patches:

  1. Do the __PT_BASE_ADDR_MASK() changes
  2. Add and use kvm_vcpu_is_legal_cr3()
  3. Add support for CR3.LAM bits

> Suggested-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
> Co-developed-by: Binbin Wu <binbin.wu@linux.intel.com>
> Signed-off-by: Binbin Wu <binbin.wu@linux.intel.com>
> Tested-by: Xuelian Guo <xuelian.guo@intel.com>
> Reviewed-by: Kai Huang <kai.huang@intel.com>
> Reviewed-by: Chao Gao <chao.gao@intel.com>
> ---
>  arch/x86/include/asm/kvm_host.h | 5 +++++
>  arch/x86/kvm/cpuid.h            | 5 +++++
>  arch/x86/kvm/mmu.h              | 5 +++++
>  arch/x86/kvm/mmu/mmu.c          | 8 +++++++-
>  arch/x86/kvm/mmu/mmu_internal.h | 1 +
>  arch/x86/kvm/mmu/paging_tmpl.h  | 3 ++-
>  arch/x86/kvm/mmu/spte.h         | 2 +-
>  arch/x86/kvm/svm/nested.c       | 4 ++--
>  arch/x86/kvm/vmx/nested.c       | 4 ++--
>  arch/x86/kvm/vmx/vmx.c          | 8 +++++++-
>  arch/x86/kvm/x86.c              | 4 ++--
>  11 files changed, 39 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index c6f03d151c31..46471dd9cc1b 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -727,6 +727,11 @@ struct kvm_vcpu_arch {
>  	unsigned long cr0_guest_owned_bits;
>  	unsigned long cr2;
>  	unsigned long cr3;
> +	/*
> +	 * CR3 non-address feature control bits.
> +	 * Guest CR3 may contain any of those bits at runtime.
> +	 */
> +	u64 cr3_ctrl_bits;

This should be an "unsigned long".

Hmm, "ctrl_bits" is unnecessarily generic at this point.  It's also arguably wrong,
because X86_CR3_PCID_NOFLUSH is also a control bit, it's just allowed in CR3 itself.

I think I'd prefer to drop this field and avoid bikeshedding the name entirely.  The
only reason to effectively cache "X86_CR3_LAM_U48 | X86_CR3_LAM_U57" is because
guest_cpuid_has() is slow, and I'd rather solve that problem with the "governed
feature" framework.

More below.

>  	unsigned long cr4;
>  	unsigned long cr4_guest_owned_bits;
>  	unsigned long cr4_guest_rsvd_bits;
> diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
> index b1658c0de847..ef8e1b912d7d 100644
> --- a/arch/x86/kvm/cpuid.h
> +++ b/arch/x86/kvm/cpuid.h
> @@ -42,6 +42,11 @@ static inline int cpuid_maxphyaddr(struct kvm_vcpu *vcpu)
>  	return vcpu->arch.maxphyaddr;
>  }
>  
> +static inline bool kvm_vcpu_is_legal_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)

Heh, I think it makes sense to wrap this one.  I'll probably tell you differently
tomorrow, but today, let's wrap.

> +{
> +	return !((cr3 & vcpu->arch.reserved_gpa_bits) & ~vcpu->arch.cr3_ctrl_bits);

Don't open code something for which there is a perfect helper, i.e. use
kvm_vcpu_is_legal_gpa().

If we go the governed feature route, this becomes:

static inline bool kvm_vcpu_is_legal_cr3(struct kvm_vcpu *vcpu,
					 unsigned long cr3)
{
	if (guest_can_use(vcpu, X86_FEATURE_LAM))
		cr3 &= ~(X86_CR3_LAM_U48 | X86_CR3_LAM_U57);

	return kvm_vcpu_is_legal_gpa(cr3);
}

> +}
> +
>  static inline bool kvm_vcpu_is_legal_gpa(struct kvm_vcpu *vcpu, gpa_t gpa)
>  {
>  	return !(gpa & vcpu->arch.reserved_gpa_bits);
> diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
> index 92d5a1924fc1..81d8a433dae1 100644
> --- a/arch/x86/kvm/mmu.h
> +++ b/arch/x86/kvm/mmu.h
> @@ -144,6 +144,11 @@ static inline unsigned long kvm_get_active_pcid(struct kvm_vcpu *vcpu)
>  	return kvm_get_pcid(vcpu, kvm_read_cr3(vcpu));
>  }
>  
> +static inline u64 kvm_get_active_cr3_ctrl_bits(struct kvm_vcpu *vcpu)

And then this becomes:

static inline u64 kvm_get_active_cr3_lam_bits(struct kvm_vcpu *vcpu)
{
	if (!guest_can_use(vcpu, X86_FEATURE_LAM))
		return 0;

	return kvm_read_cr3(vcpu) & (X86_CR3_LAM_U48 | X86_CR3_LAM_U57);
}

> +{
> +	return kvm_read_cr3(vcpu) & vcpu->arch.cr3_ctrl_bits;
> +}
> +
>  static inline void kvm_mmu_load_pgd(struct kvm_vcpu *vcpu)
>  {
>  	u64 root_hpa = vcpu->arch.mmu->root.hpa;
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index c8961f45e3b1..deea9a9f0c75 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -3812,7 +3812,13 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
>  	hpa_t root;
>  
>  	root_pgd = kvm_mmu_get_guest_pgd(vcpu, mmu);
> -	root_gfn = root_pgd >> PAGE_SHIFT;
> +	/*
> +	 * Guest PGD can be CR3 or EPTP (for nested EPT case). CR3 may contain
> +	 * additional control bits (e.g. LAM control bits). To be generic,
> +	 * unconditionally strip non-address bits when computing the GFN since
> +	 * the guest PGD has already been checked for validity.
> +	 */

Drop this comment, the code is self-explanatory, and the comment is incomplete,
e.g. it can also be nCR3.

> +	root_gfn = (root_pgd & __PT_BASE_ADDR_MASK) >> PAGE_SHIFT;
>  
>  	if (mmu_check_root(vcpu, root_gfn))
>  		return 1;
> diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h
> index d39af5639ce9..7d2105432d66 100644
> --- a/arch/x86/kvm/mmu/mmu_internal.h
> +++ b/arch/x86/kvm/mmu/mmu_internal.h
> @@ -21,6 +21,7 @@ extern bool dbg;
>  #endif
>  
>  /* Page table builder macros common to shadow (host) PTEs and guest PTEs. */
> +#define __PT_BASE_ADDR_MASK GENMASK_ULL(51, 12)
>  #define __PT_LEVEL_SHIFT(level, bits_per_level)	\
>  	(PAGE_SHIFT + ((level) - 1) * (bits_per_level))
>  #define __PT_INDEX(address, level, bits_per_level) \
> diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
> index 0662e0278e70..394733ac9088 100644
> --- a/arch/x86/kvm/mmu/paging_tmpl.h
> +++ b/arch/x86/kvm/mmu/paging_tmpl.h
> @@ -62,7 +62,7 @@
>  #endif
>  
>  /* Common logic, but per-type values.  These also need to be undefined. */
> -#define PT_BASE_ADDR_MASK	((pt_element_t)(((1ULL << 52) - 1) & ~(u64)(PAGE_SIZE-1)))
> +#define PT_BASE_ADDR_MASK	((pt_element_t)__PT_BASE_ADDR_MASK)
>  #define PT_LVL_ADDR_MASK(lvl)	__PT_LVL_ADDR_MASK(PT_BASE_ADDR_MASK, lvl, PT_LEVEL_BITS)
>  #define PT_LVL_OFFSET_MASK(lvl)	__PT_LVL_OFFSET_MASK(PT_BASE_ADDR_MASK, lvl, PT_LEVEL_BITS)
>  #define PT_INDEX(addr, lvl)	__PT_INDEX(addr, lvl, PT_LEVEL_BITS)
> @@ -324,6 +324,7 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker,
>  	trace_kvm_mmu_pagetable_walk(addr, access);
>  retry_walk:
>  	walker->level = mmu->cpu_role.base.level;
> +	/* gpte_to_gfn() will strip non-address bits. */

Drop this comment too, it's not relevant to the immediate code, i.e. it'd be
better suited about this code:

	table_gfn = gpte_to_gfn(pte);

but IMO that code is quite self-explanatory too.

> @@ -7740,6 +7741,11 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>  		vmx->msr_ia32_feature_control_valid_bits &=
>  			~FEAT_CTL_SGX_LC_ENABLED;
>  
> +	if (guest_cpuid_has(vcpu, X86_FEATURE_LAM))

This is wrong, KVM needs to check that the host supports LAM too, otherwise KVM
will allow userspace to shove garbage into guest CR3 and induce VM-Entry failures
and whatnot.  If we go the guest_can_use() route, this problem solves itself.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v9 4/6] KVM: x86: Introduce untag_addr() in kvm_x86_ops
  2023-06-06  9:18 ` [PATCH v9 4/6] KVM: x86: Introduce untag_addr() in kvm_x86_ops Binbin Wu
@ 2023-06-28  0:15   ` Sean Christopherson
  2023-06-29  6:12     ` Binbin Wu
  0 siblings, 1 reply; 26+ messages in thread
From: Sean Christopherson @ 2023-06-28  0:15 UTC (permalink / raw)
  To: Binbin Wu
  Cc: kvm, linux-kernel, pbonzini, chao.gao, kai.huang, David.Laight,
	robert.hu

On Tue, Jun 06, 2023, Binbin Wu wrote:
> diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
> index 5b9ec610b2cb..c2091e24a6b9 100644
> --- a/arch/x86/kvm/kvm_emulate.h
> +++ b/arch/x86/kvm/kvm_emulate.h
> @@ -91,6 +91,7 @@ struct x86_instruction_info {
>  /* x86-specific emulation flags */
>  #define X86EMUL_F_FETCH			BIT(0)
>  #define X86EMUL_F_WRITE			BIT(1)
> +#define X86EMUL_F_SKIPLAM		BIT(2)

See my comments in the LASS series about describing the access, not dictating
the end behavior.

>  
>  struct x86_emulate_ops {
>  	void (*vm_bugged)(struct x86_emulate_ctxt *ctxt);
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 52dcf3c00bb8..82a225d1000e 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -8133,6 +8133,77 @@ static void vmx_vm_destroy(struct kvm *kvm)
>  	free_pages((unsigned long)kvm_vmx->pid_table, vmx_get_pid_table_order(kvm));
>  }
>  
> +#define LAM_S57_EN_MASK (X86_CR4_LAM_SUP | X86_CR4_LA57)
> +static int lam_sign_extend_bit(struct kvm_vcpu *vcpu, gva_t addr)
> +{
> +	u64 cr3, cr4;
> +
> +	/*
> +	 * The LAM identification of a pointer as user or supervisor is
> +	 * based solely on the value of pointer bit 63.
> +	 */
> +	if (!(addr >> 63)) {

BIT_ULL(63)

> +		cr3 = kvm_read_cr3(vcpu);

Use the perfectly good helper added earlier in the series:

		cr3_lam = kvm_get_active_lam_bits();

That has the added bonus of avoiding a VMREAD of CR3 when LAM is disabled in CR4.

> +		if (cr3 & X86_CR3_LAM_U57)
> +			return 56;
> +		if (cr3 & X86_CR3_LAM_U48)
> +			return 47;
> +	} else {
> +		cr4 = kvm_read_cr4_bits(vcpu, LAM_S57_EN_MASK);
> +		if (cr4 == LAM_S57_EN_MASK)
> +			return 56;
> +		if (cr4 & X86_CR4_LAM_SUP)
> +			return 47;

This is way too complicated for a simple thing.  Burying multiple bits in a #define
and then relying on specific bits being in the mask is unnecessarily subtle.

And this whole helper shouldn't exist.  There's one caller, and will only ever
be one caller.  Defining magic numbers, i.e. using -1 to signal "disabled", makes
it that much harder to read the code.

More below.

> +	}
> +	return -1;
> +}
> +
> +/*
> + * Only called in 64-bit mode.
> + *
> + * LAM has a modified canonical check when applicable:
> + * LAM_S48                : [ 1 ][ metadata ][ 1 ]
> + *                            63               47
> + * LAM_U48                : [ 0 ][ metadata ][ 0 ]
> + *                            63               47
> + * LAM_S57                : [ 1 ][ metadata ][ 1 ]
> + *                            63               56
> + * LAM_U57 + 5-lvl paging : [ 0 ][ metadata ][ 0 ]
> + *                            63               56
> + * LAM_U57 + 4-lvl paging : [ 0 ][ metadata ][ 0...0 ]
> + *                            63               56..47
> + *
> + * Untag the metadata bits by sign-extending the value of bit 47 (LAM48) or
> + * bit 56 (LAM57). The resulting address after untag isn't guaranteed to be
> + * canonical. Callers should perform the original canonical check and raise
> + * #GP/#SS if the address is non-canonical.
> + *
> + * Note that KVM masks the metadata in addresses, performs the (original)
> + * canonicality checking and then walks page table. This is slightly
> + * different from hardware behavior but achieves the same effect.
> + * Specifically, if LAM is enabled, the processor performs a modified
> + * canonicality checking where the metadata are ignored instead of
> + * masked. After the modified canonicality checking, the processor masks
> + * the metadata before passing addresses for paging translation.
> + */
> +void vmx_untag_addr(struct kvm_vcpu *vcpu, gva_t *gva, u32 flags)

Rather than modify the pointer, return the untagged address.  That's more flexible
as it allows using the result in if-statements and whatnot.  That might not ever
come into play, but there's no good reason to use an in/out param in a void
function.

> +{
> +	int sign_ext_bit;
> +
> +	/*
> +	 * Check LAM_U48 in cr3_ctrl_bits to avoid guest_cpuid_has().
> +	 * If not set, vCPU doesn't supports LAM.
> +	 */
> +	if (!(vcpu->arch.cr3_ctrl_bits & X86_CR3_LAM_U48) ||

This is unnecessary, KVM should never allow the LAM bits in CR3 to be set if LAM
isn't supported.

> +	    (flags & X86EMUL_F_SKIPLAM) || WARN_ON_ONCE(!is_64_bit_mode(vcpu)))

Same comments as the LASS series, don't WARN, just put the check here.

> +		return;
> +
> +	sign_ext_bit = lam_sign_extend_bit(vcpu, *gva);
> +	if (sign_ext_bit > 0)
> +		*gva = (sign_extend64(*gva, sign_ext_bit) & ~BIT_ULL(63)) |
> +		       (*gva & BIT_ULL(63));


Something like this?  The early return in the user path is a bit forced, e.g. it
could also be:

	if (cr3 & X86_CR3_LAM_U57)
		lam_bit = 56;
	else if (X86_CR3_LAM_U48)
		lam_bit = 48;
	else
		return gva;

but IMO making the CR3 and CR4 paths somewhat symmetrical is valuable.

gva_t vmx_get_untagged_addr(struct kvm_vcpu *vcpu, gva_t gva,
			    unsigned int flags)
{
	unsigned long cr3_bits, cr4_bits;
	int lam_bit;

	if (flags & (X86EMUL_F_FETCH | X86EMUL_F_BRANCH_INVLPG | X86EMUL_F_IMPLICIT))
		return gva;

	if (!is_64_bit_mode(vcpu))
		return gva;

	/*
	 * Bit 63 determines if the address should be treated as user address
	 * or a supervisor address.
	 */
	if (!(gva & BIT_ULL(63))) {
		cr3_bits = kvm_get_active_lam_bits(vcpu);
		if (!(cr3 & (X86_CR3_LAM_U57 | X86_CR3_LAM_U48))
			return gva;

		/* LAM_U48 is ignored if LAM_U57 is set. */
		lam_bit = cr3_bits & X86_CR3_LAM_U57 ? 56 : 47;
	} else {
		if (!kvm_is_cr4_bit_set(vcpu, X86_CR4_LAM_SUP_BIT))
			return gva;

		lam_bit = kvm_is_cr4_bit_set(vcpu, X86_CR4_LA57) ? 56 : 47;
	}
	return (sign_extend64(gva, lam_bit) & ~BIT_ULL(63)) | (gva & BIT_ULL(63));
}


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v9 5/6] KVM: x86: Untag address when LAM applicable
  2023-06-06  9:18 ` [PATCH v9 5/6] KVM: x86: Untag address when LAM applicable Binbin Wu
@ 2023-06-28  0:19   ` Sean Christopherson
  0 siblings, 0 replies; 26+ messages in thread
From: Sean Christopherson @ 2023-06-28  0:19 UTC (permalink / raw)
  To: Binbin Wu
  Cc: kvm, linux-kernel, pbonzini, chao.gao, kai.huang, David.Laight,
	robert.hu

On Tue, Jun 06, 2023, Binbin Wu wrote:
> Untag address for 64-bit memory/MMIO operand in instruction emulations
> and VMExit handlers when LAM is applicable.
> 
> For instruction emulation, untag address in __linearize() before
> canonical check. LAM doesn't apply to addresses used for instruction
> fetches or to those that specify the targets of jump and call instructions,
> use X86EMUL_F_SKIPLAM to skip LAM untag.
> 
> For VMExit handlers related to 64-bit linear address:
> - Cases need to untag address
>   Operand(s) of VMX instructions and INVPCID.
>   Operand(s) of SGX ENCLS.
> - Cases LAM doesn't apply to
>   Operand of INVLPG.
>   Linear address in INVPCID descriptor (no change needed).
>   Linear address in INVVPID descriptor (it has been confirmed, although it is
>   not called out in LAM spec, no change needed).
>   BASEADDR specified in SESC of ECREATE (no change needed).
> 
> Note:
> LAM doesn't apply to the writes to control registers or MSRs.
> LAM masking applies before paging, so the faulting linear address in CR2
> doesn't contain the metadata.
> The guest linear address saved in VMCS doesn't contain metadata.
> 
> Co-developed-by: Robert Hoo <robert.hu@linux.intel.com>
> Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
> Signed-off-by: Binbin Wu <binbin.wu@linux.intel.com>
> Reviewed-by: Chao Gao <chao.gao@intel.com>
> Tested-by: Xuelian Guo <xuelian.guo@intel.com>
> ---
>  arch/x86/kvm/emulate.c     | 16 +++++++++++++---
>  arch/x86/kvm/kvm_emulate.h |  2 ++
>  arch/x86/kvm/vmx/nested.c  |  2 ++
>  arch/x86/kvm/vmx/sgx.c     |  1 +
>  arch/x86/kvm/x86.c         |  7 +++++++
>  5 files changed, 25 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> index e89afc39e56f..c135adb26f1e 100644
> --- a/arch/x86/kvm/emulate.c
> +++ b/arch/x86/kvm/emulate.c
> @@ -701,6 +701,7 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
>  	*max_size = 0;
>  	switch (mode) {
>  	case X86EMUL_MODE_PROT64:
> +		ctxt->ops->untag_addr(ctxt, &la, flags);
>  		*linear = la;

Ha!  Returning the untagged address does help:

		*linear = ctx->ops->get_untagged_address(ctxt, la, flags);

>  		va_bits = ctxt_virt_addr_bits(ctxt);
>  		if (!__is_canonical_address(la, va_bits))
> @@ -771,8 +772,12 @@ static inline int assign_eip(struct x86_emulate_ctxt *ctxt, ulong dst)
>  
>  	if (ctxt->op_bytes != sizeof(unsigned long))
>  		addr.ea = dst & ((1UL << (ctxt->op_bytes << 3)) - 1);
> +	/*
> +	 * LAM doesn't apply to addresses that specify the targets of jump and
> +	 * call instructions.
> +	 */
>  	rc = __linearize(ctxt, addr, &max_size, 1, ctxt->mode, &linear,
> -			 X86EMUL_F_FETCH);
> +			 X86EMUL_F_FETCH | X86EMUL_F_SKIPLAM);

No need for anything LAM specific here, just skip all FETCH access (unlike LASS
which skips checks only for branch targets).

> -	rc = linearize(ctxt, ctxt->src.addr.mem, 1, false, &linear);
> +	/* LAM doesn't apply to invlpg */

Comment unneeded if X86EMUL_F_INVLPG is added.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v9 3/6] KVM: x86: Virtualize CR3.LAM_{U48,U57}
  2023-06-27 23:40   ` Sean Christopherson
@ 2023-06-28  3:05     ` Binbin Wu
  2023-06-28 17:40       ` Sean Christopherson
  0 siblings, 1 reply; 26+ messages in thread
From: Binbin Wu @ 2023-06-28  3:05 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, linux-kernel, pbonzini, chao.gao, kai.huang, David.Laight,
	robert.hu



On 6/28/2023 7:40 AM, Sean Christopherson wrote:
> On Tue, Jun 06, 2023, Binbin Wu wrote:
>> Opportunistically use GENMASK_ULL() to define __PT_BASE_ADDR_MASK.
> This are not the type of changes to do opportunstically.   Opportunstic changes
> are things like fixing comment typos, dropping unnecessary semi-colons, fixing
> coding styles violations, etc.

OK, thanks for the education.
>
>> Opportunistically use kvm_vcpu_is_legal_cr3() to check CR3 in SVM nested code,
>> to provide a clear distinction b/t CR3 and GPA checks.
> This *shouldn't* be an opportunsitic thing.  That you felt compelled to call it
> out is a symptom of this patch doing too much.
>
> In short, split this into three patches:
>
>    1. Do the __PT_BASE_ADDR_MASK() changes
>    2. Add and use kvm_vcpu_is_legal_cr3()
>    3. Add support for CR3.LAM bits
Will do that, thanks.

>
>> Suggested-by: Sean Christopherson <seanjc@google.com>
>> Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
>> Co-developed-by: Binbin Wu <binbin.wu@linux.intel.com>
>> Signed-off-by: Binbin Wu <binbin.wu@linux.intel.com>
>> Tested-by: Xuelian Guo <xuelian.guo@intel.com>
>> Reviewed-by: Kai Huang <kai.huang@intel.com>
>> Reviewed-by: Chao Gao <chao.gao@intel.com>
>> ---
>>   arch/x86/include/asm/kvm_host.h | 5 +++++
>>   arch/x86/kvm/cpuid.h            | 5 +++++
>>   arch/x86/kvm/mmu.h              | 5 +++++
>>   arch/x86/kvm/mmu/mmu.c          | 8 +++++++-
>>   arch/x86/kvm/mmu/mmu_internal.h | 1 +
>>   arch/x86/kvm/mmu/paging_tmpl.h  | 3 ++-
>>   arch/x86/kvm/mmu/spte.h         | 2 +-
>>   arch/x86/kvm/svm/nested.c       | 4 ++--
>>   arch/x86/kvm/vmx/nested.c       | 4 ++--
>>   arch/x86/kvm/vmx/vmx.c          | 8 +++++++-
>>   arch/x86/kvm/x86.c              | 4 ++--
>>   11 files changed, 39 insertions(+), 10 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>> index c6f03d151c31..46471dd9cc1b 100644
>> --- a/arch/x86/include/asm/kvm_host.h
>> +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -727,6 +727,11 @@ struct kvm_vcpu_arch {
>>   	unsigned long cr0_guest_owned_bits;
>>   	unsigned long cr2;
>>   	unsigned long cr3;
>> +	/*
>> +	 * CR3 non-address feature control bits.
>> +	 * Guest CR3 may contain any of those bits at runtime.
>> +	 */
>> +	u64 cr3_ctrl_bits;
> This should be an "unsigned long".
>
> Hmm, "ctrl_bits" is unnecessarily generic at this point.  It's also arguably wrong,
> because X86_CR3_PCID_NOFLUSH is also a control bit, it's just allowed in CR3 itself.
>
> I think I'd prefer to drop this field and avoid bikeshedding the name entirely.  The
> only reason to effectively cache "X86_CR3_LAM_U48 | X86_CR3_LAM_U57" is because
> guest_cpuid_has() is slow, and I'd rather solve that problem with the "governed
> feature" framework.
Thanks for the suggestion.

Is the below patch the lastest patch of "governed feature" framework 
support?
https://lore.kernel.org/kvm/20230217231022.816138-2-seanjc@google.com/

Do you have plan to apply it to kvm-x86 repo?

>
> More below.
>
>>   	unsigned long cr4;
>>   	unsigned long cr4_guest_owned_bits;
>>   	unsigned long cr4_guest_rsvd_bits;
>> diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
>> index b1658c0de847..ef8e1b912d7d 100644
>> --- a/arch/x86/kvm/cpuid.h
>> +++ b/arch/x86/kvm/cpuid.h
>> @@ -42,6 +42,11 @@ static inline int cpuid_maxphyaddr(struct kvm_vcpu *vcpu)
>>   	return vcpu->arch.maxphyaddr;
>>   }
>>   
>> +static inline bool kvm_vcpu_is_legal_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
> Heh, I think it makes sense to wrap this one.  I'll probably tell you differently
> tomorrow, but today, let's wrap.
>
>> +{
>> +	return !((cr3 & vcpu->arch.reserved_gpa_bits) & ~vcpu->arch.cr3_ctrl_bits);
> Don't open code something for which there is a perfect helper, i.e. use
> kvm_vcpu_is_legal_gpa().
I didn't use the helper because after masking the control bits (LAM 
bits), CR3 is not  a GPA conceptally,
i.e, it contains PCID or PWT/PCD in lower bits.
But maybe this is my overthinking?Will use the helper instead.

>
> If we go the governed feature route, this becomes:
>
> static inline bool kvm_vcpu_is_legal_cr3(struct kvm_vcpu *vcpu,
> 					 unsigned long cr3)
> {
> 	if (guest_can_use(vcpu, X86_FEATURE_LAM))
> 		cr3 &= ~(X86_CR3_LAM_U48 | X86_CR3_LAM_U57);
>
> 	return kvm_vcpu_is_legal_gpa(cr3);
> }
>
>> +}
>> +
>>   static inline bool kvm_vcpu_is_legal_gpa(struct kvm_vcpu *vcpu, gpa_t gpa)
>>   {
>>   	return !(gpa & vcpu->arch.reserved_gpa_bits);
>> diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
>> index 92d5a1924fc1..81d8a433dae1 100644
>> --- a/arch/x86/kvm/mmu.h
>> +++ b/arch/x86/kvm/mmu.h
>> @@ -144,6 +144,11 @@ static inline unsigned long kvm_get_active_pcid(struct kvm_vcpu *vcpu)
>>   	return kvm_get_pcid(vcpu, kvm_read_cr3(vcpu));
>>   }
>>   
>> +static inline u64 kvm_get_active_cr3_ctrl_bits(struct kvm_vcpu *vcpu)
> And then this becomes:
>
> static inline u64 kvm_get_active_cr3_lam_bits(struct kvm_vcpu *vcpu)
> {
> 	if (!guest_can_use(vcpu, X86_FEATURE_LAM))
> 		return 0;
>
> 	return kvm_read_cr3(vcpu) & (X86_CR3_LAM_U48 | X86_CR3_LAM_U57);
> }
>
>> +{
>> +	return kvm_read_cr3(vcpu) & vcpu->arch.cr3_ctrl_bits;
>> +}
>> +
>>   static inline void kvm_mmu_load_pgd(struct kvm_vcpu *vcpu)
>>   {
>>   	u64 root_hpa = vcpu->arch.mmu->root.hpa;
>> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
>> index c8961f45e3b1..deea9a9f0c75 100644
>> --- a/arch/x86/kvm/mmu/mmu.c
>> +++ b/arch/x86/kvm/mmu/mmu.c
>> @@ -3812,7 +3812,13 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
>>   	hpa_t root;
>>   
>>   	root_pgd = kvm_mmu_get_guest_pgd(vcpu, mmu);
>> -	root_gfn = root_pgd >> PAGE_SHIFT;
>> +	/*
>> +	 * Guest PGD can be CR3 or EPTP (for nested EPT case). CR3 may contain
>> +	 * additional control bits (e.g. LAM control bits). To be generic,
>> +	 * unconditionally strip non-address bits when computing the GFN since
>> +	 * the guest PGD has already been checked for validity.
>> +	 */
> Drop this comment, the code is self-explanatory, and the comment is incomplete,
> e.g. it can also be nCR3.
I was trying to use CR3 for both nested/non-nested cases. Sorry for the 
confusion.
Anyway, will drop the comment.


>
>> +	root_gfn = (root_pgd & __PT_BASE_ADDR_MASK) >> PAGE_SHIFT;
>>   
>>   	if (mmu_check_root(vcpu, root_gfn))
>>   		return 1;
>> diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h
>> index d39af5639ce9..7d2105432d66 100644
>> --- a/arch/x86/kvm/mmu/mmu_internal.h
>> +++ b/arch/x86/kvm/mmu/mmu_internal.h
>> @@ -21,6 +21,7 @@ extern bool dbg;
>>   #endif
>>   
>>   /* Page table builder macros common to shadow (host) PTEs and guest PTEs. */
>> +#define __PT_BASE_ADDR_MASK GENMASK_ULL(51, 12)
>>   #define __PT_LEVEL_SHIFT(level, bits_per_level)	\
>>   	(PAGE_SHIFT + ((level) - 1) * (bits_per_level))
>>   #define __PT_INDEX(address, level, bits_per_level) \
>> diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
>> index 0662e0278e70..394733ac9088 100644
>> --- a/arch/x86/kvm/mmu/paging_tmpl.h
>> +++ b/arch/x86/kvm/mmu/paging_tmpl.h
>> @@ -62,7 +62,7 @@
>>   #endif
>>   
>>   /* Common logic, but per-type values.  These also need to be undefined. */
>> -#define PT_BASE_ADDR_MASK	((pt_element_t)(((1ULL << 52) - 1) & ~(u64)(PAGE_SIZE-1)))
>> +#define PT_BASE_ADDR_MASK	((pt_element_t)__PT_BASE_ADDR_MASK)
>>   #define PT_LVL_ADDR_MASK(lvl)	__PT_LVL_ADDR_MASK(PT_BASE_ADDR_MASK, lvl, PT_LEVEL_BITS)
>>   #define PT_LVL_OFFSET_MASK(lvl)	__PT_LVL_OFFSET_MASK(PT_BASE_ADDR_MASK, lvl, PT_LEVEL_BITS)
>>   #define PT_INDEX(addr, lvl)	__PT_INDEX(addr, lvl, PT_LEVEL_BITS)
>> @@ -324,6 +324,7 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker,
>>   	trace_kvm_mmu_pagetable_walk(addr, access);
>>   retry_walk:
>>   	walker->level = mmu->cpu_role.base.level;
>> +	/* gpte_to_gfn() will strip non-address bits. */
> Drop this comment too, it's not relevant to the immediate code, i.e. it'd be
> better suited about this code:
>
> 	table_gfn = gpte_to_gfn(pte);
>
> but IMO that code is quite self-explanatory too.

OK, thanks.
>
>> @@ -7740,6 +7741,11 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>>   		vmx->msr_ia32_feature_control_valid_bits &=
>>   			~FEAT_CTL_SGX_LC_ENABLED;
>>   
>> +	if (guest_cpuid_has(vcpu, X86_FEATURE_LAM))
> This is wrong, KVM needs to check that the host supports LAM too, otherwise KVM
> will allow userspace to shove garbage into guest CR3 and induce VM-Entry failures
> and whatnot.
Right, will fix it.

>    If we go the guest_can_use() route, this problem solves itself.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v9 3/6] KVM: x86: Virtualize CR3.LAM_{U48,U57}
  2023-06-28  3:05     ` Binbin Wu
@ 2023-06-28 17:40       ` Sean Christopherson
  2023-07-03  7:56         ` Binbin Wu
  0 siblings, 1 reply; 26+ messages in thread
From: Sean Christopherson @ 2023-06-28 17:40 UTC (permalink / raw)
  To: Binbin Wu
  Cc: kvm, linux-kernel, pbonzini, chao.gao, kai.huang, David.Laight,
	robert.hu

On Wed, Jun 28, 2023, Binbin Wu wrote:
> 
> 
> On 6/28/2023 7:40 AM, Sean Christopherson wrote:
> > I think I'd prefer to drop this field and avoid bikeshedding the name entirely.  The
> > only reason to effectively cache "X86_CR3_LAM_U48 | X86_CR3_LAM_U57" is because
> > guest_cpuid_has() is slow, and I'd rather solve that problem with the "governed
> > feature" framework.
> Thanks for the suggestion.
> 
> Is the below patch the lastest patch of "governed feature" framework
> support?
> https://lore.kernel.org/kvm/20230217231022.816138-2-seanjc@google.com/

Yes, I haven't refreshed it since the original posting.

> Do you have plan to apply it to kvm-x86 repo?

I'm leaning more and more towards pushing it through sooner than later as this
isn't the first time in recent memory that a patch/series has done somewhat odd
things to workaround guest_cpuid_has() being slow.  I was hoping to get feedback
before applying, but that's not looking likely at this point.

> > More below.
> > 
> > >   	unsigned long cr4;
> > >   	unsigned long cr4_guest_owned_bits;
> > >   	unsigned long cr4_guest_rsvd_bits;
> > > diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
> > > index b1658c0de847..ef8e1b912d7d 100644
> > > --- a/arch/x86/kvm/cpuid.h
> > > +++ b/arch/x86/kvm/cpuid.h
> > > @@ -42,6 +42,11 @@ static inline int cpuid_maxphyaddr(struct kvm_vcpu *vcpu)
> > >   	return vcpu->arch.maxphyaddr;
> > >   }
> > > +static inline bool kvm_vcpu_is_legal_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
> > Heh, I think it makes sense to wrap this one.  I'll probably tell you differently
> > tomorrow, but today, let's wrap.
> > 
> > > +{
> > > +	return !((cr3 & vcpu->arch.reserved_gpa_bits) & ~vcpu->arch.cr3_ctrl_bits);
> > Don't open code something for which there is a perfect helper, i.e. use
> > kvm_vcpu_is_legal_gpa().
> I didn't use the helper because after masking the control bits (LAM bits),
> CR3 is not  a GPA conceptally, i.e, it contains PCID or PWT/PCD in lower
> bits.
> But maybe this is my overthinking?Will use the helper instead.

You're not overthinking it, I had the exact same reaction.  However, the above
also directly looks at arch.reserved_gpa_bits, i.e. treats CR3 like a GPA for
all intents and purposes, so it's not any better than using kvm_vcpu_is_legal_gpa().
And I couldn't bring myself to suggest adding a "reserved CR3 bits" mask because
CR3 *does* contain a GPA, i.e. we'd still have to check kvm_vcpu_is_legal_gpa(),
and realistically the "reserved CR3 bits" will never be a superset of "illegal
GPA bits".

> > If we go the governed feature route, this becomes:
> > 
> > static inline bool kvm_vcpu_is_legal_cr3(struct kvm_vcpu *vcpu,
> > 					 unsigned long cr3)
> > {
> > 	if (guest_can_use(vcpu, X86_FEATURE_LAM))
> > 		cr3 &= ~(X86_CR3_LAM_U48 | X86_CR3_LAM_U57);
> > 
> > 	return kvm_vcpu_is_legal_gpa(cr3);
> > }
> > 
> > > +}
> > > +
> > >   static inline bool kvm_vcpu_is_legal_gpa(struct kvm_vcpu *vcpu, gpa_t gpa)
> > >   {
> > >   	return !(gpa & vcpu->arch.reserved_gpa_bits);
> > > diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
> > > index 92d5a1924fc1..81d8a433dae1 100644
> > > --- a/arch/x86/kvm/mmu.h
> > > +++ b/arch/x86/kvm/mmu.h
> > > @@ -144,6 +144,11 @@ static inline unsigned long kvm_get_active_pcid(struct kvm_vcpu *vcpu)
> > >   	return kvm_get_pcid(vcpu, kvm_read_cr3(vcpu));
> > >   }
> > > +static inline u64 kvm_get_active_cr3_ctrl_bits(struct kvm_vcpu *vcpu)
> > And then this becomes:
> > 
> > static inline u64 kvm_get_active_cr3_lam_bits(struct kvm_vcpu *vcpu)
> > {
> > 	if (!guest_can_use(vcpu, X86_FEATURE_LAM))
> > 		return 0;
> > 
> > 	return kvm_read_cr3(vcpu) & (X86_CR3_LAM_U48 | X86_CR3_LAM_U57);
> > }
> > 
> > > +{
> > > +	return kvm_read_cr3(vcpu) & vcpu->arch.cr3_ctrl_bits;
> > > +}
> > > +
> > >   static inline void kvm_mmu_load_pgd(struct kvm_vcpu *vcpu)
> > >   {
> > >   	u64 root_hpa = vcpu->arch.mmu->root.hpa;
> > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > > index c8961f45e3b1..deea9a9f0c75 100644
> > > --- a/arch/x86/kvm/mmu/mmu.c
> > > +++ b/arch/x86/kvm/mmu/mmu.c
> > > @@ -3812,7 +3812,13 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
> > >   	hpa_t root;
> > >   	root_pgd = kvm_mmu_get_guest_pgd(vcpu, mmu);
> > > -	root_gfn = root_pgd >> PAGE_SHIFT;
> > > +	/*
> > > +	 * Guest PGD can be CR3 or EPTP (for nested EPT case). CR3 may contain
> > > +	 * additional control bits (e.g. LAM control bits). To be generic,
> > > +	 * unconditionally strip non-address bits when computing the GFN since
> > > +	 * the guest PGD has already been checked for validity.
> > > +	 */
> > Drop this comment, the code is self-explanatory, and the comment is incomplete,
> > e.g. it can also be nCR3.
> I was trying to use CR3 for both nested/non-nested cases. Sorry for the
> confusion.
> Anyway, will drop the comment.

FWIW, EPTP also has non-address bits.  But the real reason I don't think this
warrants a comment is that "pgd" is a specifically not an address, i.e. it's
fully expected and intuitive that retrieving the gfn from a pgd would need to
mask off bits.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v9 4/6] KVM: x86: Introduce untag_addr() in kvm_x86_ops
  2023-06-28  0:15   ` Sean Christopherson
@ 2023-06-29  6:12     ` Binbin Wu
  2023-06-29  6:57       ` Chao Gao
                         ` (2 more replies)
  0 siblings, 3 replies; 26+ messages in thread
From: Binbin Wu @ 2023-06-29  6:12 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, linux-kernel, pbonzini, chao.gao, kai.huang, David.Laight,
	robert.hu



On 6/28/2023 8:15 AM, Sean Christopherson wrote:
> On Tue, Jun 06, 2023, Binbin Wu wrote:
>> diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
>> index 5b9ec610b2cb..c2091e24a6b9 100644
>> --- a/arch/x86/kvm/kvm_emulate.h
>> +++ b/arch/x86/kvm/kvm_emulate.h
>> @@ -91,6 +91,7 @@ struct x86_instruction_info {
>>   /* x86-specific emulation flags */
>>   #define X86EMUL_F_FETCH			BIT(0)
>>   #define X86EMUL_F_WRITE			BIT(1)
>> +#define X86EMUL_F_SKIPLAM		BIT(2)
> See my comments in the LASS series about describing the access, not dictating
> the end behavior.

The suggestion do decouple the code to specific feature in common 
emulator code, thanks.

>
>>   
>>   struct x86_emulate_ops {
>>   	void (*vm_bugged)(struct x86_emulate_ctxt *ctxt);
>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>> index 52dcf3c00bb8..82a225d1000e 100644
>> --- a/arch/x86/kvm/vmx/vmx.c
>> +++ b/arch/x86/kvm/vmx/vmx.c
>> @@ -8133,6 +8133,77 @@ static void vmx_vm_destroy(struct kvm *kvm)
>>   	free_pages((unsigned long)kvm_vmx->pid_table, vmx_get_pid_table_order(kvm));
>>   }
>>   
>> +#define LAM_S57_EN_MASK (X86_CR4_LAM_SUP | X86_CR4_LA57)
>> +static int lam_sign_extend_bit(struct kvm_vcpu *vcpu, gva_t addr)
>> +{
>> +	u64 cr3, cr4;
>> +
>> +	/*
>> +	 * The LAM identification of a pointer as user or supervisor is
>> +	 * based solely on the value of pointer bit 63.
>> +	 */
>> +	if (!(addr >> 63)) {
> BIT_ULL(63)
>
>> +		cr3 = kvm_read_cr3(vcpu);
> Use the perfectly good helper added earlier in the series:
>
> 		cr3_lam = kvm_get_active_lam_bits();
Good suggestion. Thanks.

>
> That has the added bonus of avoiding a VMREAD of CR3 when LAM is disabled in CR4.
Why? I don't get the point.

>
>> +		if (cr3 & X86_CR3_LAM_U57)
>> +			return 56;
>> +		if (cr3 & X86_CR3_LAM_U48)
>> +			return 47;
>> +	} else {
>> +		cr4 = kvm_read_cr4_bits(vcpu, LAM_S57_EN_MASK);
>> +		if (cr4 == LAM_S57_EN_MASK)
>> +			return 56;
>> +		if (cr4 & X86_CR4_LAM_SUP)
>> +			return 47;
> This is way too complicated for a simple thing.  Burying multiple bits in a #define
> and then relying on specific bits being in the mask is unnecessarily subtle.
>
> And this whole helper shouldn't exist.  There's one caller, and will only ever
> be one caller.  Defining magic numbers, i.e. using -1 to signal "disabled", makes
> it that much harder to read the code.
>
> More below.
>
>> +	}
>> +	return -1;
>> +}
>> +
>> +/*
>> + * Only called in 64-bit mode.
>> + *
>> + * LAM has a modified canonical check when applicable:
>> + * LAM_S48                : [ 1 ][ metadata ][ 1 ]
>> + *                            63               47
>> + * LAM_U48                : [ 0 ][ metadata ][ 0 ]
>> + *                            63               47
>> + * LAM_S57                : [ 1 ][ metadata ][ 1 ]
>> + *                            63               56
>> + * LAM_U57 + 5-lvl paging : [ 0 ][ metadata ][ 0 ]
>> + *                            63               56
>> + * LAM_U57 + 4-lvl paging : [ 0 ][ metadata ][ 0...0 ]
>> + *                            63               56..47
>> + *
>> + * Untag the metadata bits by sign-extending the value of bit 47 (LAM48) or
>> + * bit 56 (LAM57). The resulting address after untag isn't guaranteed to be
>> + * canonical. Callers should perform the original canonical check and raise
>> + * #GP/#SS if the address is non-canonical.
>> + *
>> + * Note that KVM masks the metadata in addresses, performs the (original)
>> + * canonicality checking and then walks page table. This is slightly
>> + * different from hardware behavior but achieves the same effect.
>> + * Specifically, if LAM is enabled, the processor performs a modified
>> + * canonicality checking where the metadata are ignored instead of
>> + * masked. After the modified canonicality checking, the processor masks
>> + * the metadata before passing addresses for paging translation.
>> + */
>> +void vmx_untag_addr(struct kvm_vcpu *vcpu, gva_t *gva, u32 flags)
> Rather than modify the pointer, return the untagged address.  That's more flexible
> as it allows using the result in if-statements and whatnot.  That might not ever
> come into play, but there's no good reason to use an in/out param in a void
> function.
In earlier version, it did return the untagged address.
In this version, I changed it as an in/out param to make the interface 
conditional and avoid to add a dummy one in SVM.
Is it can be a reason?


>
>> +{
>> +	int sign_ext_bit;
>> +
>> +	/*
>> +	 * Check LAM_U48 in cr3_ctrl_bits to avoid guest_cpuid_has().
>> +	 * If not set, vCPU doesn't supports LAM.
>> +	 */
>> +	if (!(vcpu->arch.cr3_ctrl_bits & X86_CR3_LAM_U48) ||
> This is unnecessary, KVM should never allow the LAM bits in CR3 to be set if LAM
> isn't supported.

OK.
>
>> +	    (flags & X86EMUL_F_SKIPLAM) || WARN_ON_ONCE(!is_64_bit_mode(vcpu)))
> Same comments as the LASS series, don't WARN, just put the check here.
OK.

>
>> +		return;
>> +
>> +	sign_ext_bit = lam_sign_extend_bit(vcpu, *gva);
>> +	if (sign_ext_bit > 0)
>> +		*gva = (sign_extend64(*gva, sign_ext_bit) & ~BIT_ULL(63)) |
>> +		       (*gva & BIT_ULL(63));
>
> Something like this?  The early return in the user path is a bit forced, e.g. it
> could also be:
>
> 	if (cr3 & X86_CR3_LAM_U57)
> 		lam_bit = 56;
> 	else if (X86_CR3_LAM_U48)
> 		lam_bit = 48;
> 	else
> 		return gva;
>
> but IMO making the CR3 and CR4 paths somewhat symmetrical is valuable.
>
> gva_t vmx_get_untagged_addr(struct kvm_vcpu *vcpu, gva_t gva,
> 			    unsigned int flags)
> {
> 	unsigned long cr3_bits, cr4_bits;
> 	int lam_bit;
>
> 	if (flags & (X86EMUL_F_FETCH | X86EMUL_F_BRANCH_INVLPG | X86EMUL_F_IMPLICIT))
Thanks for the suggestion. Overall, it looks good to me.

Suppose "X86EMUL_F_BRANCH_INVLPG "  should be two flags for branch and 
invlpg, right?

And for LAM, X86EMUL_F_IMPLICIT will not be used because in the implicit 
access to memory management registers or descriptors,
the linear base addresses still need to be canonical and no hooks will 
be added to untag the addresses in these pathes.
So I probably will remove the check for X86EMUL_F_IMPLICIT here.

> 		return gva;
>
> 	if (!is_64_bit_mode(vcpu))
> 		return gva;
>
> 	/*
> 	 * Bit 63 determines if the address should be treated as user address
> 	 * or a supervisor address.
> 	 */
> 	if (!(gva & BIT_ULL(63))) {
> 		cr3_bits = kvm_get_active_lam_bits(vcpu);
> 		if (!(cr3 & (X86_CR3_LAM_U57 | X86_CR3_LAM_U48))
> 			return gva;
>
> 		/* LAM_U48 is ignored if LAM_U57 is set. */
> 		lam_bit = cr3_bits & X86_CR3_LAM_U57 ? 56 : 47;
> 	} else {
> 		if (!kvm_is_cr4_bit_set(vcpu, X86_CR4_LAM_SUP_BIT))
> 			return gva;
>
> 		lam_bit = kvm_is_cr4_bit_set(vcpu, X86_CR4_LA57) ? 56 : 47;
> 	}
> 	return (sign_extend64(gva, lam_bit) & ~BIT_ULL(63)) | (gva & BIT_ULL(63));
> }
>


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v9 4/6] KVM: x86: Introduce untag_addr() in kvm_x86_ops
  2023-06-29  6:12     ` Binbin Wu
@ 2023-06-29  6:57       ` Chao Gao
  2023-06-29  7:22         ` Binbin Wu
  2023-06-29  8:30       ` David Laight
  2023-06-29 15:16       ` Sean Christopherson
  2 siblings, 1 reply; 26+ messages in thread
From: Chao Gao @ 2023-06-29  6:57 UTC (permalink / raw)
  To: Binbin Wu
  Cc: Sean Christopherson, kvm, linux-kernel, pbonzini, kai.huang,
	David.Laight, robert.hu

On Thu, Jun 29, 2023 at 02:12:27PM +0800, Binbin Wu wrote:
>> > +	/*
>> > +	 * Check LAM_U48 in cr3_ctrl_bits to avoid guest_cpuid_has().
>> > +	 * If not set, vCPU doesn't supports LAM.
>> > +	 */
>> > +	if (!(vcpu->arch.cr3_ctrl_bits & X86_CR3_LAM_U48) ||
>> This is unnecessary, KVM should never allow the LAM bits in CR3 to be set if LAM
>> isn't supported.

A corner case is:

If EPT is enabled, CR3 writes are not trapped. then guests can set the
LAM bits in CR3 if hardware supports LAM regardless whether or not guest
enumerates LAM.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v9 4/6] KVM: x86: Introduce untag_addr() in kvm_x86_ops
  2023-06-29  6:57       ` Chao Gao
@ 2023-06-29  7:22         ` Binbin Wu
  2023-06-29 15:33           ` Sean Christopherson
  0 siblings, 1 reply; 26+ messages in thread
From: Binbin Wu @ 2023-06-29  7:22 UTC (permalink / raw)
  To: Chao Gao, Sean Christopherson
  Cc: kvm, linux-kernel, pbonzini, kai.huang, David.Laight, robert.hu



On 6/29/2023 2:57 PM, Chao Gao wrote:
> On Thu, Jun 29, 2023 at 02:12:27PM +0800, Binbin Wu wrote:
>>>> +	/*
>>>> +	 * Check LAM_U48 in cr3_ctrl_bits to avoid guest_cpuid_has().
>>>> +	 * If not set, vCPU doesn't supports LAM.
>>>> +	 */
>>>> +	if (!(vcpu->arch.cr3_ctrl_bits & X86_CR3_LAM_U48) ||
>>> This is unnecessary, KVM should never allow the LAM bits in CR3 to be set if LAM
>>> isn't supported.
> A corner case is:
>
> If EPT is enabled, CR3 writes are not trapped. then guests can set the
> LAM bits in CR3 if hardware supports LAM regardless whether or not guest
> enumerates LAM.
I recalled the main reason why I added the check.
It's used to avoid the following checking on CR3 & CR4, which may cause 
an additional VMREAD.

Also, about the virtualization hole, if guest can enable LAM bits in CR3 
in non-root mode without cause any problem,
that means the hardware supports LAM, should KVM continue to untag the 
address following CR3 setting?
Because skip untag the address probably will cause guest failure, and of 
cause, this is the guest itself to blame.
But untag the address seems do no harm?




^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH v9 4/6] KVM: x86: Introduce untag_addr() in kvm_x86_ops
  2023-06-29  6:12     ` Binbin Wu
  2023-06-29  6:57       ` Chao Gao
@ 2023-06-29  8:30       ` David Laight
  2023-06-29 15:16       ` Sean Christopherson
  2 siblings, 0 replies; 26+ messages in thread
From: David Laight @ 2023-06-29  8:30 UTC (permalink / raw)
  To: 'Binbin Wu', Sean Christopherson
  Cc: kvm, linux-kernel, pbonzini, chao.gao, kai.huang, robert.hu

From: Binbin Wu
> Sent: 29 June 2023 07:12
...
> >> +void vmx_untag_addr(struct kvm_vcpu *vcpu, gva_t *gva, u32 flags)
> >
> > Rather than modify the pointer, return the untagged address.  That's more flexible
> > as it allows using the result in if-statements and whatnot.  That might not ever
> > come into play, but there's no good reason to use an in/out param in a void
> > function.
>
> In earlier version, it did return the untagged address.
> In this version, I changed it as an in/out param to make the interface
> conditional and avoid to add a dummy one in SVM.
> Is it can be a reason?

You are always going to need a 'dummy' version.
If it ends up being 'x = x' the compiler will just optimise
it away.

But for a real function you'll get much better code from:
	x = fn(x);
than
	fn(&x);

It also lets you used 'void *' (etc) to avoid casts which
can easily hide bugs.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v9 4/6] KVM: x86: Introduce untag_addr() in kvm_x86_ops
  2023-06-29  6:12     ` Binbin Wu
  2023-06-29  6:57       ` Chao Gao
  2023-06-29  8:30       ` David Laight
@ 2023-06-29 15:16       ` Sean Christopherson
  2023-06-29 17:26         ` Binbin Wu
  2 siblings, 1 reply; 26+ messages in thread
From: Sean Christopherson @ 2023-06-29 15:16 UTC (permalink / raw)
  To: Binbin Wu
  Cc: kvm, linux-kernel, pbonzini, chao.gao, kai.huang, David.Laight,
	robert.hu

On Thu, Jun 29, 2023, Binbin Wu wrote:
> On 6/28/2023 8:15 AM, Sean Christopherson wrote:
> > On Tue, Jun 06, 2023, Binbin Wu wrote:
> > Use the perfectly good helper added earlier in the series:
> > 
> > 		cr3_lam = kvm_get_active_lam_bits();
> Good suggestion. Thanks.
> 
> > 
> > That has the added bonus of avoiding a VMREAD of CR3 when LAM is disabled in CR4.
> Why? I don't get the point.

Sorry, typo on my end.  When LAM is disabled in guest CPUID, not CR4.

> > > +void vmx_untag_addr(struct kvm_vcpu *vcpu, gva_t *gva, u32 flags)
> > Rather than modify the pointer, return the untagged address.  That's more flexible
> > as it allows using the result in if-statements and whatnot.  That might not ever
> > come into play, but there's no good reason to use an in/out param in a void
> > function.
> In earlier version, it did return the untagged address.
> In this version, I changed it as an in/out param to make the interface
> conditional and avoid to add a dummy one in SVM.
> Is it can be a reason?

Hmm, no.  You can achieve the same by doing:

	struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);

	if (!kvm_x86_ops.get_untagged_addr)
		return addr;

	return static_call(kvm_x86_get_untagged_addr)(vcpu, addr, flags);

> > gva_t vmx_get_untagged_addr(struct kvm_vcpu *vcpu, gva_t gva,
> > 			    unsigned int flags)
> > {
> > 	unsigned long cr3_bits, cr4_bits;
> > 	int lam_bit;
> > 
> > 	if (flags & (X86EMUL_F_FETCH | X86EMUL_F_BRANCH_INVLPG | X86EMUL_F_IMPLICIT))
> Thanks for the suggestion. Overall, it looks good to me.
> 
> Suppose "X86EMUL_F_BRANCH_INVLPG "  should be two flags for branch and
> invlpg, right?

Yeah, typo again.  Should just be X86EMUL_F_INVLPG, because unlike LASS, LAM
ignores all FETCH types.

> And for LAM, X86EMUL_F_IMPLICIT will not be used because in the implicit
> access to memory management registers or descriptors,
> the linear base addresses still need to be canonical and no hooks will be
> added to untag the addresses in these pathes.
> So I probably will remove the check for X86EMUL_F_IMPLICIT here.

No, please keep it, e.g. so that changes in the emulator don't lead to breakage,
and to document that they are exempt.

If you want, you could do WARN_ON_ONCE() for the IMPLICIT case, but I don't know
that that's worthwhile, e.g. nothing will go wrong if KVM tries to untag an
implicit access, and deliberately avoiding the call make make it annoying to
consolidate code in the future.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v9 4/6] KVM: x86: Introduce untag_addr() in kvm_x86_ops
  2023-06-29  7:22         ` Binbin Wu
@ 2023-06-29 15:33           ` Sean Christopherson
  0 siblings, 0 replies; 26+ messages in thread
From: Sean Christopherson @ 2023-06-29 15:33 UTC (permalink / raw)
  To: Binbin Wu
  Cc: Chao Gao, kvm, linux-kernel, pbonzini, kai.huang, David.Laight,
	robert.hu

On Thu, Jun 29, 2023, Binbin Wu wrote:
> On 6/29/2023 2:57 PM, Chao Gao wrote:
> > On Thu, Jun 29, 2023 at 02:12:27PM +0800, Binbin Wu wrote:
> > > > > +	/*
> > > > > +	 * Check LAM_U48 in cr3_ctrl_bits to avoid guest_cpuid_has().
> > > > > +	 * If not set, vCPU doesn't supports LAM.
> > > > > +	 */
> > > > > +	if (!(vcpu->arch.cr3_ctrl_bits & X86_CR3_LAM_U48) ||
> > > > This is unnecessary, KVM should never allow the LAM bits in CR3 to be set if LAM
> > > > isn't supported.
> > A corner case is:
> > 
> > If EPT is enabled, CR3 writes are not trapped. then guests can set the
> > LAM bits in CR3 if hardware supports LAM regardless whether or not guest
> > enumerates LAM.

Argh, that's a really obnoxious virtualization hole.

> I recalled the main reason why I added the check.
> It's used to avoid the following checking on CR3 & CR4, which may cause an
> additional VMREAD.

FWIW, that will (and should) be handled by kvm_get_active_lam_bits().  Hmm, though
since CR4.LAM_SUP is a separate thing, that should probably be
kvm_get_active_cr3_lam_bits().

> Also, about the virtualization hole, if guest can enable LAM bits in CR3 in
> non-root mode without cause any problem, that means the hardware supports
> LAM, should KVM continue to untag the address following CR3 setting?

Hrm, no, KVM should honor the architecture.  The virtualization hole is bad enough
as it is, I don't want to KVM to actively make it worse.

> Because skip untag the address probably will cause guest failure, and of
> cause, this is the guest itself to blame.

Yeah, guest's fault.  The fact that it the guest won't get all the #GPs it should
is unfortunate, but intercepting all writes to CR3 just to close the hole is sadly
a really bad tradeoff.

> But untag the address seems do no harm?

In an of itself, not really.  But I don't want to set the precedent in KVM that
user LAM is supported regardless of guest CPUID.

Another problem with the virtualization hole is that the guest will be able
to induce VM-Fail when KVM is running on L1, because L0 will likely enforce the
CR3 checks on VM-Enter but not intercept MOV CR3.  I.e. the guest can get an
illegal value into vmcs.GUEST_CR3.  We could add code to explicitly detect that
case to help triage such failures, but I don't know that it's worth the code, e.g.

	if (exit_reason.failed_vmentry) {
		if (boot_cpu_has(X86_FEATURE_LAM) &&
		    !guest_can_use(X86_FEATURE_LAM) &&
		    (kvm_read_cr3(vcpu) & (X86_CR3_LAM_U48 | X86_CR3_LAM_U57)))
		    	pr_warn_ratelimited("Guest abused LAM virtualization hole\n");
		else
			dump_vmcs(vcpu);
		vcpu->run->exit_reason = KVM_EXIT_FAIL_ENTRY;
		vcpu->run->fail_entry.hardware_entry_failure_reason
			= exit_reason.full;
		vcpu->run->fail_entry.cpu = vcpu->arch.last_vmentry_cpu;
		return 0;
	}

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v9 4/6] KVM: x86: Introduce untag_addr() in kvm_x86_ops
  2023-06-29 15:16       ` Sean Christopherson
@ 2023-06-29 17:26         ` Binbin Wu
  0 siblings, 0 replies; 26+ messages in thread
From: Binbin Wu @ 2023-06-29 17:26 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, linux-kernel, pbonzini, chao.gao, kai.huang, David.Laight,
	robert.hu



On 6/29/2023 11:16 PM, Sean Christopherson wrote:
>> And for LAM, X86EMUL_F_IMPLICIT will not be used because in the implicit
>> access to memory management registers or descriptors,
>> the linear base addresses still need to be canonical and no hooks will be
>> added to untag the addresses in these pathes.
>> So I probably will remove the check for X86EMUL_F_IMPLICIT here.
> No, please keep it, e.g. so that changes in the emulator don't lead to breakage,
> and to document that they are exempt.
>
> If you want, you could do WARN_ON_ONCE() for the IMPLICIT case, but I don't know
> that that's worthwhile, e.g. nothing will go wrong if KVM tries to untag an
> implicit access, and deliberately avoiding the call make make it annoying to
> consolidate code in the future.
Right.
Have a second thought, X86EMUL_F_IMPLICIT should be kept in case SVM has 
a different implementation and needs to do untag for IMPLICIT cases.



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v9 3/6] KVM: x86: Virtualize CR3.LAM_{U48,U57}
  2023-06-28 17:40       ` Sean Christopherson
@ 2023-07-03  7:56         ` Binbin Wu
  2023-07-22  1:28           ` Sean Christopherson
  0 siblings, 1 reply; 26+ messages in thread
From: Binbin Wu @ 2023-07-03  7:56 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, linux-kernel, pbonzini, chao.gao, kai.huang, David.Laight,
	robert.hu



On 6/29/2023 1:40 AM, Sean Christopherson wrote:
> On Wed, Jun 28, 2023, Binbin Wu wrote:
>>
>> On 6/28/2023 7:40 AM, Sean Christopherson wrote:
>>> I think I'd prefer to drop this field and avoid bikeshedding the name entirely.  The
>>> only reason to effectively cache "X86_CR3_LAM_U48 | X86_CR3_LAM_U57" is because
>>> guest_cpuid_has() is slow, and I'd rather solve that problem with the "governed
>>> feature" framework.
>> Thanks for the suggestion.
>>
>> Is the below patch the lastest patch of "governed feature" framework
>> support?
>> https://lore.kernel.org/kvm/20230217231022.816138-2-seanjc@google.com/
> Yes, I haven't refreshed it since the original posting.
>
>> Do you have plan to apply it to kvm-x86 repo?
> I'm leaning more and more towards pushing it through sooner than later as this
> isn't the first time in recent memory that a patch/series has done somewhat odd
> things to workaround guest_cpuid_has() being slow.  I was hoping to get feedback
> before applying, but that's not looking likely at this point.
Hi Sean,

I plan to adopt the "KVM-governed feature framework" to track whether 
the guest can use LAM feature.
Because your patchset is not applied yet, there are two ways to do it. 
Which one do you prefer?

Option 1:
Make KVM LAM patchset base on your "KVM-governed feature framework" 
patchset.

Option 2:
Temporarily add a bool in kvm_vcpu_arch as following, and use the bool 
"can_use_lam" instead of guest_can_use(vcpu, X86_FEATURE_LAM).
And provide a cleanup patch to use "KVM-governed feature framework", 
which can be applied along with or after your patchset.

index fb9d1f2d6136..74c0c70b0a44 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -748,6 +748,7 @@ struct kvm_vcpu_arch {
         bool tpr_access_reporting;
         bool xsaves_enabled;
         bool xfd_no_write_intercept;
+       bool can_use_lam;
         u64 ia32_xss;
         u64 microcode_version;
         u64 arch_capabilities;
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 2d9d155691a7..5b2db5daebb3 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7737,6 +7737,9 @@ static void vmx_vcpu_after_set_cpuid(struct 
kvm_vcpu *vcpu)
                 vmx->msr_ia32_feature_control_valid_bits &=
                         ~FEAT_CTL_SGX_LC_ENABLED;

+       vcpu->arch.can_use_lam = boot_cpu_has(X86_FEATURE_LAM) &&
+                                guest_cpuid_has(vcpu, X86_FEATURE_LAM);
+
         /* Refresh #PF interception to account for MAXPHYADDR changes. */
         vmx_update_exception_bitmap(vcpu);
  }

[...]

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v9 3/6] KVM: x86: Virtualize CR3.LAM_{U48,U57}
  2023-07-03  7:56         ` Binbin Wu
@ 2023-07-22  1:28           ` Sean Christopherson
  0 siblings, 0 replies; 26+ messages in thread
From: Sean Christopherson @ 2023-07-22  1:28 UTC (permalink / raw)
  To: Binbin Wu
  Cc: kvm, linux-kernel, pbonzini, chao.gao, kai.huang, David.Laight,
	robert.hu

On Mon, Jul 03, 2023, Binbin Wu wrote:
> 
> On 6/29/2023 1:40 AM, Sean Christopherson wrote:
> > On Wed, Jun 28, 2023, Binbin Wu wrote:
> > > 
> > > On 6/28/2023 7:40 AM, Sean Christopherson wrote:
> > > > I think I'd prefer to drop this field and avoid bikeshedding the name entirely.  The
> > > > only reason to effectively cache "X86_CR3_LAM_U48 | X86_CR3_LAM_U57" is because
> > > > guest_cpuid_has() is slow, and I'd rather solve that problem with the "governed
> > > > feature" framework.
> > > Thanks for the suggestion.
> > > 
> > > Is the below patch the lastest patch of "governed feature" framework
> > > support?
> > > https://lore.kernel.org/kvm/20230217231022.816138-2-seanjc@google.com/
> > Yes, I haven't refreshed it since the original posting.
> > 
> > > Do you have plan to apply it to kvm-x86 repo?
> > I'm leaning more and more towards pushing it through sooner than later as this
> > isn't the first time in recent memory that a patch/series has done somewhat odd
> > things to workaround guest_cpuid_has() being slow.  I was hoping to get feedback
> > before applying, but that's not looking likely at this point.
> Hi Sean,
> 
> I plan to adopt the "KVM-governed feature framework" to track whether the
> guest can use LAM feature.
> Because your patchset is not applied yet, there are two ways to do it. Which
> one do you prefer?
> 
> Option 1:
> Make KVM LAM patchset base on your "KVM-governed feature framework"
> patchset.
> 
> Option 2:
> Temporarily add a bool in kvm_vcpu_arch as following, and use the bool
> "can_use_lam" instead of guest_can_use(vcpu, X86_FEATURE_LAM).
> And provide a cleanup patch to use "KVM-governed feature framework", which
> can be applied along with or after your patchset.

Sorry for not responding.  I was hoping I could get v2 posted before advising on
a direction, but long story short, I made a few goofs and got delayed (I won't get
v2 out until next week).  Belatedly, either option is fine by me (I see you posted
v10 on top of the governed feature stuff).

Thank!  And again, sorry.

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2023-07-22  1:28 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-06  9:18 [PATCH v9 0/6] Linear Address Masking (LAM) KVM Enabling Binbin Wu
2023-06-06  9:18 ` [PATCH v9 1/6] KVM: x86: Consolidate flags for __linearize() Binbin Wu
2023-06-06  9:18 ` [PATCH v9 2/6] KVM: x86: Virtualize CR4.LAM_SUP Binbin Wu
2023-06-07  3:40   ` Huang, Kai
2023-06-07  4:55     ` Binbin Wu
2023-06-07  9:20       ` Huang, Kai
2023-06-06  9:18 ` [PATCH v9 3/6] KVM: x86: Virtualize CR3.LAM_{U48,U57} Binbin Wu
2023-06-27 23:40   ` Sean Christopherson
2023-06-28  3:05     ` Binbin Wu
2023-06-28 17:40       ` Sean Christopherson
2023-07-03  7:56         ` Binbin Wu
2023-07-22  1:28           ` Sean Christopherson
2023-06-06  9:18 ` [PATCH v9 4/6] KVM: x86: Introduce untag_addr() in kvm_x86_ops Binbin Wu
2023-06-28  0:15   ` Sean Christopherson
2023-06-29  6:12     ` Binbin Wu
2023-06-29  6:57       ` Chao Gao
2023-06-29  7:22         ` Binbin Wu
2023-06-29 15:33           ` Sean Christopherson
2023-06-29  8:30       ` David Laight
2023-06-29 15:16       ` Sean Christopherson
2023-06-29 17:26         ` Binbin Wu
2023-06-06  9:18 ` [PATCH v9 5/6] KVM: x86: Untag address when LAM applicable Binbin Wu
2023-06-28  0:19   ` Sean Christopherson
2023-06-06  9:18 ` [PATCH v9 6/6] KVM: x86: Expose LAM feature to userspace VMM Binbin Wu
2023-06-07  3:52   ` Huang, Kai
2023-06-16  1:45     ` Binbin Wu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).