kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/9] Linear Address Masking (LAM) KVM Enabling
@ 2023-02-09  2:40 Robert Hoo
  2023-02-09  2:40 ` [PATCH v4 1/9] KVM: x86: Intercept CR4.LAM_SUP when LAM feature is enabled in guest Robert Hoo
                   ` (10 more replies)
  0 siblings, 11 replies; 64+ messages in thread
From: Robert Hoo @ 2023-02-09  2:40 UTC (permalink / raw)
  To: seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, chao.gao, isaku.yamahata
  Cc: kirill.shutemov, kvm, Robert Hoo

===Feature Introduction===

Linear-address masking (LAM) [1], modifies the checking that is applied to
*64-bit* linear addresses, allowing software to use of the untranslated
address (upper) bits for metadata.
As for which upper bits of linear address can be borrowed, LAM has 2 modes:
LAM_48 (bits 62:48, i.e. LAM width of 15) and LAM_57 (bits 62:57, i.e. LAM
width of 6), controlled by these new bits: CR3[62] (LAM_U48), CR3[61]
(LAM_U57), and CR4[28] (LAM_SUP).

* LAM_U48 and LAM_U57 bits controls LAM for user mode address. I.e. if
  CR3.LAM_U57 = 1, LAM57 is applied; if CR3.LAM_U48 = 1 and CR3.LAM_U57 = 0,
  LAM48 is applied.
* LAM_SUP bit, combined with paging mode (4-level or 5-level), determines
  LAM status for supervisor mode address. I.e. when CR4.LAM_SUP =1, 4-level
  paging mode will have LAM48 for supervisor mode address while 5-level paging
  will have LAM57.

Note:
1. LAM applies to only data address, not to instructions.
2. LAM identification of an address as user or supervisor is based solely on the
   value of pointer bit 63 and does not, for the purposes of LAM, depend on the CPL.
3. For user mode address, it is possible that 5-level paging and LAM_U48 are both
   set, in this case, the effective usable linear address width is 48, i.e. bit
   56:47 is reserved by LAM. [2]
4. When VM exit, the problematic address saved in VMCS field is clean, i.e.
   metadata cleared with canonical form.


===LAM KVM Design===

Intercept CR4.LAM_SUP by KVM, to avoid read VMCS field every time, with
expectation that guest won't toggle this bit frequently.

Under EPT mode, CR3 is fully under guest control, guest LAM is thus transparent to
KVM. Nothing more need to do.

For Shadow paging (EPT = off), KVM need to handle guest CR3.LAM_U48 and CR3.LAM_U57
toggles.

[1] ISE Chap10 https://cdrdv2.intel.com/v1/dl/getContent/671368 (Section 10.6 VMX interaction)
[2] Thus currently, Kernel enabling patch only enables LAM_U57. https://lore.kernel.org/lkml/20230123220500.21077-1-kirill.shutemov@linux.intel.com/ 

---
Changelog
v3 --> v4:
Drop unrelated Patch 1 in v3 (Binbin, Sean, Xiaoyao)
Intercept CR4.LAM_SUP instead of pass through to guest (Sean)
Just filter out CR3.LAM_{U48, U57}, instead of all reserved high bits
(Sean, Yuan)
Use existing __canonical_address() helper instead write a new one (Weijiang)
Add LAM handling in KVM emulation (Yu, Yuan)
Add Jingqi's reviwed-by on Patch 7
Rebased to Kirill's latest code, which is 6.2-rc1 base.

v2 --> v3:
As LAM Kernel patches are in tip tree now, rebase to it.
https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/

v1 --> v2:
1. Fixes i386-allyesconfig build error on get_pgd(), where
   CR3_HIGH_RSVD_MASK isn't applicable.
   (Reported-by: kernel test robot <lkp@intel.com>)
2. In kvm_set_cr3(), be conservative on skip tlb flush when only LAM bits
   toggles. (Kirill)


Robert Hoo (9):
  KVM: x86: Intercept CR4.LAM_SUP when LAM feature is enabled in guest
  KVM: x86: MMU: Clear CR3 LAM bits when allocate shadow root
  [Trivial] KVM: x86: MMU: Commets update
  KVM: x86: MMU: Integrate LAM bits when build guest CR3
  KVM: x86: Untag LAM bits when applicable
  KVM: x86: When judging setting CR3 valid or not, consider LAM bits
  KVM: x86: When guest set CR3, handle LAM bits semantics
  KVM: x86: LAM: Expose LAM CPUID to user space VMM
  KVM: x86: emulation: Apply LAM when emulating data access

 arch/x86/include/asm/kvm_host.h |  3 ++-
 arch/x86/kvm/cpuid.c            |  2 +-
 arch/x86/kvm/emulate.c          |  6 +++++
 arch/x86/kvm/mmu.h              |  5 ++++
 arch/x86/kvm/mmu/mmu.c          | 11 ++++++--
 arch/x86/kvm/vmx/vmx.c          |  6 ++++-
 arch/x86/kvm/x86.c              | 38 ++++++++++++++++++++++----
 arch/x86/kvm/x86.h              | 47 +++++++++++++++++++++++++++++++++
 8 files changed, 108 insertions(+), 10 deletions(-)

base-commit: 03334443640f226f56f71b5dfa3b1be6d4a1a1bc
(https://git.kernel.org/pub/scm/linux/kernel/git/kas/linux.git lam)
-- 
2.31.1


^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v4 1/9] KVM: x86: Intercept CR4.LAM_SUP when LAM feature is enabled in guest
  2023-02-09  2:40 [PATCH v4 0/9] Linear Address Masking (LAM) KVM Enabling Robert Hoo
@ 2023-02-09  2:40 ` Robert Hoo
  2023-02-09  9:21   ` Chao Gao
                     ` (2 more replies)
  2023-02-09  2:40 ` [PATCH v4 2/9] KVM: x86: MMU: Clear CR3 LAM bits when allocate shadow root Robert Hoo
                   ` (9 subsequent siblings)
  10 siblings, 3 replies; 64+ messages in thread
From: Robert Hoo @ 2023-02-09  2:40 UTC (permalink / raw)
  To: seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, chao.gao, isaku.yamahata
  Cc: kirill.shutemov, kvm, Robert Hoo

Remove CR4.LAM_SUP (bit 28) from default CR4_RESERVED_BITS, while reserve
it in __cr4_reserved_bits() by feature testing.

Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
Reviewed-by: Jingqi Liu <jingqi.liu@intel.com>
---
 arch/x86/include/asm/kvm_host.h | 3 ++-
 arch/x86/kvm/x86.h              | 2 ++
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index f35f1ff4427b..4684896698f4 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -125,7 +125,8 @@
 			  | X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR | X86_CR4_PCIDE \
 			  | X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE \
 			  | X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_VMXE \
-			  | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP))
+			  | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP \
+			  | X86_CR4_LAM_SUP))
 
 #define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR)
 
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index 9de72586f406..8ec5cc983062 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -475,6 +475,8 @@ bool kvm_msr_allowed(struct kvm_vcpu *vcpu, u32 index, u32 type);
 		__reserved_bits |= X86_CR4_VMXE;        \
 	if (!__cpu_has(__c, X86_FEATURE_PCID))          \
 		__reserved_bits |= X86_CR4_PCIDE;       \
+	if (!__cpu_has(__c, X86_FEATURE_LAM))		\
+		__reserved_bits |= X86_CR4_LAM_SUP;	\
 	__reserved_bits;                                \
 })
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v4 2/9] KVM: x86: MMU: Clear CR3 LAM bits when allocate shadow root
  2023-02-09  2:40 [PATCH v4 0/9] Linear Address Masking (LAM) KVM Enabling Robert Hoo
  2023-02-09  2:40 ` [PATCH v4 1/9] KVM: x86: Intercept CR4.LAM_SUP when LAM feature is enabled in guest Robert Hoo
@ 2023-02-09  2:40 ` Robert Hoo
  2023-02-09  9:55   ` Chao Gao
  2023-02-10  3:38   ` Yang, Weijiang
  2023-02-09  2:40 ` [PATCH v4 3/9] KVM: x86: MMU: Commets update Robert Hoo
                   ` (8 subsequent siblings)
  10 siblings, 2 replies; 64+ messages in thread
From: Robert Hoo @ 2023-02-09  2:40 UTC (permalink / raw)
  To: seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, chao.gao, isaku.yamahata
  Cc: kirill.shutemov, kvm, Robert Hoo

mmu->get_guest_pgd()'s implementation is get_cr3(), clear the LAM bits for
root_pgd, which needs a pure address, plus (possible) PCID info (low 12
bits).

Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
---
 arch/x86/kvm/mmu/mmu.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 835426254e76..1d61dfe37c77 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3698,8 +3698,11 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
 	gfn_t root_gfn, root_pgd;
 	int quadrant, i, r;
 	hpa_t root;
-
+#ifdef CONFIG_X86_64
+	root_pgd = mmu->get_guest_pgd(vcpu) & ~(X86_CR3_LAM_U48 | X86_CR3_LAM_U57);
+#else
 	root_pgd = mmu->get_guest_pgd(vcpu);
+#endif
 	root_gfn = root_pgd >> PAGE_SHIFT;
 
 	if (mmu_check_root(vcpu, root_gfn))
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v4 3/9] KVM: x86: MMU: Commets update
  2023-02-09  2:40 [PATCH v4 0/9] Linear Address Masking (LAM) KVM Enabling Robert Hoo
  2023-02-09  2:40 ` [PATCH v4 1/9] KVM: x86: Intercept CR4.LAM_SUP when LAM feature is enabled in guest Robert Hoo
  2023-02-09  2:40 ` [PATCH v4 2/9] KVM: x86: MMU: Clear CR3 LAM bits when allocate shadow root Robert Hoo
@ 2023-02-09  2:40 ` Robert Hoo
  2023-02-10  6:59   ` Chao Gao
  2023-02-09  2:40 ` [PATCH v4 4/9] KVM: x86: MMU: Integrate LAM bits when build guest CR3 Robert Hoo
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 64+ messages in thread
From: Robert Hoo @ 2023-02-09  2:40 UTC (permalink / raw)
  To: seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, chao.gao, isaku.yamahata
  Cc: kirill.shutemov, kvm, Robert Hoo

kvm_mmu_ensure_valid_pgd() is stale. Update the comments according to
latest code.

No function changes.

P.S. Sean firstly noticed this in
https://lore.kernel.org/kvm/Yg%2FguAXFLJBmDflh@google.com/.

Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
---
 arch/x86/kvm/mmu/mmu.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 1d61dfe37c77..9a780445a88e 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4497,8 +4497,12 @@ void kvm_mmu_new_pgd(struct kvm_vcpu *vcpu, gpa_t new_pgd)
 	struct kvm_mmu *mmu = vcpu->arch.mmu;
 	union kvm_mmu_page_role new_role = mmu->root_role;
 
+	/*
+	 * If no root is found in cache, current active root.hpa will be (set)
+	 * INVALID_PAGE, a new root will be set up during vcpu_enter_guest()
+	 * --> kvm_mmu_reload().
+	 */
 	if (!fast_pgd_switch(vcpu->kvm, mmu, new_pgd, new_role)) {
-		/* kvm_mmu_ensure_valid_pgd will set up a new root.  */
 		return;
 	}
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v4 4/9] KVM: x86: MMU: Integrate LAM bits when build guest CR3
  2023-02-09  2:40 [PATCH v4 0/9] Linear Address Masking (LAM) KVM Enabling Robert Hoo
                   ` (2 preceding siblings ...)
  2023-02-09  2:40 ` [PATCH v4 3/9] KVM: x86: MMU: Commets update Robert Hoo
@ 2023-02-09  2:40 ` Robert Hoo
  2023-02-10 14:04   ` Chao Gao
  2023-02-09  2:40 ` [PATCH v4 5/9] KVM: x86: Untag LAM bits when applicable Robert Hoo
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 64+ messages in thread
From: Robert Hoo @ 2023-02-09  2:40 UTC (permalink / raw)
  To: seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, chao.gao, isaku.yamahata
  Cc: kirill.shutemov, kvm, Robert Hoo

When calc the new CR3 value, take LAM bits in.

Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
Reviewed-by: Jingqi Liu <jingqi.liu@intel.com>
---
 arch/x86/kvm/mmu.h     | 5 +++++
 arch/x86/kvm/vmx/vmx.c | 3 ++-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index 6bdaacb6faa0..866f2b7cb509 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -142,6 +142,11 @@ static inline unsigned long kvm_get_active_pcid(struct kvm_vcpu *vcpu)
 	return kvm_get_pcid(vcpu, kvm_read_cr3(vcpu));
 }
 
+static inline u64 kvm_get_active_lam(struct kvm_vcpu *vcpu)
+{
+	return kvm_read_cr3(vcpu) & (X86_CR3_LAM_U48 | X86_CR3_LAM_U57);
+}
+
 static inline void kvm_mmu_load_pgd(struct kvm_vcpu *vcpu)
 {
 	u64 root_hpa = vcpu->arch.mmu->root.hpa;
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index fe5615fd8295..66edd091f145 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -3289,7 +3289,8 @@ static void vmx_load_mmu_pgd(struct kvm_vcpu *vcpu, hpa_t root_hpa,
 			update_guest_cr3 = false;
 		vmx_ept_load_pdptrs(vcpu);
 	} else {
-		guest_cr3 = root_hpa | kvm_get_active_pcid(vcpu);
+		guest_cr3 = root_hpa | kvm_get_active_pcid(vcpu) |
+			    kvm_get_active_lam(vcpu);
 	}
 
 	if (update_guest_cr3)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v4 5/9] KVM: x86: Untag LAM bits when applicable
  2023-02-09  2:40 [PATCH v4 0/9] Linear Address Masking (LAM) KVM Enabling Robert Hoo
                   ` (3 preceding siblings ...)
  2023-02-09  2:40 ` [PATCH v4 4/9] KVM: x86: MMU: Integrate LAM bits when build guest CR3 Robert Hoo
@ 2023-02-09  2:40 ` Robert Hoo
  2023-02-10 15:04   ` Chao Gao
  2023-02-09  2:40 ` [PATCH v4 6/9] KVM: x86: When KVM judges CR3 valid or not, consider LAM bits Robert Hoo
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 64+ messages in thread
From: Robert Hoo @ 2023-02-09  2:40 UTC (permalink / raw)
  To: seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, chao.gao, isaku.yamahata
  Cc: kirill.shutemov, kvm, Robert Hoo

Define kvm_untagged_addr() per LAM feature spec: Address high bits are sign
extended, from highest effective address bit.
Note that LAM_U48 and LA57 has some effective bits overlap. This patch
gives a WARN() on that case.

Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
Reviewed-by: Jingqi Liu <jingqi.liu@intel.com>
---
 arch/x86/kvm/vmx/vmx.c |  3 +++
 arch/x86/kvm/x86.c     |  4 ++++
 arch/x86/kvm/x86.h     | 32 ++++++++++++++++++++++++++++++++
 3 files changed, 39 insertions(+)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 66edd091f145..e4f14d1bdd2f 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2163,6 +2163,9 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		    (!msr_info->host_initiated &&
 		     !guest_cpuid_has(vcpu, X86_FEATURE_MPX)))
 			return 1;
+
+		data = kvm_untagged_addr(data, vcpu);
+
 		if (is_noncanonical_address(data & PAGE_MASK, vcpu) ||
 		    (data & MSR_IA32_BNDCFGS_RSVD))
 			return 1;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 312aea1854ae..1bdc8c0c80c0 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1809,6 +1809,10 @@ static int __kvm_set_msr(struct kvm_vcpu *vcpu, u32 index, u64 data,
 	case MSR_KERNEL_GS_BASE:
 	case MSR_CSTAR:
 	case MSR_LSTAR:
+		/*
+		 * The strict canonical checking still applies to MSR
+		 * writing even LAM is enabled.
+		 */
 		if (is_noncanonical_address(data, vcpu))
 			return 1;
 		break;
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index 8ec5cc983062..7228895d4a6f 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -201,6 +201,38 @@ static inline bool is_noncanonical_address(u64 la, struct kvm_vcpu *vcpu)
 	return !__is_canonical_address(la, vcpu_virt_addr_bits(vcpu));
 }
 
+#ifdef CONFIG_X86_64
+/* untag addr for guest, according to vCPU CR3 and CR4 settings */
+static inline u64 kvm_untagged_addr(u64 addr, struct kvm_vcpu *vcpu)
+{
+	if (addr >> 63 == 0) {
+		/* User pointers */
+		if (kvm_read_cr3(vcpu) & X86_CR3_LAM_U57)
+			addr = __canonical_address(addr, 57);
+		else if (kvm_read_cr3(vcpu) & X86_CR3_LAM_U48) {
+			/*
+			 * If guest enabled 5-level paging and LAM_U48,
+			 * bit 47 should be 0, bit 48:56 contains meta data
+			 * although bit 47:56 are valid 5-level address
+			 * bits.
+			 * If LAM_U48 and 4-level paging, bit47 is 0.
+			 */
+			WARN_ON(addr & _BITUL(47));
+			addr = __canonical_address(addr, 48);
+		}
+	} else if (kvm_read_cr4(vcpu) & X86_CR4_LAM_SUP) { /* Supervisor pointers */
+		if (kvm_read_cr4(vcpu) & X86_CR4_LA57)
+			addr = __canonical_address(addr, 57);
+		else
+			addr = __canonical_address(addr, 48);
+	}
+
+	return addr;
+}
+#else
+#define kvm_untagged_addr(addr, vcpu)	(addr)
+#endif
+
 static inline void vcpu_cache_mmio_info(struct kvm_vcpu *vcpu,
 					gva_t gva, gfn_t gfn, unsigned access)
 {
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v4 6/9] KVM: x86: When KVM judges CR3 valid or not, consider LAM bits
  2023-02-09  2:40 [PATCH v4 0/9] Linear Address Masking (LAM) KVM Enabling Robert Hoo
                   ` (4 preceding siblings ...)
  2023-02-09  2:40 ` [PATCH v4 5/9] KVM: x86: Untag LAM bits when applicable Robert Hoo
@ 2023-02-09  2:40 ` Robert Hoo
  2023-02-13  2:01   ` Chao Gao
  2023-02-09  2:40 ` [PATCH v4 7/9] KVM: x86: When guest set CR3, handle LAM bits semantics Robert Hoo
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 64+ messages in thread
From: Robert Hoo @ 2023-02-09  2:40 UTC (permalink / raw)
  To: seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, chao.gao, isaku.yamahata
  Cc: kirill.shutemov, kvm, Robert Hoo

Before apply to kvm_vcpu_is_illegal_gpa(), clear LAM bits if it's valid.

Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
Reviewed-by: Jingqi Liu <jingqi.liu@intel.com>
---
 arch/x86/kvm/x86.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 1bdc8c0c80c0..3218f465ae71 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1231,6 +1231,14 @@ static void kvm_invalidate_pcid(struct kvm_vcpu *vcpu, unsigned long pcid)
 	kvm_mmu_free_roots(vcpu->kvm, mmu, roots_to_free);
 }
 
+static bool kvm_is_valid_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
+{
+	if (guest_cpuid_has(vcpu, X86_FEATURE_LAM))
+		cr3 &= ~(X86_CR3_LAM_U48 | X86_CR3_LAM_U57);
+
+	return kvm_vcpu_is_legal_gpa(vcpu, cr3);
+}
+
 int kvm_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
 {
 	bool skip_tlb_flush = false;
@@ -1254,7 +1262,7 @@ int kvm_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
 	 * stuff CR3, e.g. for RSM emulation, and there is no guarantee that
 	 * the current vCPU mode is accurate.
 	 */
-	if (kvm_vcpu_is_illegal_gpa(vcpu, cr3))
+	if (!kvm_is_valid_cr3(vcpu, cr3))
 		return 1;
 
 	if (is_pae_paging(vcpu) && !load_pdptrs(vcpu, cr3))
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v4 7/9] KVM: x86: When guest set CR3, handle LAM bits semantics
  2023-02-09  2:40 [PATCH v4 0/9] Linear Address Masking (LAM) KVM Enabling Robert Hoo
                   ` (5 preceding siblings ...)
  2023-02-09  2:40 ` [PATCH v4 6/9] KVM: x86: When KVM judges CR3 valid or not, consider LAM bits Robert Hoo
@ 2023-02-09  2:40 ` Robert Hoo
  2023-02-13  3:31   ` Chao Gao
  2023-02-09  2:40 ` [PATCH v4 8/9] KVM: x86: emulation: Apply LAM when emulating data access Robert Hoo
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 64+ messages in thread
From: Robert Hoo @ 2023-02-09  2:40 UTC (permalink / raw)
  To: seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, chao.gao, isaku.yamahata
  Cc: kirill.shutemov, kvm, Robert Hoo

When only changes LAM bits, ask next vcpu run to load mmu pgd, so that it
will build new CR3 with LAM bits updates.
When changes on CR3's effective address bits, no matter LAM bits changes or not,
go through normal pgd update process.

Reviewed-by: Jingqi Liu <jingqi.liu@intel.com>
Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
---
 arch/x86/kvm/x86.c | 24 ++++++++++++++++++++----
 1 file changed, 20 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 3218f465ae71..7df6c9dd12a5 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1242,9 +1242,9 @@ static bool kvm_is_valid_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
 int kvm_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
 {
 	bool skip_tlb_flush = false;
-	unsigned long pcid = 0;
+	unsigned long pcid = 0, old_cr3;
 #ifdef CONFIG_X86_64
-	bool pcid_enabled = kvm_read_cr4_bits(vcpu, X86_CR4_PCIDE);
+	bool pcid_enabled = !!kvm_read_cr4_bits(vcpu, X86_CR4_PCIDE);
 
 	if (pcid_enabled) {
 		skip_tlb_flush = cr3 & X86_CR3_PCID_NOFLUSH;
@@ -1257,6 +1257,10 @@ int kvm_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
 	if (cr3 == kvm_read_cr3(vcpu) && !is_pae_paging(vcpu))
 		goto handle_tlb_flush;
 
+	if (!guest_cpuid_has(vcpu, X86_FEATURE_LAM) &&
+	    (cr3 & (X86_CR3_LAM_U48 | X86_CR3_LAM_U57)))
+		return	1;
+
 	/*
 	 * Do not condition the GPA check on long mode, this helper is used to
 	 * stuff CR3, e.g. for RSM emulation, and there is no guarantee that
@@ -1268,8 +1272,20 @@ int kvm_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
 	if (is_pae_paging(vcpu) && !load_pdptrs(vcpu, cr3))
 		return 1;
 
-	if (cr3 != kvm_read_cr3(vcpu))
-		kvm_mmu_new_pgd(vcpu, cr3);
+	old_cr3 = kvm_read_cr3(vcpu);
+	if (cr3 != old_cr3) {
+		if ((cr3 ^ old_cr3) & CR3_ADDR_MASK) {
+			kvm_mmu_new_pgd(vcpu, cr3 & ~(X86_CR3_LAM_U48 |
+					X86_CR3_LAM_U57));
+		} else {
+			/*
+			 * Though effective addr no change, mark the
+			 * request so that LAM bits will take effect
+			 * when enter guest.
+			 */
+			kvm_make_request(KVM_REQ_LOAD_MMU_PGD, vcpu);
+		}
+	}
 
 	vcpu->arch.cr3 = cr3;
 	kvm_register_mark_dirty(vcpu, VCPU_EXREG_CR3);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v4 8/9] KVM: x86: emulation: Apply LAM when emulating data access
  2023-02-09  2:40 [PATCH v4 0/9] Linear Address Masking (LAM) KVM Enabling Robert Hoo
                   ` (6 preceding siblings ...)
  2023-02-09  2:40 ` [PATCH v4 7/9] KVM: x86: When guest set CR3, handle LAM bits semantics Robert Hoo
@ 2023-02-09  2:40 ` Robert Hoo
  2023-02-13  3:53   ` Chao Gao
  2023-02-09  2:40 ` [PATCH v4 9/9] KVM: x86: LAM: Expose LAM CPUID to user space VMM Robert Hoo
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 64+ messages in thread
From: Robert Hoo @ 2023-02-09  2:40 UTC (permalink / raw)
  To: seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, chao.gao, isaku.yamahata
  Cc: kirill.shutemov, kvm, Robert Hoo

When in KVM emulation, calculated a LA for data access, apply LAM if
guest is at that moment LAM active, so that the following canonical check
can pass.

Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
---
 arch/x86/kvm/emulate.c |  6 ++++++
 arch/x86/kvm/x86.h     | 13 +++++++++++++
 2 files changed, 19 insertions(+)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 5cc3efa0e21c..d52037151133 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -700,6 +700,12 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
 	*max_size = 0;
 	switch (mode) {
 	case X86EMUL_MODE_PROT64:
+		/*
+		 * LAM applies only on data access
+		 */
+		if (!fetch && is_lam_active(ctxt->vcpu))
+			la = kvm_untagged_addr(la, ctxt->vcpu);
+
 		*linear = la;
 		va_bits = ctxt_virt_addr_bits(ctxt);
 		if (!__is_canonical_address(la, va_bits))
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index 7228895d4a6f..9397e9f4e061 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -135,6 +135,19 @@ static inline int is_long_mode(struct kvm_vcpu *vcpu)
 #endif
 }
 
+#ifdef CONFIG_X86_64
+static inline bool is_lam_active(struct kvm_vcpu *vcpu)
+{
+	return kvm_read_cr3(vcpu) & (X86_CR3_LAM_U48 | X86_CR3_LAM_U57) ||
+	       kvm_read_cr4_bits(vcpu, X86_CR4_LAM_SUP);
+}
+#else
+static inline bool is_lam_active(struct kvm_vcpu *vcpu)
+{
+	return false;
+}
+#endif
+
 static inline bool is_64_bit_mode(struct kvm_vcpu *vcpu)
 {
 	int cs_db, cs_l;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v4 9/9] KVM: x86: LAM: Expose LAM CPUID to user space VMM
  2023-02-09  2:40 [PATCH v4 0/9] Linear Address Masking (LAM) KVM Enabling Robert Hoo
                   ` (7 preceding siblings ...)
  2023-02-09  2:40 ` [PATCH v4 8/9] KVM: x86: emulation: Apply LAM when emulating data access Robert Hoo
@ 2023-02-09  2:40 ` Robert Hoo
  2023-02-21  5:47   ` Binbin Wu
  2023-02-09  6:15 ` [PATCH v4 0/9] Linear Address Masking (LAM) KVM Enabling Chao Gao
  2023-02-13  9:02 ` Binbin Wu
  10 siblings, 1 reply; 64+ messages in thread
From: Robert Hoo @ 2023-02-09  2:40 UTC (permalink / raw)
  To: seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, chao.gao, isaku.yamahata
  Cc: kirill.shutemov, kvm, Robert Hoo

LAM feature is enumerated by (EAX=07H, ECX=01H):EAX.LAM[bit26].

Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
Reviewed-by: Jingqi Liu <jingqi.liu@intel.com>
---
 arch/x86/kvm/cpuid.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index b14653b61470..79f45cbe587e 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -664,7 +664,7 @@ void kvm_set_cpu_caps(void)
 
 	kvm_cpu_cap_mask(CPUID_7_1_EAX,
 		F(AVX_VNNI) | F(AVX512_BF16) | F(CMPCCXADD) | F(AMX_FP16) |
-		F(AVX_IFMA)
+		F(AVX_IFMA) | F(LAM)
 	);
 
 	kvm_cpu_cap_init_kvm_defined(CPUID_7_1_EDX,
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 0/9] Linear Address Masking (LAM) KVM Enabling
  2023-02-09  2:40 [PATCH v4 0/9] Linear Address Masking (LAM) KVM Enabling Robert Hoo
                   ` (8 preceding siblings ...)
  2023-02-09  2:40 ` [PATCH v4 9/9] KVM: x86: LAM: Expose LAM CPUID to user space VMM Robert Hoo
@ 2023-02-09  6:15 ` Chao Gao
  2023-02-09 12:25   ` Robert Hoo
  2023-02-13  9:02 ` Binbin Wu
  10 siblings, 1 reply; 64+ messages in thread
From: Chao Gao @ 2023-02-09  6:15 UTC (permalink / raw)
  To: Robert Hoo
  Cc: seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, isaku.yamahata, kirill.shutemov, kvm

On Thu, Feb 09, 2023 at 10:40:13AM +0800, Robert Hoo wrote:
>Intercept CR4.LAM_SUP by KVM, to avoid read VMCS field every time, with
>expectation that guest won't toggle this bit frequently.
>
>Under EPT mode, CR3 is fully under guest control, guest LAM is thus transparent to
>KVM. Nothing more need to do.

I don't think it is correct. You have to strip LAM_U57/U48 from CR3 when
walking guest page table and strip metadata from pointers when emulating
instructions.

>
>For Shadow paging (EPT = off), KVM need to handle guest CR3.LAM_U48 and CR3.LAM_U57
>toggles.
>
>[1] ISE Chap10 https://cdrdv2.intel.com/v1/dl/getContent/671368 (Section 10.6 VMX interaction)
>[2] Thus currently, Kernel enabling patch only enables LAM_U57. https://lore.kernel.org/lkml/20230123220500.21077-1-kirill.shutemov@linux.intel.com/ 

Please add a kvm-unit-test or kselftest for LAM, particularly for
operations (e.g., canonical check for supervisor pointers, toggle
CR4.LAM_SUP) which aren't covered by the test in Kirill's series.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 1/9] KVM: x86: Intercept CR4.LAM_SUP when LAM feature is enabled in guest
  2023-02-09  2:40 ` [PATCH v4 1/9] KVM: x86: Intercept CR4.LAM_SUP when LAM feature is enabled in guest Robert Hoo
@ 2023-02-09  9:21   ` Chao Gao
  2023-02-09 12:48     ` Robert Hoo
  2023-02-10  3:29   ` Yang, Weijiang
  2023-02-14  1:27   ` Binbin Wu
  2 siblings, 1 reply; 64+ messages in thread
From: Chao Gao @ 2023-02-09  9:21 UTC (permalink / raw)
  To: Robert Hoo
  Cc: seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, isaku.yamahata, kirill.shutemov, kvm

The subject doesn't match what the patch does; intercepting 
CR4.LAM_SUP isn't done by this patch. How about:

	Virtualize CR4.LAM_SUP

and in the changelog, explain a bit why CR4.LAM_SUP isn't
pass-thru'd and why no change to kvm/vmx_set_cr4() is needed.

On Thu, Feb 09, 2023 at 10:40:14AM +0800, Robert Hoo wrote:
>Remove CR4.LAM_SUP (bit 28) from default CR4_RESERVED_BITS, while reserve

s/(bit 28)//

>it in __cr4_reserved_bits() by feature testing.
>
>Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
>Reviewed-by: Jingqi Liu <jingqi.liu@intel.com>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 2/9] KVM: x86: MMU: Clear CR3 LAM bits when allocate shadow root
  2023-02-09  2:40 ` [PATCH v4 2/9] KVM: x86: MMU: Clear CR3 LAM bits when allocate shadow root Robert Hoo
@ 2023-02-09  9:55   ` Chao Gao
  2023-02-09 13:02     ` Robert Hoo
  2023-02-10  3:38   ` Yang, Weijiang
  1 sibling, 1 reply; 64+ messages in thread
From: Chao Gao @ 2023-02-09  9:55 UTC (permalink / raw)
  To: Robert Hoo
  Cc: seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, isaku.yamahata, kirill.shutemov, kvm

On Thu, Feb 09, 2023 at 10:40:15AM +0800, Robert Hoo wrote:
>--- a/arch/x86/kvm/mmu/mmu.c
>+++ b/arch/x86/kvm/mmu/mmu.c
>@@ -3698,8 +3698,11 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
> 	gfn_t root_gfn, root_pgd;
> 	int quadrant, i, r;
> 	hpa_t root;
>-

The blank line should be kept.

>+#ifdef CONFIG_X86_64
>+	root_pgd = mmu->get_guest_pgd(vcpu) & ~(X86_CR3_LAM_U48 | X86_CR3_LAM_U57);
>+#else
> 	root_pgd = mmu->get_guest_pgd(vcpu);
>+#endif

Why are other call sites of mmu->get_guest_pgd() not changed? And what's
the value of the #ifdef?

> 	root_gfn = root_pgd >> PAGE_SHIFT;
> 
> 	if (mmu_check_root(vcpu, root_gfn))
>-- 
>2.31.1
>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 0/9] Linear Address Masking (LAM) KVM Enabling
  2023-02-09  6:15 ` [PATCH v4 0/9] Linear Address Masking (LAM) KVM Enabling Chao Gao
@ 2023-02-09 12:25   ` Robert Hoo
  2023-02-09 17:27     ` Sean Christopherson
  0 siblings, 1 reply; 64+ messages in thread
From: Robert Hoo @ 2023-02-09 12:25 UTC (permalink / raw)
  To: Chao Gao
  Cc: seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, isaku.yamahata, kirill.shutemov, kvm

On Thu, 2023-02-09 at 14:15 +0800, Chao Gao wrote:
> On Thu, Feb 09, 2023 at 10:40:13AM +0800, Robert Hoo wrote:
> > Intercept CR4.LAM_SUP by KVM, to avoid read VMCS field every time,
> > with
> > expectation that guest won't toggle this bit frequently.
> > 
> > Under EPT mode, CR3 is fully under guest control, guest LAM is thus
> > transparent to
> > KVM. Nothing more need to do.
> 
> I don't think it is correct. You have to strip LAM_U57/U48 from CR3
> when
> walking guest page table and strip metadata from pointers when
> emulating
> instructions.
> 
Yes, has added patch 8 for emulation case. Didn't explicitly note it in
cover letter.
> > 
> > For Shadow paging (EPT = off), KVM need to handle guest CR3.LAM_U48
> > and CR3.LAM_U57
> > toggles.
> > 
> > [1] ISE Chap10 https://cdrdv2.intel.com/v1/dl/getContent/671368
> > (Section 10.6 VMX interaction)
> > [2] Thus currently, Kernel enabling patch only enables LAM_U57. 
> > https://lore.kernel.org/lkml/20230123220500.21077-1-kirill.shutemov@linux.intel.com/
> >  
> 
> Please add a kvm-unit-test or kselftest for LAM, particularly for
> operations (e.g., canonical check for supervisor pointers, toggle
> CR4.LAM_SUP) which aren't covered by the test in Kirill's series.

OK, I can explore for kvm-unit-test in separate patch set.
BTW, this patch set has passed guest running Kirill's kselftests.


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 1/9] KVM: x86: Intercept CR4.LAM_SUP when LAM feature is enabled in guest
  2023-02-09  9:21   ` Chao Gao
@ 2023-02-09 12:48     ` Robert Hoo
  0 siblings, 0 replies; 64+ messages in thread
From: Robert Hoo @ 2023-02-09 12:48 UTC (permalink / raw)
  To: Chao Gao
  Cc: seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, isaku.yamahata, kirill.shutemov, kvm

On Thu, 2023-02-09 at 17:21 +0800, Chao Gao wrote:
> The subject doesn't match what the patch does; intercepting 
> CR4.LAM_SUP isn't done by this patch. How about:
> 
> 	Virtualize CR4.LAM_SUP

All right, although I think this patch is all about intercepting
CR4.LAM_SUP. Additional handling on CR4 bits intercepting in
kvm/vmx_set_cr4() isn't always necessary.

> 
> and in the changelog, 

Do you mean in cover letter? or in this patch's description here?

> explain a bit why CR4.LAM_SUP isn't
> pass-thru'd and why no change to kvm/vmx_set_cr4() is needed.

OK.

Existing kvm/vmx_set_cr4() can handle CR4.LAM_SUP well, no additional
code need to be added.
If we take a look at kvm/vmx_set_cr4() body, besides the ultimate goal
of 
	vmcs_writel(CR4_READ_SHADOW, cr4);
	vmcs_writel(GUEST_CR4, hw_cr4);

other hunks are about constructing/adjust cr4/hw_cr4. Those are for the
CR4 bits that has dependency on other features/system status (e.g.
paging), while CR4.LAM_SUP doesn't.

> 
> On Thu, Feb 09, 2023 at 10:40:14AM +0800, Robert Hoo wrote:
> > Remove CR4.LAM_SUP (bit 28) from default CR4_RESERVED_BITS, while
> > reserve
> 
> s/(bit 28)//
> 
> > it in __cr4_reserved_bits() by feature testing.
> > 
> > Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
> > Reviewed-by: Jingqi Liu <jingqi.liu@intel.com>


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 2/9] KVM: x86: MMU: Clear CR3 LAM bits when allocate shadow root
  2023-02-09  9:55   ` Chao Gao
@ 2023-02-09 13:02     ` Robert Hoo
  2023-02-14  2:55       ` Binbin Wu
  0 siblings, 1 reply; 64+ messages in thread
From: Robert Hoo @ 2023-02-09 13:02 UTC (permalink / raw)
  To: Chao Gao
  Cc: seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, isaku.yamahata, kirill.shutemov, kvm

On Thu, 2023-02-09 at 17:55 +0800, Chao Gao wrote:
> On Thu, Feb 09, 2023 at 10:40:15AM +0800, Robert Hoo wrote:
> > --- a/arch/x86/kvm/mmu/mmu.c
> > +++ b/arch/x86/kvm/mmu/mmu.c
> > @@ -3698,8 +3698,11 @@ static int mmu_alloc_shadow_roots(struct
> > kvm_vcpu *vcpu)
> > 	gfn_t root_gfn, root_pgd;
> > 	int quadrant, i, r;
> > 	hpa_t root;
> > -
> 
> The blank line should be kept.

OK
> 
> > +#ifdef CONFIG_X86_64
> > +	root_pgd = mmu->get_guest_pgd(vcpu) & ~(X86_CR3_LAM_U48 |
> > X86_CR3_LAM_U57);
> > +#else
> > 	root_pgd = mmu->get_guest_pgd(vcpu);
> > +#endif
> 
> Why are other call sites of mmu->get_guest_pgd() not changed? 

Emm, the other 3 are
FNAME(walk_addr_generic)()
kvm_arch_setup_async_pf()
kvm_arch_async_page_ready

In former version, I clear CR3.LAM bits for guest_pgd inside mmu-
>get_guest_pgd(). I think this is generic. Perhaps I should still do it
in that way. Let's wait for other's comments on this.
Thanks for pointing out.

> And what's
> the value of the #ifdef?

LAM is only available in 64 bit mode.
> 
> > 	root_gfn = root_pgd >> PAGE_SHIFT;
> > 
> > 	if (mmu_check_root(vcpu, root_gfn))
> > -- 
> > 2.31.1
> > 


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 0/9] Linear Address Masking (LAM) KVM Enabling
  2023-02-09 12:25   ` Robert Hoo
@ 2023-02-09 17:27     ` Sean Christopherson
  2023-02-10  2:07       ` Robert Hoo
  0 siblings, 1 reply; 64+ messages in thread
From: Sean Christopherson @ 2023-02-09 17:27 UTC (permalink / raw)
  To: Robert Hoo
  Cc: Chao Gao, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, isaku.yamahata, kirill.shutemov, kvm

On Thu, Feb 09, 2023, Robert Hoo wrote:
> On Thu, 2023-02-09 at 14:15 +0800, Chao Gao wrote:
> > On Thu, Feb 09, 2023 at 10:40:13AM +0800, Robert Hoo wrote:
> > Please add a kvm-unit-test or kselftest for LAM, particularly for
> > operations (e.g., canonical check for supervisor pointers, toggle
> > CR4.LAM_SUP) which aren't covered by the test in Kirill's series.
> 
> OK, I can explore for kvm-unit-test in separate patch set.

Please make tests your top priority.  Without tests, I am not going to spend any
time reviewing this series, or any other hardware enabling series[*].  I don't
expect KVM specific tests for everything, i.e. it's ok to to rely things like
running VMs that utilize LAM and/or running LAM selftests in the guest, but I do
want a reasonably thorough explanation of how all the test pieces fit together to
validate KVM's implementation.

[*] https://lore.kernel.org/all/Y+Uq0JOEmmdI0YwA@google.com

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 0/9] Linear Address Masking (LAM) KVM Enabling
  2023-02-09 17:27     ` Sean Christopherson
@ 2023-02-10  2:07       ` Robert Hoo
  2023-02-10  3:17         ` Chao Gao
  2023-02-10  8:39         ` Robert Hoo
  0 siblings, 2 replies; 64+ messages in thread
From: Robert Hoo @ 2023-02-10  2:07 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Chao Gao, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, isaku.yamahata, kirill.shutemov, kvm

On Thu, 2023-02-09 at 17:27 +0000, Sean Christopherson wrote:
> On Thu, Feb 09, 2023, Robert Hoo wrote:
> > On Thu, 2023-02-09 at 14:15 +0800, Chao Gao wrote:
> > > On Thu, Feb 09, 2023 at 10:40:13AM +0800, Robert Hoo wrote:
> > > Please add a kvm-unit-test or kselftest for LAM, particularly for
> > > operations (e.g., canonical check for supervisor pointers, toggle
> > > CR4.LAM_SUP) which aren't covered by the test in Kirill's series.
> > 
> > OK, I can explore for kvm-unit-test in separate patch set.
> 
> Please make tests your top priority.  Without tests, I am not going
> to spend any
> time reviewing this series, or any other hardware enabling
> series[*].  I don't
> expect KVM specific tests for everything, i.e. it's ok to to rely
> things like
> running VMs that utilize LAM and/or running LAM selftests in the
> guest, but I do
> want a reasonably thorough explanation of how all the test pieces fit
> together to
> validate KVM's implementation.

Sure, and ack on unit test is part of development work.

This patch set had always been unit tested before sent out, i.e.
"running LAM selftests in guest" on both ept=Y/N.

CR4.LAM_SUP, as Chao pointed out, could not be covered by kselftest, I
may explore it in kvm-unit-test.
Or, would you mind that separate CR4.LAM_SUP enabling in another patch
set?
> 
> [*] https://lore.kernel.org/all/Y+Uq0JOEmmdI0YwA@google.com


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 0/9] Linear Address Masking (LAM) KVM Enabling
  2023-02-10  2:07       ` Robert Hoo
@ 2023-02-10  3:17         ` Chao Gao
  2023-02-10  8:41           ` Robert Hoo
  2023-02-10  8:39         ` Robert Hoo
  1 sibling, 1 reply; 64+ messages in thread
From: Chao Gao @ 2023-02-10  3:17 UTC (permalink / raw)
  To: Robert Hoo
  Cc: Sean Christopherson, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, isaku.yamahata, kirill.shutemov, kvm

On Fri, Feb 10, 2023 at 10:07:42AM +0800, Robert Hoo wrote:
>On Thu, 2023-02-09 at 17:27 +0000, Sean Christopherson wrote:
>> On Thu, Feb 09, 2023, Robert Hoo wrote:
>> > On Thu, 2023-02-09 at 14:15 +0800, Chao Gao wrote:
>> > > On Thu, Feb 09, 2023 at 10:40:13AM +0800, Robert Hoo wrote:
>> > > Please add a kvm-unit-test or kselftest for LAM, particularly for
>> > > operations (e.g., canonical check for supervisor pointers, toggle
>> > > CR4.LAM_SUP) which aren't covered by the test in Kirill's series.
>> > 
>> > OK, I can explore for kvm-unit-test in separate patch set.
>> 
>> Please make tests your top priority.  Without tests, I am not going
>> to spend any
>> time reviewing this series, or any other hardware enabling
>> series[*].  I don't
>> expect KVM specific tests for everything, i.e. it's ok to to rely
>> things like
>> running VMs that utilize LAM and/or running LAM selftests in the
>> guest, but I do
>> want a reasonably thorough explanation of how all the test pieces fit
>> together to
>> validate KVM's implementation.
>
>Sure, and ack on unit test is part of development work.
>
>This patch set had always been unit tested before sent out, i.e.
>"running LAM selftests in guest" on both ept=Y/N.
>
>CR4.LAM_SUP, as Chao pointed out, could not be covered by kselftest, I
>may explore it in kvm-unit-test.

Alternatively, add another kselftest for LAM under kselftests/kvm.

>Or, would you mind that separate CR4.LAM_SUP enabling in another patch
>set?

This isn't a good idea. KVM shouldn't advertise LAM to userspace VMM
without CR4.LAM_SUP handling given LAM for supervisor pointers isn't
enumerated by a separate CPUID bit. Then, without the "another patch set",
this series just adds some dead code to KVM, which, IMO, is unacceptable.

>> 
>> [*] https://lore.kernel.org/all/Y+Uq0JOEmmdI0YwA@google.com
>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 1/9] KVM: x86: Intercept CR4.LAM_SUP when LAM feature is enabled in guest
  2023-02-09  2:40 ` [PATCH v4 1/9] KVM: x86: Intercept CR4.LAM_SUP when LAM feature is enabled in guest Robert Hoo
  2023-02-09  9:21   ` Chao Gao
@ 2023-02-10  3:29   ` Yang, Weijiang
  2023-02-10  5:02     ` Robert Hoo
  2023-02-14  1:27   ` Binbin Wu
  2 siblings, 1 reply; 64+ messages in thread
From: Yang, Weijiang @ 2023-02-10  3:29 UTC (permalink / raw)
  To: Robert Hoo
  Cc: kirill.shutemov, kvm, seanjc, pbonzini, yu.c.zhang, yuan.yao,
	jingqi.liu, chao.gao, isaku.yamahata


On 2/9/2023 10:40 AM, Robert Hoo wrote:
> Remove CR4.LAM_SUP (bit 28) from default CR4_RESERVED_BITS, while reserve
> it in __cr4_reserved_bits() by feature testing.
>
> Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
> Reviewed-by: Jingqi Liu <jingqi.liu@intel.com>

As Sean pointed out in[*], this Reviewed-by is for other purpose, please 
remove all of

them in this series.

[*] Re: [PATCH 7/7] x86/kvm: Expose LASS feature to VM guest - Sean 
Christopherson (kernel.org) 
<https://lore.kernel.org/all/Y+Uq0JOEmmdI0YwA@google.com/><https://lore.kernel.org/all/Y+Uq0JOEmmdI0YwA@google.com/>

[...]


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 2/9] KVM: x86: MMU: Clear CR3 LAM bits when allocate shadow root
  2023-02-09  2:40 ` [PATCH v4 2/9] KVM: x86: MMU: Clear CR3 LAM bits when allocate shadow root Robert Hoo
  2023-02-09  9:55   ` Chao Gao
@ 2023-02-10  3:38   ` Yang, Weijiang
  2023-02-11  3:12     ` Robert Hoo
  1 sibling, 1 reply; 64+ messages in thread
From: Yang, Weijiang @ 2023-02-10  3:38 UTC (permalink / raw)
  To: Robert Hoo
  Cc: kirill.shutemov, kvm, seanjc, pbonzini, yu.c.zhang, yuan.yao,
	jingqi.liu, chao.gao, isaku.yamahata


On 2/9/2023 10:40 AM, Robert Hoo wrote:

[...]


> -
> +#ifdef CONFIG_X86_64
> +	root_pgd = mmu->get_guest_pgd(vcpu) & ~(X86_CR3_LAM_U48 | X86_CR3_LAM_U57);
> +#else
>   	root_pgd = mmu->get_guest_pgd(vcpu);
> +#endif

I prefer using:

root_pgd = mmu->get_guest_pgd(vcpu);

if (IS_ENABLED(CONFIG_X86_64))

     root_pgd &= ~(X86_CR3_LAM_U48 | X86_CR3_LAM_U57);

I looks more structured.


>   	root_gfn = root_pgd >> PAGE_SHIFT;
>   
>   	if (mmu_check_root(vcpu, root_gfn))

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 1/9] KVM: x86: Intercept CR4.LAM_SUP when LAM feature is enabled in guest
  2023-02-10  3:29   ` Yang, Weijiang
@ 2023-02-10  5:02     ` Robert Hoo
  2023-02-10 16:30       ` Sean Christopherson
  0 siblings, 1 reply; 64+ messages in thread
From: Robert Hoo @ 2023-02-10  5:02 UTC (permalink / raw)
  To: Yang, Weijiang
  Cc: kirill.shutemov, kvm, seanjc, pbonzini, yu.c.zhang, yuan.yao,
	jingqi.liu, chao.gao, isaku.yamahata

On Fri, 2023-02-10 at 11:29 +0800, Yang, Weijiang wrote:
> On 2/9/2023 10:40 AM, Robert Hoo wrote:
> > Remove CR4.LAM_SUP (bit 28) from default CR4_RESERVED_BITS, while
> > reserve
> > it in __cr4_reserved_bits() by feature testing.
> > 
> > Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
> > Reviewed-by: Jingqi Liu <jingqi.liu@intel.com>
> 
> As Sean pointed out in[*], this Reviewed-by is for other purpose,
> please 
> remove all of
> 
> them in this series.

No. Sean meant another thing.
This reviewed-by is from Jingqi as usual.

But for this patch specific, I think I can drop Jingqi's reviewed-by,
as this patch changed fundamentally from last version.
> 
> [*] Re: [PATCH 7/7] x86/kvm: Expose LASS feature to VM guest - Sean 
> Christopherson (kernel.org) 
> <
> https://lore.kernel.org/all/Y+Uq0JOEmmdI0YwA@google.com/><https://lore.kernel.org/all/Y+Uq0JOEmmdI0YwA@google.com/
> >
> 
> [...]
> 


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 3/9] KVM: x86: MMU: Commets update
  2023-02-09  2:40 ` [PATCH v4 3/9] KVM: x86: MMU: Commets update Robert Hoo
@ 2023-02-10  6:59   ` Chao Gao
  2023-02-10  7:55     ` Robert Hoo
  0 siblings, 1 reply; 64+ messages in thread
From: Chao Gao @ 2023-02-10  6:59 UTC (permalink / raw)
  To: Robert Hoo
  Cc: seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, isaku.yamahata, kirill.shutemov, kvm

On Thu, Feb 09, 2023 at 10:40:16AM +0800, Robert Hoo wrote:
>kvm_mmu_ensure_valid_pgd() is stale. Update the comments according to
>latest code.
>
>No function changes.
>
>P.S. Sean firstly noticed this in

Reported-by:?

Should not this be post separately? This patch has nothing to do with LAM.

>https://lore.kernel.org/kvm/Yg%2FguAXFLJBmDflh@google.com/.
>
>Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
>---
> arch/x86/kvm/mmu/mmu.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
>diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
>index 1d61dfe37c77..9a780445a88e 100644
>--- a/arch/x86/kvm/mmu/mmu.c
>+++ b/arch/x86/kvm/mmu/mmu.c
>@@ -4497,8 +4497,12 @@ void kvm_mmu_new_pgd(struct kvm_vcpu *vcpu, gpa_t new_pgd)
> 	struct kvm_mmu *mmu = vcpu->arch.mmu;
> 	union kvm_mmu_page_role new_role = mmu->root_role;
> 
>+	/*
>+	 * If no root is found in cache, current active root.hpa will be (set)
>+	 * INVALID_PAGE, a new root will be set up during vcpu_enter_guest()
>+	 * --> kvm_mmu_reload().
>+	 */
> 	if (!fast_pgd_switch(vcpu->kvm, mmu, new_pgd, new_role)) {
>-		/* kvm_mmu_ensure_valid_pgd will set up a new root.  */
> 		return;
> 	}
> 
>-- 
>2.31.1
>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 3/9] KVM: x86: MMU: Commets update
  2023-02-10  6:59   ` Chao Gao
@ 2023-02-10  7:55     ` Robert Hoo
  2023-02-10 16:54       ` Sean Christopherson
  0 siblings, 1 reply; 64+ messages in thread
From: Robert Hoo @ 2023-02-10  7:55 UTC (permalink / raw)
  To: Chao Gao
  Cc: seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, isaku.yamahata, kirill.shutemov, kvm

On Fri, 2023-02-10 at 14:59 +0800, Chao Gao wrote:
> On Thu, Feb 09, 2023 at 10:40:16AM +0800, Robert Hoo wrote:
> > kvm_mmu_ensure_valid_pgd() is stale. Update the comments according
> > to
> > latest code.
> > 
> > No function changes.
> > 
> > P.S. Sean firstly noticed this in
> 
> Reported-by:?

OK. Sean agree?
> 
> Should not this be post separately? This patch has nothing to do with
> LAM.

It's too trivial to post separately, I think, just comments updates.
And it on the code path of LAM KVM enabling, therefore I observed and
update passingly; although no code change happens.
> 
> > https://lore.kernel.org/kvm/Yg%2FguAXFLJBmDflh@google.com/.
> > 
> > Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
> > ---
> > arch/x86/kvm/mmu/mmu.c | 6 +++++-
> > 1 file changed, 5 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > index 1d61dfe37c77..9a780445a88e 100644
> > --- a/arch/x86/kvm/mmu/mmu.c
> > +++ b/arch/x86/kvm/mmu/mmu.c
> > @@ -4497,8 +4497,12 @@ void kvm_mmu_new_pgd(struct kvm_vcpu *vcpu,
> > gpa_t new_pgd)
> > 	struct kvm_mmu *mmu = vcpu->arch.mmu;
> > 	union kvm_mmu_page_role new_role = mmu->root_role;
> > 
> > +	/*
> > +	 * If no root is found in cache, current active root.hpa will
> > be (set)
> > +	 * INVALID_PAGE, a new root will be set up during
> > vcpu_enter_guest()
> > +	 * --> kvm_mmu_reload().
> > +	 */
> > 	if (!fast_pgd_switch(vcpu->kvm, mmu, new_pgd, new_role)) {
> > -		/* kvm_mmu_ensure_valid_pgd will set up a new root.  */
> > 		return;
> > 	}
> > 
> > -- 
> > 2.31.1
> > 


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 0/9] Linear Address Masking (LAM) KVM Enabling
  2023-02-10  2:07       ` Robert Hoo
  2023-02-10  3:17         ` Chao Gao
@ 2023-02-10  8:39         ` Robert Hoo
  2023-02-10  9:22           ` Chao Gao
  1 sibling, 1 reply; 64+ messages in thread
From: Robert Hoo @ 2023-02-10  8:39 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Chao Gao, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, isaku.yamahata, kirill.shutemov, kvm

On Fri, 2023-02-10 at 10:07 +0800, Robert Hoo wrote:
> On Thu, 2023-02-09 at 17:27 +0000, Sean Christopherson wrote:
> > On Thu, Feb 09, 2023, Robert Hoo wrote:
> > > On Thu, 2023-02-09 at 14:15 +0800, Chao Gao wrote:
> > > > On Thu, Feb 09, 2023 at 10:40:13AM +0800, Robert Hoo wrote:
> > > > Please add a kvm-unit-test or kselftest for LAM, particularly
> > > > for
> > > > operations (e.g., canonical check for supervisor pointers,
> > > > toggle
> > > > CR4.LAM_SUP) which aren't covered by the test in Kirill's
> > > > series.
> > > 
> > > OK, I can explore for kvm-unit-test in separate patch set.
> > 
> > Please make tests your top priority.  Without tests, I am not going
> > to spend any
> > time reviewing this series, or any other hardware enabling
> > series[*].  I don't
> > expect KVM specific tests for everything, i.e. it's ok to to rely
> > things like
> > running VMs that utilize LAM and/or running LAM selftests in the
> > guest, but I do
> > want a reasonably thorough explanation of how all the test pieces
> > fit
> > together to
> > validate KVM's implementation.
> 
> Sure, and ack on unit test is part of development work.
> 
> This patch set had always been unit tested before sent out, i.e.
> "running LAM selftests in guest" on both ept=Y/N.
> 
> CR4.LAM_SUP, as Chao pointed out, could not be covered by kselftest,
> I
> may explore it in kvm-unit-test.
> 
When I come to kvm-unit-test, just find that I had already developed
some test case on CR4.LAM_SUP toggle and carried out on this patch set.
Just forgot about it.

Is it all right? if so, I will include it in next version.

diff --git a/lib/x86/processor.h b/lib/x86/processor.h
index 3d58ef7..c6b1db6 100644
--- a/lib/x86/processor.h
+++ b/lib/x86/processor.h
@@ -105,6 +105,8 @@
 #define X86_CR4_CET            BIT(X86_CR4_CET_BIT)
 #define X86_CR4_PKS_BIT                (24)
 #define X86_CR4_PKS            BIT(X86_CR4_PKS_BIT)
+#define X86_CR4_LAM_SUP_BIT    (28)
+#define X86_CR4_LAM_SUP        BIT(X86_CR4_LAM_SUP_BIT)
 
 #define X86_EFLAGS_CF_BIT      (0)
 #define X86_EFLAGS_CF          BIT(X86_EFLAGS_CF_BIT)
@@ -248,6 +250,7 @@ static inline bool is_intel(void)
 #define        X86_FEATURE_SPEC_CTRL           (CPUID(0x7, 0, EDX,
26))
 #define        X86_FEATURE_ARCH_CAPABILITIES   (CPUID(0x7, 0, EDX,
29))
 #define        X86_FEATURE_PKS                 (CPUID(0x7, 0, ECX,
31))
+#define        X86_FEATURE_LAM                 (CPUID(0x7, 1, EAX,
26))
 
 /*
  * Extended Leafs, a.k.a. AMD defined
diff --git a/x86/Makefile.x86_64 b/x86/Makefile.x86_64
index f483dea..af626cc 100644
--- a/x86/Makefile.x86_64
+++ b/x86/Makefile.x86_64
@@ -34,6 +34,7 @@ tests += $(TEST_DIR)/rdpru.$(exe)
 tests += $(TEST_DIR)/pks.$(exe)
 tests += $(TEST_DIR)/pmu_lbr.$(exe)
 tests += $(TEST_DIR)/pmu_pebs.$(exe)
+tests += $(TEST_DIR)/lam_sup.$(exe)
 
 ifeq ($(CONFIG_EFI),y)
 tests += $(TEST_DIR)/amd_sev.$(exe)
diff --git a/x86/lam_sup.c b/x86/lam_sup.c
new file mode 100644
index 0000000..3e05129
--- /dev/null
+++ b/x86/lam_sup.c
@@ -0,0 +1,37 @@
+#include "libcflat.h"
+#include "processor.h"
+#include "desc.h"
+
+int main(int ac, char **av)
+{
+       unsigned long cr4;
+       struct cpuid c = {0, 0, 0, 0};
+
+       c = cpuid_indexed(7, 1);
+
+       if (!(c.a & (1 << 26))) {
+               report_skip("LAM is not supported. EAX = 0x%0x", c.a);
+               abort();
+       }
+
+       report_info("set CR4.LAM_SUP(bit 28)");
+
+       cr4 = read_cr4();
+       write_cr4(cr4 | X86_CR4_LAM_SUP);
+
+       report((cr4 | X86_CR4_LAM_SUP) == read_cr4(), "Set CR4.LAM_SUP
succeeded.");
+
+       report_info("clear CR4.LAM_SUP(bit 28)");
+
+       cr4 = read_cr4();
+       write_cr4(cr4 & ~X86_CR4_LAM_SUP);
+
+       report((cr4 & ~X86_CR4_LAM_SUP) == read_cr4(), "Clear
CR4.LAM_SUP succeeded.");
+
+       return report_summary();
+}
+
diff --git a/x86/unittests.cfg b/x86/unittests.cfg
index f324e32..0c90dfe 100644
--- a/x86/unittests.cfg
+++ b/x86/unittests.cfg
@@ -478,3 +478,8 @@ file = cet.flat
 arch = x86_64
 smp = 2
 extra_params = -enable-kvm -m 2048 -cpu host
+
+[intel-lam]
+file = lam_sup.flat
+arch = x86_64
+extra_params = -enable-kvm -cpu host


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 0/9] Linear Address Masking (LAM) KVM Enabling
  2023-02-10  3:17         ` Chao Gao
@ 2023-02-10  8:41           ` Robert Hoo
  0 siblings, 0 replies; 64+ messages in thread
From: Robert Hoo @ 2023-02-10  8:41 UTC (permalink / raw)
  To: Chao Gao
  Cc: Sean Christopherson, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, isaku.yamahata, kirill.shutemov, kvm

On Fri, 2023-02-10 at 11:17 +0800, Chao Gao wrote:
> 
> Alternatively, add another kselftest for LAM under kselftests/kvm.
> 
Let me explore.


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 0/9] Linear Address Masking (LAM) KVM Enabling
  2023-02-10  8:39         ` Robert Hoo
@ 2023-02-10  9:22           ` Chao Gao
  0 siblings, 0 replies; 64+ messages in thread
From: Chao Gao @ 2023-02-10  9:22 UTC (permalink / raw)
  To: Robert Hoo
  Cc: Sean Christopherson, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, isaku.yamahata, kirill.shutemov, kvm

On Fri, Feb 10, 2023 at 04:39:49PM +0800, Robert Hoo wrote:
>On Fri, 2023-02-10 at 10:07 +0800, Robert Hoo wrote:
>> On Thu, 2023-02-09 at 17:27 +0000, Sean Christopherson wrote:
>> > On Thu, Feb 09, 2023, Robert Hoo wrote:
>> > > On Thu, 2023-02-09 at 14:15 +0800, Chao Gao wrote:
>> > > > On Thu, Feb 09, 2023 at 10:40:13AM +0800, Robert Hoo wrote:
>> > > > Please add a kvm-unit-test or kselftest for LAM, particularly
>> > > > for
>> > > > operations (e.g., canonical check for supervisor pointers,
>> > > > toggle
>> > > > CR4.LAM_SUP) which aren't covered by the test in Kirill's
>> > > > series.
>> > > 
>> > > OK, I can explore for kvm-unit-test in separate patch set.
>> > 
>> > Please make tests your top priority.  Without tests, I am not going
>> > to spend any
>> > time reviewing this series, or any other hardware enabling
>> > series[*].  I don't
>> > expect KVM specific tests for everything, i.e. it's ok to to rely
>> > things like
>> > running VMs that utilize LAM and/or running LAM selftests in the
>> > guest, but I do
>> > want a reasonably thorough explanation of how all the test pieces
>> > fit
>> > together to
>> > validate KVM's implementation.
>> 
>> Sure, and ack on unit test is part of development work.
>> 
>> This patch set had always been unit tested before sent out, i.e.
>> "running LAM selftests in guest" on both ept=Y/N.
>> 
>> CR4.LAM_SUP, as Chao pointed out, could not be covered by kselftest,
>> I
>> may explore it in kvm-unit-test.
>> 
>When I come to kvm-unit-test, just find that I had already developed
>some test case on CR4.LAM_SUP toggle and carried out on this patch set.
>Just forgot about it.
>
>Is it all right? if so, I will include it in next version.

You can add more steps to the test, e.g.,

1. check if CR4.LAM_SUP setting takes effort by storing metadata into
   a supervisor pointer and dereferencing the pointer.
2. turn 5-level paging on/off and check if LAM width complies with
   the spec.
3. add some negtive tests. e.g., if LAM isn't advertised to the guest,
   setting CR4.LAM_SUP isn't allowed.

...

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 4/9] KVM: x86: MMU: Integrate LAM bits when build guest CR3
  2023-02-09  2:40 ` [PATCH v4 4/9] KVM: x86: MMU: Integrate LAM bits when build guest CR3 Robert Hoo
@ 2023-02-10 14:04   ` Chao Gao
  2023-02-11  6:24     ` Robert Hoo
  0 siblings, 1 reply; 64+ messages in thread
From: Chao Gao @ 2023-02-10 14:04 UTC (permalink / raw)
  To: Robert Hoo
  Cc: seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, isaku.yamahata, kirill.shutemov, kvm

On Thu, Feb 09, 2023 at 10:40:17AM +0800, Robert Hoo wrote:
>When calc the new CR3 value, take LAM bits in.

I prefer to merge this one into patch 2 because both are related to
CR3_LAM_U48/U57 handling. Merging them can give us the whole picture of
how the new LAM bits are handled:
* strip them from CR3 when allocating/finding a shadow root
* stitch them with other fields to form a shadow CR3

I have a couple questions:
1. in kvm_set_cr3(), 

        /* PDPTRs are always reloaded for PAE paging. */
        if (cr3 == kvm_read_cr3(vcpu) && !is_pae_paging(vcpu))
                goto handle_tlb_flush;

Shouldn't we strip off CR3_LAM_U48/U57 and do the comparison?
It depends on whether toggling CR3_LAM_U48/U57 causes a TLB flush.

2. also in kvm_set_cr3(),

        if (cr3 != kvm_read_cr3(vcpu))
                kvm_mmu_new_pgd(vcpu, cr3);

is it necessary to use a new pgd if only CR3_LAM_U48/U57 were changed?

>
>Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
>Reviewed-by: Jingqi Liu <jingqi.liu@intel.com>
>---
> arch/x86/kvm/mmu.h     | 5 +++++
> arch/x86/kvm/vmx/vmx.c | 3 ++-
> 2 files changed, 7 insertions(+), 1 deletion(-)
>
>diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
>index 6bdaacb6faa0..866f2b7cb509 100644
>--- a/arch/x86/kvm/mmu.h
>+++ b/arch/x86/kvm/mmu.h
>@@ -142,6 +142,11 @@ static inline unsigned long kvm_get_active_pcid(struct kvm_vcpu *vcpu)
> 	return kvm_get_pcid(vcpu, kvm_read_cr3(vcpu));
> }
> 
>+static inline u64 kvm_get_active_lam(struct kvm_vcpu *vcpu)
>+{
>+	return kvm_read_cr3(vcpu) & (X86_CR3_LAM_U48 | X86_CR3_LAM_U57);
>+}
>+
> static inline void kvm_mmu_load_pgd(struct kvm_vcpu *vcpu)
> {
> 	u64 root_hpa = vcpu->arch.mmu->root.hpa;
>diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>index fe5615fd8295..66edd091f145 100644
>--- a/arch/x86/kvm/vmx/vmx.c
>+++ b/arch/x86/kvm/vmx/vmx.c
>@@ -3289,7 +3289,8 @@ static void vmx_load_mmu_pgd(struct kvm_vcpu *vcpu, hpa_t root_hpa,
> 			update_guest_cr3 = false;
> 		vmx_ept_load_pdptrs(vcpu);
> 	} else {
>-		guest_cr3 = root_hpa | kvm_get_active_pcid(vcpu);
>+		guest_cr3 = root_hpa | kvm_get_active_pcid(vcpu) |
>+			    kvm_get_active_lam(vcpu);
> 	}
> 
> 	if (update_guest_cr3)
>-- 
>2.31.1
>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 5/9] KVM: x86: Untag LAM bits when applicable
  2023-02-09  2:40 ` [PATCH v4 5/9] KVM: x86: Untag LAM bits when applicable Robert Hoo
@ 2023-02-10 15:04   ` Chao Gao
  2023-02-11  5:57     ` Robert Hoo
  0 siblings, 1 reply; 64+ messages in thread
From: Chao Gao @ 2023-02-10 15:04 UTC (permalink / raw)
  To: Robert Hoo
  Cc: seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, isaku.yamahata, kirill.shutemov, kvm

On Thu, Feb 09, 2023 at 10:40:18AM +0800, Robert Hoo wrote:
>Define kvm_untagged_addr() per LAM feature spec: Address high bits are sign
>extended, from highest effective address bit.

>Note that LAM_U48 and LA57 has some effective bits overlap. This patch
>gives a WARN() on that case.

Why emit a WARN()? the behavior is undefined or something KVM cannot
emulated?

>
>Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
>Reviewed-by: Jingqi Liu <jingqi.liu@intel.com>
>---
> arch/x86/kvm/vmx/vmx.c |  3 +++
> arch/x86/kvm/x86.c     |  4 ++++
> arch/x86/kvm/x86.h     | 32 ++++++++++++++++++++++++++++++++
> 3 files changed, 39 insertions(+)
>
>diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>index 66edd091f145..e4f14d1bdd2f 100644
>--- a/arch/x86/kvm/vmx/vmx.c
>+++ b/arch/x86/kvm/vmx/vmx.c
>@@ -2163,6 +2163,9 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> 		    (!msr_info->host_initiated &&
> 		     !guest_cpuid_has(vcpu, X86_FEATURE_MPX)))
> 			return 1;
>+
>+		data = kvm_untagged_addr(data, vcpu);

This is a MSR write, so LAM masking isn't performed on this write
according to LAM spec. Am I misunderstanding something?

>+
> 		if (is_noncanonical_address(data & PAGE_MASK, vcpu) ||
> 		    (data & MSR_IA32_BNDCFGS_RSVD))
> 			return 1;
>diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>index 312aea1854ae..1bdc8c0c80c0 100644
>--- a/arch/x86/kvm/x86.c
>+++ b/arch/x86/kvm/x86.c
>@@ -1809,6 +1809,10 @@ static int __kvm_set_msr(struct kvm_vcpu *vcpu, u32 index, u64 data,
> 	case MSR_KERNEL_GS_BASE:
> 	case MSR_CSTAR:
> 	case MSR_LSTAR:
>+		/*
>+		 * The strict canonical checking still applies to MSR
>+		 * writing even LAM is enabled.
>+		 */
> 		if (is_noncanonical_address(data, vcpu))

LAM spec says:

	Processors that support LAM continue to require the addresses written to
	control registers or MSRs be 57-bit canonical if the processor supports
	5-level paging or 48-bit canonical if it supports only 4-level paging

My understanding is 57-bit canonical checking is performed if the processor
__supports__ 5-level paging. Then the is_noncanonical_address() here is
arguably wrong. Could you double-confirm and fix it?

> 			return 1;
> 		break;
>diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
>index 8ec5cc983062..7228895d4a6f 100644
>--- a/arch/x86/kvm/x86.h
>+++ b/arch/x86/kvm/x86.h
>@@ -201,6 +201,38 @@ static inline bool is_noncanonical_address(u64 la, struct kvm_vcpu *vcpu)
> 	return !__is_canonical_address(la, vcpu_virt_addr_bits(vcpu));
> }
> 
>+#ifdef CONFIG_X86_64

I don't get the purpose of the #ifdef. Shouldn't you check if the vcpu
is in 64-bit long mode?

>+/* untag addr for guest, according to vCPU CR3 and CR4 settings */
>+static inline u64 kvm_untagged_addr(u64 addr, struct kvm_vcpu *vcpu)
>+{
>+	if (addr >> 63 == 0) {
>+		/* User pointers */
>+		if (kvm_read_cr3(vcpu) & X86_CR3_LAM_U57)
>+			addr = __canonical_address(addr, 57);

braces are missing.

https://www.kernel.org/doc/html/latest/process/coding-style.html#placing-braces-and-spaces

>+		else if (kvm_read_cr3(vcpu) & X86_CR3_LAM_U48) {
>+			/*
>+			 * If guest enabled 5-level paging and LAM_U48,
>+			 * bit 47 should be 0, bit 48:56 contains meta data
>+			 * although bit 47:56 are valid 5-level address
>+			 * bits.
>+			 * If LAM_U48 and 4-level paging, bit47 is 0.
>+			 */
>+			WARN_ON(addr & _BITUL(47));
>+			addr = __canonical_address(addr, 48);
>+		}
>+	} else if (kvm_read_cr4(vcpu) & X86_CR4_LAM_SUP) { /* Supervisor pointers */
>+		if (kvm_read_cr4(vcpu) & X86_CR4_LA57)

use kvm_read_cr4_bits here to save potential VMCS_READs.

>+			addr = __canonical_address(addr, 57);
>+		else
>+			addr = __canonical_address(addr, 48);
>+	}
>+
>+	return addr;
>+}
>+#else
>+#define kvm_untagged_addr(addr, vcpu)	(addr)
>+#endif
>+
> static inline void vcpu_cache_mmio_info(struct kvm_vcpu *vcpu,
> 					gva_t gva, gfn_t gfn, unsigned access)
> {
>-- 
>2.31.1
>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 1/9] KVM: x86: Intercept CR4.LAM_SUP when LAM feature is enabled in guest
  2023-02-10  5:02     ` Robert Hoo
@ 2023-02-10 16:30       ` Sean Christopherson
  0 siblings, 0 replies; 64+ messages in thread
From: Sean Christopherson @ 2023-02-10 16:30 UTC (permalink / raw)
  To: Robert Hoo
  Cc: Yang, Weijiang, kirill.shutemov, kvm, pbonzini, yu.c.zhang,
	yuan.yao, jingqi.liu, chao.gao, isaku.yamahata

On Fri, Feb 10, 2023, Robert Hoo wrote:
> On Fri, 2023-02-10 at 11:29 +0800, Yang, Weijiang wrote:
> > On 2/9/2023 10:40 AM, Robert Hoo wrote:
> > > Remove CR4.LAM_SUP (bit 28) from default CR4_RESERVED_BITS, while
> > > reserve
> > > it in __cr4_reserved_bits() by feature testing.
> > > 
> > > Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
> > > Reviewed-by: Jingqi Liu <jingqi.liu@intel.com>
> > 
> > As Sean pointed out in[*], this Reviewed-by is for other purpose,
> > please 
> > remove all of
> > 
> > them in this series.
> 
> No. Sean meant another thing.

Correct, what I object to is Intel _requiring_ a Reviewed-by before posting.

And while I'm certainly not going to refuse patches that have been reviewed
internally, I _strongly_ prefer reviews be on-list so that they are public and
recorded.  Being able to go back and look at the history and evolution of patches
is valuable, and the discussion itself is often beneficial to non-participants,
e.g. people that are new-ish to KVM and/or aren't familiar with the feature being
enabled can often learn new things and avoid similar pitfalls of their own.

Rather than spend cycles getting through internal review, I would much prefer
developers spend their time writing tests and validating their code before posting.
Obviously there's a risk that foregoing internal review will result in low quality
submissions, but I think the LASS series proves that mandatory reviews doesn't
necessarily help on that front.  On the other hand, writing and running tests
naturally enforces a minimum level of quality.

I am happy to help with changelogs, comments, naming, etc.  E.g. I don't get
frustrated when someone who is new to kernel development or for whom English is
not their first language writes an imperfect changelog.  I get frustrated when
there's seemingly no attempt to justify _why_ a patch/KVM does something, and I
get really grumpy when blatantly buggy code is posted with no tests.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 3/9] KVM: x86: MMU: Commets update
  2023-02-10  7:55     ` Robert Hoo
@ 2023-02-10 16:54       ` Sean Christopherson
  0 siblings, 0 replies; 64+ messages in thread
From: Sean Christopherson @ 2023-02-10 16:54 UTC (permalink / raw)
  To: Robert Hoo
  Cc: Chao Gao, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, isaku.yamahata, kirill.shutemov, kvm

On Fri, Feb 10, 2023, Robert Hoo wrote:
> On Fri, 2023-02-10 at 14:59 +0800, Chao Gao wrote:
> > On Thu, Feb 09, 2023 at 10:40:16AM +0800, Robert Hoo wrote:
> > > kvm_mmu_ensure_valid_pgd() is stale. Update the comments according
> > > to
> > > latest code.
> > > 
> > > No function changes.
> > > 
> > > P.S. Sean firstly noticed this in
> > 
> > Reported-by:?
> 
> OK. Sean agree?
> > 
> > Should not this be post separately? This patch has nothing to do with
> > LAM.
> 
> It's too trivial to post separately, I think, just comments updates.
> And it on the code path of LAM KVM enabling, therefore I observed and
> update passingly; although no code change happens.

No need, I already applied a similar patch:

https://lore.kernel.org/all/20221128214709.224710-1-wei.liu@kernel.org

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 2/9] KVM: x86: MMU: Clear CR3 LAM bits when allocate shadow root
  2023-02-10  3:38   ` Yang, Weijiang
@ 2023-02-11  3:12     ` Robert Hoo
  0 siblings, 0 replies; 64+ messages in thread
From: Robert Hoo @ 2023-02-11  3:12 UTC (permalink / raw)
  To: Yang, Weijiang
  Cc: kirill.shutemov, kvm, seanjc, pbonzini, yu.c.zhang, yuan.yao,
	jingqi.liu, chao.gao, isaku.yamahata

On Fri, 2023-02-10 at 11:38 +0800, Yang, Weijiang wrote:
> On 2/9/2023 10:40 AM, Robert Hoo wrote:
> 
> [...]
> 
> 
> > -
> > +#ifdef CONFIG_X86_64
> > +	root_pgd = mmu->get_guest_pgd(vcpu) & ~(X86_CR3_LAM_U48 |
> > X86_CR3_LAM_U57);
> > +#else
> >   	root_pgd = mmu->get_guest_pgd(vcpu);
> > +#endif
> 
> I prefer using:
> 
> root_pgd = mmu->get_guest_pgd(vcpu);
> 
> if (IS_ENABLED(CONFIG_X86_64))
> 
>      root_pgd &= ~(X86_CR3_LAM_U48 | X86_CR3_LAM_U57);
> 
> I looks more structured.
> 
OK, thanks, both look all right to me.
Just don't quite understand the "structured" meaning, would you
elaborate more? Same as 8d5265b1016 ("KVM: x86/mmu: Use IS_ENABLED() to
avoid RETPOLINE for TDP page faults")?


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 5/9] KVM: x86: Untag LAM bits when applicable
  2023-02-10 15:04   ` Chao Gao
@ 2023-02-11  5:57     ` Robert Hoo
  2023-02-16  6:37       ` Binbin Wu
  0 siblings, 1 reply; 64+ messages in thread
From: Robert Hoo @ 2023-02-11  5:57 UTC (permalink / raw)
  To: Chao Gao
  Cc: seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, isaku.yamahata, kirill.shutemov, kvm

On Fri, 2023-02-10 at 23:04 +0800, Chao Gao wrote:
> Why emit a WARN()? the behavior is undefined or something KVM cannot
> emulated?
> 
Oh, it should have been removed per Yuan's comment on last version;
sorry forgot to.
Originally, I wrote an WARN_ON() (perhaps WARN_ON_ONCE() is better)
here, as an assert this should never happen to KVM. But now considering
emulation part, it's possible. 

> > 
[...]
> 
> This is a MSR write, so LAM masking isn't performed on this write
> according to LAM spec.

Right.

> Am I misunderstanding something?
> 
> > +
[...]
> > @@ -1809,6 +1809,10 @@ static int __kvm_set_msr(struct kvm_vcpu
> > *vcpu, u32 index, u64 data,
> > 	case MSR_KERNEL_GS_BASE:
> > 	case MSR_CSTAR:
> > 	case MSR_LSTAR:
> > +		/*
> > +		 * The strict canonical checking still applies to MSR
> > +		 * writing even LAM is enabled.
> > +		 */
> > 		if (is_noncanonical_address(data, vcpu))
> 
> LAM spec says:
> 
> 	Processors that support LAM continue to require the addresses
> written to
> 	control registers or MSRs be 57-bit canonical if the processor
> supports
> 	5-level paging or 48-bit canonical if it supports only 4-level
> paging
> 
> My understanding is 57-bit canonical checking is performed if the
> processor
> __supports__ 5-level paging. Then the is_noncanonical_address() here
> is
> arguably wrong. Could you double-confirm and fix it?
> 
Emm, condition of "support" V.S. "enabled", you mean
vcpu_virt_addr_bits(), right?
{
	return kvm_read_cr4_bits(vcpu, X86_CR4_LA57) ? 57 : 48;
}

This is worth double-confirm. Let me check with Yu. Thanks.

> > 			return 1;
> > 		break;
> > diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
> > index 8ec5cc983062..7228895d4a6f 100644
> > --- a/arch/x86/kvm/x86.h
> > +++ b/arch/x86/kvm/x86.h
> > @@ -201,6 +201,38 @@ static inline bool is_noncanonical_address(u64
> > la, struct kvm_vcpu *vcpu)
> > 	return !__is_canonical_address(la, vcpu_virt_addr_bits(vcpu));
> > }
> > 
> > +#ifdef CONFIG_X86_64
> 
> I don't get the purpose of the #ifdef. Shouldn't you check if the
> vcpu
> is in 64-bit long mode?
> 
Good question, thanks. It inspires me:
1) can 32-bit host/kvm support 64-bit guest kernel?
2) if !64-bit mode (guest), should CPUID enumeration gone? or #GP on
guest attempts to set CR3/CR4 lam bits?

> > +/* untag addr for guest, according to vCPU CR3 and CR4 settings */
> > +static inline u64 kvm_untagged_addr(u64 addr, struct kvm_vcpu
> > *vcpu)
> > +{
> > +	if (addr >> 63 == 0) {
> > +		/* User pointers */
> > +		if (kvm_read_cr3(vcpu) & X86_CR3_LAM_U57)
> > +			addr = __canonical_address(addr, 57);
> 
> braces are missing.

Careful review, thanks.
Curious this escaped checkpatch.pl.
> 
> https://www.kernel.org/doc/html/latest/process/coding-style.html#placing-braces-and-spaces
> 
> > +		else if (kvm_read_cr3(vcpu) & X86_CR3_LAM_U48) {
> > +			/*
> > +			 * If guest enabled 5-level paging and LAM_U48,
> > +			 * bit 47 should be 0, bit 48:56 contains meta
> > data
> > +			 * although bit 47:56 are valid 5-level address
> > +			 * bits.
> > +			 * If LAM_U48 and 4-level paging, bit47 is 0.
> > +			 */
> > +			WARN_ON(addr & _BITUL(47));
> > +			addr = __canonical_address(addr, 48);
> > +		}
> > +	} else if (kvm_read_cr4(vcpu) & X86_CR4_LAM_SUP) { /*
> > Supervisor pointers */
> > +		if (kvm_read_cr4(vcpu) & X86_CR4_LA57)
> 
> use kvm_read_cr4_bits here to save potential VMCS_READs.

Right, thanks.


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 4/9] KVM: x86: MMU: Integrate LAM bits when build guest CR3
  2023-02-10 14:04   ` Chao Gao
@ 2023-02-11  6:24     ` Robert Hoo
  2023-02-11  6:29       ` Robert Hoo
  0 siblings, 1 reply; 64+ messages in thread
From: Robert Hoo @ 2023-02-11  6:24 UTC (permalink / raw)
  To: Chao Gao
  Cc: seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, isaku.yamahata, kirill.shutemov, kvm

On Fri, 2023-02-10 at 22:04 +0800, Chao Gao wrote:
> On Thu, Feb 09, 2023 at 10:40:17AM +0800, Robert Hoo wrote:
> > When calc the new CR3 value, take LAM bits in.
> 
> I prefer to merge this one into patch 2 because both are related to
> CR3_LAM_U48/U57 handling. Merging them can give us the whole picture
> of
> how the new LAM bits are handled:
> * strip them from CR3 when allocating/finding a shadow root
> * stitch them with other fields to form a shadow CR3

OK
> 
> I have a couple questions:
> 1. in kvm_set_cr3(), 
> 
>         /* PDPTRs are always reloaded for PAE paging. */
>         if (cr3 == kvm_read_cr3(vcpu) && !is_pae_paging(vcpu))
>                 goto handle_tlb_flush;
> 
> Shouldn't we strip off CR3_LAM_U48/U57 and do the comparison?

Here is the stringent check, i.e. including LAM bits.
Below we'll check if LAM bits is allowed to check.

	if (!guest_cpuid_has(vcpu, X86_FEATURE_LAM) &&
	    (cr3 & (X86_CR3_LAM_U48 | X86_CR3_LAM_U57)))
		return	1;

The only LAM bits toggling case will be handled in

	if (cr3 != old_cr3) {
		if ((cr3 ^ old_cr3) & CR3_ADDR_MASK) {
			kvm_mmu_new_pgd(vcpu, cr3 & ~(X86_CR3_LAM_U48 |
					X86_CR3_LAM_U57));
		} else {
			/*
			 * Though effective addr no change, mark the
			 * request so that LAM bits will take effect
			 * when enter guest.
			 */
			kvm_make_request(KVM_REQ_LOAD_MMU_PGD, vcpu);
		}
	}

> It depends on whether toggling CR3_LAM_U48/U57 causes a TLB flush.

In v1, Kirill and I discussed about this. He lean toward being
conservative on TLB flush.
> 
> 2. also in kvm_set_cr3(),
> 
>         if (cr3 != kvm_read_cr3(vcpu))
>                 kvm_mmu_new_pgd(vcpu, cr3);
> 
> is it necessary to use a new pgd if only CR3_LAM_U48/U57 were
> changed?
> 
Hasn't applied my patch? It isn't like this. It's like below
	if ((cr3 ^ old_cr3) & CR3_ADDR_MASK) {
			kvm_mmu
_new_pgd(vcpu, cr3 & ~(X86_CR3_LAM_U48 |
				
	X86_CR3_LAM_U57));
			}

Only if effective pgd addr bits changes, kvm_mmu_new_pgd().


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 4/9] KVM: x86: MMU: Integrate LAM bits when build guest CR3
  2023-02-11  6:24     ` Robert Hoo
@ 2023-02-11  6:29       ` Robert Hoo
  0 siblings, 0 replies; 64+ messages in thread
From: Robert Hoo @ 2023-02-11  6:29 UTC (permalink / raw)
  To: Chao Gao
  Cc: seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, isaku.yamahata, kirill.shutemov, kvm

On Sat, 2023-02-11 at 14:24 +0800, Robert Hoo wrote:
> In v1, Kirill and I discussed about this. He lean toward being
> conservative on TLB flush.
> > 
https://lore.kernel.org/kvm/20221103024001.wtrj77ekycleq4vc@box.shutemov.name/


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 6/9] KVM: x86: When KVM judges CR3 valid or not, consider LAM bits
  2023-02-09  2:40 ` [PATCH v4 6/9] KVM: x86: When KVM judges CR3 valid or not, consider LAM bits Robert Hoo
@ 2023-02-13  2:01   ` Chao Gao
  2023-02-13 13:25     ` Robert Hoo
  0 siblings, 1 reply; 64+ messages in thread
From: Chao Gao @ 2023-02-13  2:01 UTC (permalink / raw)
  To: Robert Hoo
  Cc: seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, isaku.yamahata, kirill.shutemov, kvm

On Thu, Feb 09, 2023 at 10:40:19AM +0800, Robert Hoo wrote:
>Before apply to kvm_vcpu_is_illegal_gpa(), clear LAM bits if it's valid.

I prefer to squash this patch into patch 2 because it is also related to
CR3 LAM bits handling.

>
>Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
>Reviewed-by: Jingqi Liu <jingqi.liu@intel.com>
>---
> arch/x86/kvm/x86.c | 10 +++++++++-
> 1 file changed, 9 insertions(+), 1 deletion(-)
>
>diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>index 1bdc8c0c80c0..3218f465ae71 100644
>--- a/arch/x86/kvm/x86.c
>+++ b/arch/x86/kvm/x86.c
>@@ -1231,6 +1231,14 @@ static void kvm_invalidate_pcid(struct kvm_vcpu *vcpu, unsigned long pcid)
> 	kvm_mmu_free_roots(vcpu->kvm, mmu, roots_to_free);
> }
> 
>+static bool kvm_is_valid_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)

Since this function takes a "vcpu" argument, probably
kvm_vcpu_is_valid_cr3() is slightly better.

>+{
>+	if (guest_cpuid_has(vcpu, X86_FEATURE_LAM))

check if the vcpu is in the 64 bit long mode?

>+		cr3 &= ~(X86_CR3_LAM_U48 | X86_CR3_LAM_U57);
>+
>+	return kvm_vcpu_is_legal_gpa(vcpu, cr3);
>+}
>+
> int kvm_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
> {
> 	bool skip_tlb_flush = false;
>@@ -1254,7 +1262,7 @@ int kvm_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
> 	 * stuff CR3, e.g. for RSM emulation, and there is no guarantee that
> 	 * the current vCPU mode is accurate.
> 	 */
>-	if (kvm_vcpu_is_illegal_gpa(vcpu, cr3))
>+	if (!kvm_is_valid_cr3(vcpu, cr3))

There are other call sites of kvm_vcpu_is_illegal_gpa() to validate cr3.
Do you need to modify them?

> 		return 1;
> 
> 	if (is_pae_paging(vcpu) && !load_pdptrs(vcpu, cr3))
>-- 
>2.31.1
>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 7/9] KVM: x86: When guest set CR3, handle LAM bits semantics
  2023-02-09  2:40 ` [PATCH v4 7/9] KVM: x86: When guest set CR3, handle LAM bits semantics Robert Hoo
@ 2023-02-13  3:31   ` Chao Gao
  2023-02-14  5:28     ` Robert Hoo
  0 siblings, 1 reply; 64+ messages in thread
From: Chao Gao @ 2023-02-13  3:31 UTC (permalink / raw)
  To: Robert Hoo
  Cc: seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, isaku.yamahata, kirill.shutemov, kvm

On Thu, Feb 09, 2023 at 10:40:20AM +0800, Robert Hoo wrote:
>When only changes LAM bits, ask next vcpu run to load mmu pgd, so that it
>will build new CR3 with LAM bits updates.
>When changes on CR3's effective address bits, no matter LAM bits changes or not,
>go through normal pgd update process.

Please squash this into patch 2.

>
>Reviewed-by: Jingqi Liu <jingqi.liu@intel.com>
>Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
>---
> arch/x86/kvm/x86.c | 24 ++++++++++++++++++++----
> 1 file changed, 20 insertions(+), 4 deletions(-)
>
>diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>index 3218f465ae71..7df6c9dd12a5 100644
>--- a/arch/x86/kvm/x86.c
>+++ b/arch/x86/kvm/x86.c
>@@ -1242,9 +1242,9 @@ static bool kvm_is_valid_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
> int kvm_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
> {
> 	bool skip_tlb_flush = false;
>-	unsigned long pcid = 0;
>+	unsigned long pcid = 0, old_cr3;
> #ifdef CONFIG_X86_64

>-	bool pcid_enabled = kvm_read_cr4_bits(vcpu, X86_CR4_PCIDE);
>+	bool pcid_enabled = !!kvm_read_cr4_bits(vcpu, X86_CR4_PCIDE);

This change isn't related. Please drop it or post it separately.

> 
> 	if (pcid_enabled) {
> 		skip_tlb_flush = cr3 & X86_CR3_PCID_NOFLUSH;
>@@ -1257,6 +1257,10 @@ int kvm_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
> 	if (cr3 == kvm_read_cr3(vcpu) && !is_pae_paging(vcpu))
> 		goto handle_tlb_flush;
> 
>+	if (!guest_cpuid_has(vcpu, X86_FEATURE_LAM) &&
>+	    (cr3 & (X86_CR3_LAM_U48 | X86_CR3_LAM_U57)))
>+		return	1;

can you move this check to kvm_vcpu_is_valid_cr3(), i.e., return false in
that function if any LAM bit is toggled while LAM isn't exposed to the guest?

>+
> 	/*
> 	 * Do not condition the GPA check on long mode, this helper is used to
> 	 * stuff CR3, e.g. for RSM emulation, and there is no guarantee that
>@@ -1268,8 +1272,20 @@ int kvm_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
> 	if (is_pae_paging(vcpu) && !load_pdptrs(vcpu, cr3))
> 		return 1;
> 
>-	if (cr3 != kvm_read_cr3(vcpu))
>-		kvm_mmu_new_pgd(vcpu, cr3);
>+	old_cr3 = kvm_read_cr3(vcpu);
>+	if (cr3 != old_cr3) {
>+		if ((cr3 ^ old_cr3) & CR3_ADDR_MASK) {

Does this check against CR3_ADDR_MASK necessarily mean LAM bits are
toggled, i.e., CR3_ADDR_MASK == ~(X86_CR3_LAM_U48 | X86_CR3_LAM_U57)?

Why not check if LAM bits are changed? This way the patch only changes
cases related to LAM, keeping other cases intact.

>+			kvm_mmu_new_pgd(vcpu, cr3 & ~(X86_CR3_LAM_U48 |
>+					X86_CR3_LAM_U57));

Do you need to touch kvm_mmu_new_pgd() in nested_vmx_load_cr3()?

>+		} else {
>+			/*
>+			 * Though effective addr no change, mark the
>+			 * request so that LAM bits will take effect
>+			 * when enter guest.
>+			 */
>+			kvm_make_request(KVM_REQ_LOAD_MMU_PGD, vcpu);
>+		}
>+	}
> 
> 	vcpu->arch.cr3 = cr3;
> 	kvm_register_mark_dirty(vcpu, VCPU_EXREG_CR3);
>-- 
>2.31.1
>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 8/9] KVM: x86: emulation: Apply LAM when emulating data access
  2023-02-09  2:40 ` [PATCH v4 8/9] KVM: x86: emulation: Apply LAM when emulating data access Robert Hoo
@ 2023-02-13  3:53   ` Chao Gao
  2023-02-14  5:38     ` Robert Hoo
  0 siblings, 1 reply; 64+ messages in thread
From: Chao Gao @ 2023-02-13  3:53 UTC (permalink / raw)
  To: Robert Hoo
  Cc: seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, isaku.yamahata, kirill.shutemov, kvm

On Thu, Feb 09, 2023 at 10:40:21AM +0800, Robert Hoo wrote:
>When in KVM emulation, calculated a LA for data access, apply LAM if
>guest is at that moment LAM active, so that the following canonical check
>can pass.

This sounds weird. Passing the canonical checking isn't the goal. Emulating
the behavior of a LAM-capable processor on memory accesses is.

>
>Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
>---
> arch/x86/kvm/emulate.c |  6 ++++++
> arch/x86/kvm/x86.h     | 13 +++++++++++++
> 2 files changed, 19 insertions(+)
>
>diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
>index 5cc3efa0e21c..d52037151133 100644
>--- a/arch/x86/kvm/emulate.c
>+++ b/arch/x86/kvm/emulate.c
>@@ -700,6 +700,12 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
> 	*max_size = 0;
> 	switch (mode) {
> 	case X86EMUL_MODE_PROT64:
>+		/*
>+		 * LAM applies only on data access
>+		 */

one-line comments look like /* Bla bla bla */

>+		if (!fetch && is_lam_active(ctxt->vcpu))
>+			la = kvm_untagged_addr(la, ctxt->vcpu);
>+
> 		*linear = la;
> 		va_bits = ctxt_virt_addr_bits(ctxt);
> 		if (!__is_canonical_address(la, va_bits))
>diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
>index 7228895d4a6f..9397e9f4e061 100644
>--- a/arch/x86/kvm/x86.h
>+++ b/arch/x86/kvm/x86.h
>@@ -135,6 +135,19 @@ static inline int is_long_mode(struct kvm_vcpu *vcpu)
> #endif
> }
> 
>+#ifdef CONFIG_X86_64
>+static inline bool is_lam_active(struct kvm_vcpu *vcpu)

Drop this function because kvm_untagged_addr() already does these checks
(and taking user/supervisor pointers into consideration).

>+{
>+	return kvm_read_cr3(vcpu) & (X86_CR3_LAM_U48 | X86_CR3_LAM_U57) ||
>+	       kvm_read_cr4_bits(vcpu, X86_CR4_LAM_SUP);
>+}
>+#else
>+static inline bool is_lam_active(struct kvm_vcpu *vcpu)
>+{
>+	return false;
>+}
>+#endif
>+
> static inline bool is_64_bit_mode(struct kvm_vcpu *vcpu)
> {
> 	int cs_db, cs_l;
>-- 
>2.31.1
>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 0/9] Linear Address Masking (LAM) KVM Enabling
  2023-02-09  2:40 [PATCH v4 0/9] Linear Address Masking (LAM) KVM Enabling Robert Hoo
                   ` (9 preceding siblings ...)
  2023-02-09  6:15 ` [PATCH v4 0/9] Linear Address Masking (LAM) KVM Enabling Chao Gao
@ 2023-02-13  9:02 ` Binbin Wu
  2023-02-13 13:16   ` Robert Hoo
  10 siblings, 1 reply; 64+ messages in thread
From: Binbin Wu @ 2023-02-13  9:02 UTC (permalink / raw)
  To: Robert Hoo, seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, chao.gao, isaku.yamahata
  Cc: kirill.shutemov, kvm


On 2/9/2023 10:40 AM, Robert Hoo wrote:
> ===Feature Introduction===
>
> Linear-address masking (LAM) [1], modifies the checking that is applied to
> *64-bit* linear addresses, allowing software to use of the untranslated
> address (upper) bits for metadata.
> As for which upper bits of linear address can be borrowed, LAM has 2 modes:
> LAM_48 (bits 62:48, i.e. LAM width of 15) and LAM_57 (bits 62:57, i.e. LAM
> width of 6), controlled by these new bits: CR3[62] (LAM_U48), CR3[61]
> (LAM_U57), and CR4[28] (LAM_SUP).
>
> * LAM_U48 and LAM_U57 bits controls LAM for user mode address. I.e. if
>    CR3.LAM_U57 = 1, LAM57 is applied; if CR3.LAM_U48 = 1 and CR3.LAM_U57 = 0,
>    LAM48 is applied.
> * LAM_SUP bit, combined with paging mode (4-level or 5-level), determines
>    LAM status for supervisor mode address. I.e. when CR4.LAM_SUP =1, 4-level
>    paging mode will have LAM48 for supervisor mode address while 5-level paging
>    will have LAM57.
>
> Note:
> 1. LAM applies to only data address, not to instructions.
> 2. LAM identification of an address as user or supervisor is based solely on the
>     value of pointer bit 63 and does not, for the purposes of LAM, depend on the CPL.
> 3. For user mode address, it is possible that 5-level paging and LAM_U48 are both
>     set, in this case, the effective usable linear address width is 48, i.e. bit
>     56:47 is reserved by LAM. [2]

How to understand "reserved by LAM"?

According to the spec, bits 56:48 of the pointer contained metadata.


> 4. When VM exit, the problematic address saved in VMCS field is clean, i.e.
>     metadata cleared with canonical form.
>
>
> ===LAM KVM Design===
>
> Intercept CR4.LAM_SUP by KVM, to avoid read VMCS field every time, with
> expectation that guest won't toggle this bit frequently.
>
> Under EPT mode, CR3 is fully under guest control, guest LAM is thus transparent to
> KVM. Nothing more need to do.
>
> For Shadow paging (EPT = off), KVM need to handle guest CR3.LAM_U48 and CR3.LAM_U57
> toggles.
>
> [1] ISE Chap10 https://cdrdv2.intel.com/v1/dl/getContent/671368 (Section 10.6 VMX interaction)
> [2] Thus currently, Kernel enabling patch only enables LAM_U57. https://lore.kernel.org/lkml/20230123220500.21077-1-kirill.shutemov@linux.intel.com/
>
> ---
> Changelog
> v3 --> v4:
> Drop unrelated Patch 1 in v3 (Binbin, Sean, Xiaoyao)
> Intercept CR4.LAM_SUP instead of pass through to guest (Sean)
> Just filter out CR3.LAM_{U48, U57}, instead of all reserved high bits
> (Sean, Yuan)
> Use existing __canonical_address() helper instead write a new one (Weijiang)
> Add LAM handling in KVM emulation (Yu, Yuan)
> Add Jingqi's reviwed-by on Patch 7
> Rebased to Kirill's latest code, which is 6.2-rc1 base.
>
> v2 --> v3:
> As LAM Kernel patches are in tip tree now, rebase to it.
> https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/
>
> v1 --> v2:
> 1. Fixes i386-allyesconfig build error on get_pgd(), where
>     CR3_HIGH_RSVD_MASK isn't applicable.
>     (Reported-by: kernel test robot <lkp@intel.com>)
> 2. In kvm_set_cr3(), be conservative on skip tlb flush when only LAM bits
>     toggles. (Kirill)
>
>
> Robert Hoo (9):
>    KVM: x86: Intercept CR4.LAM_SUP when LAM feature is enabled in guest
>    KVM: x86: MMU: Clear CR3 LAM bits when allocate shadow root
>    [Trivial] KVM: x86: MMU: Commets update
>    KVM: x86: MMU: Integrate LAM bits when build guest CR3
>    KVM: x86: Untag LAM bits when applicable
>    KVM: x86: When judging setting CR3 valid or not, consider LAM bits
>    KVM: x86: When guest set CR3, handle LAM bits semantics
>    KVM: x86: LAM: Expose LAM CPUID to user space VMM
>    KVM: x86: emulation: Apply LAM when emulating data access
>
>   arch/x86/include/asm/kvm_host.h |  3 ++-
>   arch/x86/kvm/cpuid.c            |  2 +-
>   arch/x86/kvm/emulate.c          |  6 +++++
>   arch/x86/kvm/mmu.h              |  5 ++++
>   arch/x86/kvm/mmu/mmu.c          | 11 ++++++--
>   arch/x86/kvm/vmx/vmx.c          |  6 ++++-
>   arch/x86/kvm/x86.c              | 38 ++++++++++++++++++++++----
>   arch/x86/kvm/x86.h              | 47 +++++++++++++++++++++++++++++++++
>   8 files changed, 108 insertions(+), 10 deletions(-)
>
> base-commit: 03334443640f226f56f71b5dfa3b1be6d4a1a1bc
> (https://git.kernel.org/pub/scm/linux/kernel/git/kas/linux.git lam)

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 0/9] Linear Address Masking (LAM) KVM Enabling
  2023-02-13  9:02 ` Binbin Wu
@ 2023-02-13 13:16   ` Robert Hoo
  0 siblings, 0 replies; 64+ messages in thread
From: Robert Hoo @ 2023-02-13 13:16 UTC (permalink / raw)
  To: Binbin Wu, seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, chao.gao, isaku.yamahata
  Cc: kirill.shutemov, kvm

On Mon, 2023-02-13 at 17:02 +0800, Binbin Wu wrote:
> > 3. For user mode address, it is possible that 5-level paging and
> > LAM_U48 are both
> >     set, in this case, the effective usable linear address width is
> > 48, i.e. bit
> >     56:47 is reserved by LAM. [2]
> 
> How to understand "reserved by LAM"?
> 
> According to the spec, bits 56:48 of the pointer contained metadata.
> 
Emm, looks it's superfluous words, I'll remove it in next version.
I meant the same as your understanding: bit 56:48 could be used by
metadata,  and cleared before translation, therefore not usable by 5-
level paging, "reserved by LAM".


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 6/9] KVM: x86: When KVM judges CR3 valid or not, consider LAM bits
  2023-02-13  2:01   ` Chao Gao
@ 2023-02-13 13:25     ` Robert Hoo
  2023-02-14  6:18       ` Chao Gao
  0 siblings, 1 reply; 64+ messages in thread
From: Robert Hoo @ 2023-02-13 13:25 UTC (permalink / raw)
  To: Chao Gao
  Cc: seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, isaku.yamahata, kirill.shutemov, kvm

On Mon, 2023-02-13 at 10:01 +0800, Chao Gao wrote:
> On Thu, Feb 09, 2023 at 10:40:19AM +0800, Robert Hoo wrote:
> > Before apply to kvm_vcpu_is_illegal_gpa(), clear LAM bits if it's
> > valid.
> 
> I prefer to squash this patch into patch 2 because it is also related
> to
> CR3 LAM bits handling.
> 
Though all surround CR3, I would prefer split into pieces, so that
easier for review and accept. I can change their order to group
together. Is is all right for you?
> > 
> > 
> > +static bool kvm_is_valid_cr3(struct kvm_vcpu *vcpu, unsigned long
> > cr3)
> 
> Since this function takes a "vcpu" argument, probably
> kvm_vcpu_is_valid_cr3() is slightly better.

OK, to align with kvm_vcpu_is_legal_gpa().
> 
> > +{
> > +	if (guest_cpuid_has(vcpu, X86_FEATURE_LAM))
> 
> check if the vcpu is in the 64 bit long mode?

Emm, looks necessary. 
	(guest_cpuid_has(vcpu, X86_FEATURE_LAM) && is_long_mode(vcpu))
Let me ponder more, e.g. when guest has LAM feature but not in long
mode...
> 
> > +		cr3 &= ~(X86_CR3_LAM_U48 | X86_CR3_LAM_U57);
> > +
> > +	return kvm_vcpu_is_legal_gpa(vcpu, cr3);
> > +}
> > +
> > int kvm_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
> > {
> > 	bool skip_tlb_flush = false;
> > @@ -1254,7 +1262,7 @@ int kvm_set_cr3(struct kvm_vcpu *vcpu,
> > unsigned long cr3)
> > 	 * stuff CR3, e.g. for RSM emulation, and there is no guarantee
> > that
> > 	 * the current vCPU mode is accurate.
> > 	 */
> > -	if (kvm_vcpu_is_illegal_gpa(vcpu, cr3))
> > +	if (!kvm_is_valid_cr3(vcpu, cr3))
> 
> There are other call sites of kvm_vcpu_is_illegal_gpa() to validate
> cr3.
> Do you need to modify them?

I don't think so. Others are for gpa validation, no need to change.
Here is for CR3.
> 
> > 		return 1;
> > 
> > 	if (is_pae_paging(vcpu) && !load_pdptrs(vcpu, cr3))
> > -- 
> > 2.31.1
> > 


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 1/9] KVM: x86: Intercept CR4.LAM_SUP when LAM feature is enabled in guest
  2023-02-09  2:40 ` [PATCH v4 1/9] KVM: x86: Intercept CR4.LAM_SUP when LAM feature is enabled in guest Robert Hoo
  2023-02-09  9:21   ` Chao Gao
  2023-02-10  3:29   ` Yang, Weijiang
@ 2023-02-14  1:27   ` Binbin Wu
  2023-02-14  6:11     ` Robert Hoo
  2 siblings, 1 reply; 64+ messages in thread
From: Binbin Wu @ 2023-02-14  1:27 UTC (permalink / raw)
  To: Robert Hoo, seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, chao.gao, isaku.yamahata
  Cc: kirill.shutemov, kvm

This patch removes CR4.LAM_SUP from cr4_reserved_bits to allows the 
setting of X86_CR4_LAM_SUP by guest if the hardware platform supports 
the feature.

The interception of CR4 is decided by CR4 guest/host mask and CR4 read 
shadow.

The patch tiltle is not accurate.


On 2/9/2023 10:40 AM, Robert Hoo wrote:
> Remove CR4.LAM_SUP (bit 28) from default CR4_RESERVED_BITS, while reserve
> it in __cr4_reserved_bits() by feature testing.
>
> Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
> Reviewed-by: Jingqi Liu <jingqi.liu@intel.com>
> ---
>   arch/x86/include/asm/kvm_host.h | 3 ++-
>   arch/x86/kvm/x86.h              | 2 ++
>   2 files changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index f35f1ff4427b..4684896698f4 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -125,7 +125,8 @@
>   			  | X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR | X86_CR4_PCIDE \
>   			  | X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE \
>   			  | X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_VMXE \
> -			  | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP))
> +			  | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP \
> +			  | X86_CR4_LAM_SUP))
>   
>   #define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR)
>   
> diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
> index 9de72586f406..8ec5cc983062 100644
> --- a/arch/x86/kvm/x86.h
> +++ b/arch/x86/kvm/x86.h
> @@ -475,6 +475,8 @@ bool kvm_msr_allowed(struct kvm_vcpu *vcpu, u32 index, u32 type);
>   		__reserved_bits |= X86_CR4_VMXE;        \
>   	if (!__cpu_has(__c, X86_FEATURE_PCID))          \
>   		__reserved_bits |= X86_CR4_PCIDE;       \
> +	if (!__cpu_has(__c, X86_FEATURE_LAM))		\
> +		__reserved_bits |= X86_CR4_LAM_SUP;	\
>   	__reserved_bits;                                \
>   })
>   

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 2/9] KVM: x86: MMU: Clear CR3 LAM bits when allocate shadow root
  2023-02-09 13:02     ` Robert Hoo
@ 2023-02-14  2:55       ` Binbin Wu
  2023-02-15  1:17         ` Robert Hoo
  2023-02-16  2:14         ` Robert Hoo
  0 siblings, 2 replies; 64+ messages in thread
From: Binbin Wu @ 2023-02-14  2:55 UTC (permalink / raw)
  To: Robert Hoo, Chao Gao
  Cc: seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, isaku.yamahata, kirill.shutemov, kvm


On 2/9/2023 9:02 PM, Robert Hoo wrote:
> On Thu, 2023-02-09 at 17:55 +0800, Chao Gao wrote:
>> On Thu, Feb 09, 2023 at 10:40:15AM +0800, Robert Hoo wrote:
>>> --- a/arch/x86/kvm/mmu/mmu.c
>>> +++ b/arch/x86/kvm/mmu/mmu.c
>>> @@ -3698,8 +3698,11 @@ static int mmu_alloc_shadow_roots(struct
>>> kvm_vcpu *vcpu)
>>> 	gfn_t root_gfn, root_pgd;
>>> 	int quadrant, i, r;
>>> 	hpa_t root;
>>> -
>> The blank line should be kept.
> OK
>>> +#ifdef CONFIG_X86_64
>>> +	root_pgd = mmu->get_guest_pgd(vcpu) & ~(X86_CR3_LAM_U48 |
>>> X86_CR3_LAM_U57);
>>> +#else
>>> 	root_pgd = mmu->get_guest_pgd(vcpu);
>>> +#endif
>> Why are other call sites of mmu->get_guest_pgd() not changed?
> Emm, the other 3 are
> FNAME(walk_addr_generic)()
> kvm_arch_setup_async_pf()
> kvm_arch_async_page_ready
>
> In former version, I clear CR3.LAM bits for guest_pgd inside mmu-
>> get_guest_pgd(). I think this is generic. Perhaps I should still do it
> in that way. Let's wait for other's comments on this.
> Thanks for pointing out.

I also prefer to handle it inside get_guest_pdg,
but in kvm_arch_setup_async_pf()/kvm_arch_async_page_ready(), the value 
is assigned to
cr3 of struct kvm_arch_async_pf, does it requires all fileds of cr3?


>
>> And what's
>> the value of the #ifdef?
> LAM is only available in 64 bit mode.

In other modes, the two bits are either ignored or not defined, seems 
safe to clear the two bits unconditionally.


>>> 	root_gfn = root_pgd >> PAGE_SHIFT;
>>>
>>> 	if (mmu_check_root(vcpu, root_gfn))
>>> -- 
>>> 2.31.1
>>>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 7/9] KVM: x86: When guest set CR3, handle LAM bits semantics
  2023-02-13  3:31   ` Chao Gao
@ 2023-02-14  5:28     ` Robert Hoo
  2023-02-14  6:48       ` Chao Gao
  0 siblings, 1 reply; 64+ messages in thread
From: Robert Hoo @ 2023-02-14  5:28 UTC (permalink / raw)
  To: Chao Gao
  Cc: seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, isaku.yamahata, kirill.shutemov, kvm

On Mon, 2023-02-13 at 11:31 +0800, Chao Gao wrote:
> On Thu, Feb 09, 2023 at 10:40:20AM +0800, Robert Hoo wrote:
> > When only changes LAM bits, ask next vcpu run to load mmu pgd, so
> > that it
> > will build new CR3 with LAM bits updates.
> > When changes on CR3's effective address bits, no matter LAM bits
> > changes or not,
> > go through normal pgd update process.
> 
> Please squash this into patch 2.
> 
Though all surround CR3, I would prefer split into pieces, so that
easier for review and accept. I can change their order to group
together. Is it all right for you?
> > 
> > -	bool pcid_enabled = kvm_read_cr4_bits(vcpu, X86_CR4_PCIDE);
> > +	bool pcid_enabled = !!kvm_read_cr4_bits(vcpu, X86_CR4_PCIDE);
> 
> This change isn't related. Please drop it or post it separately.
> 
Yu commented this also on last version. I thought it's too trivial to
be separated.
Now that both of you suggest this. Let me split it in a separated patch
in this set. Is this all right?
I do think separating it in another patch set alone, is too trivial.
> > 
> > 	if (pcid_enabled) {
> > 		skip_tlb_flush = cr3 & X86_CR3_PCID_NOFLUSH;
> > @@ -1257,6 +1257,10 @@ int kvm_set_cr3(struct kvm_vcpu *vcpu,
> > unsigned long cr3)
> > 	if (cr3 == kvm_read_cr3(vcpu) && !is_pae_paging(vcpu))
> > 		goto handle_tlb_flush;
> > 
> > +	if (!guest_cpuid_has(vcpu, X86_FEATURE_LAM) &&
> > +	    (cr3 & (X86_CR3_LAM_U48 | X86_CR3_LAM_U57)))
> > +		return	1;
> 
> can you move this check to kvm_vcpu_is_valid_cr3(), i.e., return
> false in
> that function if any LAM bit is toggled while LAM isn't exposed to
> the guest?
> 
OK
> > +
> > 	/*
> > 	 * Do not condition the GPA check on long mode, this helper is
> > used to
> > 	 * stuff CR3, e.g. for RSM emulation, and there is no guarantee
> > that
> > @@ -1268,8 +1272,20 @@ int kvm_set_cr3(struct kvm_vcpu *vcpu,
> > unsigned long cr3)
> > 	if (is_pae_paging(vcpu) && !load_pdptrs(vcpu, cr3))
> > 		return 1;
> > 
> > -	if (cr3 != kvm_read_cr3(vcpu))
> > -		kvm_mmu_new_pgd(vcpu, cr3);
> > +	old_cr3 = kvm_read_cr3(vcpu);
> > +	if (cr3 != old_cr3) {
> > +		if ((cr3 ^ old_cr3) & CR3_ADDR_MASK) {
> 
This means those effective addr bits changes, then no matter LAM bits
toggled or not, it needs new pgd.

> Does this check against CR3_ADDR_MASK necessarily mean LAM bits are
> toggled, i.e., CR3_ADDR_MASK == ~(X86_CR3_LAM_U48 | X86_CR3_LAM_U57)?
> 
> Why not check if LAM bits are changed? This way the patch only
> changes
> cases related to LAM, keeping other cases intact.

Yes, I can better to add check in "else" that LAM bits changes.
But in fact above kvm_is_valid_cr3() has guaranteed no other high order
bits changed.
Emm, now you might ask to melt LAM bits into vcpu-
>arch.reserved_gpa_bits? ;)
> 
> > +			kvm_mmu_new_pgd(vcpu, cr3 & ~(X86_CR3_LAM_U48 |
> > +					X86_CR3_LAM_U57));
> 
> Do you need to touch kvm_mmu_new_pgd() in nested_vmx_load_cr3()?

Didn't scope nested LAM case in this patch set.



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 8/9] KVM: x86: emulation: Apply LAM when emulating data access
  2023-02-13  3:53   ` Chao Gao
@ 2023-02-14  5:38     ` Robert Hoo
  0 siblings, 0 replies; 64+ messages in thread
From: Robert Hoo @ 2023-02-14  5:38 UTC (permalink / raw)
  To: Chao Gao
  Cc: seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, isaku.yamahata, kirill.shutemov, kvm

On Mon, 2023-02-13 at 11:53 +0800, Chao Gao wrote:
> On Thu, Feb 09, 2023 at 10:40:21AM +0800, Robert Hoo wrote:
> > When in KVM emulation, calculated a LA for data access, apply LAM
> > if
> > guest is at that moment LAM active, so that the following canonical
> > check
> > can pass.
> 
> This sounds weird. Passing the canonical checking isn't the goal.
> Emulating
> the behavior of a LAM-capable processor on memory accesses is.
> 
Emm, how about describe like this:
In KVM emulation, apply LAM rule for linear address calculated, (data
access only), i.e. clear possible meta data in LA, before doing
canonical check.

> > +#ifdef CONFIG_X86_64
> > +static inline bool is_lam_active(struct kvm_vcpu *vcpu)
> 
> Drop this function because kvm_untagged_addr() already does these
> checks
> (and taking user/supervisor pointers into consideration).
> 
OK


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 1/9] KVM: x86: Intercept CR4.LAM_SUP when LAM feature is enabled in guest
  2023-02-14  1:27   ` Binbin Wu
@ 2023-02-14  6:11     ` Robert Hoo
  2023-02-14  9:00       ` Binbin Wu
  0 siblings, 1 reply; 64+ messages in thread
From: Robert Hoo @ 2023-02-14  6:11 UTC (permalink / raw)
  To: Binbin Wu, seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, chao.gao, isaku.yamahata
  Cc: kirill.shutemov, kvm

On Tue, 2023-02-14 at 09:27 +0800, Binbin Wu wrote:
> This patch removes CR4.LAM_SUP from cr4_reserved_bits to allows the 
> setting of X86_CR4_LAM_SUP by guest 

Yes

> if the hardware platform supports 
> the feature.

More precisely, if guest_cpuid_has() LAM feature. QEMU could turn
feature off even if underlying host/KVM tells supporting it.
> 
> The interception of CR4 is decided by CR4 guest/host mask and CR4
> read 
> shadow.
> 
My interpretation is that "intercept CR4.x bit" is the opposite of
"guest own CR4.x bit".
Both of them are implemented via CR4 guest/host mask and CR4 shadow,
whose combination decides corresponding CR4.x bit access causes VM exit
or not.
When we changes some bits in CR4_RESERVED_BITS and
__cr4_reserved_bits(), we changes vcpu->arch.cr4_guest_owned_bits which
eventually forms the effective vmcs_writel(CR4_GUEST_HOST_MASK).



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 6/9] KVM: x86: When KVM judges CR3 valid or not, consider LAM bits
  2023-02-13 13:25     ` Robert Hoo
@ 2023-02-14  6:18       ` Chao Gao
  2023-02-14  7:00         ` Chao Gao
  2023-02-18  5:44         ` Robert Hoo
  0 siblings, 2 replies; 64+ messages in thread
From: Chao Gao @ 2023-02-14  6:18 UTC (permalink / raw)
  To: Robert Hoo
  Cc: seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, isaku.yamahata, kirill.shutemov, kvm

On Mon, Feb 13, 2023 at 09:25:50PM +0800, Robert Hoo wrote:
>On Mon, 2023-02-13 at 10:01 +0800, Chao Gao wrote:
>> On Thu, Feb 09, 2023 at 10:40:19AM +0800, Robert Hoo wrote:
>> > Before apply to kvm_vcpu_is_illegal_gpa(), clear LAM bits if it's
>> > valid.
>> 
>> I prefer to squash this patch into patch 2 because it is also related
>> to
>> CR3 LAM bits handling.
>> 
>Though all surround CR3, I would prefer split into pieces, so that
>easier for review and accept. I can change their order to group
>together. Is is all right for you?

No. To me, it doesn't make review easier at all.

This patch itself is incomplete and lacks proper justification. Merging
related patches together can show the whole picture and it is easier
to write some justification.

There are two ways that make sense to me:
option 1:

patch 1: virtualize CR4.LAM_SUP
patch 2: virtualize CR3.LAM_U48/U57
patch 3: virtualize LAM masking on applicable data accesses
patch 4: expose LAM CPUID bit to user sapce VMM

option 2:
given the series has only 100 LoC, you can merge all patches into a
single patch.


>> > {
>> > 	bool skip_tlb_flush = false;
>> > @@ -1254,7 +1262,7 @@ int kvm_set_cr3(struct kvm_vcpu *vcpu,
>> > unsigned long cr3)
>> > 	 * stuff CR3, e.g. for RSM emulation, and there is no guarantee
>> > that
>> > 	 * the current vCPU mode is accurate.
>> > 	 */
>> > -	if (kvm_vcpu_is_illegal_gpa(vcpu, cr3))
>> > +	if (!kvm_is_valid_cr3(vcpu, cr3))
>> 
>> There are other call sites of kvm_vcpu_is_illegal_gpa() to validate
>> cr3.
>> Do you need to modify them?
>
>I don't think so. Others are for gpa validation, no need to change.
>Here is for CR3.

how about the call in kvm_is_valid_sregs()? if you don't change it, when
user space VMM tries to set a CR3 with any LAM bits, KVM thinks the CR3
is illegal and returns an error. To me it means live migration probably
is broken.

And the call in nested_vmx_check_host_state()? L1 VMM should be allowed to
program a CR3 with LAM bits set to VMCS's HOST_CR3 field. Actually, it
is exactly what this patch 6 is doing.

Please inspect other call sites carefully.

>> 
>> > 		return 1;
>> > 
>> > 	if (is_pae_paging(vcpu) && !load_pdptrs(vcpu, cr3))
>> > -- 
>> > 2.31.1
>> > 
>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 7/9] KVM: x86: When guest set CR3, handle LAM bits semantics
  2023-02-14  5:28     ` Robert Hoo
@ 2023-02-14  6:48       ` Chao Gao
  0 siblings, 0 replies; 64+ messages in thread
From: Chao Gao @ 2023-02-14  6:48 UTC (permalink / raw)
  To: Robert Hoo
  Cc: seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, isaku.yamahata, kirill.shutemov, kvm

On Tue, Feb 14, 2023 at 01:28:33PM +0800, Robert Hoo wrote:
>> > +
>> > 	/*
>> > 	 * Do not condition the GPA check on long mode, this helper is
>> > used to
>> > 	 * stuff CR3, e.g. for RSM emulation, and there is no guarantee
>> > that
>> > @@ -1268,8 +1272,20 @@ int kvm_set_cr3(struct kvm_vcpu *vcpu,
>> > unsigned long cr3)
>> > 	if (is_pae_paging(vcpu) && !load_pdptrs(vcpu, cr3))
>> > 		return 1;
>> > 
>> > -	if (cr3 != kvm_read_cr3(vcpu))
>> > -		kvm_mmu_new_pgd(vcpu, cr3);
>> > +	old_cr3 = kvm_read_cr3(vcpu);
>> > +	if (cr3 != old_cr3) {
>> > +		if ((cr3 ^ old_cr3) & CR3_ADDR_MASK) {
>> 
>This means those effective addr bits changes, then no matter LAM bits
>toggled or not, it needs new pgd.
>
>> Does this check against CR3_ADDR_MASK necessarily mean LAM bits are
>> toggled, i.e., CR3_ADDR_MASK == ~(X86_CR3_LAM_U48 | X86_CR3_LAM_U57)?
>> 
>> Why not check if LAM bits are changed? This way the patch only
>> changes
>> cases related to LAM, keeping other cases intact.
>
>Yes, I can better to add check in "else" that LAM bits changes.
>But in fact above kvm_is_valid_cr3() has guaranteed no other high order
>bits changed.
>Emm, now you might ask to melt LAM bits into vcpu-
>>arch.reserved_gpa_bits? ;)

no. I am not asking for that.

My point is for example, bit X isn't in CR3_ADDR_MASK. then toggling
the bit X will go into the else{} branch, which is particularly for LAM
bits. So, the change is correct iff

	CR3_ADDR_MASK = ~(X86_CR3_LAM_U48 | X86_CR3_LAM_U57).

I didn't check if that is true on your code base. If it isn't, replace
CR3_ADDR_MASK with ~(X86_CR3_LAM_U48 | X86_CR3_LAM_U57).

>> 
>> > +			kvm_mmu_new_pgd(vcpu, cr3 & ~(X86_CR3_LAM_U48 |
>> > +					X86_CR3_LAM_U57));
>> 
>> Do you need to touch kvm_mmu_new_pgd() in nested_vmx_load_cr3()?
>
>Didn't scope nested LAM case in this patch set.

Is there any justificaiton for not considering nested virtualization?
Won't nested virtualization be broken by this series?


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 6/9] KVM: x86: When KVM judges CR3 valid or not, consider LAM bits
  2023-02-14  6:18       ` Chao Gao
@ 2023-02-14  7:00         ` Chao Gao
  2023-02-18  5:44         ` Robert Hoo
  1 sibling, 0 replies; 64+ messages in thread
From: Chao Gao @ 2023-02-14  7:00 UTC (permalink / raw)
  To: Robert Hoo
  Cc: seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, isaku.yamahata, kirill.shutemov, kvm

On Tue, Feb 14, 2023 at 02:18:59PM +0800, Chao Gao wrote:
>>> > 	bool skip_tlb_flush = false;
>>> > @@ -1254,7 +1262,7 @@ int kvm_set_cr3(struct kvm_vcpu *vcpu,
>>> > unsigned long cr3)
>>> > 	 * stuff CR3, e.g. for RSM emulation, and there is no guarantee
>>> > that
>>> > 	 * the current vCPU mode is accurate.
>>> > 	 */
>>> > -	if (kvm_vcpu_is_illegal_gpa(vcpu, cr3))
>>> > +	if (!kvm_is_valid_cr3(vcpu, cr3))
>>> 
>>> There are other call sites of kvm_vcpu_is_illegal_gpa() to validate
>>> cr3.
>>> Do you need to modify them?
>>
>>I don't think so. Others are for gpa validation, no need to change.
>>Here is for CR3.
>
>how about the call in kvm_is_valid_sregs()? if you don't change it, when
>user space VMM tries to set a CR3 with any LAM bits, KVM thinks the CR3
>is illegal and returns an error. To me it means live migration probably
>is broken.
>
>And the call in nested_vmx_check_host_state()? L1 VMM should be allowed to
>program a CR3 with LAM bits set to VMCS's HOST_CR3 field. Actually, it
>is exactly what this patch 6 is doing.

Please disregard "Actually, it is exactly what this patch 6 is doing".
My brain just disconnected.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 1/9] KVM: x86: Intercept CR4.LAM_SUP when LAM feature is enabled in guest
  2023-02-14  6:11     ` Robert Hoo
@ 2023-02-14  9:00       ` Binbin Wu
  2023-02-14 12:24         ` Robert Hoo
  0 siblings, 1 reply; 64+ messages in thread
From: Binbin Wu @ 2023-02-14  9:00 UTC (permalink / raw)
  To: Robert Hoo, seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, chao.gao, isaku.yamahata
  Cc: kirill.shutemov, kvm


On 2/14/2023 2:11 PM, Robert Hoo wrote:
> On Tue, 2023-02-14 at 09:27 +0800, Binbin Wu wrote:
>> The interception of CR4 is decided by CR4 guest/host mask and CR4
>> read
>> shadow.
>>
> My interpretation is that "intercept CR4.x bit" is the opposite of
> "guest own CR4.x bit".
> Both of them are implemented via CR4 guest/host mask and CR4 shadow,
> whose combination decides corresponding CR4.x bit access causes VM exit
> or not.
> When we changes some bits in CR4_RESERVED_BITS and
> __cr4_reserved_bits(), we changes vcpu->arch.cr4_guest_owned_bits which
> eventually forms the effective vmcs_writel(CR4_GUEST_HOST_MASK).
>
According to the code of set_cr4_guest_host_mask,
vcpu->arch.cr4_guest_owned_bits is a subset of KVM_POSSIBLE_CR4_GUEST_BITS,
and X86_CR4_LAM_SUP is not included in KVM_POSSIBLE_CR4_GUEST_BITS.
No matter change CR4_RESERVED_BITS or not, X86_CR4_LAM_SUP will always be set in CR4_GUEST_HOST_MASK.



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 1/9] KVM: x86: Intercept CR4.LAM_SUP when LAM feature is enabled in guest
  2023-02-14  9:00       ` Binbin Wu
@ 2023-02-14 12:24         ` Robert Hoo
  2023-02-14 12:36           ` Robert Hoo
  2023-02-16  5:31           ` Binbin Wu
  0 siblings, 2 replies; 64+ messages in thread
From: Robert Hoo @ 2023-02-14 12:24 UTC (permalink / raw)
  To: Binbin Wu, seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, chao.gao, isaku.yamahata
  Cc: kirill.shutemov, kvm

On Tue, 2023-02-14 at 17:00 +0800, Binbin Wu wrote:
> According to the code of set_cr4_guest_host_mask,
> vcpu->arch.cr4_guest_owned_bits is a subset of
> KVM_POSSIBLE_CR4_GUEST_BITS,
> and X86_CR4_LAM_SUP is not included in KVM_POSSIBLE_CR4_GUEST_BITS.
> No matter change CR4_RESERVED_BITS or not, X86_CR4_LAM_SUP will
> always be set in CR4_GUEST_HOST_MASK.
> 
> 
set_cr4_guest_host_mask():
	vcpu->arch.cr4_guest_owned_bits = KVM_POSSIBLE_CR4_GUEST_BITS &
			~vcpu->arch.cr4_guest_rsvd_bits;

kvm_vcpu_after_set_cpuid():
	vcpu->arch.cr4_guest_rsvd_bits =
	    __cr4_reserved_bits(guest_cpuid_has, vcpu);


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 1/9] KVM: x86: Intercept CR4.LAM_SUP when LAM feature is enabled in guest
  2023-02-14 12:24         ` Robert Hoo
@ 2023-02-14 12:36           ` Robert Hoo
  2023-02-16  5:31           ` Binbin Wu
  1 sibling, 0 replies; 64+ messages in thread
From: Robert Hoo @ 2023-02-14 12:36 UTC (permalink / raw)
  To: Binbin Wu, seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, chao.gao, isaku.yamahata
  Cc: kirill.shutemov, kvm

On Tue, 2023-02-14 at 20:24 +0800, Robert Hoo wrote:
> On Tue, 2023-02-14 at 17:00 +0800, Binbin Wu wrote:
> > According to the code of set_cr4_guest_host_mask,
> > vcpu->arch.cr4_guest_owned_bits is a subset of
> > KVM_POSSIBLE_CR4_GUEST_BITS,
> > and X86_CR4_LAM_SUP is not included in KVM_POSSIBLE_CR4_GUEST_BITS.
> > No matter change CR4_RESERVED_BITS or not, X86_CR4_LAM_SUP will
> > always be set in CR4_GUEST_HOST_MASK.

yes, bits set to 1 in a guest/host mask means it's “owned” by the host.
> > 
> > 
> 
> set_cr4_guest_host_mask():
> 	vcpu->arch.cr4_guest_owned_bits = KVM_POSSIBLE_CR4_GUEST_BITS &
> 			~vcpu->arch.cr4_guest_rsvd_bits;
> 
> kvm_vcpu_after_set_cpuid():
> 	vcpu->arch.cr4_guest_rsvd_bits =
> 	    __cr4_reserved_bits(guest_cpuid_has, vcpu);


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 2/9] KVM: x86: MMU: Clear CR3 LAM bits when allocate shadow root
  2023-02-14  2:55       ` Binbin Wu
@ 2023-02-15  1:17         ` Robert Hoo
  2023-02-16  2:14         ` Robert Hoo
  1 sibling, 0 replies; 64+ messages in thread
From: Robert Hoo @ 2023-02-15  1:17 UTC (permalink / raw)
  To: Binbin Wu, Chao Gao
  Cc: seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, isaku.yamahata, kirill.shutemov, kvm

On Tue, 2023-02-14 at 10:55 +0800, Binbin Wu wrote:
> > Emm, the other 3 are
> > FNAME(walk_addr_generic)()
> > kvm_arch_setup_async_pf()
> > kvm_arch_async_page_ready
> > 
> > In former version, I clear CR3.LAM bits for guest_pgd inside mmu-
> > > get_guest_pgd(). I think this is generic. Perhaps I should still
> > > do it
> > 
> > in that way. Let's wait for other's comments on this.
> > Thanks for pointing out.
> 
> I also prefer to handle it inside get_guest_pdg,
> but in kvm_arch_setup_async_pf()/kvm_arch_async_page_ready(), the
> value 
> is assigned to
> cr3 of struct kvm_arch_async_pf, does it requires all fileds of cr3?
> 
I'm looking into the async PF. Anyone who's familiar with async PF can
shed some light?

If kvm_arch_setup_async_pf()/kvm_arch_async_page_ready() needs whole
CR3, would you agree we have 2 methods: get_cr3() and get_pgd()?



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 2/9] KVM: x86: MMU: Clear CR3 LAM bits when allocate shadow root
  2023-02-14  2:55       ` Binbin Wu
  2023-02-15  1:17         ` Robert Hoo
@ 2023-02-16  2:14         ` Robert Hoo
  1 sibling, 0 replies; 64+ messages in thread
From: Robert Hoo @ 2023-02-16  2:14 UTC (permalink / raw)
  To: Binbin Wu, Chao Gao
  Cc: seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, isaku.yamahata, kirill.shutemov, kvm

On Tue, 2023-02-14 at 10:55 +0800, Binbin Wu wrote:
> On 2/9/2023 9:02 PM, Robert Hoo wrote:
> > On Thu, 2023-02-09 at 17:55 +0800, Chao Gao wrote:
> > > On Thu, Feb 09, 2023 at 10:40:15AM +0800, Robert Hoo wrote:
> > > > --- a/arch/x86/kvm/mmu/mmu.c
> > > > +++ b/arch/x86/kvm/mmu/mmu.c
> > > > @@ -3698,8 +3698,11 @@ static int mmu_alloc_shadow_roots(struct
> > > > kvm_vcpu *vcpu)
> > > > 	gfn_t root_gfn, root_pgd;
> > > > 	int quadrant, i, r;
> > > > 	hpa_t root;
> > > > -
> > > 
> > > The blank line should be kept.
> > 
> > OK
> > > > +#ifdef CONFIG_X86_64
> > > > +	root_pgd = mmu->get_guest_pgd(vcpu) & ~(X86_CR3_LAM_U48
> > > > |
> > > > X86_CR3_LAM_U57);
> > > > +#else
> > > > 	root_pgd = mmu->get_guest_pgd(vcpu);
> > > > +#endif
> > > 
> > > Why are other call sites of mmu->get_guest_pgd() not changed?
> > 
> > Emm, the other 3 are
> > FNAME(walk_addr_generic)()
> > kvm_arch_setup_async_pf()
> > kvm_arch_async_page_ready
> > 
> > In former version, I clear CR3.LAM bits for guest_pgd inside mmu-
> > > get_guest_pgd(). I think this is generic. Perhaps I should still
> > > do it
> > 
> > in that way. Let's wait for other's comments on this.
> > Thanks for pointing out.
> 
> I also prefer to handle it inside get_guest_pdg,
> but in kvm_arch_setup_async_pf()/kvm_arch_async_page_ready(), the
> value 
> is assigned to
> cr3 of struct kvm_arch_async_pf, does it requires all fileds of cr3?
> 
Took a rough look at the code, looks like
kvm_arch_setup_sync_fp() preserves the temporal cr3
kvm_arch_async_page_ready() confirms the ready (resolved) PF does
correspond to current vcpu context.
To be conservative, I prefer keep
kvm_arch_setup_async_pf()/kvm_arch_async_page_ready() as is, i.e. LAM
bits is contained in the checking.

As for FNAME(walk_addr_generic)()
	pte           = mmu->get_guest_pgd(vcpu);
I think it's like mmu_alloc_shadow_roots(), i.e. LAM bits should be
cleared. 

If no objection, I'll update in next version.


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 1/9] KVM: x86: Intercept CR4.LAM_SUP when LAM feature is enabled in guest
  2023-02-14 12:24         ` Robert Hoo
  2023-02-14 12:36           ` Robert Hoo
@ 2023-02-16  5:31           ` Binbin Wu
  2023-02-16  5:54             ` Robert Hoo
  1 sibling, 1 reply; 64+ messages in thread
From: Binbin Wu @ 2023-02-16  5:31 UTC (permalink / raw)
  To: Robert Hoo, seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, chao.gao, isaku.yamahata
  Cc: kirill.shutemov, kvm


On 2/14/2023 8:24 PM, Robert Hoo wrote:
> On Tue, 2023-02-14 at 17:00 +0800, Binbin Wu wrote:
>> According to the code of set_cr4_guest_host_mask,
>> vcpu->arch.cr4_guest_owned_bits is a subset of
>> KVM_POSSIBLE_CR4_GUEST_BITS,
>> and X86_CR4_LAM_SUP is not included in KVM_POSSIBLE_CR4_GUEST_BITS.
>> No matter change CR4_RESERVED_BITS or not, X86_CR4_LAM_SUP will
>> always be set in CR4_GUEST_HOST_MASK.
>>
>>
> set_cr4_guest_host_mask():
> 	vcpu->arch.cr4_guest_owned_bits = KVM_POSSIBLE_CR4_GUEST_BITS &
> 			~vcpu->arch.cr4_guest_rsvd_bits;

My point is  when X86_CR4_LAM_SUP is not set in KVM_POSSIBLE_CR4_GUEST_BITS,
CR4.LAM_SUP is definitely owned by host, regardless of the value of 
cr4_guest_rsvd_bits.


>
> kvm_vcpu_after_set_cpuid():
> 	vcpu->arch.cr4_guest_rsvd_bits =
> 	    __cr4_reserved_bits(guest_cpuid_has, vcpu);
>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 1/9] KVM: x86: Intercept CR4.LAM_SUP when LAM feature is enabled in guest
  2023-02-16  5:31           ` Binbin Wu
@ 2023-02-16  5:54             ` Robert Hoo
  0 siblings, 0 replies; 64+ messages in thread
From: Robert Hoo @ 2023-02-16  5:54 UTC (permalink / raw)
  To: Binbin Wu, seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, chao.gao, isaku.yamahata
  Cc: kirill.shutemov, kvm

On Thu, 2023-02-16 at 13:31 +0800, Binbin Wu wrote:
> On 2/14/2023 8:24 PM, Robert Hoo wrote:
> > On Tue, 2023-02-14 at 17:00 +0800, Binbin Wu wrote:
> > > According to the code of set_cr4_guest_host_mask,
> > > vcpu->arch.cr4_guest_owned_bits is a subset of
> > > KVM_POSSIBLE_CR4_GUEST_BITS,
> > > and X86_CR4_LAM_SUP is not included in
> > > KVM_POSSIBLE_CR4_GUEST_BITS.
> > > No matter change CR4_RESERVED_BITS or not, X86_CR4_LAM_SUP will
> > > always be set in CR4_GUEST_HOST_MASK.
> > > 
> > > 
> > 
> > set_cr4_guest_host_mask():
> > 	vcpu->arch.cr4_guest_owned_bits = KVM_POSSIBLE_CR4_GUEST_BITS &
> > 			~vcpu->arch.cr4_guest_rsvd_bits;
> 
> My point is  when X86_CR4_LAM_SUP is not set in
> KVM_POSSIBLE_CR4_GUEST_BITS,
> CR4.LAM_SUP is definitely owned by host, regardless of the value of 
> cr4_guest_rsvd_bits.
> 
Yes, you can disregard that reply.
We were talking each's own points:) Neither is wrong.

Chao talked to me afterwards, that your points are: we can say by
default, without this patch, CR4.LAM_SUP were intercepted. so why
redundantly name this patch "Intercept CR4.LAM_SUP".
That's true, but intercepted as reserved bit.

I'm revising the subject in v5.
> 
> > 
> > kvm_vcpu_after_set_cpuid():
> > 	vcpu->arch.cr4_guest_rsvd_bits =
> > 	    __cr4_reserved_bits(guest_cpuid_has, vcpu);
> > 


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 5/9] KVM: x86: Untag LAM bits when applicable
  2023-02-11  5:57     ` Robert Hoo
@ 2023-02-16  6:37       ` Binbin Wu
  0 siblings, 0 replies; 64+ messages in thread
From: Binbin Wu @ 2023-02-16  6:37 UTC (permalink / raw)
  To: Robert Hoo, Chao Gao
  Cc: seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, isaku.yamahata, kirill.shutemov, kvm


On 2/11/2023 1:57 PM, Robert Hoo wrote:
> On Fri, 2023-02-10 at 23:04 +0800, Chao Gao wrote:
>
>>> @@ -1809,6 +1809,10 @@ static int __kvm_set_msr(struct kvm_vcpu
>>> *vcpu, u32 index, u64 data,
>>> 	case MSR_KERNEL_GS_BASE:
>>> 	case MSR_CSTAR:
>>> 	case MSR_LSTAR:
>>> +		/*
>>> +		 * The strict canonical checking still applies to MSR
>>> +		 * writing even LAM is enabled.
>>> +		 */
>>> 		if (is_noncanonical_address(data, vcpu))
>> LAM spec says:
>>
>> 	Processors that support LAM continue to require the addresses
>> written to
>> 	control registers or MSRs be 57-bit canonical if the processor
>> supports
>> 	5-level paging or 48-bit canonical if it supports only 4-level
>> paging
>>
>> My understanding is 57-bit canonical checking is performed if the
>> processor
>> __supports__ 5-level paging. Then the is_noncanonical_address() here
>> is
>> arguably wrong. Could you double-confirm and fix it?
>>
> Emm, condition of "support" V.S. "enabled", you mean
> vcpu_virt_addr_bits(), right?
> {
> 	return kvm_read_cr4_bits(vcpu, X86_CR4_LA57) ? 57 : 48;
> }
>
> This is worth double-confirm. Let me check with Yu. Thanks.

I have a check about the definition of canonical address checking in 
Intel SDM.
It says: “In 64-bit mode, an address is considered to be in canonical 
form if address bits 63 through to the most-significant implemented bit 
by the microarchitecture are set to either all ones or all zeros”.
So the processor canonical address checking bit width depends on the 
"support" of 5-level paging, not the "enable status" of it.
And the description is aligned with LAM spec.

The current code for canonical check is not strictly aligned with the 
hardware behavior.
IMHO, the current code can work properly becasue the design of virtual 
memory map of OSes in 4-level paging mode still makes bits 63:47 
identical for canonical addresses (, which is reasonable and necessary 
for back compatibility), after the processor supports 5-level paging .
So for functionality, the current implementation should be OK.

Want to hear more options about whether we need to modify the code to 
strictly align with the hardware behavior.



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 6/9] KVM: x86: When KVM judges CR3 valid or not, consider LAM bits
  2023-02-14  6:18       ` Chao Gao
  2023-02-14  7:00         ` Chao Gao
@ 2023-02-18  5:44         ` Robert Hoo
  1 sibling, 0 replies; 64+ messages in thread
From: Robert Hoo @ 2023-02-18  5:44 UTC (permalink / raw)
  To: Chao Gao
  Cc: seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, isaku.yamahata, kirill.shutemov, kvm

On Tue, 2023-02-14 at 14:18 +0800, Chao Gao wrote:
> There are two ways that make sense to me:
> option 1:
> 
> patch 1: virtualize CR4.LAM_SUP
> patch 2: virtualize CR3.LAM_U48/U57
> patch 3: virtualize LAM masking on applicable data accesses
And patch 4: KVM emulation: Apply LAM when emulating data access
> patch 4: expose LAM CPUID bit to user sapce VMM

> > > > 	 * the current vCPU mode is accurate.
> > > > 	 */
> > > > -	if (kvm_vcpu_is_illegal_gpa(vcpu, cr3))
> > > > +	if (!kvm_is_valid_cr3(vcpu, cr3))
> > > 
> > > There are other call sites of kvm_vcpu_is_illegal_gpa() to
> > > validate
> > > cr3.
> > > Do you need to modify them?
> > 
> > I don't think so. Others are for gpa validation, no need to change.
> > Here is for CR3.
> 
> how about the call in kvm_is_valid_sregs()? if you don't change it,
> when
> user space VMM tries to set a CR3 with any LAM bits, KVM thinks the
> CR3
> is illegal and returns an error. To me it means live migration
> probably
> is broken.

Agree. Will add this check in v5.
> 
> And the call in nested_vmx_check_host_state()? L1 VMM should be
> allowed to
> program a CR3 with LAM bits set to VMCS's HOST_CR3 field. 

Right, per spec, it should be allowed. 



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 9/9] KVM: x86: LAM: Expose LAM CPUID to user space VMM
  2023-02-09  2:40 ` [PATCH v4 9/9] KVM: x86: LAM: Expose LAM CPUID to user space VMM Robert Hoo
@ 2023-02-21  5:47   ` Binbin Wu
  2023-02-21  7:26     ` Robert Hoo
  0 siblings, 1 reply; 64+ messages in thread
From: Binbin Wu @ 2023-02-21  5:47 UTC (permalink / raw)
  To: Robert Hoo, seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, chao.gao, isaku.yamahata
  Cc: kirill.shutemov, kvm


On 2/9/2023 10:40 AM, Robert Hoo wrote:
> LAM feature is enumerated by (EAX=07H, ECX=01H):EAX.LAM[bit26].
>
> Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
> Reviewed-by: Jingqi Liu <jingqi.liu@intel.com>
> ---
>   arch/x86/kvm/cpuid.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index b14653b61470..79f45cbe587e 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -664,7 +664,7 @@ void kvm_set_cpu_caps(void)
>   
>   	kvm_cpu_cap_mask(CPUID_7_1_EAX,
>   		F(AVX_VNNI) | F(AVX512_BF16) | F(CMPCCXADD) | F(AMX_FP16) |
> -		F(AVX_IFMA)
> +		F(AVX_IFMA) | F(LAM)

Do we allow to expose the LAM capability to guest when host kernel 
disabled LAM feature (e.g. via clearcpuid)?

May be it should be handled similarly as LA57.


>   	);
>   
>   	kvm_cpu_cap_init_kvm_defined(CPUID_7_1_EDX,

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 9/9] KVM: x86: LAM: Expose LAM CPUID to user space VMM
  2023-02-21  5:47   ` Binbin Wu
@ 2023-02-21  7:26     ` Robert Hoo
  2023-02-21  8:26       ` Binbin Wu
  0 siblings, 1 reply; 64+ messages in thread
From: Robert Hoo @ 2023-02-21  7:26 UTC (permalink / raw)
  To: Binbin Wu, seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, chao.gao, isaku.yamahata
  Cc: kirill.shutemov, kvm

On Tue, 2023-02-21 at 13:47 +0800, Binbin Wu wrote:
> On 2/9/2023 10:40 AM, Robert Hoo wrote:
> > LAM feature is enumerated by (EAX=07H, ECX=01H):EAX.LAM[bit26].
> > 
> > Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
> > Reviewed-by: Jingqi Liu <jingqi.liu@intel.com>
> > ---
> >   arch/x86/kvm/cpuid.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> > index b14653b61470..79f45cbe587e 100644
> > --- a/arch/x86/kvm/cpuid.c
> > +++ b/arch/x86/kvm/cpuid.c
> > @@ -664,7 +664,7 @@ void kvm_set_cpu_caps(void)
> >   
> >   	kvm_cpu_cap_mask(CPUID_7_1_EAX,
> >   		F(AVX_VNNI) | F(AVX512_BF16) | F(CMPCCXADD) |
> > F(AMX_FP16) |
> > -		F(AVX_IFMA)
> > +		F(AVX_IFMA) | F(LAM)
> 
> Do we allow to expose the LAM capability to guest when host kernel 
> disabled LAM feature (e.g. via clearcpuid)?

No
> 
> May be it should be handled similarly as LA57.
> 
You mean expose LAM to guest based on HW capability rather than host
status?
Why is LA57 exposure like this? unlike most features.

Without explicit rationality, I would tend to follow most conventions.



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 9/9] KVM: x86: LAM: Expose LAM CPUID to user space VMM
  2023-02-21  7:26     ` Robert Hoo
@ 2023-02-21  8:26       ` Binbin Wu
  2023-02-21 11:13         ` Yu Zhang
  0 siblings, 1 reply; 64+ messages in thread
From: Binbin Wu @ 2023-02-21  8:26 UTC (permalink / raw)
  To: Robert Hoo, seanjc, pbonzini, yu.c.zhang, yuan.yao, jingqi.liu,
	weijiang.yang, chao.gao, isaku.yamahata
  Cc: kirill.shutemov, kvm


On 2/21/2023 3:26 PM, Robert Hoo wrote:
> On Tue, 2023-02-21 at 13:47 +0800, Binbin Wu wrote:
>> On 2/9/2023 10:40 AM, Robert Hoo wrote:
>>> LAM feature is enumerated by (EAX=07H, ECX=01H):EAX.LAM[bit26].
>>>
>>> Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
>>> Reviewed-by: Jingqi Liu <jingqi.liu@intel.com>
>>> ---
>>>    arch/x86/kvm/cpuid.c | 2 +-
>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
>>> index b14653b61470..79f45cbe587e 100644
>>> --- a/arch/x86/kvm/cpuid.c
>>> +++ b/arch/x86/kvm/cpuid.c
>>> @@ -664,7 +664,7 @@ void kvm_set_cpu_caps(void)
>>>    
>>>    	kvm_cpu_cap_mask(CPUID_7_1_EAX,
>>>    		F(AVX_VNNI) | F(AVX512_BF16) | F(CMPCCXADD) |
>>> F(AMX_FP16) |
>>> -		F(AVX_IFMA)
>>> +		F(AVX_IFMA) | F(LAM)
>> Do we allow to expose the LAM capability to guest when host kernel
>> disabled LAM feature (e.g. via clearcpuid)?
> No
>> May be it should be handled similarly as LA57.
>>
> You mean expose LAM to guest based on HW capability rather than host
> status?
> Why is LA57 exposure like this? unlike most features.

The special handling for LA57 is from the patch "kvm: x86: Return LA57 
feature based on hardware capability".
https://lore.kernel.org/lkml/1548950983-18458-1-git-send-email-yu.c.zhang@linux.intel.com/

The reason is host kernel may disable 5-level paging using cmdline 
parameter 'no5lvl', and it will clear the feature bit for LA57 in 
boot_cpu_data.
boot_cpu_data is queried in kvm_set_cpu_caps to derive kvm cpu cap masks.

" VMs can still benefit from extended linear address width, e.g. to 
enhance features like ASLR" even when host  doesn't use 5-level paging.
So, the patch sets LA57 based on hardware capability.

I was just wondering  whether LAM could be the similar case that the 
host disabled the feature somehow (e.g via clearcpuid), and the guest 
still want to use it.



>
> Without explicit rationality, I would tend to follow most conventions.
>
>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 9/9] KVM: x86: LAM: Expose LAM CPUID to user space VMM
  2023-02-21  8:26       ` Binbin Wu
@ 2023-02-21 11:13         ` Yu Zhang
  2023-02-21 13:18           ` Binbin Wu
  0 siblings, 1 reply; 64+ messages in thread
From: Yu Zhang @ 2023-02-21 11:13 UTC (permalink / raw)
  To: Binbin Wu
  Cc: Robert Hoo, seanjc, pbonzini, yuan.yao, jingqi.liu,
	weijiang.yang, chao.gao, isaku.yamahata, kirill.shutemov, kvm

> 
> The special handling for LA57 is from the patch "kvm: x86: Return LA57
> feature based on hardware capability".
> https://lore.kernel.org/lkml/1548950983-18458-1-git-send-email-yu.c.zhang@linux.intel.com/
> 
> The reason is host kernel may disable 5-level paging using cmdline parameter
> 'no5lvl', and it will clear the feature bit for LA57 in boot_cpu_data.
> boot_cpu_data is queried in kvm_set_cpu_caps to derive kvm cpu cap masks.
> 
> " VMs can still benefit from extended linear address width, e.g. to enhance
> features like ASLR" even when host  doesn't use 5-level paging.
> So, the patch sets LA57 based on hardware capability.
> 
> I was just wondering  whether LAM could be the similar case that the host
> disabled the feature somehow (e.g via clearcpuid), and the guest still want
> to use it.

Paging modes in root & non-root are orthogonal, so should LAM.

B.R.
Yu

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 9/9] KVM: x86: LAM: Expose LAM CPUID to user space VMM
  2023-02-21 11:13         ` Yu Zhang
@ 2023-02-21 13:18           ` Binbin Wu
  2023-02-21 14:36             ` Robert Hoo
  0 siblings, 1 reply; 64+ messages in thread
From: Binbin Wu @ 2023-02-21 13:18 UTC (permalink / raw)
  To: Yu Zhang
  Cc: Robert Hoo, seanjc, pbonzini, yuan.yao, jingqi.liu,
	weijiang.yang, chao.gao, isaku.yamahata, kirill.shutemov, kvm


On 2/21/2023 7:13 PM, Yu Zhang wrote:
>> The special handling for LA57 is from the patch "kvm: x86: Return LA57
>> feature based on hardware capability".
>> https://lore.kernel.org/lkml/1548950983-18458-1-git-send-email-yu.c.zhang@linux.intel.com/
>>
>> The reason is host kernel may disable 5-level paging using cmdline parameter
>> 'no5lvl', and it will clear the feature bit for LA57 in boot_cpu_data.
>> boot_cpu_data is queried in kvm_set_cpu_caps to derive kvm cpu cap masks.
>>
>> " VMs can still benefit from extended linear address width, e.g. to enhance
>> features like ASLR" even when host  doesn't use 5-level paging.
>> So, the patch sets LA57 based on hardware capability.
>>
>> I was just wondering  whether LAM could be the similar case that the host
>> disabled the feature somehow (e.g via clearcpuid), and the guest still want
>> to use it.
> Paging modes in root & non-root are orthogonal, so should LAM.

Agree.


>
> B.R.
> Yu

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 9/9] KVM: x86: LAM: Expose LAM CPUID to user space VMM
  2023-02-21 13:18           ` Binbin Wu
@ 2023-02-21 14:36             ` Robert Hoo
  0 siblings, 0 replies; 64+ messages in thread
From: Robert Hoo @ 2023-02-21 14:36 UTC (permalink / raw)
  To: Binbin Wu, Yu Zhang
  Cc: seanjc, pbonzini, yuan.yao, jingqi.liu, weijiang.yang, chao.gao,
	isaku.yamahata, kirill.shutemov, kvm

On Tue, 2023-02-21 at 21:18 +0800, Binbin Wu wrote:
> On 2/21/2023 7:13 PM, Yu Zhang wrote:
> > > The special handling for LA57 is from the patch "kvm: x86: Return
> > > LA57
> > > feature based on hardware capability".
> > > https://lore.kernel.org/lkml/1548950983-18458-1-git-send-email-yu.c.zhang@linux.intel.com/
> > > 
> > > The reason is host kernel may disable 5-level paging using
> > > cmdline parameter
> > > 'no5lvl', and it will clear the feature bit for LA57 in
> > > boot_cpu_data.
> > > boot_cpu_data is queried in kvm_set_cpu_caps to derive kvm cpu
> > > cap masks.
> > > 
> > > " VMs can still benefit from extended linear address width, e.g.
> > > to enhance
> > > features like ASLR" even when host  doesn't use 5-level paging.
> > > So, the patch sets LA57 based on hardware capability.
> > > 
> > > I was just wondering  whether LAM could be the similar case that
> > > the host
> > > disabled the feature somehow (e.g via clearcpuid), and the guest
> > > still want
> > > to use it.
> > 
> > Paging modes in root & non-root are orthogonal, so should LAM.
> 
> Agree.
> 
Understand

In 
https://lore.kernel.org/lkml/1548950983-18458-1-git-send-email-yu.c.zhang@linux.intel.com/
, it mentioned
"As discussed earlier, VMs can still benefit from extended linear
address width, e.g. to enhance features like ASLR. So we would like to
fix this..."

Apparently something was "discussed earlier", some request was made for
that (perhaps related to ASLR).
Read through kvm_set_cpu_caps(), such kind of handling, i.e. bypass
host/KVM and expose the feature as long as HW supports it, is exception
case.
Therefore, though LAM could be done like LA57, I hesitate to make LAM
exception case to break existing framework, unless analogous
discussion/request for it occurs.


^ permalink raw reply	[flat|nested] 64+ messages in thread

end of thread, other threads:[~2023-02-21 14:37 UTC | newest]

Thread overview: 64+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-09  2:40 [PATCH v4 0/9] Linear Address Masking (LAM) KVM Enabling Robert Hoo
2023-02-09  2:40 ` [PATCH v4 1/9] KVM: x86: Intercept CR4.LAM_SUP when LAM feature is enabled in guest Robert Hoo
2023-02-09  9:21   ` Chao Gao
2023-02-09 12:48     ` Robert Hoo
2023-02-10  3:29   ` Yang, Weijiang
2023-02-10  5:02     ` Robert Hoo
2023-02-10 16:30       ` Sean Christopherson
2023-02-14  1:27   ` Binbin Wu
2023-02-14  6:11     ` Robert Hoo
2023-02-14  9:00       ` Binbin Wu
2023-02-14 12:24         ` Robert Hoo
2023-02-14 12:36           ` Robert Hoo
2023-02-16  5:31           ` Binbin Wu
2023-02-16  5:54             ` Robert Hoo
2023-02-09  2:40 ` [PATCH v4 2/9] KVM: x86: MMU: Clear CR3 LAM bits when allocate shadow root Robert Hoo
2023-02-09  9:55   ` Chao Gao
2023-02-09 13:02     ` Robert Hoo
2023-02-14  2:55       ` Binbin Wu
2023-02-15  1:17         ` Robert Hoo
2023-02-16  2:14         ` Robert Hoo
2023-02-10  3:38   ` Yang, Weijiang
2023-02-11  3:12     ` Robert Hoo
2023-02-09  2:40 ` [PATCH v4 3/9] KVM: x86: MMU: Commets update Robert Hoo
2023-02-10  6:59   ` Chao Gao
2023-02-10  7:55     ` Robert Hoo
2023-02-10 16:54       ` Sean Christopherson
2023-02-09  2:40 ` [PATCH v4 4/9] KVM: x86: MMU: Integrate LAM bits when build guest CR3 Robert Hoo
2023-02-10 14:04   ` Chao Gao
2023-02-11  6:24     ` Robert Hoo
2023-02-11  6:29       ` Robert Hoo
2023-02-09  2:40 ` [PATCH v4 5/9] KVM: x86: Untag LAM bits when applicable Robert Hoo
2023-02-10 15:04   ` Chao Gao
2023-02-11  5:57     ` Robert Hoo
2023-02-16  6:37       ` Binbin Wu
2023-02-09  2:40 ` [PATCH v4 6/9] KVM: x86: When KVM judges CR3 valid or not, consider LAM bits Robert Hoo
2023-02-13  2:01   ` Chao Gao
2023-02-13 13:25     ` Robert Hoo
2023-02-14  6:18       ` Chao Gao
2023-02-14  7:00         ` Chao Gao
2023-02-18  5:44         ` Robert Hoo
2023-02-09  2:40 ` [PATCH v4 7/9] KVM: x86: When guest set CR3, handle LAM bits semantics Robert Hoo
2023-02-13  3:31   ` Chao Gao
2023-02-14  5:28     ` Robert Hoo
2023-02-14  6:48       ` Chao Gao
2023-02-09  2:40 ` [PATCH v4 8/9] KVM: x86: emulation: Apply LAM when emulating data access Robert Hoo
2023-02-13  3:53   ` Chao Gao
2023-02-14  5:38     ` Robert Hoo
2023-02-09  2:40 ` [PATCH v4 9/9] KVM: x86: LAM: Expose LAM CPUID to user space VMM Robert Hoo
2023-02-21  5:47   ` Binbin Wu
2023-02-21  7:26     ` Robert Hoo
2023-02-21  8:26       ` Binbin Wu
2023-02-21 11:13         ` Yu Zhang
2023-02-21 13:18           ` Binbin Wu
2023-02-21 14:36             ` Robert Hoo
2023-02-09  6:15 ` [PATCH v4 0/9] Linear Address Masking (LAM) KVM Enabling Chao Gao
2023-02-09 12:25   ` Robert Hoo
2023-02-09 17:27     ` Sean Christopherson
2023-02-10  2:07       ` Robert Hoo
2023-02-10  3:17         ` Chao Gao
2023-02-10  8:41           ` Robert Hoo
2023-02-10  8:39         ` Robert Hoo
2023-02-10  9:22           ` Chao Gao
2023-02-13  9:02 ` Binbin Wu
2023-02-13 13:16   ` Robert Hoo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).