All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] x86/lbr: handle lack of model-specific LBRs
@ 2022-05-20 13:37 Roger Pau Monne
  2022-05-20 13:37 ` [PATCH 1/5] x86/ler: use feature flag to check if option is enabled Roger Pau Monne
                   ` (6 more replies)
  0 siblings, 7 replies; 22+ messages in thread
From: Roger Pau Monne @ 2022-05-20 13:37 UTC (permalink / raw)
  To: xen-devel
  Cc: Roger Pau Monne, Jan Beulich, Andrew Cooper, Wei Liu,
	Jun Nakajima, Kevin Tian

Hello,

Intel Sapphire Rapids CPUs doesn't have model-specific MSRs, and hence
only architectural LBRs are available.

Firstly implement some changes so Xen knows how to enable arch LBRs so
that the ler option can also work in such scenario (first two patches).

The lack of model-specific LBRs also affects guests, as setting
DEBUGCTLMSR.LBR is now ignored (value hardwired to 0, writes ignored) by
the hardware due to the lack of model-specific LBRs.  The LBR format
reported in PERF_CAPABILITIES also need to be exposed, as that's a way
for guests to detect lack of model-specific LBRs presence (patches 3
and 4).

Patch 5 is an indentation fix that can be merged into patch 4: done
separately to help readability of patch 4.

Thanks, Roger.

Roger Pau Monne (5):
  x86/ler: use feature flag to check if option is enabled
  x86/lbr: enable hypervisor LER with arch LBR
  x86/perf: expose LBR format in PERF_CAPABILITIES
  x86/vmx: handle no model-specific LBR presence
  x86/vmx: fix indentation of LBR

 xen/arch/x86/hvm/vmx/vmx.c                  | 59 ++++++++++++++-------
 xen/arch/x86/include/asm/msr-index.h        | 18 +++++++
 xen/arch/x86/msr.c                          |  9 ++++
 xen/arch/x86/traps.c                        | 29 ++++++++--
 xen/arch/x86/x86_64/traps.c                 |  2 +-
 xen/include/public/arch-x86/cpufeatureset.h |  3 +-
 6 files changed, 97 insertions(+), 23 deletions(-)

-- 
2.36.0



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 1/5] x86/ler: use feature flag to check if option is enabled
  2022-05-20 13:37 [PATCH 0/5] x86/lbr: handle lack of model-specific LBRs Roger Pau Monne
@ 2022-05-20 13:37 ` Roger Pau Monne
  2022-05-30 15:00   ` Jan Beulich
  2022-05-20 13:37 ` [PATCH 2/5] x86/lbr: enable hypervisor LER with arch LBR Roger Pau Monne
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 22+ messages in thread
From: Roger Pau Monne @ 2022-05-20 13:37 UTC (permalink / raw)
  To: xen-devel; +Cc: Roger Pau Monne, Jan Beulich, Andrew Cooper, Wei Liu

It's more consistent with the rest of the usages of cpu_has_xen_lbr.

No functional change intended.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
 xen/arch/x86/x86_64/traps.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/arch/x86/x86_64/traps.c b/xen/arch/x86/x86_64/traps.c
index 9d7f1f818b..24c5067ca2 100644
--- a/xen/arch/x86/x86_64/traps.c
+++ b/xen/arch/x86/x86_64/traps.c
@@ -156,7 +156,7 @@ void show_registers(const struct cpu_user_regs *regs)
     printk("CPU:    %d\n", smp_processor_id());
     _show_registers(&fault_regs, fault_crs, context, v);
 
-    if ( ler_msr && !guest_mode(regs) )
+    if ( cpu_has_xen_lbr && !guest_mode(regs) )
     {
         u64 from, to;
 
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 2/5] x86/lbr: enable hypervisor LER with arch LBR
  2022-05-20 13:37 [PATCH 0/5] x86/lbr: handle lack of model-specific LBRs Roger Pau Monne
  2022-05-20 13:37 ` [PATCH 1/5] x86/ler: use feature flag to check if option is enabled Roger Pau Monne
@ 2022-05-20 13:37 ` Roger Pau Monne
  2022-05-30 15:31   ` Jan Beulich
  2022-05-20 13:37 ` [PATCH 3/5] x86/perf: expose LBR format in PERF_CAPABILITIES Roger Pau Monne
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 22+ messages in thread
From: Roger Pau Monne @ 2022-05-20 13:37 UTC (permalink / raw)
  To: xen-devel; +Cc: Roger Pau Monne, Jan Beulich, Andrew Cooper, Wei Liu

CPUs having no model-specific LBRs don't implement DEBUGCTLMSR.LBR
and LBRs can only be enabled if the processor supports architectural
LBRs.

Split the logic to enable LBRs into a separate function and expand the
logic to also implement support for arch LBRs if model-specific LBRs
are not supported.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
 xen/arch/x86/include/asm/msr-index.h        | 18 +++++++++++++
 xen/arch/x86/traps.c                        | 29 ++++++++++++++++++---
 xen/include/public/arch-x86/cpufeatureset.h |  1 +
 3 files changed, 45 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/include/asm/msr-index.h b/xen/arch/x86/include/asm/msr-index.h
index 3e038db618..7b08e1804b 100644
--- a/xen/arch/x86/include/asm/msr-index.h
+++ b/xen/arch/x86/include/asm/msr-index.h
@@ -139,6 +139,24 @@
 #define  PASID_PASID_MASK                   0x000fffff
 #define  PASID_VALID                        (_AC(1, ULL) << 31)
 
+#define MSR_ARCH_LBR_CTL                    0x000014ce
+#define  ARCH_LBR_CTL_LBREN                 (_AC(1, ULL) <<  0)
+#define  ARCH_LBR_CTL_OS                    (_AC(1, ULL) <<  1)
+#define  ARCH_LBR_CTL_COND                  (_AC(1, ULL) << 16)
+#define  ARCH_LBR_CTL_NEAR_REL_JMP          (_AC(1, ULL) << 17)
+#define  ARCH_LBR_CTL_NEAR_IND_JMP          (_AC(1, ULL) << 18)
+#define  ARCH_LBR_CTL_NEAR_REL_CALL         (_AC(1, ULL) << 19)
+#define  ARCH_LBR_CTL_NEAR_IND_CALL         (_AC(1, ULL) << 20)
+#define  ARCH_LBR_CTL_NEAR_RET              (_AC(1, ULL) << 21)
+#define  ARCH_LBR_CTL_OTHER_BRANCH          (_AC(1, ULL) << 22)
+#define  ARCH_LBR_CTL_RECORD_ALL            (ARCH_LBR_CTL_COND | \
+                                             ARCH_LBR_CTL_NEAR_REL_JMP | \
+                                             ARCH_LBR_CTL_NEAR_IND_JMP | \
+                                             ARCH_LBR_CTL_NEAR_REL_CALL | \
+                                             ARCH_LBR_CTL_NEAR_IND_CALL | \
+                                             ARCH_LBR_CTL_NEAR_RET | \
+                                             ARCH_LBR_CTL_OTHER_BRANCH)
+
 #define MSR_EFER                            0xc0000080 /* Extended Feature Enable Register */
 #define  EFER_SCE                           (_AC(1, ULL) <<  0) /* SYSCALL Enable */
 #define  EFER_LME                           (_AC(1, ULL) <<  8) /* Long Mode Enable */
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 4c38f6c015..133348d9f9 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -1963,6 +1963,29 @@ void do_device_not_available(struct cpu_user_regs *regs)
 #endif
 }
 
+static bool enable_lbr(void)
+{
+    uint64_t debugctl;
+
+    wrmsrl(MSR_IA32_DEBUGCTLMSR, IA32_DEBUGCTLMSR_LBR);
+    rdmsrl(MSR_IA32_DEBUGCTLMSR, debugctl);
+    if ( !(debugctl & IA32_DEBUGCTLMSR_LBR) )
+    {
+        /*
+         * CPUs with no model-specific LBRs always return DEBUGCTLMSR.LBR
+         * == 0, attempt to set arch LBR if available.
+         */
+        if ( !boot_cpu_has(X86_FEATURE_ARCH_LBR) )
+            return false;
+
+        /* Note that LASTINT{FROMIP,TOIP} matches LER_{FROM_IP,TO_IP} */
+        wrmsrl(MSR_ARCH_LBR_CTL, ARCH_LBR_CTL_LBREN | ARCH_LBR_CTL_OS |
+                                 ARCH_LBR_CTL_RECORD_ALL);
+    }
+
+    return true;
+}
+
 void do_debug(struct cpu_user_regs *regs)
 {
     unsigned long dr6;
@@ -1997,7 +2020,7 @@ void do_debug(struct cpu_user_regs *regs)
 
     /* #DB automatically disabled LBR.  Reinstate it if debugging Xen. */
     if ( cpu_has_xen_lbr )
-        wrmsrl(MSR_IA32_DEBUGCTLMSR, IA32_DEBUGCTLMSR_LBR);
+        enable_lbr();
 
     if ( !guest_mode(regs) )
     {
@@ -2179,8 +2202,8 @@ void percpu_traps_init(void)
     if ( !ler_msr && (ler_msr = calc_ler_msr()) )
         setup_force_cpu_cap(X86_FEATURE_XEN_LBR);
 
-    if ( cpu_has_xen_lbr )
-        wrmsrl(MSR_IA32_DEBUGCTLMSR, IA32_DEBUGCTLMSR_LBR);
+    if ( cpu_has_xen_lbr && !enable_lbr() )
+        printk(XENLOG_ERR "CPU#%u: failed to enable LBR\n", smp_processor_id());
 }
 
 void __init init_idt_traps(void)
diff --git a/xen/include/public/arch-x86/cpufeatureset.h b/xen/include/public/arch-x86/cpufeatureset.h
index 9cee4b439e..cd6409f9f3 100644
--- a/xen/include/public/arch-x86/cpufeatureset.h
+++ b/xen/include/public/arch-x86/cpufeatureset.h
@@ -280,6 +280,7 @@ XEN_CPUFEATURE(RTM_ALWAYS_ABORT, 9*32+11) /*! June 2021 TSX defeaturing in micro
 XEN_CPUFEATURE(TSX_FORCE_ABORT, 9*32+13) /* MSR_TSX_FORCE_ABORT.RTM_ABORT */
 XEN_CPUFEATURE(SERIALIZE,     9*32+14) /*A  SERIALIZE insn */
 XEN_CPUFEATURE(TSXLDTRK,      9*32+16) /*a  TSX load tracking suspend/resume insns */
+XEN_CPUFEATURE(ARCH_LBR,      9*32+19) /*   Intel ARCH LBR */
 XEN_CPUFEATURE(CET_IBT,       9*32+20) /*   CET - Indirect Branch Tracking */
 XEN_CPUFEATURE(IBRSB,         9*32+26) /*A  IBRS and IBPB support (used by Intel) */
 XEN_CPUFEATURE(STIBP,         9*32+27) /*A  STIBP */
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 3/5] x86/perf: expose LBR format in PERF_CAPABILITIES
  2022-05-20 13:37 [PATCH 0/5] x86/lbr: handle lack of model-specific LBRs Roger Pau Monne
  2022-05-20 13:37 ` [PATCH 1/5] x86/ler: use feature flag to check if option is enabled Roger Pau Monne
  2022-05-20 13:37 ` [PATCH 2/5] x86/lbr: enable hypervisor LER with arch LBR Roger Pau Monne
@ 2022-05-20 13:37 ` Roger Pau Monne
  2022-05-20 14:10   ` Andrew Cooper
  2022-05-20 13:37 ` [PATCH 4/5] x86/vmx: handle no model-specific LBR presence Roger Pau Monne
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 22+ messages in thread
From: Roger Pau Monne @ 2022-05-20 13:37 UTC (permalink / raw)
  To: xen-devel; +Cc: Roger Pau Monne, Jan Beulich, Andrew Cooper, Wei Liu

Allow exposing the PDCM bit in CPUID for HVM guests if present on the
platform, which in turn allows exposing PERF_CAPABILITIES.  Limit the
information exposed in PERF_CAPABILITIES to the LBR format only.

This is helpful as hardware without model-specific LBRs set format to
0x3f in order to notify the feature is not present.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Seeing as we have never exposed PDCM in CPUID I wonder whether there's
something that I'm missing that makes exposing PERF_CAPABILITIES LBR
format not as trivial as it looks.
---
 xen/arch/x86/msr.c                          | 9 +++++++++
 xen/include/public/arch-x86/cpufeatureset.h | 2 +-
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
index 01a15857b7..423a795d1d 100644
--- a/xen/arch/x86/msr.c
+++ b/xen/arch/x86/msr.c
@@ -316,6 +316,15 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val)
         *val = 0;
         break;
 
+    case MSR_IA32_PERF_CAPABILITIES:
+        if ( !cp->basic.pdcm )
+            goto gp_fault;
+
+        /* Only report LBR format. */
+        rdmsrl(MSR_IA32_PERF_CAPABILITIES, *val);
+        *val &= MSR_IA32_PERF_CAP_LBR_FORMAT;
+        break;
+
     case MSR_X2APIC_FIRST ... MSR_X2APIC_LAST:
         if ( !is_hvm_domain(d) || v != curr )
             goto gp_fault;
diff --git a/xen/include/public/arch-x86/cpufeatureset.h b/xen/include/public/arch-x86/cpufeatureset.h
index cd6409f9f3..5fdaec43c5 100644
--- a/xen/include/public/arch-x86/cpufeatureset.h
+++ b/xen/include/public/arch-x86/cpufeatureset.h
@@ -135,7 +135,7 @@ XEN_CPUFEATURE(SSSE3,         1*32+ 9) /*A  Supplemental Streaming SIMD Extensio
 XEN_CPUFEATURE(FMA,           1*32+12) /*A  Fused Multiply Add */
 XEN_CPUFEATURE(CX16,          1*32+13) /*A  CMPXCHG16B */
 XEN_CPUFEATURE(XTPR,          1*32+14) /*   Send Task Priority Messages */
-XEN_CPUFEATURE(PDCM,          1*32+15) /*   Perf/Debug Capability MSR */
+XEN_CPUFEATURE(PDCM,          1*32+15) /*S  Perf/Debug Capability MSR */
 XEN_CPUFEATURE(PCID,          1*32+17) /*H  Process Context ID */
 XEN_CPUFEATURE(DCA,           1*32+18) /*   Direct Cache Access */
 XEN_CPUFEATURE(SSE4_1,        1*32+19) /*A  Streaming SIMD Extensions 4.1 */
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 4/5] x86/vmx: handle no model-specific LBR presence
  2022-05-20 13:37 [PATCH 0/5] x86/lbr: handle lack of model-specific LBRs Roger Pau Monne
                   ` (2 preceding siblings ...)
  2022-05-20 13:37 ` [PATCH 3/5] x86/perf: expose LBR format in PERF_CAPABILITIES Roger Pau Monne
@ 2022-05-20 13:37 ` Roger Pau Monne
  2022-05-30 16:02   ` Jan Beulich
  2022-05-20 13:37 ` [PATCH 5/5] x86/vmx: fix indentation of LBR Roger Pau Monne
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 22+ messages in thread
From: Roger Pau Monne @ 2022-05-20 13:37 UTC (permalink / raw)
  To: xen-devel
  Cc: Roger Pau Monne, Jun Nakajima, Kevin Tian, Jan Beulich,
	Andrew Cooper, Wei Liu

Sapphire Rapids have no model-specific LBRs, and instead only expose
architectural LBRs.  As documented in the Architectural Last Branch
Records specification, processors not supporting model-specific LBRs
MSR_IA32_DEBUGCTLMSR.LBR has no meaning, and can be written to 0 or 1,
but reads will always return 0.

Implement support in vmx_msr_write_intercept() by adding generic
detection of lack of model-specific LBRs by checking if the LBR format
reported in PERF_CAPABILITIES matches 0x3f, which is explicitly listed
in the manual as a way to signal lack of model-specific LBRs
presence.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Note the indentation change in vmx_msr_write_intercept() as a result
of the addition of a new condition is left for a following patch in
order to aid readability of the change.
---
 xen/arch/x86/hvm/vmx/vmx.c | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index cf428a4849..3f45ac05c6 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -3007,6 +3007,8 @@ static const struct lbr_info {
     { MSR_GM_LASTBRANCH_0_FROM_IP,  NUM_MSR_GM_LASTBRANCH_FROM_TO },
     { MSR_GM_LASTBRANCH_0_TO_IP,    NUM_MSR_GM_LASTBRANCH_FROM_TO },
     { 0, 0 }
+}, no_lbr[] = {
+    {0, 0}
 };
 
 static const struct lbr_info *last_branch_msr_get(void)
@@ -3070,6 +3072,21 @@ static const struct lbr_info *last_branch_msr_get(void)
         /* Goldmont */
         case 0x5c: case 0x5f:
             return gm_lbr;
+
+        default:
+            if ( cpu_has_pdcm )
+            {
+                uint64_t cap;
+
+                rdmsrl(MSR_IA32_PERF_CAPABILITIES, cap);
+                if ( (cap & MSR_IA32_PERF_CAP_LBR_FORMAT) == 0x3f )
+                    /*
+                     * On processors that do not support model-specific LBRs,
+                     * PERF_CAPABILITIES.LBR_FMT will have the value 0x3f.
+                     */
+                    return no_lbr;
+            }
+            break;
         }
         break;
 
@@ -3521,6 +3538,8 @@ static int cf_check vmx_msr_write_intercept(
                 return X86EMUL_OKAY;
             }
 
+            if ( lbr->count )
+            {
             for ( ; lbr->count; lbr++ )
             {
                 unsigned int i;
@@ -3546,6 +3565,10 @@ static int cf_check vmx_msr_write_intercept(
                 v->arch.hvm.vmx.lbr_flags |= LBR_FIXUP_TSX;
             if ( ler_to_fixup_needed )
                 v->arch.hvm.vmx.lbr_flags |= LBR_FIXUP_LER_TO;
+            }
+            else
+                /* No model specific LBRs, ignore DEBUGCTLMSR.LBR. */
+                msr_content &= ~IA32_DEBUGCTLMSR_LBR;
         }
 
         __vmwrite(GUEST_IA32_DEBUGCTL, msr_content);
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 5/5] x86/vmx: fix indentation of LBR
  2022-05-20 13:37 [PATCH 0/5] x86/lbr: handle lack of model-specific LBRs Roger Pau Monne
                   ` (3 preceding siblings ...)
  2022-05-20 13:37 ` [PATCH 4/5] x86/vmx: handle no model-specific LBR presence Roger Pau Monne
@ 2022-05-20 13:37 ` Roger Pau Monne
  2022-06-29  6:40   ` Tian, Kevin
  2022-06-17  3:24 ` [PATCH 0/5] x86/lbr: handle lack of model-specific LBRs Henry Wang
  2022-07-06  7:30 ` Henry Wang
  6 siblings, 1 reply; 22+ messages in thread
From: Roger Pau Monne @ 2022-05-20 13:37 UTC (permalink / raw)
  To: xen-devel
  Cc: Roger Pau Monne, Jun Nakajima, Kevin Tian, Jan Beulich,
	Andrew Cooper, Wei Liu

Properly indent the handling of LBR enable in MSR_IA32_DEBUGCTLMSR
vmx_msr_write_intercept().

No functional change.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Feel free to squash onto the previous patch, did separately to aid the
readability of the previous change.
---
 xen/arch/x86/hvm/vmx/vmx.c | 38 +++++++++++++++++++-------------------
 1 file changed, 19 insertions(+), 19 deletions(-)

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 3f45ac05c6..ff10b293a4 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -3540,31 +3540,31 @@ static int cf_check vmx_msr_write_intercept(
 
             if ( lbr->count )
             {
-            for ( ; lbr->count; lbr++ )
-            {
-                unsigned int i;
-
-                for ( i = 0; i < lbr->count; i++ )
+                for ( ; lbr->count; lbr++ )
                 {
-                    int rc = vmx_add_guest_msr(v, lbr->base + i, 0);
+                    unsigned int i;
 
-                    if ( unlikely(rc) )
+                    for ( i = 0; i < lbr->count; i++ )
                     {
-                        gprintk(XENLOG_ERR,
-                                "Guest load/save list error %d\n", rc);
-                        domain_crash(v->domain);
-                        return X86EMUL_OKAY;
-                    }
+                        int rc = vmx_add_guest_msr(v, lbr->base + i, 0);
 
-                    vmx_clear_msr_intercept(v, lbr->base + i, VMX_MSR_RW);
+                        if ( unlikely(rc) )
+                        {
+                            gprintk(XENLOG_ERR,
+                                    "Guest load/save list error %d\n", rc);
+                            domain_crash(v->domain);
+                            return X86EMUL_OKAY;
+                        }
+
+                        vmx_clear_msr_intercept(v, lbr->base + i, VMX_MSR_RW);
+                    }
                 }
-            }
 
-            v->arch.hvm.vmx.lbr_flags |= LBR_MSRS_INSERTED;
-            if ( lbr_tsx_fixup_needed )
-                v->arch.hvm.vmx.lbr_flags |= LBR_FIXUP_TSX;
-            if ( ler_to_fixup_needed )
-                v->arch.hvm.vmx.lbr_flags |= LBR_FIXUP_LER_TO;
+                v->arch.hvm.vmx.lbr_flags |= LBR_MSRS_INSERTED;
+                if ( lbr_tsx_fixup_needed )
+                    v->arch.hvm.vmx.lbr_flags |= LBR_FIXUP_TSX;
+                if ( ler_to_fixup_needed )
+                    v->arch.hvm.vmx.lbr_flags |= LBR_FIXUP_LER_TO;
             }
             else
                 /* No model specific LBRs, ignore DEBUGCTLMSR.LBR. */
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/5] x86/perf: expose LBR format in PERF_CAPABILITIES
  2022-05-20 13:37 ` [PATCH 3/5] x86/perf: expose LBR format in PERF_CAPABILITIES Roger Pau Monne
@ 2022-05-20 14:10   ` Andrew Cooper
  2022-05-20 14:19     ` Jan Beulich
  2022-05-20 14:52     ` Roger Pau Monné
  0 siblings, 2 replies; 22+ messages in thread
From: Andrew Cooper @ 2022-05-20 14:10 UTC (permalink / raw)
  To: Roger Pau Monne, xen-devel; +Cc: Jan Beulich, Wei Liu

On 20/05/2022 14:37, Roger Pau Monne wrote:
> Allow exposing the PDCM bit in CPUID for HVM guests if present on the
> platform, which in turn allows exposing PERF_CAPABILITIES.  Limit the
> information exposed in PERF_CAPABILITIES to the LBR format only.
>
> This is helpful as hardware without model-specific LBRs set format to
> 0x3f in order to notify the feature is not present.
>
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
> Seeing as we have never exposed PDCM in CPUID I wonder whether there's
> something that I'm missing that makes exposing PERF_CAPABILITIES LBR
> format not as trivial as it looks.
> ---
>  xen/arch/x86/msr.c                          | 9 +++++++++
>  xen/include/public/arch-x86/cpufeatureset.h | 2 +-
>  2 files changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
> index 01a15857b7..423a795d1d 100644
> --- a/xen/arch/x86/msr.c
> +++ b/xen/arch/x86/msr.c
> @@ -316,6 +316,15 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val)
>          *val = 0;
>          break;
>  
> +    case MSR_IA32_PERF_CAPABILITIES:
> +        if ( !cp->basic.pdcm )
> +            goto gp_fault;
> +
> +        /* Only report LBR format. */
> +        rdmsrl(MSR_IA32_PERF_CAPABILITIES, *val);
> +        *val &= MSR_IA32_PERF_CAP_LBR_FORMAT;

Urgh.  We should have this info cached from boot.  Except caching on
boot is broken on hybrid cpus.  The P and E cores of an AlderLake report
a different format iirc (they differ between linear, and effective addr).

Given the other pain points with hybrid cpus, I think we can ignore it
in the short term.

> +        break;
> +
>      case MSR_X2APIC_FIRST ... MSR_X2APIC_LAST:
>          if ( !is_hvm_domain(d) || v != curr )
>              goto gp_fault;
> diff --git a/xen/include/public/arch-x86/cpufeatureset.h b/xen/include/public/arch-x86/cpufeatureset.h
> index cd6409f9f3..5fdaec43c5 100644
> --- a/xen/include/public/arch-x86/cpufeatureset.h
> +++ b/xen/include/public/arch-x86/cpufeatureset.h
> @@ -135,7 +135,7 @@ XEN_CPUFEATURE(SSSE3,         1*32+ 9) /*A  Supplemental Streaming SIMD Extensio
>  XEN_CPUFEATURE(FMA,           1*32+12) /*A  Fused Multiply Add */
>  XEN_CPUFEATURE(CX16,          1*32+13) /*A  CMPXCHG16B */
>  XEN_CPUFEATURE(XTPR,          1*32+14) /*   Send Task Priority Messages */
> -XEN_CPUFEATURE(PDCM,          1*32+15) /*   Perf/Debug Capability MSR */
> +XEN_CPUFEATURE(PDCM,          1*32+15) /*S  Perf/Debug Capability MSR */

This is the bit which requires more toolstack logic to safely enable. 
Using 's' for off-by-default is fine if we want to get the series in now.

But before we expose the MSR generally, we need to:

1) Put the configuration in msr_policy so the toolstack can reason about it
2) Reject migration attempts to destinations where the LBR format changes
3) Actually put the lBR registers in the migration stream

~Andrew

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/5] x86/perf: expose LBR format in PERF_CAPABILITIES
  2022-05-20 14:10   ` Andrew Cooper
@ 2022-05-20 14:19     ` Jan Beulich
  2022-05-20 14:58       ` Andrew Cooper
  2022-05-20 14:52     ` Roger Pau Monné
  1 sibling, 1 reply; 22+ messages in thread
From: Jan Beulich @ 2022-05-20 14:19 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Wei Liu, Roger Pau Monne, xen-devel

On 20.05.2022 16:10, Andrew Cooper wrote:
> On 20/05/2022 14:37, Roger Pau Monne wrote:
>> --- a/xen/include/public/arch-x86/cpufeatureset.h
>> +++ b/xen/include/public/arch-x86/cpufeatureset.h
>> @@ -135,7 +135,7 @@ XEN_CPUFEATURE(SSSE3,         1*32+ 9) /*A  Supplemental Streaming SIMD Extensio
>>  XEN_CPUFEATURE(FMA,           1*32+12) /*A  Fused Multiply Add */
>>  XEN_CPUFEATURE(CX16,          1*32+13) /*A  CMPXCHG16B */
>>  XEN_CPUFEATURE(XTPR,          1*32+14) /*   Send Task Priority Messages */
>> -XEN_CPUFEATURE(PDCM,          1*32+15) /*   Perf/Debug Capability MSR */
>> +XEN_CPUFEATURE(PDCM,          1*32+15) /*S  Perf/Debug Capability MSR */
> 
> This is the bit which requires more toolstack logic to safely enable. 
> Using 's' for off-by-default is fine if we want to get the series in now.
> 
> But before we expose the MSR generally, we need to:
> 
> 1) Put the configuration in msr_policy so the toolstack can reason about it
> 2) Reject migration attempts to destinations where the LBR format changes

Since this could be quite restrictive, and since people needing to know
they need to hide this feature for migration to work, I guess this would
further want qualifying by "did the guest actually use LBRs so far"?

Jan

> 3) Actually put the lBR registers in the migration stream
> 
> ~Andrew



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/5] x86/perf: expose LBR format in PERF_CAPABILITIES
  2022-05-20 14:10   ` Andrew Cooper
  2022-05-20 14:19     ` Jan Beulich
@ 2022-05-20 14:52     ` Roger Pau Monné
  1 sibling, 0 replies; 22+ messages in thread
From: Roger Pau Monné @ 2022-05-20 14:52 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Jan Beulich, Wei Liu

On Fri, May 20, 2022 at 02:10:31PM +0000, Andrew Cooper wrote:
> On 20/05/2022 14:37, Roger Pau Monne wrote:
> > Allow exposing the PDCM bit in CPUID for HVM guests if present on the
> > platform, which in turn allows exposing PERF_CAPABILITIES.  Limit the
> > information exposed in PERF_CAPABILITIES to the LBR format only.
> >
> > This is helpful as hardware without model-specific LBRs set format to
> > 0x3f in order to notify the feature is not present.
> >
> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> > ---
> > Seeing as we have never exposed PDCM in CPUID I wonder whether there's
> > something that I'm missing that makes exposing PERF_CAPABILITIES LBR
> > format not as trivial as it looks.
> > ---
> >  xen/arch/x86/msr.c                          | 9 +++++++++
> >  xen/include/public/arch-x86/cpufeatureset.h | 2 +-
> >  2 files changed, 10 insertions(+), 1 deletion(-)
> >
> > diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
> > index 01a15857b7..423a795d1d 100644
> > --- a/xen/arch/x86/msr.c
> > +++ b/xen/arch/x86/msr.c
> > @@ -316,6 +316,15 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val)
> >          *val = 0;
> >          break;
> >  
> > +    case MSR_IA32_PERF_CAPABILITIES:
> > +        if ( !cp->basic.pdcm )
> > +            goto gp_fault;
> > +
> > +        /* Only report LBR format. */
> > +        rdmsrl(MSR_IA32_PERF_CAPABILITIES, *val);
> > +        *val &= MSR_IA32_PERF_CAP_LBR_FORMAT;
> 
> Urgh.  We should have this info cached from boot.  Except caching on
> boot is broken on hybrid cpus.  The P and E cores of an AlderLake report
> a different format iirc (they differ between linear, and effective addr).
> 
> Given the other pain points with hybrid cpus, I think we can ignore it
> in the short term.
> 
> > +        break;
> > +
> >      case MSR_X2APIC_FIRST ... MSR_X2APIC_LAST:
> >          if ( !is_hvm_domain(d) || v != curr )
> >              goto gp_fault;
> > diff --git a/xen/include/public/arch-x86/cpufeatureset.h b/xen/include/public/arch-x86/cpufeatureset.h
> > index cd6409f9f3..5fdaec43c5 100644
> > --- a/xen/include/public/arch-x86/cpufeatureset.h
> > +++ b/xen/include/public/arch-x86/cpufeatureset.h
> > @@ -135,7 +135,7 @@ XEN_CPUFEATURE(SSSE3,         1*32+ 9) /*A  Supplemental Streaming SIMD Extensio
> >  XEN_CPUFEATURE(FMA,           1*32+12) /*A  Fused Multiply Add */
> >  XEN_CPUFEATURE(CX16,          1*32+13) /*A  CMPXCHG16B */
> >  XEN_CPUFEATURE(XTPR,          1*32+14) /*   Send Task Priority Messages */
> > -XEN_CPUFEATURE(PDCM,          1*32+15) /*   Perf/Debug Capability MSR */
> > +XEN_CPUFEATURE(PDCM,          1*32+15) /*S  Perf/Debug Capability MSR */
> 
> This is the bit which requires more toolstack logic to safely enable. 
> Using 's' for off-by-default is fine if we want to get the series in now.
> 
> But before we expose the MSR generally, we need to:
> 
> 1) Put the configuration in msr_policy so the toolstack can reason about it
> 2) Reject migration attempts to destinations where the LBR format changes
> 3) Actually put the lBR registers in the migration stream

So far we have allowed guests to enable LBRs (DEBUGCTLMSR.LBR) freely
without any restrictions, and migration of guests using LBRs certainly
won't work currently, hence I wonder why exposing the LBR format makes
it worse as to require all this extra handling.

I'm not saying it's not worth having, but IMO we should better spend
the time in getting architectural LBRs available to guests and
migration safe, and for architectural LBRs we don't really care about
PERF_CAPABILITIES.LBR_FORMAT other than hardcoding it to 0x3f.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/5] x86/perf: expose LBR format in PERF_CAPABILITIES
  2022-05-20 14:19     ` Jan Beulich
@ 2022-05-20 14:58       ` Andrew Cooper
  2022-05-23  8:04         ` Roger Pau Monné
  2022-05-23  8:12         ` Jan Beulich
  0 siblings, 2 replies; 22+ messages in thread
From: Andrew Cooper @ 2022-05-20 14:58 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Wei Liu, Roger Pau Monne, xen-devel

On 20/05/2022 15:19, Jan Beulich wrote:
> On 20.05.2022 16:10, Andrew Cooper wrote:
>> On 20/05/2022 14:37, Roger Pau Monne wrote:
>>> --- a/xen/include/public/arch-x86/cpufeatureset.h
>>> +++ b/xen/include/public/arch-x86/cpufeatureset.h
>>> @@ -135,7 +135,7 @@ XEN_CPUFEATURE(SSSE3,         1*32+ 9) /*A  Supplemental Streaming SIMD Extensio
>>>  XEN_CPUFEATURE(FMA,           1*32+12) /*A  Fused Multiply Add */
>>>  XEN_CPUFEATURE(CX16,          1*32+13) /*A  CMPXCHG16B */
>>>  XEN_CPUFEATURE(XTPR,          1*32+14) /*   Send Task Priority Messages */
>>> -XEN_CPUFEATURE(PDCM,          1*32+15) /*   Perf/Debug Capability MSR */
>>> +XEN_CPUFEATURE(PDCM,          1*32+15) /*S  Perf/Debug Capability MSR */
>> This is the bit which requires more toolstack logic to safely enable. 
>> Using 's' for off-by-default is fine if we want to get the series in now.
>>
>> But before we expose the MSR generally, we need to:
>>
>> 1) Put the configuration in msr_policy so the toolstack can reason about it
>> 2) Reject migration attempts to destinations where the LBR format changes
> Since this could be quite restrictive, and since people needing to know
> they need to hide this feature for migration to work, I guess this would
> further want qualifying by "did the guest actually use LBRs so far"?

In practice, it's every major generation ("tock" on Intel's old model),
so isn't actually limiting the kinds of heterogeneous setups used in
production.  (Migration gets steadily less stable the further apart the
two CPUs are.)

As to dynamic, no - that would be a security bug in a cloud scenario,
because there must not be anything the guest can do to interfere with
the manageability.

Use of LBR is rare, as demonstrated by the fact that noone has
complained about the fact that migrating such a VM will malfunction.

As we now have a way of reporting "no model-specific LBR", I'm tempted
to suggest that VMs get no LBR by default, and someone wanting LBR has
to opt in, which is also an explicit agreement to the migration limitation.

~Andrew

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/5] x86/perf: expose LBR format in PERF_CAPABILITIES
  2022-05-20 14:58       ` Andrew Cooper
@ 2022-05-23  8:04         ` Roger Pau Monné
  2022-05-23  8:12         ` Jan Beulich
  1 sibling, 0 replies; 22+ messages in thread
From: Roger Pau Monné @ 2022-05-23  8:04 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Jan Beulich, Wei Liu, xen-devel

On Fri, May 20, 2022 at 02:58:01PM +0000, Andrew Cooper wrote:
> On 20/05/2022 15:19, Jan Beulich wrote:
> > On 20.05.2022 16:10, Andrew Cooper wrote:
> >> On 20/05/2022 14:37, Roger Pau Monne wrote:
> >>> --- a/xen/include/public/arch-x86/cpufeatureset.h
> >>> +++ b/xen/include/public/arch-x86/cpufeatureset.h
> >>> @@ -135,7 +135,7 @@ XEN_CPUFEATURE(SSSE3,         1*32+ 9) /*A  Supplemental Streaming SIMD Extensio
> >>>  XEN_CPUFEATURE(FMA,           1*32+12) /*A  Fused Multiply Add */
> >>>  XEN_CPUFEATURE(CX16,          1*32+13) /*A  CMPXCHG16B */
> >>>  XEN_CPUFEATURE(XTPR,          1*32+14) /*   Send Task Priority Messages */
> >>> -XEN_CPUFEATURE(PDCM,          1*32+15) /*   Perf/Debug Capability MSR */
> >>> +XEN_CPUFEATURE(PDCM,          1*32+15) /*S  Perf/Debug Capability MSR */
> >> This is the bit which requires more toolstack logic to safely enable. 
> >> Using 's' for off-by-default is fine if we want to get the series in now.
> >>
> >> But before we expose the MSR generally, we need to:
> >>
> >> 1) Put the configuration in msr_policy so the toolstack can reason about it
> >> 2) Reject migration attempts to destinations where the LBR format changes
> > Since this could be quite restrictive, and since people needing to know
> > they need to hide this feature for migration to work, I guess this would
> > further want qualifying by "did the guest actually use LBRs so far"?
> 
> In practice, it's every major generation ("tock" on Intel's old model),
> so isn't actually limiting the kinds of heterogeneous setups used in
> production.  (Migration gets steadily less stable the further apart the
> two CPUs are.)
> 
> As to dynamic, no - that would be a security bug in a cloud scenario,
> because there must not be anything the guest can do to interfere with
> the manageability.
> 
> Use of LBR is rare, as demonstrated by the fact that noone has
> complained about the fact that migrating such a VM will malfunction.
> 
> As we now have a way of reporting "no model-specific LBR", I'm tempted
> to suggest that VMs get no LBR by default, and someone wanting LBR has
> to opt in, which is also an explicit agreement to the migration limitation.

I did also consider exposing "no model-specific LBR" in
PERF_CAPABILITIES unconditionally, but I was worried whether that
would break existing setups.

If we think that providing an option to expose the native LBR format
in PERF_CAPABILITIES is fine that could be a sensible solution IMO.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/5] x86/perf: expose LBR format in PERF_CAPABILITIES
  2022-05-20 14:58       ` Andrew Cooper
  2022-05-23  8:04         ` Roger Pau Monné
@ 2022-05-23  8:12         ` Jan Beulich
  2022-05-23  9:53           ` Roger Pau Monné
  1 sibling, 1 reply; 22+ messages in thread
From: Jan Beulich @ 2022-05-23  8:12 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Wei Liu, Roger Pau Monne, xen-devel

On 20.05.2022 16:58, Andrew Cooper wrote:
> On 20/05/2022 15:19, Jan Beulich wrote:
>> On 20.05.2022 16:10, Andrew Cooper wrote:
>>> On 20/05/2022 14:37, Roger Pau Monne wrote:
>>>> --- a/xen/include/public/arch-x86/cpufeatureset.h
>>>> +++ b/xen/include/public/arch-x86/cpufeatureset.h
>>>> @@ -135,7 +135,7 @@ XEN_CPUFEATURE(SSSE3,         1*32+ 9) /*A  Supplemental Streaming SIMD Extensio
>>>>  XEN_CPUFEATURE(FMA,           1*32+12) /*A  Fused Multiply Add */
>>>>  XEN_CPUFEATURE(CX16,          1*32+13) /*A  CMPXCHG16B */
>>>>  XEN_CPUFEATURE(XTPR,          1*32+14) /*   Send Task Priority Messages */
>>>> -XEN_CPUFEATURE(PDCM,          1*32+15) /*   Perf/Debug Capability MSR */
>>>> +XEN_CPUFEATURE(PDCM,          1*32+15) /*S  Perf/Debug Capability MSR */
>>> This is the bit which requires more toolstack logic to safely enable. 
>>> Using 's' for off-by-default is fine if we want to get the series in now.
>>>
>>> But before we expose the MSR generally, we need to:
>>>
>>> 1) Put the configuration in msr_policy so the toolstack can reason about it
>>> 2) Reject migration attempts to destinations where the LBR format changes
>> Since this could be quite restrictive, and since people needing to know
>> they need to hide this feature for migration to work, I guess this would
>> further want qualifying by "did the guest actually use LBRs so far"?
> 
> In practice, it's every major generation ("tock" on Intel's old model),
> so isn't actually limiting the kinds of heterogeneous setups used in
> production.  (Migration gets steadily less stable the further apart the
> two CPUs are.)
> 
> As to dynamic, no - that would be a security bug in a cloud scenario,
> because there must not be anything the guest can do to interfere with
> the manageability.
> 
> Use of LBR is rare, as demonstrated by the fact that noone has
> complained about the fact that migrating such a VM will malfunction.
> 
> As we now have a way of reporting "no model-specific LBR",

Which only rather new guest kernels will know to look for. Hence ...

> I'm tempted
> to suggest that VMs get no LBR by default, and someone wanting LBR has
> to opt in, which is also an explicit agreement to the migration limitation.

... while in principle I agree with this, I see a practical issue.

Jan



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/5] x86/perf: expose LBR format in PERF_CAPABILITIES
  2022-05-23  8:12         ` Jan Beulich
@ 2022-05-23  9:53           ` Roger Pau Monné
  0 siblings, 0 replies; 22+ messages in thread
From: Roger Pau Monné @ 2022-05-23  9:53 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, xen-devel

On Mon, May 23, 2022 at 10:12:55AM +0200, Jan Beulich wrote:
> On 20.05.2022 16:58, Andrew Cooper wrote:
> > On 20/05/2022 15:19, Jan Beulich wrote:
> >> On 20.05.2022 16:10, Andrew Cooper wrote:
> >>> On 20/05/2022 14:37, Roger Pau Monne wrote:
> >>>> --- a/xen/include/public/arch-x86/cpufeatureset.h
> >>>> +++ b/xen/include/public/arch-x86/cpufeatureset.h
> >>>> @@ -135,7 +135,7 @@ XEN_CPUFEATURE(SSSE3,         1*32+ 9) /*A  Supplemental Streaming SIMD Extensio
> >>>>  XEN_CPUFEATURE(FMA,           1*32+12) /*A  Fused Multiply Add */
> >>>>  XEN_CPUFEATURE(CX16,          1*32+13) /*A  CMPXCHG16B */
> >>>>  XEN_CPUFEATURE(XTPR,          1*32+14) /*   Send Task Priority Messages */
> >>>> -XEN_CPUFEATURE(PDCM,          1*32+15) /*   Perf/Debug Capability MSR */
> >>>> +XEN_CPUFEATURE(PDCM,          1*32+15) /*S  Perf/Debug Capability MSR */
> >>> This is the bit which requires more toolstack logic to safely enable. 
> >>> Using 's' for off-by-default is fine if we want to get the series in now.
> >>>
> >>> But before we expose the MSR generally, we need to:
> >>>
> >>> 1) Put the configuration in msr_policy so the toolstack can reason about it
> >>> 2) Reject migration attempts to destinations where the LBR format changes
> >> Since this could be quite restrictive, and since people needing to know
> >> they need to hide this feature for migration to work, I guess this would
> >> further want qualifying by "did the guest actually use LBRs so far"?
> > 
> > In practice, it's every major generation ("tock" on Intel's old model),
> > so isn't actually limiting the kinds of heterogeneous setups used in
> > production.  (Migration gets steadily less stable the further apart the
> > two CPUs are.)
> > 
> > As to dynamic, no - that would be a security bug in a cloud scenario,
> > because there must not be anything the guest can do to interfere with
> > the manageability.
> > 
> > Use of LBR is rare, as demonstrated by the fact that noone has
> > complained about the fact that migrating such a VM will malfunction.
> > 
> > As we now have a way of reporting "no model-specific LBR",
> 
> Which only rather new guest kernels will know to look for. Hence ...
> 
> > I'm tempted
> > to suggest that VMs get no LBR by default, and someone wanting LBR has
> > to opt in, which is also an explicit agreement to the migration limitation.
> 
> ... while in principle I agree with this, I see a practical issue.

I think it should be fine to expose no model-specific LBR support in
PERF_CAPABILITIES, but we shouldn't change the behavior of
DEBUGCTLMSR.LBR exposed to guests if the underlying platform has
model-specific LBRs and those are known to Xen.

That way old guest kernels that ignore PERF_CAPABILITIES.LBR_FORMAT
will continue to work, while newish kernels that check the format will
avoid using LBRs.

In case we introduce a guest config option to enable LBR, should we
then expose the native LBR format in PERF_CAPABILITIES?  Or would it
be better to just keep the current model and not expose
PERF_CAPABILITIES at all (and don't report PDCM in CPUID in that
case).

Thanks, Roger.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/5] x86/ler: use feature flag to check if option is enabled
  2022-05-20 13:37 ` [PATCH 1/5] x86/ler: use feature flag to check if option is enabled Roger Pau Monne
@ 2022-05-30 15:00   ` Jan Beulich
  0 siblings, 0 replies; 22+ messages in thread
From: Jan Beulich @ 2022-05-30 15:00 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel

On 20.05.2022 15:37, Roger Pau Monne wrote:
> It's more consistent with the rest of the usages of cpu_has_xen_lbr.
> 
> No functional change intended.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Acked-by: Jan Beulich <jbeulich@suse.com>

I wonder though whether the use as a conditional in percpu_traps_init()
wouldn't then better also be replaced, for being consistent.

Jan



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/5] x86/lbr: enable hypervisor LER with arch LBR
  2022-05-20 13:37 ` [PATCH 2/5] x86/lbr: enable hypervisor LER with arch LBR Roger Pau Monne
@ 2022-05-30 15:31   ` Jan Beulich
  2022-07-01 15:39     ` Roger Pau Monné
  0 siblings, 1 reply; 22+ messages in thread
From: Jan Beulich @ 2022-05-30 15:31 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel

On 20.05.2022 15:37, Roger Pau Monne wrote:
> --- a/xen/arch/x86/include/asm/msr-index.h
> +++ b/xen/arch/x86/include/asm/msr-index.h
> @@ -139,6 +139,24 @@
>  #define  PASID_PASID_MASK                   0x000fffff
>  #define  PASID_VALID                        (_AC(1, ULL) << 31)
>  
> +#define MSR_ARCH_LBR_CTL                    0x000014ce
> +#define  ARCH_LBR_CTL_LBREN                 (_AC(1, ULL) <<  0)
> +#define  ARCH_LBR_CTL_OS                    (_AC(1, ULL) <<  1)

Bits 2 and 3 also have meaning (USR and CALL_STACK) according to
ISE version 44. If it was intentional that you omitted those
(perhaps you intended to add only the bits you actually use right
away), it would have been nice if you said so in the description.

> --- a/xen/arch/x86/traps.c
> +++ b/xen/arch/x86/traps.c
> @@ -1963,6 +1963,29 @@ void do_device_not_available(struct cpu_user_regs *regs)
>  #endif
>  }
>  
> +static bool enable_lbr(void)
> +{
> +    uint64_t debugctl;
> +
> +    wrmsrl(MSR_IA32_DEBUGCTLMSR, IA32_DEBUGCTLMSR_LBR);
> +    rdmsrl(MSR_IA32_DEBUGCTLMSR, debugctl);
> +    if ( !(debugctl & IA32_DEBUGCTLMSR_LBR) )
> +    {
> +        /*
> +         * CPUs with no model-specific LBRs always return DEBUGCTLMSR.LBR
> +         * == 0, attempt to set arch LBR if available.
> +         */
> +        if ( !boot_cpu_has(X86_FEATURE_ARCH_LBR) )
> +            return false;
> +
> +        /* Note that LASTINT{FROMIP,TOIP} matches LER_{FROM_IP,TO_IP} */
> +        wrmsrl(MSR_ARCH_LBR_CTL, ARCH_LBR_CTL_LBREN | ARCH_LBR_CTL_OS |
> +                                 ARCH_LBR_CTL_RECORD_ALL);
> +    }
> +
> +    return true;
> +}

Would it make sense to try architectural LBRs first?

> @@ -1997,7 +2020,7 @@ void do_debug(struct cpu_user_regs *regs)
>  
>      /* #DB automatically disabled LBR.  Reinstate it if debugging Xen. */
>      if ( cpu_has_xen_lbr )
> -        wrmsrl(MSR_IA32_DEBUGCTLMSR, IA32_DEBUGCTLMSR_LBR);
> +        enable_lbr();
>  
>      if ( !guest_mode(regs) )
>      {
> @@ -2179,8 +2202,8 @@ void percpu_traps_init(void)
>      if ( !ler_msr && (ler_msr = calc_ler_msr()) )
>          setup_force_cpu_cap(X86_FEATURE_XEN_LBR);
>  
> -    if ( cpu_has_xen_lbr )
> -        wrmsrl(MSR_IA32_DEBUGCTLMSR, IA32_DEBUGCTLMSR_LBR);
> +    if ( cpu_has_xen_lbr && !enable_lbr() )
> +        printk(XENLOG_ERR "CPU#%u: failed to enable LBR\n", smp_processor_id());
>  }

Isn't enable_lbr() failing a strong indication that we shouldn't have
set XEN_LBR just before this? IOW doesn't this want re-arranging such
that the feature bit and maybe also ler_msr (albeit some care would
be required there; in fact I think this is broken for the case of
running on non-{Intel,AMD,Hygon} CPUs [or unrecognized models] but
opt_ler being true) remain unset in that case?

As there's no good place to ask the VMX-related question, it needs to
go here: Aiui with this patch in place VMX guests will be run with
Xen's choice of LBR_CTL. That's different from DebugCtl, which - being
part of the VMCS - would be loaded by the CPU. Such a difference, if
intended, would imo again want pointing out in the description.

Jan



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 4/5] x86/vmx: handle no model-specific LBR presence
  2022-05-20 13:37 ` [PATCH 4/5] x86/vmx: handle no model-specific LBR presence Roger Pau Monne
@ 2022-05-30 16:02   ` Jan Beulich
  0 siblings, 0 replies; 22+ messages in thread
From: Jan Beulich @ 2022-05-30 16:02 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Jun Nakajima, Kevin Tian, Andrew Cooper, Wei Liu, xen-devel

On 20.05.2022 15:37, Roger Pau Monne wrote:
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -3007,6 +3007,8 @@ static const struct lbr_info {
>      { MSR_GM_LASTBRANCH_0_FROM_IP,  NUM_MSR_GM_LASTBRANCH_FROM_TO },
>      { MSR_GM_LASTBRANCH_0_TO_IP,    NUM_MSR_GM_LASTBRANCH_FROM_TO },
>      { 0, 0 }
> +}, no_lbr[] = {
> +    {0, 0}
>  };

Instead of introducing this and ...

> @@ -3070,6 +3072,21 @@ static const struct lbr_info *last_branch_msr_get(void)
>          /* Goldmont */
>          case 0x5c: case 0x5f:
>              return gm_lbr;
> +
> +        default:
> +            if ( cpu_has_pdcm )
> +            {
> +                uint64_t cap;
> +
> +                rdmsrl(MSR_IA32_PERF_CAPABILITIES, cap);
> +                if ( (cap & MSR_IA32_PERF_CAP_LBR_FORMAT) == 0x3f )
> +                    /*
> +                     * On processors that do not support model-specific LBRs,
> +                     * PERF_CAPABILITIES.LBR_FMT will have the value 0x3f.
> +                     */
> +                    return no_lbr;

... doing this MSR read every time, can't you store a mask value
once during boot, which you apply to msr_content ...

> @@ -3521,6 +3538,8 @@ static int cf_check vmx_msr_write_intercept(
>                  return X86EMUL_OKAY;
>              }
>  
> +            if ( lbr->count )
> +            {
>              for ( ; lbr->count; lbr++ )
>              {
>                  unsigned int i;

... ahead of the bigger if() enclosing this code (thus also avoiding
the need to re-indent)?

Jan



^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [PATCH 0/5] x86/lbr: handle lack of model-specific LBRs
  2022-05-20 13:37 [PATCH 0/5] x86/lbr: handle lack of model-specific LBRs Roger Pau Monne
                   ` (4 preceding siblings ...)
  2022-05-20 13:37 ` [PATCH 5/5] x86/vmx: fix indentation of LBR Roger Pau Monne
@ 2022-06-17  3:24 ` Henry Wang
  2022-07-06  7:30 ` Henry Wang
  6 siblings, 0 replies; 22+ messages in thread
From: Henry Wang @ 2022-06-17  3:24 UTC (permalink / raw)
  To: Roger Pau Monne, xen-devel
  Cc: Jan Beulich, Andrew Cooper, Wei Liu, Jun Nakajima, Kevin Tian

Hi,

It seems that this series [1] has been stale for a while, with actions needed
from the author for the first 4 patches and comments needed from the
maintainers for the patch#5. So sending this email as a gentle reminder.
Thanks!

[1] https://patchwork.kernel.org/project/xen-devel/list/?series=643625

Kind regards,
Henry

> -----Original Message-----
> From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of
> Roger Pau Monne
> Subject: [PATCH 0/5] x86/lbr: handle lack of model-specific LBRs
> 
> Hello,
> 
> Intel Sapphire Rapids CPUs doesn't have model-specific MSRs, and hence
> only architectural LBRs are available.
> 
> Firstly implement some changes so Xen knows how to enable arch LBRs so
> that the ler option can also work in such scenario (first two patches).
> 
> The lack of model-specific LBRs also affects guests, as setting
> DEBUGCTLMSR.LBR is now ignored (value hardwired to 0, writes ignored)
> by
> the hardware due to the lack of model-specific LBRs.  The LBR format
> reported in PERF_CAPABILITIES also need to be exposed, as that's a way
> for guests to detect lack of model-specific LBRs presence (patches 3
> and 4).
> 
> Patch 5 is an indentation fix that can be merged into patch 4: done
> separately to help readability of patch 4.
> 
> Thanks, Roger.
> 
> Roger Pau Monne (5):
>   x86/ler: use feature flag to check if option is enabled
>   x86/lbr: enable hypervisor LER with arch LBR
>   x86/perf: expose LBR format in PERF_CAPABILITIES
>   x86/vmx: handle no model-specific LBR presence
>   x86/vmx: fix indentation of LBR
> 
>  xen/arch/x86/hvm/vmx/vmx.c                  | 59 ++++++++++++++-------
>  xen/arch/x86/include/asm/msr-index.h        | 18 +++++++
>  xen/arch/x86/msr.c                          |  9 ++++
>  xen/arch/x86/traps.c                        | 29 ++++++++--
>  xen/arch/x86/x86_64/traps.c                 |  2 +-
>  xen/include/public/arch-x86/cpufeatureset.h |  3 +-
>  6 files changed, 97 insertions(+), 23 deletions(-)
> 
> --
> 2.36.0
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [PATCH 5/5] x86/vmx: fix indentation of LBR
  2022-05-20 13:37 ` [PATCH 5/5] x86/vmx: fix indentation of LBR Roger Pau Monne
@ 2022-06-29  6:40   ` Tian, Kevin
  0 siblings, 0 replies; 22+ messages in thread
From: Tian, Kevin @ 2022-06-29  6:40 UTC (permalink / raw)
  To: Pau Monné, Roger, xen-devel
  Cc: Pau Monné,
	Roger, Nakajima, Jun, Beulich, Jan, Cooper, Andrew, Wei Liu

> From: Roger Pau Monne
> Sent: Friday, May 20, 2022 9:38 PM
> 
> Properly indent the handling of LBR enable in MSR_IA32_DEBUGCTLMSR
> vmx_msr_write_intercept().
> 
> No functional change.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

> ---
> Feel free to squash onto the previous patch, did separately to aid the
> readability of the previous change.
> ---
>  xen/arch/x86/hvm/vmx/vmx.c | 38 +++++++++++++++++++-------------------
>  1 file changed, 19 insertions(+), 19 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
> index 3f45ac05c6..ff10b293a4 100644
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -3540,31 +3540,31 @@ static int cf_check vmx_msr_write_intercept(
> 
>              if ( lbr->count )
>              {
> -            for ( ; lbr->count; lbr++ )
> -            {
> -                unsigned int i;
> -
> -                for ( i = 0; i < lbr->count; i++ )
> +                for ( ; lbr->count; lbr++ )
>                  {
> -                    int rc = vmx_add_guest_msr(v, lbr->base + i, 0);
> +                    unsigned int i;
> 
> -                    if ( unlikely(rc) )
> +                    for ( i = 0; i < lbr->count; i++ )
>                      {
> -                        gprintk(XENLOG_ERR,
> -                                "Guest load/save list error %d\n", rc);
> -                        domain_crash(v->domain);
> -                        return X86EMUL_OKAY;
> -                    }
> +                        int rc = vmx_add_guest_msr(v, lbr->base + i, 0);
> 
> -                    vmx_clear_msr_intercept(v, lbr->base + i, VMX_MSR_RW);
> +                        if ( unlikely(rc) )
> +                        {
> +                            gprintk(XENLOG_ERR,
> +                                    "Guest load/save list error %d\n", rc);
> +                            domain_crash(v->domain);
> +                            return X86EMUL_OKAY;
> +                        }
> +
> +                        vmx_clear_msr_intercept(v, lbr->base + i, VMX_MSR_RW);
> +                    }
>                  }
> -            }
> 
> -            v->arch.hvm.vmx.lbr_flags |= LBR_MSRS_INSERTED;
> -            if ( lbr_tsx_fixup_needed )
> -                v->arch.hvm.vmx.lbr_flags |= LBR_FIXUP_TSX;
> -            if ( ler_to_fixup_needed )
> -                v->arch.hvm.vmx.lbr_flags |= LBR_FIXUP_LER_TO;
> +                v->arch.hvm.vmx.lbr_flags |= LBR_MSRS_INSERTED;
> +                if ( lbr_tsx_fixup_needed )
> +                    v->arch.hvm.vmx.lbr_flags |= LBR_FIXUP_TSX;
> +                if ( ler_to_fixup_needed )
> +                    v->arch.hvm.vmx.lbr_flags |= LBR_FIXUP_LER_TO;
>              }
>              else
>                  /* No model specific LBRs, ignore DEBUGCTLMSR.LBR. */
> --
> 2.36.0
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/5] x86/lbr: enable hypervisor LER with arch LBR
  2022-05-30 15:31   ` Jan Beulich
@ 2022-07-01 15:39     ` Roger Pau Monné
  2022-07-04  6:15       ` Jan Beulich
  0 siblings, 1 reply; 22+ messages in thread
From: Roger Pau Monné @ 2022-07-01 15:39 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, xen-devel

On Mon, May 30, 2022 at 05:31:18PM +0200, Jan Beulich wrote:
> On 20.05.2022 15:37, Roger Pau Monne wrote:
> > --- a/xen/arch/x86/include/asm/msr-index.h
> > +++ b/xen/arch/x86/include/asm/msr-index.h
> > @@ -139,6 +139,24 @@
> >  #define  PASID_PASID_MASK                   0x000fffff
> >  #define  PASID_VALID                        (_AC(1, ULL) << 31)
> >  
> > +#define MSR_ARCH_LBR_CTL                    0x000014ce
> > +#define  ARCH_LBR_CTL_LBREN                 (_AC(1, ULL) <<  0)
> > +#define  ARCH_LBR_CTL_OS                    (_AC(1, ULL) <<  1)
> 
> Bits 2 and 3 also have meaning (USR and CALL_STACK) according to
> ISE version 44. If it was intentional that you omitted those
> (perhaps you intended to add only the bits you actually use right
> away), it would have been nice if you said so in the description.

Yes, I've only added the bits used.  I could add all if that's better.

> > --- a/xen/arch/x86/traps.c
> > +++ b/xen/arch/x86/traps.c
> > @@ -1963,6 +1963,29 @@ void do_device_not_available(struct cpu_user_regs *regs)
> >  #endif
> >  }
> >  
> > +static bool enable_lbr(void)
> > +{
> > +    uint64_t debugctl;
> > +
> > +    wrmsrl(MSR_IA32_DEBUGCTLMSR, IA32_DEBUGCTLMSR_LBR);
> > +    rdmsrl(MSR_IA32_DEBUGCTLMSR, debugctl);
> > +    if ( !(debugctl & IA32_DEBUGCTLMSR_LBR) )
> > +    {
> > +        /*
> > +         * CPUs with no model-specific LBRs always return DEBUGCTLMSR.LBR
> > +         * == 0, attempt to set arch LBR if available.
> > +         */
> > +        if ( !boot_cpu_has(X86_FEATURE_ARCH_LBR) )
> > +            return false;
> > +
> > +        /* Note that LASTINT{FROMIP,TOIP} matches LER_{FROM_IP,TO_IP} */
> > +        wrmsrl(MSR_ARCH_LBR_CTL, ARCH_LBR_CTL_LBREN | ARCH_LBR_CTL_OS |
> > +                                 ARCH_LBR_CTL_RECORD_ALL);
> > +    }
> > +
> > +    return true;
> > +}
> 
> Would it make sense to try architectural LBRs first?

I didn't want to change behavior for existing platforms that have
both architectural and model specific LBRs.

> > @@ -1997,7 +2020,7 @@ void do_debug(struct cpu_user_regs *regs)
> >  
> >      /* #DB automatically disabled LBR.  Reinstate it if debugging Xen. */
> >      if ( cpu_has_xen_lbr )
> > -        wrmsrl(MSR_IA32_DEBUGCTLMSR, IA32_DEBUGCTLMSR_LBR);
> > +        enable_lbr();
> >  
> >      if ( !guest_mode(regs) )
> >      {
> > @@ -2179,8 +2202,8 @@ void percpu_traps_init(void)
> >      if ( !ler_msr && (ler_msr = calc_ler_msr()) )
> >          setup_force_cpu_cap(X86_FEATURE_XEN_LBR);
> >  
> > -    if ( cpu_has_xen_lbr )
> > -        wrmsrl(MSR_IA32_DEBUGCTLMSR, IA32_DEBUGCTLMSR_LBR);
> > +    if ( cpu_has_xen_lbr && !enable_lbr() )
> > +        printk(XENLOG_ERR "CPU#%u: failed to enable LBR\n", smp_processor_id());
> >  }
> 
> Isn't enable_lbr() failing a strong indication that we shouldn't have
> set XEN_LBR just before this?

So I've now added extra checks in calc_ler_msr() so that it only
returns != 0 when there's LBR support (either model specific or
architectural).

> IOW doesn't this want re-arranging such
> that the feature bit and maybe also ler_msr (albeit some care would
> be required there; in fact I think this is broken for the case of
> running on non-{Intel,AMD,Hygon} CPUs [or unrecognized models] but
> opt_ler being true) remain unset in that case?

opt_ler will be set to false if calc_ler_msr() return 0, which is the
case for non-{Intel,AMD,Hygon} or unrecognized models.

> As there's no good place to ask the VMX-related question, it needs to
> go here: Aiui with this patch in place VMX guests will be run with
> Xen's choice of LBR_CTL. That's different from DebugCtl, which - being
> part of the VMCS - would be loaded by the CPU. Such a difference, if
> intended, would imo again want pointing out in the description.

LBR_CTL will only be set by Xen when the CPU only supports
architectural LBRs (no model specific LBR support at all), and in that
case LBR support won't be exposed to guests, as the ARCH_LBR CPUID is
not exposed, neither are guests allowed access to ARCH_LBR_CTL.

Note that in such scenario also setting DebugCtl.LBR has not effect, as
there's no model specific LBR support, and the hardware will just
ignore the bit.  Also none of the LBR MSRs are exposed to guests
either.

I can try to clarify all the above in the commit message.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/5] x86/lbr: enable hypervisor LER with arch LBR
  2022-07-01 15:39     ` Roger Pau Monné
@ 2022-07-04  6:15       ` Jan Beulich
  2022-07-04  8:23         ` Roger Pau Monné
  0 siblings, 1 reply; 22+ messages in thread
From: Jan Beulich @ 2022-07-04  6:15 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: Andrew Cooper, Wei Liu, xen-devel

On 01.07.2022 17:39, Roger Pau Monné wrote:
> On Mon, May 30, 2022 at 05:31:18PM +0200, Jan Beulich wrote:
>> On 20.05.2022 15:37, Roger Pau Monne wrote:
>>> --- a/xen/arch/x86/include/asm/msr-index.h
>>> +++ b/xen/arch/x86/include/asm/msr-index.h
>>> @@ -139,6 +139,24 @@
>>>  #define  PASID_PASID_MASK                   0x000fffff
>>>  #define  PASID_VALID                        (_AC(1, ULL) << 31)
>>>  
>>> +#define MSR_ARCH_LBR_CTL                    0x000014ce
>>> +#define  ARCH_LBR_CTL_LBREN                 (_AC(1, ULL) <<  0)
>>> +#define  ARCH_LBR_CTL_OS                    (_AC(1, ULL) <<  1)
>>
>> Bits 2 and 3 also have meaning (USR and CALL_STACK) according to
>> ISE version 44. If it was intentional that you omitted those
>> (perhaps you intended to add only the bits you actually use right
>> away), it would have been nice if you said so in the description.
> 
> Yes, I've only added the bits used.  I could add all if that's better.

Personally I'd slightly prefer if you added all. But if you don't, which
is also okay, please make this explicit in the description.

>>> --- a/xen/arch/x86/traps.c
>>> +++ b/xen/arch/x86/traps.c
>>> @@ -1963,6 +1963,29 @@ void do_device_not_available(struct cpu_user_regs *regs)
>>>  #endif
>>>  }
>>>  
>>> +static bool enable_lbr(void)
>>> +{
>>> +    uint64_t debugctl;
>>> +
>>> +    wrmsrl(MSR_IA32_DEBUGCTLMSR, IA32_DEBUGCTLMSR_LBR);
>>> +    rdmsrl(MSR_IA32_DEBUGCTLMSR, debugctl);
>>> +    if ( !(debugctl & IA32_DEBUGCTLMSR_LBR) )
>>> +    {
>>> +        /*
>>> +         * CPUs with no model-specific LBRs always return DEBUGCTLMSR.LBR
>>> +         * == 0, attempt to set arch LBR if available.
>>> +         */
>>> +        if ( !boot_cpu_has(X86_FEATURE_ARCH_LBR) )
>>> +            return false;
>>> +
>>> +        /* Note that LASTINT{FROMIP,TOIP} matches LER_{FROM_IP,TO_IP} */
>>> +        wrmsrl(MSR_ARCH_LBR_CTL, ARCH_LBR_CTL_LBREN | ARCH_LBR_CTL_OS |
>>> +                                 ARCH_LBR_CTL_RECORD_ALL);
>>> +    }
>>> +
>>> +    return true;
>>> +}
>>
>> Would it make sense to try architectural LBRs first?
> 
> I didn't want to change behavior for existing platforms that have
> both architectural and model specific LBRs.

Are there such platforms? While it may not be said explicitly, so far
I took it that the LBR format indicator being 0x3f was connected to
arch LBR being available.

>> As there's no good place to ask the VMX-related question, it needs to
>> go here: Aiui with this patch in place VMX guests will be run with
>> Xen's choice of LBR_CTL. That's different from DebugCtl, which - being
>> part of the VMCS - would be loaded by the CPU. Such a difference, if
>> intended, would imo again want pointing out in the description.
> 
> LBR_CTL will only be set by Xen when the CPU only supports
> architectural LBRs (no model specific LBR support at all), and in that
> case LBR support won't be exposed to guests, as the ARCH_LBR CPUID is
> not exposed, neither are guests allowed access to ARCH_LBR_CTL.
> 
> Note that in such scenario also setting DebugCtl.LBR has not effect, as
> there's no model specific LBR support, and the hardware will just
> ignore the bit.  Also none of the LBR MSRs are exposed to guests
> either.

My question wasn't about guest support, but about us (perhaps mistakenly)
running guest code with the Xen setting in place. It cannot be excluded
that running with LBR enabled has a performance impact, after all.

> I can try to clarify all the above in the commit message.

Thanks.

Jan


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/5] x86/lbr: enable hypervisor LER with arch LBR
  2022-07-04  6:15       ` Jan Beulich
@ 2022-07-04  8:23         ` Roger Pau Monné
  0 siblings, 0 replies; 22+ messages in thread
From: Roger Pau Monné @ 2022-07-04  8:23 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, xen-devel

On Mon, Jul 04, 2022 at 08:15:15AM +0200, Jan Beulich wrote:
> On 01.07.2022 17:39, Roger Pau Monné wrote:
> > On Mon, May 30, 2022 at 05:31:18PM +0200, Jan Beulich wrote:
> >> On 20.05.2022 15:37, Roger Pau Monne wrote:
> >>> --- a/xen/arch/x86/include/asm/msr-index.h
> >>> +++ b/xen/arch/x86/include/asm/msr-index.h
> >>> @@ -139,6 +139,24 @@
> >>>  #define  PASID_PASID_MASK                   0x000fffff
> >>>  #define  PASID_VALID                        (_AC(1, ULL) << 31)
> >>>  
> >>> +#define MSR_ARCH_LBR_CTL                    0x000014ce
> >>> +#define  ARCH_LBR_CTL_LBREN                 (_AC(1, ULL) <<  0)
> >>> +#define  ARCH_LBR_CTL_OS                    (_AC(1, ULL) <<  1)
> >>
> >> Bits 2 and 3 also have meaning (USR and CALL_STACK) according to
> >> ISE version 44. If it was intentional that you omitted those
> >> (perhaps you intended to add only the bits you actually use right
> >> away), it would have been nice if you said so in the description.
> > 
> > Yes, I've only added the bits used.  I could add all if that's better.
> 
> Personally I'd slightly prefer if you added all. But if you don't, which
> is also okay, please make this explicit in the description.
> 
> >>> --- a/xen/arch/x86/traps.c
> >>> +++ b/xen/arch/x86/traps.c
> >>> @@ -1963,6 +1963,29 @@ void do_device_not_available(struct cpu_user_regs *regs)
> >>>  #endif
> >>>  }
> >>>  
> >>> +static bool enable_lbr(void)
> >>> +{
> >>> +    uint64_t debugctl;
> >>> +
> >>> +    wrmsrl(MSR_IA32_DEBUGCTLMSR, IA32_DEBUGCTLMSR_LBR);
> >>> +    rdmsrl(MSR_IA32_DEBUGCTLMSR, debugctl);
> >>> +    if ( !(debugctl & IA32_DEBUGCTLMSR_LBR) )
> >>> +    {
> >>> +        /*
> >>> +         * CPUs with no model-specific LBRs always return DEBUGCTLMSR.LBR
> >>> +         * == 0, attempt to set arch LBR if available.
> >>> +         */
> >>> +        if ( !boot_cpu_has(X86_FEATURE_ARCH_LBR) )
> >>> +            return false;
> >>> +
> >>> +        /* Note that LASTINT{FROMIP,TOIP} matches LER_{FROM_IP,TO_IP} */
> >>> +        wrmsrl(MSR_ARCH_LBR_CTL, ARCH_LBR_CTL_LBREN | ARCH_LBR_CTL_OS |
> >>> +                                 ARCH_LBR_CTL_RECORD_ALL);
> >>> +    }
> >>> +
> >>> +    return true;
> >>> +}
> >>
> >> Would it make sense to try architectural LBRs first?
> > 
> > I didn't want to change behavior for existing platforms that have
> > both architectural and model specific LBRs.
> 
> Are there such platforms? While it may not be said explicitly, so far
> I took it that the LBR format indicator being 0x3f was connected to
> arch LBR being available.

IIRC Ice Lake has both model-specific and architectural LBRs.

While format being 0x3f could indicate the likely presence of arch
LBRs, the CPUID bit need to be checked.

> >> As there's no good place to ask the VMX-related question, it needs to
> >> go here: Aiui with this patch in place VMX guests will be run with
> >> Xen's choice of LBR_CTL. That's different from DebugCtl, which - being
> >> part of the VMCS - would be loaded by the CPU. Such a difference, if
> >> intended, would imo again want pointing out in the description.
> > 
> > LBR_CTL will only be set by Xen when the CPU only supports
> > architectural LBRs (no model specific LBR support at all), and in that
> > case LBR support won't be exposed to guests, as the ARCH_LBR CPUID is
> > not exposed, neither are guests allowed access to ARCH_LBR_CTL.
> > 
> > Note that in such scenario also setting DebugCtl.LBR has not effect, as
> > there's no model specific LBR support, and the hardware will just
> > ignore the bit.  Also none of the LBR MSRs are exposed to guests
> > either.
> 
> My question wasn't about guest support, but about us (perhaps mistakenly)
> running guest code with the Xen setting in place. It cannot be excluded
> that running with LBR enabled has a performance impact, after all.

It's possible.  'ler' option already states that it should be used for
debugging purposes only, so I think it's fine if this results in
slower guest performance, as it's not a general purpose option after
all.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [PATCH 0/5] x86/lbr: handle lack of model-specific LBRs
  2022-05-20 13:37 [PATCH 0/5] x86/lbr: handle lack of model-specific LBRs Roger Pau Monne
                   ` (5 preceding siblings ...)
  2022-06-17  3:24 ` [PATCH 0/5] x86/lbr: handle lack of model-specific LBRs Henry Wang
@ 2022-07-06  7:30 ` Henry Wang
  6 siblings, 0 replies; 22+ messages in thread
From: Henry Wang @ 2022-07-06  7:30 UTC (permalink / raw)
  To: xen-devel
  Cc: Jan Beulich, Andrew Cooper, Wei Liu, Jun Nakajima,
	Roger Pau Monne, Kevin Tian

Hi, 

It seems that this series has been stale for more than a month, so I am sending
this email as a gentle reminder with current status:

Patch #2, #3 might need further actions from the maintainers

Patch #4 might need further actions from the author.

Patch #1 #5 has been reviewed/acked.

Kind regards,
Henry

> -----Original Message-----
> Subject: [PATCH 0/5] x86/lbr: handle lack of model-specific LBRs
> 
> Hello,
> 
> Intel Sapphire Rapids CPUs doesn't have model-specific MSRs, and hence
> only architectural LBRs are available.
> 
> Firstly implement some changes so Xen knows how to enable arch LBRs so
> that the ler option can also work in such scenario (first two patches).
> 
> The lack of model-specific LBRs also affects guests, as setting
> DEBUGCTLMSR.LBR is now ignored (value hardwired to 0, writes ignored) by
> the hardware due to the lack of model-specific LBRs.  The LBR format
> reported in PERF_CAPABILITIES also need to be exposed, as that's a way
> for guests to detect lack of model-specific LBRs presence (patches 3
> and 4).
> 
> Patch 5 is an indentation fix that can be merged into patch 4: done
> separately to help readability of patch 4.
> 
> Thanks, Roger.
> 
> Roger Pau Monne (5):
>   x86/ler: use feature flag to check if option is enabled
>   x86/lbr: enable hypervisor LER with arch LBR
>   x86/perf: expose LBR format in PERF_CAPABILITIES
>   x86/vmx: handle no model-specific LBR presence
>   x86/vmx: fix indentation of LBR
> 
>  xen/arch/x86/hvm/vmx/vmx.c                  | 59 ++++++++++++++-------
>  xen/arch/x86/include/asm/msr-index.h        | 18 +++++++
>  xen/arch/x86/msr.c                          |  9 ++++
>  xen/arch/x86/traps.c                        | 29 ++++++++--
>  xen/arch/x86/x86_64/traps.c                 |  2 +-
>  xen/include/public/arch-x86/cpufeatureset.h |  3 +-
>  6 files changed, 97 insertions(+), 23 deletions(-)
> 
> --
> 2.36.0
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2022-07-06  7:30 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-20 13:37 [PATCH 0/5] x86/lbr: handle lack of model-specific LBRs Roger Pau Monne
2022-05-20 13:37 ` [PATCH 1/5] x86/ler: use feature flag to check if option is enabled Roger Pau Monne
2022-05-30 15:00   ` Jan Beulich
2022-05-20 13:37 ` [PATCH 2/5] x86/lbr: enable hypervisor LER with arch LBR Roger Pau Monne
2022-05-30 15:31   ` Jan Beulich
2022-07-01 15:39     ` Roger Pau Monné
2022-07-04  6:15       ` Jan Beulich
2022-07-04  8:23         ` Roger Pau Monné
2022-05-20 13:37 ` [PATCH 3/5] x86/perf: expose LBR format in PERF_CAPABILITIES Roger Pau Monne
2022-05-20 14:10   ` Andrew Cooper
2022-05-20 14:19     ` Jan Beulich
2022-05-20 14:58       ` Andrew Cooper
2022-05-23  8:04         ` Roger Pau Monné
2022-05-23  8:12         ` Jan Beulich
2022-05-23  9:53           ` Roger Pau Monné
2022-05-20 14:52     ` Roger Pau Monné
2022-05-20 13:37 ` [PATCH 4/5] x86/vmx: handle no model-specific LBR presence Roger Pau Monne
2022-05-30 16:02   ` Jan Beulich
2022-05-20 13:37 ` [PATCH 5/5] x86/vmx: fix indentation of LBR Roger Pau Monne
2022-06-29  6:40   ` Tian, Kevin
2022-06-17  3:24 ` [PATCH 0/5] x86/lbr: handle lack of model-specific LBRs Henry Wang
2022-07-06  7:30 ` Henry Wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.