* [PATCH 0/3] fix MSR_LAST_BRANCH_FROM Haswell support @ 2016-06-02 2:42 David Carrillo-Cisneros 2016-06-02 2:42 ` [PATCH 1/3] perf/x86/intel: output LBR support statement after validation David Carrillo-Cisneros ` (2 more replies) 0 siblings, 3 replies; 17+ messages in thread From: David Carrillo-Cisneros @ 2016-06-02 2:42 UTC (permalink / raw) To: linux-kernel Cc: x86, Ingo Molnar, Yan, Zheng, Andi Kleen, Kan Liang, Peter Zijlstra, David Carrillo-Cisneros, Stephane Eranian The patch: perf/x86/intel: Protect LBR and extra_regs against KVM lying introduced an extra test for LBR support but did not move the dmesg accordingly. This problem is fixed in first patch in this series. When a machine that used LBR is rebooted using kexec, the extra test for LBR support may fail due to a hw bug/quirk in Haswell that generate a #GPF when restoring the value of MSR_LAST_BRANCH_FROM_* msrs with sign-extended valuse (e.p. kernel addresses). During normal execution, this problem is masked by a work-around to another hw bug that prevented FREEZE_LBRS_ON_PMI from be used in kernel branches, introduced in "perf/x86/intel: Add Haswell LBR call stack support". The second patch in this series contains a workaround for the MSR_LAST_BRANCH_FROM_* hw bug/quirk. The last patch is not to be commited, but to test the second patch. It removes the effect of the FREEZE_LBRS_ON_PMI work-around by allowing LBR call-stack for kernel addresses. This series is rebased at torvalds/linux/master . David Carrillo-Cisneros (3): perf/x86/intel: output LBR support statement after validation perf/x86/intel: fix for MSR_LAST_BRANCH_FROM_x quirk when no TSX perf, perf/tool: trigger lbr_from signext bug arch/x86/events/intel/core.c | 20 +++++++++++ arch/x86/events/intel/lbr.c | 83 +++++++++++++++++++++++++++++++++++++------- arch/x86/events/perf_event.h | 2 ++ tools/perf/util/evsel.c | 17 +++++++-- 4 files changed, 106 insertions(+), 16 deletions(-) -- 2.8.0.rc3.226.g39d4020 ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 1/3] perf/x86/intel: output LBR support statement after validation 2016-06-02 2:42 [PATCH 0/3] fix MSR_LAST_BRANCH_FROM Haswell support David Carrillo-Cisneros @ 2016-06-02 2:42 ` David Carrillo-Cisneros 2016-06-02 8:21 ` Peter Zijlstra 2016-06-02 16:04 ` Andi Kleen 2016-06-02 2:42 ` [PATCH 2/3] perf/x86/intel: fix for MSR_LAST_BRANCH_FROM_x quirk when no TSX David Carrillo-Cisneros 2016-06-02 2:42 ` [PATCH 3/3] perf, perf/tool: trigger lbr_from signext bug David Carrillo-Cisneros 2 siblings, 2 replies; 17+ messages in thread From: David Carrillo-Cisneros @ 2016-06-02 2:42 UTC (permalink / raw) To: linux-kernel Cc: x86, Ingo Molnar, Yan, Zheng, Andi Kleen, Kan Liang, Peter Zijlstra, David Carrillo-Cisneros, Stephane Eranian Commit "perf/x86/intel: Protect LBR and extra_regs against KVM lying" added an additional test to LBR support detection is performed after printing the LBR support statement to dmesg. Move the LRB support output after very last test. Reviewed-by: Stephane Eranian <eranian@google.com> Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> --- arch/x86/events/intel/core.c | 2 ++ arch/x86/events/intel/lbr.c | 9 --------- 2 files changed, 2 insertions(+), 9 deletions(-) diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index 7c66695..a5e52ad4 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -3885,6 +3885,8 @@ __init int intel_pmu_init(void) x86_pmu.lbr_nr = 0; } + if (x86_pmu.lbr_nr) + pr_cont("%d-deep LBR, ", x86_pmu.lbr_nr); /* * Access extra MSR may cause #GP under certain circumstances. * E.g. KVM doesn't support offcore event diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c index 9e2b40c..2dca66c 100644 --- a/arch/x86/events/intel/lbr.c +++ b/arch/x86/events/intel/lbr.c @@ -956,7 +956,6 @@ void __init intel_pmu_lbr_init_core(void) * SW branch filter usage: * - compensate for lack of HW filter */ - pr_cont("4-deep LBR, "); } /* nehalem/westmere */ @@ -977,7 +976,6 @@ void __init intel_pmu_lbr_init_nhm(void) * That requires LBR_FAR but that means far * jmp need to be filtered out */ - pr_cont("16-deep LBR, "); } /* sandy bridge */ @@ -997,7 +995,6 @@ void __init intel_pmu_lbr_init_snb(void) * That requires LBR_FAR but that means far * jmp need to be filtered out */ - pr_cont("16-deep LBR, "); } /* haswell */ @@ -1010,8 +1007,6 @@ void intel_pmu_lbr_init_hsw(void) x86_pmu.lbr_sel_mask = LBR_SEL_MASK; x86_pmu.lbr_sel_map = hsw_lbr_sel_map; - - pr_cont("16-deep LBR, "); } /* skylake */ @@ -1031,7 +1026,6 @@ __init void intel_pmu_lbr_init_skl(void) * That requires LBR_FAR but that means far * jmp need to be filtered out */ - pr_cont("32-deep LBR, "); } /* atom */ @@ -1057,7 +1051,6 @@ void __init intel_pmu_lbr_init_atom(void) * SW branch filter usage: * - compensate for lack of HW filter */ - pr_cont("8-deep LBR, "); } /* slm */ @@ -1088,6 +1081,4 @@ void intel_pmu_lbr_init_knl(void) x86_pmu.lbr_sel_mask = LBR_SEL_MASK; x86_pmu.lbr_sel_map = snb_lbr_sel_map; - - pr_cont("8-deep LBR, "); } -- 2.8.0.rc3.226.g39d4020 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] perf/x86/intel: output LBR support statement after validation 2016-06-02 2:42 ` [PATCH 1/3] perf/x86/intel: output LBR support statement after validation David Carrillo-Cisneros @ 2016-06-02 8:21 ` Peter Zijlstra 2016-06-02 16:04 ` Andi Kleen 1 sibling, 0 replies; 17+ messages in thread From: Peter Zijlstra @ 2016-06-02 8:21 UTC (permalink / raw) To: David Carrillo-Cisneros Cc: linux-kernel, x86, Ingo Molnar, Yan, Zheng, Andi Kleen, Kan Liang, Stephane Eranian On Wed, Jun 01, 2016 at 07:42:01PM -0700, David Carrillo-Cisneros wrote: > Commit "perf/x86/intel: Protect LBR and extra_regs against KVM lying" Commit references are done like: 338b522ca43c ("perf/x86/intel: Protect LBR and extra_regs against KVM lying") Please use that, looking them up on names is a royal pain. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] perf/x86/intel: output LBR support statement after validation 2016-06-02 2:42 ` [PATCH 1/3] perf/x86/intel: output LBR support statement after validation David Carrillo-Cisneros 2016-06-02 8:21 ` Peter Zijlstra @ 2016-06-02 16:04 ` Andi Kleen 2016-06-02 17:28 ` Stephane Eranian 1 sibling, 1 reply; 17+ messages in thread From: Andi Kleen @ 2016-06-02 16:04 UTC (permalink / raw) To: David Carrillo-Cisneros Cc: linux-kernel, x86, Ingo Molnar, Yan, Zheng, Kan Liang, Peter Zijlstra, Stephane Eranian I don't think the context switch support is really needed. It's only needed for saving/restoring LBRs, and we only do that with LBR callstacks. In any other LBR mode that LBRs are only flushed on context switch But LBR callstacks will never put kernel addresses into the LBRs because they are forced to set a ring 3 filter. So you can't have kernel addresses in the LBR when saving/restoring them (unless I missed some case) Dropping that will likely simplify the patch somewhat. -Andi ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] perf/x86/intel: output LBR support statement after validation 2016-06-02 16:04 ` Andi Kleen @ 2016-06-02 17:28 ` Stephane Eranian 2016-06-06 1:59 ` Andi Kleen 0 siblings, 1 reply; 17+ messages in thread From: Stephane Eranian @ 2016-06-02 17:28 UTC (permalink / raw) To: Andi Kleen Cc: David Carrillo-Cisneros, LKML, x86, Ingo Molnar, Yan, Zheng, Kan Liang, Peter Zijlstra Andi, On Thu, Jun 2, 2016 at 9:04 AM, Andi Kleen <ak@linux.intel.com> wrote: > > > I don't think the context switch support is really needed. It's only > needed for saving/restoring LBRs, and we only do that with LBR callstacks. > In any other LBR mode that LBRs are only flushed on context switch > But LBR callstacks will never put kernel addresses into the LBRs > because they are forced to set a ring 3 filter. So you can't have > kernel addresses in the LBR when saving/restoring them > (unless I missed some case) > It is not because you force LBR to ring3 only that you do not capture kernel addresses in the FROM field. Keep in mind that LBR priv level filtering applies to the target of the branch and not the source. You might still get a kernel address if returning from kernel. Now, in callstack mode, I think the return branch is never actually recorded in the LBR, it just causes a pop, so theoretically this should not happen. I'd like to be 100% sure of that, though. > Dropping that will likely simplify the patch somewhat. > > -Andi ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] perf/x86/intel: output LBR support statement after validation 2016-06-02 17:28 ` Stephane Eranian @ 2016-06-06 1:59 ` Andi Kleen 2016-06-08 7:27 ` Stephane Eranian 0 siblings, 1 reply; 17+ messages in thread From: Andi Kleen @ 2016-06-06 1:59 UTC (permalink / raw) To: Stephane Eranian Cc: David Carrillo-Cisneros, LKML, x86, Ingo Molnar, Yan, Zheng, Kan Liang, Peter Zijlstra > It is not because you force LBR to ring3 only that you do not capture > kernel addresses in the FROM field. > Keep in mind that LBR priv level filtering applies to the target of > the branch and not the source. You might > still get a kernel address if returning from kernel. Now, in callstack > mode, I think the return branch is never > actually recorded in the LBR, it just causes a pop, so theoretically > this should not happen. I'd like to be > 100% sure of that, though. Far branches shouldn't be included in call stack LBR. Don't think there is any other situation where the ring 0 address could leak either. -Andi -- ak@linux.intel.com -- Speaking for myself only ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] perf/x86/intel: output LBR support statement after validation 2016-06-06 1:59 ` Andi Kleen @ 2016-06-08 7:27 ` Stephane Eranian 2016-06-08 9:08 ` Andi Kleen 0 siblings, 1 reply; 17+ messages in thread From: Stephane Eranian @ 2016-06-08 7:27 UTC (permalink / raw) To: Andi Kleen Cc: David Carrillo-Cisneros, LKML, x86, Ingo Molnar, Yan, Zheng, Kan Liang, Peter Zijlstra Andi, On Sun, Jun 5, 2016 at 6:59 PM, Andi Kleen <ak@linux.intel.com> wrote: > > > It is not because you force LBR to ring3 only that you do not capture > > kernel addresses in the FROM field. > > Keep in mind that LBR priv level filtering applies to the target of > > the branch and not the source. You might > > still get a kernel address if returning from kernel. Now, in callstack > > mode, I think the return branch is never > > actually recorded in the LBR, it just causes a pop, so theoretically > > this should not happen. I'd like to be > > 100% sure of that, though. > > Far branches shouldn't be included in call stack LBR. Don't think > there is any other situation where the ring 0 address could leak either. > Ok, so you're saying that syscall and int are not causing LBR callstack to record an entry. If that is the case, then a rfi should not cause a pop of the last LBR entry. Is that right? > > -Andi > -- > ak@linux.intel.com -- Speaking for myself only ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] perf/x86/intel: output LBR support statement after validation 2016-06-08 7:27 ` Stephane Eranian @ 2016-06-08 9:08 ` Andi Kleen 0 siblings, 0 replies; 17+ messages in thread From: Andi Kleen @ 2016-06-08 9:08 UTC (permalink / raw) To: Stephane Eranian Cc: David Carrillo-Cisneros, LKML, x86, Ingo Molnar, Yan, Zheng, Kan Liang, Peter Zijlstra > Ok, so you're saying that syscall and int are not causing LBR callstack to > record an entry. If that is the case, then a rfi should not cause a pop of the > last LBR entry. Is that right? rfi = ? But likely yes. There shouldn't be any ring 0 branches in the callstack LBR, as long as the patch forcing it to ring 3 is still there. -Andi -- ak@linux.intel.com -- Speaking for myself only ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 2/3] perf/x86/intel: fix for MSR_LAST_BRANCH_FROM_x quirk when no TSX 2016-06-02 2:42 [PATCH 0/3] fix MSR_LAST_BRANCH_FROM Haswell support David Carrillo-Cisneros 2016-06-02 2:42 ` [PATCH 1/3] perf/x86/intel: output LBR support statement after validation David Carrillo-Cisneros @ 2016-06-02 2:42 ` David Carrillo-Cisneros 2016-06-02 8:23 ` Peter Zijlstra ` (5 more replies) 2016-06-02 2:42 ` [PATCH 3/3] perf, perf/tool: trigger lbr_from signext bug David Carrillo-Cisneros 2 siblings, 6 replies; 17+ messages in thread From: David Carrillo-Cisneros @ 2016-06-02 2:42 UTC (permalink / raw) To: linux-kernel Cc: x86, Ingo Molnar, Yan, Zheng, Andi Kleen, Kan Liang, Peter Zijlstra, David Carrillo-Cisneros, Stephane Eranian Intel's SDM states that bits 61:62 in MSR_LAST_BRANCH_FROM_x are the TSX flags for formats with LBR_TSX flags (i.e. LBR_FORMAT_EIP_EFLAGS2). However, when the CPU has TSX support deactivated, bits 61:62 actually behave as follows: - For wrmsr, bits 61:62 are considered part of the sign-extension. - LBR hw updates to the MSR (no through wrmsr) will clear bits 61:62, regardless of the sign of bit at position 47, i.e. bit 61:62 are not part of the sign-extension. Therefore, if the following conditions are all true: 1) LBR has TSX format. 2) CPU has no TSX support enabled. 3) data in MSR (bits 0:48) is negative. then any value passed to wrmsr must be signed-extended to 63 bits and any value from rdmsr must be converted to 61 bits, ignoring the TSX flags. We also need the quirk at context switch, when save/restore the value of MSR_LAST_BRANCH_FROM_x This bug was masked by the work-around to the bug: "LBR May Contain Incorrect Information When Using FREEZE_LBRS_ON_PMI" The work-around workis by forbidding the capture of kernel memory addresses by filtering out all kernel branches in hw and therefore causing all branches in MSR_LAST_BRANCH to be user addresses. User addresses have bits 61:62 CLEAR and do no trigger the wrmsr bug in MSR_LAST_BRANCH_FROM_x when saved/restored at context switch. To verify the hw bug: $ perf record -b -e cycles sleep 1 $ iotools/rdmsr 0 0x680 0x1fffffffb0b9b0cc $ iotools/wrmsr 0 0x680 0x1fffffffb0b9b0cc write(): Input/output error To test the workaround, use a perf tool and kernel using the patch next in this series. That patch removes the work around that masked the hw bug: $ ./lbr_perf record --call-graph lbr -e cycles:k ./cqm_easy where lbr_perf is the patched perf tool, that allows to specify :k on lbr mode. The above command will trigger a #GPF : [ 411.191445] ------------[ cut here ]------------ [ 411.196015] WARNING: CPU: 28 PID: 14096 at arch/x86/mm/extable.c:65 ex_handler_wrmsr_unsafe+0x70/0x80 [ 411.205123] unchecked MSR access error: WRMSR to 0x681 (tried to write 0x1fffffff81010794) [ 411.213286] Modules linked in: bonding w1_therm wire mpt3sas scsi_transport_sas raid_class cdc_acm ehci_pci ehci_hcd mlx4_en ib_uverbs mlx4_ib ib_core mlx4_core [ 411.227611] CPU: 28 PID: 14096 Comm: cqm_easy Not tainted 4.7.0-smp-unfixed_lbr_from_bug #675 [ 411.236033] Hardware name: Intel Grantley,Wellsburg/Ixion_QC_15, BIOS 2.50.0 01/21/2016 [ 411.243940] 0000000000000000 ffff883ff0453b48 ffffffff8167af49 ffff883ff0453ba8 [ 411.251279] 0000000000000000 ffff883ff0453b98 ffffffff810b9b15 ffff883ff0453b78 [ 411.258619] 0000004100000001 ffff883ffee9c8f8 ffffffff8168c21c ffff883ff0453c78 [ 411.265962] Call Trace: [ 411.268384] [<ffffffff8167af49>] dump_stack+0x4d/0x63 [ 411.273462] [<ffffffff810b9b15>] __warn+0xe5/0x100 [ 411.278278] [<ffffffff810b9be9>] warn_slowpath_fmt+0x49/0x50 [ 411.283955] [<ffffffff810abb40>] ex_handler_wrmsr_unsafe+0x70/0x80 [ 411.290144] [<ffffffff810abc42>] fixup_exception+0x42/0x50 [ 411.295658] [<ffffffff81079d1a>] do_general_protection+0x8a/0x160 [ 411.301764] [<ffffffff81684ec2>] general_protection+0x22/0x30 [ 411.307527] [<ffffffff810101b9>] ? intel_pmu_lbr_sched_task+0xc9/0x380 [ 411.314063] [<ffffffff81009d7c>] intel_pmu_sched_task+0x3c/0x60 [ 411.319996] [<ffffffff81003a2b>] x86_pmu_sched_task+0x1b/0x20 [ 411.325762] [<ffffffff81192a5b>] perf_pmu_sched_task+0x6b/0xb0 [ 411.331610] [<ffffffff8119746d>] __perf_event_task_sched_in+0x7d/0x150 [ 411.338145] [<ffffffff810dd9dc>] finish_task_switch+0x15c/0x200 [ 411.344078] [<ffffffff8167f894>] __schedule+0x274/0x6cc [ 411.349325] [<ffffffff8167fdd9>] schedule+0x39/0x90 [ 411.354229] [<ffffffff81675398>] exit_to_usermode_loop+0x39/0x89 [ 411.360246] [<ffffffff810028ce>] prepare_exit_to_usermode+0x2e/0x30 [ 411.366524] [<ffffffff81683c1b>] retint_user+0x8/0x10 [ 411.371599] ---[ end trace 1ed61b8a551e95d3 ]--- Reviewed-by: Stephane Eranian <eranian@google.com> Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> --- arch/x86/events/intel/core.c | 18 +++++++++++ arch/x86/events/intel/lbr.c | 73 +++++++++++++++++++++++++++++++++++++++++--- arch/x86/events/perf_event.h | 2 ++ 3 files changed, 89 insertions(+), 4 deletions(-) diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index a5e52ad4..1ce172d 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -3309,6 +3309,7 @@ static void intel_snb_check_microcode(void) static bool check_msr(unsigned long msr, u64 mask) { u64 val_old, val_new, val_tmp; + u64 (*wr_quirk)(u64); /* * Read the current value, change it and read it back to see if it @@ -3322,13 +3323,30 @@ static bool check_msr(unsigned long msr, u64 mask) * Only change the bits which can be updated by wrmsrl. */ val_tmp = val_old ^ mask; + + /* Use wr quirk for lbr msr's. */ + if ((x86_pmu.lbr_from <= msr && + msr < x86_pmu.lbr_from + x86_pmu.lbr_nr) || + (x86_pmu.lbr_to <= msr && + msr < x86_pmu.lbr_to + x86_pmu.lbr_nr)) + wr_quirk = lbr_from_signext_quirk_wr; + + if (wr_quirk) + val_tmp = wr_quirk(val_tmp); + + if (wrmsrl_safe(msr, val_tmp) || rdmsrl_safe(msr, &val_new)) return false; + /* quirk only affects validation in wrmsr, so wrmsrl'value + * should equal rdmsrl's one even with the quirk. + */ if (val_new != val_tmp) return false; + if (wr_quirk) + val_old = wr_quirk(val_old); /* Here it's sure that the MSR can be safely accessed. * Restore the old value and return. */ diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c index 2dca66c..6aa2d8a 100644 --- a/arch/x86/events/intel/lbr.c +++ b/arch/x86/events/intel/lbr.c @@ -80,6 +80,7 @@ static enum { #define LBR_FROM_FLAG_MISPRED (1ULL << 63) #define LBR_FROM_FLAG_IN_TX (1ULL << 62) #define LBR_FROM_FLAG_ABORT (1ULL << 61) +#define LBR_FROM_SIGNEXT_MSB (1ULL << 60) /* * x86control flow change classification @@ -235,6 +236,62 @@ enum { LBR_VALID, }; +/* + * For formats with LBR_TSX flags (e.g. LBR_FORMAT_EIP_FLAGS2), bits 61:62 in + * MSR_LAST_BRANCH_FROM_x are the TSX flags when TSX is supported. + * When TSX support is disabled the behavior differs as follows: + * - For wrmsr, bits 61:62 are considered part of the sign-extension. + * - HW updates to the MSR (no through wrmsr) will clear bits 61:62, + * regardless of the sign of bit at position 47, i.e. bit 61:62 are not part + * of the sign-extension. + * + * Therefore, if the conditions: + * 1) LBR has TSX format. + * 2) CPU has no TSX support enabled. + * 3) data in MSR (bits 0:48) is negative. + * are all true, then any value passed to wrmsr must be signed-extended to + * 63 bits and any value from rdmsr must be converted to 61 bits, ignoring + * the TSX flags. + */ + +static inline bool lbr_from_signext_quirk_on(void) +{ + int lbr_format = x86_pmu.intel_cap.lbr_format; + bool tsx_support = boot_cpu_has(X86_FEATURE_HLE) || + boot_cpu_has(X86_FEATURE_RTM); + + return !tsx_support && (lbr_desc[lbr_format] & LBR_TSX); +} + +DEFINE_STATIC_KEY_FALSE(lbr_from_quirk_key); + +static inline bool lbr_from_signext_quirk_test(u64 val) +{ + return static_branch_unlikely(&lbr_from_quirk_key) && + (val & LBR_FROM_SIGNEXT_MSB); +} + +/* + * If quirk is needed, do sign extension to 63 bits. + */ +inline u64 lbr_from_signext_quirk_wr(u64 val) +{ + if (lbr_from_signext_quirk_test(val)) + val |= (LBR_FROM_FLAG_IN_TX | LBR_FROM_FLAG_ABORT); + return val; +} + +/* + * If quirk is needed, ensure sign extension is 61 bits. + */ + +u64 lbr_from_signext_quirk_rd(u64 val) +{ + if (lbr_from_signext_quirk_test(val)) + val &= ~(LBR_FROM_FLAG_IN_TX | LBR_FROM_FLAG_ABORT); + return val; +} + static void __intel_pmu_lbr_restore(struct x86_perf_task_context *task_ctx) { int i; @@ -251,7 +308,8 @@ static void __intel_pmu_lbr_restore(struct x86_perf_task_context *task_ctx) tos = task_ctx->tos; for (i = 0; i < tos; i++) { lbr_idx = (tos - i) & mask; - wrmsrl(x86_pmu.lbr_from + lbr_idx, task_ctx->lbr_from[i]); + wrmsrl(x86_pmu.lbr_from + lbr_idx, + lbr_from_signext_quirk_wr(task_ctx->lbr_from[i])); wrmsrl(x86_pmu.lbr_to + lbr_idx, task_ctx->lbr_to[i]); if (x86_pmu.intel_cap.lbr_format == LBR_FORMAT_INFO) wrmsrl(MSR_LBR_INFO_0 + lbr_idx, task_ctx->lbr_info[i]); @@ -264,7 +322,7 @@ static void __intel_pmu_lbr_save(struct x86_perf_task_context *task_ctx) { int i; unsigned lbr_idx, mask; - u64 tos; + u64 tos, val; if (task_ctx->lbr_callstack_users == 0) { task_ctx->lbr_stack_state = LBR_NONE; @@ -275,7 +333,8 @@ static void __intel_pmu_lbr_save(struct x86_perf_task_context *task_ctx) tos = intel_pmu_lbr_tos(); for (i = 0; i < tos; i++) { lbr_idx = (tos - i) & mask; - rdmsrl(x86_pmu.lbr_from + lbr_idx, task_ctx->lbr_from[i]); + rdmsrl(x86_pmu.lbr_from + lbr_idx, val); + task_ctx->lbr_from[i] = lbr_from_signext_quirk_rd(val); rdmsrl(x86_pmu.lbr_to + lbr_idx, task_ctx->lbr_to[i]); if (x86_pmu.intel_cap.lbr_format == LBR_FORMAT_INFO) rdmsrl(MSR_LBR_INFO_0 + lbr_idx, task_ctx->lbr_info[i]); @@ -316,7 +375,7 @@ void intel_pmu_lbr_sched_task(struct perf_event_context *ctx, bool sched_in) * level (which could be a useful measurement in system-wide * mode). In that case, the risk is high of having a branch * stack with branch from multiple tasks. - */ + */ if (sched_in) { intel_pmu_lbr_reset(); cpuc->lbr_context = ctx; @@ -453,6 +512,8 @@ static void intel_pmu_lbr_read_64(struct cpu_hw_events *cpuc) int lbr_flags = lbr_desc[lbr_format]; rdmsrl(x86_pmu.lbr_from + lbr_idx, from); + from = lbr_from_signext_quirk_rd(from); + rdmsrl(x86_pmu.lbr_to + lbr_idx, to); if (lbr_format == LBR_FORMAT_INFO && need_info) { @@ -536,6 +597,7 @@ static int intel_pmu_setup_sw_lbr_filter(struct perf_event *event) u64 br_type = event->attr.branch_sample_type; int mask = 0; + if (br_type & PERF_SAMPLE_BRANCH_USER) mask |= X86_BR_USER; @@ -1007,6 +1069,9 @@ void intel_pmu_lbr_init_hsw(void) x86_pmu.lbr_sel_mask = LBR_SEL_MASK; x86_pmu.lbr_sel_map = hsw_lbr_sel_map; + + if (lbr_from_signext_quirk_on()) + static_branch_enable(&lbr_from_quirk_key); } /* skylake */ diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h index 8bd764d..c934931 100644 --- a/arch/x86/events/perf_event.h +++ b/arch/x86/events/perf_event.h @@ -892,6 +892,8 @@ void intel_ds_init(void); void intel_pmu_lbr_sched_task(struct perf_event_context *ctx, bool sched_in); +u64 lbr_from_signext_quirk_wr(u64 val); + void intel_pmu_lbr_reset(void); void intel_pmu_lbr_enable(struct perf_event *event); -- 2.8.0.rc3.226.g39d4020 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] perf/x86/intel: fix for MSR_LAST_BRANCH_FROM_x quirk when no TSX 2016-06-02 2:42 ` [PATCH 2/3] perf/x86/intel: fix for MSR_LAST_BRANCH_FROM_x quirk when no TSX David Carrillo-Cisneros @ 2016-06-02 8:23 ` Peter Zijlstra 2016-06-02 8:37 ` Peter Zijlstra ` (4 subsequent siblings) 5 siblings, 0 replies; 17+ messages in thread From: Peter Zijlstra @ 2016-06-02 8:23 UTC (permalink / raw) To: David Carrillo-Cisneros Cc: linux-kernel, x86, Ingo Molnar, Yan, Zheng, Andi Kleen, Kan Liang, Stephane Eranian On Wed, Jun 01, 2016 at 07:42:02PM -0700, David Carrillo-Cisneros wrote: > + /* quirk only affects validation in wrmsr, so wrmsrl'value > + * should equal rdmsrl's one even with the quirk. > + */ broken comment style ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] perf/x86/intel: fix for MSR_LAST_BRANCH_FROM_x quirk when no TSX 2016-06-02 2:42 ` [PATCH 2/3] perf/x86/intel: fix for MSR_LAST_BRANCH_FROM_x quirk when no TSX David Carrillo-Cisneros 2016-06-02 8:23 ` Peter Zijlstra @ 2016-06-02 8:37 ` Peter Zijlstra 2016-06-02 8:53 ` Peter Zijlstra ` (3 subsequent siblings) 5 siblings, 0 replies; 17+ messages in thread From: Peter Zijlstra @ 2016-06-02 8:37 UTC (permalink / raw) To: David Carrillo-Cisneros Cc: linux-kernel, x86, Ingo Molnar, Yan, Zheng, Andi Kleen, Kan Liang, Stephane Eranian On Wed, Jun 01, 2016 at 07:42:02PM -0700, David Carrillo-Cisneros wrote: > diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c > index a5e52ad4..1ce172d 100644 > --- a/arch/x86/events/intel/core.c > +++ b/arch/x86/events/intel/core.c > @@ -3309,6 +3309,7 @@ static void intel_snb_check_microcode(void) > static bool check_msr(unsigned long msr, u64 mask) > { > u64 val_old, val_new, val_tmp; > + u64 (*wr_quirk)(u64); > > /* > * Read the current value, change it and read it back to see if it > @@ -3322,13 +3323,30 @@ static bool check_msr(unsigned long msr, u64 mask) > * Only change the bits which can be updated by wrmsrl. > */ > val_tmp = val_old ^ mask; > + > + /* Use wr quirk for lbr msr's. */ > + if ((x86_pmu.lbr_from <= msr && > + msr < x86_pmu.lbr_from + x86_pmu.lbr_nr) || > + (x86_pmu.lbr_to <= msr && > + msr < x86_pmu.lbr_to + x86_pmu.lbr_nr)) > + wr_quirk = lbr_from_signext_quirk_wr; This is unreadable code.. static inline bool within(unsigned long msr, unsigned long base, unsigned long size) { return base <= msr && msr < base + size; } static inline bool is_lbr_msr(unsigned long msr) { if (within(msr, x86_pmu.lbr_from, x86_pmu.lbr_nr)) return true; if (within(msr, x86_pmu.lbr_to, x86_pmu.lbr_nr)) return true; return false; } Yes, its a few more lines, but its bloody obvious what it does at any time of the day. Also, seeing how the write quirk is for the LBR_FROM only, wth are you testing against _TO too? ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] perf/x86/intel: fix for MSR_LAST_BRANCH_FROM_x quirk when no TSX 2016-06-02 2:42 ` [PATCH 2/3] perf/x86/intel: fix for MSR_LAST_BRANCH_FROM_x quirk when no TSX David Carrillo-Cisneros 2016-06-02 8:23 ` Peter Zijlstra 2016-06-02 8:37 ` Peter Zijlstra @ 2016-06-02 8:53 ` Peter Zijlstra [not found] ` <CALcN6mhMsNHgRVuYmYsE+O+Neq6PcaDy1-ARzBd=Vej=0rHr0Q@mail.gmail.com> 2016-06-02 8:54 ` Peter Zijlstra ` (2 subsequent siblings) 5 siblings, 1 reply; 17+ messages in thread From: Peter Zijlstra @ 2016-06-02 8:53 UTC (permalink / raw) To: David Carrillo-Cisneros Cc: linux-kernel, x86, Ingo Molnar, Yan, Zheng, Andi Kleen, Kan Liang, Stephane Eranian On Wed, Jun 01, 2016 at 07:42:02PM -0700, David Carrillo-Cisneros wrote: > diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c > index 2dca66c..6aa2d8a 100644 > --- a/arch/x86/events/intel/lbr.c > +++ b/arch/x86/events/intel/lbr.c > @@ -80,6 +80,7 @@ static enum { > #define LBR_FROM_FLAG_MISPRED (1ULL << 63) > #define LBR_FROM_FLAG_IN_TX (1ULL << 62) > #define LBR_FROM_FLAG_ABORT (1ULL << 61) > +#define LBR_FROM_SIGNEXT_MSB (1ULL << 60) > > /* > * x86control flow change classification > @@ -235,6 +236,62 @@ enum { > LBR_VALID, > }; > > +/* > + * For formats with LBR_TSX flags (e.g. LBR_FORMAT_EIP_FLAGS2), bits 61:62 in > + * MSR_LAST_BRANCH_FROM_x are the TSX flags when TSX is supported. > + * When TSX support is disabled the behavior differs as follows: > + * - For wrmsr, bits 61:62 are considered part of the sign-extension. > + * - HW updates to the MSR (no through wrmsr) will clear bits 61:62, > + * regardless of the sign of bit at position 47, i.e. bit 61:62 are not part > + * of the sign-extension. > + * > + * Therefore, if the conditions: > + * 1) LBR has TSX format. > + * 2) CPU has no TSX support enabled. > + * 3) data in MSR (bits 0:48) is negative. I take issue with 3; sign extension has nothing to do with being negative or not. > + * are all true, then any value passed to wrmsr must be signed-extended to > + * 63 bits and any value from rdmsr must be converted to 61 bits, ignoring > + * the TSX flags. > + */ > + > +static inline bool lbr_from_signext_quirk_on(void) This function name is bad; it doesn't return if the quirk is on at all. It detects if the quirk is relevant and should be turned on, look at the bloody usage site. > +{ > + int lbr_format = x86_pmu.intel_cap.lbr_format; > + bool tsx_support = boot_cpu_has(X86_FEATURE_HLE) || > + boot_cpu_has(X86_FEATURE_RTM); > + > + return !tsx_support && (lbr_desc[lbr_format] & LBR_TSX); > +} > + > +DEFINE_STATIC_KEY_FALSE(lbr_from_quirk_key); > + > +static inline bool lbr_from_signext_quirk_test(u64 val) > +{ > + return static_branch_unlikely(&lbr_from_quirk_key) && > + (val & LBR_FROM_SIGNEXT_MSB); > +} No, that's just wrong. > + > +/* > + * If quirk is needed, do sign extension to 63 bits. > + */ > +inline u64 lbr_from_signext_quirk_wr(u64 val) > +{ > + if (lbr_from_signext_quirk_test(val)) > + val |= (LBR_FROM_FLAG_IN_TX | LBR_FROM_FLAG_ABORT); That too, that's horrible. You don't set IN_TX and ABORT. What's wrong with: if (static_branch_unlikely(&lbr_from_quirk)) { val <<= 2; (s64)val >>= 2; } That actually sign extends and without branches. > + return val; > +} > + > +/* > + * If quirk is needed, ensure sign extension is 61 bits. You cannot extend shorter. Extend means to make longer/more. You can truncate sign bits, or just say you need to clear the 2 MSBs. > + */ > + > +u64 lbr_from_signext_quirk_rd(u64 val) > +{ > + if (lbr_from_signext_quirk_test(val)) Again, no point in wasting an extra branch, just clear the bits unconditional: if (static_branch_unlikely(&lbr_from_quirk)) > + val &= ~(LBR_FROM_FLAG_IN_TX | LBR_FROM_FLAG_ABORT); > + return val; > +} > + > static void __intel_pmu_lbr_restore(struct x86_perf_task_context *task_ctx) > { > int i; ^ permalink raw reply [flat|nested] 17+ messages in thread
[parent not found: <CALcN6mhMsNHgRVuYmYsE+O+Neq6PcaDy1-ARzBd=Vej=0rHr0Q@mail.gmail.com>]
* Re: [PATCH 2/3] perf/x86/intel: fix for MSR_LAST_BRANCH_FROM_x quirk when no TSX [not found] ` <CALcN6mhMsNHgRVuYmYsE+O+Neq6PcaDy1-ARzBd=Vej=0rHr0Q@mail.gmail.com> @ 2016-06-03 9:21 ` Peter Zijlstra 0 siblings, 0 replies; 17+ messages in thread From: Peter Zijlstra @ 2016-06-03 9:21 UTC (permalink / raw) To: David Carrillo-Cisneros Cc: linux-kernel, x86, Ingo Molnar, Yan, Zheng, Andi Kleen, Kan Liang, Stephane Eranian On Thu, Jun 02, 2016 at 05:00:19PM +0000, David Carrillo-Cisneros wrote: > LBR_FROM_FLAG_MISPRED is at bit 63 so the bitshift wouldnt work . But I can > clean the bits unconditionally of the value, just as you said for the read > case. Argh, missed that. Ok, something like so then: #define LBR_FROM_SIGN_MASK (BIT_ULL(61) | BIT_ULL(62)) /* * Sign extend into bits 61,62 while preserving bit 63. * * This works because bits 59 and 60 are guaranteed to be sign * bits themselves. */ val = (val & ~LBR_FROM_SIGN_MASK) | ((val << 2) & LBR_FROM_SIGN_MASK); A superscalar core can evaluate the left and right hand parts of the logical or concurrently. I've not generated the code so see what GCC does with the 64bit literals, ideally it would generate code using 32bit literals and only operate on the high word, but who knows if its smart enough for that. See also: https://graphics.stanford.edu/~seander/bithacks.html#ConditionalSetOrClearBitsWithoutBranching ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] perf/x86/intel: fix for MSR_LAST_BRANCH_FROM_x quirk when no TSX 2016-06-02 2:42 ` [PATCH 2/3] perf/x86/intel: fix for MSR_LAST_BRANCH_FROM_x quirk when no TSX David Carrillo-Cisneros ` (2 preceding siblings ...) 2016-06-02 8:53 ` Peter Zijlstra @ 2016-06-02 8:54 ` Peter Zijlstra 2016-06-02 8:55 ` Peter Zijlstra 2016-06-02 16:07 ` Andi Kleen 5 siblings, 0 replies; 17+ messages in thread From: Peter Zijlstra @ 2016-06-02 8:54 UTC (permalink / raw) To: David Carrillo-Cisneros Cc: linux-kernel, x86, Ingo Molnar, Yan, Zheng, Andi Kleen, Kan Liang, Stephane Eranian On Wed, Jun 01, 2016 at 07:42:02PM -0700, David Carrillo-Cisneros wrote: > @@ -536,6 +597,7 @@ static int intel_pmu_setup_sw_lbr_filter(struct perf_event *event) > u64 br_type = event->attr.branch_sample_type; > int mask = 0; > > + > if (br_type & PERF_SAMPLE_BRANCH_USER) > mask |= X86_BR_USER; > Do we really need this extra whitespace? ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] perf/x86/intel: fix for MSR_LAST_BRANCH_FROM_x quirk when no TSX 2016-06-02 2:42 ` [PATCH 2/3] perf/x86/intel: fix for MSR_LAST_BRANCH_FROM_x quirk when no TSX David Carrillo-Cisneros ` (3 preceding siblings ...) 2016-06-02 8:54 ` Peter Zijlstra @ 2016-06-02 8:55 ` Peter Zijlstra 2016-06-02 16:07 ` Andi Kleen 5 siblings, 0 replies; 17+ messages in thread From: Peter Zijlstra @ 2016-06-02 8:55 UTC (permalink / raw) To: David Carrillo-Cisneros Cc: linux-kernel, x86, Ingo Molnar, Yan, Zheng, Andi Kleen, Kan Liang, Stephane Eranian On Wed, Jun 01, 2016 at 07:42:02PM -0700, David Carrillo-Cisneros wrote: > Reviewed-by: Stephane Eranian <eranian@google.com> Did you really? Don't just blanket issue these Stephane, look at the bloody patches before they go out please! ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] perf/x86/intel: fix for MSR_LAST_BRANCH_FROM_x quirk when no TSX 2016-06-02 2:42 ` [PATCH 2/3] perf/x86/intel: fix for MSR_LAST_BRANCH_FROM_x quirk when no TSX David Carrillo-Cisneros ` (4 preceding siblings ...) 2016-06-02 8:55 ` Peter Zijlstra @ 2016-06-02 16:07 ` Andi Kleen 5 siblings, 0 replies; 17+ messages in thread From: Andi Kleen @ 2016-06-02 16:07 UTC (permalink / raw) To: David Carrillo-Cisneros Cc: linux-kernel, x86, Ingo Molnar, Yan, Zheng, Kan Liang, Peter Zijlstra, Stephane Eranian > $ ./lbr_perf record --call-graph lbr -e cycles:k ./cqm_easy > > where lbr_perf is the patched perf tool, that allows to specify :k > on lbr mode. The above command will trigger a #GPF : Why would you want to do that? We usually have frame pointers for the kernel, so callstack LBR for the kernel is pointless. -Andi ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 3/3] perf, perf/tool: trigger lbr_from signext bug 2016-06-02 2:42 [PATCH 0/3] fix MSR_LAST_BRANCH_FROM Haswell support David Carrillo-Cisneros 2016-06-02 2:42 ` [PATCH 1/3] perf/x86/intel: output LBR support statement after validation David Carrillo-Cisneros 2016-06-02 2:42 ` [PATCH 2/3] perf/x86/intel: fix for MSR_LAST_BRANCH_FROM_x quirk when no TSX David Carrillo-Cisneros @ 2016-06-02 2:42 ` David Carrillo-Cisneros 2 siblings, 0 replies; 17+ messages in thread From: David Carrillo-Cisneros @ 2016-06-02 2:42 UTC (permalink / raw) To: linux-kernel Cc: x86, Ingo Molnar, Yan, Zheng, Andi Kleen, Kan Liang, Peter Zijlstra, David Carrillo-Cisneros, Stephane Eranian Change kernel and perf tool to activate tracking and context switch for kernel branches. Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> --- arch/x86/events/intel/lbr.c | 3 ++- tools/perf/util/evsel.c | 17 ++++++++++++++--- 2 files changed, 16 insertions(+), 4 deletions(-) diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c index 6aa2d8a..d808e54 100644 --- a/arch/x86/events/intel/lbr.c +++ b/arch/x86/events/intel/lbr.c @@ -384,7 +384,8 @@ void intel_pmu_lbr_sched_task(struct perf_event_context *ctx, bool sched_in) static inline bool branch_user_callstack(unsigned br_sel) { - return (br_sel & X86_BR_USER) && (br_sel & X86_BR_CALL_STACK); + return (br_sel & (X86_BR_USER | X86_BR_KERNEL)) && + (br_sel & X86_BR_CALL_STACK); } void intel_pmu_lbr_enable(struct perf_event *event) diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c index 5d7037e..b39aafc 100644 --- a/tools/perf/util/evsel.c +++ b/tools/perf/util/evsel.c @@ -575,9 +575,20 @@ void perf_evsel__config_callchain(struct perf_evsel *evsel, if (param->record_mode == CALLCHAIN_LBR) { if (!opts->branch_stack) { if (attr->exclude_user) { - pr_warning("LBR callstack option is only available " - "to get user callchain information. " - "Falling back to framepointers.\n"); + /* + * Sample kernel AND user branches. Do not exclude user + * branches because task_ctx->lbr_callstack_users in + * arch/x86/events/intel/lbr.c only count users for + * user branches. + */ + pr_warning("BRANCH_STACK with kernel and user\n"); + perf_evsel__set_sample_bit(evsel, BRANCH_STACK); + attr->branch_sample_type = PERF_SAMPLE_BRANCH_KERNEL | + PERF_SAMPLE_BRANCH_USER | + PERF_SAMPLE_BRANCH_CALL_STACK | + PERF_SAMPLE_BRANCH_NO_CYCLES | + PERF_SAMPLE_BRANCH_NO_FLAGS; + } else { perf_evsel__set_sample_bit(evsel, BRANCH_STACK); attr->branch_sample_type = PERF_SAMPLE_BRANCH_USER | -- 2.8.0.rc3.226.g39d4020 ^ permalink raw reply related [flat|nested] 17+ messages in thread
end of thread, other threads:[~2016-06-08 9:08 UTC | newest] Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-06-02 2:42 [PATCH 0/3] fix MSR_LAST_BRANCH_FROM Haswell support David Carrillo-Cisneros 2016-06-02 2:42 ` [PATCH 1/3] perf/x86/intel: output LBR support statement after validation David Carrillo-Cisneros 2016-06-02 8:21 ` Peter Zijlstra 2016-06-02 16:04 ` Andi Kleen 2016-06-02 17:28 ` Stephane Eranian 2016-06-06 1:59 ` Andi Kleen 2016-06-08 7:27 ` Stephane Eranian 2016-06-08 9:08 ` Andi Kleen 2016-06-02 2:42 ` [PATCH 2/3] perf/x86/intel: fix for MSR_LAST_BRANCH_FROM_x quirk when no TSX David Carrillo-Cisneros 2016-06-02 8:23 ` Peter Zijlstra 2016-06-02 8:37 ` Peter Zijlstra 2016-06-02 8:53 ` Peter Zijlstra [not found] ` <CALcN6mhMsNHgRVuYmYsE+O+Neq6PcaDy1-ARzBd=Vej=0rHr0Q@mail.gmail.com> 2016-06-03 9:21 ` Peter Zijlstra 2016-06-02 8:54 ` Peter Zijlstra 2016-06-02 8:55 ` Peter Zijlstra 2016-06-02 16:07 ` Andi Kleen 2016-06-02 2:42 ` [PATCH 3/3] perf, perf/tool: trigger lbr_from signext bug David Carrillo-Cisneros
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).