From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752869AbbJTSr0 (ORCPT ); Tue, 20 Oct 2015 14:47:26 -0400 Received: from mga03.intel.com ([134.134.136.65]:62392 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752503AbbJTSq6 (ORCPT ); Tue, 20 Oct 2015 14:46:58 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.17,708,1437462000"; d="scan'208";a="668225921" From: Andi Kleen To: peterz@infradead.org Cc: acme@kernel.org, jolsa@kernel.org, linux-kernel@vger.kernel.org, Andi Kleen Subject: [PATCH 5/5] x86, perf: Avoid context switching LBR_INFO when not needed Date: Tue, 20 Oct 2015 11:46:37 -0700 Message-Id: <1445366797-30894-5-git-send-email-andi@firstfloor.org> X-Mailer: git-send-email 2.4.3 In-Reply-To: <1445366797-30894-1-git-send-email-andi@firstfloor.org> References: <1445366797-30894-1-git-send-email-andi@firstfloor.org> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Andi Kleen We context switch LBRs in call stack mode. Currently LBR_INFO was also context switched, but we normally don't need that in call stack mode. Make the context switch check the NO_CYCLES|NO_FLAGS event flags that were earlier added, and if set avoid writing the LBR_INFO MSRs unnecessarily. The same is done for the LBR reset code. Signed-off-by: Andi Kleen --- arch/x86/kernel/cpu/perf_event.h | 3 ++- arch/x86/kernel/cpu/perf_event_intel.c | 2 +- arch/x86/kernel/cpu/perf_event_intel_lbr.c | 25 ++++++++++++++++--------- 3 files changed, 19 insertions(+), 11 deletions(-) diff --git a/arch/x86/kernel/cpu/perf_event.h b/arch/x86/kernel/cpu/perf_event.h index 1b47164..4ae66e3 100644 --- a/arch/x86/kernel/cpu/perf_event.h +++ b/arch/x86/kernel/cpu/perf_event.h @@ -634,6 +634,7 @@ struct x86_perf_task_context { int tos; int lbr_callstack_users; int lbr_stack_state; + int need_info; }; #define x86_add_quirk(func_) \ @@ -887,7 +888,7 @@ void intel_ds_init(void); void intel_pmu_lbr_sched_task(struct perf_event_context *ctx, bool sched_in); -void intel_pmu_lbr_reset(void); +void intel_pmu_lbr_reset(bool need_info); void intel_pmu_lbr_enable(struct perf_event *event); diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c index f17772a..42f21f0 100644 --- a/arch/x86/kernel/cpu/perf_event_intel.c +++ b/arch/x86/kernel/cpu/perf_event_intel.c @@ -2844,7 +2844,7 @@ static void intel_pmu_cpu_starting(int cpu) /* * Deal with CPUs that don't clear their LBRs on power-up. */ - intel_pmu_lbr_reset(); + intel_pmu_lbr_reset(1); cpuc->lbr_sel = NULL; diff --git a/arch/x86/kernel/cpu/perf_event_intel_lbr.c b/arch/x86/kernel/cpu/perf_event_intel_lbr.c index 60e71b7..7c21efb 100644 --- a/arch/x86/kernel/cpu/perf_event_intel_lbr.c +++ b/arch/x86/kernel/cpu/perf_event_intel_lbr.c @@ -195,27 +195,30 @@ static void intel_pmu_lbr_reset_32(void) wrmsrl(x86_pmu.lbr_from + i, 0); } -static void intel_pmu_lbr_reset_64(void) +static void intel_pmu_lbr_reset_64(bool need_info) { int i; for (i = 0; i < x86_pmu.lbr_nr; i++) { wrmsrl(x86_pmu.lbr_from + i, 0); wrmsrl(x86_pmu.lbr_to + i, 0); - if (x86_pmu.intel_cap.lbr_format == LBR_FORMAT_INFO) + if (need_info) wrmsrl(MSR_LBR_INFO_0 + i, 0); } } -void intel_pmu_lbr_reset(void) +void intel_pmu_lbr_reset(bool need_info) { if (!x86_pmu.lbr_nr) return; + if (x86_pmu.intel_cap.lbr_format != LBR_FORMAT_INFO) + need_info = false; + if (x86_pmu.intel_cap.lbr_format == LBR_FORMAT_32) intel_pmu_lbr_reset_32(); else - intel_pmu_lbr_reset_64(); + intel_pmu_lbr_reset_64(need_info); } /* @@ -242,7 +245,7 @@ static void __intel_pmu_lbr_restore(struct x86_perf_task_context *task_ctx) if (task_ctx->lbr_callstack_users == 0 || task_ctx->lbr_stack_state == LBR_NONE) { - intel_pmu_lbr_reset(); + intel_pmu_lbr_reset(task_ctx->need_info > 0); return; } @@ -252,7 +255,7 @@ static void __intel_pmu_lbr_restore(struct x86_perf_task_context *task_ctx) lbr_idx = (tos - i) & mask; wrmsrl(x86_pmu.lbr_from + lbr_idx, task_ctx->lbr_from[i]); wrmsrl(x86_pmu.lbr_to + lbr_idx, task_ctx->lbr_to[i]); - if (x86_pmu.intel_cap.lbr_format == LBR_FORMAT_INFO) + if (task_ctx->need_info) wrmsrl(MSR_LBR_INFO_0 + lbr_idx, task_ctx->lbr_info[i]); } wrmsrl(x86_pmu.lbr_tos, tos); @@ -276,7 +279,7 @@ static void __intel_pmu_lbr_save(struct x86_perf_task_context *task_ctx) lbr_idx = (tos - i) & mask; rdmsrl(x86_pmu.lbr_from + lbr_idx, task_ctx->lbr_from[i]); rdmsrl(x86_pmu.lbr_to + lbr_idx, task_ctx->lbr_to[i]); - if (x86_pmu.intel_cap.lbr_format == LBR_FORMAT_INFO) + if (task_ctx->need_info) rdmsrl(MSR_LBR_INFO_0 + lbr_idx, task_ctx->lbr_info[i]); } task_ctx->tos = tos; @@ -317,7 +320,7 @@ void intel_pmu_lbr_sched_task(struct perf_event_context *ctx, bool sched_in) * stack with branch from multiple tasks. */ if (sched_in) { - intel_pmu_lbr_reset(); + intel_pmu_lbr_reset(!task_ctx || task_ctx->need_info > 0); cpuc->lbr_context = ctx; } } @@ -340,7 +343,7 @@ void intel_pmu_lbr_enable(struct perf_event *event) * avoid data leaks. */ if (event->ctx->task && cpuc->lbr_context != event->ctx) { - intel_pmu_lbr_reset(); + intel_pmu_lbr_reset(!(event->hw.branch_reg.reg & LBR_NO_INFO)); cpuc->lbr_context = event->ctx; } cpuc->br_sel = event->hw.branch_reg.reg; @@ -349,6 +352,8 @@ void intel_pmu_lbr_enable(struct perf_event *event) event->ctx->task_ctx_data) { task_ctx = event->ctx->task_ctx_data; task_ctx->lbr_callstack_users++; + if (!(cpuc->br_sel & LBR_NO_INFO)) + task_ctx->need_info++; } cpuc->lbr_users++; @@ -367,6 +372,8 @@ void intel_pmu_lbr_disable(struct perf_event *event) event->ctx->task_ctx_data) { task_ctx = event->ctx->task_ctx_data; task_ctx->lbr_callstack_users--; + if (!(cpuc->br_sel & LBR_NO_INFO)) + task_ctx->need_info--; } cpuc->lbr_users--; -- 2.4.3