linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "tip-bot2 for Kan Liang" <tip-bot2@linutronix.de>
To: linux-tip-commits@vger.kernel.org
Cc: Kan Liang <kan.liang@linux.intel.com>,
	"Peter Zijlstra (Intel)" <peterz@infradead.org>,
	x86 <x86@kernel.org>, LKML <linux-kernel@vger.kernel.org>
Subject: [tip: perf/core] perf/x86/intel/lbr: Factor out a new struct for generic optimization
Date: Wed, 08 Jul 2020 09:51:53 -0000	[thread overview]
Message-ID: <159420191352.4006.2167850556977785186.tip-bot2@tip-bot2> (raw)
In-Reply-To: <1593780569-62993-6-git-send-email-kan.liang@linux.intel.com>

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     530bfff6480307d210734222a54d56af7f908957
Gitweb:        https://git.kernel.org/tip/530bfff6480307d210734222a54d56af7f908957
Author:        Kan Liang <kan.liang@linux.intel.com>
AuthorDate:    Fri, 03 Jul 2020 05:49:11 -07:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Wed, 08 Jul 2020 11:38:52 +02:00

perf/x86/intel/lbr: Factor out a new struct for generic optimization

To reduce the overhead of a context switch with LBR enabled, some
generic optimizations were introduced, e.g. avoiding restore LBR if no
one else touched them. The generic optimizations can also be used by
Architecture LBR later. Currently, the fields for the generic
optimizations are part of structure x86_perf_task_context, which will be
deprecated by Architecture LBR. A new structure should be introduced
for the common fields of generic optimization, which can be shared
between Architecture LBR and model-specific LBR.

Both 'valid_lbrs' and 'tos' are also used by the generic optimizations,
but they are not moved into the new structure, because Architecture LBR
is stack-like. The 'valid_lbrs' which records the index of the valid LBR
is not required anymore. The TOS MSR will be removed.

LBR registers may be cleared in the deep Cstate. If so, the generic
optimizations should not be applied. Perf has to unconditionally
restore the LBR registers. A generic function is required to detect the
reset due to the deep Cstate. lbr_is_reset_in_cstate() is introduced.
Currently, for the model-specific LBR, the TOS MSR is used to detect the
reset. There will be another method introduced for Architecture LBR
later.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/1593780569-62993-6-git-send-email-kan.liang@linux.intel.com
---
 arch/x86/events/intel/lbr.c  | 38 +++++++++++++++++++----------------
 arch/x86/events/perf_event.h | 10 ++++++---
 2 files changed, 28 insertions(+), 20 deletions(-)

diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c
index b2b8dc9..bba9939 100644
--- a/arch/x86/events/intel/lbr.c
+++ b/arch/x86/events/intel/lbr.c
@@ -355,33 +355,37 @@ void intel_pmu_lbr_restore(void *ctx)
 		wrmsrl(MSR_LBR_SELECT, task_ctx->lbr_sel);
 }
 
+static __always_inline bool
+lbr_is_reset_in_cstate(struct x86_perf_task_context *task_ctx)
+{
+	return !rdlbr_from(task_ctx->tos);
+}
+
 static void __intel_pmu_lbr_restore(struct x86_perf_task_context *task_ctx)
 {
 	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
-	u64 tos;
 
-	if (task_ctx->lbr_callstack_users == 0 ||
-	    task_ctx->lbr_stack_state == LBR_NONE) {
+	if (task_ctx->opt.lbr_callstack_users == 0 ||
+	    task_ctx->opt.lbr_stack_state == LBR_NONE) {
 		intel_pmu_lbr_reset();
 		return;
 	}
 
-	tos = task_ctx->tos;
 	/*
 	 * Does not restore the LBR registers, if
 	 * - No one else touched them, and
-	 * - Did not enter C6
+	 * - Was not cleared in Cstate
 	 */
 	if ((task_ctx == cpuc->last_task_ctx) &&
-	    (task_ctx->log_id == cpuc->last_log_id) &&
-	    rdlbr_from(tos)) {
-		task_ctx->lbr_stack_state = LBR_NONE;
+	    (task_ctx->opt.log_id == cpuc->last_log_id) &&
+	    !lbr_is_reset_in_cstate(task_ctx)) {
+		task_ctx->opt.lbr_stack_state = LBR_NONE;
 		return;
 	}
 
 	x86_pmu.lbr_restore(task_ctx);
 
-	task_ctx->lbr_stack_state = LBR_NONE;
+	task_ctx->opt.lbr_stack_state = LBR_NONE;
 }
 
 void intel_pmu_lbr_save(void *ctx)
@@ -415,17 +419,17 @@ static void __intel_pmu_lbr_save(struct x86_perf_task_context *task_ctx)
 {
 	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
-	if (task_ctx->lbr_callstack_users == 0) {
-		task_ctx->lbr_stack_state = LBR_NONE;
+	if (task_ctx->opt.lbr_callstack_users == 0) {
+		task_ctx->opt.lbr_stack_state = LBR_NONE;
 		return;
 	}
 
 	x86_pmu.lbr_save(task_ctx);
 
-	task_ctx->lbr_stack_state = LBR_VALID;
+	task_ctx->opt.lbr_stack_state = LBR_VALID;
 
 	cpuc->last_task_ctx = task_ctx;
-	cpuc->last_log_id = ++task_ctx->log_id;
+	cpuc->last_log_id = ++task_ctx->opt.log_id;
 }
 
 void intel_pmu_lbr_swap_task_ctx(struct perf_event_context *prev,
@@ -447,8 +451,8 @@ void intel_pmu_lbr_swap_task_ctx(struct perf_event_context *prev,
 	if (!prev_ctx_data || !next_ctx_data)
 		return;
 
-	swap(prev_ctx_data->lbr_callstack_users,
-	     next_ctx_data->lbr_callstack_users);
+	swap(prev_ctx_data->opt.lbr_callstack_users,
+	     next_ctx_data->opt.lbr_callstack_users);
 }
 
 void intel_pmu_lbr_sched_task(struct perf_event_context *ctx, bool sched_in)
@@ -503,7 +507,7 @@ void intel_pmu_lbr_add(struct perf_event *event)
 
 	if (branch_user_callstack(cpuc->br_sel) && event->ctx->task_ctx_data) {
 		task_ctx = event->ctx->task_ctx_data;
-		task_ctx->lbr_callstack_users++;
+		task_ctx->opt.lbr_callstack_users++;
 	}
 
 	/*
@@ -543,7 +547,7 @@ void intel_pmu_lbr_del(struct perf_event *event)
 	if (branch_user_callstack(cpuc->br_sel) &&
 	    event->ctx->task_ctx_data) {
 		task_ctx = event->ctx->task_ctx_data;
-		task_ctx->lbr_callstack_users--;
+		task_ctx->opt.lbr_callstack_users--;
 	}
 
 	if (event->hw.flags & PERF_X86_EVENT_LBR_SELECT)
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 6d11813..96d73cd 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -736,6 +736,12 @@ struct x86_pmu {
 	int (*aux_output_match) (struct perf_event *event);
 };
 
+struct x86_perf_task_context_opt {
+	int lbr_callstack_users;
+	int lbr_stack_state;
+	int log_id;
+};
+
 struct x86_perf_task_context {
 	u64 lbr_from[MAX_LBR_ENTRIES];
 	u64 lbr_to[MAX_LBR_ENTRIES];
@@ -743,9 +749,7 @@ struct x86_perf_task_context {
 	u64 lbr_sel;
 	int tos;
 	int valid_lbrs;
-	int lbr_callstack_users;
-	int lbr_stack_state;
-	int log_id;
+	struct x86_perf_task_context_opt opt;
 };
 
 #define x86_add_quirk(func_)						\

  reply	other threads:[~2020-07-08  9:52 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-03 12:49 [PATCH V3 00/23] Support Architectural LBR kan.liang
2020-07-03 12:49 ` [PATCH V3 01/23] x86/cpufeatures: Add Architectural LBRs feature bit kan.liang
2020-07-08  9:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-09 23:00     ` Dave Hansen
2020-07-10  9:51       ` Peter Zijlstra
2020-07-10 14:09       ` Liang, Kan
2020-07-03 12:49 ` [PATCH V3 02/23] perf/x86/intel/lbr: Add a function pointer for LBR reset kan.liang
2020-07-08  9:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-03 12:49 ` [PATCH V3 03/23] perf/x86/intel/lbr: Add a function pointer for LBR read kan.liang
2020-07-08  9:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-03 12:49 ` [PATCH V3 04/23] perf/x86/intel/lbr: Add the function pointers for LBR save and restore kan.liang
2020-07-08  9:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-03 12:49 ` [PATCH V3 05/23] perf/x86/intel/lbr: Factor out a new struct for generic optimization kan.liang
2020-07-08  9:51   ` tip-bot2 for Kan Liang [this message]
2020-07-03 12:49 ` [PATCH V3 06/23] perf/x86/intel/lbr: Use dynamic data structure for task_ctx kan.liang
2020-07-08  9:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-03 12:49 ` [PATCH V3 07/23] x86/msr-index: Add bunch of MSRs for Arch LBR kan.liang
2020-07-08  9:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-03 12:49 ` [PATCH V3 08/23] perf/x86: Expose CPUID enumeration bits for arch LBR kan.liang
2020-07-08  9:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-03 12:49 ` [PATCH V3 09/23] perf/x86/intel/lbr: Support LBR_CTL kan.liang
2020-07-08  9:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-03 12:49 ` [PATCH V3 10/23] perf/x86/intel/lbr: Unify the stored format of LBR information kan.liang
2020-07-08  9:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-03 12:49 ` [PATCH V3 11/23] perf/x86/intel/lbr: Mark the {rd,wr}lbr_{to,from} wrappers __always_inline kan.liang
2020-07-08  9:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-03 12:49 ` [PATCH V3 12/23] perf/x86/intel/lbr: Factor out rdlbr_all() and wrlbr_all() kan.liang
2020-07-08  9:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-03 12:49 ` [PATCH V3 13/23] perf/x86/intel/lbr: Factor out intel_pmu_store_lbr kan.liang
2020-07-03 19:50   ` Peter Zijlstra
2020-07-03 20:59     ` Liang, Kan
2020-07-06 10:25       ` Peter Zijlstra
2020-07-06 13:32         ` Liang, Kan
2020-07-06 14:25           ` Peter Zijlstra
2020-07-06 22:29       ` Liang, Kan
2020-07-07  7:40         ` Peter Zijlstra
2020-07-08  9:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-03 12:49 ` [PATCH V3 14/23] perf/x86/intel/lbr: Support Architectural LBR kan.liang
2020-07-08  9:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-03 12:49 ` [PATCH V3 15/23] perf/core: Factor out functions to allocate/free the task_ctx_data kan.liang
2020-07-08  9:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-03 12:49 ` [PATCH V3 16/23] perf/core: Use kmem_cache to allocate the PMU specific data kan.liang
2020-07-08  9:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-03 12:49 ` [PATCH V3 17/23] perf/x86/intel/lbr: Create kmem_cache for the LBR context data kan.liang
2020-07-08  9:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-03 12:49 ` [PATCH V3 18/23] perf/x86: Remove task_ctx_size kan.liang
2020-07-08  9:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-03 12:49 ` [PATCH V3 19/23] x86/fpu: Use proper mask to replace full instruction mask kan.liang
2020-07-08  9:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-03 12:49 ` [PATCH V3 20/23] x86/fpu/xstate: Support dynamic supervisor feature for LBR kan.liang
2020-07-08  9:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2021-05-27 22:15     ` Thomas Gleixner
2020-07-03 12:49 ` [PATCH V3 21/23] x86/fpu/xstate: Add helpers for LBR dynamic supervisor feature kan.liang
2020-07-08  9:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-03 12:49 ` [PATCH V3 22/23] perf/x86/intel/lbr: Support XSAVES/XRSTORS for LBR context switch kan.liang
2020-07-08  9:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-03 12:49 ` [PATCH V3 23/23] perf/x86/intel/lbr: Support XSAVES for arch LBR read kan.liang
2020-07-08  9:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-03 19:34 ` [PATCH V3 00/23] Support Architectural LBR Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=159420191352.4006.2167850556977785186.tip-bot2@tip-bot2 \
    --to=tip-bot2@linutronix.de \
    --cc=kan.liang@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).