From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.linutronix.de (146.0.238.70:993) by crypto-ml.lab.linutronix.de with IMAP4-SSL for ; 06 Feb 2019 01:27:17 -0000 Received: from mga03.intel.com ([134.134.136.65]) by Galois.linutronix.de with esmtps (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1grBwj-0003yA-Hq for speck@linutronix.de; Wed, 06 Feb 2019 02:24:15 +0100 From: Andi Kleen Subject: [MODERATED] [PATCH v2 5/8] PERFv2 Date: Tue, 5 Feb 2019 17:23:59 -0800 Message-Id: <9d56319fee38739b876946e171dd7be749e1a97b.1549416008.git.ak@linux.intel.com> In-Reply-To: References: In-Reply-To: References: Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit MIME-Version: 1.0 To: speck@linutronix.de Cc: Andi Kleen List-ID: Add a global setting to allow to disable RTM when counter 3 is needed for some perf event group. Otherwise RTM will not be impacted. This allows existing programs that want to use groups with all four counters to run without changes. This patch sets the default to TSX enabled, but that could be easily changed. It also adds a new allow_rtm attribute that allows perf event users to override the default. The trade offs for setting the option default are: Using 4 (or 8 with HT off) events in perf versus allowing RTM usage while perf is active. - Existing programs that use perf groups with 4 counters may not retrieve perfmon data anymore. Perf usages that use less than four (or 7 with HT off) counters are not impacted. Perf usages that don't use group will still work, but will see increase multiplexing. - TSX programs should not functionally break from forcing RTM to abort because they always need a valid fall back path. However they will see significantly lower performance if they rely on TSX for performance (all RTM transactions will run and only abort at the end), potentially slowing them down so much that it is equivalent to functional breakage. Signed-off-by: Andi Kleen --- arch/x86/events/core.c | 8 +++++++- arch/x86/events/intel/core.c | 13 ++++++++++++- arch/x86/events/perf_event.h | 3 +++ 3 files changed, 22 insertions(+), 2 deletions(-) diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c index 4bad36dfcc8e..fe77a6f7b57c 100644 --- a/arch/x86/events/core.c +++ b/arch/x86/events/core.c @@ -2258,11 +2258,17 @@ static ssize_t max_precise_show(struct device *cdev, static DEVICE_ATTR_RO(max_precise); +bool perf_enable_all_counters __read_mostly; + static ssize_t num_counter_show(struct device *cdev, struct device_attribute *attr, char *buf) { - return snprintf(buf, PAGE_SIZE, "%d\n", x86_pmu.num_counters); + int num = x86_pmu.num_counters; + if (boot_cpu_has(X86_FEATURE_TSX_FORCE_ABORT) && + !perf_enable_all_counters && num > 0) + num--; + return snprintf(buf, PAGE_SIZE, "%d\n", num); } static DEVICE_ATTR_RO(num_counter); diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index c197b2f1cfcc..5d2d5851d13d 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -3182,7 +3182,9 @@ static int intel_pmu_hw_config(struct perf_event *event) return ret; if (static_cpu_has(X86_FEATURE_TSX_FORCE_ABORT)) { - if (event->attr.config2 & FORCE_RTM_ABORT) + if ((event->attr.config2 & FORCE_RTM_ABORT) || + (perf_enable_all_counters && + !(event->attr.config2 & ALLOW_RTM))) event->hw.flags |= PERF_X86_EVENT_ABORT_TSX; else event->hw.exclude = BIT(3); @@ -3641,6 +3643,7 @@ PMU_FORMAT_ATTR(ldlat, "config1:0-15"); PMU_FORMAT_ATTR(frontend, "config1:0-23"); PMU_FORMAT_ATTR(force_rtm_abort, "config2:0"); +PMU_FORMAT_ATTR(allow_rtm, "config2:1"); static struct attribute *intel_arch3_formats_attr[] = { &format_attr_event.attr, @@ -3679,6 +3682,7 @@ static struct attribute *skl_format_attr[] = { static struct attribute *skl_extra_format_attr[] = { &format_attr_force_rtm_abort.attr, + &format_attr_allow_rtm.attr, NULL, }; @@ -4141,6 +4145,13 @@ get_events_attrs(struct attribute **base, return attrs; } +static DEVICE_BOOL_ATTR(enable_all_counters, 0644, perf_enable_all_counters); + +static struct attribute *skl_extra_attr[] = { + &dev_attr_enable_all_counters.attr.attr, + NULL, +}; + __init int intel_pmu_init(void) { struct attribute **extra_attr = NULL; diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h index ed353f9970c8..67e1581df96f 100644 --- a/arch/x86/events/perf_event.h +++ b/arch/x86/events/perf_event.h @@ -776,6 +776,7 @@ int x86_pmu_hw_config(struct perf_event *event); void x86_pmu_disable_all(void); #define FORCE_RTM_ABORT (1U << 0) +#define ALLOW_RTM (1U << 1) static inline void __x86_pmu_enable_event(struct hw_perf_event *hwc, u64 enable_mask) @@ -863,6 +864,8 @@ static inline int amd_pmu_init(void) #endif /* CONFIG_CPU_SUP_AMD */ +extern bool perf_enable_all_counters; + #ifdef CONFIG_CPU_SUP_INTEL static inline bool intel_pmu_has_bts(struct perf_event *event) -- 2.17.2