From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933795AbeAKOSc (ORCPT + 1 other); Thu, 11 Jan 2018 09:18:32 -0500 Received: from mga06.intel.com ([134.134.136.31]:1624 "EHLO mga06.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754315AbeAKOS3 (ORCPT ); Thu, 11 Jan 2018 09:18:29 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.46,344,1511856000"; d="scan'208";a="18439330" Subject: Re: [PATCH v4] perf tools: Add ARM Statistical Profiling Extensions (SPE) support To: Kim Phillips , Arnaldo Carvalho de Melo , Mark Rutland , Will Deacon Cc: robh@kernel.org, mathieu.poirier@linaro.org, pawel.moll@arm.com, suzuki.poulose@arm.com, marc.zyngier@arm.com, linux-kernel@vger.kernel.org, alexander.shishkin@linux.intel.com, peterz@infradead.org, mingo@redhat.com, tglx@linutronix.de, linux-arm-kernel@lists.infradead.org, Jiri Olsa , Andi Kleen , Wang Nan References: <20171121173302.bead17a1178ed4583e68014e@arm.com> From: Adrian Hunter Organization: Intel Finland Oy, Registered Address: PL 281, 00181 Helsinki, Business Identity Code: 0357606 - 4, Domiciled in Helsinki Message-ID: <0a07a10c-fc0c-3f97-ac01-59429aab0937@intel.com> Date: Thu, 11 Jan 2018 16:17:51 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.0 MIME-Version: 1.0 In-Reply-To: <20171121173302.bead17a1178ed4583e68014e@arm.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Return-Path: On 22/11/17 01:33, Kim Phillips wrote: > 'perf record' and 'perf report --dump-raw-trace' supported in this > release. > > Example usage: > > $ ./perf record -e arm_spe_0/ts_enable=1,pa_enable=1/ \ > dd if=/dev/zero of=/dev/null count=10000 > > perf report --dump-raw-trace > > Note that the perf.data file is portable, so the report can be run on > another architecture host if necessary. > > Output will contain raw SPE data and its textual representation, such > as: > > 0x550 [0x30]: PERF_RECORD_AUXTRACE size: 0xc408 offset: 0 ref: 0x30005619 idx: 3 tid: 2109 cpu: 3 > . > . ... ARM SPE data: size 50184 bytes > . 00000000: 49 00 LD > . 00000002: b2 00 9c 7b 7a 00 80 ff ff VA 0xffff80007a7b9c00 > . 0000000b: 9a 00 00 LAT 0 XLAT > . 0000000e: 42 16 EV RETIRED L1D-ACCESS TLB-ACCESS > . 00000010: b0 b0 c9 15 08 00 00 ff ff PC 0xff00000815c9b0 el3 ns=1 > . 00000019: 98 00 00 LAT 0 TOT > . 0000001c: 71 00 20 fa fd 16 00 00 00 TS 98750308352 > . 00000025: 49 01 ST > . 00000027: b2 60 bc 0c 0f 00 00 ff ff VA 0xffff00000f0cbc60 > . 00000030: 9a 00 00 LAT 0 XLAT > . 00000033: 42 16 EV RETIRED L1D-ACCESS TLB-ACCESS > . 00000035: b0 48 cc 15 08 00 00 ff ff PC 0xff00000815cc48 el3 ns=1 > . 0000003e: 98 00 00 LAT 0 TOT > . 00000041: 71 00 20 fa fd 16 00 00 00 TS 98750308352 > . 0000004a: 48 00 INSN-OTHER > . 0000004c: 42 02 EV RETIRED > . 0000004e: b0 ac 47 0c 08 00 00 ff ff PC 0xff0000080c47ac el3 ns=1 > . 00000057: 98 00 00 LAT 0 TOT > . 0000005a: 71 00 20 fa fd 16 00 00 00 TS 98750308352 > . 00000063: 49 00 LD > . 00000065: b2 18 48 e5 7a 00 80 ff ff VA 0xffff80007ae54818 > . 0000006e: 9a 00 00 LAT 0 XLAT > . 00000071: 42 16 EV RETIRED L1D-ACCESS TLB-ACCESS > . 00000073: b0 08 f8 15 08 00 00 ff ff PC 0xff00000815f808 el3 ns=1 > . 0000007c: 98 00 00 LAT 0 TOT > . 0000007f: 71 00 20 fa fd 16 00 00 00 TS 98750308352 > ... > > Other release notes: > > - applies to acme's perf/{core,urgent} branches, likely elsewhere > > - Report is self-contained within the tool. Record requires enabling > the kernel SPE driver by setting CONFIG_ARM_SPE_PMU. > > - the intel-bts implementation was used as a starting point; its > min/default/max buffer sizes and power of 2 pages granularity need to be > revisited for ARM SPE > > - recording across multiple SPE clusters/domains not supported > > - snapshot support (record -S), and conversion to native perf events > (e.g., via 'perf inject --itrace'), are also not supported > > - technically both cs-etm and spe can be used simultaneously, however > disabled for simplicity in this release > > Signed-off-by: Kim Phillips For what is there now, it looks fine from the auxtrace point of view. There are a couple of minor points below but nevertheless: Acked-by: Adrian Hunter > --- > v4: rebased onto acme's perf/core, whitespace fixes. > > v3: trying to address comments from v2: > > - despite adding a find_all_arm_spe_pmus() function to scan for all > arm_spe_ device instances, in order to ensure auxtrace_record__init > successfully matches the evsel type with the correct arm_spe_pmu type, > I am still having trouble running in multi-SPE PPI (heterogeneous) > environments (mmap fails with EOPNOTSUPP, as does running with > --per-thread on homogeneous systems). > > - arm_spe_reference: use gettime instead of direct cntvct register access > > - spe-decoder: add a comment for why SPE_EVENTS code sets packet->index. > > - added arm_spe_pmu_default_config that accesses the driver > caps/min_interval and sets the default sampling period to it. This way > users don't have to specify -c explicitly. Also set is_uncore to false. > > - set more sampling bits in the arm_spe and its tracking evsel. Still > unsure if too liberal, and not sure whether it needs another context > switch tracking evsel. Comments welcome! > > - https://www.spinics.net/lists/arm-kernel/msg614361.html > > v2: mostly addressing Mark Rutland's comments as much as possible without his > feedback to my feedback: > > - decoder refactored with a get_payload, not extended to with-ext_len ones like > get_addr, named the constants > > - 0x-ified %x output formats, but decided to not sign extend the addresses in > the raw dump, rather do so if necessary in the synthesis stage: > SPE implementations differ in this area, and raw dump should reflect that. > > - CPU mask / new record behaviour bisected to commit e3ba76deef23064 "perf > tools: Force uncore events to system wide monitoring". Waiting to hear back > on why driver can't do system wide monitoring, even across PPIs, by e.g., > sharing the SPE interrupts in one handler (SPE's don't differ in this record > regard). > > - addressed off-list comment from M. Williams: > "Instruction Type" packet was renamed as "Operation Type". > so in the spe packet decoder: INSN_TYPE -> OP_TYPE > > - do_get_packet fixed to handle excessive, successive PADding from a new source > of raw SPE data, so instead of: > > . 000011ae: 00 PAD > . 000011af: 00 PAD > . 000011b0: 00 PAD > . 000011b1: 00 PAD > . 000011b2: 00 PAD > . 000011b3: 00 PAD > . 000011b4: 00 PAD > . 000011b5: 00 PAD > . 000011b6: 00 PAD > > we now get: > > . 000011ae: 00 00 00 00 00 00 00 00 00 PAD > > - fixed 52 00 00 decoded with an empty events clause, adding 'EV' for all events > clauses now. parser writers can detect for empty event clauses by finding > nothing after it. > > tools/perf/arch/arm/util/auxtrace.c | 75 +++++- > tools/perf/arch/arm/util/pmu.c | 5 +- > tools/perf/arch/arm64/util/Build | 3 +- > tools/perf/arch/arm64/util/arm-spe.c | 235 +++++++++++++++++ > tools/perf/util/Build | 2 + > tools/perf/util/arm-spe-pkt-decoder.c | 471 ++++++++++++++++++++++++++++++++++ > tools/perf/util/arm-spe-pkt-decoder.h | 52 ++++ > tools/perf/util/arm-spe.c | 318 +++++++++++++++++++++++ > tools/perf/util/arm-spe.h | 42 +++ > tools/perf/util/auxtrace.c | 3 + > tools/perf/util/auxtrace.h | 1 + > 11 files changed, 1199 insertions(+), 8 deletions(-) > create mode 100644 tools/perf/arch/arm64/util/arm-spe.c > create mode 100644 tools/perf/util/arm-spe-pkt-decoder.c > create mode 100644 tools/perf/util/arm-spe-pkt-decoder.h > create mode 100644 tools/perf/util/arm-spe.c > create mode 100644 tools/perf/util/arm-spe.h > > diff --git a/tools/perf/arch/arm/util/auxtrace.c b/tools/perf/arch/arm/util/auxtrace.c > index 8edf2cb71564..8e7c1ad18224 100644 > --- a/tools/perf/arch/arm/util/auxtrace.c > +++ b/tools/perf/arch/arm/util/auxtrace.c > @@ -22,6 +22,42 @@ > #include "../../util/evlist.h" > #include "../../util/pmu.h" > #include "cs-etm.h" > +#include "arm-spe.h" > + > +static struct perf_pmu **find_all_arm_spe_pmus(int *nr_spes, int *err) > +{ > + struct perf_pmu **arm_spe_pmus = NULL; > + int ret, i, nr_cpus = sysconf(_SC_NPROCESSORS_CONF); > + /* arm_spe_xxxxxxxxx\0 */ > + char arm_spe_pmu_name[sizeof(ARM_SPE_PMU_NAME) + 10]; > + > + arm_spe_pmus = zalloc(sizeof(struct perf_pmu *) * nr_cpus); > + if (!arm_spe_pmus) { > + pr_err("spes alloc failed\n"); > + *err = -ENOMEM; > + return NULL; > + } > + > + for (i = 0; i < nr_cpus; i++) { > + ret = sprintf(arm_spe_pmu_name, "%s%d", ARM_SPE_PMU_NAME, i); > + if (ret < 0) { > + pr_err("sprintf failed\n"); > + *err = -ENOMEM; > + return NULL; > + } > + > + arm_spe_pmus[*nr_spes] = perf_pmu__find(arm_spe_pmu_name); > + if (arm_spe_pmus[*nr_spes]) { > + pr_debug2("%s %d: arm_spe_pmu %d type %d name %s\n", > + __func__, __LINE__, *nr_spes, > + arm_spe_pmus[*nr_spes]->type, > + arm_spe_pmus[*nr_spes]->name); > + (*nr_spes)++; > + } > + } > + > + return arm_spe_pmus; > +} > > struct auxtrace_record > *auxtrace_record__init(struct perf_evlist *evlist, int *err) > @@ -29,22 +65,49 @@ struct auxtrace_record > struct perf_pmu *cs_etm_pmu; > struct perf_evsel *evsel; > bool found_etm = false; > + bool found_spe = false; > + static struct perf_pmu **arm_spe_pmus = NULL; > + static int nr_spes = 0; > + int i; > + > + if (!evlist) > + return NULL; > > cs_etm_pmu = perf_pmu__find(CORESIGHT_ETM_PMU_NAME); > > - if (evlist) { > - evlist__for_each_entry(evlist, evsel) { > - if (cs_etm_pmu && > - evsel->attr.type == cs_etm_pmu->type) > - found_etm = true; > + if (!arm_spe_pmus) > + arm_spe_pmus = find_all_arm_spe_pmus(&nr_spes, err); > + > + evlist__for_each_entry(evlist, evsel) { > + if (cs_etm_pmu && > + evsel->attr.type == cs_etm_pmu->type) > + found_etm = true; > + > + if (!nr_spes) > + continue; > + > + for (i = 0; i < nr_spes; i++) { > + if (evsel->attr.type == arm_spe_pmus[i]->type) { > + found_spe = true; > + break; > + } > } > } > > + if (found_etm && found_spe) { > + pr_err("Concurrent ARM Coresight ETM and SPE operation not currently supported\n"); > + *err = -EOPNOTSUPP; > + return NULL; > + } > + > if (found_etm) > return cs_etm_record_init(err); > > + if (found_spe) > + return arm_spe_recording_init(err, arm_spe_pmus[i]); > + > /* > - * Clear 'err' even if we haven't found a cs_etm event - that way perf > + * Clear 'err' even if we haven't found an event - that way perf > * record can still be used even if tracers aren't present. The NULL > * return value will take care of telling the infrastructure HW tracing > * isn't available. > diff --git a/tools/perf/arch/arm/util/pmu.c b/tools/perf/arch/arm/util/pmu.c > index 98d67399a0d6..4c06a25ae6b1 100644 > --- a/tools/perf/arch/arm/util/pmu.c > +++ b/tools/perf/arch/arm/util/pmu.c > @@ -20,6 +20,7 @@ > #include > > #include "cs-etm.h" > +#include "arm-spe.h" > #include "../../util/pmu.h" > > struct perf_event_attr > @@ -30,7 +31,9 @@ struct perf_event_attr > /* add ETM default config here */ > pmu->selectable = true; > pmu->set_drv_config = cs_etm_set_drv_config; > - } > + } else > + if (strstarts(pmu->name, ARM_SPE_PMU_NAME)) > + return arm_spe_pmu_default_config(pmu); More conventional kernel style would be: } else if (strstarts(pmu->name, ARM_SPE_PMU_NAME)) { return arm_spe_pmu_default_config(pmu); } Also it looks like arm_spe_pmu_default_config() is only compiled for arm64 so what happens if you build for arm. > #endif > return NULL; > } > diff --git a/tools/perf/arch/arm64/util/Build b/tools/perf/arch/arm64/util/Build > index cef6fb38d17e..f9969bb88ccb 100644 > --- a/tools/perf/arch/arm64/util/Build > +++ b/tools/perf/arch/arm64/util/Build > @@ -3,4 +3,5 @@ libperf-$(CONFIG_LOCAL_LIBUNWIND) += unwind-libunwind.o > > libperf-$(CONFIG_AUXTRACE) += ../../arm/util/pmu.o \ > ../../arm/util/auxtrace.o \ > - ../../arm/util/cs-etm.o > + ../../arm/util/cs-etm.o \ > + arm-spe.o > diff --git a/tools/perf/arch/arm64/util/arm-spe.c b/tools/perf/arch/arm64/util/arm-spe.c > new file mode 100644 > index 000000000000..ef576b52c850 > --- /dev/null > +++ b/tools/perf/arch/arm64/util/arm-spe.c > @@ -0,0 +1,235 @@ > +/* > + * ARM Statistical Profiling Extensions (SPE) support > + * Copyright (c) 2017, ARM Ltd. > + * > + * This program is free software; you can redistribute it and/or modify it > + * under the terms and conditions of the GNU General Public License, > + * version 2, as published by the Free Software Foundation. > + * > + * This program is distributed in the hope it will be useful, but WITHOUT > + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or > + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for > + * more details. Might as well switch to SPDX license identifiers, here and elsewhere. > + * > + */ > + > +#include > +#include > +#include > +#include > +#include > + > +#include "../../util/cpumap.h" > +#include "../../util/evsel.h" > +#include "../../util/evlist.h" > +#include "../../util/session.h" > +#include "../../util/util.h" > +#include "../../util/pmu.h" > +#include "../../util/debug.h" > +#include "../../util/tsc.h" tsc.h is not needed > +#include "../../util/auxtrace.h" > +#include "../../util/arm-spe.h" > + > +#define KiB(x) ((x) * 1024) > +#define MiB(x) ((x) * 1024 * 1024) > + > +struct arm_spe_recording { > + struct auxtrace_record itr; > + struct perf_pmu *arm_spe_pmu; > + struct perf_evlist *evlist; > +}; > + > +static size_t > +arm_spe_info_priv_size(struct auxtrace_record *itr __maybe_unused, > + struct perf_evlist *evlist __maybe_unused) > +{ > + return ARM_SPE_AUXTRACE_PRIV_SIZE; > +} > + > +static int arm_spe_info_fill(struct auxtrace_record *itr, > + struct perf_session *session, > + struct auxtrace_info_event *auxtrace_info, > + size_t priv_size) > +{ > + struct arm_spe_recording *sper = > + container_of(itr, struct arm_spe_recording, itr); > + struct perf_pmu *arm_spe_pmu = sper->arm_spe_pmu; > + > + if (priv_size != ARM_SPE_AUXTRACE_PRIV_SIZE) > + return -EINVAL; > + > + if (!session->evlist->nr_mmaps) > + return -EINVAL; > + > + auxtrace_info->type = PERF_AUXTRACE_ARM_SPE; > + auxtrace_info->priv[ARM_SPE_PMU_TYPE] = arm_spe_pmu->type; > + > + return 0; > +} > + > +static int arm_spe_recording_options(struct auxtrace_record *itr, > + struct perf_evlist *evlist, > + struct record_opts *opts) > +{ > + struct arm_spe_recording *sper = > + container_of(itr, struct arm_spe_recording, itr); > + struct perf_pmu *arm_spe_pmu = sper->arm_spe_pmu; > + struct perf_evsel *evsel, *arm_spe_evsel = NULL; > + bool privileged = geteuid() == 0 || perf_event_paranoid() < 0; > + struct perf_evsel *tracking_evsel; > + int err; > + > + sper->evlist = evlist; > + > + evlist__for_each_entry(evlist, evsel) { > + if (evsel->attr.type == arm_spe_pmu->type) { > + if (arm_spe_evsel) { > + pr_err("There may be only one " ARM_SPE_PMU_NAME "x event\n"); > + return -EINVAL; > + } > + evsel->attr.freq = 0; > + evsel->attr.sample_period = 1; > + arm_spe_evsel = evsel; > + opts->full_auxtrace = true; > + } > + } > + > + if (!opts->full_auxtrace) > + return 0; > + > + /* We are in full trace mode but '-m,xyz' wasn't specified */ > + if (opts->full_auxtrace && !opts->auxtrace_mmap_pages) { > + if (privileged) { > + opts->auxtrace_mmap_pages = MiB(4) / page_size; > + } else { > + opts->auxtrace_mmap_pages = KiB(128) / page_size; > + if (opts->mmap_pages == UINT_MAX) > + opts->mmap_pages = KiB(256) / page_size; > + } > + } > + > + /* Validate auxtrace_mmap_pages */ > + if (opts->auxtrace_mmap_pages) { > + size_t sz = opts->auxtrace_mmap_pages * (size_t)page_size; > + size_t min_sz = KiB(8); > + > + if (sz < min_sz || !is_power_of_2(sz)) { > + pr_err("Invalid mmap size for ARM SPE: must be at least %zuKiB and a power of 2\n", > + min_sz / 1024); > + return -EINVAL; > + } > + } > + > + > + /* > + * To obtain the auxtrace buffer file descriptor, the auxtrace event > + * must come first. > + */ > + perf_evlist__to_front(evlist, arm_spe_evsel); > + > + perf_evsel__set_sample_bit(arm_spe_evsel, CPU); > + perf_evsel__set_sample_bit(arm_spe_evsel, TIME); > + perf_evsel__set_sample_bit(arm_spe_evsel, TID); > + > + /* Add dummy event to keep tracking */ > + err = parse_events(evlist, "dummy:u", NULL); > + if (err) > + return err; > + > + tracking_evsel = perf_evlist__last(evlist); > + perf_evlist__set_tracking_event(evlist, tracking_evsel); > + > + tracking_evsel->attr.freq = 0; > + tracking_evsel->attr.sample_period = 1; > + perf_evsel__set_sample_bit(tracking_evsel, TIME); > + perf_evsel__set_sample_bit(tracking_evsel, CPU); > + perf_evsel__reset_sample_bit(tracking_evsel, BRANCH_STACK); > + > + return 0; > +} > + > +static u64 arm_spe_reference(struct auxtrace_record *itr __maybe_unused) > +{ > + struct timespec ts; > + > + clock_gettime(CLOCK_MONOTONIC_RAW, &ts); > + > + return ts.tv_sec ^ ts.tv_nsec; > +} > + > +static void arm_spe_recording_free(struct auxtrace_record *itr) > +{ > + struct arm_spe_recording *sper = > + container_of(itr, struct arm_spe_recording, itr); > + > + free(sper); > +} > + > +static int arm_spe_read_finish(struct auxtrace_record *itr, int idx) > +{ > + struct arm_spe_recording *sper = > + container_of(itr, struct arm_spe_recording, itr); > + struct perf_evsel *evsel; > + > + evlist__for_each_entry(sper->evlist, evsel) { > + if (evsel->attr.type == sper->arm_spe_pmu->type) > + return perf_evlist__enable_event_idx(sper->evlist, > + evsel, idx); > + } > + return -EINVAL; > +} > + > +struct auxtrace_record *arm_spe_recording_init(int *err, > + struct perf_pmu *arm_spe_pmu) > +{ > + struct arm_spe_recording *sper; > + > + if (!arm_spe_pmu) { > + *err = -ENODEV; > + return NULL; > + } > + > + sper = zalloc(sizeof(struct arm_spe_recording)); > + if (!sper) { > + *err = -ENOMEM; > + return NULL; > + } > + > + sper->arm_spe_pmu = arm_spe_pmu; > + sper->itr.recording_options = arm_spe_recording_options; > + sper->itr.info_priv_size = arm_spe_info_priv_size; > + sper->itr.info_fill = arm_spe_info_fill; > + sper->itr.free = arm_spe_recording_free; > + sper->itr.reference = arm_spe_reference; > + sper->itr.read_finish = arm_spe_read_finish; > + sper->itr.alignment = 0; > + > + return &sper->itr; > +} > + > +struct perf_event_attr > +*arm_spe_pmu_default_config(struct perf_pmu *arm_spe_pmu) > +{ > + struct perf_event_attr *attr; > + > + attr = zalloc(sizeof(struct perf_event_attr)); > + if (!attr) { > + pr_err("arm_spe default config cannot allocate a perf_event_attr\n"); > + return NULL; > + } > + > + /* > + * If kernel driver doesn't advertise a minimum, > + * use max allowable by PMSIDR_EL1.INTERVAL > + */ > + if (perf_pmu__scan_file(arm_spe_pmu, "caps/min_interval", "%llu", > + &attr->sample_period) != 1) { > + pr_debug("arm_spe driver doesn't advertise a min. interval. Using 4096\n"); > + attr->sample_period = 4096; > + } > + > + arm_spe_pmu->selectable = true; > + arm_spe_pmu->is_uncore = false; > + > + return attr; > +} > diff --git a/tools/perf/util/Build b/tools/perf/util/Build > index a3de7916fe63..7c6a8b461e24 100644 > --- a/tools/perf/util/Build > +++ b/tools/perf/util/Build > @@ -86,6 +86,8 @@ libperf-$(CONFIG_AUXTRACE) += auxtrace.o > libperf-$(CONFIG_AUXTRACE) += intel-pt-decoder/ > libperf-$(CONFIG_AUXTRACE) += intel-pt.o > libperf-$(CONFIG_AUXTRACE) += intel-bts.o > +libperf-$(CONFIG_AUXTRACE) += arm-spe.o > +libperf-$(CONFIG_AUXTRACE) += arm-spe-pkt-decoder.o > libperf-y += parse-branch-options.o > libperf-y += dump-insn.o > libperf-y += parse-regs-options.o > diff --git a/tools/perf/util/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-pkt-decoder.c > new file mode 100644 > index 000000000000..234943471d30 > --- /dev/null > +++ b/tools/perf/util/arm-spe-pkt-decoder.c > @@ -0,0 +1,471 @@ > +/* > + * ARM Statistical Profiling Extensions (SPE) support > + * Copyright (c) 2017, ARM Ltd. > + * > + * This program is free software; you can redistribute it and/or modify it > + * under the terms and conditions of the GNU General Public License, > + * version 2, as published by the Free Software Foundation. > + * > + * This program is distributed in the hope it will be useful, but WITHOUT > + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or > + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for > + * more details. > + * > + */ > + > +#include > +#include > +#include > +#include > + > +#include "arm-spe-pkt-decoder.h" > + > +#define BIT(n) (1ULL << (n)) > + > +#define NS_FLAG BIT(63) > +#define EL_FLAG (BIT(62) | BIT(61)) > + > +#define SPE_HEADER0_PAD 0x0 > +#define SPE_HEADER0_END 0x1 > +#define SPE_HEADER0_ADDRESS 0x30 /* address packet (short) */ > +#define SPE_HEADER0_ADDRESS_MASK 0x38 > +#define SPE_HEADER0_COUNTER 0x18 /* counter packet (short) */ > +#define SPE_HEADER0_COUNTER_MASK 0x38 > +#define SPE_HEADER0_TIMESTAMP 0x71 > +#define SPE_HEADER0_TIMESTAMP 0x71 > +#define SPE_HEADER0_EVENTS 0x2 > +#define SPE_HEADER0_EVENTS_MASK 0xf > +#define SPE_HEADER0_SOURCE 0x3 > +#define SPE_HEADER0_SOURCE_MASK 0xf > +#define SPE_HEADER0_CONTEXT 0x24 > +#define SPE_HEADER0_CONTEXT_MASK 0x3c > +#define SPE_HEADER0_OP_TYPE 0x8 > +#define SPE_HEADER0_OP_TYPE_MASK 0x3c > +#define SPE_HEADER1_ALIGNMENT 0x0 > +#define SPE_HEADER1_ADDRESS 0xb0 /* address packet (extended) */ > +#define SPE_HEADER1_ADDRESS_MASK 0xf8 > +#define SPE_HEADER1_COUNTER 0x98 /* counter packet (extended) */ > +#define SPE_HEADER1_COUNTER_MASK 0xf8 > + > +#if __BYTE_ORDER == __BIG_ENDIAN > +#define le16_to_cpu bswap_16 > +#define le32_to_cpu bswap_32 > +#define le64_to_cpu bswap_64 > +#define memcpy_le64(d, s, n) do { \ > + memcpy((d), (s), (n)); \ > + *(d) = le64_to_cpu(*(d)); \ > +} while (0) > +#else > +#define le16_to_cpu > +#define le32_to_cpu > +#define le64_to_cpu > +#define memcpy_le64 memcpy > +#endif > + > +static const char * const arm_spe_packet_name[] = { > + [ARM_SPE_PAD] = "PAD", > + [ARM_SPE_END] = "END", > + [ARM_SPE_TIMESTAMP] = "TS", > + [ARM_SPE_ADDRESS] = "ADDR", > + [ARM_SPE_COUNTER] = "LAT", > + [ARM_SPE_CONTEXT] = "CONTEXT", > + [ARM_SPE_OP_TYPE] = "OP-TYPE", > + [ARM_SPE_EVENTS] = "EVENTS", > + [ARM_SPE_DATA_SOURCE] = "DATA-SOURCE", > +}; > + > +const char *arm_spe_pkt_name(enum arm_spe_pkt_type type) > +{ > + return arm_spe_packet_name[type]; > +} > + > +/* return ARM SPE payload size from its encoding, > + * which is in bits 5:4 of the byte. > + * 00 : byte > + * 01 : halfword (2) > + * 10 : word (4) > + * 11 : doubleword (8) > + */ > +static int payloadlen(unsigned char byte) > +{ > + return 1 << ((byte & 0x30) >> 4); > +} > + > +static int arm_spe_get_payload(const unsigned char *buf, size_t len, > + struct arm_spe_pkt *packet) > +{ > + size_t payload_len = payloadlen(buf[0]); > + > + if (len < 1 + payload_len) > + return ARM_SPE_NEED_MORE_BYTES; > + > + buf++; > + > + switch (payload_len) { > + case 1: packet->payload = *(uint8_t *)buf; break; > + case 2: packet->payload = le16_to_cpu(*(uint16_t *)buf); break; > + case 4: packet->payload = le32_to_cpu(*(uint32_t *)buf); break; > + case 8: packet->payload = le64_to_cpu(*(uint64_t *)buf); break; > + default: return ARM_SPE_BAD_PACKET; > + } > + > + return 1 + payload_len; > +} > + > +static int arm_spe_get_pad(struct arm_spe_pkt *packet) > +{ > + packet->type = ARM_SPE_PAD; > + return 1; > +} > + > +static int arm_spe_get_alignment(const unsigned char *buf, size_t len, > + struct arm_spe_pkt *packet) > +{ > + unsigned int alignment = 1 << ((buf[0] & 0xf) + 1); > + > + if (len < alignment) > + return ARM_SPE_NEED_MORE_BYTES; > + > + packet->type = ARM_SPE_PAD; > + return alignment - (((uint64_t)buf) & (alignment - 1)); > +} > + > +static int arm_spe_get_end(struct arm_spe_pkt *packet) > +{ > + packet->type = ARM_SPE_END; > + return 1; > +} > + > +static int arm_spe_get_timestamp(const unsigned char *buf, size_t len, > + struct arm_spe_pkt *packet) > +{ > + packet->type = ARM_SPE_TIMESTAMP; > + return arm_spe_get_payload(buf, len, packet); > +} > + > +static int arm_spe_get_events(const unsigned char *buf, size_t len, > + struct arm_spe_pkt *packet) > +{ > + int ret = arm_spe_get_payload(buf, len, packet); > + > + packet->type = ARM_SPE_EVENTS; > + > + /* we use index to identify Events with a less number of > + * comparisons in arm_spe_pkt_desc(): E.g., the LLC-ACCESS, > + * LLC-REFILL, and REMOTE-ACCESS events are identified iff > + * index > 1. > + */ > + packet->index = ret - 1; > + > + return ret; > +} > + > +static int arm_spe_get_data_source(const unsigned char *buf, size_t len, > + struct arm_spe_pkt *packet) > +{ > + packet->type = ARM_SPE_DATA_SOURCE; > + return arm_spe_get_payload(buf, len, packet); > +} > + > +static int arm_spe_get_context(const unsigned char *buf, size_t len, > + struct arm_spe_pkt *packet) > +{ > + packet->type = ARM_SPE_CONTEXT; > + packet->index = buf[0] & 0x3; > + > + return arm_spe_get_payload(buf, len, packet); > +} > + > +static int arm_spe_get_op_type(const unsigned char *buf, size_t len, > + struct arm_spe_pkt *packet) > +{ > + packet->type = ARM_SPE_OP_TYPE; > + packet->index = buf[0] & 0x3; > + return arm_spe_get_payload(buf, len, packet); > +} > + > +static int arm_spe_get_counter(const unsigned char *buf, size_t len, > + const unsigned char ext_hdr, struct arm_spe_pkt *packet) > +{ > + if (len < 2) > + return ARM_SPE_NEED_MORE_BYTES; > + > + packet->type = ARM_SPE_COUNTER; > + if (ext_hdr) > + packet->index = ((buf[0] & 0x3) << 3) | (buf[1] & 0x7); > + else > + packet->index = buf[0] & 0x7; > + > + packet->payload = le16_to_cpu(*(uint16_t *)(buf + 1)); > + > + return 1 + ext_hdr + 2; > +} > + > +static int arm_spe_get_addr(const unsigned char *buf, size_t len, > + const unsigned char ext_hdr, struct arm_spe_pkt *packet) > +{ > + if (len < 8) > + return ARM_SPE_NEED_MORE_BYTES; > + > + packet->type = ARM_SPE_ADDRESS; > + if (ext_hdr) > + packet->index = ((buf[0] & 0x3) << 3) | (buf[1] & 0x7); > + else > + packet->index = buf[0] & 0x7; > + > + memcpy_le64(&packet->payload, buf + 1, 8); > + > + return 1 + ext_hdr + 8; > +} > + > +static int arm_spe_do_get_packet(const unsigned char *buf, size_t len, > + struct arm_spe_pkt *packet) > +{ > + unsigned int byte; > + > + memset(packet, 0, sizeof(struct arm_spe_pkt)); > + > + if (!len) > + return ARM_SPE_NEED_MORE_BYTES; > + > + byte = buf[0]; > + if (byte == SPE_HEADER0_PAD) > + return arm_spe_get_pad(packet); > + else if (byte == SPE_HEADER0_END) /* no timestamp at end of record */ > + return arm_spe_get_end(packet); > + else if (byte & 0xc0 /* 0y11xxxxxx */) { > + if (byte & 0x80) { > + if ((byte & SPE_HEADER0_ADDRESS_MASK) == SPE_HEADER0_ADDRESS) > + return arm_spe_get_addr(buf, len, 0, packet); > + if ((byte & SPE_HEADER0_COUNTER_MASK) == SPE_HEADER0_COUNTER) > + return arm_spe_get_counter(buf, len, 0, packet); > + } else > + if (byte == SPE_HEADER0_TIMESTAMP) > + return arm_spe_get_timestamp(buf, len, packet); > + else if ((byte & SPE_HEADER0_EVENTS_MASK) == SPE_HEADER0_EVENTS) > + return arm_spe_get_events(buf, len, packet); > + else if ((byte & SPE_HEADER0_SOURCE_MASK) == SPE_HEADER0_SOURCE) > + return arm_spe_get_data_source(buf, len, packet); > + else if ((byte & SPE_HEADER0_CONTEXT_MASK) == SPE_HEADER0_CONTEXT) > + return arm_spe_get_context(buf, len, packet); > + else if ((byte & SPE_HEADER0_OP_TYPE_MASK) == SPE_HEADER0_OP_TYPE) > + return arm_spe_get_op_type(buf, len, packet); > + } else if ((byte & 0xe0) == 0x20 /* 0y001xxxxx */) { > + /* 16-bit header */ > + byte = buf[1]; > + if (byte == SPE_HEADER1_ALIGNMENT) > + return arm_spe_get_alignment(buf, len, packet); > + else if ((byte & SPE_HEADER1_ADDRESS_MASK) == SPE_HEADER1_ADDRESS) > + return arm_spe_get_addr(buf, len, 1, packet); > + else if ((byte & SPE_HEADER1_COUNTER_MASK) == SPE_HEADER1_COUNTER) > + return arm_spe_get_counter(buf, len, 1, packet); > + } > + > + return ARM_SPE_BAD_PACKET; > +} > + > +int arm_spe_get_packet(const unsigned char *buf, size_t len, > + struct arm_spe_pkt *packet) > +{ > + int ret; > + > + ret = arm_spe_do_get_packet(buf, len, packet); > + /* put multiple consecutive PADs on the same line, up to > + * the fixed-width output format of 16 bytes per line. > + */ > + if (ret > 0 && packet->type == ARM_SPE_PAD) { > + while (ret < 16 && len > (size_t)ret && !buf[ret]) > + ret += 1; > + } > + return ret; > +} > + > +int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf, > + size_t buf_len) > +{ > + int ret, ns, el, index = packet->index; > + unsigned long long payload = packet->payload; > + const char *name = arm_spe_pkt_name(packet->type); > + > + switch (packet->type) { > + case ARM_SPE_BAD: > + case ARM_SPE_PAD: > + case ARM_SPE_END: > + return snprintf(buf, buf_len, "%s", name); > + case ARM_SPE_EVENTS: { > + size_t blen = buf_len; > + > + ret = 0; > + ret = snprintf(buf, buf_len, "EV"); > + buf += ret; > + blen -= ret; > + if (payload & 0x1) { > + ret = snprintf(buf, buf_len, " EXCEPTION-GEN"); > + buf += ret; > + blen -= ret; > + } > + if (payload & 0x2) { > + ret = snprintf(buf, buf_len, " RETIRED"); > + buf += ret; > + blen -= ret; > + } > + if (payload & 0x4) { > + ret = snprintf(buf, buf_len, " L1D-ACCESS"); > + buf += ret; > + blen -= ret; > + } > + if (payload & 0x8) { > + ret = snprintf(buf, buf_len, " L1D-REFILL"); > + buf += ret; > + blen -= ret; > + } > + if (payload & 0x10) { > + ret = snprintf(buf, buf_len, " TLB-ACCESS"); > + buf += ret; > + blen -= ret; > + } > + if (payload & 0x20) { > + ret = snprintf(buf, buf_len, " TLB-REFILL"); > + buf += ret; > + blen -= ret; > + } > + if (payload & 0x40) { > + ret = snprintf(buf, buf_len, " NOT-TAKEN"); > + buf += ret; > + blen -= ret; > + } > + if (payload & 0x80) { > + ret = snprintf(buf, buf_len, " MISPRED"); > + buf += ret; > + blen -= ret; > + } > + if (index > 1) { > + if (payload & 0x100) { > + ret = snprintf(buf, buf_len, " LLC-ACCESS"); > + buf += ret; > + blen -= ret; > + } > + if (payload & 0x200) { > + ret = snprintf(buf, buf_len, " LLC-REFILL"); > + buf += ret; > + blen -= ret; > + } > + if (payload & 0x400) { > + ret = snprintf(buf, buf_len, " REMOTE-ACCESS"); > + buf += ret; > + blen -= ret; > + } > + } > + if (ret < 0) > + return ret; > + blen -= ret; > + return buf_len - blen; > + } > + case ARM_SPE_OP_TYPE: > + switch (index) { > + case 0: return snprintf(buf, buf_len, "%s", payload & 0x1 ? > + "COND-SELECT" : "INSN-OTHER"); > + case 1: { > + size_t blen = buf_len; > + > + if (payload & 0x1) > + ret = snprintf(buf, buf_len, "ST"); > + else > + ret = snprintf(buf, buf_len, "LD"); > + buf += ret; > + blen -= ret; > + if (payload & 0x2) { > + if (payload & 0x4) { > + ret = snprintf(buf, buf_len, " AT"); > + buf += ret; > + blen -= ret; > + } > + if (payload & 0x8) { > + ret = snprintf(buf, buf_len, " EXCL"); > + buf += ret; > + blen -= ret; > + } > + if (payload & 0x10) { > + ret = snprintf(buf, buf_len, " AR"); > + buf += ret; > + blen -= ret; > + } > + } else if (payload & 0x4) { > + ret = snprintf(buf, buf_len, " SIMD-FP"); > + buf += ret; > + blen -= ret; > + } > + if (ret < 0) > + return ret; > + blen -= ret; > + return buf_len - blen; > + } > + case 2: { > + size_t blen = buf_len; > + > + ret = snprintf(buf, buf_len, "B"); > + buf += ret; > + blen -= ret; > + if (payload & 0x1) { > + ret = snprintf(buf, buf_len, " COND"); > + buf += ret; > + blen -= ret; > + } > + if (payload & 0x2) { > + ret = snprintf(buf, buf_len, " IND"); > + buf += ret; > + blen -= ret; > + } > + if (ret < 0) > + return ret; > + blen -= ret; > + return buf_len - blen; > + } > + default: return 0; > + } > + case ARM_SPE_DATA_SOURCE: > + case ARM_SPE_TIMESTAMP: > + return snprintf(buf, buf_len, "%s %lld", name, payload); > + case ARM_SPE_ADDRESS: > + switch (index) { > + case 0: > + case 1: ns = !!(packet->payload & NS_FLAG); > + el = (packet->payload & EL_FLAG) >> 61; > + payload &= ~(0xffULL << 56); > + return snprintf(buf, buf_len, "%s 0x%llx el%d ns=%d", > + (index == 1) ? "TGT" : "PC", payload, el, ns); > + case 2: return snprintf(buf, buf_len, "VA 0x%llx", payload); > + case 3: ns = !!(packet->payload & NS_FLAG); > + payload &= ~(0xffULL << 56); > + return snprintf(buf, buf_len, "PA 0x%llx ns=%d", > + payload, ns); > + default: return 0; > + } > + case ARM_SPE_CONTEXT: > + return snprintf(buf, buf_len, "%s 0x%lx el%d", name, > + (unsigned long)payload, index + 1); > + case ARM_SPE_COUNTER: { > + size_t blen = buf_len; > + > + ret = snprintf(buf, buf_len, "%s %d ", name, > + (unsigned short)payload); > + buf += ret; > + blen -= ret; > + switch (index) { > + case 0: ret = snprintf(buf, buf_len, "TOT"); break; > + case 1: ret = snprintf(buf, buf_len, "ISSUE"); break; > + case 2: ret = snprintf(buf, buf_len, "XLAT"); break; > + default: ret = 0; > + } > + if (ret < 0) > + return ret; > + blen -= ret; > + return buf_len - blen; > + } > + default: > + break; > + } > + > + return snprintf(buf, buf_len, "%s 0x%llx (%d)", > + name, payload, packet->index); > +} > diff --git a/tools/perf/util/arm-spe-pkt-decoder.h b/tools/perf/util/arm-spe-pkt-decoder.h > new file mode 100644 > index 000000000000..f146f4143447 > --- /dev/null > +++ b/tools/perf/util/arm-spe-pkt-decoder.h > @@ -0,0 +1,52 @@ > +/* > + * ARM Statistical Profiling Extensions (SPE) support > + * Copyright (c) 2017, ARM Ltd. > + * > + * This program is free software; you can redistribute it and/or modify it > + * under the terms and conditions of the GNU General Public License, > + * version 2, as published by the Free Software Foundation. > + * > + * This program is distributed in the hope it will be useful, but WITHOUT > + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or > + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for > + * more details. > + * > + */ > + > +#ifndef INCLUDE__ARM_SPE_PKT_DECODER_H__ > +#define INCLUDE__ARM_SPE_PKT_DECODER_H__ > + > +#include > +#include > + > +#define ARM_SPE_PKT_DESC_MAX 256 > + > +#define ARM_SPE_NEED_MORE_BYTES -1 > +#define ARM_SPE_BAD_PACKET -2 > + > +enum arm_spe_pkt_type { > + ARM_SPE_BAD, > + ARM_SPE_PAD, > + ARM_SPE_END, > + ARM_SPE_TIMESTAMP, > + ARM_SPE_ADDRESS, > + ARM_SPE_COUNTER, > + ARM_SPE_CONTEXT, > + ARM_SPE_OP_TYPE, > + ARM_SPE_EVENTS, > + ARM_SPE_DATA_SOURCE, > +}; > + > +struct arm_spe_pkt { > + enum arm_spe_pkt_type type; > + unsigned char index; > + uint64_t payload; > +}; > + > +const char *arm_spe_pkt_name(enum arm_spe_pkt_type); > + > +int arm_spe_get_packet(const unsigned char *buf, size_t len, > + struct arm_spe_pkt *packet); > + > +int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf, size_t len); > +#endif > diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c > new file mode 100644 > index 000000000000..67965e26b5b1 > --- /dev/null > +++ b/tools/perf/util/arm-spe.c > @@ -0,0 +1,318 @@ > +/* > + * ARM Statistical Profiling Extensions (SPE) support > + * Copyright (c) 2017, ARM Ltd. > + * > + * This program is free software; you can redistribute it and/or modify it > + * under the terms and conditions of the GNU General Public License, > + * version 2, as published by the Free Software Foundation. > + * > + * This program is distributed in the hope it will be useful, but WITHOUT > + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or > + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for > + * more details. > + * > + */ > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +#include "cpumap.h" > +#include "color.h" > +#include "evsel.h" > +#include "evlist.h" > +#include "machine.h" > +#include "session.h" > +#include "util.h" > +#include "thread.h" > +#include "debug.h" > +#include "auxtrace.h" > +#include "arm-spe.h" > +#include "arm-spe-pkt-decoder.h" > + > +struct arm_spe { > + struct auxtrace auxtrace; > + struct auxtrace_queues queues; > + struct auxtrace_heap heap; > + u32 auxtrace_type; > + struct perf_session *session; > + struct machine *machine; > + u32 pmu_type; > +}; > + > +struct arm_spe_queue { > + struct arm_spe *spe; > + unsigned int queue_nr; > + struct auxtrace_buffer *buffer; > + bool on_heap; > + bool done; > + pid_t pid; > + pid_t tid; > + int cpu; > +}; > + > +static void arm_spe_dump(struct arm_spe *spe __maybe_unused, > + unsigned char *buf, size_t len) > +{ > + struct arm_spe_pkt packet; > + size_t pos = 0; > + int ret, pkt_len, i; > + char desc[ARM_SPE_PKT_DESC_MAX]; > + const char *color = PERF_COLOR_BLUE; > + > + color_fprintf(stdout, color, > + ". ... ARM SPE data: size %zu bytes\n", > + len); > + > + while (len) { > + ret = arm_spe_get_packet(buf, len, &packet); > + if (ret > 0) > + pkt_len = ret; > + else > + pkt_len = 1; > + printf("."); > + color_fprintf(stdout, color, " %08x: ", pos); > + for (i = 0; i < pkt_len; i++) > + color_fprintf(stdout, color, " %02x", buf[i]); > + for (; i < 16; i++) > + color_fprintf(stdout, color, " "); > + if (ret > 0) { > + ret = arm_spe_pkt_desc(&packet, desc, > + ARM_SPE_PKT_DESC_MAX); > + if (ret > 0) > + color_fprintf(stdout, color, " %s\n", desc); > + } else { > + color_fprintf(stdout, color, " Bad packet!\n"); > + } > + pos += pkt_len; > + buf += pkt_len; > + len -= pkt_len; > + } > +} > + > +static void arm_spe_dump_event(struct arm_spe *spe, unsigned char *buf, > + size_t len) > +{ > + printf(".\n"); > + arm_spe_dump(spe, buf, len); > +} > + > +static struct arm_spe_queue *arm_spe_alloc_queue(struct arm_spe *spe, > + unsigned int queue_nr) > +{ > + struct arm_spe_queue *speq; > + > + speq = zalloc(sizeof(struct arm_spe_queue)); > + if (!speq) > + return NULL; > + > + speq->spe = spe; > + speq->queue_nr = queue_nr; > + speq->pid = -1; > + speq->tid = -1; > + speq->cpu = -1; > + > + return speq; > +} > + > +static int arm_spe_setup_queue(struct arm_spe *spe, > + struct auxtrace_queue *queue, > + unsigned int queue_nr) > +{ > + struct arm_spe_queue *speq = queue->priv; > + > + if (list_empty(&queue->head)) > + return 0; > + > + if (!speq) { > + speq = arm_spe_alloc_queue(spe, queue_nr); > + if (!speq) > + return -ENOMEM; > + queue->priv = speq; > + > + if (queue->cpu != -1) > + speq->cpu = queue->cpu; > + speq->tid = queue->tid; > + } > + > + if (!speq->on_heap && !speq->buffer) { > + int ret; > + > + speq->buffer = auxtrace_buffer__next(queue, NULL); > + if (!speq->buffer) > + return 0; > + > + ret = auxtrace_heap__add(&spe->heap, queue_nr, > + speq->buffer->reference); > + if (ret) > + return ret; > + speq->on_heap = true; > + } > + > + return 0; > +} > + > +static int arm_spe_setup_queues(struct arm_spe *spe) > +{ > + unsigned int i; > + int ret; > + > + for (i = 0; i < spe->queues.nr_queues; i++) { > + ret = arm_spe_setup_queue(spe, &spe->queues.queue_array[i], > + i); > + if (ret) > + return ret; > + } > + return 0; > +} > + > +static inline int arm_spe_update_queues(struct arm_spe *spe) > +{ > + if (spe->queues.new_data) { > + spe->queues.new_data = false; > + return arm_spe_setup_queues(spe); > + } > + return 0; > +} > + > +static int arm_spe_process_event(struct perf_session *session __maybe_unused, > + union perf_event *event __maybe_unused, > + struct perf_sample *sample __maybe_unused, > + struct perf_tool *tool __maybe_unused) > +{ > + return 0; > +} > + > +static int arm_spe_process_auxtrace_event(struct perf_session *session, > + union perf_event *event, > + struct perf_tool *tool __maybe_unused) > +{ > + struct arm_spe *spe = container_of(session->auxtrace, struct arm_spe, > + auxtrace); > + struct auxtrace_buffer *buffer; > + off_t data_offset; > + int fd = perf_data__fd(session->data); > + int err; > + > + if (perf_data__is_pipe(session->data)) { > + data_offset = 0; > + } else { > + data_offset = lseek(fd, 0, SEEK_CUR); > + if (data_offset == -1) > + return -errno; > + } > + > + err = auxtrace_queues__add_event(&spe->queues, session, event, > + data_offset, &buffer); > + if (err) > + return err; > + > + /* Dump here now we have copied a piped trace out of the pipe */ > + if (dump_trace) { > + if (auxtrace_buffer__get_data(buffer, fd)) { > + arm_spe_dump_event(spe, buffer->data, > + buffer->size); > + auxtrace_buffer__put_data(buffer); > + } > + } > + > + return 0; > +} > + > +static int arm_spe_flush(struct perf_session *session __maybe_unused, > + struct perf_tool *tool __maybe_unused) > +{ > + return 0; > +} > + > +static void arm_spe_free_queue(void *priv) > +{ > + struct arm_spe_queue *speq = priv; > + > + if (!speq) > + return; > + free(speq); > +} > + > +static void arm_spe_free_events(struct perf_session *session) > +{ > + struct arm_spe *spe = container_of(session->auxtrace, struct arm_spe, > + auxtrace); > + struct auxtrace_queues *queues = &spe->queues; > + unsigned int i; > + > + for (i = 0; i < queues->nr_queues; i++) { > + arm_spe_free_queue(queues->queue_array[i].priv); > + queues->queue_array[i].priv = NULL; > + } > + auxtrace_queues__free(queues); > +} > + > +static void arm_spe_free(struct perf_session *session) > +{ > + struct arm_spe *spe = container_of(session->auxtrace, struct arm_spe, > + auxtrace); > + > + auxtrace_heap__free(&spe->heap); > + arm_spe_free_events(session); > + session->auxtrace = NULL; > + free(spe); > +} > + > +static const char * const arm_spe_info_fmts[] = { > + [ARM_SPE_PMU_TYPE] = " PMU Type %"PRId64"\n", > +}; > + > +static void arm_spe_print_info(u64 *arr) > +{ > + if (!dump_trace) > + return; > + > + fprintf(stdout, arm_spe_info_fmts[ARM_SPE_PMU_TYPE], arr[ARM_SPE_PMU_TYPE]); > +} > + > +int arm_spe_process_auxtrace_info(union perf_event *event, > + struct perf_session *session) > +{ > + struct auxtrace_info_event *auxtrace_info = &event->auxtrace_info; > + size_t min_sz = sizeof(u64) * ARM_SPE_PMU_TYPE; > + struct arm_spe *spe; > + int err; > + > + if (auxtrace_info->header.size < sizeof(struct auxtrace_info_event) + > + min_sz) > + return -EINVAL; > + > + spe = zalloc(sizeof(struct arm_spe)); > + if (!spe) > + return -ENOMEM; > + > + err = auxtrace_queues__init(&spe->queues); > + if (err) > + goto err_free; > + > + spe->session = session; > + spe->machine = &session->machines.host; /* No kvm support */ > + spe->auxtrace_type = auxtrace_info->type; > + spe->pmu_type = auxtrace_info->priv[ARM_SPE_PMU_TYPE]; > + > + spe->auxtrace.process_event = arm_spe_process_event; > + spe->auxtrace.process_auxtrace_event = arm_spe_process_auxtrace_event; > + spe->auxtrace.flush_events = arm_spe_flush; > + spe->auxtrace.free_events = arm_spe_free_events; > + spe->auxtrace.free = arm_spe_free; > + session->auxtrace = &spe->auxtrace; > + > + arm_spe_print_info(&auxtrace_info->priv[0]); > + > + return 0; > + > +err_free: > + free(spe); > + return err; > +} > diff --git a/tools/perf/util/arm-spe.h b/tools/perf/util/arm-spe.h > new file mode 100644 > index 000000000000..80752b20d850 > --- /dev/null > +++ b/tools/perf/util/arm-spe.h > @@ -0,0 +1,42 @@ > +/* > + * ARM Statistical Profiling Extensions (SPE) support > + * Copyright (c) 2017, ARM Ltd. > + * > + * This program is free software; you can redistribute it and/or modify it > + * under the terms and conditions of the GNU General Public License, > + * version 2, as published by the Free Software Foundation. > + * > + * This program is distributed in the hope it will be useful, but WITHOUT > + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or > + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for > + * more details. > + * > + */ > + > +#ifndef INCLUDE__PERF_ARM_SPE_H__ > +#define INCLUDE__PERF_ARM_SPE_H__ > + > +#define ARM_SPE_PMU_NAME "arm_spe_" > + > +enum { > + ARM_SPE_PMU_TYPE, > + ARM_SPE_PER_CPU_MMAPS, > + ARM_SPE_AUXTRACE_PRIV_MAX, > +}; > + > +#define ARM_SPE_AUXTRACE_PRIV_SIZE (ARM_SPE_AUXTRACE_PRIV_MAX * sizeof(u64)) > + > +struct auxtrace_record; > +struct perf_tool; struct auxtrace_record and struct perf_tool are not used. > +union perf_event; > +struct perf_session; > +struct perf_pmu; > + > +struct auxtrace_record *arm_spe_recording_init(int *err, > + struct perf_pmu *arm_spe_pmu); > + > +int arm_spe_process_auxtrace_info(union perf_event *event, > + struct perf_session *session); > + > +struct perf_event_attr *arm_spe_pmu_default_config(struct perf_pmu *arm_spe_pmu); > +#endif > diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c > index a33491416400..f682f7a58a02 100644 > --- a/tools/perf/util/auxtrace.c > +++ b/tools/perf/util/auxtrace.c > @@ -57,6 +57,7 @@ > > #include "intel-pt.h" > #include "intel-bts.h" > +#include "arm-spe.h" > > #include "sane_ctype.h" > #include "symbol/kallsyms.h" > @@ -913,6 +914,8 @@ int perf_event__process_auxtrace_info(struct perf_tool *tool __maybe_unused, > return intel_pt_process_auxtrace_info(event, session); > case PERF_AUXTRACE_INTEL_BTS: > return intel_bts_process_auxtrace_info(event, session); > + case PERF_AUXTRACE_ARM_SPE: > + return arm_spe_process_auxtrace_info(event, session); > case PERF_AUXTRACE_CS_ETM: > case PERF_AUXTRACE_UNKNOWN: > default: > diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h > index d19e11b68de7..453c148d2158 100644 > --- a/tools/perf/util/auxtrace.h > +++ b/tools/perf/util/auxtrace.h > @@ -43,6 +43,7 @@ enum auxtrace_type { > PERF_AUXTRACE_INTEL_PT, > PERF_AUXTRACE_INTEL_BTS, > PERF_AUXTRACE_CS_ETM, > + PERF_AUXTRACE_ARM_SPE, > }; > > enum itrace_period_type { >