From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4D8F2C74A5B for ; Wed, 29 Mar 2023 11:14:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229463AbjC2LOx (ORCPT ); Wed, 29 Mar 2023 07:14:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43278 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229703AbjC2LOe (ORCPT ); Wed, 29 Mar 2023 07:14:34 -0400 Received: from mail-wr1-x42e.google.com (mail-wr1-x42e.google.com [IPv6:2a00:1450:4864:20::42e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 030BD44BC for ; Wed, 29 Mar 2023 04:14:32 -0700 (PDT) Received: by mail-wr1-x42e.google.com with SMTP id i9so15259754wrp.3 for ; Wed, 29 Mar 2023 04:14:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1680088470; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Ln7VRUIrC0wCRHPvdMeetTqn0XRJ5U87hZk7+wKCcB8=; b=xWPo7WfdRtS2RaOMXYCdxeODZkIUjyFVVRba/Fl9EaDkrQJDLrqdBxpSBFiJhbmzIk rDPty37mmxdalNnHTcymIZlihtZds6D6mTn+0UwvalDoiQKp2yubiP+GM6wyHWkh3w+1 C9IZKXuALbrkqOZrREsYzzOW1yrMrFT7S5pglVw0vbXupNOdvRd1kGAEDqYrb6MGQsYy AxEgDQgEKRYnzlMqP/sXKt2rv+NZi8mMQFkvr4DaqTpURpfCxQcjGuvpNAs8ZcTPRCpX iIBynHWVSU6nWR55cYdYFKVhcbap52i9fnN6Q37pq2yu5BzZBhe6SeIp3Y5EXmkk78iF bULQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680088470; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Ln7VRUIrC0wCRHPvdMeetTqn0XRJ5U87hZk7+wKCcB8=; b=GkNeyVHW76PwhnKEhdp7lIpWQN395KXHZYu+90ZtbblkFI9DTnuW3wiMwgD/5oTga1 PK9nBG6oe89lzvSh7dYxH/4bBST2E48m/dQrPcAg8gU0IrKug9uz1Sll4GvtGfVRCBXG TPSv6Bo5mOsiBIx7amGEpbwgZufOO5wjBpkkCQYtkvmApMZ45xNtpOK2L+8aMWMo4Y1C TZTXZqcpCa88GBpBtaZT/4paYslELmnVtdGnuYVp3/sjTH44Zg+QEnUztxyNBS1K+AmR 5BvhwHg8/s6JpEQxK59Ntkl42GWXrjWIycFOlb/ljszgxPNjigIIKeh+XXfefU6O2YsQ ntYw== X-Gm-Message-State: AAQBX9cg7JLRbFYhgySD9H0asGcw4wDH3Qm+3aKH3l+08+Jpo9fpyyIr 1dRAaLW7wGOd8fgFdooV5D/HXim9MkdtdYi/Z+1x4Q== X-Google-Smtp-Source: AKy350Yw4pRx37jq/9l+Qpuw2zax3v3wujK3TTpdCGsbSsFK//cAObUwkcT5VPijmRludyN4D4i5aA== X-Received: by 2002:a5d:4746:0:b0:2ce:a6f2:bae0 with SMTP id o6-20020a5d4746000000b002cea6f2bae0mr14280837wrs.56.1680088470491; Wed, 29 Mar 2023 04:14:30 -0700 (PDT) Received: from linaro.org (host86-131-79-192.range86-131.btcentralplus.com. [86.131.79.192]) by smtp.gmail.com with ESMTPSA id m6-20020a056000008600b002cde25fba30sm30067163wrx.1.2023.03.29.04.14.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Mar 2023 04:14:29 -0700 (PDT) From: Mike Leach To: linux-perf-users@vger.kernel.org, linux-arm-kernel@lists.infradead.org, coresight@lists.linaro.org, linux-kernel@vger.kernel.org, acme@kernel.org, suzuki.poulose@arm.com Cc: leo.yan@linaro.org, peterz@infradead.org, mingo@redhat.com, will@kernel.org, john.garry@huawei.com, mark.rutland@arm.com, alexander.shishkin@linux.intel.com, jolsa@redhat.com, namhyung@kernel.org, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, Mike Leach , James Clark Subject: [PATCH v8 3/3] perf: cs-etm: Handle PERF_RECORD_AUX_OUTPUT_HW_ID packet Date: Wed, 29 Mar 2023 12:14:22 +0100 Message-Id: <20230329111422.3693-4-mike.leach@linaro.org> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20230329111422.3693-1-mike.leach@linaro.org> References: <20230329111422.3693-1-mike.leach@linaro.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-perf-users@vger.kernel.org When using dynamically assigned CoreSight trace IDs the drivers can output the ID / CPU association as a PERF_RECORD_AUX_OUTPUT_HW_ID packet. Update cs-etm decoder to handle this packet by setting the CPU/Trace ID mapping. Signed-off-by: Mike Leach Reviewed-by: James Clark Acked-by: Suzuki K Poulose --- tools/include/linux/coresight-pmu.h | 15 ++ .../perf/util/cs-etm-decoder/cs-etm-decoder.c | 7 + tools/perf/util/cs-etm.c | 247 ++++++++++++++++-- 3 files changed, 250 insertions(+), 19 deletions(-) diff --git a/tools/include/linux/coresight-pmu.h b/tools/include/linux/coresight-pmu.h index 1760f9a574b0..cef3b1c25335 100644 --- a/tools/include/linux/coresight-pmu.h +++ b/tools/include/linux/coresight-pmu.h @@ -31,6 +31,9 @@ */ #define CORESIGHT_TRACE_ID_UNUSED_FLAG BIT(31) +/* Value to set for unused trace ID values */ +#define CORESIGHT_TRACE_ID_UNUSED_VAL 0x7F + /* * Below are the definition of bit offsets for perf option, and works as * arbitrary values for all ETM versions. @@ -55,4 +58,16 @@ #define ETM4_CFG_BIT_RETSTK 12 #define ETM4_CFG_BIT_VMID_OPT 15 +/* + * Interpretation of the PERF_RECORD_AUX_OUTPUT_HW_ID payload. + * Used to associate a CPU with the CoreSight Trace ID. + * [07:00] - Trace ID - uses 8 bits to make value easy to read in file. + * [59:08] - Unused (SBZ) + * [63:60] - Version + */ +#define CS_AUX_HW_ID_TRACE_ID_MASK GENMASK_ULL(7, 0) +#define CS_AUX_HW_ID_VERSION_MASK GENMASK_ULL(63, 60) + +#define CS_AUX_HW_ID_CURR_VERSION 0 + #endif diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c index d0e521dfcf35..82a27ab90c8b 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c @@ -668,6 +668,7 @@ cs_etm_decoder__create_etm_decoder(struct cs_etm_decoder_params *d_params, switch (t_params->protocol) { case CS_ETM_PROTO_ETMV3: case CS_ETM_PROTO_PTM: + csid = (t_params->etmv3.reg_idr & CORESIGHT_TRACE_ID_VAL_MASK); cs_etm_decoder__gen_etmv3_config(t_params, &config_etmv3); decoder->decoder_name = (t_params->protocol == CS_ETM_PROTO_ETMV3) ? OCSD_BUILTIN_DCD_ETMV3 : @@ -675,11 +676,13 @@ cs_etm_decoder__create_etm_decoder(struct cs_etm_decoder_params *d_params, trace_config = &config_etmv3; break; case CS_ETM_PROTO_ETMV4i: + csid = (t_params->etmv4.reg_traceidr & CORESIGHT_TRACE_ID_VAL_MASK); cs_etm_decoder__gen_etmv4_config(t_params, &trace_config_etmv4); decoder->decoder_name = OCSD_BUILTIN_DCD_ETMV4I; trace_config = &trace_config_etmv4; break; case CS_ETM_PROTO_ETE: + csid = (t_params->ete.reg_traceidr & CORESIGHT_TRACE_ID_VAL_MASK); cs_etm_decoder__gen_ete_config(t_params, &trace_config_ete); decoder->decoder_name = OCSD_BUILTIN_DCD_ETE; trace_config = &trace_config_ete; @@ -688,6 +691,10 @@ cs_etm_decoder__create_etm_decoder(struct cs_etm_decoder_params *d_params, return -1; } + /* if the CPU has no trace ID associated, no decoder needed */ + if (csid == CORESIGHT_TRACE_ID_UNUSED_VAL) + return 0; + if (d_params->operation == CS_ETM_OPERATION_DECODE) { if (ocsd_dt_create_decoder(decoder->dcd_tree, decoder->decoder_name, diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index f6ca07f68b25..94e2d02009eb 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -220,6 +220,143 @@ static int cs_etm__map_trace_id(u8 trace_chan_id, u64 *cpu_metadata) return 0; } +static int cs_etm__metadata_get_trace_id(u8 *trace_chan_id, u64 *cpu_metadata) +{ + u64 cs_etm_magic = cpu_metadata[CS_ETM_MAGIC]; + + switch (cs_etm_magic) { + case __perf_cs_etmv3_magic: + *trace_chan_id = (u8)(cpu_metadata[CS_ETM_ETMTRACEIDR] & + CORESIGHT_TRACE_ID_VAL_MASK); + break; + case __perf_cs_etmv4_magic: + case __perf_cs_ete_magic: + *trace_chan_id = (u8)(cpu_metadata[CS_ETMV4_TRCTRACEIDR] & + CORESIGHT_TRACE_ID_VAL_MASK); + break; + default: + return -EINVAL; + } + return 0; +} + +/* + * update metadata trace ID from the value found in the AUX_HW_INFO packet. + * This will also clear the CORESIGHT_TRACE_ID_UNUSED_FLAG flag if present. + */ +static int cs_etm__metadata_set_trace_id(u8 trace_chan_id, u64 *cpu_metadata) +{ + u64 cs_etm_magic = cpu_metadata[CS_ETM_MAGIC]; + + switch (cs_etm_magic) { + case __perf_cs_etmv3_magic: + cpu_metadata[CS_ETM_ETMTRACEIDR] = trace_chan_id; + break; + case __perf_cs_etmv4_magic: + case __perf_cs_ete_magic: + cpu_metadata[CS_ETMV4_TRCTRACEIDR] = trace_chan_id; + break; + + default: + return -EINVAL; + } + return 0; +} + +/* + * FIELD_GET (linux/bitfield.h) not available outside kernel code, + * and the header contains too many dependencies to just copy over, + * so roll our own based on the original + */ +#define __bf_shf(x) (__builtin_ffsll(x) - 1) +#define FIELD_GET(_mask, _reg) \ + ({ \ + (typeof(_mask))(((_reg) & (_mask)) >> __bf_shf(_mask)); \ + }) + +/* + * Handle the PERF_RECORD_AUX_OUTPUT_HW_ID event. + * + * The payload associates the Trace ID and the CPU. + * The routine is tolerant of seeing multiple packets with the same association, + * but a CPU / Trace ID association changing during a session is an error. + */ +static int cs_etm__process_aux_output_hw_id(struct perf_session *session, + union perf_event *event) +{ + struct cs_etm_auxtrace *etm; + struct perf_sample sample; + struct int_node *inode; + struct evsel *evsel; + u64 *cpu_data; + u64 hw_id; + int cpu, version, err; + u8 trace_chan_id, curr_chan_id; + + /* extract and parse the HW ID */ + hw_id = event->aux_output_hw_id.hw_id; + version = FIELD_GET(CS_AUX_HW_ID_VERSION_MASK, hw_id); + trace_chan_id = FIELD_GET(CS_AUX_HW_ID_TRACE_ID_MASK, hw_id); + + /* check that we can handle this version */ + if (version > CS_AUX_HW_ID_CURR_VERSION) + return -EINVAL; + + /* get access to the etm metadata */ + etm = container_of(session->auxtrace, struct cs_etm_auxtrace, auxtrace); + if (!etm || !etm->metadata) + return -EINVAL; + + /* parse the sample to get the CPU */ + evsel = evlist__event2evsel(session->evlist, event); + if (!evsel) + return -EINVAL; + err = evsel__parse_sample(evsel, event, &sample); + if (err) + return err; + cpu = sample.cpu; + if (cpu == -1) { + /* no CPU in the sample - possibly recorded with an old version of perf */ + pr_err("CS_ETM: no CPU AUX_OUTPUT_HW_ID sample. Use compatible perf to record."); + return -EINVAL; + } + + /* See if the ID is mapped to a CPU, and it matches the current CPU */ + inode = intlist__find(traceid_list, trace_chan_id); + if (inode) { + cpu_data = inode->priv; + if ((int)cpu_data[CS_ETM_CPU] != cpu) { + pr_err("CS_ETM: map mismatch between HW_ID packet CPU and Trace ID\n"); + return -EINVAL; + } + + /* check that the mapped ID matches */ + err = cs_etm__metadata_get_trace_id(&curr_chan_id, cpu_data); + if (err) + return err; + if (curr_chan_id != trace_chan_id) { + pr_err("CS_ETM: mismatch between CPU trace ID and HW_ID packet ID\n"); + return -EINVAL; + } + + /* mapped and matched - return OK */ + return 0; + } + + /* not one we've seen before - lets map it */ + cpu_data = etm->metadata[cpu]; + err = cs_etm__map_trace_id(trace_chan_id, cpu_data); + if (err) + return err; + + /* + * if we are picking up the association from the packet, need to plug + * the correct trace ID into the metadata for setting up decoders later. + */ + err = cs_etm__metadata_set_trace_id(trace_chan_id, cpu_data); + return err; +} + void cs_etm__etmq_set_traceid_queue_timestamp(struct cs_etm_queue *etmq, u8 trace_chan_id) { @@ -2668,11 +2805,16 @@ static int cs_etm__queue_aux_fragment(struct perf_session *session, off_t file_o } /* - * In per-thread mode, CPU is set to -1, but TID will be set instead. See - * auxtrace_mmap_params__set_idx(). Return 'not found' if neither CPU nor TID match. + * In per-thread mode, auxtrace CPU is set to -1, but TID will be set instead. See + * auxtrace_mmap_params__set_idx(). However, the sample AUX event will contain a + * CPU as we set this always for the AUX_OUTPUT_HW_ID event. + * So now compare only TIDs if auxtrace CPU is -1, and CPUs if auxtrace CPU is not -1. + * Return 'not found' if mismatch. */ - if ((auxtrace_event->cpu == (__u32) -1 && auxtrace_event->tid != sample->tid) || - auxtrace_event->cpu != sample->cpu) + if (auxtrace_event->cpu == (__u32) -1) { + if (auxtrace_event->tid != sample->tid) + return 1; + } else if (auxtrace_event->cpu != sample->cpu) return 1; if (aux_event->flags & PERF_AUX_FLAG_OVERWRITE) { @@ -2721,6 +2863,17 @@ static int cs_etm__queue_aux_fragment(struct perf_session *session, off_t file_o return 1; } +static int cs_etm__process_aux_hw_id_cb(struct perf_session *session, union perf_event *event, + u64 offset __maybe_unused, void *data __maybe_unused) +{ + /* look to handle PERF_RECORD_AUX_OUTPUT_HW_ID early to ensure decoders can be set up */ + if (event->header.type == PERF_RECORD_AUX_OUTPUT_HW_ID) { + (*(int *)data)++; /* increment found count */ + return cs_etm__process_aux_output_hw_id(session, event); + } + return 0; +} + static int cs_etm__queue_aux_records_cb(struct perf_session *session, union perf_event *event, u64 offset __maybe_unused, void *data __maybe_unused) { @@ -2839,13 +2992,13 @@ static int cs_etm__map_trace_ids_metadata(int num_cpu, u64 **metadata) cs_etm_magic = metadata[i][CS_ETM_MAGIC]; switch (cs_etm_magic) { case __perf_cs_etmv3_magic: - trace_chan_id = (u8)((metadata[i][CS_ETM_ETMTRACEIDR]) & - CORESIGHT_TRACE_ID_VAL_MASK); + metadata[i][CS_ETM_ETMTRACEIDR] &= CORESIGHT_TRACE_ID_VAL_MASK; + trace_chan_id = (u8)(metadata[i][CS_ETM_ETMTRACEIDR]); break; case __perf_cs_etmv4_magic: case __perf_cs_ete_magic: - trace_chan_id = (u8)((metadata[i][CS_ETMV4_TRCTRACEIDR]) & - CORESIGHT_TRACE_ID_VAL_MASK); + metadata[i][CS_ETMV4_TRCTRACEIDR] &= CORESIGHT_TRACE_ID_VAL_MASK; + trace_chan_id = (u8)(metadata[i][CS_ETMV4_TRCTRACEIDR]); break; default: /* unknown magic number */ @@ -2858,18 +3011,47 @@ static int cs_etm__map_trace_ids_metadata(int num_cpu, u64 **metadata) return 0; } +/* + * If we found AUX_HW_ID packets, then set any metadata marked as unused to the + * unused value to reduce the number of unneeded decoders created. + */ +static int cs_etm__clear_unused_trace_ids_metadata(int num_cpu, u64 **metadata) +{ + u64 cs_etm_magic; + int i; + + for (i = 0; i < num_cpu; i++) { + cs_etm_magic = metadata[i][CS_ETM_MAGIC]; + switch (cs_etm_magic) { + case __perf_cs_etmv3_magic: + if (metadata[i][CS_ETM_ETMTRACEIDR] & CORESIGHT_TRACE_ID_UNUSED_FLAG) + metadata[i][CS_ETM_ETMTRACEIDR] = CORESIGHT_TRACE_ID_UNUSED_VAL; + break; + case __perf_cs_etmv4_magic: + case __perf_cs_ete_magic: + if (metadata[i][CS_ETMV4_TRCTRACEIDR] & CORESIGHT_TRACE_ID_UNUSED_FLAG) + metadata[i][CS_ETMV4_TRCTRACEIDR] = CORESIGHT_TRACE_ID_UNUSED_VAL; + break; + default: + /* unknown magic number */ + return -EINVAL; + } + } + return 0; +} + int cs_etm__process_auxtrace_info_full(union perf_event *event, struct perf_session *session) { struct perf_record_auxtrace_info *auxtrace_info = &event->auxtrace_info; struct cs_etm_auxtrace *etm = NULL; - struct int_node *inode; struct perf_record_time_conv *tc = &session->time_conv; int event_header_size = sizeof(struct perf_event_header); int total_size = auxtrace_info->header.size; int priv_size = 0; int num_cpu; int err = 0; + int aux_hw_id_found; int i, j; u64 *ptr = NULL; u64 **metadata = NULL; @@ -3018,8 +3200,43 @@ int cs_etm__process_auxtrace_info_full(union perf_event *event, if (err) goto err_delete_thread; - /* before aux records are queued, need to map metadata to trace IDs */ - err = cs_etm__map_trace_ids_metadata(num_cpu, metadata); + /* + * Map Trace ID values to CPU metadata. + * + * Trace metadata will always contain Trace ID values from the legacy algorithm. If the + * files has been recorded by a "new" perf updated to handle AUX_HW_ID then the metadata + * ID value will also have the CORESIGHT_TRACE_ID_UNUSED_FLAG set. + * + * The updated kernel drivers that use AUX_HW_ID to sent Trace IDs will attempt to use + * the same IDs as the old algorithm as far as is possible, unless there are clashes + * in which case a different value will be used. This means an older perf may still + * be able to record and read files generate on a newer system. + * + * For a perf able to interpret AUX_HW_ID packets we first check for the presence of + * those packets. If they are there then the values will be mapped and plugged into + * the metadata. We then set any remaining metadata values with the used flag to a + * value CORESIGHT_TRACE_ID_UNUSED_VAL - which indicates no decoder is required. + * + * If no AUX_HW_ID packets are present - which means a file recorded on an old kernel + * then we map Trace ID values to CPU directly from the metadata - clearing any unused + * flags if present. + */ + + /* first scan for AUX_OUTPUT_HW_ID records to map trace ID values to CPU metadata */ + aux_hw_id_found = 0; + err = perf_session__peek_events(session, session->header.data_offset, + session->header.data_size, + cs_etm__process_aux_hw_id_cb, &aux_hw_id_found); + if (err) + goto err_delete_thread; + + /* if HW ID found then clear any unused metadata ID values */ + if (aux_hw_id_found) + err = cs_etm__clear_unused_trace_ids_metadata(num_cpu, metadata); + /* otherwise, this is a file with metadata values only, map from metadata */ + else + err = cs_etm__map_trace_ids_metadata(num_cpu, metadata); + if (err) goto err_delete_thread; @@ -3028,14 +3245,6 @@ int cs_etm__process_auxtrace_info_full(union perf_event *event, goto err_delete_thread; etm->data_queued = etm->queues.populated; - /* - * Print warning in pipe mode, see cs_etm__process_auxtrace_event() and - * cs_etm__queue_aux_fragment() for details relating to limitations. - */ - if (!etm->data_queued) - pr_warning("CS ETM warning: Coresight decode and TRBE support requires random file access.\n" - "Continuing with best effort decoding in piped mode.\n\n"); - return 0; err_delete_thread: -- 2.32.0