linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Leo Yan <leo.yan@linaro.org>
To: German Gomez <german.gomez@arm.com>
Cc: linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org,
	John Garry <john.garry@huawei.com>, Will Deacon <will@kernel.org>,
	Mathieu Poirier <mathieu.poirier@linaro.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Jiri Olsa <jolsa@redhat.com>, Namhyung Kim <namhyung@kernel.org>,
	Mike Leach <mike.leach@linaro.org>,
	linux-arm-kernel@lists.infradead.org, coresight@lists.linaro.org
Subject: Re: [PATCH 4/5] perf arm-spe: Implement find_snapshot callback
Date: Thu, 23 Sep 2021 21:50:16 +0800	[thread overview]
Message-ID: <20210923135016.GG400258@leoy-ThinkPad-X240s> (raw)
In-Reply-To: <20210916154635.1525-4-german.gomez@arm.com>

Hi German,

On Thu, Sep 16, 2021 at 04:46:34PM +0100, German Gomez wrote:
> The head pointer of the AUX buffer managed by the arm_spe_pmu.c driver
> is not monotonically increasing, therefore the find_snapshot callback is
> needed in order to find the trace data within the AUX buffer and avoid
> wasting space in the perf.data file.
> 
> The pointer is assumed to have wrapped if the buffer contains non-zero
> data at the end. If it has wrapped, the entire contents of the AUX
> buffer are stored in the perf.data file. Otherwise only the data up to
> the head pointer is stored.
> 
> Reviewed-by: James Clark <james.clark@arm.com>
> Signed-off-by: German Gomez <german.gomez@arm.com>
> ---
>  tools/perf/arch/arm64/util/arm-spe.c | 145 +++++++++++++++++++++++++++
>  1 file changed, 145 insertions(+)
> 
> diff --git a/tools/perf/arch/arm64/util/arm-spe.c b/tools/perf/arch/arm64/util/arm-spe.c
> index f8b03d164b42..56785034fc84 100644
> --- a/tools/perf/arch/arm64/util/arm-spe.c
> +++ b/tools/perf/arch/arm64/util/arm-spe.c
> @@ -23,6 +23,7 @@
>  #include "../../../util/auxtrace.h"
>  #include "../../../util/record.h"
>  #include "../../../util/arm-spe.h"
> +#include <tools/libc_compat.h> // reallocarray
>  
>  #define KiB(x) ((x) * 1024)
>  #define MiB(x) ((x) * 1024 * 1024)
> @@ -31,6 +32,8 @@ struct arm_spe_recording {
>  	struct auxtrace_record		itr;
>  	struct perf_pmu			*arm_spe_pmu;
>  	struct evlist		*evlist;
> +	int			wrapped_cnt;
> +	bool			*wrapped;
>  };
>  
>  static void arm_spe_set_timestamp(struct auxtrace_record *itr,
> @@ -299,6 +302,146 @@ static int arm_spe_snapshot_finish(struct auxtrace_record *itr)
>  	return -EINVAL;
>  }
>  
> +static int arm_spe_alloc_wrapped_array(struct arm_spe_recording *ptr, int idx)
> +{
> +	bool *wrapped;
> +	int cnt = ptr->wrapped_cnt, new_cnt, i;
> +
> +	/*
> +	 * No need to allocate, so return early.
> +	 */
> +	if (idx < cnt)
> +		return 0;
> +
> +	/*
> +	 * Make ptr->wrapped as big as idx.
> +	 */
> +	new_cnt = idx + 1;
> +
> +	/*
> +	 * Free'ed in arm_spe_recording_free().
> +	 */
> +	wrapped = reallocarray(ptr->wrapped, new_cnt, sizeof(bool));
> +	if (!wrapped)
> +		return -ENOMEM;
> +
> +	/*
> +	 * init new allocated values.
> +	 */
> +	for (i = cnt; i < new_cnt; i++)
> +		wrapped[i] = false;
> +
> +	ptr->wrapped_cnt = new_cnt;
> +	ptr->wrapped = wrapped;
> +
> +	return 0;
> +}
> +
> +static bool arm_spe_buffer_has_wrapped(unsigned char *buffer,
> +				      size_t buffer_size, u64 head)
> +{
> +	u64 i, watermark;
> +	u64 *buf = (u64 *)buffer;
> +	size_t buf_size = buffer_size;
> +
> +	/*
> +	 * Defensively handle the case where head might be continually increasing - if its value is
> +	 * equal or greater than the size of the ring buffer, then we can safely determine it has
> +	 * wrapped around. Otherwise, continue to detect if head might have wrapped.
> +	 */
> +	if (head >= buffer_size)
> +		return true;
> +
> +	/*
> +	 * We want to look the very last 512 byte (chosen arbitrarily) in the ring buffer.
> +	 */
> +	watermark = buf_size - 512;
> +
> +	/*
> +	 * The value of head is somewhere within the size of the ring buffer. This can be that there
> +	 * hasn't been enough data to fill the ring buffer yet or the trace time was so long that
> +	 * head has numerically wrapped around.  To find we need to check if we have data at the
> +	 * very end of the ring buffer.  We can reliably do this because mmap'ed pages are zeroed
> +	 * out and there is a fresh mapping with every new session.
> +	 */
> +
> +	/*
> +	 * head is less than 512 byte from the end of the ring buffer.
> +	 */
> +	if (head > watermark)
> +		watermark = head;
> +
> +	/*
> +	 * Speed things up by using 64 bit transactions (see "u64 *buf" above)
> +	 */
> +	watermark /= sizeof(u64);
> +	buf_size /= sizeof(u64);
> +
> +	/*
> +	 * If we find trace data at the end of the ring buffer, head has been there and has
> +	 * numerically wrapped around at least once.
> +	 */
> +	for (i = watermark; i < buf_size; i++)
> +		if (buf[i])
> +			return true;
> +
> +	return false;
> +}
> +
> +static int arm_spe_find_snapshot(struct auxtrace_record *itr, int idx,
> +				  struct auxtrace_mmap *mm, unsigned char *data,
> +				  u64 *head, u64 *old)
> +{
> +	int err;
> +	bool wrapped;
> +	struct arm_spe_recording *ptr =
> +			container_of(itr, struct arm_spe_recording, itr);
> +
> +	/*
> +	 * Allocate memory to keep track of wrapping if this is the first
> +	 * time we deal with this *mm.
> +	 */
> +	if (idx >= ptr->wrapped_cnt) {
> +		err = arm_spe_alloc_wrapped_array(ptr, idx);
> +		if (err)
> +			return err;
> +	}
> +
> +	/*
> +	 * Check to see if *head has wrapped around.  If it hasn't only the
> +	 * amount of data between *head and *old is snapshot'ed to avoid
> +	 * bloating the perf.data file with zeros.  But as soon as *head has
> +	 * wrapped around the entire size of the AUX ring buffer it taken.
> +	 */
> +	wrapped = ptr->wrapped[idx];
> +	if (!wrapped && arm_spe_buffer_has_wrapped(data, mm->len, *head)) {
> +		wrapped = true;
> +		ptr->wrapped[idx] = true;
> +	}
> +
> +	pr_debug3("%s: mmap index %d old head %zu new head %zu size %zu\n",
> +		  __func__, idx, (size_t)*old, (size_t)*head, mm->len);
> +
> +	/*
> +	 * No wrap has occurred, we can just use *head and *old.
> +	 */
> +	if (!wrapped)
> +		return 0;
> +
> +	/*
> +	 * *head has wrapped around - adjust *head and *old to pickup the
> +	 * entire content of the AUX buffer.
> +	 */
> +	if (*head >= mm->len) {
> +		*old = *head - mm->len;
> +	} else {
> +		*head += mm->len;
> +		*old = *head - mm->len;
> +	}
> +
> +	return 0;
> +}
> +
>  static u64 arm_spe_reference(struct auxtrace_record *itr __maybe_unused)
>  {
>  	struct timespec ts;
> @@ -313,6 +456,7 @@ static void arm_spe_recording_free(struct auxtrace_record *itr)
>  	struct arm_spe_recording *sper =
>  			container_of(itr, struct arm_spe_recording, itr);
>  
> +	free(sper->wrapped);
>  	free(sper);
>  }
>  
> @@ -336,6 +480,7 @@ struct auxtrace_record *arm_spe_recording_init(int *err,
>  	sper->itr.pmu = arm_spe_pmu;
>  	sper->itr.snapshot_start = arm_spe_snapshot_start;
>  	sper->itr.snapshot_finish = arm_spe_snapshot_finish;
> +	sper->itr.find_snapshot = arm_spe_find_snapshot;

If I understand correctly, this patch copies the code from cs-etm for
snapshot handling.  About 2 months ago, we removed the Arm cs-etm's
specific snapshot callback function and directly use perf's function
__auxtrace_mmap__read() to handle 'head' and 'tail' pointers.  Please
see the commit for details:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2f01c200d4405c4562e45e8bb4de44a5ce37b217

Before I review more details for snapshot enabling in patches 03 and
04, could you confirm if Arm SPE can use the same way with cs-etm for
snapshot handling?  From my understanding, this is a better way to
handle AUX buffer's 'head' and 'tail'.

Thanks,
Leo

>  	sper->itr.parse_snapshot_options = arm_spe_parse_snapshot_options;
>  	sper->itr.recording_options = arm_spe_recording_options;
>  	sper->itr.info_priv_size = arm_spe_info_priv_size;
> -- 
> 2.17.1
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2021-09-23 13:52 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-16 15:46 [PATCH 1/5] perf cs-etm: Print size using consistent format German Gomez
2021-09-16 15:46 ` [PATCH 2/5] perf arm-spe: " German Gomez
2021-09-23 13:35   ` Leo Yan
2021-09-16 15:46 ` [PATCH 3/5] perf arm-spe: Add snapshot mode support German Gomez
2021-10-20 12:48   ` Leo Yan
2021-09-16 15:46 ` [PATCH 4/5] perf arm-spe: Implement find_snapshot callback German Gomez
2021-09-23 13:50   ` Leo Yan [this message]
2021-09-23 14:40     ` Leo Yan
2021-09-30 12:26       ` German Gomez
2021-10-04 12:27         ` Leo Yan
2021-10-06  9:35           ` German Gomez
2021-10-06  9:51             ` Leo Yan
2021-10-11 15:55               ` German Gomez
2021-10-12  8:19                 ` Will Deacon
2021-10-12  8:47                   ` James Clark
2021-10-13  0:39                 ` Leo Yan
2021-10-13  7:51                   ` Will Deacon
2021-10-15 12:33                     ` German Gomez
2021-10-15 14:16                       ` Leo Yan
2021-10-15 14:41                         ` German Gomez
2021-10-17  6:13                       ` Leo Yan
2021-10-19  9:23                         ` German Gomez
2021-10-19 13:12                           ` Leo Yan
2021-11-02 11:02                         ` German Gomez
2021-10-17 12:05   ` Leo Yan
2021-10-17 12:36     ` Leo Yan
2021-10-19 17:34     ` German Gomez
2021-10-20 13:25       ` Leo Yan
2021-09-16 15:46 ` [PATCH 5/5] perf arm-spe: Snapshot mode test German Gomez
2021-10-20 13:13   ` Leo Yan
2021-10-20 15:06     ` German Gomez
2021-11-02 14:07     ` James Clark
2021-11-02 15:37       ` James Clark
2021-11-09 13:26         ` German Gomez
2021-09-23 13:35 ` [PATCH 1/5] perf cs-etm: Print size using consistent format Leo Yan
2021-09-23 16:24 ` Mathieu Poirier
2021-09-30 12:09   ` German Gomez
2021-09-30 16:30     ` Mathieu Poirier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210923135016.GG400258@leoy-ThinkPad-X240s \
    --to=leo.yan@linaro.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=coresight@lists.linaro.org \
    --cc=german.gomez@arm.com \
    --cc=john.garry@huawei.com \
    --cc=jolsa@redhat.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mathieu.poirier@linaro.org \
    --cc=mike.leach@linaro.org \
    --cc=namhyung@kernel.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).