linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/3] perf test: Shell - Limit to only run executable scripts in tests
@ 2022-03-09 12:28 carsten.haitzler
  2022-03-09 12:28 ` [PATCH 2/3] perf test: Shell - only run .sh shell files to skip other files carsten.haitzler
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: carsten.haitzler @ 2022-03-09 12:28 UTC (permalink / raw)
  To: linux-kernel
  Cc: coresight, suzuki.poulose, mathieu.poirier, mike.leach, leo.yan,
	linux-perf-users, acme

From: Carsten Haitzler <carsten.haitzler@arm.com>

Perf test's shell runner will just run everything in the tests
directory (as long as it's not another directory or does not begin
with a dot), but sometimes you find files in there that are not shell
scripts - perf.data output for example if you do some testing and then
the next time you run perf test it tries to run these. Check the files
are executable so they are actually intended to be test scripts and
not just some "random junk" files there.

Signed-off-by: Carsten Haitzler <carsten.haitzler@arm.com>
---
 tools/perf/tests/builtin-test.c |  4 +++-
 tools/perf/util/path.c          | 14 +++++++++++++-
 tools/perf/util/path.h          |  1 +
 3 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index fac3717d9ba1..3c34cb766724 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -296,7 +296,9 @@ static const char *shell_test__description(char *description, size_t size,
 
 #define for_each_shell_test(entlist, nr, base, ent)	                \
 	for (int __i = 0; __i < nr && (ent = entlist[__i]); __i++)	\
-		if (!is_directory(base, ent) && ent->d_name[0] != '.')
+		if (!is_directory(base, ent) && \
+			is_executable_file(base, ent) && \
+			ent->d_name[0] != '.')
 
 static const char *shell_tests__dir(char *path, size_t size)
 {
diff --git a/tools/perf/util/path.c b/tools/perf/util/path.c
index caed0336429f..ce80b79be103 100644
--- a/tools/perf/util/path.c
+++ b/tools/perf/util/path.c
@@ -86,9 +86,21 @@ bool is_directory(const char *base_path, const struct dirent *dent)
 	char path[PATH_MAX];
 	struct stat st;
 
-	sprintf(path, "%s/%s", base_path, dent->d_name);
+	snprintf(path, sizeof(path), "%s/%s", base_path, dent->d_name);
 	if (stat(path, &st))
 		return false;
 
 	return S_ISDIR(st.st_mode);
 }
+
+bool is_executable_file(const char *base_path, const struct dirent *dent)
+{
+	char path[PATH_MAX];
+	struct stat st;
+
+	snprintf(path, sizeof(path), "%s/%s", base_path, dent->d_name);
+	if (stat(path, &st))
+		return false;
+
+	return !S_ISDIR(st.st_mode) && (st.st_mode & S_IXUSR);
+}
diff --git a/tools/perf/util/path.h b/tools/perf/util/path.h
index 083429b7efa3..d94902c22222 100644
--- a/tools/perf/util/path.h
+++ b/tools/perf/util/path.h
@@ -12,5 +12,6 @@ int path__join3(char *bf, size_t size, const char *path1, const char *path2, con
 
 bool is_regular_file(const char *file);
 bool is_directory(const char *base_path, const struct dirent *dent);
+bool is_executable_file(const char *base_path, const struct dirent *dent);
 
 #endif /* _PERF_PATH_H */
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 2/3] perf test: Shell - only run .sh shell files to skip other files
  2022-03-09 12:28 [PATCH 1/3] perf test: Shell - Limit to only run executable scripts in tests carsten.haitzler
@ 2022-03-09 12:28 ` carsten.haitzler
  2022-04-10  2:28   ` Leo Yan
  2022-03-09 12:28 ` [PATCH 3/3] perf test: Add coresight tests to guage quality of data generated carsten.haitzler
  2022-04-10  1:24 ` [PATCH 1/3] perf test: Shell - Limit to only run executable scripts in tests Leo Yan
  2 siblings, 1 reply; 19+ messages in thread
From: carsten.haitzler @ 2022-03-09 12:28 UTC (permalink / raw)
  To: linux-kernel
  Cc: coresight, suzuki.poulose, mathieu.poirier, mike.leach, leo.yan,
	linux-perf-users, acme

From: Carsten Haitzler <carsten.haitzler@arm.com>

You edit your scripts in the tests and end up with your usual shell
backup files with ~ or .bak or something else at the end, but then your
next perf test run wants to run the backups too. You might also have perf
.data files in the directory or something else undesireable as well. You end
up chasing which test is the one you edited and the backup and have to keep
removing all the backup files, so automatically skip any files that are
not plain *.sh scripts to limit the time wasted in chasing ghosts.

Signed-off-by: Carsten Haitzler <carsten.haitzler@arm.com>
---
 tools/perf/tests/builtin-test.c | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index 3c34cb766724..3a02ba7a7a89 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -296,9 +296,22 @@ static const char *shell_test__description(char *description, size_t size,
 
 #define for_each_shell_test(entlist, nr, base, ent)	                \
 	for (int __i = 0; __i < nr && (ent = entlist[__i]); __i++)	\
-		if (!is_directory(base, ent) && \
+		if (ent->d_name[0] != '.' && \
+			!is_directory(base, ent) && \
 			is_executable_file(base, ent) && \
-			ent->d_name[0] != '.')
+			is_shell_script(ent->d_name))
+
+static bool is_shell_script(const char *file)
+{
+	const char *ext;
+
+	ext = strrchr(file, '.');
+	if (!ext)
+		return false;
+	if (!strcmp(ext, ".sh"))
+		return true;
+	return false;
+}
 
 static const char *shell_tests__dir(char *path, size_t size)
 {
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 3/3] perf test: Add coresight tests to guage quality of data generated
  2022-03-09 12:28 [PATCH 1/3] perf test: Shell - Limit to only run executable scripts in tests carsten.haitzler
  2022-03-09 12:28 ` [PATCH 2/3] perf test: Shell - only run .sh shell files to skip other files carsten.haitzler
@ 2022-03-09 12:28 ` carsten.haitzler
  2022-04-10  8:30   ` Leo Yan
  2022-05-30 16:27   ` Mathieu Poirier
  2022-04-10  1:24 ` [PATCH 1/3] perf test: Shell - Limit to only run executable scripts in tests Leo Yan
  2 siblings, 2 replies; 19+ messages in thread
From: carsten.haitzler @ 2022-03-09 12:28 UTC (permalink / raw)
  To: linux-kernel
  Cc: coresight, suzuki.poulose, mathieu.poirier, mike.leach, leo.yan,
	linux-perf-users, acme

From: Carsten Haitzler <carsten.haitzler@arm.com>

This adds a test harness and tests to run perf record and examine the
resuling output when coresight is enabled on arm64 and check the
resulting quality of the output as part of perf test. These tests use
various tools to produce output from perf record then measure some key
specific aspects of that data to see if the data exists at all and
contains key aspects such as measuring some data for every thread of
a test or produces sufficient data for large exeuction runs of a large
executable. etc.

Signed-off-by: Carsten Haitzler <carsten.haitzler@arm.com>
---
 MAINTAINERS                                   |   4 +
 tools/perf/.gitignore                         |   6 +-
 tools/perf/Documentation/arm-coresight.txt    | 140 ++++++++++++++++++
 tools/perf/Makefile.perf                      |  14 +-
 tools/perf/tests/shell/coresight/Makefile     |  30 ++++
 .../tests/shell/coresight/Makefile.miniconfig |  23 +++
 .../shell/coresight/asm_pure_loop/.gitignore  |   1 +
 .../shell/coresight/asm_pure_loop/Makefile    |  30 ++++
 .../coresight/asm_pure_loop/asm_pure_loop.S   |  28 ++++
 .../shell/coresight/memcpy_thread/.gitignore  |   1 +
 .../shell/coresight/memcpy_thread/Makefile    |  29 ++++
 .../coresight/memcpy_thread/memcpy_thread.c   |  79 ++++++++++
 .../shell/coresight/thread_loop/.gitignore    |   1 +
 .../shell/coresight/thread_loop/Makefile      |  29 ++++
 .../shell/coresight/thread_loop/thread_loop.c |  86 +++++++++++
 .../coresight/unroll_loop_thread/.gitignore   |   1 +
 .../coresight/unroll_loop_thread/Makefile     |  29 ++++
 .../unroll_loop_thread/unroll_loop_thread.c   |  74 +++++++++
 .../tests/shell/coresight_asm_pure_loop.sh    |  18 +++
 .../shell/coresight_memcpy_thread_16k_10.sh   |  18 +++
 .../coresight_thread_loop_check_tid_10.sh     |  19 +++
 .../coresight_thread_loop_check_tid_2.sh      |  19 +++
 .../shell/coresight_unroll_loop_thread_10.sh  |  18 +++
 tools/perf/tests/shell/lib/coresight.sh       | 130 ++++++++++++++++
 24 files changed, 823 insertions(+), 4 deletions(-)
 create mode 100644 tools/perf/Documentation/arm-coresight.txt
 create mode 100644 tools/perf/tests/shell/coresight/Makefile
 create mode 100644 tools/perf/tests/shell/coresight/Makefile.miniconfig
 create mode 100644 tools/perf/tests/shell/coresight/asm_pure_loop/.gitignore
 create mode 100644 tools/perf/tests/shell/coresight/asm_pure_loop/Makefile
 create mode 100644 tools/perf/tests/shell/coresight/asm_pure_loop/asm_pure_loop.S
 create mode 100644 tools/perf/tests/shell/coresight/memcpy_thread/.gitignore
 create mode 100644 tools/perf/tests/shell/coresight/memcpy_thread/Makefile
 create mode 100644 tools/perf/tests/shell/coresight/memcpy_thread/memcpy_thread.c
 create mode 100644 tools/perf/tests/shell/coresight/thread_loop/.gitignore
 create mode 100644 tools/perf/tests/shell/coresight/thread_loop/Makefile
 create mode 100644 tools/perf/tests/shell/coresight/thread_loop/thread_loop.c
 create mode 100644 tools/perf/tests/shell/coresight/unroll_loop_thread/.gitignore
 create mode 100644 tools/perf/tests/shell/coresight/unroll_loop_thread/Makefile
 create mode 100644 tools/perf/tests/shell/coresight/unroll_loop_thread/unroll_loop_thread.c
 create mode 100755 tools/perf/tests/shell/coresight_asm_pure_loop.sh
 create mode 100755 tools/perf/tests/shell/coresight_memcpy_thread_16k_10.sh
 create mode 100755 tools/perf/tests/shell/coresight_thread_loop_check_tid_10.sh
 create mode 100755 tools/perf/tests/shell/coresight_thread_loop_check_tid_2.sh
 create mode 100755 tools/perf/tests/shell/coresight_unroll_loop_thread_10.sh
 create mode 100644 tools/perf/tests/shell/lib/coresight.sh

diff --git a/MAINTAINERS b/MAINTAINERS
index 673c7124ca82..18cc20609f2e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1918,10 +1918,14 @@ F:	drivers/hwtracing/coresight/*
 F:	include/dt-bindings/arm/coresight-cti-dt.h
 F:	include/linux/coresight*
 F:	samples/coresight/*
+F:	tools/perf/Documentation/arm-coresight.txt
 F:	tools/perf/arch/arm/util/auxtrace.c
 F:	tools/perf/arch/arm/util/cs-etm.c
 F:	tools/perf/arch/arm/util/cs-etm.h
 F:	tools/perf/arch/arm/util/pmu.c
+F:	tools/perf/tests/shell/coresight_*
+F:	tools/perf/tests/shell/tools/Makefile
+F:	tools/perf/tests/shell/tools/coresight/*
 F:	tools/perf/util/cs-etm-decoder/*
 F:	tools/perf/util/cs-etm.*
 
diff --git a/tools/perf/.gitignore b/tools/perf/.gitignore
index 20b8ab984d5f..138c679ecacd 100644
--- a/tools/perf/.gitignore
+++ b/tools/perf/.gitignore
@@ -15,8 +15,9 @@ perf*.1
 perf*.xml
 perf*.html
 common-cmds.h
-perf.data
-perf.data.old
+perf*.data
+perf*.data.old
+stats-*.csv
 output.svg
 perf-archive
 perf-with-kcore
@@ -30,6 +31,7 @@ config.mak.autogen
 *-flex.*
 *.pyc
 *.pyo
+*.stdout
 .config-detected
 util/intel-pt-decoder/inat-tables.c
 arch/*/include/generated/
diff --git a/tools/perf/Documentation/arm-coresight.txt b/tools/perf/Documentation/arm-coresight.txt
new file mode 100644
index 000000000000..3a9e6c573c58
--- /dev/null
+++ b/tools/perf/Documentation/arm-coresight.txt
@@ -0,0 +1,140 @@
+Arm Coresight Support
+=====================
+
+Coresight is a feature of some Arm based processors that allows for
+debugging. One of the things it can do is trace every instruction
+executed and remotely expose that information in a hardware compressed
+stream. Perf is able to locally access that stream and store it to the
+output perf data files. This stream can then be later decoded to give the
+instructions that were traced for debugging or profiling purposes. You
+can log such data with a perf record command like:
+
+    perf record -e cs_etm//u testbinary
+
+This would run some test binary (testbinary) until it exits and record
+a perf.data trace file. That file would have AUX sections if coresight
+is working correctly. You can dump the content of this file as
+readable text with a command like:
+
+    perf report --stdio --dump -i perf.data
+
+You should find some sections of this file have AUX data blocks like:
+
+    0x1e78 [0x30]: PERF_RECORD_AUXTRACE size: 0x11dd0  offset: 0  ref: 0x1b614fc1061b0ad1  idx: 0  tid: 531230  cpu: -1
+
+    . ... CoreSight ETM Trace data: size 73168 bytes
+            Idx:0; ID:10;   I_ASYNC : Alignment Synchronisation.
+              Idx:12; ID:10;  I_TRACE_INFO : Trace Info.; INFO=0x0 { CC.0 }
+              Idx:17; ID:10;  I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0x0000000000000000;
+              Idx:26; ID:10;  I_TRACE_ON : Trace On.
+              Idx:27; ID:10;  I_ADDR_CTXT_L_64IS0 : Address & Context, Long, 64 bit, IS0.; Addr=0x0000FFFFB6069140; Ctxt: AArch64,EL0, NS;
+              Idx:38; ID:10;  I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE
+              Idx:39; ID:10;  I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE
+              Idx:40; ID:10;  I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE
+              Idx:41; ID:10;  I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEN
+              ...
+
+If you see these above, then your system is tracing coresight data
+correctly.
+
+To compile perf with coresight support in the perf directory do
+
+    make CORESIGHT=1
+
+This will compile the perf tool with coresight support as well as
+build some small test binaries for perf test. This requires you also
+be compiling for 64bit Arm (ARM64/aarch64). The tools run as part of
+perf coresight tracing are in tests/shell/tools/coresight.
+
+You will also want coresight support enabled in your kernel config.
+Ensure it is enabled with:
+
+    CONFIG_CORESIGHT=y
+
+There are various other coresight options you probably also want
+enabled like:
+
+    CONFIG_CORESIGHT_LINKS_AND_SINKS=y
+    CONFIG_CORESIGHT_LINK_AND_SINK_TMC=y
+    CONFIG_CORESIGHT_CATU=y
+    CONFIG_CORESIGHT_SINK_TPIU=y
+    CONFIG_CORESIGHT_SINK_ETBV10=y
+    CONFIG_CORESIGHT_SOURCE_ETM4X=y
+    CONFIG_CORESIGHT_STM=y
+    CONFIG_CORESIGHT_CPU_DEBUG=y
+    CONFIG_CORESIGHT_CTI=y
+    CONFIG_CORESIGHT_CTI_INTEGRATION_REGS=y
+
+Please refer to the kernel configuration help for more information.
+
+Perf test - Verify kernel and userspace perf coresight work
+===========================================================
+
+When you run perf test, it will do a lot of self tests. Some of those
+tests will cover Coresight (only if enabled and on ARM64). You
+generally would run perf test from the tools/perf directory in the
+kernel tree. Some tests will check some internal perf support like:
+
+    Check Arm CoreSight trace data recording and synthesized samples
+
+Some others will actually use perf record and some test binaries that
+are in tests/shell/tools/coresight and will collect traces to ensure a
+minimum level of functionality is met. The scripts that launch these
+tests are in tests/shell. These will all look like:
+
+    Coresight / Memcpy 1M 25 Threads
+    Coresight / Unroll Loop Thread 2
+    ...
+
+These perf record tests will not run if the tool binaries do not exist
+in tests/shell/tools/coresight/*/ and will be skipped. If you do not
+have coresight support in hardware then either do not build perf with
+coresight support or remove these binaries in order to not have these
+tests fail and have them skip instead.
+
+These tests will log historical results in the current working
+directory (e.g. tools/perf) and will be named stats-*.csv like:
+
+    stats-asm_pure_loop-out.csv
+    stats-bubble_sort-random.csv
+    ...
+
+These statistic files log some aspects of the AUX data sections in
+the perf data output counting some numbers of certain encodings (a
+good way to know that it's working in a very simple way). One problem
+with coresight is that given a large enough amount of data needing to
+be logged, some of it can be lost due to the processor not waking up
+in time to read out all the data from buffers etc.. You will notice
+that the amount of data collected can vary a lot per run of perf test.
+If you wish to see how this changes over time, simply run perf test
+multiple times and all these csv files will have more and more data
+appended to it that you can later examine, graph and otherwise use to
+figure out if things have become worse or better.
+
+Be aware that amny of these tests take quite a while to run, specifically
+in processing the perf data file and dumping contents to then examine what
+is inside.
+
+You can change where these csv logs are stored by setting the
+PERF_TEST_CORESIGHT_STATDIR environment variable before running perf
+test like:
+
+    export PERF_TEST_CORESIGHT_STATDIR=/var/tmp
+    perf test
+
+They will also store resulting perf output data in the current
+directory for later inspection like:
+
+    perf-memcpy-1m.data
+    perf-thread_loop-2th.data
+    ...
+
+You can alter where the perf data files are stored by setting the
+PERF_TEST_CORESIGHT_DATADIR environment variable such as:
+
+    PERF_TEST_CORESIGHT_DATADIR=/var/tmp
+    perf test
+
+You may wish to set these above environment variables if you which to
+keep the output of tests outside of the current working directory for
+longer term storage and examination.
diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index ac861e42c8f7..b97db83992e0 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -630,7 +630,15 @@ sync_file_range_tbls := $(srctree)/tools/perf/trace/beauty/sync_file_range.sh
 $(sync_file_range_arrays): $(linux_uapi_dir)/fs.h $(sync_file_range_tbls)
 	$(Q)$(SHELL) '$(sync_file_range_tbls)' $(linux_uapi_dir) > $@
 
-all: shell_compatibility_test $(ALL_PROGRAMS) $(LANG_BINDINGS) $(OTHER_PROGRAMS)
+TESTS_CORESIGHT_DIR := $(srctree)/tools/perf/tests/shell/coresight
+
+tests-coresight-targets: FORCE
+	$(Q)$(MAKE) -C $(TESTS_CORESIGHT_DIR)
+
+tests-coresight-targets-clean:
+	$(Q)$(MAKE) -C $(TESTS_CORESIGHT_DIR) clean
+
+all: shell_compatibility_test $(ALL_PROGRAMS) $(LANG_BINDINGS) $(OTHER_PROGRAMS) tests-coresight-targets
 
 # Create python binding output directory if not already present
 _dummy := $(shell [ -d '$(OUTPUT)python' ] || mkdir -p '$(OUTPUT)python')
@@ -1020,6 +1028,7 @@ install-tests: all install-gtk
 		$(INSTALL) tests/shell/*.sh '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell'; \
 		$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/lib'; \
 		$(INSTALL) tests/shell/lib/*.sh '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/lib'
+	$(Q)$(MAKE) -C tests/shell/coresight install-tests
 
 install-bin: install-tools install-tests install-traceevent-plugins
 
@@ -1088,7 +1097,7 @@ endif # BUILD_BPF_SKEL
 bpf-skel-clean:
 	$(call QUIET_CLEAN, bpf-skel) $(RM) -r $(SKEL_TMP_OUT) $(SKELETONS)
 
-clean:: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean $(LIBPERF)-clean fixdep-clean python-clean bpf-skel-clean
+clean:: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean $(LIBPERF)-clean fixdep-clean python-clean bpf-skel-clean tests-coresight-targets-clean
 	$(call QUIET_CLEAN, core-objs)  $(RM) $(LIBPERF_A) $(OUTPUT)perf-archive $(OUTPUT)perf-with-kcore $(OUTPUT)perf-iostat $(LANG_BINDINGS)
 	$(Q)find $(if $(OUTPUT),$(OUTPUT),.) -name '*.o' -delete -o -name '\.*.cmd' -delete -o -name '\.*.d' -delete
 	$(Q)$(RM) $(OUTPUT).config-detected
@@ -1155,5 +1164,6 @@ FORCE:
 .PHONY: shell_compatibility_test please_set_SHELL_PATH_to_a_more_modern_shell
 .PHONY: $(GIT-HEAD-PHONY) TAGS tags cscope FORCE prepare
 .PHONY: libtraceevent_plugins archheaders
+.PHONY: $(TESTS_CORESIGHT_TARGETS)
 
 endif # force_fixdep
diff --git a/tools/perf/tests/shell/coresight/Makefile b/tools/perf/tests/shell/coresight/Makefile
new file mode 100644
index 000000000000..dda99aeac158
--- /dev/null
+++ b/tools/perf/tests/shell/coresight/Makefile
@@ -0,0 +1,30 @@
+# SPDX-License-Identifier: GPL-2.0-only
+# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
+include ../../../../../tools/scripts/Makefile.include
+include ../../../../../tools/scripts/Makefile.arch
+include ../../../../../tools/scripts/utilities.mak
+
+SUBDIRS = \
+	asm_pure_loop \
+	thread_loop \
+	memcpy_thread \
+	unroll_loop_thread
+
+all: $(SUBDIRS)
+$(SUBDIRS):
+	$(Q)$(MAKE) -C $@
+
+INSTALLDIRS = $(SUBDIRS:%=install-%)
+
+install-tests: $(INSTALLDIRS)
+$(INSTALLDIRS):
+	$(Q)$(MAKE) -C $(@:install-%=%) install-tests
+
+CLEANDIRS = $(SUBDIRS:%=clean-%)
+
+clean: $(CLEANDIRS)
+$(CLEANDIRS):
+	$(Q)$(MAKE) -C $(@:clean-%=%) clean >/dev/null
+
+.PHONY: all clean $(SUBDIRS) $(CLEANDIRS) $(INSTALLDIRS)
+
diff --git a/tools/perf/tests/shell/coresight/Makefile.miniconfig b/tools/perf/tests/shell/coresight/Makefile.miniconfig
new file mode 100644
index 000000000000..893c12685fed
--- /dev/null
+++ b/tools/perf/tests/shell/coresight/Makefile.miniconfig
@@ -0,0 +1,23 @@
+# SPDX-License-Identifier: GPL-2.0-only
+# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
+
+ifndef DESTDIR
+prefix ?= $(HOME)
+endif
+
+DESTDIR_SQ = $(subst ','\'',$(DESTDIR))
+perfexecdir = libexec/perf-core
+perfexec_instdir = $(perfexecdir)
+
+ifneq ($(filter /%,$(firstword $(perfexecdir))),)
+perfexec_instdir = $(perfexecdir)
+else
+perfexec_instdir = $(prefix)/$(perfexecdir)
+endif
+
+perfexec_instdir_SQ = $(subst ','\'',$(perfexec_instdir))
+INSTALL = install
+
+include ../../../../../scripts/Makefile.include
+include ../../../../../scripts/Makefile.arch
+include ../../../../../scripts/utilities.mak
diff --git a/tools/perf/tests/shell/coresight/asm_pure_loop/.gitignore b/tools/perf/tests/shell/coresight/asm_pure_loop/.gitignore
new file mode 100644
index 000000000000..468673ac32e8
--- /dev/null
+++ b/tools/perf/tests/shell/coresight/asm_pure_loop/.gitignore
@@ -0,0 +1 @@
+asm_pure_loop
diff --git a/tools/perf/tests/shell/coresight/asm_pure_loop/Makefile b/tools/perf/tests/shell/coresight/asm_pure_loop/Makefile
new file mode 100644
index 000000000000..10c5a60cb71c
--- /dev/null
+++ b/tools/perf/tests/shell/coresight/asm_pure_loop/Makefile
@@ -0,0 +1,30 @@
+# SPDX-License-Identifier: GPL-2.0
+# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
+
+include ../Makefile.miniconfig
+
+BIN=asm_pure_loop
+LIB=
+
+all: $(BIN)
+
+$(BIN): $(BIN).S
+ifdef CORESIGHT
+ifeq ($(ARCH),arm64)
+	$(Q)$(CC) $(BIN).S -nostdlib -static -o $(BIN) $(LIB)
+endif
+endif
+
+install-tests: all
+ifdef CORESIGHT
+ifeq ($(ARCH),arm64)
+	$(call QUIET_INSTALL, tests) \
+		$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)'; \
+		$(INSTALL) $(BIN) '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)/$(BIN)'
+endif
+endif
+
+clean:
+	$(Q)$(RM) -f $(BIN)
+
+.PHONY: all clean install-tests
diff --git a/tools/perf/tests/shell/coresight/asm_pure_loop/asm_pure_loop.S b/tools/perf/tests/shell/coresight/asm_pure_loop/asm_pure_loop.S
new file mode 100644
index 000000000000..75cf084a927d
--- /dev/null
+++ b/tools/perf/tests/shell/coresight/asm_pure_loop/asm_pure_loop.S
@@ -0,0 +1,28 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Tamas Zsoldos <tamas.zsoldos@arm.com>, 2021 */
+
+.globl _start
+_start:
+	mov	x0, 0x0000ffff
+	mov	x1, xzr
+loop:
+	nop
+	nop
+	cbnz	x1, noskip
+	nop
+	nop
+	adrp	x2, skip
+	add 	x2, x2, :lo12:skip
+	br	x2
+	nop
+	nop
+noskip:
+	nop
+	nop
+skip:
+	sub	x0, x0, 1
+	cbnz	x0, loop
+
+	mov	x0, #0
+	mov	x8, #93 // __NR_exit syscall
+	svc	#0
diff --git a/tools/perf/tests/shell/coresight/memcpy_thread/.gitignore b/tools/perf/tests/shell/coresight/memcpy_thread/.gitignore
new file mode 100644
index 000000000000..f8217e56091e
--- /dev/null
+++ b/tools/perf/tests/shell/coresight/memcpy_thread/.gitignore
@@ -0,0 +1 @@
+memcpy_thread
diff --git a/tools/perf/tests/shell/coresight/memcpy_thread/Makefile b/tools/perf/tests/shell/coresight/memcpy_thread/Makefile
new file mode 100644
index 000000000000..e2604cfae74b
--- /dev/null
+++ b/tools/perf/tests/shell/coresight/memcpy_thread/Makefile
@@ -0,0 +1,29 @@
+# SPDX-License-Identifier: GPL-2.0
+# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
+include ../Makefile.miniconfig
+
+BIN=memcpy_thread
+LIB=-pthread
+
+all: $(BIN)
+
+$(BIN): $(BIN).c
+ifdef CORESIGHT
+ifeq ($(ARCH),arm64)
+	$(Q)$(CC) $(BIN).c -o $(BIN) $(LIB)
+endif
+endif
+
+install-tests: all
+ifdef CORESIGHT
+ifeq ($(ARCH),arm64)
+	$(call QUIET_INSTALL, tests) \
+		$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)'; \
+		$(INSTALL) $(BIN) '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)/$(BIN)'
+endif
+endif
+
+clean:
+	$(Q)$(RM) -f $(BIN)
+
+.PHONY: all clean install-tests
diff --git a/tools/perf/tests/shell/coresight/memcpy_thread/memcpy_thread.c b/tools/perf/tests/shell/coresight/memcpy_thread/memcpy_thread.c
new file mode 100644
index 000000000000..a7e169d1bf64
--- /dev/null
+++ b/tools/perf/tests/shell/coresight/memcpy_thread/memcpy_thread.c
@@ -0,0 +1,79 @@
+// SPDX-License-Identifier: GPL-2.0
+// Carsten Haitzler <carsten.haitzler@arm.com>, 2021
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <string.h>
+#include <pthread.h>
+
+struct args {
+	unsigned long loops;
+	unsigned long size;
+	pthread_t th;
+	void *ret;
+};
+
+static void *thrfn(void *arg)
+{
+	struct args *a = arg;
+	unsigned long i, len = a->loops;
+	unsigned char *src, *dst;
+
+	src = malloc(a->size * 1024);
+	dst = malloc(a->size * 1024);
+	if ((!src) || (!dst)) {
+		printf("ERR: Can't allocate memory\n");
+		exit(1);
+	}
+	for (i = 0; i < len; i++)
+		memcpy(dst, src, a->size * 1024);
+}
+
+static pthread_t new_thr(void *(*fn) (void *arg), void *arg)
+{
+	pthread_t t;
+	pthread_attr_t attr;
+
+	pthread_attr_init(&attr);
+	pthread_create(&t, &attr, fn, arg);
+	return t;
+}
+
+int main(int argc, char **argv)
+{
+	unsigned long i, len, size, thr;
+	pthread_t threads[256];
+	struct args args[256];
+	long long v;
+
+	if (argc < 4) {
+		printf("ERR: %s [copysize Kb] [numthreads] [numloops (hundreds)]\n", argv[0]);
+		exit(1);
+	}
+
+	v = atoll(argv[1]);
+	if ((v < 1) || (v > (1024 * 1024))) {
+		printf("ERR: max memory 1GB (1048576 KB)\n");
+		exit(1);
+	}
+	size = v;
+	thr = atol(argv[2]);
+	if ((thr < 1) || (thr > 256)) {
+		printf("ERR: threads 1-256\n");
+		exit(1);
+	}
+	v = atoll(argv[3]);
+	if ((v < 1) || (v > 40000000000ll)) {
+		printf("ERR: loops 1-40000000000 (hundreds)\n");
+		exit(1);
+	}
+	len = v * 100;
+	for (i = 0; i < thr; i++) {
+		args[i].loops = len;
+		args[i].size = size;
+		args[i].th = new_thr(thrfn, &(args[i]));
+	}
+	for (i = 0; i < thr; i++)
+		pthread_join(args[i].th, &(args[i].ret));
+	return 0;
+}
diff --git a/tools/perf/tests/shell/coresight/thread_loop/.gitignore b/tools/perf/tests/shell/coresight/thread_loop/.gitignore
new file mode 100644
index 000000000000..6d4c33eaa9e8
--- /dev/null
+++ b/tools/perf/tests/shell/coresight/thread_loop/.gitignore
@@ -0,0 +1 @@
+thread_loop
diff --git a/tools/perf/tests/shell/coresight/thread_loop/Makefile b/tools/perf/tests/shell/coresight/thread_loop/Makefile
new file mode 100644
index 000000000000..424df4e8b0e6
--- /dev/null
+++ b/tools/perf/tests/shell/coresight/thread_loop/Makefile
@@ -0,0 +1,29 @@
+# SPDX-License-Identifier: GPL-2.0
+# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
+include ../Makefile.miniconfig
+
+BIN=thread_loop
+LIB=-pthread
+
+all: $(BIN)
+
+$(BIN): $(BIN).c
+ifdef CORESIGHT
+ifeq ($(ARCH),arm64)
+	$(Q)$(CC) $(BIN).c -o $(BIN) $(LIB)
+endif
+endif
+
+install-tests: all
+ifdef CORESIGHT
+ifeq ($(ARCH),arm64)
+	$(call QUIET_INSTALL, tests) \
+		$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)'; \
+		$(INSTALL) $(BIN) '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)/$(BIN)'
+endif
+endif
+
+clean:
+	$(Q)$(RM) -f $(BIN)
+
+.PHONY: all clean install-tests
diff --git a/tools/perf/tests/shell/coresight/thread_loop/thread_loop.c b/tools/perf/tests/shell/coresight/thread_loop/thread_loop.c
new file mode 100644
index 000000000000..c0158fac7d0b
--- /dev/null
+++ b/tools/perf/tests/shell/coresight/thread_loop/thread_loop.c
@@ -0,0 +1,86 @@
+// SPDX-License-Identifier: GPL-2.0
+// Carsten Haitzler <carsten.haitzler@arm.com>, 2021
+
+// define this for gettid()
+#define _GNU_SOURCE
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <string.h>
+#include <pthread.h>
+#include <sys/syscall.h>
+#ifndef SYS_gettid
+// gettid is 178 on arm64
+# define SYS_gettid 178
+#endif
+#define gettid() syscall(SYS_gettid)
+
+struct args {
+	unsigned int loops;
+	pthread_t th;
+	void *ret;
+};
+
+static void *thrfn(void *arg)
+{
+	struct args *a = arg;
+	int i = 0, len = a->loops;
+
+	if (getenv("SHOW_TID")) {
+		unsigned long long tid = gettid();
+
+		printf("%llu\n", tid);
+	}
+	asm volatile(
+		"loop:\n"
+		"add %[i], %[i], #1\n"
+		"cmp %[i], %[len]\n"
+		"blt loop\n"
+		: /* out */
+		: /* in */ [i] "r" (i), [len] "r" (len)
+		: /* clobber */
+	);
+	return (void *)(long)i;
+}
+
+static pthread_t new_thr(void *(*fn) (void *arg), void *arg)
+{
+	pthread_t t;
+	pthread_attr_t attr;
+
+	pthread_attr_init(&attr);
+	pthread_create(&t, &attr, fn, arg);
+	return t;
+}
+
+int main(int argc, char **argv)
+{
+	unsigned int i, len, thr;
+	pthread_t threads[256];
+	struct args args[256];
+
+	if (argc < 3) {
+		printf("ERR: %s [numthreads] [numloops (millions)]\n", argv[0]);
+		exit(1);
+	}
+
+	thr = atoi(argv[1]);
+	if ((thr < 1) || (thr > 256)) {
+		printf("ERR: threads 1-256\n");
+		exit(1);
+	}
+	len = atoi(argv[2]);
+	if ((len < 1) || (len > 4000)) {
+		printf("ERR: max loops 4000 (millions)\n");
+		exit(1);
+	}
+	len *= 1000000;
+	for (i = 0; i < thr; i++) {
+		args[i].loops = len;
+		args[i].th = new_thr(thrfn, &(args[i]));
+	}
+	for (i = 0; i < thr; i++)
+		pthread_join(args[i].th, &(args[i].ret));
+	return 0;
+}
diff --git a/tools/perf/tests/shell/coresight/unroll_loop_thread/.gitignore b/tools/perf/tests/shell/coresight/unroll_loop_thread/.gitignore
new file mode 100644
index 000000000000..2cb4e996dbf3
--- /dev/null
+++ b/tools/perf/tests/shell/coresight/unroll_loop_thread/.gitignore
@@ -0,0 +1 @@
+unroll_loop_thread
diff --git a/tools/perf/tests/shell/coresight/unroll_loop_thread/Makefile b/tools/perf/tests/shell/coresight/unroll_loop_thread/Makefile
new file mode 100644
index 000000000000..45ab2be8be92
--- /dev/null
+++ b/tools/perf/tests/shell/coresight/unroll_loop_thread/Makefile
@@ -0,0 +1,29 @@
+# SPDX-License-Identifier: GPL-2.0
+# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
+include ../Makefile.miniconfig
+
+BIN=unroll_loop_thread
+LIB=-pthread
+
+all: $(BIN)
+
+$(BIN): $(BIN).c
+ifdef CORESIGHT
+ifeq ($(ARCH),arm64)
+	$(Q)$(CC) $(BIN).c -o $(BIN) $(LIB)
+endif
+endif
+
+install-tests: all
+ifdef CORESIGHT
+ifeq ($(ARCH),arm64)
+	$(call QUIET_INSTALL, tests) \
+		$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)'; \
+		$(INSTALL) $(BIN) '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)/$(BIN)'
+endif
+endif
+
+clean:
+	$(Q)$(RM) -f $(BIN)
+
+.PHONY: all clean install-tests
diff --git a/tools/perf/tests/shell/coresight/unroll_loop_thread/unroll_loop_thread.c b/tools/perf/tests/shell/coresight/unroll_loop_thread/unroll_loop_thread.c
new file mode 100644
index 000000000000..cb9d22c7dfb9
--- /dev/null
+++ b/tools/perf/tests/shell/coresight/unroll_loop_thread/unroll_loop_thread.c
@@ -0,0 +1,74 @@
+// SPDX-License-Identifier: GPL-2.0
+// Carsten Haitzler <carsten.haitzler@arm.com>, 2021
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <string.h>
+#include <pthread.h>
+
+struct args {
+	pthread_t th;
+	unsigned int in, out;
+	void *ret;
+};
+
+static void *thrfn(void *arg)
+{
+	struct args *a = arg;
+	unsigned int i, in = a->in;
+
+	for (i = 0; i < 10000; i++) {
+		asm volatile (
+// force an unroll of thia add instruction so we can test long runs of code
+#define SNIP1 "add %[in], %[in], #1\n"
+// 10
+#define SNIP2 SNIP1 SNIP1 SNIP1 SNIP1 SNIP1 SNIP1 SNIP1 SNIP1 SNIP1 SNIP1
+// 100
+#define SNIP3 SNIP2 SNIP2 SNIP2 SNIP2 SNIP2 SNIP2 SNIP2 SNIP2 SNIP2 SNIP2
+// 1000
+#define SNIP4 SNIP3 SNIP3 SNIP3 SNIP3 SNIP3 SNIP3 SNIP3 SNIP3 SNIP3 SNIP3
+// 10000
+#define SNIP5 SNIP4 SNIP4 SNIP4 SNIP4 SNIP4 SNIP4 SNIP4 SNIP4 SNIP4 SNIP4
+// 100000
+			SNIP5 SNIP5 SNIP5 SNIP5 SNIP5 SNIP5 SNIP5 SNIP5 SNIP5 SNIP5
+			: /* out */
+			: /* in */ [in] "r" (in)
+			: /* clobber */
+		);
+	}
+}
+
+static pthread_t new_thr(void *(*fn) (void *arg), void *arg)
+{
+	pthread_t t;
+	pthread_attr_t attr;
+
+	pthread_attr_init(&attr);
+	pthread_create(&t, &attr, fn, arg);
+	return t;
+}
+
+int main(int argc, char **argv)
+{
+	unsigned int i, thr;
+	pthread_t threads[256];
+	struct args args[256];
+
+	if (argc < 2) {
+		printf("ERR: %s [numthreads]\n", argv[0]);
+		exit(1);
+	}
+
+	thr = atoi(argv[1]);
+	if ((thr > 256) || (thr < 1)) {
+		printf("ERR: threads 1-256\n");
+		exit(1);
+	}
+	for (i = 0; i < thr; i++) {
+		args[i].in = rand();
+		args[i].th = new_thr(thrfn, &(args[i]));
+	}
+	for (i = 0; i < thr; i++)
+		pthread_join(args[i].th, &(args[i].ret));
+	return 0;
+}
diff --git a/tools/perf/tests/shell/coresight_asm_pure_loop.sh b/tools/perf/tests/shell/coresight_asm_pure_loop.sh
new file mode 100755
index 000000000000..3f0dbefcad50
--- /dev/null
+++ b/tools/perf/tests/shell/coresight_asm_pure_loop.sh
@@ -0,0 +1,18 @@
+#!/bin/sh -e
+# Coresight / ASM Pure Loop
+
+# SPDX-License-Identifier: GPL-2.0
+# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
+
+TEST="asm_pure_loop"
+. $(dirname $0)/lib/coresight.sh
+ARGS=""
+DATV="out"
+DATA="$DATD/perf-$TEST-$DATV.data"
+
+perf record $PERFRECOPT -o "$DATA" "$BIN" $ARGS
+
+perf_dump_aux_verify "$DATA" 10 10 10
+
+err=$?
+exit $err
diff --git a/tools/perf/tests/shell/coresight_memcpy_thread_16k_10.sh b/tools/perf/tests/shell/coresight_memcpy_thread_16k_10.sh
new file mode 100755
index 000000000000..8972af835016
--- /dev/null
+++ b/tools/perf/tests/shell/coresight_memcpy_thread_16k_10.sh
@@ -0,0 +1,18 @@
+#!/bin/sh -e
+# Coresight / Memcpy 16k 10 Threads
+
+# SPDX-License-Identifier: GPL-2.0
+# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
+
+TEST="memcpy_thread"
+. $(dirname $0)/lib/coresight.sh
+ARGS="16 10 1"
+DATV="16k_10"
+DATA="$DATD/perf-$TEST-$DATV.data"
+
+perf record $PERFRECOPT -o "$DATA" "$BIN" $ARGS
+
+perf_dump_aux_verify "$DATA" 10 10 10
+
+err=$?
+exit $err
diff --git a/tools/perf/tests/shell/coresight_thread_loop_check_tid_10.sh b/tools/perf/tests/shell/coresight_thread_loop_check_tid_10.sh
new file mode 100755
index 000000000000..5b468901f89b
--- /dev/null
+++ b/tools/perf/tests/shell/coresight_thread_loop_check_tid_10.sh
@@ -0,0 +1,19 @@
+#!/bin/sh -e
+# Coresight / Thread Loop 10 Threads - Check TID
+
+# SPDX-License-Identifier: GPL-2.0
+# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
+
+TEST="thread_loop"
+. $(dirname $0)/lib/coresight.sh
+ARGS="10 1"
+DATV="check-tid-10th"
+DATA="$DATD/perf-$TEST-$DATV.data"
+STDO="$DATD/perf-$TEST-$DATV.stdout"
+
+SHOW_TID=1 perf record -s $PERFRECOPT -o "$DATA" "$BIN" $ARGS > $STDO
+
+perf_dump_aux_tid_verify "$DATA" "$STDO"
+
+err=$?
+exit $err
diff --git a/tools/perf/tests/shell/coresight_thread_loop_check_tid_2.sh b/tools/perf/tests/shell/coresight_thread_loop_check_tid_2.sh
new file mode 100755
index 000000000000..f8b7abd3aa03
--- /dev/null
+++ b/tools/perf/tests/shell/coresight_thread_loop_check_tid_2.sh
@@ -0,0 +1,19 @@
+#!/bin/sh -e
+# Coresight / Thread Loop 2 Threads - Check TID
+
+# SPDX-License-Identifier: GPL-2.0
+# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
+
+TEST="thread_loop"
+. $(dirname $0)/lib/coresight.sh
+ARGS="2 20"
+DATV="check-tid-2th"
+DATA="$DATD/perf-$TEST-$DATV.data"
+STDO="$DATD/perf-$TEST-$DATV.stdout"
+
+SHOW_TID=1 perf record -s $PERFRECOPT -o "$DATA" "$BIN" $ARGS > $STDO
+
+perf_dump_aux_tid_verify "$DATA" "$STDO"
+
+err=$?
+exit $err
diff --git a/tools/perf/tests/shell/coresight_unroll_loop_thread_10.sh b/tools/perf/tests/shell/coresight_unroll_loop_thread_10.sh
new file mode 100755
index 000000000000..c985dfb025c2
--- /dev/null
+++ b/tools/perf/tests/shell/coresight_unroll_loop_thread_10.sh
@@ -0,0 +1,18 @@
+#!/bin/sh -e
+# Coresight / Unroll Loop Thread 10
+
+# SPDX-License-Identifier: GPL-2.0
+# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
+
+TEST="unroll_loop_thread"
+. $(dirname $0)/lib/coresight.sh
+ARGS="10"
+DATV="10"
+DATA="$DATD/perf-$TEST-$DATV.data"
+
+perf record $PERFRECOPT -o "$DATA" "$BIN" $ARGS
+
+perf_dump_aux_verify "$DATA" 10 10 10
+
+err=$?
+exit $err
diff --git a/tools/perf/tests/shell/lib/coresight.sh b/tools/perf/tests/shell/lib/coresight.sh
new file mode 100644
index 000000000000..6a611b073f02
--- /dev/null
+++ b/tools/perf/tests/shell/lib/coresight.sh
@@ -0,0 +1,130 @@
+# SPDX-License-Identifier: GPL-2.0
+# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
+
+# This is sourced from a driver script so no need for #!/bin... etc. at the
+# top - the assumption below is that it runs as part of sourcing after the
+# test sets up some basic env vars to say what it is.
+
+# perf record options for the perf tests to use
+PERFRECMEM="-m ,128M"
+PERFRECOPT="$PERFRECMEM -e cs_etm//u"
+
+# These tests need to be run as root or coresight won't allow large buffers
+# and will not collect proper data
+UID=`id -u`
+if test "$UID" -ne 0; then
+	echo "Not running as root... skip"
+	exit 2
+fi
+
+TOOLS=$(dirname $0)
+DIR="$TOOLS/coresight/$TEST"
+BIN="$DIR/$TEST"
+# If the test tool/binary does not exist and is executable then skip the test
+if ! test -x "$BIN"; then exit 2; fi
+DATD="."
+# If the data dir env is set then make the data dir use that instead of ./
+if test -n "$PERF_TEST_CORESIGHT_DATADIR"; then
+	DATD="$PERF_TEST_CORESIGHT_DATADIR";
+fi
+# If the stat dir env is set then make the data dir use that instead of ./
+STATD="."
+if test -n "$PERF_TEST_CORESIGHT_STATDIR"; then
+	STATD="$PERF_TEST_CORESIGHT_STATDIR";
+fi
+
+# Called if the test fails - error code 2
+err() {
+	echo "$1"
+	exit 1
+}
+
+# Check that some statistics from our perf
+check_val_min() {
+	STATF="$4"
+	if test "$2" -lt "$3"; then
+		echo ", FAILED" >> "$STATF"
+		err "Sanity check number of $1 is too low ($2 < $3)"
+	fi
+}
+
+perf_dump_aux_verify() {
+	# Some basic checking that the AUX chunk contains some sensible data
+	# to see that we are recording something and at least a minimum
+	# amount of it. We should almost always see F3 atoms in just about
+	# anything but certainly we will see some trace info and async atom
+	# chunks.
+	DUMP="$DATD/perf-tmp-aux-dump.txt"
+	perf report --stdio --dump -i "$1" | \
+		grep -o -e I_ATOM_F3 -e I_ASYNC -e I_TRACE_INFO > "$DUMP"
+	# Simply count how many of these atoms we find to see that we are
+	# producing a reasonable amount of data - exact checks are not sane
+	# as this is a lossy  process where we may lose some blocks and the
+	# compiler may produce different code depending on the compiler and
+	# optimization options, so this is rough  just to see if we're
+	# either missing almost all the data or all of it
+	ATOM_F3_NUM=`grep I_ATOM_F3 "$DUMP" | wc -l`
+	ATOM_ASYNC_NUM=`grep I_ASYNC "$DUMP" | wc -l`
+	ATOM_TRACE_INFO_NUM=`grep I_TRACE_INFO "$DUMP" | wc -l`
+	rm -f "$DUMP"
+
+	# Arguments provide minimums for a pass
+	CHECK_F3_MIN="$2"
+	CHECK_ASYNC_MIN="$3"
+	CHECK_TRACE_INFO_MIN="$4"
+
+	# Write out statistics, so over time you can track results to see if
+	# there is a pattern - for example we have less "noisy" results that
+	# produce more consistent amounts of data each run, to see if over
+	# time any techinques to  minimize data loss are having an effect or
+	# not
+	STATF="$STATD/stats-$TEST-$DATV.csv"
+	if ! test -f "$STATF"; then
+		echo "ATOM F3 Count, Minimum, ATOM ASYNC Count, Minimum, TRACE INFO Count, Minimum" > "$STATF"
+	fi
+	echo -n "$ATOM_F3_NUM, $CHECK_F3_MIN, $ATOM_ASYNC_NUM, $CHECK_ASYNC_MIN, $ATOM_TRACE_INFO_NUM, $CHECK_TRACE_INFO_MIN" >> "$STATF"
+
+	# Actually check to see if we passed or failed.
+	check_val_min "ATOM_F3" "$ATOM_F3_NUM" "$CHECK_F3_MIN" "$STATF"
+	check_val_min "ASYNC" "$ATOM_ASYNC_NUM" "$CHECK_ASYNC_MIN" "$STATF"
+	check_val_min "TRACE_INFO" "$ATOM_TRACE_INFO_NUM" "$CHECK_TRACE_INFO_MIN" "$STATF"
+	echo ", Ok" >> "$STATF"
+}
+
+perf_dump_aux_tid_verify() {
+	# Specifically crafted test will produce a list of Tread ID's to
+	# stdout that need to be checked to  see that they have had trace
+	# info collected in AUX blocks in the perf data. This will go
+	# through all the TID's that are listed as CID=0xabcdef and see
+	# that all the Thread IDs the test tool reports are  in the perf
+	# data AUX chunks
+
+	# The TID test tools will print a TID per stdout line that are being
+	# tested
+	TIDS=`cat "$2"`
+	# Scan the perf report to find the TIDs that are actually CID in hex
+	# and build a list of the ones found
+	FOUND_TIDS=`perf report --stdio --dump -i "$1" | \
+			grep -o "CID=0x[0-9a-z]\+" | sed 's/CID=//g' | \
+			uniq | sort | uniq`
+
+	# Iterate over the list of TIDs that the test says it has and find
+	# them in the TIDs found in the perf report
+	MISSING=""
+	for TID2 in $TIDS; do
+		FOUND=""
+		for TIDHEX in $FOUND_TIDS; do
+			TID=`printf "%i" $TIDHEX`
+			if test "$TID" -eq "$TID2"; then
+				FOUND="y"
+				break
+			fi
+		done
+		if test -z "$FOUND"; then
+			MISSING="$MISSING $TID"
+		fi
+	done
+	if test -n "$MISSING"; then
+		err "Thread IDs $MISSING not found in perf AUX data"
+	fi
+}
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/3] perf test: Shell - Limit to only run executable scripts in tests
  2022-03-09 12:28 [PATCH 1/3] perf test: Shell - Limit to only run executable scripts in tests carsten.haitzler
  2022-03-09 12:28 ` [PATCH 2/3] perf test: Shell - only run .sh shell files to skip other files carsten.haitzler
  2022-03-09 12:28 ` [PATCH 3/3] perf test: Add coresight tests to guage quality of data generated carsten.haitzler
@ 2022-04-10  1:24 ` Leo Yan
  2022-04-11 19:08   ` Arnaldo Carvalho de Melo
  2 siblings, 1 reply; 19+ messages in thread
From: Leo Yan @ 2022-04-10  1:24 UTC (permalink / raw)
  To: carsten.haitzler
  Cc: linux-kernel, coresight, suzuki.poulose, mathieu.poirier,
	mike.leach, linux-perf-users, acme

On Wed, Mar 09, 2022 at 12:28:57PM +0000, carsten.haitzler@foss.arm.com wrote:
> From: Carsten Haitzler <carsten.haitzler@arm.com>
> 
> Perf test's shell runner will just run everything in the tests
> directory (as long as it's not another directory or does not begin
> with a dot), but sometimes you find files in there that are not shell
> scripts - perf.data output for example if you do some testing and then
> the next time you run perf test it tries to run these. Check the files
> are executable so they are actually intended to be test scripts and
> not just some "random junk" files there.
> 
> Signed-off-by: Carsten Haitzler <carsten.haitzler@arm.com>

Reviewed-by: Leo Yan <leo.yan@linaro.org>

> ---
>  tools/perf/tests/builtin-test.c |  4 +++-
>  tools/perf/util/path.c          | 14 +++++++++++++-
>  tools/perf/util/path.h          |  1 +
>  3 files changed, 17 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
> index fac3717d9ba1..3c34cb766724 100644
> --- a/tools/perf/tests/builtin-test.c
> +++ b/tools/perf/tests/builtin-test.c
> @@ -296,7 +296,9 @@ static const char *shell_test__description(char *description, size_t size,
>  
>  #define for_each_shell_test(entlist, nr, base, ent)	                \
>  	for (int __i = 0; __i < nr && (ent = entlist[__i]); __i++)	\
> -		if (!is_directory(base, ent) && ent->d_name[0] != '.')
> +		if (!is_directory(base, ent) && \
> +			is_executable_file(base, ent) && \
> +			ent->d_name[0] != '.')
>  
>  static const char *shell_tests__dir(char *path, size_t size)
>  {
> diff --git a/tools/perf/util/path.c b/tools/perf/util/path.c
> index caed0336429f..ce80b79be103 100644
> --- a/tools/perf/util/path.c
> +++ b/tools/perf/util/path.c
> @@ -86,9 +86,21 @@ bool is_directory(const char *base_path, const struct dirent *dent)
>  	char path[PATH_MAX];
>  	struct stat st;
>  
> -	sprintf(path, "%s/%s", base_path, dent->d_name);
> +	snprintf(path, sizeof(path), "%s/%s", base_path, dent->d_name);
>  	if (stat(path, &st))
>  		return false;
>  
>  	return S_ISDIR(st.st_mode);
>  }
> +
> +bool is_executable_file(const char *base_path, const struct dirent *dent)
> +{
> +	char path[PATH_MAX];
> +	struct stat st;
> +
> +	snprintf(path, sizeof(path), "%s/%s", base_path, dent->d_name);
> +	if (stat(path, &st))
> +		return false;
> +
> +	return !S_ISDIR(st.st_mode) && (st.st_mode & S_IXUSR);
> +}
> diff --git a/tools/perf/util/path.h b/tools/perf/util/path.h
> index 083429b7efa3..d94902c22222 100644
> --- a/tools/perf/util/path.h
> +++ b/tools/perf/util/path.h
> @@ -12,5 +12,6 @@ int path__join3(char *bf, size_t size, const char *path1, const char *path2, con
>  
>  bool is_regular_file(const char *file);
>  bool is_directory(const char *base_path, const struct dirent *dent);
> +bool is_executable_file(const char *base_path, const struct dirent *dent);
>  
>  #endif /* _PERF_PATH_H */
> -- 
> 2.32.0
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] perf test: Shell - only run .sh shell files to skip other files
  2022-03-09 12:28 ` [PATCH 2/3] perf test: Shell - only run .sh shell files to skip other files carsten.haitzler
@ 2022-04-10  2:28   ` Leo Yan
  2022-04-21 16:21     ` Carsten Haitzler
  0 siblings, 1 reply; 19+ messages in thread
From: Leo Yan @ 2022-04-10  2:28 UTC (permalink / raw)
  To: carsten.haitzler
  Cc: linux-kernel, coresight, suzuki.poulose, mathieu.poirier,
	mike.leach, linux-perf-users, acme

On Wed, Mar 09, 2022 at 12:28:58PM +0000, carsten.haitzler@foss.arm.com wrote:
> From: Carsten Haitzler <carsten.haitzler@arm.com>
> 
> You edit your scripts in the tests and end up with your usual shell
> backup files with ~ or .bak or something else at the end, but then your
> next perf test run wants to run the backups too. You might also have perf
> .data files in the directory or something else undesireable as well. You end
> up chasing which test is the one you edited and the backup and have to keep
> removing all the backup files, so automatically skip any files that are
> not plain *.sh scripts to limit the time wasted in chasing ghosts.
> 
> Signed-off-by: Carsten Haitzler <carsten.haitzler@arm.com>
>
> ---
>  tools/perf/tests/builtin-test.c | 17 +++++++++++++++--
>  1 file changed, 15 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
> index 3c34cb766724..3a02ba7a7a89 100644
> --- a/tools/perf/tests/builtin-test.c
> +++ b/tools/perf/tests/builtin-test.c
> @@ -296,9 +296,22 @@ static const char *shell_test__description(char *description, size_t size,
>  
>  #define for_each_shell_test(entlist, nr, base, ent)	                \
>  	for (int __i = 0; __i < nr && (ent = entlist[__i]); __i++)	\
> -		if (!is_directory(base, ent) && \
> +		if (ent->d_name[0] != '.' && \
> +			!is_directory(base, ent) && \
>  			is_executable_file(base, ent) && \
> -			ent->d_name[0] != '.')
> +			is_shell_script(ent->d_name))

Just nitpick: since multiple conditions are added, seems to me it's good
to use a single function is_executable_shell_script() to make decision
if a file is an executable shell script.

And the condition checking 'ent->d_name[0] != '.'' would be redundant
after we have checked the file suffix '.sh'.

Thanks,
Leo

> +
> +static bool is_shell_script(const char *file)
> +{
> +	const char *ext;
> +
> +	ext = strrchr(file, '.');
> +	if (!ext)
> +		return false;
> +	if (!strcmp(ext, ".sh"))
> +		return true;
> +	return false;
> +}
>  
>  static const char *shell_tests__dir(char *path, size_t size)
>  {
> -- 
> 2.32.0
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/3] perf test: Add coresight tests to guage quality of data generated
  2022-03-09 12:28 ` [PATCH 3/3] perf test: Add coresight tests to guage quality of data generated carsten.haitzler
@ 2022-04-10  8:30   ` Leo Yan
  2022-04-21 17:38     ` Carsten Haitzler
  2022-05-30 16:27   ` Mathieu Poirier
  1 sibling, 1 reply; 19+ messages in thread
From: Leo Yan @ 2022-04-10  8:30 UTC (permalink / raw)
  To: carsten.haitzler
  Cc: linux-kernel, coresight, suzuki.poulose, mathieu.poirier,
	mike.leach, linux-perf-users, acme

Hi Carsten,

On Wed, Mar 09, 2022 at 12:28:59PM +0000, carsten.haitzler@foss.arm.com wrote:
> From: Carsten Haitzler <carsten.haitzler@arm.com>
> 
> This adds a test harness and tests to run perf record and examine the
> resuling output when coresight is enabled on arm64 and check the
> resulting quality of the output as part of perf test. These tests use
> various tools to produce output from perf record then measure some key
> specific aspects of that data to see if the data exists at all and
> contains key aspects such as measuring some data for every thread of
> a test or produces sufficient data for large exeuction runs of a large
> executable. etc.
> 
> Signed-off-by: Carsten Haitzler <carsten.haitzler@arm.com>
> ---
>  MAINTAINERS                                   |   4 +
>  tools/perf/.gitignore                         |   6 +-
>  tools/perf/Documentation/arm-coresight.txt    | 140 ++++++++++++++++++
>  tools/perf/Makefile.perf                      |  14 +-
>  tools/perf/tests/shell/coresight/Makefile     |  30 ++++
>  .../tests/shell/coresight/Makefile.miniconfig |  23 +++
>  .../shell/coresight/asm_pure_loop/.gitignore  |   1 +
>  .../shell/coresight/asm_pure_loop/Makefile    |  30 ++++
>  .../coresight/asm_pure_loop/asm_pure_loop.S   |  28 ++++
>  .../shell/coresight/memcpy_thread/.gitignore  |   1 +
>  .../shell/coresight/memcpy_thread/Makefile    |  29 ++++
>  .../coresight/memcpy_thread/memcpy_thread.c   |  79 ++++++++++
>  .../shell/coresight/thread_loop/.gitignore    |   1 +
>  .../shell/coresight/thread_loop/Makefile      |  29 ++++
>  .../shell/coresight/thread_loop/thread_loop.c |  86 +++++++++++
>  .../coresight/unroll_loop_thread/.gitignore   |   1 +
>  .../coresight/unroll_loop_thread/Makefile     |  29 ++++
>  .../unroll_loop_thread/unroll_loop_thread.c   |  74 +++++++++
>  .../tests/shell/coresight_asm_pure_loop.sh    |  18 +++
>  .../shell/coresight_memcpy_thread_16k_10.sh   |  18 +++
>  .../coresight_thread_loop_check_tid_10.sh     |  19 +++
>  .../coresight_thread_loop_check_tid_2.sh      |  19 +++
>  .../shell/coresight_unroll_loop_thread_10.sh  |  18 +++
>  tools/perf/tests/shell/lib/coresight.sh       | 130 ++++++++++++++++

Very big change...  Why squash all patches form previous verion to this
single one big patch?  Usually the format with small patches is much
better for reviewing.

And I cannot apply cleanly on perf core branch:
https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git branch: perf/core.

>  24 files changed, 823 insertions(+), 4 deletions(-)
>  create mode 100644 tools/perf/Documentation/arm-coresight.txt
>  create mode 100644 tools/perf/tests/shell/coresight/Makefile
>  create mode 100644 tools/perf/tests/shell/coresight/Makefile.miniconfig
>  create mode 100644 tools/perf/tests/shell/coresight/asm_pure_loop/.gitignore
>  create mode 100644 tools/perf/tests/shell/coresight/asm_pure_loop/Makefile
>  create mode 100644 tools/perf/tests/shell/coresight/asm_pure_loop/asm_pure_loop.S
>  create mode 100644 tools/perf/tests/shell/coresight/memcpy_thread/.gitignore
>  create mode 100644 tools/perf/tests/shell/coresight/memcpy_thread/Makefile
>  create mode 100644 tools/perf/tests/shell/coresight/memcpy_thread/memcpy_thread.c
>  create mode 100644 tools/perf/tests/shell/coresight/thread_loop/.gitignore
>  create mode 100644 tools/perf/tests/shell/coresight/thread_loop/Makefile
>  create mode 100644 tools/perf/tests/shell/coresight/thread_loop/thread_loop.c
>  create mode 100644 tools/perf/tests/shell/coresight/unroll_loop_thread/.gitignore
>  create mode 100644 tools/perf/tests/shell/coresight/unroll_loop_thread/Makefile
>  create mode 100644 tools/perf/tests/shell/coresight/unroll_loop_thread/unroll_loop_thread.c
>  create mode 100755 tools/perf/tests/shell/coresight_asm_pure_loop.sh
>  create mode 100755 tools/perf/tests/shell/coresight_memcpy_thread_16k_10.sh
>  create mode 100755 tools/perf/tests/shell/coresight_thread_loop_check_tid_10.sh
>  create mode 100755 tools/perf/tests/shell/coresight_thread_loop_check_tid_2.sh
>  create mode 100755 tools/perf/tests/shell/coresight_unroll_loop_thread_10.sh
>  create mode 100644 tools/perf/tests/shell/lib/coresight.sh
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 673c7124ca82..18cc20609f2e 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -1918,10 +1918,14 @@ F:	drivers/hwtracing/coresight/*
>  F:	include/dt-bindings/arm/coresight-cti-dt.h
>  F:	include/linux/coresight*
>  F:	samples/coresight/*
> +F:	tools/perf/Documentation/arm-coresight.txt
>  F:	tools/perf/arch/arm/util/auxtrace.c
>  F:	tools/perf/arch/arm/util/cs-etm.c
>  F:	tools/perf/arch/arm/util/cs-etm.h
>  F:	tools/perf/arch/arm/util/pmu.c
> +F:	tools/perf/tests/shell/coresight_*
> +F:	tools/perf/tests/shell/tools/Makefile
> +F:	tools/perf/tests/shell/tools/coresight/*
>  F:	tools/perf/util/cs-etm-decoder/*
>  F:	tools/perf/util/cs-etm.*
>  
> diff --git a/tools/perf/.gitignore b/tools/perf/.gitignore
> index 20b8ab984d5f..138c679ecacd 100644
> --- a/tools/perf/.gitignore
> +++ b/tools/perf/.gitignore
> @@ -15,8 +15,9 @@ perf*.1
>  perf*.xml
>  perf*.html
>  common-cmds.h
> -perf.data
> -perf.data.old
> +perf*.data
> +perf*.data.old
> +stats-*.csv
>  output.svg
>  perf-archive
>  perf-with-kcore
> @@ -30,6 +31,7 @@ config.mak.autogen
>  *-flex.*
>  *.pyc
>  *.pyo
> +*.stdout
>  .config-detected
>  util/intel-pt-decoder/inat-tables.c
>  arch/*/include/generated/
> diff --git a/tools/perf/Documentation/arm-coresight.txt b/tools/perf/Documentation/arm-coresight.txt
> new file mode 100644
> index 000000000000..3a9e6c573c58
> --- /dev/null
> +++ b/tools/perf/Documentation/arm-coresight.txt
> @@ -0,0 +1,140 @@
> +Arm Coresight Support
> +=====================
> +
> +Coresight is a feature of some Arm based processors that allows for
> +debugging. One of the things it can do is trace every instruction
> +executed and remotely expose that information in a hardware compressed
> +stream.

Maybe here need to sync a bit for the terminology in
Documentation/trace/coresight/coresight.rst.

Something like:

"Coresight is a feature of some Arm based processors that allows for
debugging. One of the things it can do is trace instruction path and
expose that information in a hardware compressed stream for either
debugger or HW assisted tracing locally.

See Documentation/trace/coresight/coresight.rst for details."

> Perf is able to locally access that stream and store it to the
> +output perf data files. This stream can then be later decoded to give the
> +instructions that were traced for debugging or profiling purposes. You
> +can log such data with a perf record command like:
> +
> +    perf record -e cs_etm//u testbinary
> +
> +This would run some test binary (testbinary) until it exits and record
> +a perf.data trace file. That file would have AUX sections if coresight
> +is working correctly. You can dump the content of this file as
> +readable text with a command like:
> +
> +    perf report --stdio --dump -i perf.data
> +
> +You should find some sections of this file have AUX data blocks like:
> +
> +    0x1e78 [0x30]: PERF_RECORD_AUXTRACE size: 0x11dd0  offset: 0  ref: 0x1b614fc1061b0ad1  idx: 0  tid: 531230  cpu: -1
> +
> +    . ... CoreSight ETM Trace data: size 73168 bytes
> +            Idx:0; ID:10;   I_ASYNC : Alignment Synchronisation.
> +              Idx:12; ID:10;  I_TRACE_INFO : Trace Info.; INFO=0x0 { CC.0 }
> +              Idx:17; ID:10;  I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0x0000000000000000;
> +              Idx:26; ID:10;  I_TRACE_ON : Trace On.
> +              Idx:27; ID:10;  I_ADDR_CTXT_L_64IS0 : Address & Context, Long, 64 bit, IS0.; Addr=0x0000FFFFB6069140; Ctxt: AArch64,EL0, NS;
> +              Idx:38; ID:10;  I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE
> +              Idx:39; ID:10;  I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE
> +              Idx:40; ID:10;  I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE
> +              Idx:41; ID:10;  I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEN
> +              ...
> +
> +If you see these above, then your system is tracing coresight data
> +correctly.
> +
> +To compile perf with coresight support in the perf directory do
> +
> +    make CORESIGHT=1

It is inaccurate that if we don't mention openCSD lib.

> +
> +This will compile the perf tool with coresight support as well as
> +build some small test binaries for perf test. This requires you also
> +be compiling for 64bit Arm (ARM64/aarch64). The tools run as part of
> +perf coresight tracing are in tests/shell/tools/coresight.

For build perf tool, I think above paragraphs are duplicate with the
document Documentation/trace/coresight/coresight.rst.  Can we simply
say:

"The details for building perf tool with support Arm Coresight can be
found in the "HOWTO.md" file of the openCSD gitHub repository:
https://github.com/Linaro/opencsd.

And "HOWTO.md" file gives the information and examples for how to use
perf tool to record and report Coresight trace data.  It's the
prerequisite for this perf Coresight test."

> +You will also want coresight support enabled in your kernel config.
> +Ensure it is enabled with:
> +
> +    CONFIG_CORESIGHT=y
> +
> +There are various other coresight options you probably also want
> +enabled like:
> +
> +    CONFIG_CORESIGHT_LINKS_AND_SINKS=y
> +    CONFIG_CORESIGHT_LINK_AND_SINK_TMC=y
> +    CONFIG_CORESIGHT_CATU=y
> +    CONFIG_CORESIGHT_SINK_TPIU=y
> +    CONFIG_CORESIGHT_SINK_ETBV10=y
> +    CONFIG_CORESIGHT_SOURCE_ETM4X=y
> +    CONFIG_CORESIGHT_STM=y
> +    CONFIG_CORESIGHT_CPU_DEBUG=y
> +    CONFIG_CORESIGHT_CTI=y
> +    CONFIG_CORESIGHT_CTI_INTEGRATION_REGS=y
> +
> +Please refer to the kernel configuration help for more information.

I prefer to remove these kernel configuration since they are not
inconsistent on different platforms (e.g. ETBV10, ETM4X, etc), and
some configurations might not necessary (e.g. CPU_DEBUG).

> +Perf test - Verify kernel and userspace perf coresight work
> +===========================================================
> +
> +When you run perf test, it will do a lot of self tests. Some of those
> +tests will cover Coresight (only if enabled and on ARM64). You
> +generally would run perf test from the tools/perf directory in the
> +kernel tree. Some tests will check some internal perf support like:
> +
> +    Check Arm CoreSight trace data recording and synthesized samples
> +
> +Some others will actually use perf record and some test binaries that
> +are in tests/shell/tools/coresight and will collect traces to ensure a
> +minimum level of functionality is met. The scripts that launch these
> +tests are in tests/shell. These will all look like:
> +
> +    Coresight / Memcpy 1M 25 Threads
> +    Coresight / Unroll Loop Thread 2
> +    ...

Please update based on the latest test case names, at my side, I can
see the testing case like:

       Coresight / ASM Pure Loop
       Coresight / Memcpy 16k 10 Threads
       Coresight / Thread Loop 10 Threads - Check TID
       Coresight / Thread Loop 2 Threads - Check TID
       Coresight / Unroll Loop Thread 10

> +
> +These perf record tests will not run if the tool binaries do not exist
> +in tests/shell/tools/coresight/*/ and will be skipped. If you do not
> +have coresight support in hardware then either do not build perf with
> +coresight support or remove these binaries in order to not have these
> +tests fail and have them skip instead.
> +
> +These tests will log historical results in the current working
> +directory (e.g. tools/perf) and will be named stats-*.csv like:
> +
> +    stats-asm_pure_loop-out.csv
> +    stats-bubble_sort-random.csv
> +    ...
> +
> +These statistic files log some aspects of the AUX data sections in
> +the perf data output counting some numbers of certain encodings (a
> +good way to know that it's working in a very simple way). One problem
> +with coresight is that given a large enough amount of data needing to
> +be logged, some of it can be lost due to the processor not waking up
> +in time to read out all the data from buffers etc.. You will notice
> +that the amount of data collected can vary a lot per run of perf test.
> +If you wish to see how this changes over time, simply run perf test
> +multiple times and all these csv files will have more and more data
> +appended to it that you can later examine, graph and otherwise use to
> +figure out if things have become worse or better.

I am confused by this narrative.  Does it try to remind that the final
testing result (pass or fail) is not stable?  Or should we run for
multiple times so have more chance to capture issues?

> +Be aware that amny of these tests take quite a while to run, specifically

s/amny/many

> +in processing the perf data file and dumping contents to then examine what
> +is inside.
> +
> +You can change where these csv logs are stored by setting the
> +PERF_TEST_CORESIGHT_STATDIR environment variable before running perf
> +test like:
> +
> +    export PERF_TEST_CORESIGHT_STATDIR=/var/tmp
> +    perf test
> +
> +They will also store resulting perf output data in the current
> +directory for later inspection like:
> +
> +    perf-memcpy-1m.data
> +    perf-thread_loop-2th.data
> +    ...
> +
> +You can alter where the perf data files are stored by setting the
> +PERF_TEST_CORESIGHT_DATADIR environment variable such as:
> +
> +    PERF_TEST_CORESIGHT_DATADIR=/var/tmp
> +    perf test
> +
> +You may wish to set these above environment variables if you which to
> +keep the output of tests outside of the current working directory for
> +longer term storage and examination.
> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
> index ac861e42c8f7..b97db83992e0 100644
> --- a/tools/perf/Makefile.perf
> +++ b/tools/perf/Makefile.perf
> @@ -630,7 +630,15 @@ sync_file_range_tbls := $(srctree)/tools/perf/trace/beauty/sync_file_range.sh
>  $(sync_file_range_arrays): $(linux_uapi_dir)/fs.h $(sync_file_range_tbls)
>  	$(Q)$(SHELL) '$(sync_file_range_tbls)' $(linux_uapi_dir) > $@
>  
> -all: shell_compatibility_test $(ALL_PROGRAMS) $(LANG_BINDINGS) $(OTHER_PROGRAMS)
> +TESTS_CORESIGHT_DIR := $(srctree)/tools/perf/tests/shell/coresight
> +
> +tests-coresight-targets: FORCE
> +	$(Q)$(MAKE) -C $(TESTS_CORESIGHT_DIR)
> +
> +tests-coresight-targets-clean:
> +	$(Q)$(MAKE) -C $(TESTS_CORESIGHT_DIR) clean
> +
> +all: shell_compatibility_test $(ALL_PROGRAMS) $(LANG_BINDINGS) $(OTHER_PROGRAMS) tests-coresight-targets
>  
>  # Create python binding output directory if not already present
>  _dummy := $(shell [ -d '$(OUTPUT)python' ] || mkdir -p '$(OUTPUT)python')
> @@ -1020,6 +1028,7 @@ install-tests: all install-gtk
>  		$(INSTALL) tests/shell/*.sh '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell'; \
>  		$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/lib'; \
>  		$(INSTALL) tests/shell/lib/*.sh '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/lib'
> +	$(Q)$(MAKE) -C tests/shell/coresight install-tests
>  
>  install-bin: install-tools install-tests install-traceevent-plugins
>  
> @@ -1088,7 +1097,7 @@ endif # BUILD_BPF_SKEL
>  bpf-skel-clean:
>  	$(call QUIET_CLEAN, bpf-skel) $(RM) -r $(SKEL_TMP_OUT) $(SKELETONS)
>  
> -clean:: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean $(LIBPERF)-clean fixdep-clean python-clean bpf-skel-clean
> +clean:: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean $(LIBPERF)-clean fixdep-clean python-clean bpf-skel-clean tests-coresight-targets-clean
>  	$(call QUIET_CLEAN, core-objs)  $(RM) $(LIBPERF_A) $(OUTPUT)perf-archive $(OUTPUT)perf-with-kcore $(OUTPUT)perf-iostat $(LANG_BINDINGS)
>  	$(Q)find $(if $(OUTPUT),$(OUTPUT),.) -name '*.o' -delete -o -name '\.*.cmd' -delete -o -name '\.*.d' -delete
>  	$(Q)$(RM) $(OUTPUT).config-detected
> @@ -1155,5 +1164,6 @@ FORCE:
>  .PHONY: shell_compatibility_test please_set_SHELL_PATH_to_a_more_modern_shell
>  .PHONY: $(GIT-HEAD-PHONY) TAGS tags cscope FORCE prepare
>  .PHONY: libtraceevent_plugins archheaders
> +.PHONY: $(TESTS_CORESIGHT_TARGETS)

I don't find other places using TESTS_CORESIGHT_TARGETS.  Is this
redundant?

>  endif # force_fixdep
> diff --git a/tools/perf/tests/shell/coresight/Makefile b/tools/perf/tests/shell/coresight/Makefile
> new file mode 100644
> index 000000000000..dda99aeac158
> --- /dev/null
> +++ b/tools/perf/tests/shell/coresight/Makefile
> @@ -0,0 +1,30 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> +include ../../../../../tools/scripts/Makefile.include
> +include ../../../../../tools/scripts/Makefile.arch
> +include ../../../../../tools/scripts/utilities.mak
> +
> +SUBDIRS = \
> +	asm_pure_loop \
> +	thread_loop \
> +	memcpy_thread \
> +	unroll_loop_thread
> +
> +all: $(SUBDIRS)
> +$(SUBDIRS):
> +	$(Q)$(MAKE) -C $@
> +
> +INSTALLDIRS = $(SUBDIRS:%=install-%)
> +
> +install-tests: $(INSTALLDIRS)
> +$(INSTALLDIRS):
> +	$(Q)$(MAKE) -C $(@:install-%=%) install-tests
> +
> +CLEANDIRS = $(SUBDIRS:%=clean-%)
> +
> +clean: $(CLEANDIRS)
> +$(CLEANDIRS):
> +	$(Q)$(MAKE) -C $(@:clean-%=%) clean >/dev/null
> +
> +.PHONY: all clean $(SUBDIRS) $(CLEANDIRS) $(INSTALLDIRS)
> +
> diff --git a/tools/perf/tests/shell/coresight/Makefile.miniconfig b/tools/perf/tests/shell/coresight/Makefile.miniconfig
> new file mode 100644
> index 000000000000..893c12685fed
> --- /dev/null
> +++ b/tools/perf/tests/shell/coresight/Makefile.miniconfig
> @@ -0,0 +1,23 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> +
> +ifndef DESTDIR
> +prefix ?= $(HOME)
> +endif
> +
> +DESTDIR_SQ = $(subst ','\'',$(DESTDIR))
> +perfexecdir = libexec/perf-core
> +perfexec_instdir = $(perfexecdir)
> +
> +ifneq ($(filter /%,$(firstword $(perfexecdir))),)
> +perfexec_instdir = $(perfexecdir)
> +else
> +perfexec_instdir = $(prefix)/$(perfexecdir)
> +endif
> +
> +perfexec_instdir_SQ = $(subst ','\'',$(perfexec_instdir))
> +INSTALL = install
> +
> +include ../../../../../scripts/Makefile.include
> +include ../../../../../scripts/Makefile.arch
> +include ../../../../../scripts/utilities.mak
> diff --git a/tools/perf/tests/shell/coresight/asm_pure_loop/.gitignore b/tools/perf/tests/shell/coresight/asm_pure_loop/.gitignore
> new file mode 100644
> index 000000000000..468673ac32e8
> --- /dev/null
> +++ b/tools/perf/tests/shell/coresight/asm_pure_loop/.gitignore
> @@ -0,0 +1 @@
> +asm_pure_loop

Do we really need there '.gitignore' files under the folder
'tools/perf/tests/shell/coresight/'.

> diff --git a/tools/perf/tests/shell/coresight/asm_pure_loop/Makefile b/tools/perf/tests/shell/coresight/asm_pure_loop/Makefile
> new file mode 100644
> index 000000000000..10c5a60cb71c
> --- /dev/null
> +++ b/tools/perf/tests/shell/coresight/asm_pure_loop/Makefile
> @@ -0,0 +1,30 @@
> +# SPDX-License-Identifier: GPL-2.0
> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> +
> +include ../Makefile.miniconfig
> +
> +BIN=asm_pure_loop
> +LIB=

Remove the unused variable 'LIB='.

> +
> +all: $(BIN)
> +
> +$(BIN): $(BIN).S
> +ifdef CORESIGHT
> +ifeq ($(ARCH),arm64)
> +	$(Q)$(CC) $(BIN).S -nostdlib -static -o $(BIN) $(LIB)
> +endif
> +endif
> +
> +install-tests: all
> +ifdef CORESIGHT
> +ifeq ($(ARCH),arm64)
> +	$(call QUIET_INSTALL, tests) \
> +		$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)'; \
> +		$(INSTALL) $(BIN) '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)/$(BIN)'
> +endif
> +endif
> +
> +clean:
> +	$(Q)$(RM) -f $(BIN)
> +
> +.PHONY: all clean install-tests

There have four sub folders under tools/perf/tests/shell/coresight:

  asm_pure_loop
  memcpy_thread
  thread_loop
  unroll_loop_thread

And every folder has its own Makefile and every Makefile is quite
close to each other.  I am just wandering if it's possible to
remove the 4 Makefiles in these four sub folders, and simply use
tools/perf/tests/shell/coresight/Makefile as the central place to
build these assistant programs.

> diff --git a/tools/perf/tests/shell/coresight/asm_pure_loop/asm_pure_loop.S b/tools/perf/tests/shell/coresight/asm_pure_loop/asm_pure_loop.S
> new file mode 100644
> index 000000000000..75cf084a927d
> --- /dev/null
> +++ b/tools/perf/tests/shell/coresight/asm_pure_loop/asm_pure_loop.S
> @@ -0,0 +1,28 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/* Tamas Zsoldos <tamas.zsoldos@arm.com>, 2021 */
> +
> +.globl _start
> +_start:
> +	mov	x0, 0x0000ffff
> +	mov	x1, xzr
> +loop:
> +	nop
> +	nop
> +	cbnz	x1, noskip
> +	nop
> +	nop
> +	adrp	x2, skip
> +	add 	x2, x2, :lo12:skip
> +	br	x2
> +	nop
> +	nop
> +noskip:
> +	nop
> +	nop
> +skip:
> +	sub	x0, x0, 1
> +	cbnz	x0, loop
> +
> +	mov	x0, #0
> +	mov	x8, #93 // __NR_exit syscall
> +	svc	#0

I tested the case "ASM Pure Loop" on my Juno board, and it complaints:

root@debian:/mnt/export/arm-linux-kernel/tools/perf# ./perf test -v 76
 76: Coresight / ASM Pure Loop                                       :
--- start ---
test child forked, pid 9063
failed to mmap with 12 (Cannot allocate memory)
test child finished with -1
---- end ----
Coresight / ASM Pure Loop: FAILED!

Since I only setup the 1GB memory for the Linux kernel, it fails to
allocate AUX ring buffer with the size 256MB.  So I manully change
the buffer size to 8MB in tools/perf/tests/shell/lib/coresight.sh:

  PERFRECMEM="-m ,8M"

So finally I can see the test case is passed:

root@debian:/mnt/export/arm-linux-kernel/tools/perf# ./perf test -v 76
 76: Coresight / ASM Pure Loop                                       :
--- start ---
test child forked, pid 9481
-m ,8M -e cs_etm//u
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.681 MB ./perf-asm_pure_loop-out.data ]
test child finished with 0
---- end ----
Coresight / ASM Pure Loop: Ok

Do you think we really need to use 256MiB as the AUX buffer size?
IIRC, it means we allocate 256MiB per CPU for this case, on the other
hand, you could see the final perf data file size is small (0.681
MiB).

Seems to me, it's not necessary to allocate so big buffer for
the test, and I tried to run below 4 cases with 8MiB, all of them can
pass the testing :)

> diff --git a/tools/perf/tests/shell/coresight/memcpy_thread/.gitignore b/tools/perf/tests/shell/coresight/memcpy_thread/.gitignore
> new file mode 100644
> index 000000000000..f8217e56091e
> --- /dev/null
> +++ b/tools/perf/tests/shell/coresight/memcpy_thread/.gitignore
> @@ -0,0 +1 @@
> +memcpy_thread
> diff --git a/tools/perf/tests/shell/coresight/memcpy_thread/Makefile b/tools/perf/tests/shell/coresight/memcpy_thread/Makefile
> new file mode 100644
> index 000000000000..e2604cfae74b
> --- /dev/null
> +++ b/tools/perf/tests/shell/coresight/memcpy_thread/Makefile
> @@ -0,0 +1,29 @@
> +# SPDX-License-Identifier: GPL-2.0
> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> +include ../Makefile.miniconfig
> +
> +BIN=memcpy_thread
> +LIB=-pthread
> +
> +all: $(BIN)
> +
> +$(BIN): $(BIN).c
> +ifdef CORESIGHT
> +ifeq ($(ARCH),arm64)
> +	$(Q)$(CC) $(BIN).c -o $(BIN) $(LIB)
> +endif
> +endif
> +
> +install-tests: all
> +ifdef CORESIGHT
> +ifeq ($(ARCH),arm64)
> +	$(call QUIET_INSTALL, tests) \
> +		$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)'; \
> +		$(INSTALL) $(BIN) '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)/$(BIN)'
> +endif
> +endif
> +
> +clean:
> +	$(Q)$(RM) -f $(BIN)
> +
> +.PHONY: all clean install-tests
> diff --git a/tools/perf/tests/shell/coresight/memcpy_thread/memcpy_thread.c b/tools/perf/tests/shell/coresight/memcpy_thread/memcpy_thread.c
> new file mode 100644
> index 000000000000..a7e169d1bf64
> --- /dev/null
> +++ b/tools/perf/tests/shell/coresight/memcpy_thread/memcpy_thread.c
> @@ -0,0 +1,79 @@
> +// SPDX-License-Identifier: GPL-2.0
> +// Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <unistd.h>
> +#include <string.h>
> +#include <pthread.h>
> +
> +struct args {
> +	unsigned long loops;
> +	unsigned long size;
> +	pthread_t th;
> +	void *ret;
> +};
> +
> +static void *thrfn(void *arg)
> +{
> +	struct args *a = arg;
> +	unsigned long i, len = a->loops;
> +	unsigned char *src, *dst;
> +
> +	src = malloc(a->size * 1024);
> +	dst = malloc(a->size * 1024);
> +	if ((!src) || (!dst)) {
> +		printf("ERR: Can't allocate memory\n");
> +		exit(1);
> +	}
> +	for (i = 0; i < len; i++)
> +		memcpy(dst, src, a->size * 1024);
> +}
> +
> +static pthread_t new_thr(void *(*fn) (void *arg), void *arg)
> +{
> +	pthread_t t;
> +	pthread_attr_t attr;
> +
> +	pthread_attr_init(&attr);
> +	pthread_create(&t, &attr, fn, arg);
> +	return t;
> +}
> +
> +int main(int argc, char **argv)
> +{
> +	unsigned long i, len, size, thr;
> +	pthread_t threads[256];
> +	struct args args[256];
> +	long long v;
> +
> +	if (argc < 4) {
> +		printf("ERR: %s [copysize Kb] [numthreads] [numloops (hundreds)]\n", argv[0]);
> +		exit(1);
> +	}
> +
> +	v = atoll(argv[1]);
> +	if ((v < 1) || (v > (1024 * 1024))) {
> +		printf("ERR: max memory 1GB (1048576 KB)\n");
> +		exit(1);
> +	}
> +	size = v;
> +	thr = atol(argv[2]);
> +	if ((thr < 1) || (thr > 256)) {
> +		printf("ERR: threads 1-256\n");
> +		exit(1);
> +	}
> +	v = atoll(argv[3]);
> +	if ((v < 1) || (v > 40000000000ll)) {
> +		printf("ERR: loops 1-40000000000 (hundreds)\n");
> +		exit(1);
> +	}
> +	len = v * 100;
> +	for (i = 0; i < thr; i++) {
> +		args[i].loops = len;
> +		args[i].size = size;
> +		args[i].th = new_thr(thrfn, &(args[i]));
> +	}
> +	for (i = 0; i < thr; i++)
> +		pthread_join(args[i].th, &(args[i].ret));
> +	return 0;
> +}
> diff --git a/tools/perf/tests/shell/coresight/thread_loop/.gitignore b/tools/perf/tests/shell/coresight/thread_loop/.gitignore
> new file mode 100644
> index 000000000000..6d4c33eaa9e8
> --- /dev/null
> +++ b/tools/perf/tests/shell/coresight/thread_loop/.gitignore
> @@ -0,0 +1 @@
> +thread_loop
> diff --git a/tools/perf/tests/shell/coresight/thread_loop/Makefile b/tools/perf/tests/shell/coresight/thread_loop/Makefile
> new file mode 100644
> index 000000000000..424df4e8b0e6
> --- /dev/null
> +++ b/tools/perf/tests/shell/coresight/thread_loop/Makefile
> @@ -0,0 +1,29 @@
> +# SPDX-License-Identifier: GPL-2.0
> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> +include ../Makefile.miniconfig
> +
> +BIN=thread_loop
> +LIB=-pthread
> +
> +all: $(BIN)
> +
> +$(BIN): $(BIN).c
> +ifdef CORESIGHT
> +ifeq ($(ARCH),arm64)
> +	$(Q)$(CC) $(BIN).c -o $(BIN) $(LIB)
> +endif
> +endif
> +
> +install-tests: all
> +ifdef CORESIGHT
> +ifeq ($(ARCH),arm64)
> +	$(call QUIET_INSTALL, tests) \
> +		$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)'; \
> +		$(INSTALL) $(BIN) '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)/$(BIN)'
> +endif
> +endif
> +
> +clean:
> +	$(Q)$(RM) -f $(BIN)
> +
> +.PHONY: all clean install-tests
> diff --git a/tools/perf/tests/shell/coresight/thread_loop/thread_loop.c b/tools/perf/tests/shell/coresight/thread_loop/thread_loop.c
> new file mode 100644
> index 000000000000..c0158fac7d0b
> --- /dev/null
> +++ b/tools/perf/tests/shell/coresight/thread_loop/thread_loop.c
> @@ -0,0 +1,86 @@
> +// SPDX-License-Identifier: GPL-2.0
> +// Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> +
> +// define this for gettid()
> +#define _GNU_SOURCE
> +
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <unistd.h>
> +#include <string.h>
> +#include <pthread.h>
> +#include <sys/syscall.h>
> +#ifndef SYS_gettid
> +// gettid is 178 on arm64
> +# define SYS_gettid 178
> +#endif
> +#define gettid() syscall(SYS_gettid)
> +
> +struct args {
> +	unsigned int loops;
> +	pthread_t th;
> +	void *ret;
> +};
> +
> +static void *thrfn(void *arg)
> +{
> +	struct args *a = arg;
> +	int i = 0, len = a->loops;
> +
> +	if (getenv("SHOW_TID")) {
> +		unsigned long long tid = gettid();
> +
> +		printf("%llu\n", tid);
> +	}
> +	asm volatile(
> +		"loop:\n"
> +		"add %[i], %[i], #1\n"
> +		"cmp %[i], %[len]\n"
> +		"blt loop\n"
> +		: /* out */
> +		: /* in */ [i] "r" (i), [len] "r" (len)
> +		: /* clobber */
> +	);
> +	return (void *)(long)i;
> +}
> +
> +static pthread_t new_thr(void *(*fn) (void *arg), void *arg)
> +{
> +	pthread_t t;
> +	pthread_attr_t attr;
> +
> +	pthread_attr_init(&attr);
> +	pthread_create(&t, &attr, fn, arg);
> +	return t;
> +}
> +
> +int main(int argc, char **argv)
> +{
> +	unsigned int i, len, thr;
> +	pthread_t threads[256];
> +	struct args args[256];
> +
> +	if (argc < 3) {
> +		printf("ERR: %s [numthreads] [numloops (millions)]\n", argv[0]);
> +		exit(1);
> +	}
> +
> +	thr = atoi(argv[1]);
> +	if ((thr < 1) || (thr > 256)) {
> +		printf("ERR: threads 1-256\n");
> +		exit(1);
> +	}
> +	len = atoi(argv[2]);
> +	if ((len < 1) || (len > 4000)) {
> +		printf("ERR: max loops 4000 (millions)\n");
> +		exit(1);
> +	}
> +	len *= 1000000;
> +	for (i = 0; i < thr; i++) {
> +		args[i].loops = len;
> +		args[i].th = new_thr(thrfn, &(args[i]));
> +	}
> +	for (i = 0; i < thr; i++)
> +		pthread_join(args[i].th, &(args[i].ret));
> +	return 0;
> +}
> diff --git a/tools/perf/tests/shell/coresight/unroll_loop_thread/.gitignore b/tools/perf/tests/shell/coresight/unroll_loop_thread/.gitignore
> new file mode 100644
> index 000000000000..2cb4e996dbf3
> --- /dev/null
> +++ b/tools/perf/tests/shell/coresight/unroll_loop_thread/.gitignore
> @@ -0,0 +1 @@
> +unroll_loop_thread
> diff --git a/tools/perf/tests/shell/coresight/unroll_loop_thread/Makefile b/tools/perf/tests/shell/coresight/unroll_loop_thread/Makefile
> new file mode 100644
> index 000000000000..45ab2be8be92
> --- /dev/null
> +++ b/tools/perf/tests/shell/coresight/unroll_loop_thread/Makefile
> @@ -0,0 +1,29 @@
> +# SPDX-License-Identifier: GPL-2.0
> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> +include ../Makefile.miniconfig
> +
> +BIN=unroll_loop_thread
> +LIB=-pthread
> +
> +all: $(BIN)
> +
> +$(BIN): $(BIN).c
> +ifdef CORESIGHT
> +ifeq ($(ARCH),arm64)
> +	$(Q)$(CC) $(BIN).c -o $(BIN) $(LIB)
> +endif
> +endif
> +
> +install-tests: all
> +ifdef CORESIGHT
> +ifeq ($(ARCH),arm64)
> +	$(call QUIET_INSTALL, tests) \
> +		$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)'; \
> +		$(INSTALL) $(BIN) '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)/$(BIN)'
> +endif
> +endif
> +
> +clean:
> +	$(Q)$(RM) -f $(BIN)
> +
> +.PHONY: all clean install-tests
> diff --git a/tools/perf/tests/shell/coresight/unroll_loop_thread/unroll_loop_thread.c b/tools/perf/tests/shell/coresight/unroll_loop_thread/unroll_loop_thread.c
> new file mode 100644
> index 000000000000..cb9d22c7dfb9
> --- /dev/null
> +++ b/tools/perf/tests/shell/coresight/unroll_loop_thread/unroll_loop_thread.c
> @@ -0,0 +1,74 @@
> +// SPDX-License-Identifier: GPL-2.0
> +// Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <unistd.h>
> +#include <string.h>
> +#include <pthread.h>
> +
> +struct args {
> +	pthread_t th;
> +	unsigned int in, out;
> +	void *ret;
> +};
> +
> +static void *thrfn(void *arg)
> +{
> +	struct args *a = arg;
> +	unsigned int i, in = a->in;
> +
> +	for (i = 0; i < 10000; i++) {
> +		asm volatile (
> +// force an unroll of thia add instruction so we can test long runs of code
> +#define SNIP1 "add %[in], %[in], #1\n"
> +// 10
> +#define SNIP2 SNIP1 SNIP1 SNIP1 SNIP1 SNIP1 SNIP1 SNIP1 SNIP1 SNIP1 SNIP1
> +// 100
> +#define SNIP3 SNIP2 SNIP2 SNIP2 SNIP2 SNIP2 SNIP2 SNIP2 SNIP2 SNIP2 SNIP2
> +// 1000
> +#define SNIP4 SNIP3 SNIP3 SNIP3 SNIP3 SNIP3 SNIP3 SNIP3 SNIP3 SNIP3 SNIP3
> +// 10000
> +#define SNIP5 SNIP4 SNIP4 SNIP4 SNIP4 SNIP4 SNIP4 SNIP4 SNIP4 SNIP4 SNIP4
> +// 100000
> +			SNIP5 SNIP5 SNIP5 SNIP5 SNIP5 SNIP5 SNIP5 SNIP5 SNIP5 SNIP5
> +			: /* out */
> +			: /* in */ [in] "r" (in)
> +			: /* clobber */
> +		);
> +	}
> +}
> +
> +static pthread_t new_thr(void *(*fn) (void *arg), void *arg)
> +{
> +	pthread_t t;
> +	pthread_attr_t attr;
> +
> +	pthread_attr_init(&attr);
> +	pthread_create(&t, &attr, fn, arg);
> +	return t;
> +}
> +
> +int main(int argc, char **argv)
> +{
> +	unsigned int i, thr;
> +	pthread_t threads[256];
> +	struct args args[256];
> +
> +	if (argc < 2) {
> +		printf("ERR: %s [numthreads]\n", argv[0]);
> +		exit(1);
> +	}
> +
> +	thr = atoi(argv[1]);
> +	if ((thr > 256) || (thr < 1)) {
> +		printf("ERR: threads 1-256\n");
> +		exit(1);
> +	}
> +	for (i = 0; i < thr; i++) {
> +		args[i].in = rand();
> +		args[i].th = new_thr(thrfn, &(args[i]));
> +	}
> +	for (i = 0; i < thr; i++)
> +		pthread_join(args[i].th, &(args[i].ret));
> +	return 0;
> +}
> diff --git a/tools/perf/tests/shell/coresight_asm_pure_loop.sh b/tools/perf/tests/shell/coresight_asm_pure_loop.sh
> new file mode 100755
> index 000000000000..3f0dbefcad50
> --- /dev/null
> +++ b/tools/perf/tests/shell/coresight_asm_pure_loop.sh
> @@ -0,0 +1,18 @@
> +#!/bin/sh -e
> +# Coresight / ASM Pure Loop
> +
> +# SPDX-License-Identifier: GPL-2.0
> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> +
> +TEST="asm_pure_loop"
> +. $(dirname $0)/lib/coresight.sh
> +ARGS=""
> +DATV="out"
> +DATA="$DATD/perf-$TEST-$DATV.data"
> +
> +perf record $PERFRECOPT -o "$DATA" "$BIN" $ARGS
> +
> +perf_dump_aux_verify "$DATA" 10 10 10
> +
> +err=$?
> +exit $err

Can we organize the shell scripts by moving them into the folder
tools/perf/tests/shell/coresight?

  coresight_asm_pure_loop.sh
  coresight_memcpy_thread_16k_10.sh
  coresight_thread_loop_check_tid_10.sh
  coresight_thread_loop_check_tid_2.sh
  coresight_unroll_loop_thread_10.sh

And we even can consider to move script test_arm_coresight.sh into
the folder tools/perf/tests/shell/coresight and change its
name as 'coresight_smoke_test.sh'.

Thanks,
Leo

> diff --git a/tools/perf/tests/shell/coresight_memcpy_thread_16k_10.sh b/tools/perf/tests/shell/coresight_memcpy_thread_16k_10.sh
> new file mode 100755
> index 000000000000..8972af835016
> --- /dev/null
> +++ b/tools/perf/tests/shell/coresight_memcpy_thread_16k_10.sh
> @@ -0,0 +1,18 @@
> +#!/bin/sh -e
> +# Coresight / Memcpy 16k 10 Threads
> +
> +# SPDX-License-Identifier: GPL-2.0
> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> +
> +TEST="memcpy_thread"
> +. $(dirname $0)/lib/coresight.sh
> +ARGS="16 10 1"
> +DATV="16k_10"
> +DATA="$DATD/perf-$TEST-$DATV.data"
> +
> +perf record $PERFRECOPT -o "$DATA" "$BIN" $ARGS
> +
> +perf_dump_aux_verify "$DATA" 10 10 10
> +
> +err=$?
> +exit $err
> diff --git a/tools/perf/tests/shell/coresight_thread_loop_check_tid_10.sh b/tools/perf/tests/shell/coresight_thread_loop_check_tid_10.sh
> new file mode 100755
> index 000000000000..5b468901f89b
> --- /dev/null
> +++ b/tools/perf/tests/shell/coresight_thread_loop_check_tid_10.sh
> @@ -0,0 +1,19 @@
> +#!/bin/sh -e
> +# Coresight / Thread Loop 10 Threads - Check TID
> +
> +# SPDX-License-Identifier: GPL-2.0
> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> +
> +TEST="thread_loop"
> +. $(dirname $0)/lib/coresight.sh
> +ARGS="10 1"
> +DATV="check-tid-10th"
> +DATA="$DATD/perf-$TEST-$DATV.data"
> +STDO="$DATD/perf-$TEST-$DATV.stdout"
> +
> +SHOW_TID=1 perf record -s $PERFRECOPT -o "$DATA" "$BIN" $ARGS > $STDO
> +
> +perf_dump_aux_tid_verify "$DATA" "$STDO"
> +
> +err=$?
> +exit $err
> diff --git a/tools/perf/tests/shell/coresight_thread_loop_check_tid_2.sh b/tools/perf/tests/shell/coresight_thread_loop_check_tid_2.sh
> new file mode 100755
> index 000000000000..f8b7abd3aa03
> --- /dev/null
> +++ b/tools/perf/tests/shell/coresight_thread_loop_check_tid_2.sh
> @@ -0,0 +1,19 @@
> +#!/bin/sh -e
> +# Coresight / Thread Loop 2 Threads - Check TID
> +
> +# SPDX-License-Identifier: GPL-2.0
> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> +
> +TEST="thread_loop"
> +. $(dirname $0)/lib/coresight.sh
> +ARGS="2 20"
> +DATV="check-tid-2th"
> +DATA="$DATD/perf-$TEST-$DATV.data"
> +STDO="$DATD/perf-$TEST-$DATV.stdout"
> +
> +SHOW_TID=1 perf record -s $PERFRECOPT -o "$DATA" "$BIN" $ARGS > $STDO
> +
> +perf_dump_aux_tid_verify "$DATA" "$STDO"
> +
> +err=$?
> +exit $err
> diff --git a/tools/perf/tests/shell/coresight_unroll_loop_thread_10.sh b/tools/perf/tests/shell/coresight_unroll_loop_thread_10.sh
> new file mode 100755
> index 000000000000..c985dfb025c2
> --- /dev/null
> +++ b/tools/perf/tests/shell/coresight_unroll_loop_thread_10.sh
> @@ -0,0 +1,18 @@
> +#!/bin/sh -e
> +# Coresight / Unroll Loop Thread 10
> +
> +# SPDX-License-Identifier: GPL-2.0
> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> +
> +TEST="unroll_loop_thread"
> +. $(dirname $0)/lib/coresight.sh
> +ARGS="10"
> +DATV="10"
> +DATA="$DATD/perf-$TEST-$DATV.data"
> +
> +perf record $PERFRECOPT -o "$DATA" "$BIN" $ARGS
> +
> +perf_dump_aux_verify "$DATA" 10 10 10
> +
> +err=$?
> +exit $err
> diff --git a/tools/perf/tests/shell/lib/coresight.sh b/tools/perf/tests/shell/lib/coresight.sh
> new file mode 100644
> index 000000000000..6a611b073f02
> --- /dev/null
> +++ b/tools/perf/tests/shell/lib/coresight.sh
> @@ -0,0 +1,130 @@
> +# SPDX-License-Identifier: GPL-2.0
> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> +
> +# This is sourced from a driver script so no need for #!/bin... etc. at the
> +# top - the assumption below is that it runs as part of sourcing after the
> +# test sets up some basic env vars to say what it is.
> +
> +# perf record options for the perf tests to use
> +PERFRECMEM="-m ,128M"
> +PERFRECOPT="$PERFRECMEM -e cs_etm//u"
> +
> +# These tests need to be run as root or coresight won't allow large buffers
> +# and will not collect proper data
> +UID=`id -u`
> +if test "$UID" -ne 0; then
> +	echo "Not running as root... skip"
> +	exit 2
> +fi
> +
> +TOOLS=$(dirname $0)
> +DIR="$TOOLS/coresight/$TEST"
> +BIN="$DIR/$TEST"
> +# If the test tool/binary does not exist and is executable then skip the test
> +if ! test -x "$BIN"; then exit 2; fi
> +DATD="."
> +# If the data dir env is set then make the data dir use that instead of ./
> +if test -n "$PERF_TEST_CORESIGHT_DATADIR"; then
> +	DATD="$PERF_TEST_CORESIGHT_DATADIR";
> +fi
> +# If the stat dir env is set then make the data dir use that instead of ./
> +STATD="."
> +if test -n "$PERF_TEST_CORESIGHT_STATDIR"; then
> +	STATD="$PERF_TEST_CORESIGHT_STATDIR";
> +fi
> +
> +# Called if the test fails - error code 2
> +err() {
> +	echo "$1"
> +	exit 1
> +}
> +
> +# Check that some statistics from our perf
> +check_val_min() {
> +	STATF="$4"
> +	if test "$2" -lt "$3"; then
> +		echo ", FAILED" >> "$STATF"
> +		err "Sanity check number of $1 is too low ($2 < $3)"
> +	fi
> +}
> +
> +perf_dump_aux_verify() {
> +	# Some basic checking that the AUX chunk contains some sensible data
> +	# to see that we are recording something and at least a minimum
> +	# amount of it. We should almost always see F3 atoms in just about
> +	# anything but certainly we will see some trace info and async atom
> +	# chunks.
> +	DUMP="$DATD/perf-tmp-aux-dump.txt"
> +	perf report --stdio --dump -i "$1" | \
> +		grep -o -e I_ATOM_F3 -e I_ASYNC -e I_TRACE_INFO > "$DUMP"
> +	# Simply count how many of these atoms we find to see that we are
> +	# producing a reasonable amount of data - exact checks are not sane
> +	# as this is a lossy  process where we may lose some blocks and the
> +	# compiler may produce different code depending on the compiler and
> +	# optimization options, so this is rough  just to see if we're
> +	# either missing almost all the data or all of it
> +	ATOM_F3_NUM=`grep I_ATOM_F3 "$DUMP" | wc -l`
> +	ATOM_ASYNC_NUM=`grep I_ASYNC "$DUMP" | wc -l`
> +	ATOM_TRACE_INFO_NUM=`grep I_TRACE_INFO "$DUMP" | wc -l`
> +	rm -f "$DUMP"
> +
> +	# Arguments provide minimums for a pass
> +	CHECK_F3_MIN="$2"
> +	CHECK_ASYNC_MIN="$3"
> +	CHECK_TRACE_INFO_MIN="$4"
> +
> +	# Write out statistics, so over time you can track results to see if
> +	# there is a pattern - for example we have less "noisy" results that
> +	# produce more consistent amounts of data each run, to see if over
> +	# time any techinques to  minimize data loss are having an effect or
> +	# not
> +	STATF="$STATD/stats-$TEST-$DATV.csv"
> +	if ! test -f "$STATF"; then
> +		echo "ATOM F3 Count, Minimum, ATOM ASYNC Count, Minimum, TRACE INFO Count, Minimum" > "$STATF"
> +	fi
> +	echo -n "$ATOM_F3_NUM, $CHECK_F3_MIN, $ATOM_ASYNC_NUM, $CHECK_ASYNC_MIN, $ATOM_TRACE_INFO_NUM, $CHECK_TRACE_INFO_MIN" >> "$STATF"
> +
> +	# Actually check to see if we passed or failed.
> +	check_val_min "ATOM_F3" "$ATOM_F3_NUM" "$CHECK_F3_MIN" "$STATF"
> +	check_val_min "ASYNC" "$ATOM_ASYNC_NUM" "$CHECK_ASYNC_MIN" "$STATF"
> +	check_val_min "TRACE_INFO" "$ATOM_TRACE_INFO_NUM" "$CHECK_TRACE_INFO_MIN" "$STATF"
> +	echo ", Ok" >> "$STATF"
> +}
> +
> +perf_dump_aux_tid_verify() {
> +	# Specifically crafted test will produce a list of Tread ID's to
> +	# stdout that need to be checked to  see that they have had trace
> +	# info collected in AUX blocks in the perf data. This will go
> +	# through all the TID's that are listed as CID=0xabcdef and see
> +	# that all the Thread IDs the test tool reports are  in the perf
> +	# data AUX chunks
> +
> +	# The TID test tools will print a TID per stdout line that are being
> +	# tested
> +	TIDS=`cat "$2"`
> +	# Scan the perf report to find the TIDs that are actually CID in hex
> +	# and build a list of the ones found
> +	FOUND_TIDS=`perf report --stdio --dump -i "$1" | \
> +			grep -o "CID=0x[0-9a-z]\+" | sed 's/CID=//g' | \
> +			uniq | sort | uniq`
> +
> +	# Iterate over the list of TIDs that the test says it has and find
> +	# them in the TIDs found in the perf report
> +	MISSING=""
> +	for TID2 in $TIDS; do
> +		FOUND=""
> +		for TIDHEX in $FOUND_TIDS; do
> +			TID=`printf "%i" $TIDHEX`
> +			if test "$TID" -eq "$TID2"; then
> +				FOUND="y"
> +				break
> +			fi
> +		done
> +		if test -z "$FOUND"; then
> +			MISSING="$MISSING $TID"
> +		fi
> +	done
> +	if test -n "$MISSING"; then
> +		err "Thread IDs $MISSING not found in perf AUX data"
> +	fi
> +}
> -- 
> 2.32.0
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/3] perf test: Shell - Limit to only run executable scripts in tests
  2022-04-10  1:24 ` [PATCH 1/3] perf test: Shell - Limit to only run executable scripts in tests Leo Yan
@ 2022-04-11 19:08   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 19+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-04-11 19:08 UTC (permalink / raw)
  To: Leo Yan
  Cc: carsten.haitzler, linux-kernel, coresight, suzuki.poulose,
	mathieu.poirier, mike.leach, linux-perf-users

Em Sun, Apr 10, 2022 at 09:24:10AM +0800, Leo Yan escreveu:
> On Wed, Mar 09, 2022 at 12:28:57PM +0000, carsten.haitzler@foss.arm.com wrote:
> > From: Carsten Haitzler <carsten.haitzler@arm.com>
> > 
> > Perf test's shell runner will just run everything in the tests
> > directory (as long as it's not another directory or does not begin
> > with a dot), but sometimes you find files in there that are not shell
> > scripts - perf.data output for example if you do some testing and then
> > the next time you run perf test it tries to run these. Check the files
> > are executable so they are actually intended to be test scripts and
> > not just some "random junk" files there.
> > 
> > Signed-off-by: Carsten Haitzler <carsten.haitzler@arm.com>
> 
> Reviewed-by: Leo Yan <leo.yan@linaro.org>

Thanks, applied.

- Arnaldo

 
> > ---
> >  tools/perf/tests/builtin-test.c |  4 +++-
> >  tools/perf/util/path.c          | 14 +++++++++++++-
> >  tools/perf/util/path.h          |  1 +
> >  3 files changed, 17 insertions(+), 2 deletions(-)
> > 
> > diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
> > index fac3717d9ba1..3c34cb766724 100644
> > --- a/tools/perf/tests/builtin-test.c
> > +++ b/tools/perf/tests/builtin-test.c
> > @@ -296,7 +296,9 @@ static const char *shell_test__description(char *description, size_t size,
> >  
> >  #define for_each_shell_test(entlist, nr, base, ent)	                \
> >  	for (int __i = 0; __i < nr && (ent = entlist[__i]); __i++)	\
> > -		if (!is_directory(base, ent) && ent->d_name[0] != '.')
> > +		if (!is_directory(base, ent) && \
> > +			is_executable_file(base, ent) && \
> > +			ent->d_name[0] != '.')
> >  
> >  static const char *shell_tests__dir(char *path, size_t size)
> >  {
> > diff --git a/tools/perf/util/path.c b/tools/perf/util/path.c
> > index caed0336429f..ce80b79be103 100644
> > --- a/tools/perf/util/path.c
> > +++ b/tools/perf/util/path.c
> > @@ -86,9 +86,21 @@ bool is_directory(const char *base_path, const struct dirent *dent)
> >  	char path[PATH_MAX];
> >  	struct stat st;
> >  
> > -	sprintf(path, "%s/%s", base_path, dent->d_name);
> > +	snprintf(path, sizeof(path), "%s/%s", base_path, dent->d_name);
> >  	if (stat(path, &st))
> >  		return false;
> >  
> >  	return S_ISDIR(st.st_mode);
> >  }
> > +
> > +bool is_executable_file(const char *base_path, const struct dirent *dent)
> > +{
> > +	char path[PATH_MAX];
> > +	struct stat st;
> > +
> > +	snprintf(path, sizeof(path), "%s/%s", base_path, dent->d_name);
> > +	if (stat(path, &st))
> > +		return false;
> > +
> > +	return !S_ISDIR(st.st_mode) && (st.st_mode & S_IXUSR);
> > +}
> > diff --git a/tools/perf/util/path.h b/tools/perf/util/path.h
> > index 083429b7efa3..d94902c22222 100644
> > --- a/tools/perf/util/path.h
> > +++ b/tools/perf/util/path.h
> > @@ -12,5 +12,6 @@ int path__join3(char *bf, size_t size, const char *path1, const char *path2, con
> >  
> >  bool is_regular_file(const char *file);
> >  bool is_directory(const char *base_path, const struct dirent *dent);
> > +bool is_executable_file(const char *base_path, const struct dirent *dent);
> >  
> >  #endif /* _PERF_PATH_H */
> > -- 
> > 2.32.0
> > 

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] perf test: Shell - only run .sh shell files to skip other files
  2022-04-10  2:28   ` Leo Yan
@ 2022-04-21 16:21     ` Carsten Haitzler
  2022-05-26 10:14       ` Leo Yan
  0 siblings, 1 reply; 19+ messages in thread
From: Carsten Haitzler @ 2022-04-21 16:21 UTC (permalink / raw)
  To: Leo Yan
  Cc: linux-kernel, coresight, suzuki.poulose, mathieu.poirier,
	mike.leach, linux-perf-users, acme



On 4/10/22 03:28, Leo Yan wrote:
> On Wed, Mar 09, 2022 at 12:28:58PM +0000, carsten.haitzler@foss.arm.com wrote:
>> From: Carsten Haitzler <carsten.haitzler@arm.com>
>>
>> You edit your scripts in the tests and end up with your usual shell
>> backup files with ~ or .bak or something else at the end, but then your
>> next perf test run wants to run the backups too. You might also have perf
>> .data files in the directory or something else undesireable as well. You end
>> up chasing which test is the one you edited and the backup and have to keep
>> removing all the backup files, so automatically skip any files that are
>> not plain *.sh scripts to limit the time wasted in chasing ghosts.
>>
>> Signed-off-by: Carsten Haitzler <carsten.haitzler@arm.com>
>>
>> ---
>>   tools/perf/tests/builtin-test.c | 17 +++++++++++++++--
>>   1 file changed, 15 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
>> index 3c34cb766724..3a02ba7a7a89 100644
>> --- a/tools/perf/tests/builtin-test.c
>> +++ b/tools/perf/tests/builtin-test.c
>> @@ -296,9 +296,22 @@ static const char *shell_test__description(char *description, size_t size,
>>   
>>   #define for_each_shell_test(entlist, nr, base, ent)	                \
>>   	for (int __i = 0; __i < nr && (ent = entlist[__i]); __i++)	\
>> -		if (!is_directory(base, ent) && \
>> +		if (ent->d_name[0] != '.' && \
>> +			!is_directory(base, ent) && \
>>   			is_executable_file(base, ent) && \
>> -			ent->d_name[0] != '.')
>> +			is_shell_script(ent->d_name))
> 
> Just nitpick: since multiple conditions are added, seems to me it's good
> to use a single function is_executable_shell_script() to make decision
> if a file is an executable shell script.

I'd certainly make a function if this was being re-used, but as the 
"coding pattern" was to do all the tests already inside the if() in only 
one place, I kept with the style there and didn't change the code that 
didn't need changing. I can rewrite this code and basically make a 
function that is just an if ...:

bool is_exe_shell_script(const char *base, struct dirent *ent) {
    return ent->d_name[0] != '.'         && !is_directory(base, ent) &&
           is_executable_file(base, ent) && is_shell_script(ent->d_name);
}

And macro becomes:

#define for_each_shell_test(entlist, nr, base, ent) \
   for (int __i = 0; __i < nr && (ent = entlist[__i]); __i++) \
     if (is_shell(base, ent))

But one catch... it really should be is_non_hidden_exe_shell_script() as 
it's checking that it's not a hidden file AND is a shell script. Or do I 
keep the hidden file test outside of the function in the if? If we're 
nit picking then I need to know exactly what you want here as your 
suggested name is actually incorrect.

> And the condition checking 'ent->d_name[0] != '.'' would be redundant
> after we have checked the file suffix '.sh'.

This isn't actually redundant. You can have .something.sh :) If the idea 
is we skip anything with a . at the start first always... then the if 
(to me) is obvious.

> Thanks,
> Leo
> 
>> +
>> +static bool is_shell_script(const char *file)
>> +{
>> +	const char *ext;
>> +
>> +	ext = strrchr(file, '.');
>> +	if (!ext)
>> +		return false;
>> +	if (!strcmp(ext, ".sh"))
>> +		return true;
>> +	return false;
>> +}
>>   
>>   static const char *shell_tests__dir(char *path, size_t size)
>>   {
>> -- 
>> 2.32.0
>>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/3] perf test: Add coresight tests to guage quality of data generated
  2022-04-10  8:30   ` Leo Yan
@ 2022-04-21 17:38     ` Carsten Haitzler
  2022-05-26  8:20       ` Leo Yan
  0 siblings, 1 reply; 19+ messages in thread
From: Carsten Haitzler @ 2022-04-21 17:38 UTC (permalink / raw)
  To: Leo Yan
  Cc: linux-kernel, coresight, suzuki.poulose, mathieu.poirier,
	mike.leach, linux-perf-users, acme



On 4/10/22 09:30, Leo Yan wrote:
> Hi Carsten,
> 
> On Wed, Mar 09, 2022 at 12:28:59PM +0000, carsten.haitzler@foss.arm.com wrote:
>> From: Carsten Haitzler <carsten.haitzler@arm.com>
>>
>> This adds a test harness and tests to run perf record and examine the
>> resuling output when coresight is enabled on arm64 and check the
>> resulting quality of the output as part of perf test. These tests use
>> various tools to produce output from perf record then measure some key
>> specific aspects of that data to see if the data exists at all and
>> contains key aspects such as measuring some data for every thread of
>> a test or produces sufficient data for large exeuction runs of a large
>> executable. etc.
>>
>> Signed-off-by: Carsten Haitzler <carsten.haitzler@arm.com>
>> ---
>>   MAINTAINERS                                   |   4 +
>>   tools/perf/.gitignore                         |   6 +-
>>   tools/perf/Documentation/arm-coresight.txt    | 140 ++++++++++++++++++
>>   tools/perf/Makefile.perf                      |  14 +-
>>   tools/perf/tests/shell/coresight/Makefile     |  30 ++++
>>   .../tests/shell/coresight/Makefile.miniconfig |  23 +++
>>   .../shell/coresight/asm_pure_loop/.gitignore  |   1 +
>>   .../shell/coresight/asm_pure_loop/Makefile    |  30 ++++
>>   .../coresight/asm_pure_loop/asm_pure_loop.S   |  28 ++++
>>   .../shell/coresight/memcpy_thread/.gitignore  |   1 +
>>   .../shell/coresight/memcpy_thread/Makefile    |  29 ++++
>>   .../coresight/memcpy_thread/memcpy_thread.c   |  79 ++++++++++
>>   .../shell/coresight/thread_loop/.gitignore    |   1 +
>>   .../shell/coresight/thread_loop/Makefile      |  29 ++++
>>   .../shell/coresight/thread_loop/thread_loop.c |  86 +++++++++++
>>   .../coresight/unroll_loop_thread/.gitignore   |   1 +
>>   .../coresight/unroll_loop_thread/Makefile     |  29 ++++
>>   .../unroll_loop_thread/unroll_loop_thread.c   |  74 +++++++++
>>   .../tests/shell/coresight_asm_pure_loop.sh    |  18 +++
>>   .../shell/coresight_memcpy_thread_16k_10.sh   |  18 +++
>>   .../coresight_thread_loop_check_tid_10.sh     |  19 +++
>>   .../coresight_thread_loop_check_tid_2.sh      |  19 +++
>>   .../shell/coresight_unroll_loop_thread_10.sh  |  18 +++
>>   tools/perf/tests/shell/lib/coresight.sh       | 130 ++++++++++++++++
> 
> Very big change...  Why squash all patches form previous verion to this
> single one big patch?  Usually the format with small patches is much
> better for reviewing.

I was asked to re-jig the tree and in doing so I also ended up cutting 
down the size a lot so this just makes more sense together as a "here 
are the tests" as adding infra without any tests makes no sense and the 
tests themelves are self-contained in their own directories and source 
files and "drivign scripts" thus it's essentially patch 1 appended to 
patch 2 to patch 3 etc. and still broken up in the patch file by file.

> And I cannot apply cleanly on perf core branch:
> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git branch: perf/core.

I generated this based on tip. I can re-do it based on the above.

>>   24 files changed, 823 insertions(+), 4 deletions(-)
>>   create mode 100644 tools/perf/Documentation/arm-coresight.txt
>>   create mode 100644 tools/perf/tests/shell/coresight/Makefile
>>   create mode 100644 tools/perf/tests/shell/coresight/Makefile.miniconfig
>>   create mode 100644 tools/perf/tests/shell/coresight/asm_pure_loop/.gitignore
>>   create mode 100644 tools/perf/tests/shell/coresight/asm_pure_loop/Makefile
>>   create mode 100644 tools/perf/tests/shell/coresight/asm_pure_loop/asm_pure_loop.S
>>   create mode 100644 tools/perf/tests/shell/coresight/memcpy_thread/.gitignore
>>   create mode 100644 tools/perf/tests/shell/coresight/memcpy_thread/Makefile
>>   create mode 100644 tools/perf/tests/shell/coresight/memcpy_thread/memcpy_thread.c
>>   create mode 100644 tools/perf/tests/shell/coresight/thread_loop/.gitignore
>>   create mode 100644 tools/perf/tests/shell/coresight/thread_loop/Makefile
>>   create mode 100644 tools/perf/tests/shell/coresight/thread_loop/thread_loop.c
>>   create mode 100644 tools/perf/tests/shell/coresight/unroll_loop_thread/.gitignore
>>   create mode 100644 tools/perf/tests/shell/coresight/unroll_loop_thread/Makefile
>>   create mode 100644 tools/perf/tests/shell/coresight/unroll_loop_thread/unroll_loop_thread.c
>>   create mode 100755 tools/perf/tests/shell/coresight_asm_pure_loop.sh
>>   create mode 100755 tools/perf/tests/shell/coresight_memcpy_thread_16k_10.sh
>>   create mode 100755 tools/perf/tests/shell/coresight_thread_loop_check_tid_10.sh
>>   create mode 100755 tools/perf/tests/shell/coresight_thread_loop_check_tid_2.sh
>>   create mode 100755 tools/perf/tests/shell/coresight_unroll_loop_thread_10.sh
>>   create mode 100644 tools/perf/tests/shell/lib/coresight.sh
>>
>> diff --git a/MAINTAINERS b/MAINTAINERS
>> index 673c7124ca82..18cc20609f2e 100644
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -1918,10 +1918,14 @@ F:	drivers/hwtracing/coresight/*
>>   F:	include/dt-bindings/arm/coresight-cti-dt.h
>>   F:	include/linux/coresight*
>>   F:	samples/coresight/*
>> +F:	tools/perf/Documentation/arm-coresight.txt
>>   F:	tools/perf/arch/arm/util/auxtrace.c
>>   F:	tools/perf/arch/arm/util/cs-etm.c
>>   F:	tools/perf/arch/arm/util/cs-etm.h
>>   F:	tools/perf/arch/arm/util/pmu.c
>> +F:	tools/perf/tests/shell/coresight_*
>> +F:	tools/perf/tests/shell/tools/Makefile
>> +F:	tools/perf/tests/shell/tools/coresight/*
>>   F:	tools/perf/util/cs-etm-decoder/*
>>   F:	tools/perf/util/cs-etm.*
>>   
>> diff --git a/tools/perf/.gitignore b/tools/perf/.gitignore
>> index 20b8ab984d5f..138c679ecacd 100644
>> --- a/tools/perf/.gitignore
>> +++ b/tools/perf/.gitignore
>> @@ -15,8 +15,9 @@ perf*.1
>>   perf*.xml
>>   perf*.html
>>   common-cmds.h
>> -perf.data
>> -perf.data.old
>> +perf*.data
>> +perf*.data.old
>> +stats-*.csv
>>   output.svg
>>   perf-archive
>>   perf-with-kcore
>> @@ -30,6 +31,7 @@ config.mak.autogen
>>   *-flex.*
>>   *.pyc
>>   *.pyo
>> +*.stdout
>>   .config-detected
>>   util/intel-pt-decoder/inat-tables.c
>>   arch/*/include/generated/
>> diff --git a/tools/perf/Documentation/arm-coresight.txt b/tools/perf/Documentation/arm-coresight.txt
>> new file mode 100644
>> index 000000000000..3a9e6c573c58
>> --- /dev/null
>> +++ b/tools/perf/Documentation/arm-coresight.txt
>> @@ -0,0 +1,140 @@
>> +Arm Coresight Support
>> +=====================
>> +
>> +Coresight is a feature of some Arm based processors that allows for
>> +debugging. One of the things it can do is trace every instruction
>> +executed and remotely expose that information in a hardware compressed
>> +stream.
> 
> Maybe here need to sync a bit for the terminology in
> Documentation/trace/coresight/coresight.rst.
> 
> Something like:
> 
> "Coresight is a feature of some Arm based processors that allows for
> debugging. One of the things it can do is trace instruction path and
> expose that information in a hardware compressed stream for either
> debugger or HW assisted tracing locally.
> 
> See Documentation/trace/coresight/coresight.rst for details."

Sure. Can do that.

>> Perf is able to locally access that stream and store it to the
>> +output perf data files. This stream can then be later decoded to give the
>> +instructions that were traced for debugging or profiling purposes. You
>> +can log such data with a perf record command like:
>> +
>> +    perf record -e cs_etm//u testbinary
>> +
>> +This would run some test binary (testbinary) until it exits and record
>> +a perf.data trace file. That file would have AUX sections if coresight
>> +is working correctly. You can dump the content of this file as
>> +readable text with a command like:
>> +
>> +    perf report --stdio --dump -i perf.data
>> +
>> +You should find some sections of this file have AUX data blocks like:
>> +
>> +    0x1e78 [0x30]: PERF_RECORD_AUXTRACE size: 0x11dd0  offset: 0  ref: 0x1b614fc1061b0ad1  idx: 0  tid: 531230  cpu: -1
>> +
>> +    . ... CoreSight ETM Trace data: size 73168 bytes
>> +            Idx:0; ID:10;   I_ASYNC : Alignment Synchronisation.
>> +              Idx:12; ID:10;  I_TRACE_INFO : Trace Info.; INFO=0x0 { CC.0 }
>> +              Idx:17; ID:10;  I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0x0000000000000000;
>> +              Idx:26; ID:10;  I_TRACE_ON : Trace On.
>> +              Idx:27; ID:10;  I_ADDR_CTXT_L_64IS0 : Address & Context, Long, 64 bit, IS0.; Addr=0x0000FFFFB6069140; Ctxt: AArch64,EL0, NS;
>> +              Idx:38; ID:10;  I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE
>> +              Idx:39; ID:10;  I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE
>> +              Idx:40; ID:10;  I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE
>> +              Idx:41; ID:10;  I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEN
>> +              ...
>> +
>> +If you see these above, then your system is tracing coresight data
>> +correctly.
>> +
>> +To compile perf with coresight support in the perf directory do
>> +
>> +    make CORESIGHT=1
> 
> It is inaccurate that if we don't mention openCSD lib.

Do you mean I need to mention that you need the opencsd library 
installed too?

>> +
>> +This will compile the perf tool with coresight support as well as
>> +build some small test binaries for perf test. This requires you also
>> +be compiling for 64bit Arm (ARM64/aarch64). The tools run as part of
>> +perf coresight tracing are in tests/shell/tools/coresight.
> 
> For build perf tool, I think above paragraphs are duplicate with the
> document Documentation/trace/coresight/coresight.rst.  Can we simply
> say:
> 
> "The details for building perf tool with support Arm Coresight can be
> found in the "HOWTO.md" file of the openCSD gitHub repository:
> https://github.com/Linaro/opencsd.

I can. I put this here as I didn't go clone OpencCSD first but used my 
distro OpenCSD packages and thus of course didn't have the documentation 
in front of me. I spent some time wondering why it wasn't building with 
coresight support even though it detected OpenCSD when I compiled... I 
didn't expect to have to go to some separate project git repository and 
read docs there on how to build the perf tool here in the kernel. I 
wrote this because it was an actual problem I hit and it's a lot less 
frustrating to "end users" to give them the information they need in the 
relevant place they need it instead of sending them around to other 
project trees. Building perf with coresight support is handled by the 
perf tree int he kernel, not OpenCSD, thus IMHO that is where the 
documentation belongs - alongside the thing that determines how to build 
something.

> And "HOWTO.md" file gives the information and examples for how to use
> perf tool to record and report Coresight trace data.  It's the
> prerequisite for this perf Coresight test."
> 
>> +You will also want coresight support enabled in your kernel config.
>> +Ensure it is enabled with:
>> +
>> +    CONFIG_CORESIGHT=y
>> +
>> +There are various other coresight options you probably also want
>> +enabled like:
>> +
>> +    CONFIG_CORESIGHT_LINKS_AND_SINKS=y
>> +    CONFIG_CORESIGHT_LINK_AND_SINK_TMC=y
>> +    CONFIG_CORESIGHT_CATU=y
>> +    CONFIG_CORESIGHT_SINK_TPIU=y
>> +    CONFIG_CORESIGHT_SINK_ETBV10=y
>> +    CONFIG_CORESIGHT_SOURCE_ETM4X=y
>> +    CONFIG_CORESIGHT_STM=y
>> +    CONFIG_CORESIGHT_CPU_DEBUG=y
>> +    CONFIG_CORESIGHT_CTI=y
>> +    CONFIG_CORESIGHT_CTI_INTEGRATION_REGS=y
>> +
>> +Please refer to the kernel configuration help for more information.
> 
> I prefer to remove these kernel configuration since they are not
> inconsistent on different platforms (e.g. ETBV10, ETM4X, etc), and
> some configurations might not necessary (e.g. CPU_DEBUG).

Certainly there should be some documentation on which kernel configs you 
might want to turn on then? Imagine someone new comes along and doesn't 
have any idea what to possible enable at all and manages to build perf 
with coresight support (as above) then finds it doesn't work because 
they didn't enable enough config in the kernel? Sure - could probably 
trim these down a bit but the point here is to alert the user to there 
being a range of coresight config options that you need to turn on that 
you likely will find are not turned on. They certainly are not turned on 
on distro kernels and a lot of the time when you have a platform that 
already boots/works you start with your distro kernel config file 
because you want everything enabled so it actually boots. I've learned 
the hard way to do this as you manage to forget to turn on some MMC 
driver or some other feature and your boot hangs or doesn't find rootfs etc.

What would you recommend then as a "turn these on and coresight will 
almost certainly work for you on your given hardware " then?

>> +Perf test - Verify kernel and userspace perf coresight work
>> +===========================================================
>> +
>> +When you run perf test, it will do a lot of self tests. Some of those
>> +tests will cover Coresight (only if enabled and on ARM64). You
>> +generally would run perf test from the tools/perf directory in the
>> +kernel tree. Some tests will check some internal perf support like:
>> +
>> +    Check Arm CoreSight trace data recording and synthesized samples
>> +
>> +Some others will actually use perf record and some test binaries that
>> +are in tests/shell/tools/coresight and will collect traces to ensure a
>> +minimum level of functionality is met. The scripts that launch these
>> +tests are in tests/shell. These will all look like:
>> +
>> +    Coresight / Memcpy 1M 25 Threads
>> +    Coresight / Unroll Loop Thread 2
>> +    ...
> 
> Please update based on the latest test case names, at my side, I can
> see the testing case like:
> 
>         Coresight / ASM Pure Loop
>         Coresight / Memcpy 16k 10 Threads
>         Coresight / Thread Loop 10 Threads - Check TID
>         Coresight / Thread Loop 2 Threads - Check TID
>         Coresight / Unroll Loop Thread 10

Oh sorry - yeah. I wrote the docs based on the earlier tests. Will fix.

>> +
>> +These perf record tests will not run if the tool binaries do not exist
>> +in tests/shell/tools/coresight/*/ and will be skipped. If you do not
>> +have coresight support in hardware then either do not build perf with
>> +coresight support or remove these binaries in order to not have these
>> +tests fail and have them skip instead.
>> +
>> +These tests will log historical results in the current working
>> +directory (e.g. tools/perf) and will be named stats-*.csv like:
>> +
>> +    stats-asm_pure_loop-out.csv
>> +    stats-bubble_sort-random.csv
>> +    ...
>> +
>> +These statistic files log some aspects of the AUX data sections in
>> +the perf data output counting some numbers of certain encodings (a
>> +good way to know that it's working in a very simple way). One problem
>> +with coresight is that given a large enough amount of data needing to
>> +be logged, some of it can be lost due to the processor not waking up
>> +in time to read out all the data from buffers etc.. You will notice
>> +that the amount of data collected can vary a lot per run of perf test.
>> +If you wish to see how this changes over time, simply run perf test
>> +multiple times and all these csv files will have more and more data
>> +appended to it that you can later examine, graph and otherwise use to
>> +figure out if things have become worse or better.
> 
> I am confused by this narrative.  Does it try to remind that the final
> testing result (pass or fail) is not stable?  Or should we run for
> multiple times so have more chance to capture issues?

That is correct. I thought I was clear that it's lossy. That is actually 
the case. I have tests here that actually fail because there is no data 
collected from some threads at all (missing CID blocks for some of the 
threads that run in the test). The point is to have tests that may be 
failing now but in future will improve. I lowered the minimum bar to 
pass for most tests to have "at least just a little data" but most tests 
show highly variable amount of captured data. the csv files are there to 
over-time give you a good idea of the stability of the captured data.

>> +Be aware that amny of these tests take quite a while to run, specifically
> 
> s/amny/many

Indeed. will fix.

>> +in processing the perf data file and dumping contents to then examine what
>> +is inside.
>> +
>> +You can change where these csv logs are stored by setting the
>> +PERF_TEST_CORESIGHT_STATDIR environment variable before running perf
>> +test like:
>> +
>> +    export PERF_TEST_CORESIGHT_STATDIR=/var/tmp
>> +    perf test
>> +
>> +They will also store resulting perf output data in the current
>> +directory for later inspection like:
>> +
>> +    perf-memcpy-1m.data
>> +    perf-thread_loop-2th.data
>> +    ...
>> +
>> +You can alter where the perf data files are stored by setting the
>> +PERF_TEST_CORESIGHT_DATADIR environment variable such as:
>> +
>> +    PERF_TEST_CORESIGHT_DATADIR=/var/tmp
>> +    perf test
>> +
>> +You may wish to set these above environment variables if you which to
>> +keep the output of tests outside of the current working directory for
>> +longer term storage and examination.
>> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
>> index ac861e42c8f7..b97db83992e0 100644
>> --- a/tools/perf/Makefile.perf
>> +++ b/tools/perf/Makefile.perf
>> @@ -630,7 +630,15 @@ sync_file_range_tbls := $(srctree)/tools/perf/trace/beauty/sync_file_range.sh
>>   $(sync_file_range_arrays): $(linux_uapi_dir)/fs.h $(sync_file_range_tbls)
>>   	$(Q)$(SHELL) '$(sync_file_range_tbls)' $(linux_uapi_dir) > $@
>>   
>> -all: shell_compatibility_test $(ALL_PROGRAMS) $(LANG_BINDINGS) $(OTHER_PROGRAMS)
>> +TESTS_CORESIGHT_DIR := $(srctree)/tools/perf/tests/shell/coresight
>> +
>> +tests-coresight-targets: FORCE
>> +	$(Q)$(MAKE) -C $(TESTS_CORESIGHT_DIR)
>> +
>> +tests-coresight-targets-clean:
>> +	$(Q)$(MAKE) -C $(TESTS_CORESIGHT_DIR) clean
>> +
>> +all: shell_compatibility_test $(ALL_PROGRAMS) $(LANG_BINDINGS) $(OTHER_PROGRAMS) tests-coresight-targets
>>   
>>   # Create python binding output directory if not already present
>>   _dummy := $(shell [ -d '$(OUTPUT)python' ] || mkdir -p '$(OUTPUT)python')
>> @@ -1020,6 +1028,7 @@ install-tests: all install-gtk
>>   		$(INSTALL) tests/shell/*.sh '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell'; \
>>   		$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/lib'; \
>>   		$(INSTALL) tests/shell/lib/*.sh '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/lib'
>> +	$(Q)$(MAKE) -C tests/shell/coresight install-tests
>>   
>>   install-bin: install-tools install-tests install-traceevent-plugins
>>   
>> @@ -1088,7 +1097,7 @@ endif # BUILD_BPF_SKEL
>>   bpf-skel-clean:
>>   	$(call QUIET_CLEAN, bpf-skel) $(RM) -r $(SKEL_TMP_OUT) $(SKELETONS)
>>   
>> -clean:: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean $(LIBPERF)-clean fixdep-clean python-clean bpf-skel-clean
>> +clean:: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean $(LIBPERF)-clean fixdep-clean python-clean bpf-skel-clean tests-coresight-targets-clean
>>   	$(call QUIET_CLEAN, core-objs)  $(RM) $(LIBPERF_A) $(OUTPUT)perf-archive $(OUTPUT)perf-with-kcore $(OUTPUT)perf-iostat $(LANG_BINDINGS)
>>   	$(Q)find $(if $(OUTPUT),$(OUTPUT),.) -name '*.o' -delete -o -name '\.*.cmd' -delete -o -name '\.*.d' -delete
>>   	$(Q)$(RM) $(OUTPUT).config-detected
>> @@ -1155,5 +1164,6 @@ FORCE:
>>   .PHONY: shell_compatibility_test please_set_SHELL_PATH_to_a_more_modern_shell
>>   .PHONY: $(GIT-HEAD-PHONY) TAGS tags cscope FORCE prepare
>>   .PHONY: libtraceevent_plugins archheaders
>> +.PHONY: $(TESTS_CORESIGHT_TARGETS)
> 
> I don't find other places using TESTS_CORESIGHT_TARGETS.  Is this
> redundant?

I'll check - it may have been left over from my previous patch set.

>>   endif # force_fixdep
>> diff --git a/tools/perf/tests/shell/coresight/Makefile b/tools/perf/tests/shell/coresight/Makefile
>> new file mode 100644
>> index 000000000000..dda99aeac158
>> --- /dev/null
>> +++ b/tools/perf/tests/shell/coresight/Makefile
>> @@ -0,0 +1,30 @@
>> +# SPDX-License-Identifier: GPL-2.0-only
>> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>> +include ../../../../../tools/scripts/Makefile.include
>> +include ../../../../../tools/scripts/Makefile.arch
>> +include ../../../../../tools/scripts/utilities.mak
>> +
>> +SUBDIRS = \
>> +	asm_pure_loop \
>> +	thread_loop \
>> +	memcpy_thread \
>> +	unroll_loop_thread
>> +
>> +all: $(SUBDIRS)
>> +$(SUBDIRS):
>> +	$(Q)$(MAKE) -C $@
>> +
>> +INSTALLDIRS = $(SUBDIRS:%=install-%)
>> +
>> +install-tests: $(INSTALLDIRS)
>> +$(INSTALLDIRS):
>> +	$(Q)$(MAKE) -C $(@:install-%=%) install-tests
>> +
>> +CLEANDIRS = $(SUBDIRS:%=clean-%)
>> +
>> +clean: $(CLEANDIRS)
>> +$(CLEANDIRS):
>> +	$(Q)$(MAKE) -C $(@:clean-%=%) clean >/dev/null
>> +
>> +.PHONY: all clean $(SUBDIRS) $(CLEANDIRS) $(INSTALLDIRS)
>> +
>> diff --git a/tools/perf/tests/shell/coresight/Makefile.miniconfig b/tools/perf/tests/shell/coresight/Makefile.miniconfig
>> new file mode 100644
>> index 000000000000..893c12685fed
>> --- /dev/null
>> +++ b/tools/perf/tests/shell/coresight/Makefile.miniconfig
>> @@ -0,0 +1,23 @@
>> +# SPDX-License-Identifier: GPL-2.0-only
>> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>> +
>> +ifndef DESTDIR
>> +prefix ?= $(HOME)
>> +endif
>> +
>> +DESTDIR_SQ = $(subst ','\'',$(DESTDIR))
>> +perfexecdir = libexec/perf-core
>> +perfexec_instdir = $(perfexecdir)
>> +
>> +ifneq ($(filter /%,$(firstword $(perfexecdir))),)
>> +perfexec_instdir = $(perfexecdir)
>> +else
>> +perfexec_instdir = $(prefix)/$(perfexecdir)
>> +endif
>> +
>> +perfexec_instdir_SQ = $(subst ','\'',$(perfexec_instdir))
>> +INSTALL = install
>> +
>> +include ../../../../../scripts/Makefile.include
>> +include ../../../../../scripts/Makefile.arch
>> +include ../../../../../scripts/utilities.mak
>> diff --git a/tools/perf/tests/shell/coresight/asm_pure_loop/.gitignore b/tools/perf/tests/shell/coresight/asm_pure_loop/.gitignore
>> new file mode 100644
>> index 000000000000..468673ac32e8
>> --- /dev/null
>> +++ b/tools/perf/tests/shell/coresight/asm_pure_loop/.gitignore
>> @@ -0,0 +1 @@
>> +asm_pure_loop
> 
> Do we really need there '.gitignore' files under the folder
> 'tools/perf/tests/shell/coresight/'.

Where would you rather have them to ignore the generated binary tools?

>> diff --git a/tools/perf/tests/shell/coresight/asm_pure_loop/Makefile b/tools/perf/tests/shell/coresight/asm_pure_loop/Makefile
>> new file mode 100644
>> index 000000000000..10c5a60cb71c
>> --- /dev/null
>> +++ b/tools/perf/tests/shell/coresight/asm_pure_loop/Makefile
>> @@ -0,0 +1,30 @@
>> +# SPDX-License-Identifier: GPL-2.0
>> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>> +
>> +include ../Makefile.miniconfig
>> +
>> +BIN=asm_pure_loop
>> +LIB=
> 
> Remove the unused variable 'LIB='.

I have this because I wanted to have a simple template to be able to 
re-use for more tests over time. It's so much easier to maintain and 
extend if every makefile and tool follow a similar pattern and you can 
almost copy & paste between them as they don't have "exceptions". You 
really want me to remove this?

>> +
>> +all: $(BIN)
>> +
>> +$(BIN): $(BIN).S
>> +ifdef CORESIGHT
>> +ifeq ($(ARCH),arm64)
>> +	$(Q)$(CC) $(BIN).S -nostdlib -static -o $(BIN) $(LIB)
>> +endif
>> +endif
>> +
>> +install-tests: all
>> +ifdef CORESIGHT
>> +ifeq ($(ARCH),arm64)
>> +	$(call QUIET_INSTALL, tests) \
>> +		$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)'; \
>> +		$(INSTALL) $(BIN) '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)/$(BIN)'
>> +endif
>> +endif
>> +
>> +clean:
>> +	$(Q)$(RM) -f $(BIN)
>> +
>> +.PHONY: all clean install-tests
> 
> There have four sub folders under tools/perf/tests/shell/coresight:
> 
>    asm_pure_loop
>    memcpy_thread
>    thread_loop
>    unroll_loop_thread
> 
> And every folder has its own Makefile and every Makefile is quite
> close to each other.  I am just wandering if it's possible to
> remove the 4 Makefiles in these four sub folders, and simply use
> tools/perf/tests/shell/coresight/Makefile as the central place to
> build these assistant programs.

I did this so it's easier to etxent over time. having a single parent 
makefile that over time accumulates little ugly "if's" and exceptions 
makes longer-term maintenance and extending harder. I did it this way to 
make this easy - make a copy of a dir - add that dir to a parent 
makefile then modify the makefile as needed (but only as needed).

>> diff --git a/tools/perf/tests/shell/coresight/asm_pure_loop/asm_pure_loop.S b/tools/perf/tests/shell/coresight/asm_pure_loop/asm_pure_loop.S
>> new file mode 100644
>> index 000000000000..75cf084a927d
>> --- /dev/null
>> +++ b/tools/perf/tests/shell/coresight/asm_pure_loop/asm_pure_loop.S
>> @@ -0,0 +1,28 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +/* Tamas Zsoldos <tamas.zsoldos@arm.com>, 2021 */
>> +
>> +.globl _start
>> +_start:
>> +	mov	x0, 0x0000ffff
>> +	mov	x1, xzr
>> +loop:
>> +	nop
>> +	nop
>> +	cbnz	x1, noskip
>> +	nop
>> +	nop
>> +	adrp	x2, skip
>> +	add 	x2, x2, :lo12:skip
>> +	br	x2
>> +	nop
>> +	nop
>> +noskip:
>> +	nop
>> +	nop
>> +skip:
>> +	sub	x0, x0, 1
>> +	cbnz	x0, loop
>> +
>> +	mov	x0, #0
>> +	mov	x8, #93 // __NR_exit syscall
>> +	svc	#0
> 
> I tested the case "ASM Pure Loop" on my Juno board, and it complaints:
> 
> root@debian:/mnt/export/arm-linux-kernel/tools/perf# ./perf test -v 76
>   76: Coresight / ASM Pure Loop                                       :
> --- start ---
> test child forked, pid 9063
> failed to mmap with 12 (Cannot allocate memory)
> test child finished with -1
> ---- end ----
> Coresight / ASM Pure Loop: FAILED!
> 
> Since I only setup the 1GB memory for the Linux kernel, it fails to
> allocate AUX ring buffer with the size 256MB.  So I manully change
> the buffer size to 8MB in tools/perf/tests/shell/lib/coresight.sh:
> 
>    PERFRECMEM="-m ,8M"
> 
> So finally I can see the test case is passed:

This is artificial isn't it? limiting to 1GB. You certainly have far 
more memory than that available. My testse were on a system with 4GB and 
I had no issues.

> root@debian:/mnt/export/arm-linux-kernel/tools/perf# ./perf test -v 76
>   76: Coresight / ASM Pure Loop                                       :
> --- start ---
> test child forked, pid 9481
> -m ,8M -e cs_etm//u
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.681 MB ./perf-asm_pure_loop-out.data ]
> test child finished with 0
> ---- end ----
> Coresight / ASM Pure Loop: Ok
> 
> Do you think we really need to use 256MiB as the AUX buffer size?
> IIRC, it means we allocate 256MiB per CPU for this case, on the other
> hand, you could see the final perf data file size is small (0.681
> MiB).
> 
> Seems to me, it's not necessary to allocate so big buffer for
> the test, and I tried to run below 4 cases with 8MiB, all of them can
> pass the testing :)

I didn't think anyone with a system with coresight support that would be 
running perf record locally would only have 1GB of ram... I knew junos 
had 8GB and my dragonboard has 4GB ... so I know I was on the smaller 
side. I thought a larger buffer == safer results (less chance of needing 
to write out the buffer during capture). Admittdly I used 256Mb when my 
tests ran for much longer and collected more data. I can try drop to 8 
or 16gb and see.

>> diff --git a/tools/perf/tests/shell/coresight/memcpy_thread/.gitignore b/tools/perf/tests/shell/coresight/memcpy_thread/.gitignore
>> new file mode 100644
>> index 000000000000..f8217e56091e
>> --- /dev/null
>> +++ b/tools/perf/tests/shell/coresight/memcpy_thread/.gitignore
>> @@ -0,0 +1 @@
>> +memcpy_thread
>> diff --git a/tools/perf/tests/shell/coresight/memcpy_thread/Makefile b/tools/perf/tests/shell/coresight/memcpy_thread/Makefile
>> new file mode 100644
>> index 000000000000..e2604cfae74b
>> --- /dev/null
>> +++ b/tools/perf/tests/shell/coresight/memcpy_thread/Makefile
>> @@ -0,0 +1,29 @@
>> +# SPDX-License-Identifier: GPL-2.0
>> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>> +include ../Makefile.miniconfig
>> +
>> +BIN=memcpy_thread
>> +LIB=-pthread
>> +
>> +all: $(BIN)
>> +
>> +$(BIN): $(BIN).c
>> +ifdef CORESIGHT
>> +ifeq ($(ARCH),arm64)
>> +	$(Q)$(CC) $(BIN).c -o $(BIN) $(LIB)
>> +endif
>> +endif
>> +
>> +install-tests: all
>> +ifdef CORESIGHT
>> +ifeq ($(ARCH),arm64)
>> +	$(call QUIET_INSTALL, tests) \
>> +		$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)'; \
>> +		$(INSTALL) $(BIN) '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)/$(BIN)'
>> +endif
>> +endif
>> +
>> +clean:
>> +	$(Q)$(RM) -f $(BIN)
>> +
>> +.PHONY: all clean install-tests
>> diff --git a/tools/perf/tests/shell/coresight/memcpy_thread/memcpy_thread.c b/tools/perf/tests/shell/coresight/memcpy_thread/memcpy_thread.c
>> new file mode 100644
>> index 000000000000..a7e169d1bf64
>> --- /dev/null
>> +++ b/tools/perf/tests/shell/coresight/memcpy_thread/memcpy_thread.c
>> @@ -0,0 +1,79 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +// Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>> +#include <stdio.h>
>> +#include <stdlib.h>
>> +#include <unistd.h>
>> +#include <string.h>
>> +#include <pthread.h>
>> +
>> +struct args {
>> +	unsigned long loops;
>> +	unsigned long size;
>> +	pthread_t th;
>> +	void *ret;
>> +};
>> +
>> +static void *thrfn(void *arg)
>> +{
>> +	struct args *a = arg;
>> +	unsigned long i, len = a->loops;
>> +	unsigned char *src, *dst;
>> +
>> +	src = malloc(a->size * 1024);
>> +	dst = malloc(a->size * 1024);
>> +	if ((!src) || (!dst)) {
>> +		printf("ERR: Can't allocate memory\n");
>> +		exit(1);
>> +	}
>> +	for (i = 0; i < len; i++)
>> +		memcpy(dst, src, a->size * 1024);
>> +}
>> +
>> +static pthread_t new_thr(void *(*fn) (void *arg), void *arg)
>> +{
>> +	pthread_t t;
>> +	pthread_attr_t attr;
>> +
>> +	pthread_attr_init(&attr);
>> +	pthread_create(&t, &attr, fn, arg);
>> +	return t;
>> +}
>> +
>> +int main(int argc, char **argv)
>> +{
>> +	unsigned long i, len, size, thr;
>> +	pthread_t threads[256];
>> +	struct args args[256];
>> +	long long v;
>> +
>> +	if (argc < 4) {
>> +		printf("ERR: %s [copysize Kb] [numthreads] [numloops (hundreds)]\n", argv[0]);
>> +		exit(1);
>> +	}
>> +
>> +	v = atoll(argv[1]);
>> +	if ((v < 1) || (v > (1024 * 1024))) {
>> +		printf("ERR: max memory 1GB (1048576 KB)\n");
>> +		exit(1);
>> +	}
>> +	size = v;
>> +	thr = atol(argv[2]);
>> +	if ((thr < 1) || (thr > 256)) {
>> +		printf("ERR: threads 1-256\n");
>> +		exit(1);
>> +	}
>> +	v = atoll(argv[3]);
>> +	if ((v < 1) || (v > 40000000000ll)) {
>> +		printf("ERR: loops 1-40000000000 (hundreds)\n");
>> +		exit(1);
>> +	}
>> +	len = v * 100;
>> +	for (i = 0; i < thr; i++) {
>> +		args[i].loops = len;
>> +		args[i].size = size;
>> +		args[i].th = new_thr(thrfn, &(args[i]));
>> +	}
>> +	for (i = 0; i < thr; i++)
>> +		pthread_join(args[i].th, &(args[i].ret));
>> +	return 0;
>> +}
>> diff --git a/tools/perf/tests/shell/coresight/thread_loop/.gitignore b/tools/perf/tests/shell/coresight/thread_loop/.gitignore
>> new file mode 100644
>> index 000000000000..6d4c33eaa9e8
>> --- /dev/null
>> +++ b/tools/perf/tests/shell/coresight/thread_loop/.gitignore
>> @@ -0,0 +1 @@
>> +thread_loop
>> diff --git a/tools/perf/tests/shell/coresight/thread_loop/Makefile b/tools/perf/tests/shell/coresight/thread_loop/Makefile
>> new file mode 100644
>> index 000000000000..424df4e8b0e6
>> --- /dev/null
>> +++ b/tools/perf/tests/shell/coresight/thread_loop/Makefile
>> @@ -0,0 +1,29 @@
>> +# SPDX-License-Identifier: GPL-2.0
>> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>> +include ../Makefile.miniconfig
>> +
>> +BIN=thread_loop
>> +LIB=-pthread
>> +
>> +all: $(BIN)
>> +
>> +$(BIN): $(BIN).c
>> +ifdef CORESIGHT
>> +ifeq ($(ARCH),arm64)
>> +	$(Q)$(CC) $(BIN).c -o $(BIN) $(LIB)
>> +endif
>> +endif
>> +
>> +install-tests: all
>> +ifdef CORESIGHT
>> +ifeq ($(ARCH),arm64)
>> +	$(call QUIET_INSTALL, tests) \
>> +		$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)'; \
>> +		$(INSTALL) $(BIN) '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)/$(BIN)'
>> +endif
>> +endif
>> +
>> +clean:
>> +	$(Q)$(RM) -f $(BIN)
>> +
>> +.PHONY: all clean install-tests
>> diff --git a/tools/perf/tests/shell/coresight/thread_loop/thread_loop.c b/tools/perf/tests/shell/coresight/thread_loop/thread_loop.c
>> new file mode 100644
>> index 000000000000..c0158fac7d0b
>> --- /dev/null
>> +++ b/tools/perf/tests/shell/coresight/thread_loop/thread_loop.c
>> @@ -0,0 +1,86 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +// Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>> +
>> +// define this for gettid()
>> +#define _GNU_SOURCE
>> +
>> +#include <stdio.h>
>> +#include <stdlib.h>
>> +#include <unistd.h>
>> +#include <string.h>
>> +#include <pthread.h>
>> +#include <sys/syscall.h>
>> +#ifndef SYS_gettid
>> +// gettid is 178 on arm64
>> +# define SYS_gettid 178
>> +#endif
>> +#define gettid() syscall(SYS_gettid)
>> +
>> +struct args {
>> +	unsigned int loops;
>> +	pthread_t th;
>> +	void *ret;
>> +};
>> +
>> +static void *thrfn(void *arg)
>> +{
>> +	struct args *a = arg;
>> +	int i = 0, len = a->loops;
>> +
>> +	if (getenv("SHOW_TID")) {
>> +		unsigned long long tid = gettid();
>> +
>> +		printf("%llu\n", tid);
>> +	}
>> +	asm volatile(
>> +		"loop:\n"
>> +		"add %[i], %[i], #1\n"
>> +		"cmp %[i], %[len]\n"
>> +		"blt loop\n"
>> +		: /* out */
>> +		: /* in */ [i] "r" (i), [len] "r" (len)
>> +		: /* clobber */
>> +	);
>> +	return (void *)(long)i;
>> +}
>> +
>> +static pthread_t new_thr(void *(*fn) (void *arg), void *arg)
>> +{
>> +	pthread_t t;
>> +	pthread_attr_t attr;
>> +
>> +	pthread_attr_init(&attr);
>> +	pthread_create(&t, &attr, fn, arg);
>> +	return t;
>> +}
>> +
>> +int main(int argc, char **argv)
>> +{
>> +	unsigned int i, len, thr;
>> +	pthread_t threads[256];
>> +	struct args args[256];
>> +
>> +	if (argc < 3) {
>> +		printf("ERR: %s [numthreads] [numloops (millions)]\n", argv[0]);
>> +		exit(1);
>> +	}
>> +
>> +	thr = atoi(argv[1]);
>> +	if ((thr < 1) || (thr > 256)) {
>> +		printf("ERR: threads 1-256\n");
>> +		exit(1);
>> +	}
>> +	len = atoi(argv[2]);
>> +	if ((len < 1) || (len > 4000)) {
>> +		printf("ERR: max loops 4000 (millions)\n");
>> +		exit(1);
>> +	}
>> +	len *= 1000000;
>> +	for (i = 0; i < thr; i++) {
>> +		args[i].loops = len;
>> +		args[i].th = new_thr(thrfn, &(args[i]));
>> +	}
>> +	for (i = 0; i < thr; i++)
>> +		pthread_join(args[i].th, &(args[i].ret));
>> +	return 0;
>> +}
>> diff --git a/tools/perf/tests/shell/coresight/unroll_loop_thread/.gitignore b/tools/perf/tests/shell/coresight/unroll_loop_thread/.gitignore
>> new file mode 100644
>> index 000000000000..2cb4e996dbf3
>> --- /dev/null
>> +++ b/tools/perf/tests/shell/coresight/unroll_loop_thread/.gitignore
>> @@ -0,0 +1 @@
>> +unroll_loop_thread
>> diff --git a/tools/perf/tests/shell/coresight/unroll_loop_thread/Makefile b/tools/perf/tests/shell/coresight/unroll_loop_thread/Makefile
>> new file mode 100644
>> index 000000000000..45ab2be8be92
>> --- /dev/null
>> +++ b/tools/perf/tests/shell/coresight/unroll_loop_thread/Makefile
>> @@ -0,0 +1,29 @@
>> +# SPDX-License-Identifier: GPL-2.0
>> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>> +include ../Makefile.miniconfig
>> +
>> +BIN=unroll_loop_thread
>> +LIB=-pthread
>> +
>> +all: $(BIN)
>> +
>> +$(BIN): $(BIN).c
>> +ifdef CORESIGHT
>> +ifeq ($(ARCH),arm64)
>> +	$(Q)$(CC) $(BIN).c -o $(BIN) $(LIB)
>> +endif
>> +endif
>> +
>> +install-tests: all
>> +ifdef CORESIGHT
>> +ifeq ($(ARCH),arm64)
>> +	$(call QUIET_INSTALL, tests) \
>> +		$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)'; \
>> +		$(INSTALL) $(BIN) '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)/$(BIN)'
>> +endif
>> +endif
>> +
>> +clean:
>> +	$(Q)$(RM) -f $(BIN)
>> +
>> +.PHONY: all clean install-tests
>> diff --git a/tools/perf/tests/shell/coresight/unroll_loop_thread/unroll_loop_thread.c b/tools/perf/tests/shell/coresight/unroll_loop_thread/unroll_loop_thread.c
>> new file mode 100644
>> index 000000000000..cb9d22c7dfb9
>> --- /dev/null
>> +++ b/tools/perf/tests/shell/coresight/unroll_loop_thread/unroll_loop_thread.c
>> @@ -0,0 +1,74 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +// Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>> +#include <stdio.h>
>> +#include <stdlib.h>
>> +#include <unistd.h>
>> +#include <string.h>
>> +#include <pthread.h>
>> +
>> +struct args {
>> +	pthread_t th;
>> +	unsigned int in, out;
>> +	void *ret;
>> +};
>> +
>> +static void *thrfn(void *arg)
>> +{
>> +	struct args *a = arg;
>> +	unsigned int i, in = a->in;
>> +
>> +	for (i = 0; i < 10000; i++) {
>> +		asm volatile (
>> +// force an unroll of thia add instruction so we can test long runs of code
>> +#define SNIP1 "add %[in], %[in], #1\n"
>> +// 10
>> +#define SNIP2 SNIP1 SNIP1 SNIP1 SNIP1 SNIP1 SNIP1 SNIP1 SNIP1 SNIP1 SNIP1
>> +// 100
>> +#define SNIP3 SNIP2 SNIP2 SNIP2 SNIP2 SNIP2 SNIP2 SNIP2 SNIP2 SNIP2 SNIP2
>> +// 1000
>> +#define SNIP4 SNIP3 SNIP3 SNIP3 SNIP3 SNIP3 SNIP3 SNIP3 SNIP3 SNIP3 SNIP3
>> +// 10000
>> +#define SNIP5 SNIP4 SNIP4 SNIP4 SNIP4 SNIP4 SNIP4 SNIP4 SNIP4 SNIP4 SNIP4
>> +// 100000
>> +			SNIP5 SNIP5 SNIP5 SNIP5 SNIP5 SNIP5 SNIP5 SNIP5 SNIP5 SNIP5
>> +			: /* out */
>> +			: /* in */ [in] "r" (in)
>> +			: /* clobber */
>> +		);
>> +	}
>> +}
>> +
>> +static pthread_t new_thr(void *(*fn) (void *arg), void *arg)
>> +{
>> +	pthread_t t;
>> +	pthread_attr_t attr;
>> +
>> +	pthread_attr_init(&attr);
>> +	pthread_create(&t, &attr, fn, arg);
>> +	return t;
>> +}
>> +
>> +int main(int argc, char **argv)
>> +{
>> +	unsigned int i, thr;
>> +	pthread_t threads[256];
>> +	struct args args[256];
>> +
>> +	if (argc < 2) {
>> +		printf("ERR: %s [numthreads]\n", argv[0]);
>> +		exit(1);
>> +	}
>> +
>> +	thr = atoi(argv[1]);
>> +	if ((thr > 256) || (thr < 1)) {
>> +		printf("ERR: threads 1-256\n");
>> +		exit(1);
>> +	}
>> +	for (i = 0; i < thr; i++) {
>> +		args[i].in = rand();
>> +		args[i].th = new_thr(thrfn, &(args[i]));
>> +	}
>> +	for (i = 0; i < thr; i++)
>> +		pthread_join(args[i].th, &(args[i].ret));
>> +	return 0;
>> +}
>> diff --git a/tools/perf/tests/shell/coresight_asm_pure_loop.sh b/tools/perf/tests/shell/coresight_asm_pure_loop.sh
>> new file mode 100755
>> index 000000000000..3f0dbefcad50
>> --- /dev/null
>> +++ b/tools/perf/tests/shell/coresight_asm_pure_loop.sh
>> @@ -0,0 +1,18 @@
>> +#!/bin/sh -e
>> +# Coresight / ASM Pure Loop
>> +
>> +# SPDX-License-Identifier: GPL-2.0
>> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>> +
>> +TEST="asm_pure_loop"
>> +. $(dirname $0)/lib/coresight.sh
>> +ARGS=""
>> +DATV="out"
>> +DATA="$DATD/perf-$TEST-$DATV.data"
>> +
>> +perf record $PERFRECOPT -o "$DATA" "$BIN" $ARGS
>> +
>> +perf_dump_aux_verify "$DATA" 10 10 10
>> +
>> +err=$?
>> +exit $err
> 
> Can we organize the shell scripts by moving them into the folder
> tools/perf/tests/shell/coresight?

We can - but it comes with a fair few more changes.

>    coresight_asm_pure_loop.sh
>    coresight_memcpy_thread_16k_10.sh
>    coresight_thread_loop_check_tid_10.sh
>    coresight_thread_loop_check_tid_2.sh
>    coresight_unroll_loop_thread_10.sh
> 
> And we even can consider to move script test_arm_coresight.sh into
> the folder tools/perf/tests/shell/coresight and change its
> name as 'coresight_smoke_test.sh'.

Indeed these other tests I left alone for now and had not thought about 
how to marry these together yet - leaving this for another day and 
another patch set rather than this patch set itself. That was my 
thoguht. I was trying to make an "Easier to extend by just dropping a 
test into a dir" setup here to make maintenance and expansion easier 
over time (and thus encourage testing by having a simple repeatable test 
infra to duplicate). I ended up with a dir per test tool you need to 
build and a driver script in the tests/shell dir. I think this is 
certainly worth considering but perhaps as a separate set of work to 
marry these?

I piggybacked on the existing shell test infra but added a fair few more 
scripts. To do what you suggest I'd need to modify the core shell test 
code to walk subdirs recursively then looking for child scripts. The 
problem is how does perf test's shell handling know about the coresight 
subdir vs the lib subdir? Both contain *.sh shell scripts - the 
difference is the ones in lib are not executable. Is this sufficiently 
different? I could also open them to check the have #!/bin/... as the 
first line. Hardcoding just a single coresight subdir just feels wrong 
and hacky to me, thus the generic recursion solution I suggest here.

I can definitely see how extending to subdirs would make supporting 
testing cleaner and divide things into their own domains (dirs).

> Thanks,
> Leo
> 
>> diff --git a/tools/perf/tests/shell/coresight_memcpy_thread_16k_10.sh b/tools/perf/tests/shell/coresight_memcpy_thread_16k_10.sh
>> new file mode 100755
>> index 000000000000..8972af835016
>> --- /dev/null
>> +++ b/tools/perf/tests/shell/coresight_memcpy_thread_16k_10.sh
>> @@ -0,0 +1,18 @@
>> +#!/bin/sh -e
>> +# Coresight / Memcpy 16k 10 Threads
>> +
>> +# SPDX-License-Identifier: GPL-2.0
>> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>> +
>> +TEST="memcpy_thread"
>> +. $(dirname $0)/lib/coresight.sh
>> +ARGS="16 10 1"
>> +DATV="16k_10"
>> +DATA="$DATD/perf-$TEST-$DATV.data"
>> +
>> +perf record $PERFRECOPT -o "$DATA" "$BIN" $ARGS
>> +
>> +perf_dump_aux_verify "$DATA" 10 10 10
>> +
>> +err=$?
>> +exit $err
>> diff --git a/tools/perf/tests/shell/coresight_thread_loop_check_tid_10.sh b/tools/perf/tests/shell/coresight_thread_loop_check_tid_10.sh
>> new file mode 100755
>> index 000000000000..5b468901f89b
>> --- /dev/null
>> +++ b/tools/perf/tests/shell/coresight_thread_loop_check_tid_10.sh
>> @@ -0,0 +1,19 @@
>> +#!/bin/sh -e
>> +# Coresight / Thread Loop 10 Threads - Check TID
>> +
>> +# SPDX-License-Identifier: GPL-2.0
>> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>> +
>> +TEST="thread_loop"
>> +. $(dirname $0)/lib/coresight.sh
>> +ARGS="10 1"
>> +DATV="check-tid-10th"
>> +DATA="$DATD/perf-$TEST-$DATV.data"
>> +STDO="$DATD/perf-$TEST-$DATV.stdout"
>> +
>> +SHOW_TID=1 perf record -s $PERFRECOPT -o "$DATA" "$BIN" $ARGS > $STDO
>> +
>> +perf_dump_aux_tid_verify "$DATA" "$STDO"
>> +
>> +err=$?
>> +exit $err
>> diff --git a/tools/perf/tests/shell/coresight_thread_loop_check_tid_2.sh b/tools/perf/tests/shell/coresight_thread_loop_check_tid_2.sh
>> new file mode 100755
>> index 000000000000..f8b7abd3aa03
>> --- /dev/null
>> +++ b/tools/perf/tests/shell/coresight_thread_loop_check_tid_2.sh
>> @@ -0,0 +1,19 @@
>> +#!/bin/sh -e
>> +# Coresight / Thread Loop 2 Threads - Check TID
>> +
>> +# SPDX-License-Identifier: GPL-2.0
>> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>> +
>> +TEST="thread_loop"
>> +. $(dirname $0)/lib/coresight.sh
>> +ARGS="2 20"
>> +DATV="check-tid-2th"
>> +DATA="$DATD/perf-$TEST-$DATV.data"
>> +STDO="$DATD/perf-$TEST-$DATV.stdout"
>> +
>> +SHOW_TID=1 perf record -s $PERFRECOPT -o "$DATA" "$BIN" $ARGS > $STDO
>> +
>> +perf_dump_aux_tid_verify "$DATA" "$STDO"
>> +
>> +err=$?
>> +exit $err
>> diff --git a/tools/perf/tests/shell/coresight_unroll_loop_thread_10.sh b/tools/perf/tests/shell/coresight_unroll_loop_thread_10.sh
>> new file mode 100755
>> index 000000000000..c985dfb025c2
>> --- /dev/null
>> +++ b/tools/perf/tests/shell/coresight_unroll_loop_thread_10.sh
>> @@ -0,0 +1,18 @@
>> +#!/bin/sh -e
>> +# Coresight / Unroll Loop Thread 10
>> +
>> +# SPDX-License-Identifier: GPL-2.0
>> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>> +
>> +TEST="unroll_loop_thread"
>> +. $(dirname $0)/lib/coresight.sh
>> +ARGS="10"
>> +DATV="10"
>> +DATA="$DATD/perf-$TEST-$DATV.data"
>> +
>> +perf record $PERFRECOPT -o "$DATA" "$BIN" $ARGS
>> +
>> +perf_dump_aux_verify "$DATA" 10 10 10
>> +
>> +err=$?
>> +exit $err
>> diff --git a/tools/perf/tests/shell/lib/coresight.sh b/tools/perf/tests/shell/lib/coresight.sh
>> new file mode 100644
>> index 000000000000..6a611b073f02
>> --- /dev/null
>> +++ b/tools/perf/tests/shell/lib/coresight.sh
>> @@ -0,0 +1,130 @@
>> +# SPDX-License-Identifier: GPL-2.0
>> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>> +
>> +# This is sourced from a driver script so no need for #!/bin... etc. at the
>> +# top - the assumption below is that it runs as part of sourcing after the
>> +# test sets up some basic env vars to say what it is.
>> +
>> +# perf record options for the perf tests to use
>> +PERFRECMEM="-m ,128M"
>> +PERFRECOPT="$PERFRECMEM -e cs_etm//u"
>> +
>> +# These tests need to be run as root or coresight won't allow large buffers
>> +# and will not collect proper data
>> +UID=`id -u`
>> +if test "$UID" -ne 0; then
>> +	echo "Not running as root... skip"
>> +	exit 2
>> +fi
>> +
>> +TOOLS=$(dirname $0)
>> +DIR="$TOOLS/coresight/$TEST"
>> +BIN="$DIR/$TEST"
>> +# If the test tool/binary does not exist and is executable then skip the test
>> +if ! test -x "$BIN"; then exit 2; fi
>> +DATD="."
>> +# If the data dir env is set then make the data dir use that instead of ./
>> +if test -n "$PERF_TEST_CORESIGHT_DATADIR"; then
>> +	DATD="$PERF_TEST_CORESIGHT_DATADIR";
>> +fi
>> +# If the stat dir env is set then make the data dir use that instead of ./
>> +STATD="."
>> +if test -n "$PERF_TEST_CORESIGHT_STATDIR"; then
>> +	STATD="$PERF_TEST_CORESIGHT_STATDIR";
>> +fi
>> +
>> +# Called if the test fails - error code 2
>> +err() {
>> +	echo "$1"
>> +	exit 1
>> +}
>> +
>> +# Check that some statistics from our perf
>> +check_val_min() {
>> +	STATF="$4"
>> +	if test "$2" -lt "$3"; then
>> +		echo ", FAILED" >> "$STATF"
>> +		err "Sanity check number of $1 is too low ($2 < $3)"
>> +	fi
>> +}
>> +
>> +perf_dump_aux_verify() {
>> +	# Some basic checking that the AUX chunk contains some sensible data
>> +	# to see that we are recording something and at least a minimum
>> +	# amount of it. We should almost always see F3 atoms in just about
>> +	# anything but certainly we will see some trace info and async atom
>> +	# chunks.
>> +	DUMP="$DATD/perf-tmp-aux-dump.txt"
>> +	perf report --stdio --dump -i "$1" | \
>> +		grep -o -e I_ATOM_F3 -e I_ASYNC -e I_TRACE_INFO > "$DUMP"
>> +	# Simply count how many of these atoms we find to see that we are
>> +	# producing a reasonable amount of data - exact checks are not sane
>> +	# as this is a lossy  process where we may lose some blocks and the
>> +	# compiler may produce different code depending on the compiler and
>> +	# optimization options, so this is rough  just to see if we're
>> +	# either missing almost all the data or all of it
>> +	ATOM_F3_NUM=`grep I_ATOM_F3 "$DUMP" | wc -l`
>> +	ATOM_ASYNC_NUM=`grep I_ASYNC "$DUMP" | wc -l`
>> +	ATOM_TRACE_INFO_NUM=`grep I_TRACE_INFO "$DUMP" | wc -l`
>> +	rm -f "$DUMP"
>> +
>> +	# Arguments provide minimums for a pass
>> +	CHECK_F3_MIN="$2"
>> +	CHECK_ASYNC_MIN="$3"
>> +	CHECK_TRACE_INFO_MIN="$4"
>> +
>> +	# Write out statistics, so over time you can track results to see if
>> +	# there is a pattern - for example we have less "noisy" results that
>> +	# produce more consistent amounts of data each run, to see if over
>> +	# time any techinques to  minimize data loss are having an effect or
>> +	# not
>> +	STATF="$STATD/stats-$TEST-$DATV.csv"
>> +	if ! test -f "$STATF"; then
>> +		echo "ATOM F3 Count, Minimum, ATOM ASYNC Count, Minimum, TRACE INFO Count, Minimum" > "$STATF"
>> +	fi
>> +	echo -n "$ATOM_F3_NUM, $CHECK_F3_MIN, $ATOM_ASYNC_NUM, $CHECK_ASYNC_MIN, $ATOM_TRACE_INFO_NUM, $CHECK_TRACE_INFO_MIN" >> "$STATF"
>> +
>> +	# Actually check to see if we passed or failed.
>> +	check_val_min "ATOM_F3" "$ATOM_F3_NUM" "$CHECK_F3_MIN" "$STATF"
>> +	check_val_min "ASYNC" "$ATOM_ASYNC_NUM" "$CHECK_ASYNC_MIN" "$STATF"
>> +	check_val_min "TRACE_INFO" "$ATOM_TRACE_INFO_NUM" "$CHECK_TRACE_INFO_MIN" "$STATF"
>> +	echo ", Ok" >> "$STATF"
>> +}
>> +
>> +perf_dump_aux_tid_verify() {
>> +	# Specifically crafted test will produce a list of Tread ID's to
>> +	# stdout that need to be checked to  see that they have had trace
>> +	# info collected in AUX blocks in the perf data. This will go
>> +	# through all the TID's that are listed as CID=0xabcdef and see
>> +	# that all the Thread IDs the test tool reports are  in the perf
>> +	# data AUX chunks
>> +
>> +	# The TID test tools will print a TID per stdout line that are being
>> +	# tested
>> +	TIDS=`cat "$2"`
>> +	# Scan the perf report to find the TIDs that are actually CID in hex
>> +	# and build a list of the ones found
>> +	FOUND_TIDS=`perf report --stdio --dump -i "$1" | \
>> +			grep -o "CID=0x[0-9a-z]\+" | sed 's/CID=//g' | \
>> +			uniq | sort | uniq`
>> +
>> +	# Iterate over the list of TIDs that the test says it has and find
>> +	# them in the TIDs found in the perf report
>> +	MISSING=""
>> +	for TID2 in $TIDS; do
>> +		FOUND=""
>> +		for TIDHEX in $FOUND_TIDS; do
>> +			TID=`printf "%i" $TIDHEX`
>> +			if test "$TID" -eq "$TID2"; then
>> +				FOUND="y"
>> +				break
>> +			fi
>> +		done
>> +		if test -z "$FOUND"; then
>> +			MISSING="$MISSING $TID"
>> +		fi
>> +	done
>> +	if test -n "$MISSING"; then
>> +		err "Thread IDs $MISSING not found in perf AUX data"
>> +	fi
>> +}
>> -- 
>> 2.32.0
>>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/3] perf test: Add coresight tests to guage quality of data generated
  2022-04-21 17:38     ` Carsten Haitzler
@ 2022-05-26  8:20       ` Leo Yan
  2022-05-26 16:08         ` Leo Yan
  2022-06-13 14:15         ` Carsten Haitzler
  0 siblings, 2 replies; 19+ messages in thread
From: Leo Yan @ 2022-05-26  8:20 UTC (permalink / raw)
  To: Carsten Haitzler
  Cc: linux-kernel, coresight, suzuki.poulose, mathieu.poirier,
	mike.leach, linux-perf-users, acme

Hi Carsten,

Sorry for late response.

On Thu, Apr 21, 2022 at 06:38:33PM +0100, Carsten Haitzler wrote:

[...]

> > Very big change...  Why squash all patches form previous verion to this
> > single one big patch?  Usually the format with small patches is much
> > better for reviewing.
> 
> I was asked to re-jig the tree and in doing so I also ended up cutting down
> the size a lot so this just makes more sense together as a "here are the
> tests" as adding infra without any tests makes no sense and the tests
> themelves are self-contained in their own directories and source files and
> "drivign scripts" thus it's essentially patch 1 appended to patch 2 to patch
> 3 etc. and still broken up in the patch file by file.

I am not sure if I understand the meaning, seems to me you could
organize the patch series like:

- Patch for common files (e.g. script lib/coresight.sh or some
  Makefile changes);
- Patches for enabling test cases, E.g.:
  patch for asm_pure_loop;
  patch for thread loop (include unroll loop);
  patch for memcpy;
- Patch for documentation.

If this is not comfortable for you, at least we can use three patches:

- Patch for common file (e.g. script lib/coresight.sh);
- Patch for test cases;
- Patch for documentation.

[...]

> > > +If you see these above, then your system is tracing coresight data
> > > +correctly.
> > > +
> > > +To compile perf with coresight support in the perf directory do
> > > +
> > > +    make CORESIGHT=1
> > 
> > It is inaccurate that if we don't mention openCSD lib.
> 
> Do you mean I need to mention that you need the opencsd library installed
> too?

Yes, otherwise, users might directly build perf without opencsd lib,
then finally they cannot use perf with Arm CoreSight.

> > > +This will compile the perf tool with coresight support as well as
> > > +build some small test binaries for perf test. This requires you also
> > > +be compiling for 64bit Arm (ARM64/aarch64). The tools run as part of
> > > +perf coresight tracing are in tests/shell/tools/coresight.
> > 
> > For build perf tool, I think above paragraphs are duplicate with the
> > document Documentation/trace/coresight/coresight.rst.  Can we simply
> > say:
> > 
> > "The details for building perf tool with support Arm Coresight can be
> > found in the "HOWTO.md" file of the openCSD gitHub repository:
> > https://github.com/Linaro/opencsd.
> 
> I can. I put this here as I didn't go clone OpencCSD first but used my
> distro OpenCSD packages and thus of course didn't have the documentation in
> front of me. I spent some time wondering why it wasn't building with
> coresight support even though it detected OpenCSD when I compiled... I
> didn't expect to have to go to some separate project git repository and read
> docs there on how to build the perf tool here in the kernel. I wrote this
> because it was an actual problem I hit and it's a lot less frustrating to
> "end users" to give them the information they need in the relevant place
> they need it instead of sending them around to other project trees. Building
> perf with coresight support is handled by the perf tree int he kernel, not
> OpenCSD, thus IMHO that is where the documentation belongs - alongside the
> thing that determines how to build something.

Understand.

> > And "HOWTO.md" file gives the information and examples for how to use
> > perf tool to record and report Coresight trace data.  It's the
> > prerequisite for this perf Coresight test."
> > 
> > > +You will also want coresight support enabled in your kernel config.
> > > +Ensure it is enabled with:
> > > +
> > > +    CONFIG_CORESIGHT=y
> > > +
> > > +There are various other coresight options you probably also want
> > > +enabled like:
> > > +
> > > +    CONFIG_CORESIGHT_LINKS_AND_SINKS=y
> > > +    CONFIG_CORESIGHT_LINK_AND_SINK_TMC=y
> > > +    CONFIG_CORESIGHT_CATU=y
> > > +    CONFIG_CORESIGHT_SINK_TPIU=y
> > > +    CONFIG_CORESIGHT_SINK_ETBV10=y
> > > +    CONFIG_CORESIGHT_SOURCE_ETM4X=y
> > > +    CONFIG_CORESIGHT_STM=y
> > > +    CONFIG_CORESIGHT_CPU_DEBUG=y
> > > +    CONFIG_CORESIGHT_CTI=y
> > > +    CONFIG_CORESIGHT_CTI_INTEGRATION_REGS=y
> > > +
> > > +Please refer to the kernel configuration help for more information.
> > 
> > I prefer to remove these kernel configuration since they are not
> > inconsistent on different platforms (e.g. ETBV10, ETM4X, etc), and
> > some configurations might not necessary (e.g. CPU_DEBUG).
> 
> Certainly there should be some documentation on which kernel configs you
> might want to turn on then? Imagine someone new comes along and doesn't have
> any idea what to possible enable at all and manages to build perf with
> coresight support (as above) then finds it doesn't work because they didn't
> enable enough config in the kernel? Sure - could probably trim these down a
> bit but the point here is to alert the user to there being a range of
> coresight config options that you need to turn on that you likely will find
> are not turned on. They certainly are not turned on on distro kernels and a
> lot of the time when you have a platform that already boots/works you start
> with your distro kernel config file because you want everything enabled so
> it actually boots. I've learned the hard way to do this as you manage to
> forget to turn on some MMC driver or some other feature and your boot hangs
> or doesn't find rootfs etc.

So far, we will have two documents in Linux kernel:

- Documentation/trace/coresight/coresight.rst;
- tools/perf/Documentation/arm-coresight.txt.

We need to avoid overlap between these two files.  I think we could use
the file Documentation/trace/coresight/coresight.rst to focus on
CoreSight driver module relates stuffs and
tools/perf/Documentation/arm-coresight.txt is more about the perf
usages.

But, the file Documentation/trace/coresight/coresight.rst doesn't give
any info for kernel configs, I think which would be a better place to
give information for building kernel modules.

> What would you recommend then as a "turn these on and coresight will almost
> certainly work for you on your given hardware " then?

This would be fine.  Alternatively, we could add a section in the file
Documentation/trace/coresight/coresight.rst to describe how to build
CoreSight modules.

How you think for this?   I also would like to get suggestions from
CoreSight maintainers Suzuki/Mathieu/Mike.

[...]

> > Please update based on the latest test case names, at my side, I can
> > see the testing case like:
> > 
> >         Coresight / ASM Pure Loop
> >         Coresight / Memcpy 16k 10 Threads
> >         Coresight / Thread Loop 10 Threads - Check TID
> >         Coresight / Thread Loop 2 Threads - Check TID
> >         Coresight / Unroll Loop Thread 10
> 
> Oh sorry - yeah. I wrote the docs based on the earlier tests. Will fix.

Thanks.

> > > +
> > > +These perf record tests will not run if the tool binaries do not exist
> > > +in tests/shell/tools/coresight/*/ and will be skipped. If you do not
> > > +have coresight support in hardware then either do not build perf with
> > > +coresight support or remove these binaries in order to not have these
> > > +tests fail and have them skip instead.
> > > +
> > > +These tests will log historical results in the current working
> > > +directory (e.g. tools/perf) and will be named stats-*.csv like:
> > > +
> > > +    stats-asm_pure_loop-out.csv
> > > +    stats-bubble_sort-random.csv
> > > +    ...
> > > +
> > > +These statistic files log some aspects of the AUX data sections in
> > > +the perf data output counting some numbers of certain encodings (a
> > > +good way to know that it's working in a very simple way). One problem
> > > +with coresight is that given a large enough amount of data needing to
> > > +be logged, some of it can be lost due to the processor not waking up
> > > +in time to read out all the data from buffers etc.. You will notice
> > > +that the amount of data collected can vary a lot per run of perf test.
> > > +If you wish to see how this changes over time, simply run perf test
> > > +multiple times and all these csv files will have more and more data
> > > +appended to it that you can later examine, graph and otherwise use to
> > > +figure out if things have become worse or better.
> > 
> > I am confused by this narrative.  Does it try to remind that the final
> > testing result (pass or fail) is not stable?  Or should we run for
> > multiple times so have more chance to capture issues?
> 
> That is correct. I thought I was clear that it's lossy. That is actually the
> case. I have tests here that actually fail because there is no data
> collected from some threads at all (missing CID blocks for some of the
> threads that run in the test). The point is to have tests that may be
> failing now but in future will improve. I lowered the minimum bar to pass
> for most tests to have "at least just a little data" but most tests show
> highly variable amount of captured data. the csv files are there to
> over-time give you a good idea of the stability of the captured data.

Okay, this would be fine for me.  Though I am a bit worry that later if
users report a failure, then how we can tell them this is a bug or it's
just tracing quality issue?

[...]

> > > diff --git a/tools/perf/tests/shell/coresight/asm_pure_loop/.gitignore b/tools/perf/tests/shell/coresight/asm_pure_loop/.gitignore
> > > new file mode 100644
> > > index 000000000000..468673ac32e8
> > > --- /dev/null
> > > +++ b/tools/perf/tests/shell/coresight/asm_pure_loop/.gitignore
> > > @@ -0,0 +1 @@
> > > +asm_pure_loop
> > 
> > Do we really need there '.gitignore' files under the folder
> > 'tools/perf/tests/shell/coresight/'.
> 
> Where would you rather have them to ignore the generated binary tools?

It's interesting that I wanted to find a case to object you, so I tried
to check the folder linux/samples/bpf, but it does use .gitignore file
to ignore built binaries :)

Adding .gitignore is in practice and this would be fine for me.

> > > diff --git a/tools/perf/tests/shell/coresight/asm_pure_loop/Makefile b/tools/perf/tests/shell/coresight/asm_pure_loop/Makefile
> > > new file mode 100644
> > > index 000000000000..10c5a60cb71c
> > > --- /dev/null
> > > +++ b/tools/perf/tests/shell/coresight/asm_pure_loop/Makefile
> > > @@ -0,0 +1,30 @@
> > > +# SPDX-License-Identifier: GPL-2.0
> > > +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> > > +
> > > +include ../Makefile.miniconfig
> > > +
> > > +BIN=asm_pure_loop
> > > +LIB=
> > 
> > Remove the unused variable 'LIB='.
> 
> I have this because I wanted to have a simple template to be able to re-use
> for more tests over time. It's so much easier to maintain and extend if
> every makefile and tool follow a similar pattern and you can almost copy &
> paste between them as they don't have "exceptions". You really want me to
> remove this?

It's fine to keep it.  Could you add a comment for this?

To be honest, I am not experienced for bash shell script, so I have no
idea why write like this way.  If you think this is very common usage
in shell, then you could keep it and don't need to add comment.

[...]


> > There have four sub folders under tools/perf/tests/shell/coresight:
> > 
> >    asm_pure_loop
> >    memcpy_thread
> >    thread_loop
> >    unroll_loop_thread
> > 
> > And every folder has its own Makefile and every Makefile is quite
> > close to each other.  I am just wandering if it's possible to
> > remove the 4 Makefiles in these four sub folders, and simply use
> > tools/perf/tests/shell/coresight/Makefile as the central place to
> > build these assistant programs.
> 
> I did this so it's easier to etxent over time. having a single parent
> makefile that over time accumulates little ugly "if's" and exceptions makes
> longer-term maintenance and extending harder. I did it this way to make this
> easy - make a copy of a dir - add that dir to a parent makefile then modify
> the makefile as needed (but only as needed).

Okay, let's keep the saperate makefiles.

> > > diff --git a/tools/perf/tests/shell/coresight/asm_pure_loop/asm_pure_loop.S b/tools/perf/tests/shell/coresight/asm_pure_loop/asm_pure_loop.S
> > > new file mode 100644
> > > index 000000000000..75cf084a927d
> > > --- /dev/null
> > > +++ b/tools/perf/tests/shell/coresight/asm_pure_loop/asm_pure_loop.S
> > > @@ -0,0 +1,28 @@
> > > +/* SPDX-License-Identifier: GPL-2.0 */
> > > +/* Tamas Zsoldos <tamas.zsoldos@arm.com>, 2021 */
> > > +
> > > +.globl _start
> > > +_start:
> > > +	mov	x0, 0x0000ffff
> > > +	mov	x1, xzr
> > > +loop:
> > > +	nop
> > > +	nop
> > > +	cbnz	x1, noskip
> > > +	nop
> > > +	nop
> > > +	adrp	x2, skip
> > > +	add 	x2, x2, :lo12:skip
> > > +	br	x2
> > > +	nop
> > > +	nop
> > > +noskip:
> > > +	nop
> > > +	nop
> > > +skip:
> > > +	sub	x0, x0, 1
> > > +	cbnz	x0, loop
> > > +
> > > +	mov	x0, #0
> > > +	mov	x8, #93 // __NR_exit syscall
> > > +	svc	#0
> > 
> > I tested the case "ASM Pure Loop" on my Juno board, and it complaints:
> > 
> > root@debian:/mnt/export/arm-linux-kernel/tools/perf# ./perf test -v 76
> >   76: Coresight / ASM Pure Loop                                       :
> > --- start ---
> > test child forked, pid 9063
> > failed to mmap with 12 (Cannot allocate memory)
> > test child finished with -1
> > ---- end ----
> > Coresight / ASM Pure Loop: FAILED!
> > 
> > Since I only setup the 1GB memory for the Linux kernel, it fails to
> > allocate AUX ring buffer with the size 256MB.  So I manully change
> > the buffer size to 8MB in tools/perf/tests/shell/lib/coresight.sh:
> > 
> >    PERFRECMEM="-m ,8M"
> > 
> > So finally I can see the test case is passed:
> 
> This is artificial isn't it? limiting to 1GB. You certainly have far more
> memory than that available. My testse were on a system with 4GB and I had no
> issues.

Please see below comment.

> > root@debian:/mnt/export/arm-linux-kernel/tools/perf# ./perf test -v 76
> >   76: Coresight / ASM Pure Loop                                       :
> > --- start ---
> > test child forked, pid 9481
> > -m ,8M -e cs_etm//u
> > [ perf record: Woken up 1 times to write data ]
> > [ perf record: Captured and wrote 0.681 MB ./perf-asm_pure_loop-out.data ]
> > test child finished with 0
> > ---- end ----
> > Coresight / ASM Pure Loop: Ok
> > 
> > Do you think we really need to use 256MiB as the AUX buffer size?
> > IIRC, it means we allocate 256MiB per CPU for this case, on the other
> > hand, you could see the final perf data file size is small (0.681
> > MiB).
> > 
> > Seems to me, it's not necessary to allocate so big buffer for
> > the test, and I tried to run below 4 cases with 8MiB, all of them can
> > pass the testing :)
> 
> I didn't think anyone with a system with coresight support that would be
> running perf record locally would only have 1GB of ram... I knew junos had
> 8GB and my dragonboard has 4GB ... so I know I was on the smaller side. I
> thought a larger buffer == safer results (less chance of needing to write
> out the buffer during capture). Admittdly I used 256Mb when my tests ran for
> much longer and collected more data. I can try drop to 8 or 16gb and see.

Yes, my Juno board has 8GB but I also have DB410c with 1GB with quad
coes [1].  I am still concern for 256MB buffer size, it's not friendly for
embedded system, and even not good for server.  For example, if we run
this testing on Arm server with 96 cores (like Hisilicon D06 board),
then we need the buffer size is:

  256MiB * 96 = 16GiB

I agree usually 16GiB is not a problem for server, but seems to me
it doesn't make much sense to consume huge memory resource for the
testing.

In other words, if set 8MiB (or 16MiB, 32MiB) buffer size and doesn't
see testing result regression, I think this would be good to decrease
the buffer size.

[1] https://www.96boards.org/product/dragonboard410c/

[...]

> > > diff --git a/tools/perf/tests/shell/coresight_asm_pure_loop.sh b/tools/perf/tests/shell/coresight_asm_pure_loop.sh
> > > new file mode 100755
> > > index 000000000000..3f0dbefcad50
> > > --- /dev/null
> > > +++ b/tools/perf/tests/shell/coresight_asm_pure_loop.sh
> > > @@ -0,0 +1,18 @@
> > > +#!/bin/sh -e
> > > +# Coresight / ASM Pure Loop
> > > +
> > > +# SPDX-License-Identifier: GPL-2.0
> > > +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> > > +
> > > +TEST="asm_pure_loop"
> > > +. $(dirname $0)/lib/coresight.sh
> > > +ARGS=""
> > > +DATV="out"
> > > +DATA="$DATD/perf-$TEST-$DATV.data"
> > > +
> > > +perf record $PERFRECOPT -o "$DATA" "$BIN" $ARGS
> > > +
> > > +perf_dump_aux_verify "$DATA" 10 10 10
> > > +
> > > +err=$?
> > > +exit $err
> > 
> > Can we organize the shell scripts by moving them into the folder
> > tools/perf/tests/shell/coresight?
> 
> We can - but it comes with a fair few more changes.
> 
> >    coresight_asm_pure_loop.sh
> >    coresight_memcpy_thread_16k_10.sh
> >    coresight_thread_loop_check_tid_10.sh
> >    coresight_thread_loop_check_tid_2.sh
> >    coresight_unroll_loop_thread_10.sh
> > 
> > And we even can consider to move script test_arm_coresight.sh into
> > the folder tools/perf/tests/shell/coresight and change its
> > name as 'coresight_smoke_test.sh'.
> 
> Indeed these other tests I left alone for now and had not thought about how
> to marry these together yet - leaving this for another day and another patch
> set rather than this patch set itself. That was my thoguht. I was trying to
> make an "Easier to extend by just dropping a test into a dir" setup here to
> make maintenance and expansion easier over time (and thus encourage testing
> by having a simple repeatable test infra to duplicate). I ended up with a
> dir per test tool you need to build and a driver script in the tests/shell
> dir. I think this is certainly worth considering but perhaps as a separate
> set of work to marry these?

Okay, it would be fine to use separate set for moving the script
test_arm_coresight.sh, which is a simple case.

> I piggybacked on the existing shell test infra but added a fair few more
> scripts. To do what you suggest I'd need to modify the core shell test code
> to walk subdirs recursively then looking for child scripts. The problem is
> how does perf test's shell handling know about the coresight subdir vs the
> lib subdir?

Yeah, now I understand your point.  How about file layout like below?

  tools/perf/tests/shell/coresight_test_hub.sh
  tools/perf/tests/shell/coresight/coresight_asm_pure_loop.sh
  tools/perf/tests/shell/coresight/coresight_memcpy_thread_16k_10.sh
  tools/perf/tests/shell/coresight/coresight_thread_loop_check_tid_10.sh
  tools/perf/tests/shell/coresight/coresight_thread_loop_check_tid_2.sh
  tools/perf/tests/shell/coresight/coresight_unroll_loop_thread_10.sh

So we use tools/perf/tests/shell/coresight_test_hub.sh as an interface
to hook with Perf test infrastructure, and then hub.sh file calls
testing scripts under the sub folder.  Seems to me, this is also
friendly for later's extension.

> Both contain *.sh shell scripts - the difference is the ones in
> lib are not executable. Is this sufficiently different? I could also open
> them to check the have #!/bin/... as the first line. Hardcoding just a
> single coresight subdir just feels wrong and hacky to me, thus the generic
> recursion solution I suggest here.

Agreed.  I also don't prefer this way.

> I can definitely see how extending to subdirs would make supporting testing
> cleaner and divide things into their own domains (dirs).

Thanks a lot for the work!  The test cases are good for me (but I would
say Mike is the best person for reviewing testing trace data quaility),
I just want to make sure it's not hard for later maintenance.

Leo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] perf test: Shell - only run .sh shell files to skip other files
  2022-04-21 16:21     ` Carsten Haitzler
@ 2022-05-26 10:14       ` Leo Yan
  2022-06-13 13:08         ` Carsten Haitzler
  0 siblings, 1 reply; 19+ messages in thread
From: Leo Yan @ 2022-05-26 10:14 UTC (permalink / raw)
  To: Carsten Haitzler
  Cc: linux-kernel, coresight, suzuki.poulose, mathieu.poirier,
	mike.leach, linux-perf-users, acme

On Thu, Apr 21, 2022 at 05:21:27PM +0100, Carsten Haitzler wrote:
> On 4/10/22 03:28, Leo Yan wrote:
> > On Wed, Mar 09, 2022 at 12:28:58PM +0000, carsten.haitzler@foss.arm.com wrote:
> > > From: Carsten Haitzler <carsten.haitzler@arm.com>
> > > 
> > > You edit your scripts in the tests and end up with your usual shell
> > > backup files with ~ or .bak or something else at the end, but then your
> > > next perf test run wants to run the backups too. You might also have perf
> > > .data files in the directory or something else undesireable as well. You end
> > > up chasing which test is the one you edited and the backup and have to keep
> > > removing all the backup files, so automatically skip any files that are
> > > not plain *.sh scripts to limit the time wasted in chasing ghosts.
> > > 
> > > Signed-off-by: Carsten Haitzler <carsten.haitzler@arm.com>
> > > 
> > > ---
> > >   tools/perf/tests/builtin-test.c | 17 +++++++++++++++--
> > >   1 file changed, 15 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
> > > index 3c34cb766724..3a02ba7a7a89 100644
> > > --- a/tools/perf/tests/builtin-test.c
> > > +++ b/tools/perf/tests/builtin-test.c
> > > @@ -296,9 +296,22 @@ static const char *shell_test__description(char *description, size_t size,
> > >   #define for_each_shell_test(entlist, nr, base, ent)	                \
> > >   	for (int __i = 0; __i < nr && (ent = entlist[__i]); __i++)	\
> > > -		if (!is_directory(base, ent) && \
> > > +		if (ent->d_name[0] != '.' && \
> > > +			!is_directory(base, ent) && \
> > >   			is_executable_file(base, ent) && \
> > > -			ent->d_name[0] != '.')
> > > +			is_shell_script(ent->d_name))
> > 
> > Just nitpick: since multiple conditions are added, seems to me it's good
> > to use a single function is_executable_shell_script() to make decision
> > if a file is an executable shell script.
> 
> I'd certainly make a function if this was being re-used, but as the "coding
> pattern" was to do all the tests already inside the if() in only one place,
> I kept with the style there and didn't change the code that didn't need
> changing. I can rewrite this code and basically make a function that is just
> an if ...:
> 
> bool is_exe_shell_script(const char *base, struct dirent *ent) {
>    return ent->d_name[0] != '.'         && !is_directory(base, ent) &&
>           is_executable_file(base, ent) && is_shell_script(ent->d_name);
> }
> 
> And macro becomes:
> 
> #define for_each_shell_test(entlist, nr, base, ent) \
>   for (int __i = 0; __i < nr && (ent = entlist[__i]); __i++) \
>     if (is_shell(base, ent))

Sorry for long latency.

If the condition checking gets complex, seems to me it is reasonable to
use a static function (or a macro?) to encapsulate the logics.

> But one catch... it really should be is_non_hidden_exe_shell_script() as
> it's checking that it's not a hidden file AND is a shell script. Or do I
> keep the hidden file test outside of the function in the if? If we're nit
> picking then I need to know exactly what you want here as your suggested
> name is actually incorrect.

I personally prefer to use the condition:

  if (is_exe_shell_script() && ent->d_name[0] != '.')
      do_something...

The reason is the function is_exe_shell_script() is more common and we
use it easily in wider scope.

> > And the condition checking 'ent->d_name[0] != '.'' would be redundant
> > after we have checked the file suffix '.sh'.
> 
> This isn't actually redundant. You can have .something.sh :) If the idea is
> we skip anything with a . at the start first always... then the if (to me)
> is obvious.

Yeah, I agree the checking the start char '.' is the right thing
to do.

Thanks,
Leo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/3] perf test: Add coresight tests to guage quality of data generated
  2022-05-26  8:20       ` Leo Yan
@ 2022-05-26 16:08         ` Leo Yan
  2022-06-13 14:15         ` Carsten Haitzler
  1 sibling, 0 replies; 19+ messages in thread
From: Leo Yan @ 2022-05-26 16:08 UTC (permalink / raw)
  To: Carsten Haitzler
  Cc: linux-kernel, coresight, suzuki.poulose, mathieu.poirier,
	mike.leach, linux-perf-users, acme

On Thu, May 26, 2022 at 04:20:39PM +0800, Leo Yan wrote:

[...]

> Yes, my Juno board has 8GB but I also have DB410c with 1GB with quad
> coes [1].  I am still concern for 256MB buffer size, it's not friendly for
> embedded system, and even not good for server.  For example, if we run
> this testing on Arm server with 96 cores (like Hisilicon D06 board),
> then we need the buffer size is:
> 
>   256MiB * 96 = 16GiB

Correct for my bad math... here should be 256MiB * 96 = 24GiB.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/3] perf test: Add coresight tests to guage quality of data generated
  2022-03-09 12:28 ` [PATCH 3/3] perf test: Add coresight tests to guage quality of data generated carsten.haitzler
  2022-04-10  8:30   ` Leo Yan
@ 2022-05-30 16:27   ` Mathieu Poirier
  2022-05-30 16:47     ` Mathieu Poirier
  2022-06-13 13:00     ` Carsten Haitzler
  1 sibling, 2 replies; 19+ messages in thread
From: Mathieu Poirier @ 2022-05-30 16:27 UTC (permalink / raw)
  To: carsten.haitzler
  Cc: linux-kernel, coresight, suzuki.poulose, mike.leach, leo.yan,
	linux-perf-users, acme

On Wed, Mar 09, 2022 at 12:28:59PM +0000, carsten.haitzler@foss.arm.com wrote:
> From: Carsten Haitzler <carsten.haitzler@arm.com>
> 
> This adds a test harness and tests to run perf record and examine the
> resuling output when coresight is enabled on arm64 and check the
> resulting quality of the output as part of perf test. These tests use
> various tools to produce output from perf record then measure some key
> specific aspects of that data to see if the data exists at all and
> contains key aspects such as measuring some data for every thread of
> a test or produces sufficient data for large exeuction runs of a large
> executable. etc.
> 
> Signed-off-by: Carsten Haitzler <carsten.haitzler@arm.com>
> ---
>  MAINTAINERS                                   |   4 +
>  tools/perf/.gitignore                         |   6 +-
>  tools/perf/Documentation/arm-coresight.txt    | 140 ++++++++++++++++++
>  tools/perf/Makefile.perf                      |  14 +-
>  tools/perf/tests/shell/coresight/Makefile     |  30 ++++
>  .../tests/shell/coresight/Makefile.miniconfig |  23 +++
>  .../shell/coresight/asm_pure_loop/.gitignore  |   1 +
>  .../shell/coresight/asm_pure_loop/Makefile    |  30 ++++
>  .../coresight/asm_pure_loop/asm_pure_loop.S   |  28 ++++
>  .../shell/coresight/memcpy_thread/.gitignore  |   1 +
>  .../shell/coresight/memcpy_thread/Makefile    |  29 ++++
>  .../coresight/memcpy_thread/memcpy_thread.c   |  79 ++++++++++
>  .../shell/coresight/thread_loop/.gitignore    |   1 +
>  .../shell/coresight/thread_loop/Makefile      |  29 ++++
>  .../shell/coresight/thread_loop/thread_loop.c |  86 +++++++++++
>  .../coresight/unroll_loop_thread/.gitignore   |   1 +
>  .../coresight/unroll_loop_thread/Makefile     |  29 ++++
>  .../unroll_loop_thread/unroll_loop_thread.c   |  74 +++++++++
>  .../tests/shell/coresight_asm_pure_loop.sh    |  18 +++
>  .../shell/coresight_memcpy_thread_16k_10.sh   |  18 +++
>  .../coresight_thread_loop_check_tid_10.sh     |  19 +++
>  .../coresight_thread_loop_check_tid_2.sh      |  19 +++
>  .../shell/coresight_unroll_loop_thread_10.sh  |  18 +++
>  tools/perf/tests/shell/lib/coresight.sh       | 130 ++++++++++++++++
>  24 files changed, 823 insertions(+), 4 deletions(-)

As Leo pointed out this is a big patch and hard to digest intellectually.

>  create mode 100644 tools/perf/Documentation/arm-coresight.txt
>  create mode 100644 tools/perf/tests/shell/coresight/Makefile
>  create mode 100644 tools/perf/tests/shell/coresight/Makefile.miniconfig
>  create mode 100644 tools/perf/tests/shell/coresight/asm_pure_loop/.gitignore
>  create mode 100644 tools/perf/tests/shell/coresight/asm_pure_loop/Makefile
>  create mode 100644 tools/perf/tests/shell/coresight/asm_pure_loop/asm_pure_loop.S
>  create mode 100644 tools/perf/tests/shell/coresight/memcpy_thread/.gitignore
>  create mode 100644 tools/perf/tests/shell/coresight/memcpy_thread/Makefile
>  create mode 100644 tools/perf/tests/shell/coresight/memcpy_thread/memcpy_thread.c
>  create mode 100644 tools/perf/tests/shell/coresight/thread_loop/.gitignore
>  create mode 100644 tools/perf/tests/shell/coresight/thread_loop/Makefile
>  create mode 100644 tools/perf/tests/shell/coresight/thread_loop/thread_loop.c
>  create mode 100644 tools/perf/tests/shell/coresight/unroll_loop_thread/.gitignore
>  create mode 100644 tools/perf/tests/shell/coresight/unroll_loop_thread/Makefile
>  create mode 100644 tools/perf/tests/shell/coresight/unroll_loop_thread/unroll_loop_thread.c
>  create mode 100755 tools/perf/tests/shell/coresight_asm_pure_loop.sh
>  create mode 100755 tools/perf/tests/shell/coresight_memcpy_thread_16k_10.sh
>  create mode 100755 tools/perf/tests/shell/coresight_thread_loop_check_tid_10.sh
>  create mode 100755 tools/perf/tests/shell/coresight_thread_loop_check_tid_2.sh
>  create mode 100755 tools/perf/tests/shell/coresight_unroll_loop_thread_10.sh
>  create mode 100644 tools/perf/tests/shell/lib/coresight.sh
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 673c7124ca82..18cc20609f2e 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -1918,10 +1918,14 @@ F:	drivers/hwtracing/coresight/*
>  F:	include/dt-bindings/arm/coresight-cti-dt.h
>  F:	include/linux/coresight*
>  F:	samples/coresight/*
> +F:	tools/perf/Documentation/arm-coresight.txt
>  F:	tools/perf/arch/arm/util/auxtrace.c
>  F:	tools/perf/arch/arm/util/cs-etm.c
>  F:	tools/perf/arch/arm/util/cs-etm.h
>  F:	tools/perf/arch/arm/util/pmu.c
> +F:	tools/perf/tests/shell/coresight_*
> +F:	tools/perf/tests/shell/tools/Makefile
> +F:	tools/perf/tests/shell/tools/coresight/*
>  F:	tools/perf/util/cs-etm-decoder/*
>  F:	tools/perf/util/cs-etm.*
>  
> diff --git a/tools/perf/.gitignore b/tools/perf/.gitignore
> index 20b8ab984d5f..138c679ecacd 100644
> --- a/tools/perf/.gitignore
> +++ b/tools/perf/.gitignore
> @@ -15,8 +15,9 @@ perf*.1
>  perf*.xml
>  perf*.html
>  common-cmds.h
> -perf.data
> -perf.data.old
> +perf*.data
> +perf*.data.old
> +stats-*.csv
>  output.svg
>  perf-archive
>  perf-with-kcore
> @@ -30,6 +31,7 @@ config.mak.autogen
>  *-flex.*
>  *.pyc
>  *.pyo
> +*.stdout
>  .config-detected
>  util/intel-pt-decoder/inat-tables.c
>  arch/*/include/generated/
> diff --git a/tools/perf/Documentation/arm-coresight.txt b/tools/perf/Documentation/arm-coresight.txt
> new file mode 100644
> index 000000000000..3a9e6c573c58
> --- /dev/null
> +++ b/tools/perf/Documentation/arm-coresight.txt

I think it would be best to keep all the coresight documentation under the
current coresight documentation repository[1].  That way all the information on
coresight can be found in a central place.

Some part of what is added by this patch is redundant with what is currently
available in [1].  Other parts are tests specific and should be added under
something like "coresight-perf-test.rst".

Thanks,
Mathieu

[1]. Documentation/trace/coresight/


> @@ -0,0 +1,140 @@
> +Arm Coresight Support
> +=====================
> +
> +Coresight is a feature of some Arm based processors that allows for
> +debugging. One of the things it can do is trace every instruction
> +executed and remotely expose that information in a hardware compressed
> +stream. Perf is able to locally access that stream and store it to the
> +output perf data files. This stream can then be later decoded to give the
> +instructions that were traced for debugging or profiling purposes. You
> +can log such data with a perf record command like:
> +
> +    perf record -e cs_etm//u testbinary
> +
> +This would run some test binary (testbinary) until it exits and record
> +a perf.data trace file. That file would have AUX sections if coresight
> +is working correctly. You can dump the content of this file as
> +readable text with a command like:
> +
> +    perf report --stdio --dump -i perf.data
> +
> +You should find some sections of this file have AUX data blocks like:
> +
> +    0x1e78 [0x30]: PERF_RECORD_AUXTRACE size: 0x11dd0  offset: 0  ref: 0x1b614fc1061b0ad1  idx: 0  tid: 531230  cpu: -1
> +
> +    . ... CoreSight ETM Trace data: size 73168 bytes
> +            Idx:0; ID:10;   I_ASYNC : Alignment Synchronisation.
> +              Idx:12; ID:10;  I_TRACE_INFO : Trace Info.; INFO=0x0 { CC.0 }
> +              Idx:17; ID:10;  I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0x0000000000000000;
> +              Idx:26; ID:10;  I_TRACE_ON : Trace On.
> +              Idx:27; ID:10;  I_ADDR_CTXT_L_64IS0 : Address & Context, Long, 64 bit, IS0.; Addr=0x0000FFFFB6069140; Ctxt: AArch64,EL0, NS;
> +              Idx:38; ID:10;  I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE
> +              Idx:39; ID:10;  I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE
> +              Idx:40; ID:10;  I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE
> +              Idx:41; ID:10;  I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEN
> +              ...
> +
> +If you see these above, then your system is tracing coresight data
> +correctly.
> +
> +To compile perf with coresight support in the perf directory do
> +
> +    make CORESIGHT=1
> +
> +This will compile the perf tool with coresight support as well as
> +build some small test binaries for perf test. This requires you also
> +be compiling for 64bit Arm (ARM64/aarch64). The tools run as part of
> +perf coresight tracing are in tests/shell/tools/coresight.
> +
> +You will also want coresight support enabled in your kernel config.
> +Ensure it is enabled with:
> +
> +    CONFIG_CORESIGHT=y
> +
> +There are various other coresight options you probably also want
> +enabled like:
> +
> +    CONFIG_CORESIGHT_LINKS_AND_SINKS=y
> +    CONFIG_CORESIGHT_LINK_AND_SINK_TMC=y
> +    CONFIG_CORESIGHT_CATU=y
> +    CONFIG_CORESIGHT_SINK_TPIU=y
> +    CONFIG_CORESIGHT_SINK_ETBV10=y
> +    CONFIG_CORESIGHT_SOURCE_ETM4X=y
> +    CONFIG_CORESIGHT_STM=y
> +    CONFIG_CORESIGHT_CPU_DEBUG=y
> +    CONFIG_CORESIGHT_CTI=y
> +    CONFIG_CORESIGHT_CTI_INTEGRATION_REGS=y
> +
> +Please refer to the kernel configuration help for more information.
> +
> +Perf test - Verify kernel and userspace perf coresight work
> +===========================================================
> +
> +When you run perf test, it will do a lot of self tests. Some of those
> +tests will cover Coresight (only if enabled and on ARM64). You
> +generally would run perf test from the tools/perf directory in the
> +kernel tree. Some tests will check some internal perf support like:
> +
> +    Check Arm CoreSight trace data recording and synthesized samples
> +
> +Some others will actually use perf record and some test binaries that
> +are in tests/shell/tools/coresight and will collect traces to ensure a
> +minimum level of functionality is met. The scripts that launch these
> +tests are in tests/shell. These will all look like:
> +
> +    Coresight / Memcpy 1M 25 Threads
> +    Coresight / Unroll Loop Thread 2
> +    ...
> +
> +These perf record tests will not run if the tool binaries do not exist
> +in tests/shell/tools/coresight/*/ and will be skipped. If you do not
> +have coresight support in hardware then either do not build perf with
> +coresight support or remove these binaries in order to not have these
> +tests fail and have them skip instead.
> +
> +These tests will log historical results in the current working
> +directory (e.g. tools/perf) and will be named stats-*.csv like:
> +
> +    stats-asm_pure_loop-out.csv
> +    stats-bubble_sort-random.csv
> +    ...
> +
> +These statistic files log some aspects of the AUX data sections in
> +the perf data output counting some numbers of certain encodings (a
> +good way to know that it's working in a very simple way). One problem
> +with coresight is that given a large enough amount of data needing to
> +be logged, some of it can be lost due to the processor not waking up
> +in time to read out all the data from buffers etc.. You will notice
> +that the amount of data collected can vary a lot per run of perf test.
> +If you wish to see how this changes over time, simply run perf test
> +multiple times and all these csv files will have more and more data
> +appended to it that you can later examine, graph and otherwise use to
> +figure out if things have become worse or better.
> +
> +Be aware that amny of these tests take quite a while to run, specifically
> +in processing the perf data file and dumping contents to then examine what
> +is inside.
> +
> +You can change where these csv logs are stored by setting the
> +PERF_TEST_CORESIGHT_STATDIR environment variable before running perf
> +test like:
> +
> +    export PERF_TEST_CORESIGHT_STATDIR=/var/tmp
> +    perf test
> +
> +They will also store resulting perf output data in the current
> +directory for later inspection like:
> +
> +    perf-memcpy-1m.data
> +    perf-thread_loop-2th.data
> +    ...
> +
> +You can alter where the perf data files are stored by setting the
> +PERF_TEST_CORESIGHT_DATADIR environment variable such as:
> +
> +    PERF_TEST_CORESIGHT_DATADIR=/var/tmp
> +    perf test
> +
> +You may wish to set these above environment variables if you which to
> +keep the output of tests outside of the current working directory for
> +longer term storage and examination.
> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
> index ac861e42c8f7..b97db83992e0 100644
> --- a/tools/perf/Makefile.perf
> +++ b/tools/perf/Makefile.perf
> @@ -630,7 +630,15 @@ sync_file_range_tbls := $(srctree)/tools/perf/trace/beauty/sync_file_range.sh
>  $(sync_file_range_arrays): $(linux_uapi_dir)/fs.h $(sync_file_range_tbls)
>  	$(Q)$(SHELL) '$(sync_file_range_tbls)' $(linux_uapi_dir) > $@
>  
> -all: shell_compatibility_test $(ALL_PROGRAMS) $(LANG_BINDINGS) $(OTHER_PROGRAMS)
> +TESTS_CORESIGHT_DIR := $(srctree)/tools/perf/tests/shell/coresight
> +
> +tests-coresight-targets: FORCE
> +	$(Q)$(MAKE) -C $(TESTS_CORESIGHT_DIR)
> +
> +tests-coresight-targets-clean:
> +	$(Q)$(MAKE) -C $(TESTS_CORESIGHT_DIR) clean
> +
> +all: shell_compatibility_test $(ALL_PROGRAMS) $(LANG_BINDINGS) $(OTHER_PROGRAMS) tests-coresight-targets
>  
>  # Create python binding output directory if not already present
>  _dummy := $(shell [ -d '$(OUTPUT)python' ] || mkdir -p '$(OUTPUT)python')
> @@ -1020,6 +1028,7 @@ install-tests: all install-gtk
>  		$(INSTALL) tests/shell/*.sh '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell'; \
>  		$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/lib'; \
>  		$(INSTALL) tests/shell/lib/*.sh '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/lib'
> +	$(Q)$(MAKE) -C tests/shell/coresight install-tests
>  
>  install-bin: install-tools install-tests install-traceevent-plugins
>  
> @@ -1088,7 +1097,7 @@ endif # BUILD_BPF_SKEL
>  bpf-skel-clean:
>  	$(call QUIET_CLEAN, bpf-skel) $(RM) -r $(SKEL_TMP_OUT) $(SKELETONS)
>  
> -clean:: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean $(LIBPERF)-clean fixdep-clean python-clean bpf-skel-clean
> +clean:: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean $(LIBPERF)-clean fixdep-clean python-clean bpf-skel-clean tests-coresight-targets-clean
>  	$(call QUIET_CLEAN, core-objs)  $(RM) $(LIBPERF_A) $(OUTPUT)perf-archive $(OUTPUT)perf-with-kcore $(OUTPUT)perf-iostat $(LANG_BINDINGS)
>  	$(Q)find $(if $(OUTPUT),$(OUTPUT),.) -name '*.o' -delete -o -name '\.*.cmd' -delete -o -name '\.*.d' -delete
>  	$(Q)$(RM) $(OUTPUT).config-detected
> @@ -1155,5 +1164,6 @@ FORCE:
>  .PHONY: shell_compatibility_test please_set_SHELL_PATH_to_a_more_modern_shell
>  .PHONY: $(GIT-HEAD-PHONY) TAGS tags cscope FORCE prepare
>  .PHONY: libtraceevent_plugins archheaders
> +.PHONY: $(TESTS_CORESIGHT_TARGETS)
>  
>  endif # force_fixdep
> diff --git a/tools/perf/tests/shell/coresight/Makefile b/tools/perf/tests/shell/coresight/Makefile
> new file mode 100644
> index 000000000000..dda99aeac158
> --- /dev/null
> +++ b/tools/perf/tests/shell/coresight/Makefile
> @@ -0,0 +1,30 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> +include ../../../../../tools/scripts/Makefile.include
> +include ../../../../../tools/scripts/Makefile.arch
> +include ../../../../../tools/scripts/utilities.mak
> +
> +SUBDIRS = \
> +	asm_pure_loop \
> +	thread_loop \
> +	memcpy_thread \
> +	unroll_loop_thread
> +
> +all: $(SUBDIRS)
> +$(SUBDIRS):
> +	$(Q)$(MAKE) -C $@
> +
> +INSTALLDIRS = $(SUBDIRS:%=install-%)
> +
> +install-tests: $(INSTALLDIRS)
> +$(INSTALLDIRS):
> +	$(Q)$(MAKE) -C $(@:install-%=%) install-tests
> +
> +CLEANDIRS = $(SUBDIRS:%=clean-%)
> +
> +clean: $(CLEANDIRS)
> +$(CLEANDIRS):
> +	$(Q)$(MAKE) -C $(@:clean-%=%) clean >/dev/null
> +
> +.PHONY: all clean $(SUBDIRS) $(CLEANDIRS) $(INSTALLDIRS)
> +
> diff --git a/tools/perf/tests/shell/coresight/Makefile.miniconfig b/tools/perf/tests/shell/coresight/Makefile.miniconfig
> new file mode 100644
> index 000000000000..893c12685fed
> --- /dev/null
> +++ b/tools/perf/tests/shell/coresight/Makefile.miniconfig
> @@ -0,0 +1,23 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> +
> +ifndef DESTDIR
> +prefix ?= $(HOME)
> +endif
> +
> +DESTDIR_SQ = $(subst ','\'',$(DESTDIR))
> +perfexecdir = libexec/perf-core
> +perfexec_instdir = $(perfexecdir)
> +
> +ifneq ($(filter /%,$(firstword $(perfexecdir))),)
> +perfexec_instdir = $(perfexecdir)
> +else
> +perfexec_instdir = $(prefix)/$(perfexecdir)
> +endif
> +
> +perfexec_instdir_SQ = $(subst ','\'',$(perfexec_instdir))
> +INSTALL = install
> +
> +include ../../../../../scripts/Makefile.include
> +include ../../../../../scripts/Makefile.arch
> +include ../../../../../scripts/utilities.mak
> diff --git a/tools/perf/tests/shell/coresight/asm_pure_loop/.gitignore b/tools/perf/tests/shell/coresight/asm_pure_loop/.gitignore
> new file mode 100644
> index 000000000000..468673ac32e8
> --- /dev/null
> +++ b/tools/perf/tests/shell/coresight/asm_pure_loop/.gitignore
> @@ -0,0 +1 @@
> +asm_pure_loop
> diff --git a/tools/perf/tests/shell/coresight/asm_pure_loop/Makefile b/tools/perf/tests/shell/coresight/asm_pure_loop/Makefile
> new file mode 100644
> index 000000000000..10c5a60cb71c
> --- /dev/null
> +++ b/tools/perf/tests/shell/coresight/asm_pure_loop/Makefile
> @@ -0,0 +1,30 @@
> +# SPDX-License-Identifier: GPL-2.0
> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> +
> +include ../Makefile.miniconfig
> +
> +BIN=asm_pure_loop
> +LIB=
> +
> +all: $(BIN)
> +
> +$(BIN): $(BIN).S
> +ifdef CORESIGHT
> +ifeq ($(ARCH),arm64)
> +	$(Q)$(CC) $(BIN).S -nostdlib -static -o $(BIN) $(LIB)
> +endif
> +endif
> +
> +install-tests: all
> +ifdef CORESIGHT
> +ifeq ($(ARCH),arm64)
> +	$(call QUIET_INSTALL, tests) \
> +		$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)'; \
> +		$(INSTALL) $(BIN) '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)/$(BIN)'
> +endif
> +endif
> +
> +clean:
> +	$(Q)$(RM) -f $(BIN)
> +
> +.PHONY: all clean install-tests
> diff --git a/tools/perf/tests/shell/coresight/asm_pure_loop/asm_pure_loop.S b/tools/perf/tests/shell/coresight/asm_pure_loop/asm_pure_loop.S
> new file mode 100644
> index 000000000000..75cf084a927d
> --- /dev/null
> +++ b/tools/perf/tests/shell/coresight/asm_pure_loop/asm_pure_loop.S
> @@ -0,0 +1,28 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/* Tamas Zsoldos <tamas.zsoldos@arm.com>, 2021 */
> +
> +.globl _start
> +_start:
> +	mov	x0, 0x0000ffff
> +	mov	x1, xzr
> +loop:
> +	nop
> +	nop
> +	cbnz	x1, noskip
> +	nop
> +	nop
> +	adrp	x2, skip
> +	add 	x2, x2, :lo12:skip
> +	br	x2
> +	nop
> +	nop
> +noskip:
> +	nop
> +	nop
> +skip:
> +	sub	x0, x0, 1
> +	cbnz	x0, loop
> +
> +	mov	x0, #0
> +	mov	x8, #93 // __NR_exit syscall
> +	svc	#0
> diff --git a/tools/perf/tests/shell/coresight/memcpy_thread/.gitignore b/tools/perf/tests/shell/coresight/memcpy_thread/.gitignore
> new file mode 100644
> index 000000000000..f8217e56091e
> --- /dev/null
> +++ b/tools/perf/tests/shell/coresight/memcpy_thread/.gitignore
> @@ -0,0 +1 @@
> +memcpy_thread
> diff --git a/tools/perf/tests/shell/coresight/memcpy_thread/Makefile b/tools/perf/tests/shell/coresight/memcpy_thread/Makefile
> new file mode 100644
> index 000000000000..e2604cfae74b
> --- /dev/null
> +++ b/tools/perf/tests/shell/coresight/memcpy_thread/Makefile
> @@ -0,0 +1,29 @@
> +# SPDX-License-Identifier: GPL-2.0
> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> +include ../Makefile.miniconfig
> +
> +BIN=memcpy_thread
> +LIB=-pthread
> +
> +all: $(BIN)
> +
> +$(BIN): $(BIN).c
> +ifdef CORESIGHT
> +ifeq ($(ARCH),arm64)
> +	$(Q)$(CC) $(BIN).c -o $(BIN) $(LIB)
> +endif
> +endif
> +
> +install-tests: all
> +ifdef CORESIGHT
> +ifeq ($(ARCH),arm64)
> +	$(call QUIET_INSTALL, tests) \
> +		$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)'; \
> +		$(INSTALL) $(BIN) '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)/$(BIN)'
> +endif
> +endif
> +
> +clean:
> +	$(Q)$(RM) -f $(BIN)
> +
> +.PHONY: all clean install-tests
> diff --git a/tools/perf/tests/shell/coresight/memcpy_thread/memcpy_thread.c b/tools/perf/tests/shell/coresight/memcpy_thread/memcpy_thread.c
> new file mode 100644
> index 000000000000..a7e169d1bf64
> --- /dev/null
> +++ b/tools/perf/tests/shell/coresight/memcpy_thread/memcpy_thread.c
> @@ -0,0 +1,79 @@
> +// SPDX-License-Identifier: GPL-2.0
> +// Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <unistd.h>
> +#include <string.h>
> +#include <pthread.h>
> +
> +struct args {
> +	unsigned long loops;
> +	unsigned long size;
> +	pthread_t th;
> +	void *ret;
> +};
> +
> +static void *thrfn(void *arg)
> +{
> +	struct args *a = arg;
> +	unsigned long i, len = a->loops;
> +	unsigned char *src, *dst;
> +
> +	src = malloc(a->size * 1024);
> +	dst = malloc(a->size * 1024);
> +	if ((!src) || (!dst)) {
> +		printf("ERR: Can't allocate memory\n");
> +		exit(1);
> +	}
> +	for (i = 0; i < len; i++)
> +		memcpy(dst, src, a->size * 1024);
> +}
> +
> +static pthread_t new_thr(void *(*fn) (void *arg), void *arg)
> +{
> +	pthread_t t;
> +	pthread_attr_t attr;
> +
> +	pthread_attr_init(&attr);
> +	pthread_create(&t, &attr, fn, arg);
> +	return t;
> +}
> +
> +int main(int argc, char **argv)
> +{
> +	unsigned long i, len, size, thr;
> +	pthread_t threads[256];
> +	struct args args[256];
> +	long long v;
> +
> +	if (argc < 4) {
> +		printf("ERR: %s [copysize Kb] [numthreads] [numloops (hundreds)]\n", argv[0]);
> +		exit(1);
> +	}
> +
> +	v = atoll(argv[1]);
> +	if ((v < 1) || (v > (1024 * 1024))) {
> +		printf("ERR: max memory 1GB (1048576 KB)\n");
> +		exit(1);
> +	}
> +	size = v;
> +	thr = atol(argv[2]);
> +	if ((thr < 1) || (thr > 256)) {
> +		printf("ERR: threads 1-256\n");
> +		exit(1);
> +	}
> +	v = atoll(argv[3]);
> +	if ((v < 1) || (v > 40000000000ll)) {
> +		printf("ERR: loops 1-40000000000 (hundreds)\n");
> +		exit(1);
> +	}
> +	len = v * 100;
> +	for (i = 0; i < thr; i++) {
> +		args[i].loops = len;
> +		args[i].size = size;
> +		args[i].th = new_thr(thrfn, &(args[i]));
> +	}
> +	for (i = 0; i < thr; i++)
> +		pthread_join(args[i].th, &(args[i].ret));
> +	return 0;
> +}
> diff --git a/tools/perf/tests/shell/coresight/thread_loop/.gitignore b/tools/perf/tests/shell/coresight/thread_loop/.gitignore
> new file mode 100644
> index 000000000000..6d4c33eaa9e8
> --- /dev/null
> +++ b/tools/perf/tests/shell/coresight/thread_loop/.gitignore
> @@ -0,0 +1 @@
> +thread_loop
> diff --git a/tools/perf/tests/shell/coresight/thread_loop/Makefile b/tools/perf/tests/shell/coresight/thread_loop/Makefile
> new file mode 100644
> index 000000000000..424df4e8b0e6
> --- /dev/null
> +++ b/tools/perf/tests/shell/coresight/thread_loop/Makefile
> @@ -0,0 +1,29 @@
> +# SPDX-License-Identifier: GPL-2.0
> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> +include ../Makefile.miniconfig
> +
> +BIN=thread_loop
> +LIB=-pthread
> +
> +all: $(BIN)
> +
> +$(BIN): $(BIN).c
> +ifdef CORESIGHT
> +ifeq ($(ARCH),arm64)
> +	$(Q)$(CC) $(BIN).c -o $(BIN) $(LIB)
> +endif
> +endif
> +
> +install-tests: all
> +ifdef CORESIGHT
> +ifeq ($(ARCH),arm64)
> +	$(call QUIET_INSTALL, tests) \
> +		$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)'; \
> +		$(INSTALL) $(BIN) '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)/$(BIN)'
> +endif
> +endif
> +
> +clean:
> +	$(Q)$(RM) -f $(BIN)
> +
> +.PHONY: all clean install-tests
> diff --git a/tools/perf/tests/shell/coresight/thread_loop/thread_loop.c b/tools/perf/tests/shell/coresight/thread_loop/thread_loop.c
> new file mode 100644
> index 000000000000..c0158fac7d0b
> --- /dev/null
> +++ b/tools/perf/tests/shell/coresight/thread_loop/thread_loop.c
> @@ -0,0 +1,86 @@
> +// SPDX-License-Identifier: GPL-2.0
> +// Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> +
> +// define this for gettid()
> +#define _GNU_SOURCE
> +
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <unistd.h>
> +#include <string.h>
> +#include <pthread.h>
> +#include <sys/syscall.h>
> +#ifndef SYS_gettid
> +// gettid is 178 on arm64
> +# define SYS_gettid 178
> +#endif
> +#define gettid() syscall(SYS_gettid)
> +
> +struct args {
> +	unsigned int loops;
> +	pthread_t th;
> +	void *ret;
> +};
> +
> +static void *thrfn(void *arg)
> +{
> +	struct args *a = arg;
> +	int i = 0, len = a->loops;
> +
> +	if (getenv("SHOW_TID")) {
> +		unsigned long long tid = gettid();
> +
> +		printf("%llu\n", tid);
> +	}
> +	asm volatile(
> +		"loop:\n"
> +		"add %[i], %[i], #1\n"
> +		"cmp %[i], %[len]\n"
> +		"blt loop\n"
> +		: /* out */
> +		: /* in */ [i] "r" (i), [len] "r" (len)
> +		: /* clobber */
> +	);
> +	return (void *)(long)i;
> +}
> +
> +static pthread_t new_thr(void *(*fn) (void *arg), void *arg)
> +{
> +	pthread_t t;
> +	pthread_attr_t attr;
> +
> +	pthread_attr_init(&attr);
> +	pthread_create(&t, &attr, fn, arg);
> +	return t;
> +}
> +
> +int main(int argc, char **argv)
> +{
> +	unsigned int i, len, thr;
> +	pthread_t threads[256];
> +	struct args args[256];
> +
> +	if (argc < 3) {
> +		printf("ERR: %s [numthreads] [numloops (millions)]\n", argv[0]);
> +		exit(1);
> +	}
> +
> +	thr = atoi(argv[1]);
> +	if ((thr < 1) || (thr > 256)) {
> +		printf("ERR: threads 1-256\n");
> +		exit(1);
> +	}
> +	len = atoi(argv[2]);
> +	if ((len < 1) || (len > 4000)) {
> +		printf("ERR: max loops 4000 (millions)\n");
> +		exit(1);
> +	}
> +	len *= 1000000;
> +	for (i = 0; i < thr; i++) {
> +		args[i].loops = len;
> +		args[i].th = new_thr(thrfn, &(args[i]));
> +	}
> +	for (i = 0; i < thr; i++)
> +		pthread_join(args[i].th, &(args[i].ret));
> +	return 0;
> +}
> diff --git a/tools/perf/tests/shell/coresight/unroll_loop_thread/.gitignore b/tools/perf/tests/shell/coresight/unroll_loop_thread/.gitignore
> new file mode 100644
> index 000000000000..2cb4e996dbf3
> --- /dev/null
> +++ b/tools/perf/tests/shell/coresight/unroll_loop_thread/.gitignore
> @@ -0,0 +1 @@
> +unroll_loop_thread
> diff --git a/tools/perf/tests/shell/coresight/unroll_loop_thread/Makefile b/tools/perf/tests/shell/coresight/unroll_loop_thread/Makefile
> new file mode 100644
> index 000000000000..45ab2be8be92
> --- /dev/null
> +++ b/tools/perf/tests/shell/coresight/unroll_loop_thread/Makefile
> @@ -0,0 +1,29 @@
> +# SPDX-License-Identifier: GPL-2.0
> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> +include ../Makefile.miniconfig
> +
> +BIN=unroll_loop_thread
> +LIB=-pthread
> +
> +all: $(BIN)
> +
> +$(BIN): $(BIN).c
> +ifdef CORESIGHT
> +ifeq ($(ARCH),arm64)
> +	$(Q)$(CC) $(BIN).c -o $(BIN) $(LIB)
> +endif
> +endif
> +
> +install-tests: all
> +ifdef CORESIGHT
> +ifeq ($(ARCH),arm64)
> +	$(call QUIET_INSTALL, tests) \
> +		$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)'; \
> +		$(INSTALL) $(BIN) '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)/$(BIN)'
> +endif
> +endif
> +
> +clean:
> +	$(Q)$(RM) -f $(BIN)
> +
> +.PHONY: all clean install-tests
> diff --git a/tools/perf/tests/shell/coresight/unroll_loop_thread/unroll_loop_thread.c b/tools/perf/tests/shell/coresight/unroll_loop_thread/unroll_loop_thread.c
> new file mode 100644
> index 000000000000..cb9d22c7dfb9
> --- /dev/null
> +++ b/tools/perf/tests/shell/coresight/unroll_loop_thread/unroll_loop_thread.c
> @@ -0,0 +1,74 @@
> +// SPDX-License-Identifier: GPL-2.0
> +// Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <unistd.h>
> +#include <string.h>
> +#include <pthread.h>
> +
> +struct args {
> +	pthread_t th;
> +	unsigned int in, out;
> +	void *ret;
> +};
> +
> +static void *thrfn(void *arg)
> +{
> +	struct args *a = arg;
> +	unsigned int i, in = a->in;
> +
> +	for (i = 0; i < 10000; i++) {
> +		asm volatile (
> +// force an unroll of thia add instruction so we can test long runs of code
> +#define SNIP1 "add %[in], %[in], #1\n"
> +// 10
> +#define SNIP2 SNIP1 SNIP1 SNIP1 SNIP1 SNIP1 SNIP1 SNIP1 SNIP1 SNIP1 SNIP1
> +// 100
> +#define SNIP3 SNIP2 SNIP2 SNIP2 SNIP2 SNIP2 SNIP2 SNIP2 SNIP2 SNIP2 SNIP2
> +// 1000
> +#define SNIP4 SNIP3 SNIP3 SNIP3 SNIP3 SNIP3 SNIP3 SNIP3 SNIP3 SNIP3 SNIP3
> +// 10000
> +#define SNIP5 SNIP4 SNIP4 SNIP4 SNIP4 SNIP4 SNIP4 SNIP4 SNIP4 SNIP4 SNIP4
> +// 100000
> +			SNIP5 SNIP5 SNIP5 SNIP5 SNIP5 SNIP5 SNIP5 SNIP5 SNIP5 SNIP5
> +			: /* out */
> +			: /* in */ [in] "r" (in)
> +			: /* clobber */
> +		);
> +	}
> +}
> +
> +static pthread_t new_thr(void *(*fn) (void *arg), void *arg)
> +{
> +	pthread_t t;
> +	pthread_attr_t attr;
> +
> +	pthread_attr_init(&attr);
> +	pthread_create(&t, &attr, fn, arg);
> +	return t;
> +}
> +
> +int main(int argc, char **argv)
> +{
> +	unsigned int i, thr;
> +	pthread_t threads[256];
> +	struct args args[256];
> +
> +	if (argc < 2) {
> +		printf("ERR: %s [numthreads]\n", argv[0]);
> +		exit(1);
> +	}
> +
> +	thr = atoi(argv[1]);
> +	if ((thr > 256) || (thr < 1)) {
> +		printf("ERR: threads 1-256\n");
> +		exit(1);
> +	}
> +	for (i = 0; i < thr; i++) {
> +		args[i].in = rand();
> +		args[i].th = new_thr(thrfn, &(args[i]));
> +	}
> +	for (i = 0; i < thr; i++)
> +		pthread_join(args[i].th, &(args[i].ret));
> +	return 0;
> +}
> diff --git a/tools/perf/tests/shell/coresight_asm_pure_loop.sh b/tools/perf/tests/shell/coresight_asm_pure_loop.sh
> new file mode 100755
> index 000000000000..3f0dbefcad50
> --- /dev/null
> +++ b/tools/perf/tests/shell/coresight_asm_pure_loop.sh
> @@ -0,0 +1,18 @@
> +#!/bin/sh -e
> +# Coresight / ASM Pure Loop
> +
> +# SPDX-License-Identifier: GPL-2.0
> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> +
> +TEST="asm_pure_loop"
> +. $(dirname $0)/lib/coresight.sh
> +ARGS=""
> +DATV="out"
> +DATA="$DATD/perf-$TEST-$DATV.data"
> +
> +perf record $PERFRECOPT -o "$DATA" "$BIN" $ARGS
> +
> +perf_dump_aux_verify "$DATA" 10 10 10
> +
> +err=$?
> +exit $err
> diff --git a/tools/perf/tests/shell/coresight_memcpy_thread_16k_10.sh b/tools/perf/tests/shell/coresight_memcpy_thread_16k_10.sh
> new file mode 100755
> index 000000000000..8972af835016
> --- /dev/null
> +++ b/tools/perf/tests/shell/coresight_memcpy_thread_16k_10.sh
> @@ -0,0 +1,18 @@
> +#!/bin/sh -e
> +# Coresight / Memcpy 16k 10 Threads
> +
> +# SPDX-License-Identifier: GPL-2.0
> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> +
> +TEST="memcpy_thread"
> +. $(dirname $0)/lib/coresight.sh
> +ARGS="16 10 1"
> +DATV="16k_10"
> +DATA="$DATD/perf-$TEST-$DATV.data"
> +
> +perf record $PERFRECOPT -o "$DATA" "$BIN" $ARGS
> +
> +perf_dump_aux_verify "$DATA" 10 10 10
> +
> +err=$?
> +exit $err
> diff --git a/tools/perf/tests/shell/coresight_thread_loop_check_tid_10.sh b/tools/perf/tests/shell/coresight_thread_loop_check_tid_10.sh
> new file mode 100755
> index 000000000000..5b468901f89b
> --- /dev/null
> +++ b/tools/perf/tests/shell/coresight_thread_loop_check_tid_10.sh
> @@ -0,0 +1,19 @@
> +#!/bin/sh -e
> +# Coresight / Thread Loop 10 Threads - Check TID
> +
> +# SPDX-License-Identifier: GPL-2.0
> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> +
> +TEST="thread_loop"
> +. $(dirname $0)/lib/coresight.sh
> +ARGS="10 1"
> +DATV="check-tid-10th"
> +DATA="$DATD/perf-$TEST-$DATV.data"
> +STDO="$DATD/perf-$TEST-$DATV.stdout"
> +
> +SHOW_TID=1 perf record -s $PERFRECOPT -o "$DATA" "$BIN" $ARGS > $STDO
> +
> +perf_dump_aux_tid_verify "$DATA" "$STDO"
> +
> +err=$?
> +exit $err
> diff --git a/tools/perf/tests/shell/coresight_thread_loop_check_tid_2.sh b/tools/perf/tests/shell/coresight_thread_loop_check_tid_2.sh
> new file mode 100755
> index 000000000000..f8b7abd3aa03
> --- /dev/null
> +++ b/tools/perf/tests/shell/coresight_thread_loop_check_tid_2.sh
> @@ -0,0 +1,19 @@
> +#!/bin/sh -e
> +# Coresight / Thread Loop 2 Threads - Check TID
> +
> +# SPDX-License-Identifier: GPL-2.0
> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> +
> +TEST="thread_loop"
> +. $(dirname $0)/lib/coresight.sh
> +ARGS="2 20"
> +DATV="check-tid-2th"
> +DATA="$DATD/perf-$TEST-$DATV.data"
> +STDO="$DATD/perf-$TEST-$DATV.stdout"
> +
> +SHOW_TID=1 perf record -s $PERFRECOPT -o "$DATA" "$BIN" $ARGS > $STDO
> +
> +perf_dump_aux_tid_verify "$DATA" "$STDO"
> +
> +err=$?
> +exit $err
> diff --git a/tools/perf/tests/shell/coresight_unroll_loop_thread_10.sh b/tools/perf/tests/shell/coresight_unroll_loop_thread_10.sh
> new file mode 100755
> index 000000000000..c985dfb025c2
> --- /dev/null
> +++ b/tools/perf/tests/shell/coresight_unroll_loop_thread_10.sh
> @@ -0,0 +1,18 @@
> +#!/bin/sh -e
> +# Coresight / Unroll Loop Thread 10
> +
> +# SPDX-License-Identifier: GPL-2.0
> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> +
> +TEST="unroll_loop_thread"
> +. $(dirname $0)/lib/coresight.sh
> +ARGS="10"
> +DATV="10"
> +DATA="$DATD/perf-$TEST-$DATV.data"
> +
> +perf record $PERFRECOPT -o "$DATA" "$BIN" $ARGS
> +
> +perf_dump_aux_verify "$DATA" 10 10 10
> +
> +err=$?
> +exit $err
> diff --git a/tools/perf/tests/shell/lib/coresight.sh b/tools/perf/tests/shell/lib/coresight.sh
> new file mode 100644
> index 000000000000..6a611b073f02
> --- /dev/null
> +++ b/tools/perf/tests/shell/lib/coresight.sh
> @@ -0,0 +1,130 @@
> +# SPDX-License-Identifier: GPL-2.0
> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> +
> +# This is sourced from a driver script so no need for #!/bin... etc. at the
> +# top - the assumption below is that it runs as part of sourcing after the
> +# test sets up some basic env vars to say what it is.
> +
> +# perf record options for the perf tests to use
> +PERFRECMEM="-m ,128M"
> +PERFRECOPT="$PERFRECMEM -e cs_etm//u"
> +
> +# These tests need to be run as root or coresight won't allow large buffers
> +# and will not collect proper data
> +UID=`id -u`
> +if test "$UID" -ne 0; then
> +	echo "Not running as root... skip"
> +	exit 2
> +fi
> +
> +TOOLS=$(dirname $0)
> +DIR="$TOOLS/coresight/$TEST"
> +BIN="$DIR/$TEST"
> +# If the test tool/binary does not exist and is executable then skip the test
> +if ! test -x "$BIN"; then exit 2; fi
> +DATD="."
> +# If the data dir env is set then make the data dir use that instead of ./
> +if test -n "$PERF_TEST_CORESIGHT_DATADIR"; then
> +	DATD="$PERF_TEST_CORESIGHT_DATADIR";
> +fi
> +# If the stat dir env is set then make the data dir use that instead of ./
> +STATD="."
> +if test -n "$PERF_TEST_CORESIGHT_STATDIR"; then
> +	STATD="$PERF_TEST_CORESIGHT_STATDIR";
> +fi
> +
> +# Called if the test fails - error code 2
> +err() {
> +	echo "$1"
> +	exit 1
> +}
> +
> +# Check that some statistics from our perf
> +check_val_min() {
> +	STATF="$4"
> +	if test "$2" -lt "$3"; then
> +		echo ", FAILED" >> "$STATF"
> +		err "Sanity check number of $1 is too low ($2 < $3)"
> +	fi
> +}
> +
> +perf_dump_aux_verify() {
> +	# Some basic checking that the AUX chunk contains some sensible data
> +	# to see that we are recording something and at least a minimum
> +	# amount of it. We should almost always see F3 atoms in just about
> +	# anything but certainly we will see some trace info and async atom
> +	# chunks.
> +	DUMP="$DATD/perf-tmp-aux-dump.txt"
> +	perf report --stdio --dump -i "$1" | \
> +		grep -o -e I_ATOM_F3 -e I_ASYNC -e I_TRACE_INFO > "$DUMP"
> +	# Simply count how many of these atoms we find to see that we are
> +	# producing a reasonable amount of data - exact checks are not sane
> +	# as this is a lossy  process where we may lose some blocks and the
> +	# compiler may produce different code depending on the compiler and
> +	# optimization options, so this is rough  just to see if we're
> +	# either missing almost all the data or all of it
> +	ATOM_F3_NUM=`grep I_ATOM_F3 "$DUMP" | wc -l`
> +	ATOM_ASYNC_NUM=`grep I_ASYNC "$DUMP" | wc -l`
> +	ATOM_TRACE_INFO_NUM=`grep I_TRACE_INFO "$DUMP" | wc -l`
> +	rm -f "$DUMP"
> +
> +	# Arguments provide minimums for a pass
> +	CHECK_F3_MIN="$2"
> +	CHECK_ASYNC_MIN="$3"
> +	CHECK_TRACE_INFO_MIN="$4"
> +
> +	# Write out statistics, so over time you can track results to see if
> +	# there is a pattern - for example we have less "noisy" results that
> +	# produce more consistent amounts of data each run, to see if over
> +	# time any techinques to  minimize data loss are having an effect or
> +	# not
> +	STATF="$STATD/stats-$TEST-$DATV.csv"
> +	if ! test -f "$STATF"; then
> +		echo "ATOM F3 Count, Minimum, ATOM ASYNC Count, Minimum, TRACE INFO Count, Minimum" > "$STATF"
> +	fi
> +	echo -n "$ATOM_F3_NUM, $CHECK_F3_MIN, $ATOM_ASYNC_NUM, $CHECK_ASYNC_MIN, $ATOM_TRACE_INFO_NUM, $CHECK_TRACE_INFO_MIN" >> "$STATF"
> +
> +	# Actually check to see if we passed or failed.
> +	check_val_min "ATOM_F3" "$ATOM_F3_NUM" "$CHECK_F3_MIN" "$STATF"
> +	check_val_min "ASYNC" "$ATOM_ASYNC_NUM" "$CHECK_ASYNC_MIN" "$STATF"
> +	check_val_min "TRACE_INFO" "$ATOM_TRACE_INFO_NUM" "$CHECK_TRACE_INFO_MIN" "$STATF"
> +	echo ", Ok" >> "$STATF"
> +}
> +
> +perf_dump_aux_tid_verify() {
> +	# Specifically crafted test will produce a list of Tread ID's to
> +	# stdout that need to be checked to  see that they have had trace
> +	# info collected in AUX blocks in the perf data. This will go
> +	# through all the TID's that are listed as CID=0xabcdef and see
> +	# that all the Thread IDs the test tool reports are  in the perf
> +	# data AUX chunks
> +
> +	# The TID test tools will print a TID per stdout line that are being
> +	# tested
> +	TIDS=`cat "$2"`
> +	# Scan the perf report to find the TIDs that are actually CID in hex
> +	# and build a list of the ones found
> +	FOUND_TIDS=`perf report --stdio --dump -i "$1" | \
> +			grep -o "CID=0x[0-9a-z]\+" | sed 's/CID=//g' | \
> +			uniq | sort | uniq`
> +
> +	# Iterate over the list of TIDs that the test says it has and find
> +	# them in the TIDs found in the perf report
> +	MISSING=""
> +	for TID2 in $TIDS; do
> +		FOUND=""
> +		for TIDHEX in $FOUND_TIDS; do
> +			TID=`printf "%i" $TIDHEX`
> +			if test "$TID" -eq "$TID2"; then
> +				FOUND="y"
> +				break
> +			fi
> +		done
> +		if test -z "$FOUND"; then
> +			MISSING="$MISSING $TID"
> +		fi
> +	done
> +	if test -n "$MISSING"; then
> +		err "Thread IDs $MISSING not found in perf AUX data"
> +	fi
> +}
> -- 
> 2.32.0
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/3] perf test: Add coresight tests to guage quality of data generated
  2022-05-30 16:27   ` Mathieu Poirier
@ 2022-05-30 16:47     ` Mathieu Poirier
  2022-06-13 12:53       ` Carsten Haitzler
  2022-06-13 13:00     ` Carsten Haitzler
  1 sibling, 1 reply; 19+ messages in thread
From: Mathieu Poirier @ 2022-05-30 16:47 UTC (permalink / raw)
  To: carsten.haitzler
  Cc: linux-kernel, coresight, suzuki.poulose, mike.leach, leo.yan,
	linux-perf-users, acme

On Mon, 30 May 2022 at 10:27, Mathieu Poirier
<mathieu.poirier@linaro.org> wrote:
>
> On Wed, Mar 09, 2022 at 12:28:59PM +0000, carsten.haitzler@foss.arm.com wrote:
> > From: Carsten Haitzler <carsten.haitzler@arm.com>
> >
> > This adds a test harness and tests to run perf record and examine the
> > resuling output when coresight is enabled on arm64 and check the
> > resulting quality of the output as part of perf test. These tests use
> > various tools to produce output from perf record then measure some key
> > specific aspects of that data to see if the data exists at all and
> > contains key aspects such as measuring some data for every thread of
> > a test or produces sufficient data for large exeuction runs of a large
> > executable. etc.
> >
> > Signed-off-by: Carsten Haitzler <carsten.haitzler@arm.com>
> > ---
> >  MAINTAINERS                                   |   4 +
> >  tools/perf/.gitignore                         |   6 +-
> >  tools/perf/Documentation/arm-coresight.txt    | 140 ++++++++++++++++++
> >  tools/perf/Makefile.perf                      |  14 +-
> >  tools/perf/tests/shell/coresight/Makefile     |  30 ++++
> >  .../tests/shell/coresight/Makefile.miniconfig |  23 +++
> >  .../shell/coresight/asm_pure_loop/.gitignore  |   1 +
> >  .../shell/coresight/asm_pure_loop/Makefile    |  30 ++++
> >  .../coresight/asm_pure_loop/asm_pure_loop.S   |  28 ++++
> >  .../shell/coresight/memcpy_thread/.gitignore  |   1 +
> >  .../shell/coresight/memcpy_thread/Makefile    |  29 ++++
> >  .../coresight/memcpy_thread/memcpy_thread.c   |  79 ++++++++++
> >  .../shell/coresight/thread_loop/.gitignore    |   1 +
> >  .../shell/coresight/thread_loop/Makefile      |  29 ++++
> >  .../shell/coresight/thread_loop/thread_loop.c |  86 +++++++++++
> >  .../coresight/unroll_loop_thread/.gitignore   |   1 +
> >  .../coresight/unroll_loop_thread/Makefile     |  29 ++++
> >  .../unroll_loop_thread/unroll_loop_thread.c   |  74 +++++++++
> >  .../tests/shell/coresight_asm_pure_loop.sh    |  18 +++
> >  .../shell/coresight_memcpy_thread_16k_10.sh   |  18 +++
> >  .../coresight_thread_loop_check_tid_10.sh     |  19 +++
> >  .../coresight_thread_loop_check_tid_2.sh      |  19 +++
> >  .../shell/coresight_unroll_loop_thread_10.sh  |  18 +++
> >  tools/perf/tests/shell/lib/coresight.sh       | 130 ++++++++++++++++
> >  24 files changed, 823 insertions(+), 4 deletions(-)
>
> As Leo pointed out this is a big patch and hard to digest intellectually.
>
> >  create mode 100644 tools/perf/Documentation/arm-coresight.txt
> >  create mode 100644 tools/perf/tests/shell/coresight/Makefile
> >  create mode 100644 tools/perf/tests/shell/coresight/Makefile.miniconfig
> >  create mode 100644 tools/perf/tests/shell/coresight/asm_pure_loop/.gitignore
> >  create mode 100644 tools/perf/tests/shell/coresight/asm_pure_loop/Makefile
> >  create mode 100644 tools/perf/tests/shell/coresight/asm_pure_loop/asm_pure_loop.S
> >  create mode 100644 tools/perf/tests/shell/coresight/memcpy_thread/.gitignore
> >  create mode 100644 tools/perf/tests/shell/coresight/memcpy_thread/Makefile
> >  create mode 100644 tools/perf/tests/shell/coresight/memcpy_thread/memcpy_thread.c
> >  create mode 100644 tools/perf/tests/shell/coresight/thread_loop/.gitignore
> >  create mode 100644 tools/perf/tests/shell/coresight/thread_loop/Makefile
> >  create mode 100644 tools/perf/tests/shell/coresight/thread_loop/thread_loop.c
> >  create mode 100644 tools/perf/tests/shell/coresight/unroll_loop_thread/.gitignore
> >  create mode 100644 tools/perf/tests/shell/coresight/unroll_loop_thread/Makefile
> >  create mode 100644 tools/perf/tests/shell/coresight/unroll_loop_thread/unroll_loop_thread.c
> >  create mode 100755 tools/perf/tests/shell/coresight_asm_pure_loop.sh
> >  create mode 100755 tools/perf/tests/shell/coresight_memcpy_thread_16k_10.sh
> >  create mode 100755 tools/perf/tests/shell/coresight_thread_loop_check_tid_10.sh
> >  create mode 100755 tools/perf/tests/shell/coresight_thread_loop_check_tid_2.sh
> >  create mode 100755 tools/perf/tests/shell/coresight_unroll_loop_thread_10.sh
> >  create mode 100644 tools/perf/tests/shell/lib/coresight.sh
> >
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index 673c7124ca82..18cc20609f2e 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -1918,10 +1918,14 @@ F:    drivers/hwtracing/coresight/*
> >  F:   include/dt-bindings/arm/coresight-cti-dt.h
> >  F:   include/linux/coresight*
> >  F:   samples/coresight/*
> > +F:   tools/perf/Documentation/arm-coresight.txt
> >  F:   tools/perf/arch/arm/util/auxtrace.c
> >  F:   tools/perf/arch/arm/util/cs-etm.c
> >  F:   tools/perf/arch/arm/util/cs-etm.h
> >  F:   tools/perf/arch/arm/util/pmu.c
> > +F:   tools/perf/tests/shell/coresight_*
> > +F:   tools/perf/tests/shell/tools/Makefile
> > +F:   tools/perf/tests/shell/tools/coresight/*
> >  F:   tools/perf/util/cs-etm-decoder/*
> >  F:   tools/perf/util/cs-etm.*
> >
> > diff --git a/tools/perf/.gitignore b/tools/perf/.gitignore
> > index 20b8ab984d5f..138c679ecacd 100644
> > --- a/tools/perf/.gitignore
> > +++ b/tools/perf/.gitignore
> > @@ -15,8 +15,9 @@ perf*.1
> >  perf*.xml
> >  perf*.html
> >  common-cmds.h
> > -perf.data
> > -perf.data.old
> > +perf*.data
> > +perf*.data.old
> > +stats-*.csv
> >  output.svg
> >  perf-archive
> >  perf-with-kcore
> > @@ -30,6 +31,7 @@ config.mak.autogen
> >  *-flex.*
> >  *.pyc
> >  *.pyo
> > +*.stdout
> >  .config-detected
> >  util/intel-pt-decoder/inat-tables.c
> >  arch/*/include/generated/
> > diff --git a/tools/perf/Documentation/arm-coresight.txt b/tools/perf/Documentation/arm-coresight.txt
> > new file mode 100644
> > index 000000000000..3a9e6c573c58
> > --- /dev/null
> > +++ b/tools/perf/Documentation/arm-coresight.txt
>
> I think it would be best to keep all the coresight documentation under the
> current coresight documentation repository[1].  That way all the information on
> coresight can be found in a central place.
>
> Some part of what is added by this patch is redundant with what is currently
> available in [1].  Other parts are tests specific and should be added under
> something like "coresight-perf-test.rst".
>
> Thanks,
> Mathieu
>
> [1]. Documentation/trace/coresight/
>

I forgot... Please add a proper cover letter for this patchset.

>
> > @@ -0,0 +1,140 @@
> > +Arm Coresight Support
> > +=====================
> > +
> > +Coresight is a feature of some Arm based processors that allows for
> > +debugging. One of the things it can do is trace every instruction
> > +executed and remotely expose that information in a hardware compressed
> > +stream. Perf is able to locally access that stream and store it to the
> > +output perf data files. This stream can then be later decoded to give the
> > +instructions that were traced for debugging or profiling purposes. You
> > +can log such data with a perf record command like:
> > +
> > +    perf record -e cs_etm//u testbinary
> > +
> > +This would run some test binary (testbinary) until it exits and record
> > +a perf.data trace file. That file would have AUX sections if coresight
> > +is working correctly. You can dump the content of this file as
> > +readable text with a command like:
> > +
> > +    perf report --stdio --dump -i perf.data
> > +
> > +You should find some sections of this file have AUX data blocks like:
> > +
> > +    0x1e78 [0x30]: PERF_RECORD_AUXTRACE size: 0x11dd0  offset: 0  ref: 0x1b614fc1061b0ad1  idx: 0  tid: 531230  cpu: -1
> > +
> > +    . ... CoreSight ETM Trace data: size 73168 bytes
> > +            Idx:0; ID:10;   I_ASYNC : Alignment Synchronisation.
> > +              Idx:12; ID:10;  I_TRACE_INFO : Trace Info.; INFO=0x0 { CC.0 }
> > +              Idx:17; ID:10;  I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0x0000000000000000;
> > +              Idx:26; ID:10;  I_TRACE_ON : Trace On.
> > +              Idx:27; ID:10;  I_ADDR_CTXT_L_64IS0 : Address & Context, Long, 64 bit, IS0.; Addr=0x0000FFFFB6069140; Ctxt: AArch64,EL0, NS;
> > +              Idx:38; ID:10;  I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE
> > +              Idx:39; ID:10;  I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE
> > +              Idx:40; ID:10;  I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE
> > +              Idx:41; ID:10;  I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEN
> > +              ...
> > +
> > +If you see these above, then your system is tracing coresight data
> > +correctly.
> > +
> > +To compile perf with coresight support in the perf directory do
> > +
> > +    make CORESIGHT=1
> > +
> > +This will compile the perf tool with coresight support as well as
> > +build some small test binaries for perf test. This requires you also
> > +be compiling for 64bit Arm (ARM64/aarch64). The tools run as part of
> > +perf coresight tracing are in tests/shell/tools/coresight.
> > +
> > +You will also want coresight support enabled in your kernel config.
> > +Ensure it is enabled with:
> > +
> > +    CONFIG_CORESIGHT=y
> > +
> > +There are various other coresight options you probably also want
> > +enabled like:
> > +
> > +    CONFIG_CORESIGHT_LINKS_AND_SINKS=y
> > +    CONFIG_CORESIGHT_LINK_AND_SINK_TMC=y
> > +    CONFIG_CORESIGHT_CATU=y
> > +    CONFIG_CORESIGHT_SINK_TPIU=y
> > +    CONFIG_CORESIGHT_SINK_ETBV10=y
> > +    CONFIG_CORESIGHT_SOURCE_ETM4X=y
> > +    CONFIG_CORESIGHT_STM=y
> > +    CONFIG_CORESIGHT_CPU_DEBUG=y
> > +    CONFIG_CORESIGHT_CTI=y
> > +    CONFIG_CORESIGHT_CTI_INTEGRATION_REGS=y
> > +
> > +Please refer to the kernel configuration help for more information.
> > +
> > +Perf test - Verify kernel and userspace perf coresight work
> > +===========================================================
> > +
> > +When you run perf test, it will do a lot of self tests. Some of those
> > +tests will cover Coresight (only if enabled and on ARM64). You
> > +generally would run perf test from the tools/perf directory in the
> > +kernel tree. Some tests will check some internal perf support like:
> > +
> > +    Check Arm CoreSight trace data recording and synthesized samples
> > +
> > +Some others will actually use perf record and some test binaries that
> > +are in tests/shell/tools/coresight and will collect traces to ensure a
> > +minimum level of functionality is met. The scripts that launch these
> > +tests are in tests/shell. These will all look like:
> > +
> > +    Coresight / Memcpy 1M 25 Threads
> > +    Coresight / Unroll Loop Thread 2
> > +    ...
> > +
> > +These perf record tests will not run if the tool binaries do not exist
> > +in tests/shell/tools/coresight/*/ and will be skipped. If you do not
> > +have coresight support in hardware then either do not build perf with
> > +coresight support or remove these binaries in order to not have these
> > +tests fail and have them skip instead.
> > +
> > +These tests will log historical results in the current working
> > +directory (e.g. tools/perf) and will be named stats-*.csv like:
> > +
> > +    stats-asm_pure_loop-out.csv
> > +    stats-bubble_sort-random.csv
> > +    ...
> > +
> > +These statistic files log some aspects of the AUX data sections in
> > +the perf data output counting some numbers of certain encodings (a
> > +good way to know that it's working in a very simple way). One problem
> > +with coresight is that given a large enough amount of data needing to
> > +be logged, some of it can be lost due to the processor not waking up
> > +in time to read out all the data from buffers etc.. You will notice
> > +that the amount of data collected can vary a lot per run of perf test.
> > +If you wish to see how this changes over time, simply run perf test
> > +multiple times and all these csv files will have more and more data
> > +appended to it that you can later examine, graph and otherwise use to
> > +figure out if things have become worse or better.
> > +
> > +Be aware that amny of these tests take quite a while to run, specifically
> > +in processing the perf data file and dumping contents to then examine what
> > +is inside.
> > +
> > +You can change where these csv logs are stored by setting the
> > +PERF_TEST_CORESIGHT_STATDIR environment variable before running perf
> > +test like:
> > +
> > +    export PERF_TEST_CORESIGHT_STATDIR=/var/tmp
> > +    perf test
> > +
> > +They will also store resulting perf output data in the current
> > +directory for later inspection like:
> > +
> > +    perf-memcpy-1m.data
> > +    perf-thread_loop-2th.data
> > +    ...
> > +
> > +You can alter where the perf data files are stored by setting the
> > +PERF_TEST_CORESIGHT_DATADIR environment variable such as:
> > +
> > +    PERF_TEST_CORESIGHT_DATADIR=/var/tmp
> > +    perf test
> > +
> > +You may wish to set these above environment variables if you which to
> > +keep the output of tests outside of the current working directory for
> > +longer term storage and examination.
> > diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
> > index ac861e42c8f7..b97db83992e0 100644
> > --- a/tools/perf/Makefile.perf
> > +++ b/tools/perf/Makefile.perf
> > @@ -630,7 +630,15 @@ sync_file_range_tbls := $(srctree)/tools/perf/trace/beauty/sync_file_range.sh
> >  $(sync_file_range_arrays): $(linux_uapi_dir)/fs.h $(sync_file_range_tbls)
> >       $(Q)$(SHELL) '$(sync_file_range_tbls)' $(linux_uapi_dir) > $@
> >
> > -all: shell_compatibility_test $(ALL_PROGRAMS) $(LANG_BINDINGS) $(OTHER_PROGRAMS)
> > +TESTS_CORESIGHT_DIR := $(srctree)/tools/perf/tests/shell/coresight
> > +
> > +tests-coresight-targets: FORCE
> > +     $(Q)$(MAKE) -C $(TESTS_CORESIGHT_DIR)
> > +
> > +tests-coresight-targets-clean:
> > +     $(Q)$(MAKE) -C $(TESTS_CORESIGHT_DIR) clean
> > +
> > +all: shell_compatibility_test $(ALL_PROGRAMS) $(LANG_BINDINGS) $(OTHER_PROGRAMS) tests-coresight-targets
> >
> >  # Create python binding output directory if not already present
> >  _dummy := $(shell [ -d '$(OUTPUT)python' ] || mkdir -p '$(OUTPUT)python')
> > @@ -1020,6 +1028,7 @@ install-tests: all install-gtk
> >               $(INSTALL) tests/shell/*.sh '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell'; \
> >               $(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/lib'; \
> >               $(INSTALL) tests/shell/lib/*.sh '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/lib'
> > +     $(Q)$(MAKE) -C tests/shell/coresight install-tests
> >
> >  install-bin: install-tools install-tests install-traceevent-plugins
> >
> > @@ -1088,7 +1097,7 @@ endif # BUILD_BPF_SKEL
> >  bpf-skel-clean:
> >       $(call QUIET_CLEAN, bpf-skel) $(RM) -r $(SKEL_TMP_OUT) $(SKELETONS)
> >
> > -clean:: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean $(LIBPERF)-clean fixdep-clean python-clean bpf-skel-clean
> > +clean:: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean $(LIBPERF)-clean fixdep-clean python-clean bpf-skel-clean tests-coresight-targets-clean
> >       $(call QUIET_CLEAN, core-objs)  $(RM) $(LIBPERF_A) $(OUTPUT)perf-archive $(OUTPUT)perf-with-kcore $(OUTPUT)perf-iostat $(LANG_BINDINGS)
> >       $(Q)find $(if $(OUTPUT),$(OUTPUT),.) -name '*.o' -delete -o -name '\.*.cmd' -delete -o -name '\.*.d' -delete
> >       $(Q)$(RM) $(OUTPUT).config-detected
> > @@ -1155,5 +1164,6 @@ FORCE:
> >  .PHONY: shell_compatibility_test please_set_SHELL_PATH_to_a_more_modern_shell
> >  .PHONY: $(GIT-HEAD-PHONY) TAGS tags cscope FORCE prepare
> >  .PHONY: libtraceevent_plugins archheaders
> > +.PHONY: $(TESTS_CORESIGHT_TARGETS)
> >
> >  endif # force_fixdep
> > diff --git a/tools/perf/tests/shell/coresight/Makefile b/tools/perf/tests/shell/coresight/Makefile
> > new file mode 100644
> > index 000000000000..dda99aeac158
> > --- /dev/null
> > +++ b/tools/perf/tests/shell/coresight/Makefile
> > @@ -0,0 +1,30 @@
> > +# SPDX-License-Identifier: GPL-2.0-only
> > +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> > +include ../../../../../tools/scripts/Makefile.include
> > +include ../../../../../tools/scripts/Makefile.arch
> > +include ../../../../../tools/scripts/utilities.mak
> > +
> > +SUBDIRS = \
> > +     asm_pure_loop \
> > +     thread_loop \
> > +     memcpy_thread \
> > +     unroll_loop_thread
> > +
> > +all: $(SUBDIRS)
> > +$(SUBDIRS):
> > +     $(Q)$(MAKE) -C $@
> > +
> > +INSTALLDIRS = $(SUBDIRS:%=install-%)
> > +
> > +install-tests: $(INSTALLDIRS)
> > +$(INSTALLDIRS):
> > +     $(Q)$(MAKE) -C $(@:install-%=%) install-tests
> > +
> > +CLEANDIRS = $(SUBDIRS:%=clean-%)
> > +
> > +clean: $(CLEANDIRS)
> > +$(CLEANDIRS):
> > +     $(Q)$(MAKE) -C $(@:clean-%=%) clean >/dev/null
> > +
> > +.PHONY: all clean $(SUBDIRS) $(CLEANDIRS) $(INSTALLDIRS)
> > +
> > diff --git a/tools/perf/tests/shell/coresight/Makefile.miniconfig b/tools/perf/tests/shell/coresight/Makefile.miniconfig
> > new file mode 100644
> > index 000000000000..893c12685fed
> > --- /dev/null
> > +++ b/tools/perf/tests/shell/coresight/Makefile.miniconfig
> > @@ -0,0 +1,23 @@
> > +# SPDX-License-Identifier: GPL-2.0-only
> > +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> > +
> > +ifndef DESTDIR
> > +prefix ?= $(HOME)
> > +endif
> > +
> > +DESTDIR_SQ = $(subst ','\'',$(DESTDIR))
> > +perfexecdir = libexec/perf-core
> > +perfexec_instdir = $(perfexecdir)
> > +
> > +ifneq ($(filter /%,$(firstword $(perfexecdir))),)
> > +perfexec_instdir = $(perfexecdir)
> > +else
> > +perfexec_instdir = $(prefix)/$(perfexecdir)
> > +endif
> > +
> > +perfexec_instdir_SQ = $(subst ','\'',$(perfexec_instdir))
> > +INSTALL = install
> > +
> > +include ../../../../../scripts/Makefile.include
> > +include ../../../../../scripts/Makefile.arch
> > +include ../../../../../scripts/utilities.mak
> > diff --git a/tools/perf/tests/shell/coresight/asm_pure_loop/.gitignore b/tools/perf/tests/shell/coresight/asm_pure_loop/.gitignore
> > new file mode 100644
> > index 000000000000..468673ac32e8
> > --- /dev/null
> > +++ b/tools/perf/tests/shell/coresight/asm_pure_loop/.gitignore
> > @@ -0,0 +1 @@
> > +asm_pure_loop
> > diff --git a/tools/perf/tests/shell/coresight/asm_pure_loop/Makefile b/tools/perf/tests/shell/coresight/asm_pure_loop/Makefile
> > new file mode 100644
> > index 000000000000..10c5a60cb71c
> > --- /dev/null
> > +++ b/tools/perf/tests/shell/coresight/asm_pure_loop/Makefile
> > @@ -0,0 +1,30 @@
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> > +
> > +include ../Makefile.miniconfig
> > +
> > +BIN=asm_pure_loop
> > +LIB=
> > +
> > +all: $(BIN)
> > +
> > +$(BIN): $(BIN).S
> > +ifdef CORESIGHT
> > +ifeq ($(ARCH),arm64)
> > +     $(Q)$(CC) $(BIN).S -nostdlib -static -o $(BIN) $(LIB)
> > +endif
> > +endif
> > +
> > +install-tests: all
> > +ifdef CORESIGHT
> > +ifeq ($(ARCH),arm64)
> > +     $(call QUIET_INSTALL, tests) \
> > +             $(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)'; \
> > +             $(INSTALL) $(BIN) '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)/$(BIN)'
> > +endif
> > +endif
> > +
> > +clean:
> > +     $(Q)$(RM) -f $(BIN)
> > +
> > +.PHONY: all clean install-tests
> > diff --git a/tools/perf/tests/shell/coresight/asm_pure_loop/asm_pure_loop.S b/tools/perf/tests/shell/coresight/asm_pure_loop/asm_pure_loop.S
> > new file mode 100644
> > index 000000000000..75cf084a927d
> > --- /dev/null
> > +++ b/tools/perf/tests/shell/coresight/asm_pure_loop/asm_pure_loop.S
> > @@ -0,0 +1,28 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/* Tamas Zsoldos <tamas.zsoldos@arm.com>, 2021 */
> > +
> > +.globl _start
> > +_start:
> > +     mov     x0, 0x0000ffff
> > +     mov     x1, xzr
> > +loop:
> > +     nop
> > +     nop
> > +     cbnz    x1, noskip
> > +     nop
> > +     nop
> > +     adrp    x2, skip
> > +     add     x2, x2, :lo12:skip
> > +     br      x2
> > +     nop
> > +     nop
> > +noskip:
> > +     nop
> > +     nop
> > +skip:
> > +     sub     x0, x0, 1
> > +     cbnz    x0, loop
> > +
> > +     mov     x0, #0
> > +     mov     x8, #93 // __NR_exit syscall
> > +     svc     #0
> > diff --git a/tools/perf/tests/shell/coresight/memcpy_thread/.gitignore b/tools/perf/tests/shell/coresight/memcpy_thread/.gitignore
> > new file mode 100644
> > index 000000000000..f8217e56091e
> > --- /dev/null
> > +++ b/tools/perf/tests/shell/coresight/memcpy_thread/.gitignore
> > @@ -0,0 +1 @@
> > +memcpy_thread
> > diff --git a/tools/perf/tests/shell/coresight/memcpy_thread/Makefile b/tools/perf/tests/shell/coresight/memcpy_thread/Makefile
> > new file mode 100644
> > index 000000000000..e2604cfae74b
> > --- /dev/null
> > +++ b/tools/perf/tests/shell/coresight/memcpy_thread/Makefile
> > @@ -0,0 +1,29 @@
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> > +include ../Makefile.miniconfig
> > +
> > +BIN=memcpy_thread
> > +LIB=-pthread
> > +
> > +all: $(BIN)
> > +
> > +$(BIN): $(BIN).c
> > +ifdef CORESIGHT
> > +ifeq ($(ARCH),arm64)
> > +     $(Q)$(CC) $(BIN).c -o $(BIN) $(LIB)
> > +endif
> > +endif
> > +
> > +install-tests: all
> > +ifdef CORESIGHT
> > +ifeq ($(ARCH),arm64)
> > +     $(call QUIET_INSTALL, tests) \
> > +             $(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)'; \
> > +             $(INSTALL) $(BIN) '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)/$(BIN)'
> > +endif
> > +endif
> > +
> > +clean:
> > +     $(Q)$(RM) -f $(BIN)
> > +
> > +.PHONY: all clean install-tests
> > diff --git a/tools/perf/tests/shell/coresight/memcpy_thread/memcpy_thread.c b/tools/perf/tests/shell/coresight/memcpy_thread/memcpy_thread.c
> > new file mode 100644
> > index 000000000000..a7e169d1bf64
> > --- /dev/null
> > +++ b/tools/perf/tests/shell/coresight/memcpy_thread/memcpy_thread.c
> > @@ -0,0 +1,79 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +// Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> > +#include <stdio.h>
> > +#include <stdlib.h>
> > +#include <unistd.h>
> > +#include <string.h>
> > +#include <pthread.h>
> > +
> > +struct args {
> > +     unsigned long loops;
> > +     unsigned long size;
> > +     pthread_t th;
> > +     void *ret;
> > +};
> > +
> > +static void *thrfn(void *arg)
> > +{
> > +     struct args *a = arg;
> > +     unsigned long i, len = a->loops;
> > +     unsigned char *src, *dst;
> > +
> > +     src = malloc(a->size * 1024);
> > +     dst = malloc(a->size * 1024);
> > +     if ((!src) || (!dst)) {
> > +             printf("ERR: Can't allocate memory\n");
> > +             exit(1);
> > +     }
> > +     for (i = 0; i < len; i++)
> > +             memcpy(dst, src, a->size * 1024);
> > +}
> > +
> > +static pthread_t new_thr(void *(*fn) (void *arg), void *arg)
> > +{
> > +     pthread_t t;
> > +     pthread_attr_t attr;
> > +
> > +     pthread_attr_init(&attr);
> > +     pthread_create(&t, &attr, fn, arg);
> > +     return t;
> > +}
> > +
> > +int main(int argc, char **argv)
> > +{
> > +     unsigned long i, len, size, thr;
> > +     pthread_t threads[256];
> > +     struct args args[256];
> > +     long long v;
> > +
> > +     if (argc < 4) {
> > +             printf("ERR: %s [copysize Kb] [numthreads] [numloops (hundreds)]\n", argv[0]);
> > +             exit(1);
> > +     }
> > +
> > +     v = atoll(argv[1]);
> > +     if ((v < 1) || (v > (1024 * 1024))) {
> > +             printf("ERR: max memory 1GB (1048576 KB)\n");
> > +             exit(1);
> > +     }
> > +     size = v;
> > +     thr = atol(argv[2]);
> > +     if ((thr < 1) || (thr > 256)) {
> > +             printf("ERR: threads 1-256\n");
> > +             exit(1);
> > +     }
> > +     v = atoll(argv[3]);
> > +     if ((v < 1) || (v > 40000000000ll)) {
> > +             printf("ERR: loops 1-40000000000 (hundreds)\n");
> > +             exit(1);
> > +     }
> > +     len = v * 100;
> > +     for (i = 0; i < thr; i++) {
> > +             args[i].loops = len;
> > +             args[i].size = size;
> > +             args[i].th = new_thr(thrfn, &(args[i]));
> > +     }
> > +     for (i = 0; i < thr; i++)
> > +             pthread_join(args[i].th, &(args[i].ret));
> > +     return 0;
> > +}
> > diff --git a/tools/perf/tests/shell/coresight/thread_loop/.gitignore b/tools/perf/tests/shell/coresight/thread_loop/.gitignore
> > new file mode 100644
> > index 000000000000..6d4c33eaa9e8
> > --- /dev/null
> > +++ b/tools/perf/tests/shell/coresight/thread_loop/.gitignore
> > @@ -0,0 +1 @@
> > +thread_loop
> > diff --git a/tools/perf/tests/shell/coresight/thread_loop/Makefile b/tools/perf/tests/shell/coresight/thread_loop/Makefile
> > new file mode 100644
> > index 000000000000..424df4e8b0e6
> > --- /dev/null
> > +++ b/tools/perf/tests/shell/coresight/thread_loop/Makefile
> > @@ -0,0 +1,29 @@
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> > +include ../Makefile.miniconfig
> > +
> > +BIN=thread_loop
> > +LIB=-pthread
> > +
> > +all: $(BIN)
> > +
> > +$(BIN): $(BIN).c
> > +ifdef CORESIGHT
> > +ifeq ($(ARCH),arm64)
> > +     $(Q)$(CC) $(BIN).c -o $(BIN) $(LIB)
> > +endif
> > +endif
> > +
> > +install-tests: all
> > +ifdef CORESIGHT
> > +ifeq ($(ARCH),arm64)
> > +     $(call QUIET_INSTALL, tests) \
> > +             $(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)'; \
> > +             $(INSTALL) $(BIN) '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)/$(BIN)'
> > +endif
> > +endif
> > +
> > +clean:
> > +     $(Q)$(RM) -f $(BIN)
> > +
> > +.PHONY: all clean install-tests
> > diff --git a/tools/perf/tests/shell/coresight/thread_loop/thread_loop.c b/tools/perf/tests/shell/coresight/thread_loop/thread_loop.c
> > new file mode 100644
> > index 000000000000..c0158fac7d0b
> > --- /dev/null
> > +++ b/tools/perf/tests/shell/coresight/thread_loop/thread_loop.c
> > @@ -0,0 +1,86 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +// Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> > +
> > +// define this for gettid()
> > +#define _GNU_SOURCE
> > +
> > +#include <stdio.h>
> > +#include <stdlib.h>
> > +#include <unistd.h>
> > +#include <string.h>
> > +#include <pthread.h>
> > +#include <sys/syscall.h>
> > +#ifndef SYS_gettid
> > +// gettid is 178 on arm64
> > +# define SYS_gettid 178
> > +#endif
> > +#define gettid() syscall(SYS_gettid)
> > +
> > +struct args {
> > +     unsigned int loops;
> > +     pthread_t th;
> > +     void *ret;
> > +};
> > +
> > +static void *thrfn(void *arg)
> > +{
> > +     struct args *a = arg;
> > +     int i = 0, len = a->loops;
> > +
> > +     if (getenv("SHOW_TID")) {
> > +             unsigned long long tid = gettid();
> > +
> > +             printf("%llu\n", tid);
> > +     }
> > +     asm volatile(
> > +             "loop:\n"
> > +             "add %[i], %[i], #1\n"
> > +             "cmp %[i], %[len]\n"
> > +             "blt loop\n"
> > +             : /* out */
> > +             : /* in */ [i] "r" (i), [len] "r" (len)
> > +             : /* clobber */
> > +     );
> > +     return (void *)(long)i;
> > +}
> > +
> > +static pthread_t new_thr(void *(*fn) (void *arg), void *arg)
> > +{
> > +     pthread_t t;
> > +     pthread_attr_t attr;
> > +
> > +     pthread_attr_init(&attr);
> > +     pthread_create(&t, &attr, fn, arg);
> > +     return t;
> > +}
> > +
> > +int main(int argc, char **argv)
> > +{
> > +     unsigned int i, len, thr;
> > +     pthread_t threads[256];
> > +     struct args args[256];
> > +
> > +     if (argc < 3) {
> > +             printf("ERR: %s [numthreads] [numloops (millions)]\n", argv[0]);
> > +             exit(1);
> > +     }
> > +
> > +     thr = atoi(argv[1]);
> > +     if ((thr < 1) || (thr > 256)) {
> > +             printf("ERR: threads 1-256\n");
> > +             exit(1);
> > +     }
> > +     len = atoi(argv[2]);
> > +     if ((len < 1) || (len > 4000)) {
> > +             printf("ERR: max loops 4000 (millions)\n");
> > +             exit(1);
> > +     }
> > +     len *= 1000000;
> > +     for (i = 0; i < thr; i++) {
> > +             args[i].loops = len;
> > +             args[i].th = new_thr(thrfn, &(args[i]));
> > +     }
> > +     for (i = 0; i < thr; i++)
> > +             pthread_join(args[i].th, &(args[i].ret));
> > +     return 0;
> > +}
> > diff --git a/tools/perf/tests/shell/coresight/unroll_loop_thread/.gitignore b/tools/perf/tests/shell/coresight/unroll_loop_thread/.gitignore
> > new file mode 100644
> > index 000000000000..2cb4e996dbf3
> > --- /dev/null
> > +++ b/tools/perf/tests/shell/coresight/unroll_loop_thread/.gitignore
> > @@ -0,0 +1 @@
> > +unroll_loop_thread
> > diff --git a/tools/perf/tests/shell/coresight/unroll_loop_thread/Makefile b/tools/perf/tests/shell/coresight/unroll_loop_thread/Makefile
> > new file mode 100644
> > index 000000000000..45ab2be8be92
> > --- /dev/null
> > +++ b/tools/perf/tests/shell/coresight/unroll_loop_thread/Makefile
> > @@ -0,0 +1,29 @@
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> > +include ../Makefile.miniconfig
> > +
> > +BIN=unroll_loop_thread
> > +LIB=-pthread
> > +
> > +all: $(BIN)
> > +
> > +$(BIN): $(BIN).c
> > +ifdef CORESIGHT
> > +ifeq ($(ARCH),arm64)
> > +     $(Q)$(CC) $(BIN).c -o $(BIN) $(LIB)
> > +endif
> > +endif
> > +
> > +install-tests: all
> > +ifdef CORESIGHT
> > +ifeq ($(ARCH),arm64)
> > +     $(call QUIET_INSTALL, tests) \
> > +             $(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)'; \
> > +             $(INSTALL) $(BIN) '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)/$(BIN)'
> > +endif
> > +endif
> > +
> > +clean:
> > +     $(Q)$(RM) -f $(BIN)
> > +
> > +.PHONY: all clean install-tests
> > diff --git a/tools/perf/tests/shell/coresight/unroll_loop_thread/unroll_loop_thread.c b/tools/perf/tests/shell/coresight/unroll_loop_thread/unroll_loop_thread.c
> > new file mode 100644
> > index 000000000000..cb9d22c7dfb9
> > --- /dev/null
> > +++ b/tools/perf/tests/shell/coresight/unroll_loop_thread/unroll_loop_thread.c
> > @@ -0,0 +1,74 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +// Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> > +#include <stdio.h>
> > +#include <stdlib.h>
> > +#include <unistd.h>
> > +#include <string.h>
> > +#include <pthread.h>
> > +
> > +struct args {
> > +     pthread_t th;
> > +     unsigned int in, out;
> > +     void *ret;
> > +};
> > +
> > +static void *thrfn(void *arg)
> > +{
> > +     struct args *a = arg;
> > +     unsigned int i, in = a->in;
> > +
> > +     for (i = 0; i < 10000; i++) {
> > +             asm volatile (
> > +// force an unroll of thia add instruction so we can test long runs of code
> > +#define SNIP1 "add %[in], %[in], #1\n"
> > +// 10
> > +#define SNIP2 SNIP1 SNIP1 SNIP1 SNIP1 SNIP1 SNIP1 SNIP1 SNIP1 SNIP1 SNIP1
> > +// 100
> > +#define SNIP3 SNIP2 SNIP2 SNIP2 SNIP2 SNIP2 SNIP2 SNIP2 SNIP2 SNIP2 SNIP2
> > +// 1000
> > +#define SNIP4 SNIP3 SNIP3 SNIP3 SNIP3 SNIP3 SNIP3 SNIP3 SNIP3 SNIP3 SNIP3
> > +// 10000
> > +#define SNIP5 SNIP4 SNIP4 SNIP4 SNIP4 SNIP4 SNIP4 SNIP4 SNIP4 SNIP4 SNIP4
> > +// 100000
> > +                     SNIP5 SNIP5 SNIP5 SNIP5 SNIP5 SNIP5 SNIP5 SNIP5 SNIP5 SNIP5
> > +                     : /* out */
> > +                     : /* in */ [in] "r" (in)
> > +                     : /* clobber */
> > +             );
> > +     }
> > +}
> > +
> > +static pthread_t new_thr(void *(*fn) (void *arg), void *arg)
> > +{
> > +     pthread_t t;
> > +     pthread_attr_t attr;
> > +
> > +     pthread_attr_init(&attr);
> > +     pthread_create(&t, &attr, fn, arg);
> > +     return t;
> > +}
> > +
> > +int main(int argc, char **argv)
> > +{
> > +     unsigned int i, thr;
> > +     pthread_t threads[256];
> > +     struct args args[256];
> > +
> > +     if (argc < 2) {
> > +             printf("ERR: %s [numthreads]\n", argv[0]);
> > +             exit(1);
> > +     }
> > +
> > +     thr = atoi(argv[1]);
> > +     if ((thr > 256) || (thr < 1)) {
> > +             printf("ERR: threads 1-256\n");
> > +             exit(1);
> > +     }
> > +     for (i = 0; i < thr; i++) {
> > +             args[i].in = rand();
> > +             args[i].th = new_thr(thrfn, &(args[i]));
> > +     }
> > +     for (i = 0; i < thr; i++)
> > +             pthread_join(args[i].th, &(args[i].ret));
> > +     return 0;
> > +}
> > diff --git a/tools/perf/tests/shell/coresight_asm_pure_loop.sh b/tools/perf/tests/shell/coresight_asm_pure_loop.sh
> > new file mode 100755
> > index 000000000000..3f0dbefcad50
> > --- /dev/null
> > +++ b/tools/perf/tests/shell/coresight_asm_pure_loop.sh
> > @@ -0,0 +1,18 @@
> > +#!/bin/sh -e
> > +# Coresight / ASM Pure Loop
> > +
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> > +
> > +TEST="asm_pure_loop"
> > +. $(dirname $0)/lib/coresight.sh
> > +ARGS=""
> > +DATV="out"
> > +DATA="$DATD/perf-$TEST-$DATV.data"
> > +
> > +perf record $PERFRECOPT -o "$DATA" "$BIN" $ARGS
> > +
> > +perf_dump_aux_verify "$DATA" 10 10 10
> > +
> > +err=$?
> > +exit $err
> > diff --git a/tools/perf/tests/shell/coresight_memcpy_thread_16k_10.sh b/tools/perf/tests/shell/coresight_memcpy_thread_16k_10.sh
> > new file mode 100755
> > index 000000000000..8972af835016
> > --- /dev/null
> > +++ b/tools/perf/tests/shell/coresight_memcpy_thread_16k_10.sh
> > @@ -0,0 +1,18 @@
> > +#!/bin/sh -e
> > +# Coresight / Memcpy 16k 10 Threads
> > +
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> > +
> > +TEST="memcpy_thread"
> > +. $(dirname $0)/lib/coresight.sh
> > +ARGS="16 10 1"
> > +DATV="16k_10"
> > +DATA="$DATD/perf-$TEST-$DATV.data"
> > +
> > +perf record $PERFRECOPT -o "$DATA" "$BIN" $ARGS
> > +
> > +perf_dump_aux_verify "$DATA" 10 10 10
> > +
> > +err=$?
> > +exit $err
> > diff --git a/tools/perf/tests/shell/coresight_thread_loop_check_tid_10.sh b/tools/perf/tests/shell/coresight_thread_loop_check_tid_10.sh
> > new file mode 100755
> > index 000000000000..5b468901f89b
> > --- /dev/null
> > +++ b/tools/perf/tests/shell/coresight_thread_loop_check_tid_10.sh
> > @@ -0,0 +1,19 @@
> > +#!/bin/sh -e
> > +# Coresight / Thread Loop 10 Threads - Check TID
> > +
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> > +
> > +TEST="thread_loop"
> > +. $(dirname $0)/lib/coresight.sh
> > +ARGS="10 1"
> > +DATV="check-tid-10th"
> > +DATA="$DATD/perf-$TEST-$DATV.data"
> > +STDO="$DATD/perf-$TEST-$DATV.stdout"
> > +
> > +SHOW_TID=1 perf record -s $PERFRECOPT -o "$DATA" "$BIN" $ARGS > $STDO
> > +
> > +perf_dump_aux_tid_verify "$DATA" "$STDO"
> > +
> > +err=$?
> > +exit $err
> > diff --git a/tools/perf/tests/shell/coresight_thread_loop_check_tid_2.sh b/tools/perf/tests/shell/coresight_thread_loop_check_tid_2.sh
> > new file mode 100755
> > index 000000000000..f8b7abd3aa03
> > --- /dev/null
> > +++ b/tools/perf/tests/shell/coresight_thread_loop_check_tid_2.sh
> > @@ -0,0 +1,19 @@
> > +#!/bin/sh -e
> > +# Coresight / Thread Loop 2 Threads - Check TID
> > +
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> > +
> > +TEST="thread_loop"
> > +. $(dirname $0)/lib/coresight.sh
> > +ARGS="2 20"
> > +DATV="check-tid-2th"
> > +DATA="$DATD/perf-$TEST-$DATV.data"
> > +STDO="$DATD/perf-$TEST-$DATV.stdout"
> > +
> > +SHOW_TID=1 perf record -s $PERFRECOPT -o "$DATA" "$BIN" $ARGS > $STDO
> > +
> > +perf_dump_aux_tid_verify "$DATA" "$STDO"
> > +
> > +err=$?
> > +exit $err
> > diff --git a/tools/perf/tests/shell/coresight_unroll_loop_thread_10.sh b/tools/perf/tests/shell/coresight_unroll_loop_thread_10.sh
> > new file mode 100755
> > index 000000000000..c985dfb025c2
> > --- /dev/null
> > +++ b/tools/perf/tests/shell/coresight_unroll_loop_thread_10.sh
> > @@ -0,0 +1,18 @@
> > +#!/bin/sh -e
> > +# Coresight / Unroll Loop Thread 10
> > +
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> > +
> > +TEST="unroll_loop_thread"
> > +. $(dirname $0)/lib/coresight.sh
> > +ARGS="10"
> > +DATV="10"
> > +DATA="$DATD/perf-$TEST-$DATV.data"
> > +
> > +perf record $PERFRECOPT -o "$DATA" "$BIN" $ARGS
> > +
> > +perf_dump_aux_verify "$DATA" 10 10 10
> > +
> > +err=$?
> > +exit $err
> > diff --git a/tools/perf/tests/shell/lib/coresight.sh b/tools/perf/tests/shell/lib/coresight.sh
> > new file mode 100644
> > index 000000000000..6a611b073f02
> > --- /dev/null
> > +++ b/tools/perf/tests/shell/lib/coresight.sh
> > @@ -0,0 +1,130 @@
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
> > +
> > +# This is sourced from a driver script so no need for #!/bin... etc. at the
> > +# top - the assumption below is that it runs as part of sourcing after the
> > +# test sets up some basic env vars to say what it is.
> > +
> > +# perf record options for the perf tests to use
> > +PERFRECMEM="-m ,128M"
> > +PERFRECOPT="$PERFRECMEM -e cs_etm//u"
> > +
> > +# These tests need to be run as root or coresight won't allow large buffers
> > +# and will not collect proper data
> > +UID=`id -u`
> > +if test "$UID" -ne 0; then
> > +     echo "Not running as root... skip"
> > +     exit 2
> > +fi
> > +
> > +TOOLS=$(dirname $0)
> > +DIR="$TOOLS/coresight/$TEST"
> > +BIN="$DIR/$TEST"
> > +# If the test tool/binary does not exist and is executable then skip the test
> > +if ! test -x "$BIN"; then exit 2; fi
> > +DATD="."
> > +# If the data dir env is set then make the data dir use that instead of ./
> > +if test -n "$PERF_TEST_CORESIGHT_DATADIR"; then
> > +     DATD="$PERF_TEST_CORESIGHT_DATADIR";
> > +fi
> > +# If the stat dir env is set then make the data dir use that instead of ./
> > +STATD="."
> > +if test -n "$PERF_TEST_CORESIGHT_STATDIR"; then
> > +     STATD="$PERF_TEST_CORESIGHT_STATDIR";
> > +fi
> > +
> > +# Called if the test fails - error code 2
> > +err() {
> > +     echo "$1"
> > +     exit 1
> > +}
> > +
> > +# Check that some statistics from our perf
> > +check_val_min() {
> > +     STATF="$4"
> > +     if test "$2" -lt "$3"; then
> > +             echo ", FAILED" >> "$STATF"
> > +             err "Sanity check number of $1 is too low ($2 < $3)"
> > +     fi
> > +}
> > +
> > +perf_dump_aux_verify() {
> > +     # Some basic checking that the AUX chunk contains some sensible data
> > +     # to see that we are recording something and at least a minimum
> > +     # amount of it. We should almost always see F3 atoms in just about
> > +     # anything but certainly we will see some trace info and async atom
> > +     # chunks.
> > +     DUMP="$DATD/perf-tmp-aux-dump.txt"
> > +     perf report --stdio --dump -i "$1" | \
> > +             grep -o -e I_ATOM_F3 -e I_ASYNC -e I_TRACE_INFO > "$DUMP"
> > +     # Simply count how many of these atoms we find to see that we are
> > +     # producing a reasonable amount of data - exact checks are not sane
> > +     # as this is a lossy  process where we may lose some blocks and the
> > +     # compiler may produce different code depending on the compiler and
> > +     # optimization options, so this is rough  just to see if we're
> > +     # either missing almost all the data or all of it
> > +     ATOM_F3_NUM=`grep I_ATOM_F3 "$DUMP" | wc -l`
> > +     ATOM_ASYNC_NUM=`grep I_ASYNC "$DUMP" | wc -l`
> > +     ATOM_TRACE_INFO_NUM=`grep I_TRACE_INFO "$DUMP" | wc -l`
> > +     rm -f "$DUMP"
> > +
> > +     # Arguments provide minimums for a pass
> > +     CHECK_F3_MIN="$2"
> > +     CHECK_ASYNC_MIN="$3"
> > +     CHECK_TRACE_INFO_MIN="$4"
> > +
> > +     # Write out statistics, so over time you can track results to see if
> > +     # there is a pattern - for example we have less "noisy" results that
> > +     # produce more consistent amounts of data each run, to see if over
> > +     # time any techinques to  minimize data loss are having an effect or
> > +     # not
> > +     STATF="$STATD/stats-$TEST-$DATV.csv"
> > +     if ! test -f "$STATF"; then
> > +             echo "ATOM F3 Count, Minimum, ATOM ASYNC Count, Minimum, TRACE INFO Count, Minimum" > "$STATF"
> > +     fi
> > +     echo -n "$ATOM_F3_NUM, $CHECK_F3_MIN, $ATOM_ASYNC_NUM, $CHECK_ASYNC_MIN, $ATOM_TRACE_INFO_NUM, $CHECK_TRACE_INFO_MIN" >> "$STATF"
> > +
> > +     # Actually check to see if we passed or failed.
> > +     check_val_min "ATOM_F3" "$ATOM_F3_NUM" "$CHECK_F3_MIN" "$STATF"
> > +     check_val_min "ASYNC" "$ATOM_ASYNC_NUM" "$CHECK_ASYNC_MIN" "$STATF"
> > +     check_val_min "TRACE_INFO" "$ATOM_TRACE_INFO_NUM" "$CHECK_TRACE_INFO_MIN" "$STATF"
> > +     echo ", Ok" >> "$STATF"
> > +}
> > +
> > +perf_dump_aux_tid_verify() {
> > +     # Specifically crafted test will produce a list of Tread ID's to
> > +     # stdout that need to be checked to  see that they have had trace
> > +     # info collected in AUX blocks in the perf data. This will go
> > +     # through all the TID's that are listed as CID=0xabcdef and see
> > +     # that all the Thread IDs the test tool reports are  in the perf
> > +     # data AUX chunks
> > +
> > +     # The TID test tools will print a TID per stdout line that are being
> > +     # tested
> > +     TIDS=`cat "$2"`
> > +     # Scan the perf report to find the TIDs that are actually CID in hex
> > +     # and build a list of the ones found
> > +     FOUND_TIDS=`perf report --stdio --dump -i "$1" | \
> > +                     grep -o "CID=0x[0-9a-z]\+" | sed 's/CID=//g' | \
> > +                     uniq | sort | uniq`
> > +
> > +     # Iterate over the list of TIDs that the test says it has and find
> > +     # them in the TIDs found in the perf report
> > +     MISSING=""
> > +     for TID2 in $TIDS; do
> > +             FOUND=""
> > +             for TIDHEX in $FOUND_TIDS; do
> > +                     TID=`printf "%i" $TIDHEX`
> > +                     if test "$TID" -eq "$TID2"; then
> > +                             FOUND="y"
> > +                             break
> > +                     fi
> > +             done
> > +             if test -z "$FOUND"; then
> > +                     MISSING="$MISSING $TID"
> > +             fi
> > +     done
> > +     if test -n "$MISSING"; then
> > +             err "Thread IDs $MISSING not found in perf AUX data"
> > +     fi
> > +}
> > --
> > 2.32.0
> >

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/3] perf test: Add coresight tests to guage quality of data generated
  2022-05-30 16:47     ` Mathieu Poirier
@ 2022-06-13 12:53       ` Carsten Haitzler
  0 siblings, 0 replies; 19+ messages in thread
From: Carsten Haitzler @ 2022-06-13 12:53 UTC (permalink / raw)
  To: Mathieu Poirier
  Cc: linux-kernel, coresight, suzuki.poulose, mike.leach, leo.yan,
	linux-perf-users, acme



On 5/30/22 17:47, Mathieu Poirier wrote:
> On Mon, 30 May 2022 at 10:27, Mathieu Poirier
> <mathieu.poirier@linaro.org> wrote:
>>
>> On Wed, Mar 09, 2022 at 12:28:59PM +0000, carsten.haitzler@foss.arm.com wrote:
>>> From: Carsten Haitzler <carsten.haitzler@arm.com>
>>>
>>> This adds a test harness and tests to run perf record and examine the
>>> resuling output when coresight is enabled on arm64 and check the
>>> resulting quality of the output as part of perf test. These tests use
>>> various tools to produce output from perf record then measure some key
>>> specific aspects of that data to see if the data exists at all and
>>> contains key aspects such as measuring some data for every thread of
>>> a test or produces sufficient data for large exeuction runs of a large
>>> executable. etc.
>>>
>>> Signed-off-by: Carsten Haitzler <carsten.haitzler@arm.com>
>>> ---
>>>   MAINTAINERS                                   |   4 +
>>>   tools/perf/.gitignore                         |   6 +-
>>>   tools/perf/Documentation/arm-coresight.txt    | 140 ++++++++++++++++++
>>>   tools/perf/Makefile.perf                      |  14 +-
>>>   tools/perf/tests/shell/coresight/Makefile     |  30 ++++
>>>   .../tests/shell/coresight/Makefile.miniconfig |  23 +++
>>>   .../shell/coresight/asm_pure_loop/.gitignore  |   1 +
>>>   .../shell/coresight/asm_pure_loop/Makefile    |  30 ++++
>>>   .../coresight/asm_pure_loop/asm_pure_loop.S   |  28 ++++
>>>   .../shell/coresight/memcpy_thread/.gitignore  |   1 +
>>>   .../shell/coresight/memcpy_thread/Makefile    |  29 ++++
>>>   .../coresight/memcpy_thread/memcpy_thread.c   |  79 ++++++++++
>>>   .../shell/coresight/thread_loop/.gitignore    |   1 +
>>>   .../shell/coresight/thread_loop/Makefile      |  29 ++++
>>>   .../shell/coresight/thread_loop/thread_loop.c |  86 +++++++++++
>>>   .../coresight/unroll_loop_thread/.gitignore   |   1 +
>>>   .../coresight/unroll_loop_thread/Makefile     |  29 ++++
>>>   .../unroll_loop_thread/unroll_loop_thread.c   |  74 +++++++++
>>>   .../tests/shell/coresight_asm_pure_loop.sh    |  18 +++
>>>   .../shell/coresight_memcpy_thread_16k_10.sh   |  18 +++
>>>   .../coresight_thread_loop_check_tid_10.sh     |  19 +++
>>>   .../coresight_thread_loop_check_tid_2.sh      |  19 +++
>>>   .../shell/coresight_unroll_loop_thread_10.sh  |  18 +++
>>>   tools/perf/tests/shell/lib/coresight.sh       | 130 ++++++++++++++++
>>>   24 files changed, 823 insertions(+), 4 deletions(-)
>>
>> As Leo pointed out this is a big patch and hard to digest intellectually.
>>
>>>   create mode 100644 tools/perf/Documentation/arm-coresight.txt
>>>   create mode 100644 tools/perf/tests/shell/coresight/Makefile
>>>   create mode 100644 tools/perf/tests/shell/coresight/Makefile.miniconfig
>>>   create mode 100644 tools/perf/tests/shell/coresight/asm_pure_loop/.gitignore
>>>   create mode 100644 tools/perf/tests/shell/coresight/asm_pure_loop/Makefile
>>>   create mode 100644 tools/perf/tests/shell/coresight/asm_pure_loop/asm_pure_loop.S
>>>   create mode 100644 tools/perf/tests/shell/coresight/memcpy_thread/.gitignore
>>>   create mode 100644 tools/perf/tests/shell/coresight/memcpy_thread/Makefile
>>>   create mode 100644 tools/perf/tests/shell/coresight/memcpy_thread/memcpy_thread.c
>>>   create mode 100644 tools/perf/tests/shell/coresight/thread_loop/.gitignore
>>>   create mode 100644 tools/perf/tests/shell/coresight/thread_loop/Makefile
>>>   create mode 100644 tools/perf/tests/shell/coresight/thread_loop/thread_loop.c
>>>   create mode 100644 tools/perf/tests/shell/coresight/unroll_loop_thread/.gitignore
>>>   create mode 100644 tools/perf/tests/shell/coresight/unroll_loop_thread/Makefile
>>>   create mode 100644 tools/perf/tests/shell/coresight/unroll_loop_thread/unroll_loop_thread.c
>>>   create mode 100755 tools/perf/tests/shell/coresight_asm_pure_loop.sh
>>>   create mode 100755 tools/perf/tests/shell/coresight_memcpy_thread_16k_10.sh
>>>   create mode 100755 tools/perf/tests/shell/coresight_thread_loop_check_tid_10.sh
>>>   create mode 100755 tools/perf/tests/shell/coresight_thread_loop_check_tid_2.sh
>>>   create mode 100755 tools/perf/tests/shell/coresight_unroll_loop_thread_10.sh
>>>   create mode 100644 tools/perf/tests/shell/lib/coresight.sh
>>>
>>> diff --git a/MAINTAINERS b/MAINTAINERS
>>> index 673c7124ca82..18cc20609f2e 100644
>>> --- a/MAINTAINERS
>>> +++ b/MAINTAINERS
>>> @@ -1918,10 +1918,14 @@ F:    drivers/hwtracing/coresight/*
>>>   F:   include/dt-bindings/arm/coresight-cti-dt.h
>>>   F:   include/linux/coresight*
>>>   F:   samples/coresight/*
>>> +F:   tools/perf/Documentation/arm-coresight.txt
>>>   F:   tools/perf/arch/arm/util/auxtrace.c
>>>   F:   tools/perf/arch/arm/util/cs-etm.c
>>>   F:   tools/perf/arch/arm/util/cs-etm.h
>>>   F:   tools/perf/arch/arm/util/pmu.c
>>> +F:   tools/perf/tests/shell/coresight_*
>>> +F:   tools/perf/tests/shell/tools/Makefile
>>> +F:   tools/perf/tests/shell/tools/coresight/*
>>>   F:   tools/perf/util/cs-etm-decoder/*
>>>   F:   tools/perf/util/cs-etm.*
>>>
>>> diff --git a/tools/perf/.gitignore b/tools/perf/.gitignore
>>> index 20b8ab984d5f..138c679ecacd 100644
>>> --- a/tools/perf/.gitignore
>>> +++ b/tools/perf/.gitignore
>>> @@ -15,8 +15,9 @@ perf*.1
>>>   perf*.xml
>>>   perf*.html
>>>   common-cmds.h
>>> -perf.data
>>> -perf.data.old
>>> +perf*.data
>>> +perf*.data.old
>>> +stats-*.csv
>>>   output.svg
>>>   perf-archive
>>>   perf-with-kcore
>>> @@ -30,6 +31,7 @@ config.mak.autogen
>>>   *-flex.*
>>>   *.pyc
>>>   *.pyo
>>> +*.stdout
>>>   .config-detected
>>>   util/intel-pt-decoder/inat-tables.c
>>>   arch/*/include/generated/
>>> diff --git a/tools/perf/Documentation/arm-coresight.txt b/tools/perf/Documentation/arm-coresight.txt
>>> new file mode 100644
>>> index 000000000000..3a9e6c573c58
>>> --- /dev/null
>>> +++ b/tools/perf/Documentation/arm-coresight.txt
>>
>> I think it would be best to keep all the coresight documentation under the
>> current coresight documentation repository[1].  That way all the information on
>> coresight can be found in a central place.
>>
>> Some part of what is added by this patch is redundant with what is currently
>> available in [1].  Other parts are tests specific and should be added under
>> something like "coresight-perf-test.rst".
>>
>> Thanks,
>> Mathieu
>>
>> [1]. Documentation/trace/coresight/
>>
> 
> I forgot... Please add a proper cover letter for this patchset.

ok - sure. next round.

>>
>>> @@ -0,0 +1,140 @@
>>> +Arm Coresight Support
>>> +=====================
>>> +
>>> +Coresight is a feature of some Arm based processors that allows for
>>> +debugging. One of the things it can do is trace every instruction
>>> +executed and remotely expose that information in a hardware compressed
>>> +stream. Perf is able to locally access that stream and store it to the
>>> +output perf data files. This stream can then be later decoded to give the
>>> +instructions that were traced for debugging or profiling purposes. You
>>> +can log such data with a perf record command like:
>>> +
>>> +    perf record -e cs_etm//u testbinary
>>> +
>>> +This would run some test binary (testbinary) until it exits and record
>>> +a perf.data trace file. That file would have AUX sections if coresight
>>> +is working correctly. You can dump the content of this file as
>>> +readable text with a command like:
>>> +
>>> +    perf report --stdio --dump -i perf.data
>>> +
>>> +You should find some sections of this file have AUX data blocks like:
>>> +
>>> +    0x1e78 [0x30]: PERF_RECORD_AUXTRACE size: 0x11dd0  offset: 0  ref: 0x1b614fc1061b0ad1  idx: 0  tid: 531230  cpu: -1
>>> +
>>> +    . ... CoreSight ETM Trace data: size 73168 bytes
>>> +            Idx:0; ID:10;   I_ASYNC : Alignment Synchronisation.
>>> +              Idx:12; ID:10;  I_TRACE_INFO : Trace Info.; INFO=0x0 { CC.0 }
>>> +              Idx:17; ID:10;  I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0x0000000000000000;
>>> +              Idx:26; ID:10;  I_TRACE_ON : Trace On.
>>> +              Idx:27; ID:10;  I_ADDR_CTXT_L_64IS0 : Address & Context, Long, 64 bit, IS0.; Addr=0x0000FFFFB6069140; Ctxt: AArch64,EL0, NS;
>>> +              Idx:38; ID:10;  I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE
>>> +              Idx:39; ID:10;  I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE
>>> +              Idx:40; ID:10;  I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE
>>> +              Idx:41; ID:10;  I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEN
>>> +              ...
>>> +
>>> +If you see these above, then your system is tracing coresight data
>>> +correctly.
>>> +
>>> +To compile perf with coresight support in the perf directory do
>>> +
>>> +    make CORESIGHT=1
>>> +
>>> +This will compile the perf tool with coresight support as well as
>>> +build some small test binaries for perf test. This requires you also
>>> +be compiling for 64bit Arm (ARM64/aarch64). The tools run as part of
>>> +perf coresight tracing are in tests/shell/tools/coresight.
>>> +
>>> +You will also want coresight support enabled in your kernel config.
>>> +Ensure it is enabled with:
>>> +
>>> +    CONFIG_CORESIGHT=y
>>> +
>>> +There are various other coresight options you probably also want
>>> +enabled like:
>>> +
>>> +    CONFIG_CORESIGHT_LINKS_AND_SINKS=y
>>> +    CONFIG_CORESIGHT_LINK_AND_SINK_TMC=y
>>> +    CONFIG_CORESIGHT_CATU=y
>>> +    CONFIG_CORESIGHT_SINK_TPIU=y
>>> +    CONFIG_CORESIGHT_SINK_ETBV10=y
>>> +    CONFIG_CORESIGHT_SOURCE_ETM4X=y
>>> +    CONFIG_CORESIGHT_STM=y
>>> +    CONFIG_CORESIGHT_CPU_DEBUG=y
>>> +    CONFIG_CORESIGHT_CTI=y
>>> +    CONFIG_CORESIGHT_CTI_INTEGRATION_REGS=y
>>> +
>>> +Please refer to the kernel configuration help for more information.
>>> +
>>> +Perf test - Verify kernel and userspace perf coresight work
>>> +===========================================================
>>> +
>>> +When you run perf test, it will do a lot of self tests. Some of those
>>> +tests will cover Coresight (only if enabled and on ARM64). You
>>> +generally would run perf test from the tools/perf directory in the
>>> +kernel tree. Some tests will check some internal perf support like:
>>> +
>>> +    Check Arm CoreSight trace data recording and synthesized samples
>>> +
>>> +Some others will actually use perf record and some test binaries that
>>> +are in tests/shell/tools/coresight and will collect traces to ensure a
>>> +minimum level of functionality is met. The scripts that launch these
>>> +tests are in tests/shell. These will all look like:
>>> +
>>> +    Coresight / Memcpy 1M 25 Threads
>>> +    Coresight / Unroll Loop Thread 2
>>> +    ...
>>> +
>>> +These perf record tests will not run if the tool binaries do not exist
>>> +in tests/shell/tools/coresight/*/ and will be skipped. If you do not
>>> +have coresight support in hardware then either do not build perf with
>>> +coresight support or remove these binaries in order to not have these
>>> +tests fail and have them skip instead.
>>> +
>>> +These tests will log historical results in the current working
>>> +directory (e.g. tools/perf) and will be named stats-*.csv like:
>>> +
>>> +    stats-asm_pure_loop-out.csv
>>> +    stats-bubble_sort-random.csv
>>> +    ...
>>> +
>>> +These statistic files log some aspects of the AUX data sections in
>>> +the perf data output counting some numbers of certain encodings (a
>>> +good way to know that it's working in a very simple way). One problem
>>> +with coresight is that given a large enough amount of data needing to
>>> +be logged, some of it can be lost due to the processor not waking up
>>> +in time to read out all the data from buffers etc.. You will notice
>>> +that the amount of data collected can vary a lot per run of perf test.
>>> +If you wish to see how this changes over time, simply run perf test
>>> +multiple times and all these csv files will have more and more data
>>> +appended to it that you can later examine, graph and otherwise use to
>>> +figure out if things have become worse or better.
>>> +
>>> +Be aware that amny of these tests take quite a while to run, specifically
>>> +in processing the perf data file and dumping contents to then examine what
>>> +is inside.
>>> +
>>> +You can change where these csv logs are stored by setting the
>>> +PERF_TEST_CORESIGHT_STATDIR environment variable before running perf
>>> +test like:
>>> +
>>> +    export PERF_TEST_CORESIGHT_STATDIR=/var/tmp
>>> +    perf test
>>> +
>>> +They will also store resulting perf output data in the current
>>> +directory for later inspection like:
>>> +
>>> +    perf-memcpy-1m.data
>>> +    perf-thread_loop-2th.data
>>> +    ...
>>> +
>>> +You can alter where the perf data files are stored by setting the
>>> +PERF_TEST_CORESIGHT_DATADIR environment variable such as:
>>> +
>>> +    PERF_TEST_CORESIGHT_DATADIR=/var/tmp
>>> +    perf test
>>> +
>>> +You may wish to set these above environment variables if you which to
>>> +keep the output of tests outside of the current working directory for
>>> +longer term storage and examination.
>>> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
>>> index ac861e42c8f7..b97db83992e0 100644
>>> --- a/tools/perf/Makefile.perf
>>> +++ b/tools/perf/Makefile.perf
>>> @@ -630,7 +630,15 @@ sync_file_range_tbls := $(srctree)/tools/perf/trace/beauty/sync_file_range.sh
>>>   $(sync_file_range_arrays): $(linux_uapi_dir)/fs.h $(sync_file_range_tbls)
>>>        $(Q)$(SHELL) '$(sync_file_range_tbls)' $(linux_uapi_dir) > $@
>>>
>>> -all: shell_compatibility_test $(ALL_PROGRAMS) $(LANG_BINDINGS) $(OTHER_PROGRAMS)
>>> +TESTS_CORESIGHT_DIR := $(srctree)/tools/perf/tests/shell/coresight
>>> +
>>> +tests-coresight-targets: FORCE
>>> +     $(Q)$(MAKE) -C $(TESTS_CORESIGHT_DIR)
>>> +
>>> +tests-coresight-targets-clean:
>>> +     $(Q)$(MAKE) -C $(TESTS_CORESIGHT_DIR) clean
>>> +
>>> +all: shell_compatibility_test $(ALL_PROGRAMS) $(LANG_BINDINGS) $(OTHER_PROGRAMS) tests-coresight-targets
>>>
>>>   # Create python binding output directory if not already present
>>>   _dummy := $(shell [ -d '$(OUTPUT)python' ] || mkdir -p '$(OUTPUT)python')
>>> @@ -1020,6 +1028,7 @@ install-tests: all install-gtk
>>>                $(INSTALL) tests/shell/*.sh '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell'; \
>>>                $(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/lib'; \
>>>                $(INSTALL) tests/shell/lib/*.sh '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/lib'
>>> +     $(Q)$(MAKE) -C tests/shell/coresight install-tests
>>>
>>>   install-bin: install-tools install-tests install-traceevent-plugins
>>>
>>> @@ -1088,7 +1097,7 @@ endif # BUILD_BPF_SKEL
>>>   bpf-skel-clean:
>>>        $(call QUIET_CLEAN, bpf-skel) $(RM) -r $(SKEL_TMP_OUT) $(SKELETONS)
>>>
>>> -clean:: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean $(LIBPERF)-clean fixdep-clean python-clean bpf-skel-clean
>>> +clean:: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean $(LIBPERF)-clean fixdep-clean python-clean bpf-skel-clean tests-coresight-targets-clean
>>>        $(call QUIET_CLEAN, core-objs)  $(RM) $(LIBPERF_A) $(OUTPUT)perf-archive $(OUTPUT)perf-with-kcore $(OUTPUT)perf-iostat $(LANG_BINDINGS)
>>>        $(Q)find $(if $(OUTPUT),$(OUTPUT),.) -name '*.o' -delete -o -name '\.*.cmd' -delete -o -name '\.*.d' -delete
>>>        $(Q)$(RM) $(OUTPUT).config-detected
>>> @@ -1155,5 +1164,6 @@ FORCE:
>>>   .PHONY: shell_compatibility_test please_set_SHELL_PATH_to_a_more_modern_shell
>>>   .PHONY: $(GIT-HEAD-PHONY) TAGS tags cscope FORCE prepare
>>>   .PHONY: libtraceevent_plugins archheaders
>>> +.PHONY: $(TESTS_CORESIGHT_TARGETS)
>>>
>>>   endif # force_fixdep
>>> diff --git a/tools/perf/tests/shell/coresight/Makefile b/tools/perf/tests/shell/coresight/Makefile
>>> new file mode 100644
>>> index 000000000000..dda99aeac158
>>> --- /dev/null
>>> +++ b/tools/perf/tests/shell/coresight/Makefile
>>> @@ -0,0 +1,30 @@
>>> +# SPDX-License-Identifier: GPL-2.0-only
>>> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>>> +include ../../../../../tools/scripts/Makefile.include
>>> +include ../../../../../tools/scripts/Makefile.arch
>>> +include ../../../../../tools/scripts/utilities.mak
>>> +
>>> +SUBDIRS = \
>>> +     asm_pure_loop \
>>> +     thread_loop \
>>> +     memcpy_thread \
>>> +     unroll_loop_thread
>>> +
>>> +all: $(SUBDIRS)
>>> +$(SUBDIRS):
>>> +     $(Q)$(MAKE) -C $@
>>> +
>>> +INSTALLDIRS = $(SUBDIRS:%=install-%)
>>> +
>>> +install-tests: $(INSTALLDIRS)
>>> +$(INSTALLDIRS):
>>> +     $(Q)$(MAKE) -C $(@:install-%=%) install-tests
>>> +
>>> +CLEANDIRS = $(SUBDIRS:%=clean-%)
>>> +
>>> +clean: $(CLEANDIRS)
>>> +$(CLEANDIRS):
>>> +     $(Q)$(MAKE) -C $(@:clean-%=%) clean >/dev/null
>>> +
>>> +.PHONY: all clean $(SUBDIRS) $(CLEANDIRS) $(INSTALLDIRS)
>>> +
>>> diff --git a/tools/perf/tests/shell/coresight/Makefile.miniconfig b/tools/perf/tests/shell/coresight/Makefile.miniconfig
>>> new file mode 100644
>>> index 000000000000..893c12685fed
>>> --- /dev/null
>>> +++ b/tools/perf/tests/shell/coresight/Makefile.miniconfig
>>> @@ -0,0 +1,23 @@
>>> +# SPDX-License-Identifier: GPL-2.0-only
>>> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>>> +
>>> +ifndef DESTDIR
>>> +prefix ?= $(HOME)
>>> +endif
>>> +
>>> +DESTDIR_SQ = $(subst ','\'',$(DESTDIR))
>>> +perfexecdir = libexec/perf-core
>>> +perfexec_instdir = $(perfexecdir)
>>> +
>>> +ifneq ($(filter /%,$(firstword $(perfexecdir))),)
>>> +perfexec_instdir = $(perfexecdir)
>>> +else
>>> +perfexec_instdir = $(prefix)/$(perfexecdir)
>>> +endif
>>> +
>>> +perfexec_instdir_SQ = $(subst ','\'',$(perfexec_instdir))
>>> +INSTALL = install
>>> +
>>> +include ../../../../../scripts/Makefile.include
>>> +include ../../../../../scripts/Makefile.arch
>>> +include ../../../../../scripts/utilities.mak
>>> diff --git a/tools/perf/tests/shell/coresight/asm_pure_loop/.gitignore b/tools/perf/tests/shell/coresight/asm_pure_loop/.gitignore
>>> new file mode 100644
>>> index 000000000000..468673ac32e8
>>> --- /dev/null
>>> +++ b/tools/perf/tests/shell/coresight/asm_pure_loop/.gitignore
>>> @@ -0,0 +1 @@
>>> +asm_pure_loop
>>> diff --git a/tools/perf/tests/shell/coresight/asm_pure_loop/Makefile b/tools/perf/tests/shell/coresight/asm_pure_loop/Makefile
>>> new file mode 100644
>>> index 000000000000..10c5a60cb71c
>>> --- /dev/null
>>> +++ b/tools/perf/tests/shell/coresight/asm_pure_loop/Makefile
>>> @@ -0,0 +1,30 @@
>>> +# SPDX-License-Identifier: GPL-2.0
>>> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>>> +
>>> +include ../Makefile.miniconfig
>>> +
>>> +BIN=asm_pure_loop
>>> +LIB=
>>> +
>>> +all: $(BIN)
>>> +
>>> +$(BIN): $(BIN).S
>>> +ifdef CORESIGHT
>>> +ifeq ($(ARCH),arm64)
>>> +     $(Q)$(CC) $(BIN).S -nostdlib -static -o $(BIN) $(LIB)
>>> +endif
>>> +endif
>>> +
>>> +install-tests: all
>>> +ifdef CORESIGHT
>>> +ifeq ($(ARCH),arm64)
>>> +     $(call QUIET_INSTALL, tests) \
>>> +             $(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)'; \
>>> +             $(INSTALL) $(BIN) '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)/$(BIN)'
>>> +endif
>>> +endif
>>> +
>>> +clean:
>>> +     $(Q)$(RM) -f $(BIN)
>>> +
>>> +.PHONY: all clean install-tests
>>> diff --git a/tools/perf/tests/shell/coresight/asm_pure_loop/asm_pure_loop.S b/tools/perf/tests/shell/coresight/asm_pure_loop/asm_pure_loop.S
>>> new file mode 100644
>>> index 000000000000..75cf084a927d
>>> --- /dev/null
>>> +++ b/tools/perf/tests/shell/coresight/asm_pure_loop/asm_pure_loop.S
>>> @@ -0,0 +1,28 @@
>>> +/* SPDX-License-Identifier: GPL-2.0 */
>>> +/* Tamas Zsoldos <tamas.zsoldos@arm.com>, 2021 */
>>> +
>>> +.globl _start
>>> +_start:
>>> +     mov     x0, 0x0000ffff
>>> +     mov     x1, xzr
>>> +loop:
>>> +     nop
>>> +     nop
>>> +     cbnz    x1, noskip
>>> +     nop
>>> +     nop
>>> +     adrp    x2, skip
>>> +     add     x2, x2, :lo12:skip
>>> +     br      x2
>>> +     nop
>>> +     nop
>>> +noskip:
>>> +     nop
>>> +     nop
>>> +skip:
>>> +     sub     x0, x0, 1
>>> +     cbnz    x0, loop
>>> +
>>> +     mov     x0, #0
>>> +     mov     x8, #93 // __NR_exit syscall
>>> +     svc     #0
>>> diff --git a/tools/perf/tests/shell/coresight/memcpy_thread/.gitignore b/tools/perf/tests/shell/coresight/memcpy_thread/.gitignore
>>> new file mode 100644
>>> index 000000000000..f8217e56091e
>>> --- /dev/null
>>> +++ b/tools/perf/tests/shell/coresight/memcpy_thread/.gitignore
>>> @@ -0,0 +1 @@
>>> +memcpy_thread
>>> diff --git a/tools/perf/tests/shell/coresight/memcpy_thread/Makefile b/tools/perf/tests/shell/coresight/memcpy_thread/Makefile
>>> new file mode 100644
>>> index 000000000000..e2604cfae74b
>>> --- /dev/null
>>> +++ b/tools/perf/tests/shell/coresight/memcpy_thread/Makefile
>>> @@ -0,0 +1,29 @@
>>> +# SPDX-License-Identifier: GPL-2.0
>>> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>>> +include ../Makefile.miniconfig
>>> +
>>> +BIN=memcpy_thread
>>> +LIB=-pthread
>>> +
>>> +all: $(BIN)
>>> +
>>> +$(BIN): $(BIN).c
>>> +ifdef CORESIGHT
>>> +ifeq ($(ARCH),arm64)
>>> +     $(Q)$(CC) $(BIN).c -o $(BIN) $(LIB)
>>> +endif
>>> +endif
>>> +
>>> +install-tests: all
>>> +ifdef CORESIGHT
>>> +ifeq ($(ARCH),arm64)
>>> +     $(call QUIET_INSTALL, tests) \
>>> +             $(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)'; \
>>> +             $(INSTALL) $(BIN) '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)/$(BIN)'
>>> +endif
>>> +endif
>>> +
>>> +clean:
>>> +     $(Q)$(RM) -f $(BIN)
>>> +
>>> +.PHONY: all clean install-tests
>>> diff --git a/tools/perf/tests/shell/coresight/memcpy_thread/memcpy_thread.c b/tools/perf/tests/shell/coresight/memcpy_thread/memcpy_thread.c
>>> new file mode 100644
>>> index 000000000000..a7e169d1bf64
>>> --- /dev/null
>>> +++ b/tools/perf/tests/shell/coresight/memcpy_thread/memcpy_thread.c
>>> @@ -0,0 +1,79 @@
>>> +// SPDX-License-Identifier: GPL-2.0
>>> +// Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>>> +#include <stdio.h>
>>> +#include <stdlib.h>
>>> +#include <unistd.h>
>>> +#include <string.h>
>>> +#include <pthread.h>
>>> +
>>> +struct args {
>>> +     unsigned long loops;
>>> +     unsigned long size;
>>> +     pthread_t th;
>>> +     void *ret;
>>> +};
>>> +
>>> +static void *thrfn(void *arg)
>>> +{
>>> +     struct args *a = arg;
>>> +     unsigned long i, len = a->loops;
>>> +     unsigned char *src, *dst;
>>> +
>>> +     src = malloc(a->size * 1024);
>>> +     dst = malloc(a->size * 1024);
>>> +     if ((!src) || (!dst)) {
>>> +             printf("ERR: Can't allocate memory\n");
>>> +             exit(1);
>>> +     }
>>> +     for (i = 0; i < len; i++)
>>> +             memcpy(dst, src, a->size * 1024);
>>> +}
>>> +
>>> +static pthread_t new_thr(void *(*fn) (void *arg), void *arg)
>>> +{
>>> +     pthread_t t;
>>> +     pthread_attr_t attr;
>>> +
>>> +     pthread_attr_init(&attr);
>>> +     pthread_create(&t, &attr, fn, arg);
>>> +     return t;
>>> +}
>>> +
>>> +int main(int argc, char **argv)
>>> +{
>>> +     unsigned long i, len, size, thr;
>>> +     pthread_t threads[256];
>>> +     struct args args[256];
>>> +     long long v;
>>> +
>>> +     if (argc < 4) {
>>> +             printf("ERR: %s [copysize Kb] [numthreads] [numloops (hundreds)]\n", argv[0]);
>>> +             exit(1);
>>> +     }
>>> +
>>> +     v = atoll(argv[1]);
>>> +     if ((v < 1) || (v > (1024 * 1024))) {
>>> +             printf("ERR: max memory 1GB (1048576 KB)\n");
>>> +             exit(1);
>>> +     }
>>> +     size = v;
>>> +     thr = atol(argv[2]);
>>> +     if ((thr < 1) || (thr > 256)) {
>>> +             printf("ERR: threads 1-256\n");
>>> +             exit(1);
>>> +     }
>>> +     v = atoll(argv[3]);
>>> +     if ((v < 1) || (v > 40000000000ll)) {
>>> +             printf("ERR: loops 1-40000000000 (hundreds)\n");
>>> +             exit(1);
>>> +     }
>>> +     len = v * 100;
>>> +     for (i = 0; i < thr; i++) {
>>> +             args[i].loops = len;
>>> +             args[i].size = size;
>>> +             args[i].th = new_thr(thrfn, &(args[i]));
>>> +     }
>>> +     for (i = 0; i < thr; i++)
>>> +             pthread_join(args[i].th, &(args[i].ret));
>>> +     return 0;
>>> +}
>>> diff --git a/tools/perf/tests/shell/coresight/thread_loop/.gitignore b/tools/perf/tests/shell/coresight/thread_loop/.gitignore
>>> new file mode 100644
>>> index 000000000000..6d4c33eaa9e8
>>> --- /dev/null
>>> +++ b/tools/perf/tests/shell/coresight/thread_loop/.gitignore
>>> @@ -0,0 +1 @@
>>> +thread_loop
>>> diff --git a/tools/perf/tests/shell/coresight/thread_loop/Makefile b/tools/perf/tests/shell/coresight/thread_loop/Makefile
>>> new file mode 100644
>>> index 000000000000..424df4e8b0e6
>>> --- /dev/null
>>> +++ b/tools/perf/tests/shell/coresight/thread_loop/Makefile
>>> @@ -0,0 +1,29 @@
>>> +# SPDX-License-Identifier: GPL-2.0
>>> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>>> +include ../Makefile.miniconfig
>>> +
>>> +BIN=thread_loop
>>> +LIB=-pthread
>>> +
>>> +all: $(BIN)
>>> +
>>> +$(BIN): $(BIN).c
>>> +ifdef CORESIGHT
>>> +ifeq ($(ARCH),arm64)
>>> +     $(Q)$(CC) $(BIN).c -o $(BIN) $(LIB)
>>> +endif
>>> +endif
>>> +
>>> +install-tests: all
>>> +ifdef CORESIGHT
>>> +ifeq ($(ARCH),arm64)
>>> +     $(call QUIET_INSTALL, tests) \
>>> +             $(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)'; \
>>> +             $(INSTALL) $(BIN) '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)/$(BIN)'
>>> +endif
>>> +endif
>>> +
>>> +clean:
>>> +     $(Q)$(RM) -f $(BIN)
>>> +
>>> +.PHONY: all clean install-tests
>>> diff --git a/tools/perf/tests/shell/coresight/thread_loop/thread_loop.c b/tools/perf/tests/shell/coresight/thread_loop/thread_loop.c
>>> new file mode 100644
>>> index 000000000000..c0158fac7d0b
>>> --- /dev/null
>>> +++ b/tools/perf/tests/shell/coresight/thread_loop/thread_loop.c
>>> @@ -0,0 +1,86 @@
>>> +// SPDX-License-Identifier: GPL-2.0
>>> +// Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>>> +
>>> +// define this for gettid()
>>> +#define _GNU_SOURCE
>>> +
>>> +#include <stdio.h>
>>> +#include <stdlib.h>
>>> +#include <unistd.h>
>>> +#include <string.h>
>>> +#include <pthread.h>
>>> +#include <sys/syscall.h>
>>> +#ifndef SYS_gettid
>>> +// gettid is 178 on arm64
>>> +# define SYS_gettid 178
>>> +#endif
>>> +#define gettid() syscall(SYS_gettid)
>>> +
>>> +struct args {
>>> +     unsigned int loops;
>>> +     pthread_t th;
>>> +     void *ret;
>>> +};
>>> +
>>> +static void *thrfn(void *arg)
>>> +{
>>> +     struct args *a = arg;
>>> +     int i = 0, len = a->loops;
>>> +
>>> +     if (getenv("SHOW_TID")) {
>>> +             unsigned long long tid = gettid();
>>> +
>>> +             printf("%llu\n", tid);
>>> +     }
>>> +     asm volatile(
>>> +             "loop:\n"
>>> +             "add %[i], %[i], #1\n"
>>> +             "cmp %[i], %[len]\n"
>>> +             "blt loop\n"
>>> +             : /* out */
>>> +             : /* in */ [i] "r" (i), [len] "r" (len)
>>> +             : /* clobber */
>>> +     );
>>> +     return (void *)(long)i;
>>> +}
>>> +
>>> +static pthread_t new_thr(void *(*fn) (void *arg), void *arg)
>>> +{
>>> +     pthread_t t;
>>> +     pthread_attr_t attr;
>>> +
>>> +     pthread_attr_init(&attr);
>>> +     pthread_create(&t, &attr, fn, arg);
>>> +     return t;
>>> +}
>>> +
>>> +int main(int argc, char **argv)
>>> +{
>>> +     unsigned int i, len, thr;
>>> +     pthread_t threads[256];
>>> +     struct args args[256];
>>> +
>>> +     if (argc < 3) {
>>> +             printf("ERR: %s [numthreads] [numloops (millions)]\n", argv[0]);
>>> +             exit(1);
>>> +     }
>>> +
>>> +     thr = atoi(argv[1]);
>>> +     if ((thr < 1) || (thr > 256)) {
>>> +             printf("ERR: threads 1-256\n");
>>> +             exit(1);
>>> +     }
>>> +     len = atoi(argv[2]);
>>> +     if ((len < 1) || (len > 4000)) {
>>> +             printf("ERR: max loops 4000 (millions)\n");
>>> +             exit(1);
>>> +     }
>>> +     len *= 1000000;
>>> +     for (i = 0; i < thr; i++) {
>>> +             args[i].loops = len;
>>> +             args[i].th = new_thr(thrfn, &(args[i]));
>>> +     }
>>> +     for (i = 0; i < thr; i++)
>>> +             pthread_join(args[i].th, &(args[i].ret));
>>> +     return 0;
>>> +}
>>> diff --git a/tools/perf/tests/shell/coresight/unroll_loop_thread/.gitignore b/tools/perf/tests/shell/coresight/unroll_loop_thread/.gitignore
>>> new file mode 100644
>>> index 000000000000..2cb4e996dbf3
>>> --- /dev/null
>>> +++ b/tools/perf/tests/shell/coresight/unroll_loop_thread/.gitignore
>>> @@ -0,0 +1 @@
>>> +unroll_loop_thread
>>> diff --git a/tools/perf/tests/shell/coresight/unroll_loop_thread/Makefile b/tools/perf/tests/shell/coresight/unroll_loop_thread/Makefile
>>> new file mode 100644
>>> index 000000000000..45ab2be8be92
>>> --- /dev/null
>>> +++ b/tools/perf/tests/shell/coresight/unroll_loop_thread/Makefile
>>> @@ -0,0 +1,29 @@
>>> +# SPDX-License-Identifier: GPL-2.0
>>> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>>> +include ../Makefile.miniconfig
>>> +
>>> +BIN=unroll_loop_thread
>>> +LIB=-pthread
>>> +
>>> +all: $(BIN)
>>> +
>>> +$(BIN): $(BIN).c
>>> +ifdef CORESIGHT
>>> +ifeq ($(ARCH),arm64)
>>> +     $(Q)$(CC) $(BIN).c -o $(BIN) $(LIB)
>>> +endif
>>> +endif
>>> +
>>> +install-tests: all
>>> +ifdef CORESIGHT
>>> +ifeq ($(ARCH),arm64)
>>> +     $(call QUIET_INSTALL, tests) \
>>> +             $(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)'; \
>>> +             $(INSTALL) $(BIN) '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)/$(BIN)'
>>> +endif
>>> +endif
>>> +
>>> +clean:
>>> +     $(Q)$(RM) -f $(BIN)
>>> +
>>> +.PHONY: all clean install-tests
>>> diff --git a/tools/perf/tests/shell/coresight/unroll_loop_thread/unroll_loop_thread.c b/tools/perf/tests/shell/coresight/unroll_loop_thread/unroll_loop_thread.c
>>> new file mode 100644
>>> index 000000000000..cb9d22c7dfb9
>>> --- /dev/null
>>> +++ b/tools/perf/tests/shell/coresight/unroll_loop_thread/unroll_loop_thread.c
>>> @@ -0,0 +1,74 @@
>>> +// SPDX-License-Identifier: GPL-2.0
>>> +// Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>>> +#include <stdio.h>
>>> +#include <stdlib.h>
>>> +#include <unistd.h>
>>> +#include <string.h>
>>> +#include <pthread.h>
>>> +
>>> +struct args {
>>> +     pthread_t th;
>>> +     unsigned int in, out;
>>> +     void *ret;
>>> +};
>>> +
>>> +static void *thrfn(void *arg)
>>> +{
>>> +     struct args *a = arg;
>>> +     unsigned int i, in = a->in;
>>> +
>>> +     for (i = 0; i < 10000; i++) {
>>> +             asm volatile (
>>> +// force an unroll of thia add instruction so we can test long runs of code
>>> +#define SNIP1 "add %[in], %[in], #1\n"
>>> +// 10
>>> +#define SNIP2 SNIP1 SNIP1 SNIP1 SNIP1 SNIP1 SNIP1 SNIP1 SNIP1 SNIP1 SNIP1
>>> +// 100
>>> +#define SNIP3 SNIP2 SNIP2 SNIP2 SNIP2 SNIP2 SNIP2 SNIP2 SNIP2 SNIP2 SNIP2
>>> +// 1000
>>> +#define SNIP4 SNIP3 SNIP3 SNIP3 SNIP3 SNIP3 SNIP3 SNIP3 SNIP3 SNIP3 SNIP3
>>> +// 10000
>>> +#define SNIP5 SNIP4 SNIP4 SNIP4 SNIP4 SNIP4 SNIP4 SNIP4 SNIP4 SNIP4 SNIP4
>>> +// 100000
>>> +                     SNIP5 SNIP5 SNIP5 SNIP5 SNIP5 SNIP5 SNIP5 SNIP5 SNIP5 SNIP5
>>> +                     : /* out */
>>> +                     : /* in */ [in] "r" (in)
>>> +                     : /* clobber */
>>> +             );
>>> +     }
>>> +}
>>> +
>>> +static pthread_t new_thr(void *(*fn) (void *arg), void *arg)
>>> +{
>>> +     pthread_t t;
>>> +     pthread_attr_t attr;
>>> +
>>> +     pthread_attr_init(&attr);
>>> +     pthread_create(&t, &attr, fn, arg);
>>> +     return t;
>>> +}
>>> +
>>> +int main(int argc, char **argv)
>>> +{
>>> +     unsigned int i, thr;
>>> +     pthread_t threads[256];
>>> +     struct args args[256];
>>> +
>>> +     if (argc < 2) {
>>> +             printf("ERR: %s [numthreads]\n", argv[0]);
>>> +             exit(1);
>>> +     }
>>> +
>>> +     thr = atoi(argv[1]);
>>> +     if ((thr > 256) || (thr < 1)) {
>>> +             printf("ERR: threads 1-256\n");
>>> +             exit(1);
>>> +     }
>>> +     for (i = 0; i < thr; i++) {
>>> +             args[i].in = rand();
>>> +             args[i].th = new_thr(thrfn, &(args[i]));
>>> +     }
>>> +     for (i = 0; i < thr; i++)
>>> +             pthread_join(args[i].th, &(args[i].ret));
>>> +     return 0;
>>> +}
>>> diff --git a/tools/perf/tests/shell/coresight_asm_pure_loop.sh b/tools/perf/tests/shell/coresight_asm_pure_loop.sh
>>> new file mode 100755
>>> index 000000000000..3f0dbefcad50
>>> --- /dev/null
>>> +++ b/tools/perf/tests/shell/coresight_asm_pure_loop.sh
>>> @@ -0,0 +1,18 @@
>>> +#!/bin/sh -e
>>> +# Coresight / ASM Pure Loop
>>> +
>>> +# SPDX-License-Identifier: GPL-2.0
>>> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>>> +
>>> +TEST="asm_pure_loop"
>>> +. $(dirname $0)/lib/coresight.sh
>>> +ARGS=""
>>> +DATV="out"
>>> +DATA="$DATD/perf-$TEST-$DATV.data"
>>> +
>>> +perf record $PERFRECOPT -o "$DATA" "$BIN" $ARGS
>>> +
>>> +perf_dump_aux_verify "$DATA" 10 10 10
>>> +
>>> +err=$?
>>> +exit $err
>>> diff --git a/tools/perf/tests/shell/coresight_memcpy_thread_16k_10.sh b/tools/perf/tests/shell/coresight_memcpy_thread_16k_10.sh
>>> new file mode 100755
>>> index 000000000000..8972af835016
>>> --- /dev/null
>>> +++ b/tools/perf/tests/shell/coresight_memcpy_thread_16k_10.sh
>>> @@ -0,0 +1,18 @@
>>> +#!/bin/sh -e
>>> +# Coresight / Memcpy 16k 10 Threads
>>> +
>>> +# SPDX-License-Identifier: GPL-2.0
>>> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>>> +
>>> +TEST="memcpy_thread"
>>> +. $(dirname $0)/lib/coresight.sh
>>> +ARGS="16 10 1"
>>> +DATV="16k_10"
>>> +DATA="$DATD/perf-$TEST-$DATV.data"
>>> +
>>> +perf record $PERFRECOPT -o "$DATA" "$BIN" $ARGS
>>> +
>>> +perf_dump_aux_verify "$DATA" 10 10 10
>>> +
>>> +err=$?
>>> +exit $err
>>> diff --git a/tools/perf/tests/shell/coresight_thread_loop_check_tid_10.sh b/tools/perf/tests/shell/coresight_thread_loop_check_tid_10.sh
>>> new file mode 100755
>>> index 000000000000..5b468901f89b
>>> --- /dev/null
>>> +++ b/tools/perf/tests/shell/coresight_thread_loop_check_tid_10.sh
>>> @@ -0,0 +1,19 @@
>>> +#!/bin/sh -e
>>> +# Coresight / Thread Loop 10 Threads - Check TID
>>> +
>>> +# SPDX-License-Identifier: GPL-2.0
>>> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>>> +
>>> +TEST="thread_loop"
>>> +. $(dirname $0)/lib/coresight.sh
>>> +ARGS="10 1"
>>> +DATV="check-tid-10th"
>>> +DATA="$DATD/perf-$TEST-$DATV.data"
>>> +STDO="$DATD/perf-$TEST-$DATV.stdout"
>>> +
>>> +SHOW_TID=1 perf record -s $PERFRECOPT -o "$DATA" "$BIN" $ARGS > $STDO
>>> +
>>> +perf_dump_aux_tid_verify "$DATA" "$STDO"
>>> +
>>> +err=$?
>>> +exit $err
>>> diff --git a/tools/perf/tests/shell/coresight_thread_loop_check_tid_2.sh b/tools/perf/tests/shell/coresight_thread_loop_check_tid_2.sh
>>> new file mode 100755
>>> index 000000000000..f8b7abd3aa03
>>> --- /dev/null
>>> +++ b/tools/perf/tests/shell/coresight_thread_loop_check_tid_2.sh
>>> @@ -0,0 +1,19 @@
>>> +#!/bin/sh -e
>>> +# Coresight / Thread Loop 2 Threads - Check TID
>>> +
>>> +# SPDX-License-Identifier: GPL-2.0
>>> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>>> +
>>> +TEST="thread_loop"
>>> +. $(dirname $0)/lib/coresight.sh
>>> +ARGS="2 20"
>>> +DATV="check-tid-2th"
>>> +DATA="$DATD/perf-$TEST-$DATV.data"
>>> +STDO="$DATD/perf-$TEST-$DATV.stdout"
>>> +
>>> +SHOW_TID=1 perf record -s $PERFRECOPT -o "$DATA" "$BIN" $ARGS > $STDO
>>> +
>>> +perf_dump_aux_tid_verify "$DATA" "$STDO"
>>> +
>>> +err=$?
>>> +exit $err
>>> diff --git a/tools/perf/tests/shell/coresight_unroll_loop_thread_10.sh b/tools/perf/tests/shell/coresight_unroll_loop_thread_10.sh
>>> new file mode 100755
>>> index 000000000000..c985dfb025c2
>>> --- /dev/null
>>> +++ b/tools/perf/tests/shell/coresight_unroll_loop_thread_10.sh
>>> @@ -0,0 +1,18 @@
>>> +#!/bin/sh -e
>>> +# Coresight / Unroll Loop Thread 10
>>> +
>>> +# SPDX-License-Identifier: GPL-2.0
>>> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>>> +
>>> +TEST="unroll_loop_thread"
>>> +. $(dirname $0)/lib/coresight.sh
>>> +ARGS="10"
>>> +DATV="10"
>>> +DATA="$DATD/perf-$TEST-$DATV.data"
>>> +
>>> +perf record $PERFRECOPT -o "$DATA" "$BIN" $ARGS
>>> +
>>> +perf_dump_aux_verify "$DATA" 10 10 10
>>> +
>>> +err=$?
>>> +exit $err
>>> diff --git a/tools/perf/tests/shell/lib/coresight.sh b/tools/perf/tests/shell/lib/coresight.sh
>>> new file mode 100644
>>> index 000000000000..6a611b073f02
>>> --- /dev/null
>>> +++ b/tools/perf/tests/shell/lib/coresight.sh
>>> @@ -0,0 +1,130 @@
>>> +# SPDX-License-Identifier: GPL-2.0
>>> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>>> +
>>> +# This is sourced from a driver script so no need for #!/bin... etc. at the
>>> +# top - the assumption below is that it runs as part of sourcing after the
>>> +# test sets up some basic env vars to say what it is.
>>> +
>>> +# perf record options for the perf tests to use
>>> +PERFRECMEM="-m ,128M"
>>> +PERFRECOPT="$PERFRECMEM -e cs_etm//u"
>>> +
>>> +# These tests need to be run as root or coresight won't allow large buffers
>>> +# and will not collect proper data
>>> +UID=`id -u`
>>> +if test "$UID" -ne 0; then
>>> +     echo "Not running as root... skip"
>>> +     exit 2
>>> +fi
>>> +
>>> +TOOLS=$(dirname $0)
>>> +DIR="$TOOLS/coresight/$TEST"
>>> +BIN="$DIR/$TEST"
>>> +# If the test tool/binary does not exist and is executable then skip the test
>>> +if ! test -x "$BIN"; then exit 2; fi
>>> +DATD="."
>>> +# If the data dir env is set then make the data dir use that instead of ./
>>> +if test -n "$PERF_TEST_CORESIGHT_DATADIR"; then
>>> +     DATD="$PERF_TEST_CORESIGHT_DATADIR";
>>> +fi
>>> +# If the stat dir env is set then make the data dir use that instead of ./
>>> +STATD="."
>>> +if test -n "$PERF_TEST_CORESIGHT_STATDIR"; then
>>> +     STATD="$PERF_TEST_CORESIGHT_STATDIR";
>>> +fi
>>> +
>>> +# Called if the test fails - error code 2
>>> +err() {
>>> +     echo "$1"
>>> +     exit 1
>>> +}
>>> +
>>> +# Check that some statistics from our perf
>>> +check_val_min() {
>>> +     STATF="$4"
>>> +     if test "$2" -lt "$3"; then
>>> +             echo ", FAILED" >> "$STATF"
>>> +             err "Sanity check number of $1 is too low ($2 < $3)"
>>> +     fi
>>> +}
>>> +
>>> +perf_dump_aux_verify() {
>>> +     # Some basic checking that the AUX chunk contains some sensible data
>>> +     # to see that we are recording something and at least a minimum
>>> +     # amount of it. We should almost always see F3 atoms in just about
>>> +     # anything but certainly we will see some trace info and async atom
>>> +     # chunks.
>>> +     DUMP="$DATD/perf-tmp-aux-dump.txt"
>>> +     perf report --stdio --dump -i "$1" | \
>>> +             grep -o -e I_ATOM_F3 -e I_ASYNC -e I_TRACE_INFO > "$DUMP"
>>> +     # Simply count how many of these atoms we find to see that we are
>>> +     # producing a reasonable amount of data - exact checks are not sane
>>> +     # as this is a lossy  process where we may lose some blocks and the
>>> +     # compiler may produce different code depending on the compiler and
>>> +     # optimization options, so this is rough  just to see if we're
>>> +     # either missing almost all the data or all of it
>>> +     ATOM_F3_NUM=`grep I_ATOM_F3 "$DUMP" | wc -l`
>>> +     ATOM_ASYNC_NUM=`grep I_ASYNC "$DUMP" | wc -l`
>>> +     ATOM_TRACE_INFO_NUM=`grep I_TRACE_INFO "$DUMP" | wc -l`
>>> +     rm -f "$DUMP"
>>> +
>>> +     # Arguments provide minimums for a pass
>>> +     CHECK_F3_MIN="$2"
>>> +     CHECK_ASYNC_MIN="$3"
>>> +     CHECK_TRACE_INFO_MIN="$4"
>>> +
>>> +     # Write out statistics, so over time you can track results to see if
>>> +     # there is a pattern - for example we have less "noisy" results that
>>> +     # produce more consistent amounts of data each run, to see if over
>>> +     # time any techinques to  minimize data loss are having an effect or
>>> +     # not
>>> +     STATF="$STATD/stats-$TEST-$DATV.csv"
>>> +     if ! test -f "$STATF"; then
>>> +             echo "ATOM F3 Count, Minimum, ATOM ASYNC Count, Minimum, TRACE INFO Count, Minimum" > "$STATF"
>>> +     fi
>>> +     echo -n "$ATOM_F3_NUM, $CHECK_F3_MIN, $ATOM_ASYNC_NUM, $CHECK_ASYNC_MIN, $ATOM_TRACE_INFO_NUM, $CHECK_TRACE_INFO_MIN" >> "$STATF"
>>> +
>>> +     # Actually check to see if we passed or failed.
>>> +     check_val_min "ATOM_F3" "$ATOM_F3_NUM" "$CHECK_F3_MIN" "$STATF"
>>> +     check_val_min "ASYNC" "$ATOM_ASYNC_NUM" "$CHECK_ASYNC_MIN" "$STATF"
>>> +     check_val_min "TRACE_INFO" "$ATOM_TRACE_INFO_NUM" "$CHECK_TRACE_INFO_MIN" "$STATF"
>>> +     echo ", Ok" >> "$STATF"
>>> +}
>>> +
>>> +perf_dump_aux_tid_verify() {
>>> +     # Specifically crafted test will produce a list of Tread ID's to
>>> +     # stdout that need to be checked to  see that they have had trace
>>> +     # info collected in AUX blocks in the perf data. This will go
>>> +     # through all the TID's that are listed as CID=0xabcdef and see
>>> +     # that all the Thread IDs the test tool reports are  in the perf
>>> +     # data AUX chunks
>>> +
>>> +     # The TID test tools will print a TID per stdout line that are being
>>> +     # tested
>>> +     TIDS=`cat "$2"`
>>> +     # Scan the perf report to find the TIDs that are actually CID in hex
>>> +     # and build a list of the ones found
>>> +     FOUND_TIDS=`perf report --stdio --dump -i "$1" | \
>>> +                     grep -o "CID=0x[0-9a-z]\+" | sed 's/CID=//g' | \
>>> +                     uniq | sort | uniq`
>>> +
>>> +     # Iterate over the list of TIDs that the test says it has and find
>>> +     # them in the TIDs found in the perf report
>>> +     MISSING=""
>>> +     for TID2 in $TIDS; do
>>> +             FOUND=""
>>> +             for TIDHEX in $FOUND_TIDS; do
>>> +                     TID=`printf "%i" $TIDHEX`
>>> +                     if test "$TID" -eq "$TID2"; then
>>> +                             FOUND="y"
>>> +                             break
>>> +                     fi
>>> +             done
>>> +             if test -z "$FOUND"; then
>>> +                     MISSING="$MISSING $TID"
>>> +             fi
>>> +     done
>>> +     if test -n "$MISSING"; then
>>> +             err "Thread IDs $MISSING not found in perf AUX data"
>>> +     fi
>>> +}
>>> --
>>> 2.32.0
>>>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/3] perf test: Add coresight tests to guage quality of data generated
  2022-05-30 16:27   ` Mathieu Poirier
  2022-05-30 16:47     ` Mathieu Poirier
@ 2022-06-13 13:00     ` Carsten Haitzler
  1 sibling, 0 replies; 19+ messages in thread
From: Carsten Haitzler @ 2022-06-13 13:00 UTC (permalink / raw)
  To: Mathieu Poirier
  Cc: linux-kernel, coresight, suzuki.poulose, mike.leach, leo.yan,
	linux-perf-users, acme



On 5/30/22 17:27, Mathieu Poirier wrote:
> On Wed, Mar 09, 2022 at 12:28:59PM +0000, carsten.haitzler@foss.arm.com wrote:
>> From: Carsten Haitzler <carsten.haitzler@arm.com>
>>
>> This adds a test harness and tests to run perf record and examine the
>> resuling output when coresight is enabled on arm64 and check the
>> resulting quality of the output as part of perf test. These tests use
>> various tools to produce output from perf record then measure some key
>> specific aspects of that data to see if the data exists at all and
>> contains key aspects such as measuring some data for every thread of
>> a test or produces sufficient data for large exeuction runs of a large
>> executable. etc.
>>
>> Signed-off-by: Carsten Haitzler <carsten.haitzler@arm.com>
>> ---
>>   MAINTAINERS                                   |   4 +
>>   tools/perf/.gitignore                         |   6 +-
>>   tools/perf/Documentation/arm-coresight.txt    | 140 ++++++++++++++++++
>>   tools/perf/Makefile.perf                      |  14 +-
>>   tools/perf/tests/shell/coresight/Makefile     |  30 ++++
>>   .../tests/shell/coresight/Makefile.miniconfig |  23 +++
>>   .../shell/coresight/asm_pure_loop/.gitignore  |   1 +
>>   .../shell/coresight/asm_pure_loop/Makefile    |  30 ++++
>>   .../coresight/asm_pure_loop/asm_pure_loop.S   |  28 ++++
>>   .../shell/coresight/memcpy_thread/.gitignore  |   1 +
>>   .../shell/coresight/memcpy_thread/Makefile    |  29 ++++
>>   .../coresight/memcpy_thread/memcpy_thread.c   |  79 ++++++++++
>>   .../shell/coresight/thread_loop/.gitignore    |   1 +
>>   .../shell/coresight/thread_loop/Makefile      |  29 ++++
>>   .../shell/coresight/thread_loop/thread_loop.c |  86 +++++++++++
>>   .../coresight/unroll_loop_thread/.gitignore   |   1 +
>>   .../coresight/unroll_loop_thread/Makefile     |  29 ++++
>>   .../unroll_loop_thread/unroll_loop_thread.c   |  74 +++++++++
>>   .../tests/shell/coresight_asm_pure_loop.sh    |  18 +++
>>   .../shell/coresight_memcpy_thread_16k_10.sh   |  18 +++
>>   .../coresight_thread_loop_check_tid_10.sh     |  19 +++
>>   .../coresight_thread_loop_check_tid_2.sh      |  19 +++
>>   .../shell/coresight_unroll_loop_thread_10.sh  |  18 +++
>>   tools/perf/tests/shell/lib/coresight.sh       | 130 ++++++++++++++++
>>   24 files changed, 823 insertions(+), 4 deletions(-)
> 
> As Leo pointed out this is a big patch and hard to digest intellectually.
> 
>>   create mode 100644 tools/perf/Documentation/arm-coresight.txt
>>   create mode 100644 tools/perf/tests/shell/coresight/Makefile
>>   create mode 100644 tools/perf/tests/shell/coresight/Makefile.miniconfig
>>   create mode 100644 tools/perf/tests/shell/coresight/asm_pure_loop/.gitignore
>>   create mode 100644 tools/perf/tests/shell/coresight/asm_pure_loop/Makefile
>>   create mode 100644 tools/perf/tests/shell/coresight/asm_pure_loop/asm_pure_loop.S
>>   create mode 100644 tools/perf/tests/shell/coresight/memcpy_thread/.gitignore
>>   create mode 100644 tools/perf/tests/shell/coresight/memcpy_thread/Makefile
>>   create mode 100644 tools/perf/tests/shell/coresight/memcpy_thread/memcpy_thread.c
>>   create mode 100644 tools/perf/tests/shell/coresight/thread_loop/.gitignore
>>   create mode 100644 tools/perf/tests/shell/coresight/thread_loop/Makefile
>>   create mode 100644 tools/perf/tests/shell/coresight/thread_loop/thread_loop.c
>>   create mode 100644 tools/perf/tests/shell/coresight/unroll_loop_thread/.gitignore
>>   create mode 100644 tools/perf/tests/shell/coresight/unroll_loop_thread/Makefile
>>   create mode 100644 tools/perf/tests/shell/coresight/unroll_loop_thread/unroll_loop_thread.c
>>   create mode 100755 tools/perf/tests/shell/coresight_asm_pure_loop.sh
>>   create mode 100755 tools/perf/tests/shell/coresight_memcpy_thread_16k_10.sh
>>   create mode 100755 tools/perf/tests/shell/coresight_thread_loop_check_tid_10.sh
>>   create mode 100755 tools/perf/tests/shell/coresight_thread_loop_check_tid_2.sh
>>   create mode 100755 tools/perf/tests/shell/coresight_unroll_loop_thread_10.sh
>>   create mode 100644 tools/perf/tests/shell/lib/coresight.sh
>>
>> diff --git a/MAINTAINERS b/MAINTAINERS
>> index 673c7124ca82..18cc20609f2e 100644
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -1918,10 +1918,14 @@ F:	drivers/hwtracing/coresight/*
>>   F:	include/dt-bindings/arm/coresight-cti-dt.h
>>   F:	include/linux/coresight*
>>   F:	samples/coresight/*
>> +F:	tools/perf/Documentation/arm-coresight.txt
>>   F:	tools/perf/arch/arm/util/auxtrace.c
>>   F:	tools/perf/arch/arm/util/cs-etm.c
>>   F:	tools/perf/arch/arm/util/cs-etm.h
>>   F:	tools/perf/arch/arm/util/pmu.c
>> +F:	tools/perf/tests/shell/coresight_*
>> +F:	tools/perf/tests/shell/tools/Makefile
>> +F:	tools/perf/tests/shell/tools/coresight/*
>>   F:	tools/perf/util/cs-etm-decoder/*
>>   F:	tools/perf/util/cs-etm.*
>>   
>> diff --git a/tools/perf/.gitignore b/tools/perf/.gitignore
>> index 20b8ab984d5f..138c679ecacd 100644
>> --- a/tools/perf/.gitignore
>> +++ b/tools/perf/.gitignore
>> @@ -15,8 +15,9 @@ perf*.1
>>   perf*.xml
>>   perf*.html
>>   common-cmds.h
>> -perf.data
>> -perf.data.old
>> +perf*.data
>> +perf*.data.old
>> +stats-*.csv
>>   output.svg
>>   perf-archive
>>   perf-with-kcore
>> @@ -30,6 +31,7 @@ config.mak.autogen
>>   *-flex.*
>>   *.pyc
>>   *.pyo
>> +*.stdout
>>   .config-detected
>>   util/intel-pt-decoder/inat-tables.c
>>   arch/*/include/generated/
>> diff --git a/tools/perf/Documentation/arm-coresight.txt b/tools/perf/Documentation/arm-coresight.txt
>> new file mode 100644
>> index 000000000000..3a9e6c573c58
>> --- /dev/null
>> +++ b/tools/perf/Documentation/arm-coresight.txt
> 
> I think it would be best to keep all the coresight documentation under the
> current coresight documentation repository[1].  That way all the information on
> coresight can be found in a central place.

OK. I just added this here because this was relevant "get you going" 
documentation for perf + coresight and testing and other relevant docs 
were here including architecture specific ones like the intel docs. I'll 
find a way of merging it with the above mentioned docs.

> Some part of what is added by this patch is redundant with what is currently
> available in [1].  Other parts are tests specific and should be added under
> something like "coresight-perf-test.rst".

Sure - some is, but most is not mentioned there (the vast majority). I'd 
argue essentially all of it is relevant to someone getting started and 
the documentation in the above location is not very friendly or useful 
in that regard (which is why I wrote the documentation here to begin 
with). As I can't really find anything it duplicates other than the 
initial paragraph which I can remove, I'll merge the rest in and link it 
in as above, unless you can be specific on what it really duplicates?

> Thanks,
> Mathieu
> 
> [1]. Documentation/trace/coresight/
> 
> 
>> @@ -0,0 +1,140 @@
>> +Arm Coresight Support
>> +=====================
>> +
>> +Coresight is a feature of some Arm based processors that allows for
>> +debugging. One of the things it can do is trace every instruction
>> +executed and remotely expose that information in a hardware compressed
>> +stream. Perf is able to locally access that stream and store it to the
>> +output perf data files. This stream can then be later decoded to give the
>> +instructions that were traced for debugging or profiling purposes. You
>> +can log such data with a perf record command like:
>> +
>> +    perf record -e cs_etm//u testbinary
>> +
>> +This would run some test binary (testbinary) until it exits and record
>> +a perf.data trace file. That file would have AUX sections if coresight
>> +is working correctly. You can dump the content of this file as
>> +readable text with a command like:
>> +
>> +    perf report --stdio --dump -i perf.data
>> +
>> +You should find some sections of this file have AUX data blocks like:
>> +
>> +    0x1e78 [0x30]: PERF_RECORD_AUXTRACE size: 0x11dd0  offset: 0  ref: 0x1b614fc1061b0ad1  idx: 0  tid: 531230  cpu: -1
>> +
>> +    . ... CoreSight ETM Trace data: size 73168 bytes
>> +            Idx:0; ID:10;   I_ASYNC : Alignment Synchronisation.
>> +              Idx:12; ID:10;  I_TRACE_INFO : Trace Info.; INFO=0x0 { CC.0 }
>> +              Idx:17; ID:10;  I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0x0000000000000000;
>> +              Idx:26; ID:10;  I_TRACE_ON : Trace On.
>> +              Idx:27; ID:10;  I_ADDR_CTXT_L_64IS0 : Address & Context, Long, 64 bit, IS0.; Addr=0x0000FFFFB6069140; Ctxt: AArch64,EL0, NS;
>> +              Idx:38; ID:10;  I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE
>> +              Idx:39; ID:10;  I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE
>> +              Idx:40; ID:10;  I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE
>> +              Idx:41; ID:10;  I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEN
>> +              ...
>> +
>> +If you see these above, then your system is tracing coresight data
>> +correctly.
>> +
>> +To compile perf with coresight support in the perf directory do
>> +
>> +    make CORESIGHT=1
>> +
>> +This will compile the perf tool with coresight support as well as
>> +build some small test binaries for perf test. This requires you also
>> +be compiling for 64bit Arm (ARM64/aarch64). The tools run as part of
>> +perf coresight tracing are in tests/shell/tools/coresight.
>> +
>> +You will also want coresight support enabled in your kernel config.
>> +Ensure it is enabled with:
>> +
>> +    CONFIG_CORESIGHT=y
>> +
>> +There are various other coresight options you probably also want
>> +enabled like:
>> +
>> +    CONFIG_CORESIGHT_LINKS_AND_SINKS=y
>> +    CONFIG_CORESIGHT_LINK_AND_SINK_TMC=y
>> +    CONFIG_CORESIGHT_CATU=y
>> +    CONFIG_CORESIGHT_SINK_TPIU=y
>> +    CONFIG_CORESIGHT_SINK_ETBV10=y
>> +    CONFIG_CORESIGHT_SOURCE_ETM4X=y
>> +    CONFIG_CORESIGHT_STM=y
>> +    CONFIG_CORESIGHT_CPU_DEBUG=y
>> +    CONFIG_CORESIGHT_CTI=y
>> +    CONFIG_CORESIGHT_CTI_INTEGRATION_REGS=y
>> +
>> +Please refer to the kernel configuration help for more information.
>> +
>> +Perf test - Verify kernel and userspace perf coresight work
>> +===========================================================
>> +
>> +When you run perf test, it will do a lot of self tests. Some of those
>> +tests will cover Coresight (only if enabled and on ARM64). You
>> +generally would run perf test from the tools/perf directory in the
>> +kernel tree. Some tests will check some internal perf support like:
>> +
>> +    Check Arm CoreSight trace data recording and synthesized samples
>> +
>> +Some others will actually use perf record and some test binaries that
>> +are in tests/shell/tools/coresight and will collect traces to ensure a
>> +minimum level of functionality is met. The scripts that launch these
>> +tests are in tests/shell. These will all look like:
>> +
>> +    Coresight / Memcpy 1M 25 Threads
>> +    Coresight / Unroll Loop Thread 2
>> +    ...
>> +
>> +These perf record tests will not run if the tool binaries do not exist
>> +in tests/shell/tools/coresight/*/ and will be skipped. If you do not
>> +have coresight support in hardware then either do not build perf with
>> +coresight support or remove these binaries in order to not have these
>> +tests fail and have them skip instead.
>> +
>> +These tests will log historical results in the current working
>> +directory (e.g. tools/perf) and will be named stats-*.csv like:
>> +
>> +    stats-asm_pure_loop-out.csv
>> +    stats-bubble_sort-random.csv
>> +    ...
>> +
>> +These statistic files log some aspects of the AUX data sections in
>> +the perf data output counting some numbers of certain encodings (a
>> +good way to know that it's working in a very simple way). One problem
>> +with coresight is that given a large enough amount of data needing to
>> +be logged, some of it can be lost due to the processor not waking up
>> +in time to read out all the data from buffers etc.. You will notice
>> +that the amount of data collected can vary a lot per run of perf test.
>> +If you wish to see how this changes over time, simply run perf test
>> +multiple times and all these csv files will have more and more data
>> +appended to it that you can later examine, graph and otherwise use to
>> +figure out if things have become worse or better.
>> +
>> +Be aware that amny of these tests take quite a while to run, specifically
>> +in processing the perf data file and dumping contents to then examine what
>> +is inside.
>> +
>> +You can change where these csv logs are stored by setting the
>> +PERF_TEST_CORESIGHT_STATDIR environment variable before running perf
>> +test like:
>> +
>> +    export PERF_TEST_CORESIGHT_STATDIR=/var/tmp
>> +    perf test
>> +
>> +They will also store resulting perf output data in the current
>> +directory for later inspection like:
>> +
>> +    perf-memcpy-1m.data
>> +    perf-thread_loop-2th.data
>> +    ...
>> +
>> +You can alter where the perf data files are stored by setting the
>> +PERF_TEST_CORESIGHT_DATADIR environment variable such as:
>> +
>> +    PERF_TEST_CORESIGHT_DATADIR=/var/tmp
>> +    perf test
>> +
>> +You may wish to set these above environment variables if you which to
>> +keep the output of tests outside of the current working directory for
>> +longer term storage and examination.
>> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
>> index ac861e42c8f7..b97db83992e0 100644
>> --- a/tools/perf/Makefile.perf
>> +++ b/tools/perf/Makefile.perf
>> @@ -630,7 +630,15 @@ sync_file_range_tbls := $(srctree)/tools/perf/trace/beauty/sync_file_range.sh
>>   $(sync_file_range_arrays): $(linux_uapi_dir)/fs.h $(sync_file_range_tbls)
>>   	$(Q)$(SHELL) '$(sync_file_range_tbls)' $(linux_uapi_dir) > $@
>>   
>> -all: shell_compatibility_test $(ALL_PROGRAMS) $(LANG_BINDINGS) $(OTHER_PROGRAMS)
>> +TESTS_CORESIGHT_DIR := $(srctree)/tools/perf/tests/shell/coresight
>> +
>> +tests-coresight-targets: FORCE
>> +	$(Q)$(MAKE) -C $(TESTS_CORESIGHT_DIR)
>> +
>> +tests-coresight-targets-clean:
>> +	$(Q)$(MAKE) -C $(TESTS_CORESIGHT_DIR) clean
>> +
>> +all: shell_compatibility_test $(ALL_PROGRAMS) $(LANG_BINDINGS) $(OTHER_PROGRAMS) tests-coresight-targets
>>   
>>   # Create python binding output directory if not already present
>>   _dummy := $(shell [ -d '$(OUTPUT)python' ] || mkdir -p '$(OUTPUT)python')
>> @@ -1020,6 +1028,7 @@ install-tests: all install-gtk
>>   		$(INSTALL) tests/shell/*.sh '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell'; \
>>   		$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/lib'; \
>>   		$(INSTALL) tests/shell/lib/*.sh '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/lib'
>> +	$(Q)$(MAKE) -C tests/shell/coresight install-tests
>>   
>>   install-bin: install-tools install-tests install-traceevent-plugins
>>   
>> @@ -1088,7 +1097,7 @@ endif # BUILD_BPF_SKEL
>>   bpf-skel-clean:
>>   	$(call QUIET_CLEAN, bpf-skel) $(RM) -r $(SKEL_TMP_OUT) $(SKELETONS)
>>   
>> -clean:: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean $(LIBPERF)-clean fixdep-clean python-clean bpf-skel-clean
>> +clean:: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean $(LIBPERF)-clean fixdep-clean python-clean bpf-skel-clean tests-coresight-targets-clean
>>   	$(call QUIET_CLEAN, core-objs)  $(RM) $(LIBPERF_A) $(OUTPUT)perf-archive $(OUTPUT)perf-with-kcore $(OUTPUT)perf-iostat $(LANG_BINDINGS)
>>   	$(Q)find $(if $(OUTPUT),$(OUTPUT),.) -name '*.o' -delete -o -name '\.*.cmd' -delete -o -name '\.*.d' -delete
>>   	$(Q)$(RM) $(OUTPUT).config-detected
>> @@ -1155,5 +1164,6 @@ FORCE:
>>   .PHONY: shell_compatibility_test please_set_SHELL_PATH_to_a_more_modern_shell
>>   .PHONY: $(GIT-HEAD-PHONY) TAGS tags cscope FORCE prepare
>>   .PHONY: libtraceevent_plugins archheaders
>> +.PHONY: $(TESTS_CORESIGHT_TARGETS)
>>   
>>   endif # force_fixdep
>> diff --git a/tools/perf/tests/shell/coresight/Makefile b/tools/perf/tests/shell/coresight/Makefile
>> new file mode 100644
>> index 000000000000..dda99aeac158
>> --- /dev/null
>> +++ b/tools/perf/tests/shell/coresight/Makefile
>> @@ -0,0 +1,30 @@
>> +# SPDX-License-Identifier: GPL-2.0-only
>> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>> +include ../../../../../tools/scripts/Makefile.include
>> +include ../../../../../tools/scripts/Makefile.arch
>> +include ../../../../../tools/scripts/utilities.mak
>> +
>> +SUBDIRS = \
>> +	asm_pure_loop \
>> +	thread_loop \
>> +	memcpy_thread \
>> +	unroll_loop_thread
>> +
>> +all: $(SUBDIRS)
>> +$(SUBDIRS):
>> +	$(Q)$(MAKE) -C $@
>> +
>> +INSTALLDIRS = $(SUBDIRS:%=install-%)
>> +
>> +install-tests: $(INSTALLDIRS)
>> +$(INSTALLDIRS):
>> +	$(Q)$(MAKE) -C $(@:install-%=%) install-tests
>> +
>> +CLEANDIRS = $(SUBDIRS:%=clean-%)
>> +
>> +clean: $(CLEANDIRS)
>> +$(CLEANDIRS):
>> +	$(Q)$(MAKE) -C $(@:clean-%=%) clean >/dev/null
>> +
>> +.PHONY: all clean $(SUBDIRS) $(CLEANDIRS) $(INSTALLDIRS)
>> +
>> diff --git a/tools/perf/tests/shell/coresight/Makefile.miniconfig b/tools/perf/tests/shell/coresight/Makefile.miniconfig
>> new file mode 100644
>> index 000000000000..893c12685fed
>> --- /dev/null
>> +++ b/tools/perf/tests/shell/coresight/Makefile.miniconfig
>> @@ -0,0 +1,23 @@
>> +# SPDX-License-Identifier: GPL-2.0-only
>> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>> +
>> +ifndef DESTDIR
>> +prefix ?= $(HOME)
>> +endif
>> +
>> +DESTDIR_SQ = $(subst ','\'',$(DESTDIR))
>> +perfexecdir = libexec/perf-core
>> +perfexec_instdir = $(perfexecdir)
>> +
>> +ifneq ($(filter /%,$(firstword $(perfexecdir))),)
>> +perfexec_instdir = $(perfexecdir)
>> +else
>> +perfexec_instdir = $(prefix)/$(perfexecdir)
>> +endif
>> +
>> +perfexec_instdir_SQ = $(subst ','\'',$(perfexec_instdir))
>> +INSTALL = install
>> +
>> +include ../../../../../scripts/Makefile.include
>> +include ../../../../../scripts/Makefile.arch
>> +include ../../../../../scripts/utilities.mak
>> diff --git a/tools/perf/tests/shell/coresight/asm_pure_loop/.gitignore b/tools/perf/tests/shell/coresight/asm_pure_loop/.gitignore
>> new file mode 100644
>> index 000000000000..468673ac32e8
>> --- /dev/null
>> +++ b/tools/perf/tests/shell/coresight/asm_pure_loop/.gitignore
>> @@ -0,0 +1 @@
>> +asm_pure_loop
>> diff --git a/tools/perf/tests/shell/coresight/asm_pure_loop/Makefile b/tools/perf/tests/shell/coresight/asm_pure_loop/Makefile
>> new file mode 100644
>> index 000000000000..10c5a60cb71c
>> --- /dev/null
>> +++ b/tools/perf/tests/shell/coresight/asm_pure_loop/Makefile
>> @@ -0,0 +1,30 @@
>> +# SPDX-License-Identifier: GPL-2.0
>> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>> +
>> +include ../Makefile.miniconfig
>> +
>> +BIN=asm_pure_loop
>> +LIB=
>> +
>> +all: $(BIN)
>> +
>> +$(BIN): $(BIN).S
>> +ifdef CORESIGHT
>> +ifeq ($(ARCH),arm64)
>> +	$(Q)$(CC) $(BIN).S -nostdlib -static -o $(BIN) $(LIB)
>> +endif
>> +endif
>> +
>> +install-tests: all
>> +ifdef CORESIGHT
>> +ifeq ($(ARCH),arm64)
>> +	$(call QUIET_INSTALL, tests) \
>> +		$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)'; \
>> +		$(INSTALL) $(BIN) '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)/$(BIN)'
>> +endif
>> +endif
>> +
>> +clean:
>> +	$(Q)$(RM) -f $(BIN)
>> +
>> +.PHONY: all clean install-tests
>> diff --git a/tools/perf/tests/shell/coresight/asm_pure_loop/asm_pure_loop.S b/tools/perf/tests/shell/coresight/asm_pure_loop/asm_pure_loop.S
>> new file mode 100644
>> index 000000000000..75cf084a927d
>> --- /dev/null
>> +++ b/tools/perf/tests/shell/coresight/asm_pure_loop/asm_pure_loop.S
>> @@ -0,0 +1,28 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +/* Tamas Zsoldos <tamas.zsoldos@arm.com>, 2021 */
>> +
>> +.globl _start
>> +_start:
>> +	mov	x0, 0x0000ffff
>> +	mov	x1, xzr
>> +loop:
>> +	nop
>> +	nop
>> +	cbnz	x1, noskip
>> +	nop
>> +	nop
>> +	adrp	x2, skip
>> +	add 	x2, x2, :lo12:skip
>> +	br	x2
>> +	nop
>> +	nop
>> +noskip:
>> +	nop
>> +	nop
>> +skip:
>> +	sub	x0, x0, 1
>> +	cbnz	x0, loop
>> +
>> +	mov	x0, #0
>> +	mov	x8, #93 // __NR_exit syscall
>> +	svc	#0
>> diff --git a/tools/perf/tests/shell/coresight/memcpy_thread/.gitignore b/tools/perf/tests/shell/coresight/memcpy_thread/.gitignore
>> new file mode 100644
>> index 000000000000..f8217e56091e
>> --- /dev/null
>> +++ b/tools/perf/tests/shell/coresight/memcpy_thread/.gitignore
>> @@ -0,0 +1 @@
>> +memcpy_thread
>> diff --git a/tools/perf/tests/shell/coresight/memcpy_thread/Makefile b/tools/perf/tests/shell/coresight/memcpy_thread/Makefile
>> new file mode 100644
>> index 000000000000..e2604cfae74b
>> --- /dev/null
>> +++ b/tools/perf/tests/shell/coresight/memcpy_thread/Makefile
>> @@ -0,0 +1,29 @@
>> +# SPDX-License-Identifier: GPL-2.0
>> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>> +include ../Makefile.miniconfig
>> +
>> +BIN=memcpy_thread
>> +LIB=-pthread
>> +
>> +all: $(BIN)
>> +
>> +$(BIN): $(BIN).c
>> +ifdef CORESIGHT
>> +ifeq ($(ARCH),arm64)
>> +	$(Q)$(CC) $(BIN).c -o $(BIN) $(LIB)
>> +endif
>> +endif
>> +
>> +install-tests: all
>> +ifdef CORESIGHT
>> +ifeq ($(ARCH),arm64)
>> +	$(call QUIET_INSTALL, tests) \
>> +		$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)'; \
>> +		$(INSTALL) $(BIN) '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)/$(BIN)'
>> +endif
>> +endif
>> +
>> +clean:
>> +	$(Q)$(RM) -f $(BIN)
>> +
>> +.PHONY: all clean install-tests
>> diff --git a/tools/perf/tests/shell/coresight/memcpy_thread/memcpy_thread.c b/tools/perf/tests/shell/coresight/memcpy_thread/memcpy_thread.c
>> new file mode 100644
>> index 000000000000..a7e169d1bf64
>> --- /dev/null
>> +++ b/tools/perf/tests/shell/coresight/memcpy_thread/memcpy_thread.c
>> @@ -0,0 +1,79 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +// Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>> +#include <stdio.h>
>> +#include <stdlib.h>
>> +#include <unistd.h>
>> +#include <string.h>
>> +#include <pthread.h>
>> +
>> +struct args {
>> +	unsigned long loops;
>> +	unsigned long size;
>> +	pthread_t th;
>> +	void *ret;
>> +};
>> +
>> +static void *thrfn(void *arg)
>> +{
>> +	struct args *a = arg;
>> +	unsigned long i, len = a->loops;
>> +	unsigned char *src, *dst;
>> +
>> +	src = malloc(a->size * 1024);
>> +	dst = malloc(a->size * 1024);
>> +	if ((!src) || (!dst)) {
>> +		printf("ERR: Can't allocate memory\n");
>> +		exit(1);
>> +	}
>> +	for (i = 0; i < len; i++)
>> +		memcpy(dst, src, a->size * 1024);
>> +}
>> +
>> +static pthread_t new_thr(void *(*fn) (void *arg), void *arg)
>> +{
>> +	pthread_t t;
>> +	pthread_attr_t attr;
>> +
>> +	pthread_attr_init(&attr);
>> +	pthread_create(&t, &attr, fn, arg);
>> +	return t;
>> +}
>> +
>> +int main(int argc, char **argv)
>> +{
>> +	unsigned long i, len, size, thr;
>> +	pthread_t threads[256];
>> +	struct args args[256];
>> +	long long v;
>> +
>> +	if (argc < 4) {
>> +		printf("ERR: %s [copysize Kb] [numthreads] [numloops (hundreds)]\n", argv[0]);
>> +		exit(1);
>> +	}
>> +
>> +	v = atoll(argv[1]);
>> +	if ((v < 1) || (v > (1024 * 1024))) {
>> +		printf("ERR: max memory 1GB (1048576 KB)\n");
>> +		exit(1);
>> +	}
>> +	size = v;
>> +	thr = atol(argv[2]);
>> +	if ((thr < 1) || (thr > 256)) {
>> +		printf("ERR: threads 1-256\n");
>> +		exit(1);
>> +	}
>> +	v = atoll(argv[3]);
>> +	if ((v < 1) || (v > 40000000000ll)) {
>> +		printf("ERR: loops 1-40000000000 (hundreds)\n");
>> +		exit(1);
>> +	}
>> +	len = v * 100;
>> +	for (i = 0; i < thr; i++) {
>> +		args[i].loops = len;
>> +		args[i].size = size;
>> +		args[i].th = new_thr(thrfn, &(args[i]));
>> +	}
>> +	for (i = 0; i < thr; i++)
>> +		pthread_join(args[i].th, &(args[i].ret));
>> +	return 0;
>> +}
>> diff --git a/tools/perf/tests/shell/coresight/thread_loop/.gitignore b/tools/perf/tests/shell/coresight/thread_loop/.gitignore
>> new file mode 100644
>> index 000000000000..6d4c33eaa9e8
>> --- /dev/null
>> +++ b/tools/perf/tests/shell/coresight/thread_loop/.gitignore
>> @@ -0,0 +1 @@
>> +thread_loop
>> diff --git a/tools/perf/tests/shell/coresight/thread_loop/Makefile b/tools/perf/tests/shell/coresight/thread_loop/Makefile
>> new file mode 100644
>> index 000000000000..424df4e8b0e6
>> --- /dev/null
>> +++ b/tools/perf/tests/shell/coresight/thread_loop/Makefile
>> @@ -0,0 +1,29 @@
>> +# SPDX-License-Identifier: GPL-2.0
>> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>> +include ../Makefile.miniconfig
>> +
>> +BIN=thread_loop
>> +LIB=-pthread
>> +
>> +all: $(BIN)
>> +
>> +$(BIN): $(BIN).c
>> +ifdef CORESIGHT
>> +ifeq ($(ARCH),arm64)
>> +	$(Q)$(CC) $(BIN).c -o $(BIN) $(LIB)
>> +endif
>> +endif
>> +
>> +install-tests: all
>> +ifdef CORESIGHT
>> +ifeq ($(ARCH),arm64)
>> +	$(call QUIET_INSTALL, tests) \
>> +		$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)'; \
>> +		$(INSTALL) $(BIN) '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)/$(BIN)'
>> +endif
>> +endif
>> +
>> +clean:
>> +	$(Q)$(RM) -f $(BIN)
>> +
>> +.PHONY: all clean install-tests
>> diff --git a/tools/perf/tests/shell/coresight/thread_loop/thread_loop.c b/tools/perf/tests/shell/coresight/thread_loop/thread_loop.c
>> new file mode 100644
>> index 000000000000..c0158fac7d0b
>> --- /dev/null
>> +++ b/tools/perf/tests/shell/coresight/thread_loop/thread_loop.c
>> @@ -0,0 +1,86 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +// Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>> +
>> +// define this for gettid()
>> +#define _GNU_SOURCE
>> +
>> +#include <stdio.h>
>> +#include <stdlib.h>
>> +#include <unistd.h>
>> +#include <string.h>
>> +#include <pthread.h>
>> +#include <sys/syscall.h>
>> +#ifndef SYS_gettid
>> +// gettid is 178 on arm64
>> +# define SYS_gettid 178
>> +#endif
>> +#define gettid() syscall(SYS_gettid)
>> +
>> +struct args {
>> +	unsigned int loops;
>> +	pthread_t th;
>> +	void *ret;
>> +};
>> +
>> +static void *thrfn(void *arg)
>> +{
>> +	struct args *a = arg;
>> +	int i = 0, len = a->loops;
>> +
>> +	if (getenv("SHOW_TID")) {
>> +		unsigned long long tid = gettid();
>> +
>> +		printf("%llu\n", tid);
>> +	}
>> +	asm volatile(
>> +		"loop:\n"
>> +		"add %[i], %[i], #1\n"
>> +		"cmp %[i], %[len]\n"
>> +		"blt loop\n"
>> +		: /* out */
>> +		: /* in */ [i] "r" (i), [len] "r" (len)
>> +		: /* clobber */
>> +	);
>> +	return (void *)(long)i;
>> +}
>> +
>> +static pthread_t new_thr(void *(*fn) (void *arg), void *arg)
>> +{
>> +	pthread_t t;
>> +	pthread_attr_t attr;
>> +
>> +	pthread_attr_init(&attr);
>> +	pthread_create(&t, &attr, fn, arg);
>> +	return t;
>> +}
>> +
>> +int main(int argc, char **argv)
>> +{
>> +	unsigned int i, len, thr;
>> +	pthread_t threads[256];
>> +	struct args args[256];
>> +
>> +	if (argc < 3) {
>> +		printf("ERR: %s [numthreads] [numloops (millions)]\n", argv[0]);
>> +		exit(1);
>> +	}
>> +
>> +	thr = atoi(argv[1]);
>> +	if ((thr < 1) || (thr > 256)) {
>> +		printf("ERR: threads 1-256\n");
>> +		exit(1);
>> +	}
>> +	len = atoi(argv[2]);
>> +	if ((len < 1) || (len > 4000)) {
>> +		printf("ERR: max loops 4000 (millions)\n");
>> +		exit(1);
>> +	}
>> +	len *= 1000000;
>> +	for (i = 0; i < thr; i++) {
>> +		args[i].loops = len;
>> +		args[i].th = new_thr(thrfn, &(args[i]));
>> +	}
>> +	for (i = 0; i < thr; i++)
>> +		pthread_join(args[i].th, &(args[i].ret));
>> +	return 0;
>> +}
>> diff --git a/tools/perf/tests/shell/coresight/unroll_loop_thread/.gitignore b/tools/perf/tests/shell/coresight/unroll_loop_thread/.gitignore
>> new file mode 100644
>> index 000000000000..2cb4e996dbf3
>> --- /dev/null
>> +++ b/tools/perf/tests/shell/coresight/unroll_loop_thread/.gitignore
>> @@ -0,0 +1 @@
>> +unroll_loop_thread
>> diff --git a/tools/perf/tests/shell/coresight/unroll_loop_thread/Makefile b/tools/perf/tests/shell/coresight/unroll_loop_thread/Makefile
>> new file mode 100644
>> index 000000000000..45ab2be8be92
>> --- /dev/null
>> +++ b/tools/perf/tests/shell/coresight/unroll_loop_thread/Makefile
>> @@ -0,0 +1,29 @@
>> +# SPDX-License-Identifier: GPL-2.0
>> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>> +include ../Makefile.miniconfig
>> +
>> +BIN=unroll_loop_thread
>> +LIB=-pthread
>> +
>> +all: $(BIN)
>> +
>> +$(BIN): $(BIN).c
>> +ifdef CORESIGHT
>> +ifeq ($(ARCH),arm64)
>> +	$(Q)$(CC) $(BIN).c -o $(BIN) $(LIB)
>> +endif
>> +endif
>> +
>> +install-tests: all
>> +ifdef CORESIGHT
>> +ifeq ($(ARCH),arm64)
>> +	$(call QUIET_INSTALL, tests) \
>> +		$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)'; \
>> +		$(INSTALL) $(BIN) '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/tools/$(BIN)/$(BIN)'
>> +endif
>> +endif
>> +
>> +clean:
>> +	$(Q)$(RM) -f $(BIN)
>> +
>> +.PHONY: all clean install-tests
>> diff --git a/tools/perf/tests/shell/coresight/unroll_loop_thread/unroll_loop_thread.c b/tools/perf/tests/shell/coresight/unroll_loop_thread/unroll_loop_thread.c
>> new file mode 100644
>> index 000000000000..cb9d22c7dfb9
>> --- /dev/null
>> +++ b/tools/perf/tests/shell/coresight/unroll_loop_thread/unroll_loop_thread.c
>> @@ -0,0 +1,74 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +// Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>> +#include <stdio.h>
>> +#include <stdlib.h>
>> +#include <unistd.h>
>> +#include <string.h>
>> +#include <pthread.h>
>> +
>> +struct args {
>> +	pthread_t th;
>> +	unsigned int in, out;
>> +	void *ret;
>> +};
>> +
>> +static void *thrfn(void *arg)
>> +{
>> +	struct args *a = arg;
>> +	unsigned int i, in = a->in;
>> +
>> +	for (i = 0; i < 10000; i++) {
>> +		asm volatile (
>> +// force an unroll of thia add instruction so we can test long runs of code
>> +#define SNIP1 "add %[in], %[in], #1\n"
>> +// 10
>> +#define SNIP2 SNIP1 SNIP1 SNIP1 SNIP1 SNIP1 SNIP1 SNIP1 SNIP1 SNIP1 SNIP1
>> +// 100
>> +#define SNIP3 SNIP2 SNIP2 SNIP2 SNIP2 SNIP2 SNIP2 SNIP2 SNIP2 SNIP2 SNIP2
>> +// 1000
>> +#define SNIP4 SNIP3 SNIP3 SNIP3 SNIP3 SNIP3 SNIP3 SNIP3 SNIP3 SNIP3 SNIP3
>> +// 10000
>> +#define SNIP5 SNIP4 SNIP4 SNIP4 SNIP4 SNIP4 SNIP4 SNIP4 SNIP4 SNIP4 SNIP4
>> +// 100000
>> +			SNIP5 SNIP5 SNIP5 SNIP5 SNIP5 SNIP5 SNIP5 SNIP5 SNIP5 SNIP5
>> +			: /* out */
>> +			: /* in */ [in] "r" (in)
>> +			: /* clobber */
>> +		);
>> +	}
>> +}
>> +
>> +static pthread_t new_thr(void *(*fn) (void *arg), void *arg)
>> +{
>> +	pthread_t t;
>> +	pthread_attr_t attr;
>> +
>> +	pthread_attr_init(&attr);
>> +	pthread_create(&t, &attr, fn, arg);
>> +	return t;
>> +}
>> +
>> +int main(int argc, char **argv)
>> +{
>> +	unsigned int i, thr;
>> +	pthread_t threads[256];
>> +	struct args args[256];
>> +
>> +	if (argc < 2) {
>> +		printf("ERR: %s [numthreads]\n", argv[0]);
>> +		exit(1);
>> +	}
>> +
>> +	thr = atoi(argv[1]);
>> +	if ((thr > 256) || (thr < 1)) {
>> +		printf("ERR: threads 1-256\n");
>> +		exit(1);
>> +	}
>> +	for (i = 0; i < thr; i++) {
>> +		args[i].in = rand();
>> +		args[i].th = new_thr(thrfn, &(args[i]));
>> +	}
>> +	for (i = 0; i < thr; i++)
>> +		pthread_join(args[i].th, &(args[i].ret));
>> +	return 0;
>> +}
>> diff --git a/tools/perf/tests/shell/coresight_asm_pure_loop.sh b/tools/perf/tests/shell/coresight_asm_pure_loop.sh
>> new file mode 100755
>> index 000000000000..3f0dbefcad50
>> --- /dev/null
>> +++ b/tools/perf/tests/shell/coresight_asm_pure_loop.sh
>> @@ -0,0 +1,18 @@
>> +#!/bin/sh -e
>> +# Coresight / ASM Pure Loop
>> +
>> +# SPDX-License-Identifier: GPL-2.0
>> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>> +
>> +TEST="asm_pure_loop"
>> +. $(dirname $0)/lib/coresight.sh
>> +ARGS=""
>> +DATV="out"
>> +DATA="$DATD/perf-$TEST-$DATV.data"
>> +
>> +perf record $PERFRECOPT -o "$DATA" "$BIN" $ARGS
>> +
>> +perf_dump_aux_verify "$DATA" 10 10 10
>> +
>> +err=$?
>> +exit $err
>> diff --git a/tools/perf/tests/shell/coresight_memcpy_thread_16k_10.sh b/tools/perf/tests/shell/coresight_memcpy_thread_16k_10.sh
>> new file mode 100755
>> index 000000000000..8972af835016
>> --- /dev/null
>> +++ b/tools/perf/tests/shell/coresight_memcpy_thread_16k_10.sh
>> @@ -0,0 +1,18 @@
>> +#!/bin/sh -e
>> +# Coresight / Memcpy 16k 10 Threads
>> +
>> +# SPDX-License-Identifier: GPL-2.0
>> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>> +
>> +TEST="memcpy_thread"
>> +. $(dirname $0)/lib/coresight.sh
>> +ARGS="16 10 1"
>> +DATV="16k_10"
>> +DATA="$DATD/perf-$TEST-$DATV.data"
>> +
>> +perf record $PERFRECOPT -o "$DATA" "$BIN" $ARGS
>> +
>> +perf_dump_aux_verify "$DATA" 10 10 10
>> +
>> +err=$?
>> +exit $err
>> diff --git a/tools/perf/tests/shell/coresight_thread_loop_check_tid_10.sh b/tools/perf/tests/shell/coresight_thread_loop_check_tid_10.sh
>> new file mode 100755
>> index 000000000000..5b468901f89b
>> --- /dev/null
>> +++ b/tools/perf/tests/shell/coresight_thread_loop_check_tid_10.sh
>> @@ -0,0 +1,19 @@
>> +#!/bin/sh -e
>> +# Coresight / Thread Loop 10 Threads - Check TID
>> +
>> +# SPDX-License-Identifier: GPL-2.0
>> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>> +
>> +TEST="thread_loop"
>> +. $(dirname $0)/lib/coresight.sh
>> +ARGS="10 1"
>> +DATV="check-tid-10th"
>> +DATA="$DATD/perf-$TEST-$DATV.data"
>> +STDO="$DATD/perf-$TEST-$DATV.stdout"
>> +
>> +SHOW_TID=1 perf record -s $PERFRECOPT -o "$DATA" "$BIN" $ARGS > $STDO
>> +
>> +perf_dump_aux_tid_verify "$DATA" "$STDO"
>> +
>> +err=$?
>> +exit $err
>> diff --git a/tools/perf/tests/shell/coresight_thread_loop_check_tid_2.sh b/tools/perf/tests/shell/coresight_thread_loop_check_tid_2.sh
>> new file mode 100755
>> index 000000000000..f8b7abd3aa03
>> --- /dev/null
>> +++ b/tools/perf/tests/shell/coresight_thread_loop_check_tid_2.sh
>> @@ -0,0 +1,19 @@
>> +#!/bin/sh -e
>> +# Coresight / Thread Loop 2 Threads - Check TID
>> +
>> +# SPDX-License-Identifier: GPL-2.0
>> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>> +
>> +TEST="thread_loop"
>> +. $(dirname $0)/lib/coresight.sh
>> +ARGS="2 20"
>> +DATV="check-tid-2th"
>> +DATA="$DATD/perf-$TEST-$DATV.data"
>> +STDO="$DATD/perf-$TEST-$DATV.stdout"
>> +
>> +SHOW_TID=1 perf record -s $PERFRECOPT -o "$DATA" "$BIN" $ARGS > $STDO
>> +
>> +perf_dump_aux_tid_verify "$DATA" "$STDO"
>> +
>> +err=$?
>> +exit $err
>> diff --git a/tools/perf/tests/shell/coresight_unroll_loop_thread_10.sh b/tools/perf/tests/shell/coresight_unroll_loop_thread_10.sh
>> new file mode 100755
>> index 000000000000..c985dfb025c2
>> --- /dev/null
>> +++ b/tools/perf/tests/shell/coresight_unroll_loop_thread_10.sh
>> @@ -0,0 +1,18 @@
>> +#!/bin/sh -e
>> +# Coresight / Unroll Loop Thread 10
>> +
>> +# SPDX-License-Identifier: GPL-2.0
>> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>> +
>> +TEST="unroll_loop_thread"
>> +. $(dirname $0)/lib/coresight.sh
>> +ARGS="10"
>> +DATV="10"
>> +DATA="$DATD/perf-$TEST-$DATV.data"
>> +
>> +perf record $PERFRECOPT -o "$DATA" "$BIN" $ARGS
>> +
>> +perf_dump_aux_verify "$DATA" 10 10 10
>> +
>> +err=$?
>> +exit $err
>> diff --git a/tools/perf/tests/shell/lib/coresight.sh b/tools/perf/tests/shell/lib/coresight.sh
>> new file mode 100644
>> index 000000000000..6a611b073f02
>> --- /dev/null
>> +++ b/tools/perf/tests/shell/lib/coresight.sh
>> @@ -0,0 +1,130 @@
>> +# SPDX-License-Identifier: GPL-2.0
>> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>> +
>> +# This is sourced from a driver script so no need for #!/bin... etc. at the
>> +# top - the assumption below is that it runs as part of sourcing after the
>> +# test sets up some basic env vars to say what it is.
>> +
>> +# perf record options for the perf tests to use
>> +PERFRECMEM="-m ,128M"
>> +PERFRECOPT="$PERFRECMEM -e cs_etm//u"
>> +
>> +# These tests need to be run as root or coresight won't allow large buffers
>> +# and will not collect proper data
>> +UID=`id -u`
>> +if test "$UID" -ne 0; then
>> +	echo "Not running as root... skip"
>> +	exit 2
>> +fi
>> +
>> +TOOLS=$(dirname $0)
>> +DIR="$TOOLS/coresight/$TEST"
>> +BIN="$DIR/$TEST"
>> +# If the test tool/binary does not exist and is executable then skip the test
>> +if ! test -x "$BIN"; then exit 2; fi
>> +DATD="."
>> +# If the data dir env is set then make the data dir use that instead of ./
>> +if test -n "$PERF_TEST_CORESIGHT_DATADIR"; then
>> +	DATD="$PERF_TEST_CORESIGHT_DATADIR";
>> +fi
>> +# If the stat dir env is set then make the data dir use that instead of ./
>> +STATD="."
>> +if test -n "$PERF_TEST_CORESIGHT_STATDIR"; then
>> +	STATD="$PERF_TEST_CORESIGHT_STATDIR";
>> +fi
>> +
>> +# Called if the test fails - error code 2
>> +err() {
>> +	echo "$1"
>> +	exit 1
>> +}
>> +
>> +# Check that some statistics from our perf
>> +check_val_min() {
>> +	STATF="$4"
>> +	if test "$2" -lt "$3"; then
>> +		echo ", FAILED" >> "$STATF"
>> +		err "Sanity check number of $1 is too low ($2 < $3)"
>> +	fi
>> +}
>> +
>> +perf_dump_aux_verify() {
>> +	# Some basic checking that the AUX chunk contains some sensible data
>> +	# to see that we are recording something and at least a minimum
>> +	# amount of it. We should almost always see F3 atoms in just about
>> +	# anything but certainly we will see some trace info and async atom
>> +	# chunks.
>> +	DUMP="$DATD/perf-tmp-aux-dump.txt"
>> +	perf report --stdio --dump -i "$1" | \
>> +		grep -o -e I_ATOM_F3 -e I_ASYNC -e I_TRACE_INFO > "$DUMP"
>> +	# Simply count how many of these atoms we find to see that we are
>> +	# producing a reasonable amount of data - exact checks are not sane
>> +	# as this is a lossy  process where we may lose some blocks and the
>> +	# compiler may produce different code depending on the compiler and
>> +	# optimization options, so this is rough  just to see if we're
>> +	# either missing almost all the data or all of it
>> +	ATOM_F3_NUM=`grep I_ATOM_F3 "$DUMP" | wc -l`
>> +	ATOM_ASYNC_NUM=`grep I_ASYNC "$DUMP" | wc -l`
>> +	ATOM_TRACE_INFO_NUM=`grep I_TRACE_INFO "$DUMP" | wc -l`
>> +	rm -f "$DUMP"
>> +
>> +	# Arguments provide minimums for a pass
>> +	CHECK_F3_MIN="$2"
>> +	CHECK_ASYNC_MIN="$3"
>> +	CHECK_TRACE_INFO_MIN="$4"
>> +
>> +	# Write out statistics, so over time you can track results to see if
>> +	# there is a pattern - for example we have less "noisy" results that
>> +	# produce more consistent amounts of data each run, to see if over
>> +	# time any techinques to  minimize data loss are having an effect or
>> +	# not
>> +	STATF="$STATD/stats-$TEST-$DATV.csv"
>> +	if ! test -f "$STATF"; then
>> +		echo "ATOM F3 Count, Minimum, ATOM ASYNC Count, Minimum, TRACE INFO Count, Minimum" > "$STATF"
>> +	fi
>> +	echo -n "$ATOM_F3_NUM, $CHECK_F3_MIN, $ATOM_ASYNC_NUM, $CHECK_ASYNC_MIN, $ATOM_TRACE_INFO_NUM, $CHECK_TRACE_INFO_MIN" >> "$STATF"
>> +
>> +	# Actually check to see if we passed or failed.
>> +	check_val_min "ATOM_F3" "$ATOM_F3_NUM" "$CHECK_F3_MIN" "$STATF"
>> +	check_val_min "ASYNC" "$ATOM_ASYNC_NUM" "$CHECK_ASYNC_MIN" "$STATF"
>> +	check_val_min "TRACE_INFO" "$ATOM_TRACE_INFO_NUM" "$CHECK_TRACE_INFO_MIN" "$STATF"
>> +	echo ", Ok" >> "$STATF"
>> +}
>> +
>> +perf_dump_aux_tid_verify() {
>> +	# Specifically crafted test will produce a list of Tread ID's to
>> +	# stdout that need to be checked to  see that they have had trace
>> +	# info collected in AUX blocks in the perf data. This will go
>> +	# through all the TID's that are listed as CID=0xabcdef and see
>> +	# that all the Thread IDs the test tool reports are  in the perf
>> +	# data AUX chunks
>> +
>> +	# The TID test tools will print a TID per stdout line that are being
>> +	# tested
>> +	TIDS=`cat "$2"`
>> +	# Scan the perf report to find the TIDs that are actually CID in hex
>> +	# and build a list of the ones found
>> +	FOUND_TIDS=`perf report --stdio --dump -i "$1" | \
>> +			grep -o "CID=0x[0-9a-z]\+" | sed 's/CID=//g' | \
>> +			uniq | sort | uniq`
>> +
>> +	# Iterate over the list of TIDs that the test says it has and find
>> +	# them in the TIDs found in the perf report
>> +	MISSING=""
>> +	for TID2 in $TIDS; do
>> +		FOUND=""
>> +		for TIDHEX in $FOUND_TIDS; do
>> +			TID=`printf "%i" $TIDHEX`
>> +			if test "$TID" -eq "$TID2"; then
>> +				FOUND="y"
>> +				break
>> +			fi
>> +		done
>> +		if test -z "$FOUND"; then
>> +			MISSING="$MISSING $TID"
>> +		fi
>> +	done
>> +	if test -n "$MISSING"; then
>> +		err "Thread IDs $MISSING not found in perf AUX data"
>> +	fi
>> +}
>> -- 
>> 2.32.0
>>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] perf test: Shell - only run .sh shell files to skip other files
  2022-05-26 10:14       ` Leo Yan
@ 2022-06-13 13:08         ` Carsten Haitzler
  2022-06-15  9:14           ` Leo Yan
  0 siblings, 1 reply; 19+ messages in thread
From: Carsten Haitzler @ 2022-06-13 13:08 UTC (permalink / raw)
  To: Leo Yan
  Cc: linux-kernel, coresight, suzuki.poulose, mathieu.poirier,
	mike.leach, linux-perf-users, acme



On 5/26/22 11:14, Leo Yan wrote:
> On Thu, Apr 21, 2022 at 05:21:27PM +0100, Carsten Haitzler wrote:
>> On 4/10/22 03:28, Leo Yan wrote:
>>> On Wed, Mar 09, 2022 at 12:28:58PM +0000, carsten.haitzler@foss.arm.com wrote:
>>>> From: Carsten Haitzler <carsten.haitzler@arm.com>
>>>>
>>>> You edit your scripts in the tests and end up with your usual shell
>>>> backup files with ~ or .bak or something else at the end, but then your
>>>> next perf test run wants to run the backups too. You might also have perf
>>>> .data files in the directory or something else undesireable as well. You end
>>>> up chasing which test is the one you edited and the backup and have to keep
>>>> removing all the backup files, so automatically skip any files that are
>>>> not plain *.sh scripts to limit the time wasted in chasing ghosts.
>>>>
>>>> Signed-off-by: Carsten Haitzler <carsten.haitzler@arm.com>
>>>>
>>>> ---
>>>>    tools/perf/tests/builtin-test.c | 17 +++++++++++++++--
>>>>    1 file changed, 15 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
>>>> index 3c34cb766724..3a02ba7a7a89 100644
>>>> --- a/tools/perf/tests/builtin-test.c
>>>> +++ b/tools/perf/tests/builtin-test.c
>>>> @@ -296,9 +296,22 @@ static const char *shell_test__description(char *description, size_t size,
>>>>    #define for_each_shell_test(entlist, nr, base, ent)	                \
>>>>    	for (int __i = 0; __i < nr && (ent = entlist[__i]); __i++)	\
>>>> -		if (!is_directory(base, ent) && \
>>>> +		if (ent->d_name[0] != '.' && \
>>>> +			!is_directory(base, ent) && \
>>>>    			is_executable_file(base, ent) && \
>>>> -			ent->d_name[0] != '.')
>>>> +			is_shell_script(ent->d_name))
>>>
>>> Just nitpick: since multiple conditions are added, seems to me it's good
>>> to use a single function is_executable_shell_script() to make decision
>>> if a file is an executable shell script.
>>
>> I'd certainly make a function if this was being re-used, but as the "coding
>> pattern" was to do all the tests already inside the if() in only one place,
>> I kept with the style there and didn't change the code that didn't need
>> changing. I can rewrite this code and basically make a function that is just
>> an if ...:
>>
>> bool is_exe_shell_script(const char *base, struct dirent *ent) {
>>     return ent->d_name[0] != '.'         && !is_directory(base, ent) &&
>>            is_executable_file(base, ent) && is_shell_script(ent->d_name);
>> }
>>
>> And macro becomes:
>>
>> #define for_each_shell_test(entlist, nr, base, ent) \
>>    for (int __i = 0; __i < nr && (ent = entlist[__i]); __i++) \
>>      if (is_shell(base, ent))
> 
> Sorry for long latency.

No problem.

> If the condition checking gets complex, seems to me it is reasonable to
> use a static function (or a macro?) to encapsulate the logics.

Well normally my rule i s - if it gets re-used then do it, otherwise it 
just involves more indirection to follow. :) But regardless of that, 
given some other things you ask for that kind of makes this discussion 
moot as it requires much bigger wholesale changes to the test infra 
which will make these patches a lot more work. I'll get to that later in 
mails.

>> But one catch... it really should be is_non_hidden_exe_shell_script() as
>> it's checking that it's not a hidden file AND is a shell script. Or do I
>> keep the hidden file test outside of the function in the if? If we're nit
>> picking then I need to know exactly what you want here as your suggested
>> name is actually incorrect.
> 
> I personally prefer to use the condition:
> 
>    if (is_exe_shell_script() && ent->d_name[0] != '.')
>        do_something...
> 
> The reason is the function is_exe_shell_script() is more common and we
> use it easily in wider scope.

As above - will probably have to redo a lot of the test infra involving 
the shell tests to handle some of your other requests, but if we don't 
go that way, I have got where you want to go and I can do this.

>>> And the condition checking 'ent->d_name[0] != '.'' would be redundant
>>> after we have checked the file suffix '.sh'.
>>
>> This isn't actually redundant. You can have .something.sh :) If the idea is
>> we skip anything with a . at the start first always... then the if (to me)
>> is obvious.
> 
> Yeah, I agree the checking the start char '.' is the right thing
> to do.

:)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/3] perf test: Add coresight tests to guage quality of data generated
  2022-05-26  8:20       ` Leo Yan
  2022-05-26 16:08         ` Leo Yan
@ 2022-06-13 14:15         ` Carsten Haitzler
  1 sibling, 0 replies; 19+ messages in thread
From: Carsten Haitzler @ 2022-06-13 14:15 UTC (permalink / raw)
  To: Leo Yan
  Cc: linux-kernel, coresight, suzuki.poulose, mathieu.poirier,
	mike.leach, linux-perf-users, acme



On 5/26/22 09:20, Leo Yan wrote:
> Hi Carsten,
> 
> Sorry for late response.
> 
> On Thu, Apr 21, 2022 at 06:38:33PM +0100, Carsten Haitzler wrote:
> 
> [...]
> 
>>> Very big change...  Why squash all patches form previous verion to this
>>> single one big patch?  Usually the format with small patches is much
>>> better for reviewing.
>>
>> I was asked to re-jig the tree and in doing so I also ended up cutting down
>> the size a lot so this just makes more sense together as a "here are the
>> tests" as adding infra without any tests makes no sense and the tests
>> themelves are self-contained in their own directories and source files and
>> "drivign scripts" thus it's essentially patch 1 appended to patch 2 to patch
>> 3 etc. and still broken up in the patch file by file.
> 
> I am not sure if I understand the meaning, seems to me you could
> organize the patch series like:
> 
> - Patch for common files (e.g. script lib/coresight.sh or some
>    Makefile changes);
> - Patches for enabling test cases, E.g.:
>    patch for asm_pure_loop;
>    patch for thread loop (include unroll loop);
>    patch for memcpy;
> - Patch for documentation.

Each of those last patches are simple a patch with added "fluff" to add 
a line to a makefile each time then entire new files. A single patch set 
already divides each of these into their own files in the same patch are 
already divided - the only difference is - do you get the added fluff 
fiddling to add files to the makefile or just have them all added at 
once and the rest is already broken up. If you scroll through a patch 
with 3 files, or have 3 patches with 1 file in each - it's the same in 
the end. It's divided either way, but creating a patch set where one 
patch after the other adds a new test and each adds a line to the same 
makefle is a more work as I can't just git add/commit the test files to 
each patch. I have to first remove all tests. Then add the parent 
makefile for the tools with 1 line in it for 1 tool and 1 set of patch 
scripts. I then have to add the next one and edit this parent makefile 
to add one line and so on - where it actually makes the total size of 
all the patches bigger as this makefile that lists all the child dirs 
keeps being modified only to just add a single line added to it and the 
rest is all stand-alone new files anyway. As all the changes to this 
parent makefile all modify the same context, they can't be  separated 
out and cherry-picked as they then get conflicts in the context anyways 
so the patches can't be chosen separately without resolving conflicts 
anyway, thus to me it makes no sense to do all the extra work for a 
bigger total patch set that can't be split out without conflicts anyway...

But I kind of give up. It seems there is a view (that I don't understand 
as over-all the data is more complex and larger to review when broken 
out) that more data in N patches is preferable to less data in a single 
patch (even though it's all broken out into different file blocks in the 
same patch so you scroll down and basically in essence "cat *.patch" and 
"cat onefile.patch" produce the same thing other than the "add one more 
line to this makefile"). I'll break it up - it just is more busy-work to 
create a bigger overall patch set with more changes to look at overall.

> If this is not comfortable for you, at least we can use three patches:
> 
> - Patch for common file (e.g. script lib/coresight.sh);
> - Patch for test cases;
> - Patch for documentation.

The documentation I can see breaking out - though that wants to be moved 
somewhere else in the tree now and not alongside the perf tool. I 
generally go for the view of documentation belongs with the thing it 
documents and if there is new information to document, it belongs with 
the patch introducing what it documents. :)

> [...]
> 
>>>> +If you see these above, then your system is tracing coresight data
>>>> +correctly.
>>>> +
>>>> +To compile perf with coresight support in the perf directory do
>>>> +
>>>> +    make CORESIGHT=1
>>>
>>> It is inaccurate that if we don't mention openCSD lib.
>>
>> Do you mean I need to mention that you need the opencsd library installed
>> too?
> 
> Yes, otherwise, users might directly build perf without opencsd lib,
> then finally they cannot use perf with Arm CoreSight.

OK - sure. Will document that OpenCSD is needed and where to possibly 
get it (git, packages etc.).

>>>> +This will compile the perf tool with coresight support as well as
>>>> +build some small test binaries for perf test. This requires you also
>>>> +be compiling for 64bit Arm (ARM64/aarch64). The tools run as part of
>>>> +perf coresight tracing are in tests/shell/tools/coresight.
>>>
>>> For build perf tool, I think above paragraphs are duplicate with the
>>> document Documentation/trace/coresight/coresight.rst.  Can we simply
>>> say:
>>>
>>> "The details for building perf tool with support Arm Coresight can be
>>> found in the "HOWTO.md" file of the openCSD gitHub repository:
>>> https://github.com/Linaro/opencsd.
>>
>> I can. I put this here as I didn't go clone OpencCSD first but used my
>> distro OpenCSD packages and thus of course didn't have the documentation in
>> front of me. I spent some time wondering why it wasn't building with
>> coresight support even though it detected OpenCSD when I compiled... I
>> didn't expect to have to go to some separate project git repository and read
>> docs there on how to build the perf tool here in the kernel. I wrote this
>> because it was an actual problem I hit and it's a lot less frustrating to
>> "end users" to give them the information they need in the relevant place
>> they need it instead of sending them around to other project trees. Building
>> perf with coresight support is handled by the perf tree int he kernel, not
>> OpenCSD, thus IMHO that is where the documentation belongs - alongside the
>> thing that determines how to build something.
> 
> Understand.

I'll move this over to the place that Matthieu and you mention 
(Documentation/trace/coresight), but I do think this kind of information 
is necessary as above, so thanks. :)

>>> And "HOWTO.md" file gives the information and examples for how to use
>>> perf tool to record and report Coresight trace data.  It's the
>>> prerequisite for this perf Coresight test."
>>>
>>>> +You will also want coresight support enabled in your kernel config.
>>>> +Ensure it is enabled with:
>>>> +
>>>> +    CONFIG_CORESIGHT=y
>>>> +
>>>> +There are various other coresight options you probably also want
>>>> +enabled like:
>>>> +
>>>> +    CONFIG_CORESIGHT_LINKS_AND_SINKS=y
>>>> +    CONFIG_CORESIGHT_LINK_AND_SINK_TMC=y
>>>> +    CONFIG_CORESIGHT_CATU=y
>>>> +    CONFIG_CORESIGHT_SINK_TPIU=y
>>>> +    CONFIG_CORESIGHT_SINK_ETBV10=y
>>>> +    CONFIG_CORESIGHT_SOURCE_ETM4X=y
>>>> +    CONFIG_CORESIGHT_STM=y
>>>> +    CONFIG_CORESIGHT_CPU_DEBUG=y
>>>> +    CONFIG_CORESIGHT_CTI=y
>>>> +    CONFIG_CORESIGHT_CTI_INTEGRATION_REGS=y
>>>> +
>>>> +Please refer to the kernel configuration help for more information.
>>>
>>> I prefer to remove these kernel configuration since they are not
>>> inconsistent on different platforms (e.g. ETBV10, ETM4X, etc), and
>>> some configurations might not necessary (e.g. CPU_DEBUG).
>>
>> Certainly there should be some documentation on which kernel configs you
>> might want to turn on then? Imagine someone new comes along and doesn't have
>> any idea what to possible enable at all and manages to build perf with
>> coresight support (as above) then finds it doesn't work because they didn't
>> enable enough config in the kernel? Sure - could probably trim these down a
>> bit but the point here is to alert the user to there being a range of
>> coresight config options that you need to turn on that you likely will find
>> are not turned on. They certainly are not turned on on distro kernels and a
>> lot of the time when you have a platform that already boots/works you start
>> with your distro kernel config file because you want everything enabled so
>> it actually boots. I've learned the hard way to do this as you manage to
>> forget to turn on some MMC driver or some other feature and your boot hangs
>> or doesn't find rootfs etc.
> 
> So far, we will have two documents in Linux kernel:
> 
> - Documentation/trace/coresight/coresight.rst;
> - tools/perf/Documentation/arm-coresight.txt.
> 
> We need to avoid overlap between these two files.  I think we could use
> the file Documentation/trace/coresight/coresight.rst to focus on
> CoreSight driver module relates stuffs and
> tools/perf/Documentation/arm-coresight.txt is more about the perf
> usages.

That's the case here with this patch set. To me - I would look for the 
perf docs where I put them - in the perf documentation directory, not 
somewhere else.

> But, the file Documentation/trace/coresight/coresight.rst doesn't give
> any info for kernel configs, I think which would be a better place to
> give information for building kernel modules.

Correct. I do think you're right. I think that the "core docs" should 
give this information and the perf tool docs should at least reference 
these so you know to look there. I think I should at least put a stub 
doc in the perf tool doc tree to tell someone to go look at the "core 
docs" for more information.

>> What would you recommend then as a "turn these on and coresight will almost
>> certainly work for you on your given hardware " then?
> 
> This would be fine.  Alternatively, we could add a section in the file
> Documentation/trace/coresight/coresight.rst to describe how to build
> CoreSight modules.

I think that is good/best and then "link" the docs so someone knows 
where to look for more information.

> How you think for this?   I also would like to get suggestions from
> CoreSight maintainers Suzuki/Mathieu/Mike.
> 
> [...]
> 
>>> Please update based on the latest test case names, at my side, I can
>>> see the testing case like:
>>>
>>>          Coresight / ASM Pure Loop
>>>          Coresight / Memcpy 16k 10 Threads
>>>          Coresight / Thread Loop 10 Threads - Check TID
>>>          Coresight / Thread Loop 2 Threads - Check TID
>>>          Coresight / Unroll Loop Thread 10
>>
>> Oh sorry - yeah. I wrote the docs based on the earlier tests. Will fix.
> 
> Thanks.
> 
>>>> +
>>>> +These perf record tests will not run if the tool binaries do not exist
>>>> +in tests/shell/tools/coresight/*/ and will be skipped. If you do not
>>>> +have coresight support in hardware then either do not build perf with
>>>> +coresight support or remove these binaries in order to not have these
>>>> +tests fail and have them skip instead.
>>>> +
>>>> +These tests will log historical results in the current working
>>>> +directory (e.g. tools/perf) and will be named stats-*.csv like:
>>>> +
>>>> +    stats-asm_pure_loop-out.csv
>>>> +    stats-bubble_sort-random.csv
>>>> +    ...
>>>> +
>>>> +These statistic files log some aspects of the AUX data sections in
>>>> +the perf data output counting some numbers of certain encodings (a
>>>> +good way to know that it's working in a very simple way). One problem
>>>> +with coresight is that given a large enough amount of data needing to
>>>> +be logged, some of it can be lost due to the processor not waking up
>>>> +in time to read out all the data from buffers etc.. You will notice
>>>> +that the amount of data collected can vary a lot per run of perf test.
>>>> +If you wish to see how this changes over time, simply run perf test
>>>> +multiple times and all these csv files will have more and more data
>>>> +appended to it that you can later examine, graph and otherwise use to
>>>> +figure out if things have become worse or better.
>>>
>>> I am confused by this narrative.  Does it try to remind that the final
>>> testing result (pass or fail) is not stable?  Or should we run for
>>> multiple times so have more chance to capture issues?
>>
>> That is correct. I thought I was clear that it's lossy. That is actually the
>> case. I have tests here that actually fail because there is no data
>> collected from some threads at all (missing CID blocks for some of the
>> threads that run in the test). The point is to have tests that may be
>> failing now but in future will improve. I lowered the minimum bar to pass
>> for most tests to have "at least just a little data" but most tests show
>> highly variable amount of captured data. the csv files are there to
>> over-time give you a good idea of the stability of the captured data.
> 
> Okay, this would be fine for me.  Though I am a bit worry that later if
> users report a failure, then how we can tell them this is a bug or it's
> just tracing quality issue?

Well this is kind of both a bug and a quality issue. Reality is I have 
tests that literally get no information in the perf trace for some 
threads. Reality is we have what I'd call "Quality of trace" bugs and I 
was doing these test as a way of beginning to explore that and quantify 
it as that is not something the existing tests were doing. We want to 
have tests that in future with new designs show the quality bug to be 
fixed. It's kind of test driven development. Expose the quality bug, 
document it as such and then in future show it to be fixed when that 
happens (v9 should fix this...). I probably should document this as such 
(as a quality bug and why such failures are to be expected) so I'll add 
that to my TODO to make sure it's clear that these tests might fail and 
it's a quality issue (I mention the CSV files that are generated as a 
way to over-time track that quality).

> [...]
> 
>>>> diff --git a/tools/perf/tests/shell/coresight/asm_pure_loop/.gitignore b/tools/perf/tests/shell/coresight/asm_pure_loop/.gitignore
>>>> new file mode 100644
>>>> index 000000000000..468673ac32e8
>>>> --- /dev/null
>>>> +++ b/tools/perf/tests/shell/coresight/asm_pure_loop/.gitignore
>>>> @@ -0,0 +1 @@
>>>> +asm_pure_loop
>>>
>>> Do we really need there '.gitignore' files under the folder
>>> 'tools/perf/tests/shell/coresight/'.
>>
>> Where would you rather have them to ignore the generated binary tools?
> 
> It's interesting that I wanted to find a case to object you, so I tried
> to check the folder linux/samples/bpf, but it does use .gitignore file
> to ignore built binaries :)
> 
> Adding .gitignore is in practice and this would be fine for me.

:)

>>>> diff --git a/tools/perf/tests/shell/coresight/asm_pure_loop/Makefile b/tools/perf/tests/shell/coresight/asm_pure_loop/Makefile
>>>> new file mode 100644
>>>> index 000000000000..10c5a60cb71c
>>>> --- /dev/null
>>>> +++ b/tools/perf/tests/shell/coresight/asm_pure_loop/Makefile
>>>> @@ -0,0 +1,30 @@
>>>> +# SPDX-License-Identifier: GPL-2.0
>>>> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>>>> +
>>>> +include ../Makefile.miniconfig
>>>> +
>>>> +BIN=asm_pure_loop
>>>> +LIB=
>>>
>>> Remove the unused variable 'LIB='.
>>
>> I have this because I wanted to have a simple template to be able to re-use
>> for more tests over time. It's so much easier to maintain and extend if
>> every makefile and tool follow a similar pattern and you can almost copy &
>> paste between them as they don't have "exceptions". You really want me to
>> remove this?
> 
> It's fine to keep it.  Could you add a comment for this?

No problems! Can do.

> To be honest, I am not experienced for bash shell script, so I have no
> idea why write like this way.  If you think this is very common usage
> in shell, then you could keep it and don't need to add comment.

Well this is a makefile above (not shell), but having done lots of 
makefile and other higher level make tooling (autotools, cmake, meson) 
and having to maintain larger trees with "We build the same kind of 
thing N times but each module/tool needs slightly different 
linking/includes/tooling" ... I have learned the value of using 
templates that are the same but sometimes you have empty fields or 
fields that change. When you have to fix/change a "design pattern" issue 
it's easier to do when the files share more in common and it's easier 
for someone to start a new tool/module/subdir by taking an existing 
template and just modifying it as needed and everything they need is 
there. It really makes maintenance so much easier in the long-run as you 
expand the set of things something builds. Design patterns in your build 
system really help. :)

> [...]
> 
> 
>>> There have four sub folders under tools/perf/tests/shell/coresight:
>>>
>>>     asm_pure_loop
>>>     memcpy_thread
>>>     thread_loop
>>>     unroll_loop_thread
>>>
>>> And every folder has its own Makefile and every Makefile is quite
>>> close to each other.  I am just wandering if it's possible to
>>> remove the 4 Makefiles in these four sub folders, and simply use
>>> tools/perf/tests/shell/coresight/Makefile as the central place to
>>> build these assistant programs.
>>
>> I did this so it's easier to etxent over time. having a single parent
>> makefile that over time accumulates little ugly "if's" and exceptions makes
>> longer-term maintenance and extending harder. I did it this way to make this
>> easy - make a copy of a dir - add that dir to a parent makefile then modify
>> the makefile as needed (but only as needed).
> 
> Okay, let's keep the saperate makefiles.

:)

>>>> diff --git a/tools/perf/tests/shell/coresight/asm_pure_loop/asm_pure_loop.S b/tools/perf/tests/shell/coresight/asm_pure_loop/asm_pure_loop.S
>>>> new file mode 100644
>>>> index 000000000000..75cf084a927d
>>>> --- /dev/null
>>>> +++ b/tools/perf/tests/shell/coresight/asm_pure_loop/asm_pure_loop.S
>>>> @@ -0,0 +1,28 @@
>>>> +/* SPDX-License-Identifier: GPL-2.0 */
>>>> +/* Tamas Zsoldos <tamas.zsoldos@arm.com>, 2021 */
>>>> +
>>>> +.globl _start
>>>> +_start:
>>>> +	mov	x0, 0x0000ffff
>>>> +	mov	x1, xzr
>>>> +loop:
>>>> +	nop
>>>> +	nop
>>>> +	cbnz	x1, noskip
>>>> +	nop
>>>> +	nop
>>>> +	adrp	x2, skip
>>>> +	add 	x2, x2, :lo12:skip
>>>> +	br	x2
>>>> +	nop
>>>> +	nop
>>>> +noskip:
>>>> +	nop
>>>> +	nop
>>>> +skip:
>>>> +	sub	x0, x0, 1
>>>> +	cbnz	x0, loop
>>>> +
>>>> +	mov	x0, #0
>>>> +	mov	x8, #93 // __NR_exit syscall
>>>> +	svc	#0
>>>
>>> I tested the case "ASM Pure Loop" on my Juno board, and it complaints:
>>>
>>> root@debian:/mnt/export/arm-linux-kernel/tools/perf# ./perf test -v 76
>>>    76: Coresight / ASM Pure Loop                                       :
>>> --- start ---
>>> test child forked, pid 9063
>>> failed to mmap with 12 (Cannot allocate memory)
>>> test child finished with -1
>>> ---- end ----
>>> Coresight / ASM Pure Loop: FAILED!
>>>
>>> Since I only setup the 1GB memory for the Linux kernel, it fails to
>>> allocate AUX ring buffer with the size 256MB.  So I manully change
>>> the buffer size to 8MB in tools/perf/tests/shell/lib/coresight.sh:
>>>
>>>     PERFRECMEM="-m ,8M"
>>>
>>> So finally I can see the test case is passed:
>>
>> This is artificial isn't it? limiting to 1GB. You certainly have far more
>> memory than that available. My testse were on a system with 4GB and I had no
>> issues.
> 
> Please see below comment.
> 
>>> root@debian:/mnt/export/arm-linux-kernel/tools/perf# ./perf test -v 76
>>>    76: Coresight / ASM Pure Loop                                       :
>>> --- start ---
>>> test child forked, pid 9481
>>> -m ,8M -e cs_etm//u
>>> [ perf record: Woken up 1 times to write data ]
>>> [ perf record: Captured and wrote 0.681 MB ./perf-asm_pure_loop-out.data ]
>>> test child finished with 0
>>> ---- end ----
>>> Coresight / ASM Pure Loop: Ok
>>>
>>> Do you think we really need to use 256MiB as the AUX buffer size?
>>> IIRC, it means we allocate 256MiB per CPU for this case, on the other
>>> hand, you could see the final perf data file size is small (0.681
>>> MiB).
>>>
>>> Seems to me, it's not necessary to allocate so big buffer for
>>> the test, and I tried to run below 4 cases with 8MiB, all of them can
>>> pass the testing :)
>>
>> I didn't think anyone with a system with coresight support that would be
>> running perf record locally would only have 1GB of ram... I knew junos had
>> 8GB and my dragonboard has 4GB ... so I know I was on the smaller side. I
>> thought a larger buffer == safer results (less chance of needing to write
>> out the buffer during capture). Admittdly I used 256Mb when my tests ran for
>> much longer and collected more data. I can try drop to 8 or 16gb and see.
> 
> Yes, my Juno board has 8GB but I also have DB410c with 1GB with quad
> coes [1].  I am still concern for 256MB buffer size, it's not friendly for
> embedded system, and even not good for server.  For example, if we run
> this testing on Arm server with 96 cores (like Hisilicon D06 board),
> then we need the buffer size is:
> 
>    256MiB * 96 = 16GiB

A bit more actually 24M - but I see you correct this already in a 
following mail, but your point remains the same. :)

> I agree usually 16GiB is not a problem for server, but seems to me
> it doesn't make much sense to consume huge memory resource for the
> testing.

Even 24GB is not a problem IMHO if you have 96 cores, but ... I see your 
point.

> In other words, if set 8MiB (or 16MiB, 32MiB) buffer size and doesn't
> see testing result regression, I think this would be good to decrease
> the buffer size.

I'll drop it down to something less - sure. Maybe between 16-64M and 
that should mean everything you have there is able to do this without 
issues. 1GB with 8 cores would only need 512M for buffers - you have a 
lot left over. So 64M or less would be just fine I think (and with 4 
cores... even more headroom).

> [1] https://www.96boards.org/product/dragonboard410c/
> 
> [...]
> 
>>>> diff --git a/tools/perf/tests/shell/coresight_asm_pure_loop.sh b/tools/perf/tests/shell/coresight_asm_pure_loop.sh
>>>> new file mode 100755
>>>> index 000000000000..3f0dbefcad50
>>>> --- /dev/null
>>>> +++ b/tools/perf/tests/shell/coresight_asm_pure_loop.sh
>>>> @@ -0,0 +1,18 @@
>>>> +#!/bin/sh -e
>>>> +# Coresight / ASM Pure Loop
>>>> +
>>>> +# SPDX-License-Identifier: GPL-2.0
>>>> +# Carsten Haitzler <carsten.haitzler@arm.com>, 2021
>>>> +
>>>> +TEST="asm_pure_loop"
>>>> +. $(dirname $0)/lib/coresight.sh
>>>> +ARGS=""
>>>> +DATV="out"
>>>> +DATA="$DATD/perf-$TEST-$DATV.data"
>>>> +
>>>> +perf record $PERFRECOPT -o "$DATA" "$BIN" $ARGS
>>>> +
>>>> +perf_dump_aux_verify "$DATA" 10 10 10
>>>> +
>>>> +err=$?
>>>> +exit $err
>>>
>>> Can we organize the shell scripts by moving them into the folder
>>> tools/perf/tests/shell/coresight?
>>
>> We can - but it comes with a fair few more changes.
>>
>>>     coresight_asm_pure_loop.sh
>>>     coresight_memcpy_thread_16k_10.sh
>>>     coresight_thread_loop_check_tid_10.sh
>>>     coresight_thread_loop_check_tid_2.sh
>>>     coresight_unroll_loop_thread_10.sh
>>>
>>> And we even can consider to move script test_arm_coresight.sh into
>>> the folder tools/perf/tests/shell/coresight and change its
>>> name as 'coresight_smoke_test.sh'.
>>
>> Indeed these other tests I left alone for now and had not thought about how
>> to marry these together yet - leaving this for another day and another patch
>> set rather than this patch set itself. That was my thoguht. I was trying to
>> make an "Easier to extend by just dropping a test into a dir" setup here to
>> make maintenance and expansion easier over time (and thus encourage testing
>> by having a simple repeatable test infra to duplicate). I ended up with a
>> dir per test tool you need to build and a driver script in the tests/shell
>> dir. I think this is certainly worth considering but perhaps as a separate
>> set of work to marry these?
> 
> Okay, it would be fine to use separate set for moving the script
> test_arm_coresight.sh, which is a simple case.

I think that breaking out test_arm_coresight.sh into sub-tests that are 
stand-alone is better. I've had to deal with test suites that have one 
test case and that test fails/succeeds BUT this test case actually tests 
30+ API calls or something like that and one of those fail. Which one? 
You don't immediately know. You have to now go digging through log files 
to find out. Breaking this out into less efficient but easier to see 
"one test per thing you test" is much better and long-run really saves 
time and headaches, but this is "let's do this another day" but it is 
why I am over-engineering the tests with the view that these will expand 
in number and move over such tests.

>> I piggybacked on the existing shell test infra but added a fair few more
>> scripts. To do what you suggest I'd need to modify the core shell test code
>> to walk subdirs recursively then looking for child scripts. The problem is
>> how does perf test's shell handling know about the coresight subdir vs the
>> lib subdir?
> 
> Yeah, now I understand your point.  How about file layout like below?
> 
>    tools/perf/tests/shell/coresight_test_hub.sh
>    tools/perf/tests/shell/coresight/coresight_asm_pure_loop.sh
>    tools/perf/tests/shell/coresight/coresight_memcpy_thread_16k_10.sh
>    tools/perf/tests/shell/coresight/coresight_thread_loop_check_tid_10.sh
>    tools/perf/tests/shell/coresight/coresight_thread_loop_check_tid_2.sh
>    tools/perf/tests/shell/coresight/coresight_unroll_loop_thread_10.sh
> 
> So we use tools/perf/tests/shell/coresight_test_hub.sh as an interface
> to hook with Perf test infrastructure, and then hub.sh file calls
> testing scripts under the sub folder.  Seems to me, this is also
> friendly for later's extension.

Aaaaah here I disagree. The current test infra will only see 
"coresight_test_hub.sh" and then that all passes, fails or skips as a 
single test (see above). ub-tests are hidden when you do "perf test". 
I've had to deal with this before and it is (IMHO) a horrible way to go 
as you spend time then re-running the tests by hand and digging through 
other logs to see what failed rather than the toplevel tests indicating 
right there pass, fail or skip. Over time it becomes a time-sink to hunt 
these and mystifies new people who don't know where to dig for the 
specific failure.

To break them up like you want into subdirs then requires me to go to 
perf test (builtin-script.c) and change how the walking of test subdirs 
works with shells. It seems to have some legacy code in there that walks 
some "lang" dirs with "bin" dirs in them that seemingly do not exist 
today in the kernel tree. The right way to do this if you want to break 
this out into a coresight dir would be to modify how perf test walks 
dirs and builds the list of tests. I'd have to walk all subdirs then 
specifically filter for tests it knows will exist (executable, end in 
.sh, maybe be named like parentdir/parentdir_XXX.sh like your above 
suggestion of coresight/coresight_XXX.sh etc.). This now involves 
re-jigging a lot more code and abstracting the walking into a single 
function that walks once then stores the data in an array of struct 
pef_shell_test or such which now is filtered (and sorted which we should 
do which we don't now), and thus clean this all up, remove the 
(seemingly) legacy lang/bin dir walk code (I need to dig through history 
to find out where this came in and why it was there) and so on.

This certainly then makes this set of patches bigger and I need to break 
out this "revamp" of the walking of test shell scripts a preparation 
patch, but just saying: It certainly raises the bar for work and I don't 
think having a single parent .sh that then walks all the children and 
produces a single test output is right. I could popen this test and read 
the output to generate multiple output tests but, IMHO, that is worse 
and more complex than fixing the tree walk above to build an index of 
all shell tests once etc. like:

struct script_file {
    char *dir;
    char *file;
    char *label;
}

struct script_file *list_script_files(void);

... so we run this whenever we want a list of shell tests and just 
iterate over it until a NULL member (dir, file, label are all NULL). 
Simple and does the job but moves all the walking into a single function 
instead of there being like 3 of them now.

>> Both contain *.sh shell scripts - the difference is the ones in
>> lib are not executable. Is this sufficiently different? I could also open
>> them to check the have #!/bin/... as the first line. Hardcoding just a
>> single coresight subdir just feels wrong and hacky to me, thus the generic
>> recursion solution I suggest here.
> 
> Agreed.  I also don't prefer this way.
> 
>> I can definitely see how extending to subdirs would make supporting testing
>> cleaner and divide things into their own domains (dirs).
> 
> Thanks a lot for the work!  The test cases are good for me (but I would
> say Mike is the best person for reviewing testing trace data quaility),
> I just want to make sure it's not hard for later maintenance.

Sure. Let me at least make a start on the shell-test walk code and 
cleaning it up. I've just been staring at this and trying to find a 
"better way" but I'm pretty much at "Do it my current way with no subdir 
and keep the patch simple" or "Make this a more complex a patch to try 
streamline the code and abstract the shell tests better and include 
sub-dir walking" as I describe above. Long term I think this is actually 
a very good idea to clean this up and allow tests then in general to go 
into domain-specific subdirs. It just brings forward this kind of work 
ahead of getting extra tests in. I can't really see another sensible option.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] perf test: Shell - only run .sh shell files to skip other files
  2022-06-13 13:08         ` Carsten Haitzler
@ 2022-06-15  9:14           ` Leo Yan
  0 siblings, 0 replies; 19+ messages in thread
From: Leo Yan @ 2022-06-15  9:14 UTC (permalink / raw)
  To: Carsten Haitzler
  Cc: linux-kernel, coresight, suzuki.poulose, mathieu.poirier,
	mike.leach, linux-perf-users, acme

On Mon, Jun 13, 2022 at 02:08:30PM +0100, Carsten Haitzler wrote:

[...]

> > If the condition checking gets complex, seems to me it is reasonable to
> > use a static function (or a macro?) to encapsulate the logics.
> 
> Well normally my rule i s - if it gets re-used then do it, otherwise it just
> involves more indirection to follow. :) But regardless of that, given some
> other things you ask for that kind of makes this discussion moot as it
> requires much bigger wholesale changes to the test infra which will make
> these patches a lot more work. I'll get to that later in mails.

Your mentioned rule makes sense to me.

> > > But one catch... it really should be is_non_hidden_exe_shell_script() as
> > > it's checking that it's not a hidden file AND is a shell script. Or do I
> > > keep the hidden file test outside of the function in the if? If we're nit
> > > picking then I need to know exactly what you want here as your suggested
> > > name is actually incorrect.
> > 
> > I personally prefer to use the condition:
> > 
> >    if (is_exe_shell_script() && ent->d_name[0] != '.')
> >        do_something...
> > 
> > The reason is the function is_exe_shell_script() is more common and we
> > use it easily in wider scope.
> 
> As above - will probably have to redo a lot of the test infra involving the
> shell tests to handle some of your other requests, but if we don't go that
> way, I have got where you want to go and I can do this.

To be honest, I am not sure if this patch is related with refactoring
test infrastructure or not.  You could reconsider when you spin for next
patch set (as you said, might refactor test infra).

In case you still want to keep this patch as it is, it would be fine for
me and you could add my reviewed tag:

Reviewed-by: Leo Yan <leo.yan@linaro.org>

Thanks,
Leo

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2022-06-15  9:15 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-09 12:28 [PATCH 1/3] perf test: Shell - Limit to only run executable scripts in tests carsten.haitzler
2022-03-09 12:28 ` [PATCH 2/3] perf test: Shell - only run .sh shell files to skip other files carsten.haitzler
2022-04-10  2:28   ` Leo Yan
2022-04-21 16:21     ` Carsten Haitzler
2022-05-26 10:14       ` Leo Yan
2022-06-13 13:08         ` Carsten Haitzler
2022-06-15  9:14           ` Leo Yan
2022-03-09 12:28 ` [PATCH 3/3] perf test: Add coresight tests to guage quality of data generated carsten.haitzler
2022-04-10  8:30   ` Leo Yan
2022-04-21 17:38     ` Carsten Haitzler
2022-05-26  8:20       ` Leo Yan
2022-05-26 16:08         ` Leo Yan
2022-06-13 14:15         ` Carsten Haitzler
2022-05-30 16:27   ` Mathieu Poirier
2022-05-30 16:47     ` Mathieu Poirier
2022-06-13 12:53       ` Carsten Haitzler
2022-06-13 13:00     ` Carsten Haitzler
2022-04-10  1:24 ` [PATCH 1/3] perf test: Shell - Limit to only run executable scripts in tests Leo Yan
2022-04-11 19:08   ` Arnaldo Carvalho de Melo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).