All of lore.kernel.org
 help / color / mirror / Atom feed
* [igt-dev] [PATCH i-g-t 0/4] Add perf OA tools for GPUvis
@ 2020-02-15  1:11 Umesh Nerlige Ramappa
  2020-02-15  1:11 ` [igt-dev] [PATCH i-g-t 2/4] lib/i915/perf: Add support for loading perf configurations Umesh Nerlige Ramappa
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Umesh Nerlige Ramappa @ 2020-02-15  1:11 UTC (permalink / raw)
  To: igt-dev, Joonas Lahtinen, Ashutosh Dixit, Lionel G Landwerlin

The tools provided here enable capturing performance metrics from the i915
driver and are used in conjunction with the GPUvis software here - 

https://github.com/mikesart/gpuvis

The changes required in GPUvis are wip and will be posted following the merge of
these tools.

For more information, view tools/i915-perf/README in this patch series

Lionel Landwerlin (4):
  lib/i915/perf: Add i915_perf library
  lib/i915/perf: Add support for loading perf configurations
  tools/i915/perf: Add i915 perf recorder tool
  lib/i915/perf: Add i915 perf data reader

 lib/i915-perf.pc.in                           |    10 +
 lib/i915/perf-configs/README.md               |   115 +
 lib/i915/perf-configs/codegen.py              |    33 +
 lib/i915/perf-configs/guids.xml               |   282 +
 lib/i915/perf-configs/mdapi-xml-convert.py    |  1000 +
 lib/i915/perf-configs/oa-bdw.xml              | 15653 ++++++++++++++++
 lib/i915/perf-configs/oa-bxt.xml              |  9595 ++++++++++
 lib/i915/perf-configs/oa-cflgt2.xml           | 10866 +++++++++++
 lib/i915/perf-configs/oa-cflgt3.xml           | 10933 +++++++++++
 lib/i915/perf-configs/oa-chv.xml              |  9757 ++++++++++
 lib/i915/perf-configs/oa-cnl.xml              | 10411 ++++++++++
 lib/i915/perf-configs/oa-glk.xml              |  9346 +++++++++
 lib/i915/perf-configs/oa-hsw.xml              |  4615 +++++
 lib/i915/perf-configs/oa-icl.xml              | 11899 ++++++++++++
 lib/i915/perf-configs/oa-kblgt2.xml           | 10866 +++++++++++
 lib/i915/perf-configs/oa-kblgt3.xml           | 10933 +++++++++++
 lib/i915/perf-configs/oa-sklgt2.xml           | 11895 ++++++++++++
 lib/i915/perf-configs/oa-sklgt3.xml           | 10933 +++++++++++
 lib/i915/perf-configs/oa-sklgt4.xml           | 10956 +++++++++++
 lib/i915/perf-configs/perf-codegen.py         |   854 +
 lib/i915/perf-configs/update-guids.py         |   230 +
 lib/i915/perf.c                               |   332 +
 lib/i915/perf.h                               |   240 +
 lib/i915/perf_data.h                          |    88 +
 lib/i915/perf_data_reader.c                   |   330 +
 lib/i915/perf_data_reader.h                   |   103 +
 lib/meson.build                               |    67 +
 tools/i915-perf/README                        |    70 +
 tools/i915-perf/i915_perf_configs.c           |   277 +
 tools/i915-perf/i915_perf_control.c           |   133 +
 tools/i915-perf/i915_perf_recorder.c          |   931 +
 tools/i915-perf/i915_perf_recorder_commands.h |    39 +
 tools/i915-perf/meson.build                   |    17 +
 tools/meson.build                             |     1 +
 34 files changed, 153810 insertions(+)
 create mode 100644 lib/i915-perf.pc.in
 create mode 100644 lib/i915/perf-configs/README.md
 create mode 100644 lib/i915/perf-configs/codegen.py
 create mode 100644 lib/i915/perf-configs/guids.xml
 create mode 100755 lib/i915/perf-configs/mdapi-xml-convert.py
 create mode 100644 lib/i915/perf-configs/oa-bdw.xml
 create mode 100644 lib/i915/perf-configs/oa-bxt.xml
 create mode 100644 lib/i915/perf-configs/oa-cflgt2.xml
 create mode 100644 lib/i915/perf-configs/oa-cflgt3.xml
 create mode 100644 lib/i915/perf-configs/oa-chv.xml
 create mode 100644 lib/i915/perf-configs/oa-cnl.xml
 create mode 100644 lib/i915/perf-configs/oa-glk.xml
 create mode 100644 lib/i915/perf-configs/oa-hsw.xml
 create mode 100644 lib/i915/perf-configs/oa-icl.xml
 create mode 100644 lib/i915/perf-configs/oa-kblgt2.xml
 create mode 100644 lib/i915/perf-configs/oa-kblgt3.xml
 create mode 100644 lib/i915/perf-configs/oa-sklgt2.xml
 create mode 100644 lib/i915/perf-configs/oa-sklgt3.xml
 create mode 100644 lib/i915/perf-configs/oa-sklgt4.xml
 create mode 100755 lib/i915/perf-configs/perf-codegen.py
 create mode 100755 lib/i915/perf-configs/update-guids.py
 create mode 100644 lib/i915/perf.c
 create mode 100644 lib/i915/perf.h
 create mode 100644 lib/i915/perf_data.h
 create mode 100644 lib/i915/perf_data_reader.c
 create mode 100644 lib/i915/perf_data_reader.h
 create mode 100644 tools/i915-perf/README
 create mode 100644 tools/i915-perf/i915_perf_configs.c
 create mode 100644 tools/i915-perf/i915_perf_control.c
 create mode 100644 tools/i915-perf/i915_perf_recorder.c
 create mode 100644 tools/i915-perf/i915_perf_recorder_commands.h
 create mode 100644 tools/i915-perf/meson.build

-- 
2.20.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [igt-dev] [PATCH i-g-t 2/4] lib/i915/perf: Add support for loading perf configurations
  2020-02-15  1:11 [igt-dev] [PATCH i-g-t 0/4] Add perf OA tools for GPUvis Umesh Nerlige Ramappa
@ 2020-02-15  1:11 ` Umesh Nerlige Ramappa
  2020-02-15  1:11 ` [igt-dev] [PATCH i-g-t 3/4] tools/i915/perf: Add i915 perf recorder tool Umesh Nerlige Ramappa
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Umesh Nerlige Ramappa @ 2020-02-15  1:11 UTC (permalink / raw)
  To: igt-dev, Joonas Lahtinen, Ashutosh Dixit, Lionel G Landwerlin

From: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

Add support for loading perf configurations used by gpuvis.

v2: rebase fixes for igt list (Umesh)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
---
 lib/i915/perf.c | 138 ++++++++++++++++++++++++++++++++++++++++++++++++
 lib/i915/perf.h |   2 +
 2 files changed, 140 insertions(+)

diff --git a/lib/i915/perf.c b/lib/i915/perf.c
index 1627f102..ae786701 100644
--- a/lib/i915/perf.c
+++ b/lib/i915/perf.c
@@ -20,8 +20,18 @@
  * SOFTWARE.
  */
 
+#include <assert.h>
+#include <errno.h>
+#include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
+#include <dirent.h>
+#include <fcntl.h>
+#include <sys/ioctl.h>
+#include <sys/stat.h>
+#include <sys/sysmacros.h>
+#include <sys/types.h>
+#include <unistd.h>
 
 #include "intel_chipset.h"
 #include "perf.h"
@@ -192,3 +202,131 @@ intel_perf_add_metric_set(struct intel_perf *perf,
 {
 	igt_list_add_tail(&metric_set->link, &perf->metric_sets);
 }
+
+static bool
+read_file_uint64(const char *file, uint64_t *value)
+{
+	char buf[32];
+	int fd, n;
+
+	fd = open(file, 0);
+	if (fd < 0)
+		return false;
+	n = read(fd, buf, sizeof (buf) - 1);
+	close(fd);
+	if (n < 0)
+		return false;
+
+	buf[n] = '\0';
+	*value = strtoull(buf, 0, 0);
+
+	return true;
+}
+
+static int
+get_card_for_fd(int fd)
+{
+	struct stat sb;
+	int mjr, mnr;
+	char buffer[128];
+	DIR *drm_dir;
+	struct dirent *entry;
+	int retval = -1;
+
+	if (fstat(fd, &sb))
+		return -1;
+
+	mjr = major(sb.st_rdev);
+	mnr = minor(sb.st_rdev);
+
+	snprintf(buffer, sizeof(buffer), "/sys/dev/char/%d:%d/device/drm", mjr, mnr);
+
+	drm_dir = opendir(buffer);
+	assert(drm_dir != NULL);
+
+	while ((entry = readdir(drm_dir))) {
+		if (entry->d_type == DT_DIR && strncmp(entry->d_name, "card", 4) == 0) {
+			retval = strtoull(entry->d_name + 4, NULL, 10);
+			break;
+		}
+	}
+
+	closedir(drm_dir);
+
+	return retval;
+}
+
+static void
+load_metric_set_config(struct intel_perf_metric_set *metric_set, int drm_fd)
+{
+	struct drm_i915_perf_oa_config config;
+	uint64_t config_id = 0;
+
+	memset(&config, 0, sizeof(config));
+
+	memcpy(config.uuid, metric_set->hw_config_guid, sizeof(config.uuid));
+
+	config.n_mux_regs = metric_set->n_mux_regs;
+	config.mux_regs_ptr = (uintptr_t) metric_set->mux_regs;
+
+	config.n_boolean_regs = metric_set->n_b_counter_regs;
+	config.boolean_regs_ptr = (uintptr_t) metric_set->b_counter_regs;
+
+	config.n_flex_regs = metric_set->n_flex_regs;
+	config.flex_regs_ptr = (uintptr_t) metric_set->flex_regs;
+
+	while (ioctl(drm_fd, DRM_IOCTL_I915_PERF_ADD_CONFIG, &config) < 0 &&
+	       (errno == EAGAIN || errno == EINTR));
+
+	metric_set->perf_oa_metrics_set = config_id;
+}
+
+void
+intel_perf_load_perf_configs(struct intel_perf *perf, int drm_fd)
+{
+	int drm_card = get_card_for_fd(drm_fd);
+	struct dirent *entry;
+	char metrics_path[128];
+	DIR *metrics_dir;
+	struct intel_perf_metric_set *metric_set;
+
+	snprintf(metrics_path, sizeof(metrics_path),
+		 "/sys/class/drm/card%d/metrics", drm_card);
+	metrics_dir = opendir(metrics_path);
+	if (!metrics_dir)
+		return;
+
+	while ((entry = readdir(metrics_dir))) {
+		char *metric_id_path;
+		uint64_t metric_id;
+
+		if (entry->d_type != DT_DIR)
+			continue;
+
+		asprintf(&metric_id_path, "%s/%s/id",
+			 metrics_path, entry->d_name);
+
+		if (!read_file_uint64(metric_id_path, &metric_id)) {
+			free(metric_id_path);
+			continue;
+		}
+
+		free(metric_id_path);
+
+		igt_list_for_each_entry(metric_set, &perf->metric_sets, link) {
+			if (!strcmp(metric_set->hw_config_guid, entry->d_name)) {
+				metric_set->perf_oa_metrics_set = metric_id;
+				break;
+			}
+		}
+	}
+
+	closedir(metrics_dir);
+
+	igt_list_for_each_entry(metric_set, &perf->metric_sets, link) {
+		if (metric_set->perf_oa_metrics_set)
+			continue;
+
+		load_metric_set_config(metric_set, drm_fd);
+	}
+}
diff --git a/lib/i915/perf.h b/lib/i915/perf.h
index 5a091c46..0b66efe1 100644
--- a/lib/i915/perf.h
+++ b/lib/i915/perf.h
@@ -231,6 +231,8 @@ void intel_perf_add_logical_counter(struct intel_perf *perf,
 void intel_perf_add_metric_set(struct intel_perf *perf,
 			       struct intel_perf_metric_set *metric_set);
 
+void intel_perf_load_perf_configs(struct intel_perf *perf, int drm_fd);
+
 #ifdef __cplusplus
 };
 #endif
-- 
2.20.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [igt-dev] [PATCH i-g-t 3/4] tools/i915/perf: Add i915 perf recorder tool
  2020-02-15  1:11 [igt-dev] [PATCH i-g-t 0/4] Add perf OA tools for GPUvis Umesh Nerlige Ramappa
  2020-02-15  1:11 ` [igt-dev] [PATCH i-g-t 2/4] lib/i915/perf: Add support for loading perf configurations Umesh Nerlige Ramappa
@ 2020-02-15  1:11 ` Umesh Nerlige Ramappa
  2020-02-15  1:11 ` [igt-dev] [PATCH i-g-t 4/4] lib/i915/perf: Add i915 perf data reader Umesh Nerlige Ramappa
  2020-02-17 13:42 ` [igt-dev] [PATCH i-g-t 0/4] Add perf OA tools for GPUvis Lionel Landwerlin
  3 siblings, 0 replies; 5+ messages in thread
From: Umesh Nerlige Ramappa @ 2020-02-15  1:11 UTC (permalink / raw)
  To: igt-dev, Joonas Lahtinen, Ashutosh Dixit, Lionel G Landwerlin

From: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

i915 perf recorder tool captures OA perf data for a specific metric set
in a circular buffer of specified size. The i915 perf control tool is
used to dump the data captured in the circular buffer to a trace file.
The data captured is used to view relevant events in gpuvis.

v2: (Umesh)
- rebase fixes for igt_list apis
- memset circular_buffer to 0 to initialize size, beginpos and enpos
- _FORTIFY_SOURCE=2 caused snprintf to go through __snprintf_chk that
  falsely flagged a buffer overflow and sent an abort signal to
  i915_perf_control when capturing traces. undef the _FORTIFY_SOURCE
  selectively for i915 control tool.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
---
 lib/i915/perf_data.h                          |  88 ++
 lib/meson.build                               |   1 +
 tools/i915-perf/i915_perf_control.c           | 133 +++
 tools/i915-perf/i915_perf_recorder.c          | 931 ++++++++++++++++++
 tools/i915-perf/i915_perf_recorder_commands.h |  39 +
 tools/i915-perf/meson.build                   |  12 +
 6 files changed, 1204 insertions(+)
 create mode 100644 lib/i915/perf_data.h
 create mode 100644 tools/i915-perf/i915_perf_control.c
 create mode 100644 tools/i915-perf/i915_perf_recorder.c
 create mode 100644 tools/i915-perf/i915_perf_recorder_commands.h

diff --git a/lib/i915/perf_data.h b/lib/i915/perf_data.h
new file mode 100644
index 00000000..13791187
--- /dev/null
+++ b/lib/i915/perf_data.h
@@ -0,0 +1,88 @@
+/*
+ * Copyright (C) 2019 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef PERF_DATA_H
+#define PERF_DATA_H
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/* The structures below are embedded in the i915-perf stream so as to
+ * provide metadata. The types used in the
+ * drm_i915_perf_record_header.type are defined in
+ * intel_perf_record_type.
+ *
+ * Once defined, those structures cannot change. If you need to add
+ * new data, just define a new structure & record_type.
+ */
+
+#include <stdint.h>
+
+enum intel_perf_record_type {
+	/* Start at 65536, which is pretty safe since after 3years the
+	 * kernel hasn't defined more than 3 entries.
+	 */
+
+	/* intel_perf_record_device_info */
+	INTEL_PERF_RECORD_TYPE_DEVICE_INFO = 1 << 16,
+
+	/* intel_perf_record_device_topology */
+	INTEL_PERF_RECORD_TYPE_DEVICE_TOPOLOGY,
+
+	/* intel_perf_record_timestamp_correlation */
+	INTEL_PERF_RECORD_TYPE_TIMESTAMP_CORRELATION,
+};
+
+struct intel_perf_record_device_info {
+	/* Frequency of the timestamps in the records. */
+	uint64_t timestamp_frequency;
+
+	/* PCI ID */
+	uint32_t device_id;
+
+	/* enum drm_i915_oa_format */
+	uint32_t oa_format;
+
+	/* Configuration identifier */
+	char uuid[40];
+};
+
+/* Topology as reported by i915. */
+struct intel_perf_record_device_topology {
+	struct drm_i915_query_topology_info topology;
+};
+
+/* Timestamp correlation between CPU/GPU. */
+struct intel_perf_record_timestamp_correlation {
+	/* In CLOCK_MONOTONIC */
+	uint64_t cpu_timestamp;
+
+	/* Engine timestamp associated with the OA unit */
+	uint64_t gpu_timestamp;
+};
+
+#ifdef __cplusplus
+};
+#endif
+
+#endif /* PERF_DATA_H */
diff --git a/lib/meson.build b/lib/meson.build
index edff8a67..6e935d45 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -217,6 +217,7 @@ install_headers(
   'igt_list.h',
   'intel_chipset.h',
   'i915/perf.h',
+  'i915/perf_data.h',
   subdir : 'i915-perf'
 )
 
diff --git a/tools/i915-perf/i915_perf_control.c b/tools/i915-perf/i915_perf_control.c
new file mode 100644
index 00000000..a8d0d30f
--- /dev/null
+++ b/tools/i915-perf/i915_perf_control.c
@@ -0,0 +1,133 @@
+/*
+ * Copyright (C) 2019 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include <getopt.h>
+#include <stdbool.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+
+#include "i915_perf_recorder_commands.h"
+
+static void
+usage(const char *name)
+{
+	fprintf(stdout,
+		"Usage: %s [options]\n"
+		"\n"
+		"     --help,               -h         Print this screen\n"
+		"     --command-fifo,       -f <path>  Path to a command fifo\n"
+		"     --dump,               -d <path>  Write a content of circular buffer to path\n",
+		name);
+}
+
+int
+main(int argc, char *argv[])
+{
+	const struct option long_options[] = {
+		{"help",                       no_argument, 0, 'h'},
+		{"dump",                 required_argument, 0, 'd'},
+		{"command-fifo",         required_argument, 0, 'f'},
+		{"quit",                       no_argument, 0, 'q'},
+		{0, 0, 0, 0}
+	};
+	const char *command_fifo = I915_PERF_RECORD_FIFO_PATH, *dump_file = NULL;
+	FILE *command_fifo_file;
+	int opt;
+	bool quit = false;
+
+	while ((opt = getopt_long(argc, argv, "hd:f:q", long_options, NULL)) != -1) {
+		switch (opt) {
+		case 'h':
+			usage(argv[0]);
+			return EXIT_SUCCESS;
+		case 'd':
+			dump_file = optarg;
+			break;
+		case 'f':
+			command_fifo = optarg;
+			break;
+		case 'q':
+			quit = true;
+			break;
+		default:
+			fprintf(stderr, "Internal error: "
+				"unexpected getopt value: %d\n", opt);
+			usage(argv[0]);
+			return EXIT_FAILURE;
+		}
+	}
+
+	if (!command_fifo)
+		return EXIT_FAILURE;
+
+	command_fifo_file = fopen(command_fifo, "r+");
+	if (!command_fifo_file) {
+		fprintf(stderr, "Unable to open command file\n");
+		return EXIT_FAILURE;
+	}
+
+	if (dump_file) {
+		if (dump_file[0] == '/') {
+			uint32_t total_len =
+				sizeof(struct recorder_command_base) + strlen(dump_file) + 1;
+			struct {
+				struct recorder_command_base base;
+				struct recorder_command_dump dump;
+			} *data = malloc(total_len);
+
+			data->base.command = RECORDER_COMMAND_DUMP;
+			data->base.size = total_len;
+			snprintf((char *) data->dump.path, strlen(dump_file) + 1, "%s", dump_file);
+
+			fwrite(data, total_len, 1, command_fifo_file);
+		} else {
+			char *cwd = get_current_dir_name();
+			uint32_t path_len = strlen(cwd) + 1 + strlen(dump_file) + 1;
+			uint32_t total_len = sizeof(struct recorder_command_base) + path_len;
+			struct {
+				struct recorder_command_base base;
+				struct recorder_command_dump dump;
+			} *data = malloc(total_len);
+
+			data->base.command = RECORDER_COMMAND_DUMP;
+			data->base.size = total_len;
+			snprintf((char *) data->dump.path, path_len, "%s/%s", cwd, dump_file);
+
+			fwrite(data, total_len, 1, command_fifo_file);
+		}
+	}
+
+	if (quit) {
+		struct recorder_command_base base = {
+			.command = RECORDER_COMMAND_QUIT,
+			.size = sizeof(base),
+		};
+
+		fwrite(&base, sizeof(base), 1, command_fifo_file);
+	}
+
+	fclose(command_fifo_file);
+
+	return EXIT_SUCCESS;
+}
diff --git a/tools/i915-perf/i915_perf_recorder.c b/tools/i915-perf/i915_perf_recorder.c
new file mode 100644
index 00000000..61bde5ba
--- /dev/null
+++ b/tools/i915-perf/i915_perf_recorder.c
@@ -0,0 +1,931 @@
+/*
+ * Copyright (C) 2019 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include <assert.h>
+#include <dirent.h>
+#include <errno.h>
+#include <fcntl.h>
+#include <getopt.h>
+#include <inttypes.h>
+#include <poll.h>
+#include <signal.h>
+#include <stdbool.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/ioctl.h>
+#include <sys/stat.h>
+#include <sys/sysmacros.h>
+#include <sys/time.h>
+#include <sys/types.h>
+#include <time.h>
+#include <unistd.h>
+
+#include <i915_drm.h>
+
+#include "igt_core.h"
+#include "intel_chipset.h"
+#include "i915/perf.h"
+#include "i915/perf_data.h"
+
+#include "i915_perf_recorder_commands.h"
+
+#define ALIGN(v, a) (((v) + (a)-1) & ~((a)-1))
+#define ARRAY_SIZE(arr) (sizeof(arr)/sizeof((arr)[0]))
+#define MAX(a,b) ((a) > (b) ? (a) : (b))
+#define MIN(a,b) ((a) < (b) ? (a) : (b))
+
+struct circular_buffer {
+	char   *data;
+	size_t  allocated_size;
+	size_t  size;
+	size_t  beginpos;
+	size_t  endpos;
+};
+
+struct chunk {
+	char *data;
+	size_t len;
+};
+
+static size_t
+circular_available_size(const struct circular_buffer *buffer)
+{
+	assert(buffer->size <= buffer->allocated_size);
+	return buffer->allocated_size - buffer->size;
+}
+
+static void
+get_chunks(struct chunk *chunks, struct circular_buffer *buffer, bool write, size_t len)
+{
+	size_t offset = write ? buffer->endpos : buffer->beginpos;
+
+	if (write)
+		assert(circular_available_size(buffer) >= len);
+	else
+		assert(buffer->size >= len);
+
+	chunks[0].data = &buffer->data[offset];
+
+	if ((offset + len) > buffer->allocated_size) {
+		chunks[0].len = buffer->allocated_size - offset;
+		chunks[1].data = buffer->data;
+		chunks[1].len = len - (buffer->allocated_size - offset);
+	} else {
+		chunks[0].len = len;
+		chunks[1].data = NULL;
+		chunks[1].len = 0;
+	}
+}
+
+static ssize_t
+circular_buffer_read(void *c, char *buf, size_t size)
+{
+	struct circular_buffer *buffer = c;
+	struct chunk chunks[2];
+
+	if (buffer->size < size)
+		return -1;
+
+	get_chunks(chunks, buffer, false, size);
+
+	memcpy(buf, chunks[0].data, chunks[0].len);
+	memcpy(buf + chunks[0].len, chunks[1].data, chunks[1].len);
+	buffer->beginpos = (buffer->beginpos + size) % buffer->allocated_size;
+	buffer->size -= size;
+
+	return size;
+}
+
+static size_t
+peek_item_size(struct circular_buffer *buffer)
+{
+	struct drm_i915_perf_record_header header;
+	struct chunk chunks[2];
+
+	if (!buffer->size)
+		return 0;
+
+	assert(buffer->size >= sizeof(header));
+
+	get_chunks(chunks, buffer, false, sizeof(header));
+	memcpy(&header, chunks[0].data, chunks[0].len);
+	memcpy((char *) &header + chunks[0].len, chunks[1].data, chunks[1].len);
+
+	return header.size;
+}
+
+static void
+circular_shrink(struct circular_buffer *buffer, size_t size)
+{
+	size_t shrank = 0, item_size;
+
+	assert(size <= buffer->allocated_size);
+
+	while (shrank < size && buffer->size > (item_size = peek_item_size(buffer))) {
+		assert(item_size > 0 && item_size <= buffer->allocated_size);
+
+		buffer->beginpos = (buffer->beginpos + item_size) % buffer->allocated_size;
+		buffer->size -= item_size;
+
+		shrank += item_size;
+	}
+}
+
+static ssize_t
+circular_buffer_write(void *c, const char *buf, size_t _size)
+{
+	struct circular_buffer *buffer = c;
+	size_t size = _size;
+
+	while (size) {
+		size_t avail = circular_available_size(buffer), item_size;
+		struct chunk chunks[2];
+
+		/* Make space in the buffer if there is too much data. */
+		if (avail < size)
+			circular_shrink(buffer, size - avail);
+
+		item_size = MIN(circular_available_size(buffer), size);
+
+		get_chunks(chunks, buffer, true, item_size);
+
+		memcpy(chunks[0].data, buf, chunks[0].len);
+		memcpy(chunks[1].data, buf + chunks[0].len, chunks[1].len);
+
+		buf += item_size;
+		size -= item_size;
+
+		buffer->endpos = (buffer->endpos + item_size) % buffer->allocated_size;
+		buffer->size += item_size;
+	}
+
+	return _size;
+}
+
+static int
+circular_buffer_seek(void *c, off64_t *offset, int whence)
+{
+	return -1;
+}
+
+static int
+circular_buffer_close(void *c)
+{
+	return 0;
+}
+
+cookie_io_functions_t circular_buffer_functions = {
+	.read  = circular_buffer_read,
+	.write = circular_buffer_write,
+	.seek  = circular_buffer_seek,
+	.close = circular_buffer_close,
+};
+
+
+static bool
+read_file_uint64(const char *file, uint64_t *value)
+{
+	char buf[32];
+	int fd, n;
+
+	fd = open(file, 0);
+	if (fd < 0)
+		return false;
+	n = read(fd, buf, sizeof (buf) - 1);
+	close(fd);
+	if (n < 0)
+		return false;
+
+	buf[n] = '\0';
+	*value = strtoull(buf, 0, 0);
+
+	return true;
+}
+
+static uint32_t
+read_device_param(const char *stem, int id, const char *param)
+{
+	char *name;
+	int ret = asprintf(&name, "/sys/class/drm/%s%u/device/%s", stem, id, param);
+	uint64_t value;
+	bool success;
+
+	assert(ret != -1);
+
+	success = read_file_uint64(name, &value);
+	free(name);
+
+	return success ? value : 0;
+}
+
+static int
+find_intel_render_node(void)
+{
+	for (int i = 128; i < (128 + 16); i++) {
+		if (read_device_param("renderD", i, "vendor") == 0x8086)
+			return i;
+	}
+
+	return -1;
+}
+
+static int
+open_render_node(uint32_t *devid)
+{
+	char *name;
+	int ret;
+	int fd;
+
+	int render = find_intel_render_node();
+	if (render < 0)
+		return -1;
+
+	ret = asprintf(&name, "/dev/dri/renderD%u", render);
+	assert(ret != -1);
+
+	*devid = read_device_param("renderD", render, "device");
+
+	fd = open(name, O_RDWR);
+	free(name);
+
+	return fd;
+}
+
+static uint32_t
+oa_exponent_for_period(uint64_t device_timestamp_frequency, double period)
+{
+	uint64_t period_ns = 1000 * 1000 * 1000 * period;
+	uint64_t device_periods[32];
+
+	for (uint32_t i = 0; i < ARRAY_SIZE(device_periods); i++)
+		device_periods[i] = 1000000000ull * (1u << i) / device_timestamp_frequency;
+
+	for (uint32_t i = 1; i < ARRAY_SIZE(device_periods); i++) {
+		if (period_ns >= device_periods[i - 1] &&
+		    period_ns < device_periods[i]) {
+			if ((device_periods[i] - period_ns) >
+			    (period_ns - device_periods[i - 1]))
+				return i - 1;
+			return i;
+		}
+	}
+
+	return -1;
+}
+
+static int
+perf_ioctl(int fd, unsigned long request, void *arg)
+{
+	int ret;
+
+	do {
+		ret = ioctl(fd, request, arg);
+	} while (ret == -1 && (errno == EINTR || errno == EAGAIN));
+
+	return ret;
+}
+
+static uint64_t
+get_device_timestamp_frequency(const struct intel_device_info *devinfo, int drm_fd)
+{
+	drm_i915_getparam_t gp;
+	int timestamp_frequency;
+
+	gp.param = I915_PARAM_CS_TIMESTAMP_FREQUENCY;
+	gp.value = &timestamp_frequency;
+	if (perf_ioctl(drm_fd, DRM_IOCTL_I915_GETPARAM, &gp) == 0)
+		return timestamp_frequency;
+
+	if (devinfo->gen > 9) {
+		fprintf(stderr, "Unable to query timestamp frequency from i915, please update kernel.\n");
+		return 0;
+	}
+
+	fprintf(stderr, "Warning: unable to query timestamp frequency from i915, guessing values...\n");
+
+	if (devinfo->gen <= 8)
+		return 12500000;
+	if (devinfo->is_broxton)
+		return 19200000;
+	return 12000000;
+}
+
+static int
+perf_open(int drm_fd,
+	  const struct intel_device_info *devinfo,
+	  uint32_t oa_exponent,
+	  const struct intel_perf_metric_set *metric_set)
+{
+	uint64_t properties[DRM_I915_PERF_PROP_MAX * 2];
+	struct drm_i915_perf_open_param param;
+	int p = 0, stream_fd;
+
+	properties[p++] = DRM_I915_PERF_PROP_SAMPLE_OA;
+	properties[p++] = true;
+
+	properties[p++] = DRM_I915_PERF_PROP_OA_METRICS_SET;
+	properties[p++] = metric_set->perf_oa_metrics_set;
+
+	properties[p++] = DRM_I915_PERF_PROP_OA_FORMAT;
+	properties[p++] = metric_set->perf_oa_format;
+
+	properties[p++] = DRM_I915_PERF_PROP_OA_EXPONENT;
+	properties[p++] = oa_exponent;
+
+	memset(&param, 0, sizeof(param));
+	param.flags = 0;
+	param.flags |= I915_PERF_FLAG_FD_CLOEXEC | I915_PERF_FLAG_FD_NONBLOCK;
+	param.properties_ptr = (uintptr_t)properties;
+	param.num_properties = p / 2;
+
+	stream_fd = perf_ioctl(drm_fd, DRM_IOCTL_I915_PERF_OPEN, &param);
+	return stream_fd;
+}
+
+static bool quit = false;
+
+static void
+sigint_handler(int val)
+{
+	quit = true;
+}
+
+static bool
+write_header(FILE *output,
+	     uint32_t device_id,
+	     uint64_t timestamp_frequency,
+	     const struct intel_perf_metric_set *metric_set)
+{
+	struct intel_perf_record_device_info info = {
+		.timestamp_frequency = timestamp_frequency,
+		.device_id = device_id,
+		.oa_format = metric_set->perf_oa_format,
+	};
+	struct drm_i915_perf_record_header header = {
+		.type = INTEL_PERF_RECORD_TYPE_DEVICE_INFO,
+		.size = sizeof(header) + sizeof(info),
+	};
+
+	snprintf(info.uuid, sizeof(info.uuid), "%s", metric_set->hw_config_guid);
+
+	if (fwrite(&header, sizeof(header), 1, output) != 1)
+		return false;
+
+	if (fwrite(&info, sizeof(info), 1, output) != 1)
+		return false;
+
+	return true;
+}
+
+static bool
+write_topology(FILE *output, int drm_fd)
+{
+	struct drm_i915_perf_record_header header = {
+		.type = INTEL_PERF_RECORD_TYPE_DEVICE_TOPOLOGY,
+	};
+	struct drm_i915_query query = {};
+	struct drm_i915_query_topology_info *topo_info;
+	struct drm_i915_query_item item = {
+		.query_id = DRM_I915_QUERY_TOPOLOGY_INFO,
+	};
+	int ret;
+
+	query.num_items = 1;
+	query.items_ptr = (uintptr_t) &item;
+
+	/* Maybe not be available on older kernels. */
+	ret = perf_ioctl(drm_fd, DRM_IOCTL_I915_QUERY, &query);
+	if (ret < 0)
+		return true;
+
+	assert(item.length > 0);
+	topo_info = malloc(item.length);
+	item.data_ptr = (uintptr_t) topo_info;
+
+	ret = perf_ioctl(drm_fd, DRM_IOCTL_I915_QUERY, &query);
+	assert(ret == 0);
+
+	header.size = sizeof(header) + item.length;
+	if (fwrite(&header, sizeof(header), 1, output) != 1)
+		return false;
+
+	if (fwrite(topo_info, item.length, 1, output) != 1)
+		return false;
+
+	return true;
+}
+
+static bool
+write_i915_perf_data(FILE *output, int perf_fd)
+{
+	ssize_t ret;
+	char data[4096];
+
+	while ((ret = read(perf_fd, data, sizeof(data))) > 0 ||
+	       errno == EINTR) {
+		if (fwrite(data, ret, 1, output) != 1)
+			return false;
+	}
+
+	return true;
+}
+
+static uint64_t timespec_diff(struct timespec *begin,
+			      struct timespec *end)
+{
+	return 1000000000ull * (end->tv_sec - begin->tv_sec) + end->tv_nsec - begin->tv_nsec;
+}
+
+static clock_t correlation_clock_id = CLOCK_MONOTONIC;
+
+static bool
+get_correlation_timestamps(struct intel_perf_record_timestamp_correlation *corr, int drm_fd)
+{
+	struct drm_i915_reg_read reg_read;
+	struct {
+		struct timespec cpu_ts_begin;
+		struct timespec cpu_ts_end;
+		uint64_t gpu_ts;
+	} attempts[3];
+	uint32_t best = 0;
+
+#define RENDER_RING_TIMESTAMP 0x2358
+
+        reg_read.offset = RENDER_RING_TIMESTAMP | I915_REG_READ_8B_WA;
+
+	/* Gather 3 correlations. */
+	for (uint32_t i = 0; i < ARRAY_SIZE(attempts); i++) {
+		clock_gettime(correlation_clock_id, &attempts[i].cpu_ts_begin);
+		if (perf_ioctl(drm_fd, DRM_IOCTL_I915_REG_READ, &reg_read) < 0)
+			return false;
+		clock_gettime(correlation_clock_id, &attempts[i].cpu_ts_end);
+
+		attempts[i].gpu_ts = reg_read.val;
+	}
+
+	/* Now select the best. */
+	for (uint32_t i = 1; i < ARRAY_SIZE(attempts); i++) {
+		if (timespec_diff(&attempts[i].cpu_ts_begin,
+				  &attempts[i].cpu_ts_end) <
+		    timespec_diff(&attempts[best].cpu_ts_begin,
+				  &attempts[best].cpu_ts_end))
+			best = i;
+	}
+
+	corr->cpu_timestamp =
+		(attempts[best].cpu_ts_begin.tv_sec * 1000000000ull +
+		 attempts[best].cpu_ts_begin.tv_nsec) +
+		timespec_diff(&attempts[best].cpu_ts_begin,
+			      &attempts[best].cpu_ts_end) / 2;
+	corr->gpu_timestamp = attempts[best].gpu_ts;
+
+	return true;
+}
+
+static bool
+write_saved_correlation_timestamps(FILE *output,
+				   const struct intel_perf_record_timestamp_correlation *corr)
+{
+	struct drm_i915_perf_record_header header = {
+		.type = INTEL_PERF_RECORD_TYPE_TIMESTAMP_CORRELATION,
+		.size = sizeof(header) + sizeof(*corr),
+	};
+
+	if (fwrite(&header, sizeof(header), 1, output) != 1)
+		return false;
+
+	if (fwrite(corr, sizeof(*corr), 1, output) != 1)
+		return false;
+
+	return true;
+}
+
+static bool
+write_correlation_timestamps(FILE *output, int drm_fd)
+{
+	struct intel_perf_record_timestamp_correlation corr;
+
+	if (!get_correlation_timestamps(&corr, drm_fd))
+		return false;
+
+	return write_saved_correlation_timestamps(output, &corr);
+}
+
+static void
+read_command_file(int command_fd, FILE *output_stream, struct circular_buffer *buffer,
+		  int drm_fd, uint32_t devid, uint64_t timestamp_frequency,
+		  struct intel_perf_metric_set *metric_set)
+{
+	struct recorder_command_base header;
+	ssize_t ret = read(command_fd, &header, sizeof(header));
+
+	if (ret < 0)
+		return;
+
+	switch (header.command) {
+	case RECORDER_COMMAND_DUMP: {
+		uint32_t len = header.size - sizeof(header), offset = 0;
+		struct recorder_command_dump *dump = malloc(len);
+		FILE *file;
+
+		while (offset < len &&
+		       ((ret = read(command_fd, (void *) dump + offset, len - offset)) > 0
+			|| errno == EAGAIN)) {
+			if (ret > 0)
+				offset += ret;
+		}
+
+		fprintf(stdout, "Writing circular buffer to %s\n", dump->path);
+
+		file = fopen((const char *) dump->path, "w+");
+		if (file) {
+			struct chunk chunks[2];
+
+			fflush(output_stream);
+			get_chunks(chunks, buffer, false, buffer->size);
+
+			if (!write_header(file, devid, timestamp_frequency, metric_set) ||
+			    !write_topology(file, drm_fd) ||
+			    fwrite(chunks[0].data, chunks[0].len, 1, file) != 1 ||
+			    (chunks[1].len > 0 &&
+			     fwrite(chunks[1].data, chunks[1].len, 1, file) != 1) ||
+			    !write_correlation_timestamps(file, drm_fd)) {
+				fprintf(stderr, "Unable to write circular buffer data in file '%s'\n",
+					dump->path);
+			}
+			fclose(file);
+		} else
+			fprintf(stderr, "Unable to write dump file '%s'\n", dump->path);
+
+		free(dump);
+		break;
+	}
+	case RECORDER_COMMAND_QUIT:
+		quit = true;
+		break;
+	default:
+		fprintf(stderr, "Unknown command 0x%x\n", header.command);
+		break;
+	}
+}
+
+static void
+print_metric_sets(const struct intel_perf *perf)
+{
+	struct intel_perf_metric_set *metric_set;
+	uint32_t longest_name = 0;
+
+	igt_list_for_each_entry(metric_set, &perf->metric_sets, link) {
+		longest_name = MAX(longest_name, strlen(metric_set->symbol_name));
+	}
+
+	igt_list_for_each_entry(metric_set, &perf->metric_sets, link) {
+		fprintf(stdout, "%s:%*s%s\t\n",
+			metric_set->symbol_name,
+			(int) (longest_name - strlen(metric_set->symbol_name) + 1), " ",
+			metric_set->name);
+	}
+}
+
+static void
+print_metric_set_counters(const struct intel_perf_metric_set *metric_set)
+{
+	uint32_t longest_name = 0;
+	for (uint32_t i = 0; i < metric_set->n_counters; i++) {
+		longest_name = MAX(longest_name, strlen(metric_set->counters[i].name));
+	}
+
+	fprintf(stdout, "Metric set %s:\n", metric_set->name);
+	for (uint32_t i = 0; i < metric_set->n_counters; i++) {
+		struct intel_perf_logical_counter *counter = &metric_set->counters[i];
+
+		fprintf(stdout, "%s:%*s%s\n",
+			counter->name,
+			(int)(longest_name - strlen(counter->name) + 1), " ",
+			counter->desc);
+	}
+}
+
+static void
+usage(const char *name)
+{
+	fprintf(stdout,
+		"Usage: %s [options]\n"
+		"\n"
+		"     --help,               -h          Print this screen\n"
+		"     --correlation-period, -c <value>  Time period of timestamp correlation in seconds\n"
+		"                                       (default = 1.0)\n"
+		"     --perf-period,        -p <value>  Time period of i915-perf reports in seconds\n"
+		"                                       (default = 0.001)\n"
+		"     --metric,             -m <value>  i915 metric to sample with\n"
+		"     --counters,           -C          List counters for a given metric and exit\n"
+		"     --size,               -s <value>  Size of circular buffer to use in kilobytes\n"
+		"                                       If specified, a maximum amount of <value> data will\n"
+		"                                       be recorded.\n"
+		"     --command-fifo,       -f <path>   Path to a command fifo, implies circular buffer\n"
+		"                                       (To use with i915-perf-control)\n"
+		"     --output,             -o <path>   Output file (default = i915_perf.record)\n"
+		"     --cpu-clock,          -k <path>   Cpu clock to use for correlations\n"
+		"                                       Values: boot, mono, mono_raw (default = mono)\n",
+		name);
+}
+
+int
+main(int argc, char *argv[])
+{
+	const struct option long_options[] = {
+		{"help",                       no_argument, 0, 'h'},
+		{"correlation-period",   required_argument, 0, 'c'},
+		{"perf-period",          required_argument, 0, 'p'},
+		{"metric",               required_argument, 0, 'm'},
+		{"counters",                   no_argument, 0, 'C'},
+		{"output",               required_argument, 0, 'o'},
+		{"size",                 required_argument, 0, 's'},
+		{"command-fifo",         required_argument, 0, 'f'},
+		{"cpu-clock",            required_argument, 0, 'k'},
+		{0, 0, 0, 0}
+	};
+	const struct {
+		clock_t id;
+		const char *name;
+	} clock_names[] = {
+		{ CLOCK_BOOTTIME,      "boot" },
+		{ CLOCK_MONOTONIC,     "mono" },
+		{ CLOCK_MONOTONIC_RAW, "mono_raw" },
+	};
+	const struct intel_device_info *devinfo;
+	double corr_period = 1.0, perf_period = 0.001;
+	const char *metric_name = NULL, *output_file = "i915_perf.record", *command_fifo = I915_PERF_RECORD_FIFO_PATH;
+	struct intel_perf *perf;
+	struct intel_perf_metric_set *metric_set, *selected_metric_set = NULL;
+	struct intel_perf_record_timestamp_correlation initial_correlation;
+	struct circular_buffer circular_buffer;
+	struct timespec now;
+	uint64_t corr_period_ns, poll_time_ns, timestamp_frequency;
+	uint32_t devid = 0, oa_exponent;
+	uint32_t circular_size = 0;
+	int drm_fd, perf_fd, command_fifo_fd = -1;
+	int opt;
+	bool list_counters = false;
+	FILE *output = NULL, *output_stream;
+
+	while ((opt = getopt_long(argc, argv, "hc:p:m:Co:s:f:k:", long_options, NULL)) != -1) {
+		switch (opt) {
+		case 'h':
+			usage(argv[0]);
+			return EXIT_SUCCESS;
+		case 'c':
+			corr_period = atof(optarg);
+			break;
+		case 'p':
+			perf_period = atof(optarg);
+			break;
+		case 'm':
+			metric_name = optarg;
+			break;
+		case 'C':
+			list_counters = true;
+			break;
+		case 'o':
+			output_file = optarg;
+			break;
+		case 's':
+			circular_size = MAX(8, atoi(optarg)) * 1024;
+			break;
+		case 'f':
+			command_fifo = optarg;
+			circular_size = 8 * 1024 * 1024;
+			break;
+		case 'k': {
+			bool found = false;
+			for (uint32_t i = 0; i < ARRAY_SIZE(clock_names); i++) {
+				if (!strcmp(clock_names[i].name, optarg)) {
+					correlation_clock_id = clock_names[i].id;
+					found = true;
+					break;
+				}
+			}
+			if (!found) {
+				fprintf(stderr, "Unknown clock name '%s'\n", optarg);
+				return EXIT_FAILURE;
+			}
+			break;
+		}
+		default:
+			fprintf(stderr, "Internal error: "
+				"unexpected getopt value: %d\n", opt);
+			usage(argv[0]);
+			return EXIT_FAILURE;
+		}
+	}
+
+	drm_fd = open_render_node(&devid);
+
+	devinfo = intel_get_device_info(devid);
+	if (!devinfo) {
+		fprintf(stderr, "No device info found.\n");
+		return EXIT_FAILURE;
+	}
+
+	fprintf(stdout, "Device name=%s gen=%i gt=%i id=0x%x\n",
+		devinfo->codename, devinfo->gen, devinfo->gt, devid);
+
+	perf = intel_perf_for_devinfo(devinfo);
+	if (!perf) {
+		fprintf(stderr, "No perf data found.\n");
+		return EXIT_FAILURE;
+	}
+
+	if (!metric_name) {
+		print_metric_sets(perf);
+		return EXIT_FAILURE;
+	}
+
+	igt_list_for_each_entry(metric_set, &perf->metric_sets, link) {
+		if (!strcasecmp(metric_set->symbol_name, metric_name)) {
+			selected_metric_set = metric_set;
+			break;
+		}
+	}
+
+	if (!selected_metric_set) {
+		fprintf(stderr, "Unknown metric set '%s'\n", metric_name);
+		print_metric_sets(perf);
+		return EXIT_FAILURE;
+	}
+
+	if (list_counters) {
+		print_metric_set_counters(selected_metric_set);
+		return EXIT_SUCCESS;
+	}
+
+	intel_perf_load_perf_configs(perf, drm_fd);
+
+	timestamp_frequency = get_device_timestamp_frequency(devinfo, drm_fd);
+
+	signal(SIGINT, sigint_handler);
+
+	if (command_fifo) {
+		if (mkfifo(command_fifo, S_IRUSR | S_IWUSR | S_IRGRP | S_IWGRP | S_IROTH | S_IWOTH) != 0) {
+			fprintf(stderr, "Unable to create command fifo '%s': %s\n",
+				command_fifo, strerror(errno));
+			return EXIT_FAILURE;
+		}
+
+		command_fifo_fd = open(command_fifo, O_RDWR);
+		if (command_fifo_fd < 0) {
+			fprintf(stderr, "Unable to open command fifo '%s': %s\n",
+				command_fifo, strerror(errno));
+			return EXIT_FAILURE;
+		}
+	} else {
+		output = fopen(output_file, "w+");
+		if (!output) {
+			fprintf(stderr, "Unable to open output file '%s'\n",
+				output_file);
+			return EXIT_FAILURE;
+		}
+	}
+
+	if (circular_size) {
+		memset(&circular_buffer, 0, sizeof(circular_buffer));
+		circular_buffer.allocated_size = circular_size;
+		circular_buffer.data = malloc(circular_size);
+		if (!circular_buffer.data) {
+			fprintf(stderr, "Unable to allocate circular buffer\n");
+			return EXIT_FAILURE;
+		}
+
+		output_stream = fopencookie(&circular_buffer, "w+",
+					    circular_buffer_functions);
+		if (!output_stream) {
+			fprintf(stderr, "Unable to create circular buffer\n");
+			return EXIT_FAILURE;
+		}
+
+		if (!get_correlation_timestamps(&initial_correlation, drm_fd)) {
+			fprintf(stderr, "Unable to correlation timestamps\n");
+			return EXIT_FAILURE;
+		}
+
+		write_correlation_timestamps(output_stream, drm_fd);
+	} else {
+		if (!write_header(output, devid, timestamp_frequency, selected_metric_set) ||
+		    !write_topology(output, drm_fd) ||
+		    !write_correlation_timestamps(output, drm_fd)) {
+			fprintf(stderr, "Unable to write header in file '%s'\n",
+				output_file);
+			return EXIT_FAILURE;
+		}
+
+		output_stream = output;
+	}
+
+	if (selected_metric_set->perf_oa_metrics_set == 0) {
+		fprintf(stderr,
+			"Unable to load performance configuration, consider running:\n"
+			"   sysctl dev.i915.perf_stream_paranoid=0\n");
+		return EXIT_FAILURE;
+	}
+
+	oa_exponent = oa_exponent_for_period(timestamp_frequency, perf_period);
+	fprintf(stdout, "Opening perf stream with metric_id=%lu oa_exponent=%u\n",
+		selected_metric_set->perf_oa_metrics_set, oa_exponent);
+
+	perf_fd = perf_open(drm_fd, devinfo, oa_exponent, selected_metric_set);
+	if (perf_fd < 0) {
+		fprintf(stderr, "Unable to open i915 perf stream: %s\n",
+			strerror(errno));
+		return EXIT_FAILURE;
+	}
+
+	corr_period_ns = corr_period * 1000000000ul;
+	poll_time_ns = corr_period_ns;
+
+	while (!quit) {
+		struct pollfd pollfd[2] = {
+			{         perf_fd, POLLIN, 0 },
+			{ command_fifo_fd, POLLIN, 0 },
+		};
+		uint64_t elapsed_ns;
+		int ret;
+
+		igt_gettime(&now);
+		ret = poll(pollfd, command_fifo_fd != -1 ? 2 : 1, poll_time_ns / 1000000);
+		if (ret < 0 && errno != EINTR) {
+			fprintf(stderr, "Failed to poll i915-perf stream: %s\n",
+				strerror(errno));
+			break;
+		}
+
+		if (ret > 0) {
+			if (pollfd[0].revents & POLLIN) {
+				if (!write_i915_perf_data(output_stream, perf_fd)) {
+					fprintf(stderr, "Failed to write i915-perf data: %s\n",
+						strerror(errno));
+					break;
+				}
+			}
+
+			if (pollfd[1].revents & POLLIN) {
+				read_command_file(command_fifo_fd, output_stream,
+						  &circular_buffer,
+						  drm_fd, devid, timestamp_frequency,
+						  selected_metric_set);
+			}
+		}
+
+		elapsed_ns = igt_nsec_elapsed(&now);
+		if (elapsed_ns > poll_time_ns) {
+			poll_time_ns = corr_period_ns;
+			if (!write_correlation_timestamps(output_stream, drm_fd)) {
+				fprintf(stderr,
+					"Failed to write i915 timestamp correlation data: %s\n",
+					strerror(errno));
+				break;
+			}
+		} else {
+			poll_time_ns -= elapsed_ns;
+		}
+	}
+
+	fprintf(stdout, "Exiting...\n");
+
+	if (!write_correlation_timestamps(output_stream, drm_fd)) {
+		fprintf(stderr,
+			"Failed to write final i915 timestamp correlation data: %s\n",
+			strerror(errno));
+	}
+
+	fclose(output_stream);
+
+	if (command_fifo)
+		unlink(command_fifo);
+
+	free(circular_buffer.data);
+
+	close(drm_fd);
+
+	return EXIT_SUCCESS;
+}
diff --git a/tools/i915-perf/i915_perf_recorder_commands.h b/tools/i915-perf/i915_perf_recorder_commands.h
new file mode 100644
index 00000000..4855d80f
--- /dev/null
+++ b/tools/i915-perf/i915_perf_recorder_commands.h
@@ -0,0 +1,39 @@
+/*
+ * Copyright (C) 2019 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include <stdint.h>
+
+#define I915_PERF_RECORD_FIFO_PATH "/tmp/.i915-perf-record"
+
+enum recorder_command {
+	RECORDER_COMMAND_DUMP = 1,
+	RECORDER_COMMAND_QUIT,
+};
+
+struct recorder_command_base {
+	uint32_t command;
+	uint32_t size;
+};
+
+struct recorder_command_dump {
+	uint8_t path[0];
+};
diff --git a/tools/i915-perf/meson.build b/tools/i915-perf/meson.build
index 0ebdd185..1be3ab22 100644
--- a/tools/i915-perf/meson.build
+++ b/tools/i915-perf/meson.build
@@ -3,3 +3,15 @@ executable('i915-perf-configs',
            include_directories: inc,
            dependencies: [lib_igt_chipset, lib_igt_i915_perf],
            install: true)
+
+executable('i915-perf-recorder',
+           [ 'i915_perf_recorder.c' ],
+           include_directories: inc,
+           dependencies: [lib_igt, lib_igt_i915_perf],
+           install: true)
+
+executable('i915-perf-control',
+           [ 'i915_perf_control.c' ],
+           c_args: '-U_FORTIFY_SOURCE',
+           include_directories: inc,
+           install: true)
-- 
2.20.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [igt-dev] [PATCH i-g-t 4/4] lib/i915/perf: Add i915 perf data reader
  2020-02-15  1:11 [igt-dev] [PATCH i-g-t 0/4] Add perf OA tools for GPUvis Umesh Nerlige Ramappa
  2020-02-15  1:11 ` [igt-dev] [PATCH i-g-t 2/4] lib/i915/perf: Add support for loading perf configurations Umesh Nerlige Ramappa
  2020-02-15  1:11 ` [igt-dev] [PATCH i-g-t 3/4] tools/i915/perf: Add i915 perf recorder tool Umesh Nerlige Ramappa
@ 2020-02-15  1:11 ` Umesh Nerlige Ramappa
  2020-02-17 13:42 ` [igt-dev] [PATCH i-g-t 0/4] Add perf OA tools for GPUvis Lionel Landwerlin
  3 siblings, 0 replies; 5+ messages in thread
From: Umesh Nerlige Ramappa @ 2020-02-15  1:11 UTC (permalink / raw)
  To: igt-dev, Joonas Lahtinen, Ashutosh Dixit, Lionel G Landwerlin

From: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

Read perf OA records and correlate timestamps between the GPU and CPU.

v2: (Umesh)
- Add README on usage
- rebase fixes for igt_list

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
---
 lib/i915/perf_data_reader.c | 330 ++++++++++++++++++++++++++++++++++++
 lib/i915/perf_data_reader.h | 103 +++++++++++
 lib/meson.build             |   2 +
 tools/i915-perf/README      |  70 ++++++++
 4 files changed, 505 insertions(+)
 create mode 100644 lib/i915/perf_data_reader.c
 create mode 100644 lib/i915/perf_data_reader.h
 create mode 100644 tools/i915-perf/README

diff --git a/lib/i915/perf_data_reader.c b/lib/i915/perf_data_reader.c
new file mode 100644
index 00000000..43683331
--- /dev/null
+++ b/lib/i915/perf_data_reader.c
@@ -0,0 +1,330 @@
+/*
+ * Copyright (C) 2019 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include <assert.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/mman.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <unistd.h>
+
+#include "intel_chipset.h"
+#include "perf.h"
+#include "perf_data_reader.h"
+
+#define MAX(a,b) ((a) > (b) ? (a) : (b))
+#define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]))
+
+static inline bool
+oa_report_ctx_is_valid(const struct intel_perf_devinfo *devinfo,
+		       const uint8_t *_report)
+{
+	const uint32_t *report = (const uint32_t *) _report;
+
+	if (devinfo->gen < 8) {
+		return false; /* TODO */
+	} else if (devinfo->gen == 8) {
+		return report[0] & (1ul << 25);
+	} else if (devinfo->gen > 8) {
+		return report[0] & (1ul << 16);
+	}
+
+	return false;
+}
+
+static uint32_t
+oa_report_ctx_id(const struct intel_perf_devinfo *devinfo, const uint8_t *report)
+{
+	if (!oa_report_ctx_is_valid(devinfo, report))
+		return 0xffffffff;
+	return ((const uint32_t *) report)[2];
+}
+
+static inline uint64_t
+oa_report_timestamp(const uint8_t *report)
+{
+	return ((const uint32_t *)report)[1];
+}
+
+static void
+append_record(struct intel_perf_data_reader *reader,
+	      const struct drm_i915_perf_record_header *header)
+{
+	if (reader->n_records >= reader->n_allocated_records) {
+		reader->n_allocated_records = MAX(100, 2 * reader->n_allocated_records);
+		reader->records =
+			(const struct drm_i915_perf_record_header **)
+			realloc((void *) reader->records,
+				reader->n_allocated_records *
+				sizeof(*reader->records));
+		assert(reader->records);
+	}
+
+	reader->records[reader->n_records++] = header;
+}
+
+static void
+append_timestamp_correlation(struct intel_perf_data_reader *reader,
+			     const struct intel_perf_record_timestamp_correlation *corr)
+{
+	if (reader->n_correlations >= reader->n_allocated_correlations) {
+		reader->n_allocated_correlations = MAX(100, 2 * reader->n_allocated_correlations);
+		reader->correlations =
+			(const struct intel_perf_record_timestamp_correlation **)
+			realloc((void *) reader->correlations,
+				reader->n_allocated_correlations *
+				sizeof(*reader->correlations));
+		assert(reader->correlations);
+	}
+
+	reader->correlations[reader->n_correlations++] = corr;
+}
+
+static struct intel_perf_metric_set *
+find_metric_set(struct intel_perf *perf, const char *uuid)
+{
+	struct intel_perf_metric_set *metric_set;
+
+	igt_list_for_each_entry(metric_set, &perf->metric_sets, link) {
+		if (!strcmp(uuid, metric_set->hw_config_guid))
+			return metric_set;
+	}
+
+	return NULL;
+}
+
+static void
+init_devinfo(struct intel_perf_devinfo *perf_devinfo,
+	     const struct intel_device_info *devinfo,
+	     uint32_t devid,
+	     uint64_t timestamp_frequency)
+{
+	perf_devinfo->devid = devid;
+	perf_devinfo->gen = devinfo->gen;
+	perf_devinfo->timestamp_frequency = timestamp_frequency;
+}
+
+static bool
+parse_data(struct intel_perf_data_reader *reader)
+{
+	const uint8_t *end = reader->mmap_data + reader->mmap_size;
+	const uint8_t *iter = reader->mmap_data;
+	while (iter < end) {
+		const struct drm_i915_perf_record_header *header =
+			(const struct drm_i915_perf_record_header *) iter;
+
+		switch (header->type) {
+		case DRM_I915_PERF_RECORD_SAMPLE:
+			append_record(reader, header);
+			break;
+
+		case DRM_I915_PERF_RECORD_OA_REPORT_LOST:
+		case DRM_I915_PERF_RECORD_OA_BUFFER_LOST:
+			assert(header->size == sizeof(*header));
+			break;
+
+		case INTEL_PERF_RECORD_TYPE_DEVICE_INFO: {
+			const struct intel_device_info *devinfo;
+
+			reader->record_info =
+				(const struct intel_perf_record_device_info *) (header + 1);
+			assert(header->size == (sizeof(*reader->record_info) + sizeof(*header)));
+			devinfo = intel_get_device_info(reader->record_info->device_id);
+			if (!devinfo)
+				return false;
+			init_devinfo(&reader->devinfo, devinfo,
+				     reader->record_info->device_id,
+				     reader->record_info->timestamp_frequency);
+			reader->perf = intel_perf_for_devinfo(devinfo);
+			reader->metric_set = find_metric_set(reader->perf, reader->record_info->uuid);
+			break;
+		}
+
+		case INTEL_PERF_RECORD_TYPE_TIMESTAMP_CORRELATION: {
+			append_timestamp_correlation(reader,
+						     (const struct intel_perf_record_timestamp_correlation *) (header + 1));
+			break;
+		}
+		}
+
+		iter += header->size;
+	}
+
+	return true;
+}
+
+static uint64_t
+correlate_gpu_timestamp(struct intel_perf_data_reader *reader,
+			uint64_t gpu_ts)
+{
+	/* OA reports only have the lower 32bits of the timestamp
+	 * register, while our correlation data has the whole 36bits.
+	 * Try to figure what portion of the correlation data the
+	 * 32bit timestamp belongs to.
+	 */
+	uint64_t mask = 0xffffffff;
+	int corr_idx = -1;
+
+	for (uint32_t i = 0; i < reader->n_correlation_chunks; i++) {
+		if (gpu_ts >= (reader->correlation_chunks[i].gpu_ts_begin & mask) &&
+		    gpu_ts <= (reader->correlation_chunks[i].gpu_ts_end & mask)) {
+			corr_idx = reader->correlation_chunks[i].idx;
+			break;
+		}
+	}
+
+	/* Not found? Assume prior to the first timestamp correlation.
+	 */
+	if (corr_idx < 0) {
+		return reader->correlations[0]->cpu_timestamp -
+			((reader->correlations[0]->gpu_timestamp & mask) - gpu_ts) *
+			(reader->correlations[1]->cpu_timestamp - reader->correlations[0]->cpu_timestamp) /
+			(reader->correlations[1]->gpu_timestamp - reader->correlations[0]->gpu_timestamp);
+	}
+
+	for (uint32_t i = corr_idx; i < (reader->n_correlations - 1); i++) {
+		if (gpu_ts >= (reader->correlations[i]->gpu_timestamp & mask) &&
+		    gpu_ts < (reader->correlations[i + 1]->gpu_timestamp & mask)) {
+			return reader->correlations[i]->cpu_timestamp +
+				(gpu_ts - (reader->correlations[i]->gpu_timestamp & mask)) *
+				(reader->correlations[i + 1]->cpu_timestamp - reader->correlations[i]->cpu_timestamp) /
+				(reader->correlations[i + 1]->gpu_timestamp - reader->correlations[i]->gpu_timestamp);
+		}
+	}
+
+	/* This is a bit harsh, but the recording tool should ensure we have
+	 * sampling points on either side of the bag of OA reports.
+	 */
+	assert(0);
+}
+
+static void
+append_timeline_event(struct intel_perf_data_reader *reader,
+		      uint64_t ts_start, uint64_t ts_end,
+		      uint32_t record_start, uint32_t record_end,
+		      uint32_t hw_id)
+{
+	if (reader->n_timelines >= reader->n_allocated_timelines) {
+		reader->n_allocated_timelines = MAX(100, 2 * reader->n_allocated_timelines);
+		reader->timelines =
+			(struct intel_perf_timeline_item *)
+			realloc((void *) reader->timelines,
+				reader->n_allocated_timelines *
+				sizeof(*reader->timelines));
+		assert(reader->timelines);
+	}
+
+	reader->timelines[reader->n_timelines].ts_start = ts_start;
+	reader->timelines[reader->n_timelines].ts_end = ts_end;
+	reader->timelines[reader->n_timelines].cpu_ts_start =
+		correlate_gpu_timestamp(reader, ts_start);
+	reader->timelines[reader->n_timelines].cpu_ts_end =
+		correlate_gpu_timestamp(reader, ts_end);
+	reader->timelines[reader->n_timelines].record_start = record_start;
+	reader->timelines[reader->n_timelines].record_end = record_end;
+	reader->timelines[reader->n_timelines].hw_id = hw_id;
+	reader->n_timelines++;
+}
+
+static void
+generate_cpu_events(struct intel_perf_data_reader *reader)
+{
+	uint32_t last_header_idx = 0;
+	const struct drm_i915_perf_record_header *last_header = reader->records[0];
+
+	for (uint32_t i = 1; i < reader->n_records; i++) {
+		const struct drm_i915_perf_record_header *current_header =
+			reader->records[i];
+		const uint8_t *start_report = (const uint8_t *) (last_header + 1),
+			*end_report = (const uint8_t *) (current_header + 1);
+		uint32_t last_ctx_id = oa_report_ctx_id(&reader->devinfo, start_report),
+			current_ctx_id = oa_report_ctx_id(&reader->devinfo, end_report);
+		uint64_t gpu_ts_start = oa_report_timestamp(start_report),
+			gpu_ts_end = oa_report_timestamp(end_report);
+
+		if (last_ctx_id == current_ctx_id)
+			continue;
+
+		append_timeline_event(reader, gpu_ts_start, gpu_ts_end, last_header_idx, i, last_ctx_id);
+
+		last_header = current_header;
+		last_header_idx = i;
+	}
+}
+
+static void
+compute_correlation_chunks(struct intel_perf_data_reader *reader)
+{
+	uint64_t mask = ~(0xffffffff);
+	uint32_t last_idx = 0;
+	uint64_t last_ts = reader->correlations[last_idx]->gpu_timestamp;
+
+	for (uint32_t i = 0; i < reader->n_correlations; i++) {
+		if (!reader->n_correlation_chunks ||
+		    (last_ts & mask) != (reader->correlations[i]->gpu_timestamp & mask)) {
+			assert(reader->n_correlation_chunks < ARRAY_SIZE(reader->correlation_chunks));
+			reader->correlation_chunks[reader->n_correlation_chunks].gpu_ts_begin = last_ts;
+			reader->correlation_chunks[reader->n_correlation_chunks].gpu_ts_end = last_ts | ~mask;
+			reader->correlation_chunks[reader->n_correlation_chunks].idx = last_idx;
+			last_ts = reader->correlation_chunks[reader->n_correlation_chunks].gpu_ts_end + 1;
+			last_idx = i;
+			reader->n_correlation_chunks++;
+		}
+	}
+}
+
+bool
+intel_perf_data_reader_init(struct intel_perf_data_reader *reader,
+			    int perf_file_fd)
+{
+        struct stat st;
+        if (fstat(perf_file_fd, &st) != 0)
+		return false;
+
+	memset(reader, 0, sizeof(*reader));
+
+	reader->mmap_size = st.st_size;
+	reader->mmap_data = (const uint8_t *) mmap(NULL, st.st_size,
+						   PROT_READ, MAP_PRIVATE,
+						   perf_file_fd, 0);
+	if (reader->mmap_data == MAP_FAILED)
+		return false;
+
+	if (!parse_data(reader))
+		return false;
+
+	compute_correlation_chunks(reader);
+	generate_cpu_events(reader);
+
+	return true;
+}
+
+void
+intel_perf_data_reader_fini(struct intel_perf_data_reader *reader)
+{
+	intel_perf_free(reader->perf);
+	free(reader->records);
+	free(reader->timelines);
+	free(reader->correlations);
+	munmap((void *)reader->mmap_data, reader->mmap_size);
+}
diff --git a/lib/i915/perf_data_reader.h b/lib/i915/perf_data_reader.h
new file mode 100644
index 00000000..f75e96dd
--- /dev/null
+++ b/lib/i915/perf_data_reader.h
@@ -0,0 +1,103 @@
+/*
+ * Copyright (C) 2019 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef PERF_DATA_READER_H
+#define PERF_DATA_READER_H
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/* Helper to read a i915-perf recording. */
+
+#include <stdbool.h>
+#include <stdint.h>
+
+#include <i915_drm.h>
+
+#include "perf.h"
+#include "perf_data.h"
+
+struct intel_device_info;
+
+struct intel_perf_timeline_item {
+	uint64_t ts_start;
+	uint64_t ts_end;
+	uint64_t cpu_ts_start;
+	uint64_t cpu_ts_end;
+
+	/* Offsets into intel_perf_data_reader.records */
+	uint32_t record_start;
+	uint32_t record_end;
+
+	uint32_t hw_id;
+
+	/* User associated data with a given item on the i915 perf
+	 * timeline.
+	 */
+	void *user_data;
+};
+
+struct intel_perf_data_reader {
+	/* Array of pointers into the mmapped i915 perf file. */
+	const struct drm_i915_perf_record_header **records;
+	uint32_t n_records;
+	uint32_t n_allocated_records;
+
+	/**/
+	struct intel_perf_timeline_item *timelines;
+	uint32_t n_timelines;
+	uint32_t n_allocated_timelines;
+
+	/**/
+	const struct intel_perf_record_timestamp_correlation **correlations;
+	uint32_t n_correlations;
+	uint32_t n_allocated_correlations;
+
+	struct {
+		uint64_t gpu_ts_begin;
+		uint64_t gpu_ts_end;
+		uint32_t idx;
+	} correlation_chunks[4];
+	uint32_t n_correlation_chunks;
+
+	/**/
+	const struct intel_perf_record_device_info *record_info;
+
+	struct intel_perf_devinfo devinfo;
+
+	struct intel_perf *perf;
+	struct intel_perf_metric_set *metric_set;
+
+	const uint8_t *mmap_data;
+	size_t mmap_size;
+};
+
+bool intel_perf_data_reader_init(struct intel_perf_data_reader *reader,
+				 int perf_file_fd);
+void intel_perf_data_reader_fini(struct intel_perf_data_reader *reader);
+
+#ifdef __cplusplus
+};
+#endif
+
+#endif /* PERF_DATA_READER_H */
diff --git a/lib/meson.build b/lib/meson.build
index 6e935d45..f241bff7 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -173,6 +173,7 @@ lib_igt_perf = declare_dependency(link_with : lib_igt_perf_build,
 
 i915_perf_files = [
   'i915/perf.c',
+  'i915/perf_data_reader.c',
 ]
 
 i915_perf_hardware = [
@@ -218,6 +219,7 @@ install_headers(
   'intel_chipset.h',
   'i915/perf.h',
   'i915/perf_data.h',
+  'i915/perf_data_reader.h',
   subdir : 'i915-perf'
 )
 
diff --git a/tools/i915-perf/README b/tools/i915-perf/README
new file mode 100644
index 00000000..e9822345
--- /dev/null
+++ b/tools/i915-perf/README
@@ -0,0 +1,70 @@
+======================
+i915 perf tools for OA
+======================
+
+The tools provided here enable capturing performance metrics from the i915
+driver and are used in conjunction with the GPUvis software here - 
+
+https://github.com/mikesart/gpuvis
+
+Tools in IGT
+------------
+
+The following tools are generated in build/tools/i915-perf
+
+i915-perf-configs
+i915-perf-control
+i915-perf-recorder
+
+Usage in IGT
+------------
+
+Just launching i915-perf-recorder with no argument will list all available
+metrics. Once installed, the igt recorder tool can be used to record metrics in
+a circular buffer. Example below shows capture of RenderBasic metrics with an
+8Mb circular buffer.
+
+i915-perf-recorder -m RenderBasic -s 8192
+
+The circular buffer can be dumped at a given location from another terminal
+using the i915-perf-control tool :
+
+i915-perf-control -d /tmp/recording.perf
+ 
+Integration with GPUvis
+-----------------------
+
+GPUvis provides sample scripts in gpuvis/sample directory that can be modified
+and used to capture the metrics required.
+
+1. Setup the recording by launching the following scripts from gpuvis/sample
+   directory : 
+
+        trace-cmd-setup.sh
+        trace-cmd-start-tracing.sh
+
+This will setup a recording in a circular buffer.
+ 
+2. Start using the system for a specific task you want to record.
+
+3. Once the task is completed, save the circular buffer into a capture file with
+   the following script :
+
+        trace-cmd-capture.sh
+ 
+4. Once finished, tear down the circular buffer recording with :
+
+        trace-cmd-stop-tracing.sh
+
+Inspecting data captured in GPUvis
+----------------------------------
+ 
+The capture script will generate 2 files for instance : 
+
+        trace_09-26-2019_01-22-40.dat
+        trace_09-26-2019_01-22-40.i915-dat
+
+The first one contains ftrace data, the other i915-perf data. To inspect the
+data launch gpuvis with the 2 files as arguments :
+
+        gpuvis trace_09-26-2019_01-22-40.dat trace_09-26-2019_01-22-40.i915-dat
-- 
2.20.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 0/4] Add perf OA tools for GPUvis
  2020-02-15  1:11 [igt-dev] [PATCH i-g-t 0/4] Add perf OA tools for GPUvis Umesh Nerlige Ramappa
                   ` (2 preceding siblings ...)
  2020-02-15  1:11 ` [igt-dev] [PATCH i-g-t 4/4] lib/i915/perf: Add i915 perf data reader Umesh Nerlige Ramappa
@ 2020-02-17 13:42 ` Lionel Landwerlin
  3 siblings, 0 replies; 5+ messages in thread
From: Lionel Landwerlin @ 2020-02-17 13:42 UTC (permalink / raw)
  To: Umesh Nerlige Ramappa, igt-dev, Joonas Lahtinen, Ashutosh Dixit

There was a small communication hiccup. Umesh thought I did not work on 
this stuff anymore, but I actually just picked up the stuff again last week.

Sending an update with more changes/updates.

Sorry for the confusion.

-Lionel

On 15/02/2020 03:11, Umesh Nerlige Ramappa wrote:
> The tools provided here enable capturing performance metrics from the i915
> driver and are used in conjunction with the GPUvis software here -
>
> https://github.com/mikesart/gpuvis
>
> The changes required in GPUvis are wip and will be posted following the merge of
> these tools.
>
> For more information, view tools/i915-perf/README in this patch series
>
> Lionel Landwerlin (4):
>    lib/i915/perf: Add i915_perf library
>    lib/i915/perf: Add support for loading perf configurations
>    tools/i915/perf: Add i915 perf recorder tool
>    lib/i915/perf: Add i915 perf data reader
>
>   lib/i915-perf.pc.in                           |    10 +
>   lib/i915/perf-configs/README.md               |   115 +
>   lib/i915/perf-configs/codegen.py              |    33 +
>   lib/i915/perf-configs/guids.xml               |   282 +
>   lib/i915/perf-configs/mdapi-xml-convert.py    |  1000 +
>   lib/i915/perf-configs/oa-bdw.xml              | 15653 ++++++++++++++++
>   lib/i915/perf-configs/oa-bxt.xml              |  9595 ++++++++++
>   lib/i915/perf-configs/oa-cflgt2.xml           | 10866 +++++++++++
>   lib/i915/perf-configs/oa-cflgt3.xml           | 10933 +++++++++++
>   lib/i915/perf-configs/oa-chv.xml              |  9757 ++++++++++
>   lib/i915/perf-configs/oa-cnl.xml              | 10411 ++++++++++
>   lib/i915/perf-configs/oa-glk.xml              |  9346 +++++++++
>   lib/i915/perf-configs/oa-hsw.xml              |  4615 +++++
>   lib/i915/perf-configs/oa-icl.xml              | 11899 ++++++++++++
>   lib/i915/perf-configs/oa-kblgt2.xml           | 10866 +++++++++++
>   lib/i915/perf-configs/oa-kblgt3.xml           | 10933 +++++++++++
>   lib/i915/perf-configs/oa-sklgt2.xml           | 11895 ++++++++++++
>   lib/i915/perf-configs/oa-sklgt3.xml           | 10933 +++++++++++
>   lib/i915/perf-configs/oa-sklgt4.xml           | 10956 +++++++++++
>   lib/i915/perf-configs/perf-codegen.py         |   854 +
>   lib/i915/perf-configs/update-guids.py         |   230 +
>   lib/i915/perf.c                               |   332 +
>   lib/i915/perf.h                               |   240 +
>   lib/i915/perf_data.h                          |    88 +
>   lib/i915/perf_data_reader.c                   |   330 +
>   lib/i915/perf_data_reader.h                   |   103 +
>   lib/meson.build                               |    67 +
>   tools/i915-perf/README                        |    70 +
>   tools/i915-perf/i915_perf_configs.c           |   277 +
>   tools/i915-perf/i915_perf_control.c           |   133 +
>   tools/i915-perf/i915_perf_recorder.c          |   931 +
>   tools/i915-perf/i915_perf_recorder_commands.h |    39 +
>   tools/i915-perf/meson.build                   |    17 +
>   tools/meson.build                             |     1 +
>   34 files changed, 153810 insertions(+)
>   create mode 100644 lib/i915-perf.pc.in
>   create mode 100644 lib/i915/perf-configs/README.md
>   create mode 100644 lib/i915/perf-configs/codegen.py
>   create mode 100644 lib/i915/perf-configs/guids.xml
>   create mode 100755 lib/i915/perf-configs/mdapi-xml-convert.py
>   create mode 100644 lib/i915/perf-configs/oa-bdw.xml
>   create mode 100644 lib/i915/perf-configs/oa-bxt.xml
>   create mode 100644 lib/i915/perf-configs/oa-cflgt2.xml
>   create mode 100644 lib/i915/perf-configs/oa-cflgt3.xml
>   create mode 100644 lib/i915/perf-configs/oa-chv.xml
>   create mode 100644 lib/i915/perf-configs/oa-cnl.xml
>   create mode 100644 lib/i915/perf-configs/oa-glk.xml
>   create mode 100644 lib/i915/perf-configs/oa-hsw.xml
>   create mode 100644 lib/i915/perf-configs/oa-icl.xml
>   create mode 100644 lib/i915/perf-configs/oa-kblgt2.xml
>   create mode 100644 lib/i915/perf-configs/oa-kblgt3.xml
>   create mode 100644 lib/i915/perf-configs/oa-sklgt2.xml
>   create mode 100644 lib/i915/perf-configs/oa-sklgt3.xml
>   create mode 100644 lib/i915/perf-configs/oa-sklgt4.xml
>   create mode 100755 lib/i915/perf-configs/perf-codegen.py
>   create mode 100755 lib/i915/perf-configs/update-guids.py
>   create mode 100644 lib/i915/perf.c
>   create mode 100644 lib/i915/perf.h
>   create mode 100644 lib/i915/perf_data.h
>   create mode 100644 lib/i915/perf_data_reader.c
>   create mode 100644 lib/i915/perf_data_reader.h
>   create mode 100644 tools/i915-perf/README
>   create mode 100644 tools/i915-perf/i915_perf_configs.c
>   create mode 100644 tools/i915-perf/i915_perf_control.c
>   create mode 100644 tools/i915-perf/i915_perf_recorder.c
>   create mode 100644 tools/i915-perf/i915_perf_recorder_commands.h
>   create mode 100644 tools/i915-perf/meson.build
>

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-02-17 13:42 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-15  1:11 [igt-dev] [PATCH i-g-t 0/4] Add perf OA tools for GPUvis Umesh Nerlige Ramappa
2020-02-15  1:11 ` [igt-dev] [PATCH i-g-t 2/4] lib/i915/perf: Add support for loading perf configurations Umesh Nerlige Ramappa
2020-02-15  1:11 ` [igt-dev] [PATCH i-g-t 3/4] tools/i915/perf: Add i915 perf recorder tool Umesh Nerlige Ramappa
2020-02-15  1:11 ` [igt-dev] [PATCH i-g-t 4/4] lib/i915/perf: Add i915 perf data reader Umesh Nerlige Ramappa
2020-02-17 13:42 ` [igt-dev] [PATCH i-g-t 0/4] Add perf OA tools for GPUvis Lionel Landwerlin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.