[PATCH igt v4 00/13] corresponding changes for i915-perf interface

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH igt v4 00/13] corresponding changes for i915-perf interface
@ 2016-11-14 20:51 Robert Bragg
  2016-11-14 20:51 ` [PATCH igt v4 01/13] igt/perf: add i915 perf stream tests for Haswell Robert Bragg
                   ` (12 more replies)
  0 siblings, 13 replies; 22+ messages in thread
From: Robert Bragg @ 2016-11-14 20:51 UTC (permalink / raw)
  To: intel-gfx

The i915-perf series affects the command parser and itself includes new uapi
which these i-g-t changes try to cover.

This update follows up on review from both Matt and Chris.

A minor change I ended up making was to rename the hsw_oa_formats[] array to
oa_formats[] indexed by the format ID so the test could more easily get to the
C counter offset for a given format which I think helps make test_oa_exponents
clearer, avoiding using '224' with a /* C offset */ comment. The same array may
be extended later for testing gen8+ so I've removed the hsw_ prefix for now.

I noticed the existing handle and fd global variables could be made local to
main() and so added a minor patch for that.

perf.c no longer depends on the i915_drm.h copy in libdrm including the i915
perf interface since the test itself includes a guarded copy of the new uapi
for now.

- Robert

Robert Bragg (13):
  igt/perf: add i915 perf stream tests for Haswell
  igt/gem_exec_parse: some minor cleanups
  igt/gem_exec_parse: move hsw_load_register_reg down
  igt/gem_exec_parse: update hsw_load_register_reg
  igt/gem_exec_parse: make global vars local to main()
  igt/gem_exec_parse: init global parser_version in fixture
  igt/gem_exec_parse: req. v < 9 for oacontrol tracking test
  igt/gem_exec_parse: make basic-rejected version agnostic
  igt/gem_exec_parse: update bitmasks test for v >=8
  igt/gem_exec_parse: update cmd-crossing-page for >= v8
  igt/gem_exec_parse: update hsw_load_register_reg for v >= 8
  igt/gem_exec_parse: update registers test for v >= 8
  igt/gem_exec_parse: check oacontrol lri bad for >= v9

 tests/Makefile.sources |    1 +
 tests/gem_exec_parse.c |  568 ++++++------
 tests/perf.c           | 2373 ++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 2656 insertions(+), 286 deletions(-)
 create mode 100644 tests/perf.c

-- 
2.10.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH igt v4 01/13] igt/perf: add i915 perf stream tests for Haswell
  2016-11-14 20:51 [PATCH igt v4 00/13] corresponding changes for i915-perf interface Robert Bragg
@ 2016-11-14 20:51 ` Robert Bragg
  2016-11-14 20:51 ` [PATCH igt v4 02/13] igt/gem_exec_parse: some minor cleanups Robert Bragg
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Robert Bragg @ 2016-11-14 20:51 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
---
 tests/Makefile.sources |    1 +
 tests/perf.c           | 2373 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 2374 insertions(+)
 create mode 100644 tests/perf.c

diff --git a/tests/Makefile.sources b/tests/Makefile.sources
index 65e0792..a818963 100644
--- a/tests/Makefile.sources
+++ b/tests/Makefile.sources
@@ -214,6 +214,7 @@ TESTS_progs = \
 	kms_pwrite_crc \
 	kms_sink_crc_basic \
 	prime_udl \
+	perf \
 	$(NULL)
 
 # IMPORTANT: The ZZ_ tests need to be run last!
diff --git a/tests/perf.c b/tests/perf.c
new file mode 100644
index 0000000..1aafca9
--- /dev/null
+++ b/tests/perf.c
@@ -0,0 +1,2373 @@
+/*
+ * Copyright © 2016 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#include <stdlib.h>
+#include <stdio.h>
+#include <string.h>
+#include <fcntl.h>
+#include <inttypes.h>
+#include <errno.h>
+#include <sys/stat.h>
+#include <sys/time.h>
+#include <sys/times.h>
+#include <sys/types.h>
+#include <dirent.h>
+#include <time.h>
+#include <poll.h>
+#include <math.h>
+
+#include "igt.h"
+#include "drm.h"
+
+IGT_TEST_DESCRIPTION("Test the i915 perf metrics streaming interface");
+
+#define GEN6_MI_REPORT_PERF_COUNT ((0x28 << 23) | (3 - 2))
+
+#define GFX_OP_PIPE_CONTROL     ((3 << 29) | (3 << 27) | (2 << 24))
+#define PIPE_CONTROL_CS_STALL	   (1 << 20)
+#define PIPE_CONTROL_GLOBAL_SNAPSHOT_COUNT_RESET	(1 << 19)
+#define PIPE_CONTROL_TLB_INVALIDATE     (1 << 18)
+#define PIPE_CONTROL_SYNC_GFDT	  (1 << 17)
+#define PIPE_CONTROL_MEDIA_STATE_CLEAR  (1 << 16)
+#define PIPE_CONTROL_NO_WRITE	   (0 << 14)
+#define PIPE_CONTROL_WRITE_IMMEDIATE    (1 << 14)
+#define PIPE_CONTROL_WRITE_DEPTH_COUNT  (2 << 14)
+#define PIPE_CONTROL_WRITE_TIMESTAMP    (3 << 14)
+#define PIPE_CONTROL_DEPTH_STALL	(1 << 13)
+#define PIPE_CONTROL_RENDER_TARGET_FLUSH (1 << 12)
+#define PIPE_CONTROL_INSTRUCTION_INVALIDATE (1 << 11)
+#define PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE   (1 << 10) /* GM45+ only */
+#define PIPE_CONTROL_ISP_DIS	    (1 << 9)
+#define PIPE_CONTROL_INTERRUPT_ENABLE   (1 << 8)
+#define PIPE_CONTROL_FLUSH_ENABLE       (1 << 7) /* Gen7+ only */
+/* GT */
+#define PIPE_CONTROL_DATA_CACHE_INVALIDATE      (1 << 5)
+#define PIPE_CONTROL_VF_CACHE_INVALIDATE	(1 << 4)
+#define PIPE_CONTROL_CONST_CACHE_INVALIDATE     (1 << 3)
+#define PIPE_CONTROL_STATE_CACHE_INVALIDATE     (1 << 2)
+#define PIPE_CONTROL_STALL_AT_SCOREBOARD	(1 << 1)
+#define PIPE_CONTROL_DEPTH_CACHE_FLUSH	  (1 << 0)
+#define PIPE_CONTROL_PPGTT_WRITE	(0 << 2)
+#define PIPE_CONTROL_GLOBAL_GTT_WRITE   (1 << 2)
+
+#define NSEC_PER_SEC 1000000000ull
+
+/* Temporarily copy i915-perf uapi here to avoid a dependency on libdrm's
+ * i915_drm.h copy being updated with the i915-perf interface before this
+ * test can land in i-g-t.
+ *
+ * TODO: remove this once the interface lands in libdrm
+ */
+#ifndef DRM_I915_PERF_OPEN
+#define DRM_I915_PERF_OPEN		0x36
+#define DRM_IOCTL_I915_PERF_OPEN	DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_OPEN, struct drm_i915_perf_open_param)
+
+enum drm_i915_oa_format {
+       I915_OA_FORMAT_A13 = 1,
+       I915_OA_FORMAT_A29,
+       I915_OA_FORMAT_A13_B8_C8,
+       I915_OA_FORMAT_B4_C8,
+       I915_OA_FORMAT_A45_B8_C8,
+       I915_OA_FORMAT_B4_C8_A16,
+       I915_OA_FORMAT_C4_B8,
+
+       I915_OA_FORMAT_MAX /* non-ABI */
+};
+
+enum drm_i915_perf_property_id {
+       DRM_I915_PERF_PROP_CTX_HANDLE = 1,
+       DRM_I915_PERF_PROP_SAMPLE_OA,
+       DRM_I915_PERF_PROP_OA_METRICS_SET,
+       DRM_I915_PERF_PROP_OA_FORMAT,
+       DRM_I915_PERF_PROP_OA_EXPONENT,
+
+       DRM_I915_PERF_PROP_MAX /* non-ABI */
+};
+
+struct drm_i915_perf_open_param {
+       __u32 flags;
+#define I915_PERF_FLAG_FD_CLOEXEC	(1<<0)
+#define I915_PERF_FLAG_FD_NONBLOCK	(1<<1)
+#define I915_PERF_FLAG_DISABLED		(1<<2)
+
+       __u32 num_properties;
+       __u64 properties_ptr;
+};
+
+#define I915_PERF_IOCTL_ENABLE _IO('i', 0x0)
+#define I915_PERF_IOCTL_DISABLE	_IO('i', 0x1)
+
+struct drm_i915_perf_record_header {
+       __u32 type;
+       __u16 pad;
+       __u16 size;
+};
+
+enum drm_i915_perf_record_type {
+       DRM_I915_PERF_RECORD_SAMPLE = 1,
+       DRM_I915_PERF_RECORD_OA_REPORT_LOST = 2,
+       DRM_I915_PERF_RECORD_OA_BUFFER_LOST = 3,
+
+       DRM_I915_PERF_RECORD_MAX /* non-ABI */
+};
+#endif /* !DRM_I915_PERF_OPEN */
+
+static struct {
+	const char *name;
+	size_t size;
+	int a_off; /* bytes */
+	int n_a;
+	int first_a;
+	int b_off;
+	int n_b;
+	int c_off;
+	int n_c;
+} oa_formats[I915_OA_FORMAT_MAX] = {
+	[I915_OA_FORMAT_A13] = {
+		"A13", .size = 64,
+		.a_off = 12, .n_a = 13 },
+	[I915_OA_FORMAT_A29] = {
+		"A29", .size = 128,
+		.a_off = 12, .n_a = 29 },
+	[I915_OA_FORMAT_A13_B8_C8] = {
+		"A13_B8_C8", .size = 128,
+		.a_off = 12, .n_a = 13,
+		.b_off = 64, .n_b = 8,
+		.c_off = 96, .n_c = 8 },
+	[I915_OA_FORMAT_A45_B8_C8] = {
+		"A45_B8_C8", .size = 256,
+		.a_off = 12,  .n_a = 45,
+		.b_off = 192, .n_b = 8,
+		.c_off = 224, .n_c = 8 },
+	[I915_OA_FORMAT_B4_C8] = {
+		"B4_C8", .size = 64,
+		.b_off = 16, .n_b = 4,
+		.c_off = 32, .n_c = 8 },
+	[I915_OA_FORMAT_B4_C8_A16] = {
+		"B4_C8_A16", .size = 128,
+		.b_off = 16, .n_b = 4,
+		.c_off = 32, .n_c = 8,
+		.a_off = 60, .n_a = 16, .first_a = 29 },
+	[I915_OA_FORMAT_C4_B8] = {
+		"C4_B8", .size = 64,
+		.c_off = 16, .n_c = 4,
+		.b_off = 28, .n_b = 8 },
+};
+
+static bool hsw_undefined_a_counters[45] = {
+	[4] = true,
+	[6] = true,
+	[9] = true,
+	[11] = true,
+	[14] = true,
+	[16] = true,
+	[19] = true,
+	[21] = true,
+	[24] = true,
+	[26] = true,
+	[29] = true,
+	[31] = true,
+	[34] = true,
+	[43] = true,
+	[44] = true,
+};
+
+static int drm_fd = -1;
+static uint32_t devid;
+static int device = -1;
+
+static uint64_t hsw_render_basic_id = UINT64_MAX;
+static uint64_t gt_min_freq_mhz_saved = 0;
+static uint64_t gt_max_freq_mhz_saved = 0;
+static uint64_t gt_min_freq_mhz = 0;
+static uint64_t gt_max_freq_mhz = 0;
+
+static uint64_t timestamp_frequency = 12500000;
+
+static igt_render_copyfunc_t render_copy = NULL;
+
+static int
+__perf_open(int fd, struct drm_i915_perf_open_param *param)
+{
+	int ret = igt_ioctl(fd, DRM_IOCTL_I915_PERF_OPEN, param);
+
+	igt_assert(ret >= 0);
+	errno = 0;
+
+	return ret;
+}
+
+static int
+lookup_format(int i915_perf_fmt_id)
+{
+	igt_assert(i915_perf_fmt_id < I915_OA_FORMAT_MAX);
+	igt_assert(oa_formats[i915_perf_fmt_id].name);
+
+	return i915_perf_fmt_id;
+}
+
+static bool
+try_read_u64_file(const char *file, uint64_t *val)
+{
+	char buf[32];
+	int fd, n;
+
+	fd = open(file, O_RDONLY);
+	if (fd < 0)
+		return false;
+
+	while ((n = read(fd, buf, sizeof(buf) - 1)) < 0 && errno == EINTR)
+		;
+	igt_assert(n >= 0);
+
+	close(fd);
+
+	buf[n] = '\0';
+	*val = strtoull(buf, NULL, 0);
+
+	return true;
+}
+
+static uint64_t
+read_u64_file(const char *file)
+{
+	uint64_t val;
+
+	igt_assert_eq(try_read_u64_file(file, &val), true);
+
+	return val;
+}
+
+static void
+write_u64_file(const char *file, uint64_t val)
+{
+	char buf[32];
+	int fd, len, ret;
+
+	fd = open(file, O_WRONLY);
+	igt_assert(fd >= 0);
+
+	len = snprintf(buf, sizeof(buf), "%"PRIu64, val);
+	igt_assert(len > 0);
+
+	while ((ret = write(fd, buf, len)) < 0 && errno == EINTR)
+		;
+	igt_assert_eq(ret, len);
+
+	close(fd);
+}
+
+static uint64_t
+sysfs_read(const char *file)
+{
+	char buf[512];
+
+	snprintf(buf, sizeof(buf), "/sys/class/drm/card%d/%s", device, file);
+
+	return read_u64_file(buf);
+}
+
+static void
+sysfs_write(const char *file, uint64_t val)
+{
+	char buf[512];
+
+	snprintf(buf, sizeof(buf), "/sys/class/drm/card%d/%s", device, file);
+
+	write_u64_file(buf, val);
+}
+
+static char *
+read_debugfs_record(const char *file, const char *key)
+{
+	FILE *fp;
+	char *line = NULL;
+	size_t line_buf_size = 0;
+	int len = 0;
+	int key_len = strlen(key);
+	char *value = NULL;
+
+	fp = igt_debugfs_fopen(file, "r");
+	igt_require(fp);
+
+	while ((len = getline(&line, &line_buf_size, fp)) > 0) {
+
+		if (line[len - 1] == '\n')
+			line[len - 1] = '\0';
+
+		if (strncmp(key, line, key_len) == 0 &&
+		    line[key_len] == ':' &&
+		    line[key_len + 1] == ' ')
+		{
+			value = strdup(line + key_len + 2);
+			goto done;
+		}
+	}
+
+	igt_assert(!"reached");
+done:
+	free(line);
+	if (fp)
+		fclose(fp);
+	return value;
+}
+
+static uint64_t
+read_debugfs_u64_record(const char *file, const char *key)
+{
+	char *str_val = read_debugfs_record(file, key);
+	uint64_t val;
+
+	igt_require(str_val);
+
+	val = strtoull(str_val, NULL, 0);
+	free(str_val);
+
+	return val;
+}
+
+static bool
+lookup_hsw_render_basic_id(void)
+{
+	char buf[256];
+
+	igt_assert_neq(device, -1);
+
+	snprintf(buf, sizeof(buf),
+		 "/sys/class/drm/card%d/metrics/403d8832-1a27-4aa6-a64e-f5389ce7b212/id",
+		 device);
+
+	return try_read_u64_file(buf, &hsw_render_basic_id);
+}
+
+static void
+gt_frequency_range_save(void)
+{
+	gt_min_freq_mhz_saved = sysfs_read("gt_min_freq_mhz");
+	gt_max_freq_mhz_saved = sysfs_read("gt_max_freq_mhz");
+
+	gt_min_freq_mhz = gt_min_freq_mhz_saved;
+	gt_max_freq_mhz = gt_max_freq_mhz_saved;
+}
+
+static void
+gt_frequency_pin(int gt_freq_mhz)
+{
+	igt_debug("requesting pinned GT freq = %dmhz\n", gt_freq_mhz);
+
+	if (gt_freq_mhz > gt_max_freq_mhz) {
+		sysfs_write("gt_max_freq_mhz", gt_freq_mhz);
+		sysfs_write("gt_min_freq_mhz", gt_freq_mhz);
+	} else {
+		sysfs_write("gt_min_freq_mhz", gt_freq_mhz);
+		sysfs_write("gt_max_freq_mhz", gt_freq_mhz);
+	}
+	gt_min_freq_mhz = gt_freq_mhz;
+	gt_max_freq_mhz = gt_freq_mhz;
+}
+
+static void
+gt_frequency_range_restore(void)
+{
+	igt_debug("restoring GT frequency range: min = %dmhz, max =%dmhz, current: min=%dmhz, max=%dmhz\n",
+		  (int)gt_min_freq_mhz_saved,
+		  (int)gt_max_freq_mhz_saved,
+		  (int)gt_min_freq_mhz,
+		  (int)gt_max_freq_mhz);
+
+	/* Assume current min/max are the same */
+	if (gt_min_freq_mhz_saved > gt_max_freq_mhz) {
+		sysfs_write("gt_max_freq_mhz", gt_max_freq_mhz_saved);
+		sysfs_write("gt_min_freq_mhz", gt_min_freq_mhz_saved);
+	} else {
+		sysfs_write("gt_min_freq_mhz", gt_min_freq_mhz_saved);
+		sysfs_write("gt_max_freq_mhz", gt_max_freq_mhz_saved);
+	}
+
+	gt_min_freq_mhz = gt_min_freq_mhz_saved;
+	gt_max_freq_mhz = gt_max_freq_mhz_saved;
+}
+
+static uint64_t
+timebase_scale(uint32_t u32_delta)
+{
+	return ((uint64_t)u32_delta * NSEC_PER_SEC) / timestamp_frequency;
+}
+
+/* CAP_SYS_ADMIN is required to open system wide metrics, unless the system
+ * control parameter dev.i915.perf_stream_paranoid == 0 */
+static void
+test_system_wide_paranoid(void)
+{
+	igt_fork(child, 1) {
+		uint64_t properties[] = {
+			/* Include OA reports in samples */
+			DRM_I915_PERF_PROP_SAMPLE_OA, true,
+
+			/* OA unit configuration */
+			DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
+			DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
+			DRM_I915_PERF_PROP_OA_EXPONENT, 13, /* 1 millisecond */
+		};
+		struct drm_i915_perf_open_param param = {
+			.flags = I915_PERF_FLAG_FD_CLOEXEC |
+				I915_PERF_FLAG_FD_NONBLOCK,
+			.num_properties = sizeof(properties) / 16,
+			.properties_ptr = (uint64_t)properties,
+		};
+
+		write_u64_file("/proc/sys/dev/i915/perf_stream_paranoid", 1);
+
+		igt_drop_root();
+
+		do_ioctl_err(drm_fd, DRM_IOCTL_I915_PERF_OPEN, &param, EACCES);
+	}
+
+	igt_waitchildren();
+
+	igt_fork(child, 1) {
+		uint64_t properties[] = {
+			/* Include OA reports in samples */
+			DRM_I915_PERF_PROP_SAMPLE_OA, true,
+
+			/* OA unit configuration */
+			DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
+			DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
+			DRM_I915_PERF_PROP_OA_EXPONENT, 13, /* 1 millisecond */
+		};
+		struct drm_i915_perf_open_param param = {
+			.flags = I915_PERF_FLAG_FD_CLOEXEC |
+				I915_PERF_FLAG_FD_NONBLOCK,
+			.num_properties = sizeof(properties) / 16,
+			.properties_ptr = (uint64_t)properties,
+		};
+		int stream_fd;
+
+		write_u64_file("/proc/sys/dev/i915/perf_stream_paranoid", 0);
+
+		igt_drop_root();
+
+		stream_fd = __perf_open(drm_fd, &param);
+		close(stream_fd);
+	}
+
+	igt_waitchildren();
+
+	/* leave in paranoid state */
+	write_u64_file("/proc/sys/dev/i915/perf_stream_paranoid", 1);
+}
+
+static void
+test_invalid_open_flags(void)
+{
+	uint64_t properties[] = {
+		/* Include OA reports in samples */
+		DRM_I915_PERF_PROP_SAMPLE_OA, true,
+
+		/* OA unit configuration */
+		DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
+		DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
+		DRM_I915_PERF_PROP_OA_EXPONENT, 13, /* 1 millisecond */
+	};
+	struct drm_i915_perf_open_param param = {
+		.flags = ~0, /* Undefined flag bits set! */
+		.num_properties = sizeof(properties) / 16,
+		.properties_ptr = (uint64_t)properties,
+	};
+
+	do_ioctl_err(drm_fd, DRM_IOCTL_I915_PERF_OPEN, &param, EINVAL);
+}
+
+static void
+test_invalid_oa_metric_set_id(void)
+{
+	uint64_t properties[] = {
+		/* Include OA reports in samples */
+		DRM_I915_PERF_PROP_SAMPLE_OA, true,
+
+		/* OA unit configuration */
+		DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
+		DRM_I915_PERF_PROP_OA_EXPONENT, 13, /* 1 millisecond */
+		DRM_I915_PERF_PROP_OA_METRICS_SET, UINT64_MAX,
+	};
+	struct drm_i915_perf_open_param param = {
+		.flags = I915_PERF_FLAG_FD_CLOEXEC |
+			I915_PERF_FLAG_FD_NONBLOCK,
+		.num_properties = sizeof(properties) / 16,
+		.properties_ptr = (uint64_t)properties,
+	};
+	int stream_fd;
+
+	do_ioctl_err(drm_fd, DRM_IOCTL_I915_PERF_OPEN, &param, EINVAL);
+
+	properties[ARRAY_SIZE(properties) - 1] = 0; /* ID 0 is also be reserved as invalid */
+	do_ioctl_err(drm_fd, DRM_IOCTL_I915_PERF_OPEN, &param, EINVAL);
+
+	/* Check that we aren't just seeing false positives... */
+	properties[ARRAY_SIZE(properties) - 1] = hsw_render_basic_id;
+	stream_fd = __perf_open(drm_fd, &param);
+	close(stream_fd);
+
+	/* There's no valid default OA metric set ID... */
+	param.num_properties--;
+	do_ioctl_err(drm_fd, DRM_IOCTL_I915_PERF_OPEN, &param, EINVAL);
+}
+
+static void
+test_invalid_oa_format_id(void)
+{
+	uint64_t properties[] = {
+		/* Include OA reports in samples */
+		DRM_I915_PERF_PROP_SAMPLE_OA, true,
+
+		/* OA unit configuration */
+		DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
+		DRM_I915_PERF_PROP_OA_EXPONENT, 13, /* 1 millisecond */
+		DRM_I915_PERF_PROP_OA_FORMAT, UINT64_MAX,
+	};
+	struct drm_i915_perf_open_param param = {
+		.flags = I915_PERF_FLAG_FD_CLOEXEC |
+			I915_PERF_FLAG_FD_NONBLOCK,
+		.num_properties = sizeof(properties) / 16,
+		.properties_ptr = (uint64_t)properties,
+	};
+	int stream_fd;
+
+	do_ioctl_err(drm_fd, DRM_IOCTL_I915_PERF_OPEN, &param, EINVAL);
+
+	properties[ARRAY_SIZE(properties) - 1] = 0; /* ID 0 is also be reserved as invalid */
+	do_ioctl_err(drm_fd, DRM_IOCTL_I915_PERF_OPEN, &param, EINVAL);
+
+	/* Check that we aren't just seeing false positives... */
+	properties[ARRAY_SIZE(properties) - 1] = I915_OA_FORMAT_A45_B8_C8;
+	stream_fd = __perf_open(drm_fd, &param);
+	close(stream_fd);
+
+	/* There's no valid default OA format... */
+	param.num_properties--;
+	do_ioctl_err(drm_fd, DRM_IOCTL_I915_PERF_OPEN, &param, EINVAL);
+}
+
+static void
+test_missing_sample_flags(void)
+{
+	uint64_t properties[] = {
+		/* No _PROP_SAMPLE_xyz flags */
+
+		/* OA unit configuration */
+		DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
+		DRM_I915_PERF_PROP_OA_EXPONENT, 13, /* 1 millisecond */
+		DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
+	};
+	struct drm_i915_perf_open_param param = {
+		.flags = I915_PERF_FLAG_FD_CLOEXEC,
+		.num_properties = sizeof(properties) / 16,
+		.properties_ptr = (uint64_t)properties,
+	};
+
+	do_ioctl_err(drm_fd, DRM_IOCTL_I915_PERF_OPEN, &param, EINVAL);
+}
+
+static void
+read_2_oa_reports(int stream_fd,
+		  int format_id,
+		  int exponent,
+		  uint32_t *oa_report0,
+		  uint32_t *oa_report1,
+		  bool timer_only)
+{
+	size_t format_size = oa_formats[format_id].size;
+	size_t sample_size = (sizeof(struct drm_i915_perf_record_header) +
+			      format_size);
+	const struct drm_i915_perf_record_header *header;
+	uint32_t exponent_mask = (1 << (exponent + 1)) - 1;
+
+	/* Note: we allocate a large buffer so that each read() iteration
+	 * should scrape *all* pending records.
+	 *
+	 * The largest buffer the OA unit supports is 16MB and the smallest
+	 * OA report format is 64bytes allowing up to 262144 reports to
+	 * be buffered.
+	 *
+	 * Being sure we are fetching all buffered reports allows us to
+	 * potentially throw away / skip all reports whenever we see
+	 * a _REPORT_LOST notification as a way of being sure are
+	 * measurements aren't skewed by a lost report.
+	 *
+	 * Note: that is is useful for some tests but also not something
+	 * applications would be expected to resort to. Lost reports are
+	 * somewhat unpredictable but typically don't pose a problem - except
+	 * to indicate that the OA unit may be over taxed if lots of reports
+	 * are being lost.
+	 */
+	int buf_size = 262144 * (64 + sizeof(struct drm_i915_perf_record_header));
+	uint8_t *buf = malloc(buf_size);
+	int n = 0;
+
+	for (int i = 0; i < 1000; i++) {
+		ssize_t len;
+
+		while ((len = read(stream_fd, buf, buf_size)) < 0 &&
+		       errno == EINTR)
+			;
+
+		igt_assert(len > 0);
+
+		for (size_t offset = 0; offset < len; offset += header->size) {
+			const uint32_t *report;
+
+			header = (void *)(buf + offset);
+
+			igt_assert_eq(header->pad, 0); /* Reserved */
+
+			/* Currently the only test that should ever expect to
+			 * see a _BUFFER_LOST error is the buffer_fill test,
+			 * otherwise something bad has probably happened...
+			 */
+			igt_assert_neq(header->type, DRM_I915_PERF_RECORD_OA_BUFFER_LOST);
+
+			/* At high sampling frequencies the OA HW might not be
+			 * able to cope with all write requests and will notify
+			 * us that a report was lost. We restart our read of
+			 * two sequential reports due to the timeline blip this
+			 * implies
+			 */
+			if (header->type == DRM_I915_PERF_RECORD_OA_REPORT_LOST) {
+				igt_debug("read restart: OA trigger collision / report lost\n");
+				n = 0;
+
+				/* XXX: break, because we don't know where
+				 * within the series of already read reports
+				 * there could be a blip from the lost report.
+				 */
+				break;
+			}
+
+			/* Currently the only other record type expected is a
+			 * _SAMPLE. Notably this test will need updating if
+			 * i915-perf is extended in the future with additional
+			 * record types.
+			 */
+			igt_assert_eq(header->type, DRM_I915_PERF_RECORD_SAMPLE);
+
+			igt_assert_eq(header->size, sample_size);
+
+			report = (const void *)(header + 1);
+
+			igt_debug("read report: reason = %x, timestamp = %x, exponent mask=%x\n",
+				  report[0], report[1], exponent_mask);
+
+			/* Don't expect zero for timestamps */
+			igt_assert_neq(report[1], 0);
+
+			if (timer_only) {
+				/* For Haswell we don't have a documented
+				 * report reason field (though empirically
+				 * report[0] bit 10 does seem to correlate with
+				 * a timer trigger reason) so we instead infer
+				 * which reports are timer triggered by
+				 * checking if the least significant bits are
+				 * zero and the exponent bit is set.
+				 */
+				if ((report[1] & exponent_mask) != (1 << exponent)) {
+					igt_debug("skipping non timer report reason=%x\n",
+						  report[0]);
+
+					/* Also assert our hypothesis about the
+					 * reason bit...
+					 */
+					igt_assert_eq(report[0] & (1 << 10), 0);
+					continue;
+				}
+			}
+
+			if (n++ == 0)
+				memcpy(oa_report0, report, format_size);
+			else {
+				memcpy(oa_report1, report, format_size);
+				free(buf);
+				return;
+			}
+		}
+	}
+
+	free(buf);
+
+	igt_assert(!"reached");
+}
+
+static void
+open_and_read_2_oa_reports(int format_id,
+			   int exponent,
+			   uint32_t *oa_report0,
+			   uint32_t *oa_report1,
+			   bool timer_only)
+{
+	uint64_t properties[] = {
+		/* Include OA reports in samples */
+		DRM_I915_PERF_PROP_SAMPLE_OA, true,
+
+		/* OA unit configuration */
+		DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
+		DRM_I915_PERF_PROP_OA_FORMAT, format_id,
+		DRM_I915_PERF_PROP_OA_EXPONENT, exponent,
+
+	};
+	struct drm_i915_perf_open_param param = {
+		.flags = I915_PERF_FLAG_FD_CLOEXEC,
+		.num_properties = sizeof(properties) / 16,
+		.properties_ptr = (uint64_t)properties,
+	};
+	int stream_fd = __perf_open(drm_fd, &param);
+
+	read_2_oa_reports(stream_fd, format_id, exponent,
+			  oa_report0, oa_report1, timer_only);
+
+	close(stream_fd);
+}
+
+static void
+print_reports(uint32_t *oa_report0, uint32_t *oa_report1, int fmt)
+{
+	uint32_t *a0, *b0, *c0;
+	uint32_t *a1, *b1, *c1;
+
+	/* Not ideal naming here with a0 or a1
+	 * differentiating report0 or 1 not A counter 0 or 1....
+	 */
+	a0 = (uint32_t *)(((uint8_t *)oa_report0) + oa_formats[fmt].a_off);
+	b0 = (uint32_t *)(((uint8_t *)oa_report0) + oa_formats[fmt].b_off);
+	c0 = (uint32_t *)(((uint8_t *)oa_report0) + oa_formats[fmt].c_off);
+
+	a1 = (uint32_t *)(((uint8_t *)oa_report1) + oa_formats[fmt].a_off);
+	b1 = (uint32_t *)(((uint8_t *)oa_report1) + oa_formats[fmt].b_off);
+	c1 = (uint32_t *)(((uint8_t *)oa_report1) + oa_formats[fmt].c_off);
+
+	igt_debug("TIMESTAMP: 1st = %"PRIu32", 2nd = %"PRIu32", delta = %"PRIu32"\n",
+		  oa_report0[1], oa_report1[1], oa_report1[1] - oa_report0[1]);
+
+	if (oa_formats[fmt].n_c) {
+		igt_debug("CLOCK: 1st = %"PRIu32", 2nd = %"PRIu32", delta = %"PRIu32"\n",
+			  c0[2], c1[2], c1[2] - c0[2]);
+	} else
+		igt_debug("CLOCK = N/A\n");
+
+	for (int j = oa_formats[fmt].first_a;
+	     j < oa_formats[fmt].n_a;
+	     j++)
+	{
+		uint32_t delta = a1[j] - a0[j];
+
+		if (hsw_undefined_a_counters[j])
+			continue;
+
+		igt_debug("A%d: 1st = %"PRIu32", 2nd = %"PRIu32", delta = %"PRIu32"\n",
+			  j, a0[j], a1[j], delta);
+	}
+
+	for (int j = 0; j < oa_formats[fmt].n_b; j++) {
+		uint32_t delta = b1[j] - b0[j];
+		igt_debug("B%d: 1st = %"PRIu32", 2nd = %"PRIu32", delta = %"PRIu32"\n",
+			  j, b0[j], b1[j], delta);
+	}
+
+	for (int j = 0; j < oa_formats[fmt].n_c; j++) {
+		uint32_t delta = c1[j] - c0[j];
+		igt_debug("C%d: 1st = %"PRIu32", 2nd = %"PRIu32", delta = %"PRIu32"\n",
+			  j, c0[j], c1[j], delta);
+	}
+}
+
+static void
+test_oa_formats(void)
+{
+	int oa_exponent = 13;
+
+	for (int i = 0; i < ARRAY_SIZE(oa_formats); i++) {
+		uint32_t oa_report0[64];
+		uint32_t oa_report1[64];
+		uint32_t *a0, *b0, *c0;
+		uint32_t *a1, *b1, *c1;
+		uint32_t time_delta;
+		uint32_t clock_delta;
+		uint32_t max_delta;
+
+		if (!oa_formats[i].name) /* sparse, indexed by ID */
+			continue;
+
+		igt_debug("Checking OA format %s\n", oa_formats[i].name);
+
+		open_and_read_2_oa_reports(i,
+					   oa_exponent,
+					   oa_report0,
+					   oa_report1,
+					   false); /* timer reports only */
+
+		print_reports(oa_report0, oa_report1, i);
+
+		a0 = (uint32_t *)(((uint8_t *)oa_report0) + oa_formats[i].a_off);
+		b0 = (uint32_t *)(((uint8_t *)oa_report0) + oa_formats[i].b_off);
+		c0 = (uint32_t *)(((uint8_t *)oa_report0) + oa_formats[i].c_off);
+
+		a1 = (uint32_t *)(((uint8_t *)oa_report1) + oa_formats[i].a_off);
+		b1 = (uint32_t *)(((uint8_t *)oa_report1) + oa_formats[i].b_off);
+		c1 = (uint32_t *)(((uint8_t *)oa_report1) + oa_formats[i].c_off);
+
+		time_delta = timebase_scale(oa_report1[1] - oa_report0[1]);
+		igt_assert_neq(time_delta, 0);
+
+		/* C2 corresponds to a clock counter for the Haswell render
+		 * basic metric set but it's not included in all of the
+		 * formats.
+		 */
+		if (oa_formats[i].n_c) {
+			uint64_t freq;
+
+			/* The first report might have a clock count of zero
+			 * but we wouldn't expect that in the second report...
+			 */
+			igt_assert_neq(c1[2], 0);
+
+			clock_delta = c1[2] - c0[2];
+			igt_assert_neq(clock_delta, 0);
+
+			freq = ((uint64_t)clock_delta * 1000) / time_delta;
+			igt_debug("freq = %"PRIu64"\n", freq);
+
+			igt_assert(freq <= gt_max_freq_mhz);
+		} else {
+			/* Assume running at max freq for sake of
+			 * below sanity check on counters... */
+			clock_delta = (gt_max_freq_mhz *
+				       (uint64_t)time_delta) / 1000;
+		}
+
+		igt_debug("clock delta = %"PRIu32"\n", clock_delta);
+
+		/* The maximum rate for any HSW counter =
+		 *   clock_delta * 40 EUs
+		 *
+		 * Sanity check that no counters exceed this delta.
+		 */
+		max_delta = clock_delta * 40;
+
+		for (int j = oa_formats[i].first_a;
+		     j < oa_formats[i].n_a;
+		     j++)
+		{
+			uint32_t delta = a1[j] - a0[j];
+
+			if (hsw_undefined_a_counters[j])
+				continue;
+
+			igt_debug("A%d: delta = %"PRIu32"\n", j, delta);
+			igt_assert(delta <= max_delta);
+		}
+
+		for (int j = 0; j < oa_formats[i].n_b; j++) {
+			uint32_t delta = b1[j] - b0[j];
+			igt_debug("B%d: delta = %"PRIu32"\n", j, delta);
+			igt_assert(delta <= max_delta);
+		}
+
+		for (int j = 0; j < oa_formats[i].n_c; j++) {
+			uint32_t delta = c1[j] - c0[j];
+			igt_debug("C%d: delta = %"PRIu32"\n", j, delta);
+			igt_assert(delta <= max_delta);
+		}
+	}
+}
+
+static void
+test_oa_exponents(int gt_freq_mhz)
+{
+	/* This test tries to use the sysfs interface for pinning the GT
+	 * frequency so we have another point of reference for comparing with
+	 * the clock frequency as derived from OA reports.
+	 *
+	 * This test has been finicky to stabilise while the
+	 * gt_min/max_freq_mhz files in sysfs don't seem to be a reliable
+	 * mechanism for fixing the gpu frequency.
+	 *
+	 * Since these unit tests are focused on the OA unit not the ability to
+	 * pin the frequency via sysfs we make the test account for pinning not
+	 * being reliable and read back the current frequency for each
+	 * iteration of this test to take this into account.
+	 */
+	gt_frequency_pin(gt_freq_mhz);
+
+	igt_debug("Testing OA timer exponents with requested GT frequency = %dmhz\n",
+		  gt_freq_mhz);
+
+	/* It's asking a lot to sample with a 160 nanosecond period and the
+	 * test can fail due to buffer overflows if it wasn't possible to
+	 * keep up, so we don't start from an exponent of zero...
+	 */
+	for (int i = 2; i < 20; i++) {
+		uint32_t expected_timestamp_delta;
+		uint32_t timestamp_delta;
+		uint32_t oa_report0[64];
+		uint32_t oa_report1[64];
+		uint32_t *c0, *c1;
+		uint32_t time_delta;
+		uint32_t clock_delta;
+		uint32_t freq;
+		int n_tested = 0;
+		int n_freq_matches = 0;
+
+		/* The exponent is effectively selecting a bit in the timestamp
+		 * to trigger reports on and so in practice we expect the raw
+		 * timestamp deltas for periodic reports to exactly match the
+		 * value of next bit.
+		 */
+		expected_timestamp_delta = 2 << i;
+
+		for (int j = 0; n_tested < 10 && j < 100; j++) {
+			int gt_freq_mhz_0, gt_freq_mhz_1;
+			int c_off;
+
+			gt_freq_mhz_0 = sysfs_read("gt_act_freq_mhz");
+
+			igt_debug("ITER %d: testing OA exponent %d with sysfs GT freq = %dmhz\n",
+				  j, i, gt_freq_mhz_0);
+
+			open_and_read_2_oa_reports(I915_OA_FORMAT_A45_B8_C8,
+						   i, /* exponent */
+						   oa_report0,
+						   oa_report1,
+						   true); /* timer triggered
+							     reports only */
+
+			gt_freq_mhz_1 = sysfs_read("gt_act_freq_mhz");
+
+			/* If it looks like the frequency has changed according
+			 * to sysfs then skip looking at this pair of reports
+			 */
+			if (gt_freq_mhz_0 != gt_freq_mhz_1) {
+				igt_debug("skipping OA reports pair due to GT frequency change according to sysfs\n");
+				continue;
+			}
+
+			timestamp_delta = oa_report1[1] - oa_report0[1];
+			igt_assert_neq(timestamp_delta, 0);
+
+			if (timestamp_delta != expected_timestamp_delta) {
+				igt_debug("timestamp0 = %u/0x%x\n",
+					  oa_report0[1], oa_report0[1]);
+				igt_debug("timestamp1 = %u/0x%x\n",
+					  oa_report1[1], oa_report1[1]);
+			}
+
+			igt_assert_eq(timestamp_delta, expected_timestamp_delta);
+
+			/* NB: for the render basic metric set opened above by
+			 * open_and_read_2_oa_reports(), the C2 counter is
+			 * configured as the gpu clock counter...
+			 */
+			c_off = oa_formats[I915_OA_FORMAT_A45_B8_C8].c_off;
+			igt_assert(c_off);
+			c0 = (uint32_t *)(((uint8_t *)oa_report0) + c_off);
+			c1 = (uint32_t *)(((uint8_t *)oa_report1) + c_off);
+			clock_delta = c1[2] - c0[2];
+
+			time_delta = timebase_scale(timestamp_delta);
+
+			freq = ((uint64_t)clock_delta * 1000) / time_delta;
+			igt_debug("ITER %d: time delta = %"PRIu32"(ns) clock delta = %"PRIu32" freq = %"PRIu32"(mhz)\n",
+				  j, time_delta, clock_delta, freq);
+
+			if (freq == gt_freq_mhz_1)
+				n_freq_matches++;
+
+			n_tested++;
+		}
+
+		if (n_tested < 10)
+			igt_debug("sysfs frequency pinning too unstable for cross-referencing with OA derived frequency");
+		igt_assert_eq(n_tested, 10);
+
+		igt_debug("number of iterations with expected clock frequency = %d\n",
+			  n_freq_matches);
+
+		/* Don't assert the calculated frequency for extremely short
+		 * durations.
+		 *
+		 * Allow some mismatches since can't be can't be sure about
+		 * frequency changes between sysfs reads.
+		 */
+		if (i > 3)
+			igt_assert(n_freq_matches >= 7);
+	}
+
+	gt_frequency_range_restore();
+}
+
+/* The OA exponent selects a timestamp counter bit to trigger reports on.
+ *
+ * With a 64bit timestamp and least significant bit approx == 80ns then the MSB
+ * equates to > 40 thousand years and isn't exposed via the i915 perf interface.
+ *
+ * The max exponent exposed is expected to be 31, which is still a fairly
+ * ridiculous period (>5min) but is the maximum exponent where it's still
+ * possible to use periodic sampling as a means for tracking the overflow of
+ * 32bit OA report timestamps.
+ */
+static void
+test_invalid_oa_exponent(void)
+{
+	uint64_t properties[] = {
+		/* Include OA reports in samples */
+		DRM_I915_PERF_PROP_SAMPLE_OA, true,
+
+		/* OA unit configuration */
+		DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
+		DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
+		DRM_I915_PERF_PROP_OA_EXPONENT, 31, /* maximum exponent expected
+						       to be accepted */
+	};
+	struct drm_i915_perf_open_param param = {
+		.flags = I915_PERF_FLAG_FD_CLOEXEC,
+		.num_properties = sizeof(properties) / 16,
+		.properties_ptr = (uint64_t)properties,
+	};
+	int stream_fd = __perf_open(drm_fd, &param);
+
+	close(stream_fd);
+
+	for (int i = 32; i < 65; i++) {
+		properties[7] = i;
+		do_ioctl_err(drm_fd, DRM_IOCTL_I915_PERF_OPEN, &param, EINVAL);
+	}
+}
+
+/* Return the largest OA exponent that will still result in a sampling
+ * frequency higher than the given frequency.
+ */
+static int
+max_oa_exponent_for_higher_freq(uint64_t freq)
+{
+	/* NB: timebase_scale() takes a uint32_t and an exponent of 30
+	 * would already represent a period of ~3 minutes so there's
+	 * really no need to consider higher exponents.
+	 */
+	for (int i = 0; i < 30; i++) {
+		uint64_t oa_period = timebase_scale(2 << i);
+		uint32_t oa_freq = NSEC_PER_SEC / oa_period;
+
+		if (oa_freq <= freq)
+			return max(0, i - 1);
+	}
+
+	igt_assert(!"reached");
+	return -1;
+}
+
+/* The lowest periodic sampling exponent equates to a period of 160 nanoseconds
+ * or a frequency of 6.25MHz which is only possible to request as root by
+ * default. By default the maximum OA sampling rate is 100KHz
+ */
+static void
+test_low_oa_exponent_permissions(void)
+{
+	int max_freq = read_u64_file("/proc/sys/dev/i915/oa_max_sample_rate");
+	int bad_exponent = max_oa_exponent_for_higher_freq(max_freq);
+	int ok_exponent = bad_exponent + 1;
+	uint64_t properties[] = {
+		/* Include OA reports in samples */
+		DRM_I915_PERF_PROP_SAMPLE_OA, true,
+
+		/* OA unit configuration */
+		DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
+		DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
+		DRM_I915_PERF_PROP_OA_EXPONENT, bad_exponent,
+	};
+	struct drm_i915_perf_open_param param = {
+		.flags = I915_PERF_FLAG_FD_CLOEXEC,
+		.num_properties = sizeof(properties) / 16,
+		.properties_ptr = (uint64_t)properties,
+	};
+	uint64_t oa_period, oa_freq;
+
+	igt_assert_eq(max_freq, 100000);
+
+	/* Avoid EACCES errors opening a stream without CAP_SYS_ADMIN */
+	write_u64_file("/proc/sys/dev/i915/perf_stream_paranoid", 0);
+
+	igt_fork(child, 1) {
+		igt_drop_root();
+
+		do_ioctl_err(drm_fd, DRM_IOCTL_I915_PERF_OPEN, &param, EACCES);
+	}
+
+	igt_waitchildren();
+
+	properties[7] = ok_exponent;
+
+	igt_fork(child, 1) {
+		int stream_fd;
+
+		igt_drop_root();
+
+		stream_fd = __perf_open(drm_fd, &param);
+		close(stream_fd);
+	}
+
+	igt_waitchildren();
+
+	oa_period = timebase_scale(2 << ok_exponent);
+	oa_freq = NSEC_PER_SEC / oa_period;
+	write_u64_file("/proc/sys/dev/i915/oa_max_sample_rate", oa_freq - 100);
+
+	igt_fork(child, 1) {
+		igt_drop_root();
+
+		do_ioctl_err(drm_fd, DRM_IOCTL_I915_PERF_OPEN, &param, EACCES);
+	}
+
+	igt_waitchildren();
+
+	/* restore the defaults */
+	write_u64_file("/proc/sys/dev/i915/oa_max_sample_rate", 100000);
+	write_u64_file("/proc/sys/dev/i915/perf_stream_paranoid", 1);
+}
+
+static void
+test_per_context_mode_unprivileged(void)
+{
+	uint64_t properties[] = {
+		/* Single context sampling */
+		DRM_I915_PERF_PROP_CTX_HANDLE, UINT64_MAX, /* updated below */
+
+		/* Include OA reports in samples */
+		DRM_I915_PERF_PROP_SAMPLE_OA, true,
+
+		/* OA unit configuration */
+		DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
+		DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
+		DRM_I915_PERF_PROP_OA_EXPONENT, 13, /* 1 millisecond */
+	};
+	struct drm_i915_perf_open_param param = {
+		.flags = I915_PERF_FLAG_FD_CLOEXEC,
+		.num_properties = sizeof(properties) / 16,
+		.properties_ptr = (uint64_t)properties,
+	};
+
+	/* should be default, but just to be sure... */
+	write_u64_file("/proc/sys/dev/i915/perf_stream_paranoid", 1);
+
+	igt_fork(child, 1) {
+		drm_intel_context *context;
+		drm_intel_bufmgr *bufmgr;
+		int stream_fd;
+
+		igt_drop_root();
+
+		bufmgr = drm_intel_bufmgr_gem_init(drm_fd, 4096);
+		context = drm_intel_gem_context_create(bufmgr);
+
+		igt_assert(context);
+
+		properties[1] = drm_intel_gem_context_get_context_id(context);
+
+		stream_fd = __perf_open(drm_fd, &param);
+		close(stream_fd);
+
+		drm_intel_gem_context_destroy(context);
+		drm_intel_bufmgr_destroy(bufmgr);
+	}
+
+	igt_waitchildren();
+}
+
+static int64_t
+get_time(void)
+{
+	struct timespec ts;
+
+	clock_gettime(CLOCK_MONOTONIC, &ts);
+
+	return ts.tv_sec * 1000000000 + ts.tv_nsec;
+}
+
+/* Note: The interface doesn't currently provide strict guarantees or control
+ * over the upper bound for how long it might take for a POLLIN event after
+ * some OA report is written by the OA unit.
+ *
+ * The plan is to add a property later that gives some control over the maximum
+ * latency, but for now we expect it is tuned for a fairly low latency
+ * suitable for applications wanting to provide live feedback for captured
+ * metrics.
+ *
+ * At the time of writing this test the driver was using a fixed 200Hz hrtimer
+ * regardless of the OA sampling exponent.
+ *
+ * There is no lower bound since a stream configured for periodic sampling may
+ * still contain other automatically triggered reports.
+ *
+ * What we try and check for here is that blocking reads don't return EAGAIN
+ * and that we aren't spending any significant time burning the cpu in
+ * kernelspace.
+ */
+static void
+test_blocking(void)
+{
+	/* 40 milliseconds
+	 *
+	 * Having a period somewhat > sysconf(_SC_CLK_TCK) helps to stop
+	 * scheduling (liable to kick in when we make blocking poll()s/reads)
+	 * from interfering with the test.
+	 */
+	int oa_exponent = 18;
+	uint64_t properties[] = {
+		/* Include OA reports in samples */
+		DRM_I915_PERF_PROP_SAMPLE_OA, true,
+
+		/* OA unit configuration */
+		DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
+		DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
+		DRM_I915_PERF_PROP_OA_EXPONENT, oa_exponent,
+	};
+	struct drm_i915_perf_open_param param = {
+		.flags = I915_PERF_FLAG_FD_CLOEXEC,
+		.num_properties = sizeof(properties) / 16,
+		.properties_ptr = (uint64_t)properties,
+	};
+	int stream_fd = __perf_open(drm_fd, &param);
+	uint8_t buf[1024 * 1024];
+	struct tms start_times;
+	struct tms end_times;
+	int64_t user_ns, kernel_ns;
+	int64_t tick_ns = 1000000000 / sysconf(_SC_CLK_TCK);
+	int64_t start;
+	int n = 0;
+
+	times(&start_times);
+
+	/* Loop for 600ms performing blocking reads while the HW is sampling at
+	 * ~25Hz, with the expectation that we spend most of our time blocked
+	 * in the kernel, and shouldn't be burning cpu cycles in the kernel in
+	 * association with this process (verified by looking at stime before
+	 * and after loop).
+	 */
+	for (start = get_time(); (get_time() - start) < 600000000; /* nop */) {
+		int ret;
+
+		while ((ret = read(stream_fd, buf, sizeof(buf))) < 0 &&
+		       errno == EINTR)
+			;
+
+		igt_assert(ret > 0);
+
+		n++;
+	}
+
+	times(&end_times);
+
+	/* Using nanosecond units is fairly silly here, given the tick in-
+	 * precision - ah well, it's consistent with the get_time() units.
+	 */
+	user_ns = (end_times.tms_utime - start_times.tms_utime) * tick_ns;
+	kernel_ns = (end_times.tms_stime - start_times.tms_stime) * tick_ns;
+
+	igt_debug("%d blocking reads in 500 milliseconds, with 1KHz OA sampling\n", n);
+	igt_debug("time in userspace = %"PRIu64"ns (start utime = %d, end = %d, ns ticks per sec = %d)\n",
+		  user_ns, (int)start_times.tms_utime, (int)end_times.tms_utime, (int)tick_ns);
+	igt_debug("time in kernelspace = %"PRIu64"ns (start stime = %d, end = %d, ns ticks per sec = %d)\n",
+		  kernel_ns, (int)start_times.tms_stime, (int)end_times.tms_stime, (int)tick_ns);
+
+	/* With completely broken blocking (but also not returning an error) we
+	 * could end up with an open loop, hopefully recognisable with > 15
+	 * (600/40)iterations.
+	 */
+	igt_assert(n <= 15);
+
+	/* It's a bit tricky to put a lower limit here, but we expect a
+	 * relatively low latency for seeing reports, while we don't currently
+	 * give any control over this in the api.
+	 *
+	 * Limited to a 5 millisecond latency and 45ms (worst case)
+	 * per-iteration that could give 13.3 iterations. Rounding gives a tiny
+	 * bit more latency slack (6ms)...
+	 */
+	igt_assert(n > 13);
+
+	/* A bit tricky to put a number on this, but we don't expect the kernel
+	 * to use any significant cpu while waiting and given the in precision
+	 * of stime (multiple of CLK_TCK) we expect this to round to zero.
+	 */
+	igt_assert_eq(kernel_ns, 0);
+
+	close(stream_fd);
+}
+
+static void
+test_polling(void)
+{
+	/* 40 milliseconds
+	 *
+	 * Having a period somewhat > sysconf(_SC_CLK_TCK) helps to stop
+	 * scheduling (liable to kick in when we make blocking poll()s/reads)
+	 * from interfering with the test.
+	 */
+	int oa_exponent = 18;
+	uint64_t properties[] = {
+		/* Include OA reports in samples */
+		DRM_I915_PERF_PROP_SAMPLE_OA, true,
+
+		/* OA unit configuration */
+		DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
+		DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
+		DRM_I915_PERF_PROP_OA_EXPONENT, oa_exponent,
+	};
+	struct drm_i915_perf_open_param param = {
+		.flags = I915_PERF_FLAG_FD_CLOEXEC |
+			I915_PERF_FLAG_FD_NONBLOCK,
+		.num_properties = sizeof(properties) / 16,
+		.properties_ptr = (uint64_t)properties,
+	};
+	int stream_fd = __perf_open(drm_fd, &param);
+	uint8_t buf[1024 * 1024];
+	struct tms start_times;
+	struct tms end_times;
+	int64_t user_ns, kernel_ns;
+	int64_t tick_ns = 1000000000 / sysconf(_SC_CLK_TCK);
+	int64_t start;
+	int n = 0;
+
+	times(&start_times);
+
+	/* Loop for 600ms performing blocking polls while the HW is sampling at
+	 * ~25Hz, with the expectation that we spend most of our time blocked
+	 * in the kernel, and shouldn't be burning cpu cycles in the kernel in
+	 * association with this process (verified by looking at stime before
+	 * and after loop).
+	 */
+	for (start = get_time(); (get_time() - start) < 600000000; /* nop */) {
+		struct pollfd pollfd = { .fd = stream_fd, .events = POLLIN };
+		int ret;
+
+		while ((ret = poll(&pollfd, 1, -1)) < 0 &&
+		       errno == EINTR)
+			;
+		igt_assert_eq(ret, 1);
+		igt_assert(pollfd.revents & POLLIN);
+
+		while ((ret = read(stream_fd, buf, sizeof(buf))) < 0 &&
+		       errno == EINTR)
+			;
+
+		/* Don't expect to see EAGAIN if we've had a POLLIN event
+		 *
+		 * XXX: actually this is technically overly strict since we do
+		 * knowingly allow false positive POLLIN events. At least in
+		 * the future when supporting context filtering of metrics for
+		 * Gen8+ handled in the kernel then POLLIN events may be
+		 * delivered when we know there are pending reports to process
+		 * but before we've done any filtering to know for certain that
+		 * any reports are destined to be copied to userspace.
+		 *
+		 * Still, for now it's a reasonable sanity check.
+		 */
+		if (ret < 0)
+			igt_debug("Unexpected error when reading after poll = %d\n", errno);
+		igt_assert_neq(ret, -1);
+
+		/* At this point, after consuming pending reports (and hoping
+		 * the scheduler hasn't stopped us for too long we now
+		 * expect EAGAIN on read.
+		 */
+		while ((ret = read(stream_fd, buf, sizeof(buf))) < 0 &&
+		       errno == EINTR)
+			;
+		igt_assert_eq(ret, -1);
+		igt_assert_eq(errno, EAGAIN);
+
+		n++;
+	}
+
+	times(&end_times);
+
+	/* Using nanosecond units is fairly silly here, given the tick in-
+	 * precision - ah well, it's consistent with the get_time() units.
+	 */
+	user_ns = (end_times.tms_utime - start_times.tms_utime) * tick_ns;
+	kernel_ns = (end_times.tms_stime - start_times.tms_stime) * tick_ns;
+
+	igt_debug("%d blocking poll()s in 600 milliseconds, with 25Hz OA sampling\n", n);
+	igt_debug("time in userspace = %"PRIu64"ns (start utime = %d, end = %d, ns ticks per sec = %d)\n",
+		  user_ns, (int)start_times.tms_utime, (int)end_times.tms_utime, (int)tick_ns);
+	igt_debug("time in kernelspace = %"PRIu64"ns (start stime = %d, end = %d, ns ticks per sec = %d)\n",
+		  kernel_ns, (int)start_times.tms_stime, (int)end_times.tms_stime, (int)tick_ns);
+
+	/* With completely broken blocking while polling (but still somehow
+	 * reporting a POLLIN event) we could end up with an open loop,
+	 * hopefully recognisable with > 15 (600/40)iterations.
+	 */
+	igt_assert(n <= 15);
+
+	/* It's a bit tricky to put a lower limit here, but we expect a
+	 * relatively low latency for seeing reports, while we don't currently
+	 * give any control over this in the api.
+	 *
+	 * Limited to a 5 millisecond latency and 45ms (worst case)
+	 * per-iteration that could give 13.3 iterations. Rounding gives a tiny
+	 * bit more latency slack (6ms)...
+	 */
+	igt_assert(n > 13);
+
+	/* A bit tricky to put a number on this, but we don't expect the kernel
+	 * to use any significant cpu while waiting and given the in precision
+	 * of stime (multiple of CLK_TCK) we expect this to round to zero.
+	 */
+	igt_assert_eq(kernel_ns, 0);
+
+	close(stream_fd);
+}
+
+static void
+test_buffer_fill(void)
+{
+	int oa_exponent = 5; /* 5 micro seconds */
+	uint64_t properties[] = {
+		/* Include OA reports in samples */
+		DRM_I915_PERF_PROP_SAMPLE_OA, true,
+
+		/* OA unit configuration */
+		DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
+		DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
+		DRM_I915_PERF_PROP_OA_EXPONENT, oa_exponent,
+	};
+	struct drm_i915_perf_open_param param = {
+		.flags = I915_PERF_FLAG_FD_CLOEXEC,
+		.num_properties = sizeof(properties) / 16,
+		.properties_ptr = (uint64_t)properties,
+	};
+	int stream_fd = __perf_open(drm_fd, &param);
+	int buf_size = 65536 * (256 + sizeof(struct drm_i915_perf_record_header));
+	uint8_t *buf = malloc(buf_size);
+
+
+	for (int i = 0; i < 5; i++) {
+		struct drm_i915_perf_record_header *header;
+		bool overflow_seen;
+		int offset = 0;
+		int len;
+
+		/* It should take ~330 milliseconds to fill a 16MB OA buffer with a
+		 * 5 microsecond sampling period and 256 byte reports. */
+		nanosleep(&(struct timespec){ .tv_sec = 0, .tv_nsec = 500000000 }, NULL);
+
+		while ((len = read(stream_fd, buf, buf_size)) == -1 && errno == EINTR)
+			;
+
+		igt_assert_neq(len, -1);
+
+		overflow_seen = false;
+		for (offset = 0; offset < len; offset += header->size) {
+			header = (void *)(buf + offset);
+
+			if (header->type == DRM_I915_PERF_RECORD_OA_BUFFER_LOST)
+				overflow_seen = true;
+		}
+
+		igt_assert_eq(overflow_seen, true);
+
+		nanosleep(&(struct timespec){ .tv_sec = 0, .tv_nsec = 1000000 }, NULL);
+
+		while ((len = read(stream_fd, buf, buf_size)) == -1 && errno == EINTR)
+			;
+
+		igt_assert_neq(len, -1);
+
+		/* expect ~ 200 records in 1 millisecond */
+		igt_assert(len > 256 * 150);
+
+		overflow_seen = false;
+		for (offset = 0; offset < len; offset += header->size) {
+			header = (void *)(buf + offset);
+
+			if (header->type == DRM_I915_PERF_RECORD_OA_BUFFER_LOST)
+				overflow_seen = true;
+		}
+
+		igt_assert_eq(overflow_seen, false);
+	}
+
+	free(buf);
+
+	close(stream_fd);
+}
+
+static void
+test_enable_disable(void)
+{
+	int oa_exponent = 5; /* 5 micro seconds */
+	uint64_t properties[] = {
+		/* Include OA reports in samples */
+		DRM_I915_PERF_PROP_SAMPLE_OA, true,
+
+		/* OA unit configuration */
+		DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
+		DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
+		DRM_I915_PERF_PROP_OA_EXPONENT, oa_exponent,
+	};
+	struct drm_i915_perf_open_param param = {
+		.flags = I915_PERF_FLAG_FD_CLOEXEC |
+			 I915_PERF_FLAG_DISABLED, /* Verify we start disabled */
+		.num_properties = sizeof(properties) / 16,
+		.properties_ptr = (uint64_t)properties,
+	};
+	int stream_fd = __perf_open(drm_fd, &param);
+	int buf_size = 65536 * (256 + sizeof(struct drm_i915_perf_record_header));
+	uint8_t *buf = malloc(buf_size);
+
+
+	for (int i = 0; i < 5; i++) {
+		int len;
+
+		/* If the stream were enabled then it would take ~330
+		 * milliseconds to fill a 16MB OA buffer with a 5 microsecond
+		 * sampling period and 256 byte reports.
+		 *
+		 * Giving enough time for an overflow might help catch whether
+		 * the OA unit has been enabled even if the driver might at
+		 * least avoid copying reports while disabled.
+		 */
+		nanosleep(&(struct timespec){ .tv_sec = 0, .tv_nsec = 500000000 }, NULL);
+
+		while ((len = read(stream_fd, buf, buf_size)) == -1 && errno == EINTR)
+			;
+
+		igt_assert_eq(len, -1);
+		igt_assert_eq(errno, EIO);
+
+		do_ioctl(stream_fd, I915_PERF_IOCTL_ENABLE, 0);
+
+		nanosleep(&(struct timespec){ .tv_sec = 0, .tv_nsec = 1000000 }, NULL);
+
+		while ((len = read(stream_fd, buf, buf_size)) == -1 && errno == EINTR)
+			;
+
+		igt_assert_neq(len, -1);
+
+		/* expect ~ 200 records in 1 millisecond */
+		igt_assert(len > 256 * 150 && len < 256 * 2000);
+
+		do_ioctl(stream_fd, I915_PERF_IOCTL_DISABLE, 0);
+
+		/* It's considered an error to read a stream while it's disabled
+		 * since it would block indefinitely...
+		 */
+		len = read(stream_fd, buf, buf_size);
+
+		igt_assert_eq(len, -1);
+		igt_assert_eq(errno, EIO);
+	}
+
+	free(buf);
+
+	close(stream_fd);
+}
+
+static void
+test_short_reads(void)
+{
+	int oa_exponent = 5; /* 5 micro seconds */
+	uint64_t properties[] = {
+		/* Include OA reports in samples */
+		DRM_I915_PERF_PROP_SAMPLE_OA, true,
+
+		/* OA unit configuration */
+		DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
+		DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
+		DRM_I915_PERF_PROP_OA_EXPONENT, oa_exponent,
+	};
+	struct drm_i915_perf_open_param param = {
+		.flags = I915_PERF_FLAG_FD_CLOEXEC,
+		.num_properties = sizeof(properties) / 16,
+		.properties_ptr = (uint64_t)properties,
+	};
+	size_t record_size = 256 + sizeof(struct drm_i915_perf_record_header);
+	size_t page_size = sysconf(_SC_PAGE_SIZE);
+	int zero_fd = open("/dev/zero", O_RDWR|O_CLOEXEC);
+	uint8_t *pages = mmap(NULL, page_size * 2,
+			      PROT_READ|PROT_WRITE, MAP_PRIVATE, zero_fd, 0);
+	struct drm_i915_perf_record_header *header;
+	int stream_fd;
+	int ret;
+
+	igt_assert_neq(zero_fd, -1);
+	close(zero_fd);
+	zero_fd = -1;
+
+	igt_assert(pages);
+
+	ret = mprotect(pages + page_size, page_size, PROT_NONE);
+	igt_assert_eq(ret, 0);
+
+	stream_fd = __perf_open(drm_fd, &param);
+
+	nanosleep(&(struct timespec){ .tv_sec = 0, .tv_nsec = 5000000 }, NULL);
+
+	/* At this point there should be lots of pending reports to read */
+
+	/* A read that can return at least one record should result in a short
+	 * read not an EFAULT if the buffer is smaller than the requested read
+	 * size...
+	 *
+	 * Expect to see a sample record here, but at least skip over any
+	 * _RECORD_LOST notifications.
+	 */
+	do {
+		header = (void *)(pages + page_size - record_size);
+		ret = read(stream_fd,
+			   header,
+			   page_size);
+		igt_assert(ret > 0);
+	} while (header->type == DRM_I915_PERF_RECORD_OA_REPORT_LOST);
+
+	igt_assert_eq(ret, record_size);
+
+	/* A read that can't return a single record because it would result
+	 * in a fault on buffer overrun should result in an EFAULT error...
+	 */
+	ret = read(stream_fd, pages + page_size - 16, page_size);
+	igt_assert_eq(ret, -1);
+	igt_assert_eq(errno, EFAULT);
+
+	/* A read that can't return a single record because the buffer is too
+	 * small should result in an ENOSPC error..
+	 *
+	 * Again, skip over _RECORD_LOST records (smaller than record_size/2)
+	 */
+	do {
+		header = (void *)(pages + page_size - record_size / 2);
+		ret = read(stream_fd,
+			   header,
+			   record_size / 2);
+	} while (ret > 0 && header->type == DRM_I915_PERF_RECORD_OA_REPORT_LOST);
+
+	igt_assert_eq(ret, -1);
+	igt_assert_eq(errno, ENOSPC);
+
+	close(stream_fd);
+
+	munmap(pages, page_size * 2);
+}
+
+static void
+test_non_sampling_read_error(void)
+{
+	uint64_t properties[] = {
+		/* XXX: even without periodic sampling we have to
+		 * specify at least one sample layout property...
+		 */
+		DRM_I915_PERF_PROP_SAMPLE_OA, true,
+
+		/* OA unit configuration */
+		DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
+		DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
+
+		/* XXX: no sampling exponent */
+	};
+	struct drm_i915_perf_open_param param = {
+		.flags = I915_PERF_FLAG_FD_CLOEXEC,
+		.num_properties = sizeof(properties) / 16,
+		.properties_ptr = (uint64_t)properties,
+	};
+	int stream_fd = __perf_open(drm_fd, &param);
+	uint8_t buf[1024];
+
+	int ret = read(stream_fd, buf, sizeof(buf));
+	igt_assert_eq(ret, -1);
+	igt_assert_eq(errno, EIO);
+
+	close(stream_fd);
+}
+
+/* Check that attempts to read from a stream while it is disable will return
+ * EIO instead of blocking indefinitely.
+ */
+static void
+test_disabled_read_error(void)
+{
+	int oa_exponent = 5; /* 5 micro seconds */
+	uint64_t properties[] = {
+		/* XXX: even without periodic sampling we have to
+		 * specify at least one sample layout property...
+		 */
+		DRM_I915_PERF_PROP_SAMPLE_OA, true,
+
+		/* OA unit configuration */
+		DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
+		DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
+		DRM_I915_PERF_PROP_OA_EXPONENT, oa_exponent,
+	};
+	struct drm_i915_perf_open_param param = {
+		.flags = I915_PERF_FLAG_FD_CLOEXEC |
+			 I915_PERF_FLAG_DISABLED, /* XXX: open disabled */
+		.num_properties = sizeof(properties) / 16,
+		.properties_ptr = (uint64_t)properties,
+	};
+	int stream_fd = __perf_open(drm_fd, &param);
+	uint32_t oa_report0[64];
+	uint32_t oa_report1[64];
+	uint32_t buf[128] = { 0 };
+	int ret;
+
+
+	ret = read(stream_fd, buf, sizeof(buf));
+	igt_assert_eq(ret, -1);
+	igt_assert_eq(errno, EIO);
+
+	close(stream_fd);
+
+
+	param.flags &= ~I915_PERF_FLAG_DISABLED;
+	stream_fd = __perf_open(drm_fd, &param);
+
+	read_2_oa_reports(stream_fd,
+			  I915_OA_FORMAT_A45_B8_C8,
+			  oa_exponent,
+			  oa_report0,
+			  oa_report1,
+			  false); /* not just timer reports */
+
+	do_ioctl(stream_fd, I915_PERF_IOCTL_DISABLE, 0);
+
+	ret = read(stream_fd, buf, sizeof(buf));
+	igt_assert_eq(ret, -1);
+	igt_assert_eq(errno, EIO);
+
+	do_ioctl(stream_fd, I915_PERF_IOCTL_ENABLE, 0);
+
+	read_2_oa_reports(stream_fd,
+			  I915_OA_FORMAT_A45_B8_C8,
+			  oa_exponent,
+			  oa_report0,
+			  oa_report1,
+			  false); /* not just timer reports */
+
+	close(stream_fd);
+}
+
+static void
+test_mi_rpc(void)
+{
+	uint64_t properties[] = {
+		/* Note: we have to specify at least one sample property even
+		 * though we aren't interested in samples in this case.
+		 */
+		DRM_I915_PERF_PROP_SAMPLE_OA, true,
+
+		/* OA unit configuration */
+		DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
+		DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
+
+		/* Note: no OA exponent specified in this case */
+	};
+	struct drm_i915_perf_open_param param = {
+		.flags = I915_PERF_FLAG_FD_CLOEXEC,
+		.num_properties = sizeof(properties) / 16,
+		.properties_ptr = (uint64_t)properties,
+	};
+	int stream_fd = __perf_open(drm_fd, &param);
+	drm_intel_bufmgr *bufmgr = drm_intel_bufmgr_gem_init(drm_fd, 4096);
+	drm_intel_context *context;
+	struct intel_batchbuffer *batch;
+	drm_intel_bo *bo;
+	uint32_t *report32;
+	int ret;
+
+	drm_intel_bufmgr_gem_enable_reuse(bufmgr);
+
+	context = drm_intel_gem_context_create(bufmgr);
+	igt_assert(context);
+
+	batch = intel_batchbuffer_alloc(bufmgr, devid);
+
+	bo = drm_intel_bo_alloc(bufmgr, "mi_rpc dest bo", 4096, 64);
+
+	ret = drm_intel_bo_map(bo, true);
+	igt_assert_eq(ret, 0);
+
+	memset(bo->virtual, 0x80, 4096);
+	drm_intel_bo_unmap(bo);
+
+	BEGIN_BATCH(3, 1);
+	OUT_BATCH(GEN6_MI_REPORT_PERF_COUNT);
+	OUT_RELOC(bo, I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION,
+		  0); /* offset in bytes */
+	OUT_BATCH(0xdeadbeef); /* report ID */
+	ADVANCE_BATCH();
+
+	intel_batchbuffer_flush_with_context(batch, context);
+
+	ret = drm_intel_bo_map(bo, false /* write enable */);
+	igt_assert_eq(ret, 0);
+
+	report32 = bo->virtual;
+	igt_assert_eq(report32[0], 0xdeadbeef); /* report ID */
+	igt_assert_neq(report32[1], 0); /* timestamp */
+
+	igt_assert_neq(report32[63], 0x80808080); /* end of report */
+	igt_assert_eq(report32[64], 0x80808080); /* after 256 byte report */
+
+	drm_intel_bo_unmap(bo);
+	drm_intel_bo_unreference(bo);
+	intel_batchbuffer_free(batch);
+	drm_intel_gem_context_destroy(context);
+	drm_intel_bufmgr_destroy(bufmgr);
+	close(stream_fd);
+}
+
+static void
+scratch_buf_init(drm_intel_bufmgr *bufmgr,
+		 struct igt_buf *buf,
+		 int width, int height,
+		 uint32_t color)
+{
+	size_t stride = width * 4;
+	size_t size = stride * height;
+	drm_intel_bo *bo = drm_intel_bo_alloc(bufmgr, "", size, 4096);
+	int ret;
+
+	ret = drm_intel_bo_map(bo, true /* writable */);
+	igt_assert_eq(ret, 0);
+
+	for (int i = 0; i < width * height; i++)
+		((uint32_t *)bo->virtual)[i] = color;
+
+	drm_intel_bo_unmap(bo);
+
+	buf->bo = bo;
+	buf->stride = stride;
+	buf->tiling = I915_TILING_NONE;
+	buf->size = size;
+}
+
+static void
+emit_stall_timestamp_and_rpc(struct intel_batchbuffer *batch,
+			     drm_intel_bo *dst,
+			     int timestamp_offset,
+			     int report_dst_offset,
+			     uint32_t report_id)
+{
+	uint32_t pipe_ctl_flags = (PIPE_CONTROL_CS_STALL |
+				   PIPE_CONTROL_RENDER_TARGET_FLUSH |
+				   PIPE_CONTROL_WRITE_TIMESTAMP);
+
+	BEGIN_BATCH(5, 1);
+	OUT_BATCH(GFX_OP_PIPE_CONTROL | (5 - 2));
+	OUT_BATCH(pipe_ctl_flags);
+	OUT_RELOC(dst, I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION,
+		  timestamp_offset);
+	OUT_BATCH(0); /* imm lower */
+	OUT_BATCH(0); /* imm upper */
+	ADVANCE_BATCH();
+
+	BEGIN_BATCH(3, 1);
+	OUT_BATCH(GEN6_MI_REPORT_PERF_COUNT);
+	OUT_RELOC(dst, I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION,
+		  report_dst_offset);
+	OUT_BATCH(report_id);
+	ADVANCE_BATCH();
+}
+
+/* Tests the INTEL_performance_query use case where an unprivileged process
+ * should be able to configure the OA unit for per-context metrics (for a
+ * context associated with that process' drm file descriptor) and the counters
+ * should only relate to that specific context.
+ */
+static void
+test_per_ctx_mi_rpc(void)
+{
+	uint64_t properties[] = {
+		DRM_I915_PERF_PROP_CTX_HANDLE, UINT64_MAX, /* updated below */
+
+		/* Note: we have to specify at least one sample property even
+		 * though we aren't interested in samples in this case
+		 */
+		DRM_I915_PERF_PROP_SAMPLE_OA, true,
+
+		/* OA unit configuration */
+		DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
+		DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
+
+		/* Note: no OA exponent specified in this case */
+	};
+	struct drm_i915_perf_open_param param = {
+		.flags = I915_PERF_FLAG_FD_CLOEXEC,
+		.num_properties = sizeof(properties) / 16,
+		.properties_ptr = (uint64_t)properties,
+	};
+
+	/* should be default, but just to be sure... */
+	write_u64_file("/proc/sys/dev/i915/perf_stream_paranoid", 1);
+
+	igt_fork(child, 1) {
+		drm_intel_bufmgr *bufmgr;
+		drm_intel_context *context0, *context1;
+		int stream_fd;
+		struct intel_batchbuffer *batch;
+		struct igt_buf src, dst;
+		drm_intel_bo *bo;
+		uint32_t *report0_32, *report1_32;
+		uint64_t timestamp0_64, timestamp1_64;
+		uint32_t delta_ts64, delta_oa32;
+		uint64_t delta_ts64_ns, delta_oa32_ns;
+		uint32_t delta_delta;
+		int n_samples_written;
+		int width = 800;
+		int height = 600;
+		int ret;
+
+		igt_drop_root();
+
+		bufmgr = drm_intel_bufmgr_gem_init(drm_fd, 4096);
+		drm_intel_bufmgr_gem_enable_reuse(bufmgr);
+
+		scratch_buf_init(bufmgr, &src, width, height, 0xff0000ff);
+		scratch_buf_init(bufmgr, &dst, width, height, 0x00ff00ff);
+
+		batch = intel_batchbuffer_alloc(bufmgr, devid);
+
+		context0 = drm_intel_gem_context_create(bufmgr);
+		igt_assert(context0);
+
+		context1 = drm_intel_gem_context_create(bufmgr);
+		igt_assert(context1);
+
+		igt_debug("submitting warm up render_copy\n");
+
+		/* Submit some early, unmeasured, work to the context we want
+		 * to measure to try and catch issues with i915-perf
+		 * initializing the HW context ID for filtering.
+		 *
+		 * We do this because i915-perf single context filtering had
+		 * previously only relied on a hook into context pinning to
+		 * initialize the HW context ID, instead of also trying to
+		 * determine the HW ID while opening the stream, in case it
+		 * has already been pinned.
+		 *
+		 * This wasn't noticed by the previous unit test because we
+		 * were opening the stream while the context hadn't been
+		 * touched or pinned yet and so it worked out correctly to wait
+		 * for the pinning hook.
+		 *
+		 * Now a buggy version of i915-perf will fail to measure
+		 * anything for context0 once this initial render_copy() ends
+		 * up pinning the context since there won't ever be a pinning
+		 * hook callback.
+		 */
+		render_copy(batch,
+			    context0,
+			    &src, 0, 0, width, height,
+			    &dst, 0, 0);
+
+		properties[1] = drm_intel_gem_context_get_context_id(context0);
+
+		igt_debug("opening i915-perf stream\n");
+		stream_fd = __perf_open(drm_fd, &param);
+
+		bo = drm_intel_bo_alloc(bufmgr, "mi_rpc dest bo", 4096, 64);
+
+		ret = drm_intel_bo_map(bo, true /* write enable */);
+		igt_assert_eq(ret, 0);
+
+		memset(bo->virtual, 0x80, 4096);
+		drm_intel_bo_unmap(bo);
+
+		emit_stall_timestamp_and_rpc(batch,
+					     bo,
+					     512 /* timestamp offset */,
+					     0, /* report dst offset */
+					     0xdeadbeef); /* report id */
+
+		/* Explicitly flush here (even though the render_copy() call
+		 * will itself flush before/after the copy) to clarify that
+		 * that the PIPE_CONTROL + MI_RPC commands will be in a
+		 * separate batch from the copy.
+		 */
+		intel_batchbuffer_flush_with_context(batch, context0);
+
+		render_copy(batch,
+			    context0,
+			    &src, 0, 0, width, height,
+			    &dst, 0, 0);
+
+		/* Another redundant flush to clarify batch bo is free to reuse */
+		intel_batchbuffer_flush_with_context(batch, context0);
+
+		/* submit two copies on the other context to avoid a false
+		 * positive in case the driver somehow ended up filtering for
+		 * context1
+		 */
+		render_copy(batch,
+			    context1,
+			    &src, 0, 0, width, height,
+			    &dst, 0, 0);
+
+		render_copy(batch,
+			    context1,
+			    &src, 0, 0, width, height,
+			    &dst, 0, 0);
+
+		/* And another */
+		intel_batchbuffer_flush_with_context(batch, context1);
+
+		emit_stall_timestamp_and_rpc(batch,
+					     bo,
+					     520 /* timestamp offset */,
+					     256, /* report dst offset */
+					     0xbeefbeef); /* report id */
+
+		intel_batchbuffer_flush_with_context(batch, context0);
+
+		ret = drm_intel_bo_map(bo, false /* write enable */);
+		igt_assert_eq(ret, 0);
+
+		report0_32 = bo->virtual;
+		igt_assert_eq(report0_32[0], 0xdeadbeef); /* report ID */
+		igt_assert_neq(report0_32[1], 0); /* timestamp */
+
+		report1_32 = report0_32 + 64;
+		igt_assert_eq(report1_32[0], 0xbeefbeef); /* report ID */
+		igt_assert_neq(report1_32[1], 0); /* timestamp */
+
+		print_reports(report0_32, report1_32,
+			      lookup_format(I915_OA_FORMAT_A45_B8_C8));
+
+		/* A40 == N samples written to all render targets */
+		n_samples_written = report1_32[43] - report0_32[43];
+		igt_debug("n samples written = %d\n", n_samples_written);
+		igt_assert_eq(n_samples_written, width * height);
+
+		igt_debug("timestamp32 0 = %u\n", report0_32[1]);
+		igt_debug("timestamp32 1 = %u\n", report1_32[1]);
+
+		timestamp0_64 = *(uint64_t *)(((uint8_t *)bo->virtual) + 512);
+		timestamp1_64 = *(uint64_t *)(((uint8_t *)bo->virtual) + 520);
+
+		igt_debug("timestamp64 0 = %"PRIu64"\n", timestamp0_64);
+		igt_debug("timestamp64 1 = %"PRIu64"\n", timestamp1_64);
+
+		delta_ts64 = timestamp1_64 - timestamp0_64;
+		delta_oa32 = report1_32[1] - report0_32[1];
+
+		/* sanity check that we can pass the delta to timebase_scale */
+		igt_assert(delta_ts64 < UINT32_MAX);
+		delta_oa32_ns = timebase_scale(delta_oa32);
+		delta_ts64_ns = timebase_scale(delta_ts64);
+
+		igt_debug("ts32 delta = %u, = %uns\n",
+			  delta_oa32, (unsigned)delta_oa32_ns);
+		igt_debug("ts64 delta = %u, = %uns\n",
+			  delta_ts64, (unsigned)delta_ts64_ns);
+
+		/* The delta as calculated via the PIPE_CONTROL timestamp or
+		 * the OA report timestamps should be almost identical but
+		 * allow a 320 nanoseconds margin.
+		 */
+		delta_delta = delta_ts64_ns > delta_oa32_ns ?
+			(delta_ts64_ns - delta_oa32_ns) :
+			(delta_oa32_ns - delta_ts64_ns);
+		igt_assert(delta_delta <= 320);
+
+		drm_intel_bo_unreference(src.bo);
+		drm_intel_bo_unreference(dst.bo);
+
+		drm_intel_bo_unmap(bo);
+		drm_intel_bo_unreference(bo);
+		intel_batchbuffer_free(batch);
+		drm_intel_gem_context_destroy(context0);
+		drm_intel_gem_context_destroy(context1);
+		drm_intel_bufmgr_destroy(bufmgr);
+		close(stream_fd);
+	}
+
+	igt_waitchildren();
+}
+
+static void
+test_rc6_disable(void)
+{
+	int oa_exponent = 13; /* 1 millisecond */
+	uint64_t properties[] = {
+		/* Include OA reports in samples */
+		DRM_I915_PERF_PROP_SAMPLE_OA, true,
+
+		/* OA unit configuration */
+		DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
+		DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
+		DRM_I915_PERF_PROP_OA_EXPONENT, oa_exponent,
+	};
+	struct drm_i915_perf_open_param param = {
+		.flags = I915_PERF_FLAG_FD_CLOEXEC,
+		.num_properties = sizeof(properties) / 16,
+		.properties_ptr = (uint64_t)properties,
+	};
+	int stream_fd = __perf_open(drm_fd, &param);
+	uint64_t n_events_start = read_debugfs_u64_record("i915_drpc_info",
+							  "RC6 residency since boot");
+	uint64_t n_events_end;
+
+	nanosleep(&(struct timespec){ .tv_sec = 0, .tv_nsec = 500000000 }, NULL);
+
+	n_events_end = read_debugfs_u64_record("i915_drpc_info",
+					       "RC6 residency since boot");
+
+	igt_assert_eq(n_events_end - n_events_start, 0);
+
+	close(stream_fd);
+
+	n_events_start = read_debugfs_u64_record("i915_drpc_info",
+						 "RC6 residency since boot");
+
+	nanosleep(&(struct timespec){ .tv_sec = 0, .tv_nsec = 500000000 }, NULL);
+
+	n_events_end = read_debugfs_u64_record("i915_drpc_info",
+					       "RC6 residency since boot");
+
+	igt_assert_neq(n_events_end - n_events_start, 0);
+}
+
+static unsigned
+read_i915_module_ref(void)
+{
+	FILE *fp = fopen("/proc/modules", "r");
+	char *line = NULL;
+	size_t line_buf_size = 0;
+	int len = 0;
+	unsigned ref_count;
+
+	igt_assert(fp);
+
+	while ((len = getline(&line, &line_buf_size, fp)) > 0) {
+		if (strncmp(line, "i915 ", 5) == 0) {
+			unsigned long mem;
+			int ret = sscanf(line + 5, "%lu %u", &mem, &ref_count);
+			igt_assert(ret == 2);
+			goto done;
+		}
+	}
+
+	igt_assert(!"reached");
+
+done:
+	free(line);
+	fclose(fp);
+	return ref_count;
+}
+
+/* check that an open i915 perf stream holds a reference on the drm i915 module
+ * including in the corner case where the original drm fd has been closed.
+ */
+static void
+test_i915_ref_count(void)
+{
+	int oa_exponent = 13; /* 1 millisecond */
+	uint64_t properties[] = {
+		/* Include OA reports in samples */
+		DRM_I915_PERF_PROP_SAMPLE_OA, true,
+
+		/* OA unit configuration */
+		DRM_I915_PERF_PROP_OA_METRICS_SET, 0 /* updated below */,
+		DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
+		DRM_I915_PERF_PROP_OA_EXPONENT, oa_exponent,
+	};
+	struct drm_i915_perf_open_param param = {
+		.flags = I915_PERF_FLAG_FD_CLOEXEC,
+		.num_properties = sizeof(properties) / 16,
+		.properties_ptr = (uint64_t)properties,
+	};
+	unsigned baseline, ref_count0, ref_count1;
+	int stream_fd;
+	uint32_t oa_report0[64];
+	uint32_t oa_report1[64];
+
+	/* This should be the first test before the first fixture so no drm_fd
+	 * should have been opened so far...
+	 */
+	igt_assert_eq(drm_fd, -1);
+
+	baseline = read_i915_module_ref();
+	igt_debug("baseline ref count (drm fd closed) = %u\n", baseline);
+
+	drm_fd = __drm_open_driver(DRIVER_INTEL);
+	devid = intel_get_drm_devid(drm_fd);
+	device = drm_get_card();
+
+	igt_require(IS_HASWELL(devid));
+	igt_require(lookup_hsw_render_basic_id());
+	properties[3] = hsw_render_basic_id;
+
+	ref_count0 = read_i915_module_ref();
+	igt_debug("initial ref count with drm_fd open = %u\n", ref_count0);
+	igt_assert(ref_count0 > baseline);
+
+	stream_fd = __perf_open(drm_fd, &param);
+	ref_count1 = read_i915_module_ref();
+	igt_debug("ref count after opening i915 perf stream = %u\n", ref_count1);
+	igt_assert(ref_count1 > ref_count0);
+
+	close(drm_fd);
+	drm_fd = -1;
+	ref_count0 = read_i915_module_ref();
+	igt_debug("ref count after closing drm fd = %u\n", ref_count0);
+
+	igt_assert(ref_count0 > baseline);
+
+	read_2_oa_reports(stream_fd,
+			  I915_OA_FORMAT_A45_B8_C8,
+			  oa_exponent,
+			  oa_report0,
+			  oa_report1,
+			  false); /* not just timer reports */
+
+	close(stream_fd);
+	ref_count0 = read_i915_module_ref();
+	igt_debug("ref count after closing i915 perf stream fd = %u\n", ref_count0);
+	igt_assert_eq(ref_count0, baseline);
+}
+
+static void
+test_sysctl_defaults(void)
+{
+	int paranoid = read_u64_file("/proc/sys/dev/i915/perf_stream_paranoid");
+	int max_freq = read_u64_file("/proc/sys/dev/i915/oa_max_sample_rate");
+
+	igt_assert_eq(paranoid, 1);
+	igt_assert_eq(max_freq, 100000);
+}
+
+igt_main
+{
+	igt_skip_on_simulation();
+
+	igt_fixture {
+		struct stat sb;
+
+		igt_require(stat("/proc/sys/dev/i915/perf_stream_paranoid", &sb)
+			    == 0);
+		igt_require(stat("/proc/sys/dev/i915/oa_max_sample_rate", &sb)
+			    == 0);
+	}
+
+	igt_subtest("i915-ref-count")
+		test_i915_ref_count();
+
+	igt_subtest("sysctl-defaults")
+		test_sysctl_defaults();
+
+	igt_fixture {
+		/* We expect that the ref count test before these fixtures
+		 * should have closed drm_fd...
+		 */
+		igt_assert_eq(drm_fd, -1);
+		drm_fd = drm_open_driver_render(DRIVER_INTEL);
+		devid = intel_get_drm_devid(drm_fd);
+		device = drm_get_card();
+
+		igt_require(IS_HASWELL(devid));
+		igt_require(lookup_hsw_render_basic_id());
+
+		gt_frequency_range_save();
+
+		write_u64_file("/proc/sys/dev/i915/perf_stream_paranoid", 1);
+		write_u64_file("/proc/sys/dev/i915/oa_max_sample_rate", 100000);
+
+		render_copy = igt_get_render_copyfunc(devid);
+		igt_require_f(render_copy, "no render-copy function\n");
+	}
+
+	igt_subtest("non-system-wide-paranoid")
+		test_system_wide_paranoid();
+
+	igt_subtest("invalid-open-flags")
+		test_invalid_open_flags();
+
+	igt_subtest("invalid-oa-metric-set-id")
+		test_invalid_oa_metric_set_id();
+
+	igt_subtest("invalid-oa-format-id")
+		test_invalid_oa_format_id();
+
+	igt_subtest("missing-sample-flags")
+		test_missing_sample_flags();
+
+	igt_subtest("oa-formats")
+		test_oa_formats();
+
+	igt_subtest("invalid-oa-exponent")
+		test_invalid_oa_exponent();
+	igt_subtest("low-oa-exponent-permissions")
+		test_low_oa_exponent_permissions();
+	igt_subtest("oa-exponents") {
+		test_oa_exponents(450);
+		test_oa_exponents(550);
+	}
+
+	igt_subtest("per-context-mode-unprivileged")
+		test_per_context_mode_unprivileged();
+
+	igt_subtest("buffer-fill")
+		test_buffer_fill();
+
+	igt_subtest("disabled-read-error")
+		test_disabled_read_error();
+	igt_subtest("non-sampling-read-error")
+		test_non_sampling_read_error();
+
+	igt_subtest("enable-disable")
+		test_enable_disable();
+
+	igt_subtest("blocking")
+		test_blocking();
+
+	igt_subtest("polling")
+		test_polling();
+
+	igt_subtest("short-reads")
+		test_short_reads();
+
+	igt_subtest("mi-rpc")
+		test_mi_rpc();
+
+	igt_subtest("mi-rpc-per-ctx")
+		test_per_ctx_mi_rpc();
+
+	igt_subtest("rc6-disable")
+		test_rc6_disable();
+
+	igt_fixture {
+		/* leave sysctl options in their default state... */
+		write_u64_file("/proc/sys/dev/i915/oa_max_sample_rate", 100000);
+		write_u64_file("/proc/sys/dev/i915/perf_stream_paranoid", 1);
+
+		gt_frequency_range_restore();
+
+		close(drm_fd);
+	}
+}
-- 
2.10.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH igt v4 02/13] igt/gem_exec_parse: some minor cleanups
  2016-11-14 20:51 [PATCH igt v4 00/13] corresponding changes for i915-perf interface Robert Bragg
  2016-11-14 20:51 ` [PATCH igt v4 01/13] igt/perf: add i915 perf stream tests for Haswell Robert Bragg
@ 2016-11-14 20:51 ` Robert Bragg
  2016-11-14 20:51 ` [PATCH igt v4 03/13] igt/gem_exec_parse: move hsw_load_register_reg down Robert Bragg
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Robert Bragg @ 2016-11-14 20:51 UTC (permalink / raw)
  To: intel-gfx

This normalizes the execbuf utilities in this file to all use memset to
clear obj, reloc and execbuf structures and set them up in the same
order. As I was debugging some unpredictable test failures I was getting
unsure that all these structures were being fully initialized.

The same I915_GEM_DOMAIN_COMMAND domain is now used with all relocs.

The register/command defines have been moved to the top of the file to
be available to all tests/utilities.

The handle + fd variables are now static.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
---
 tests/gem_exec_parse.c | 196 +++++++++++++++++--------------------------------
 1 file changed, 66 insertions(+), 130 deletions(-)

diff --git a/tests/gem_exec_parse.c b/tests/gem_exec_parse.c
index 2739b2c..c81bbc7 100644
--- a/tests/gem_exec_parse.c
+++ b/tests/gem_exec_parse.c
@@ -37,6 +37,20 @@
 #define OACONTROL 0x2360
 #define DERRMR 0x44050
 
+#define HSW_CS_GPR(n) (0x2600 + 8*(n))
+#define HSW_CS_GPR0 HSW_CS_GPR(0)
+#define HSW_CS_GPR1 HSW_CS_GPR(1)
+
+#define MI_LOAD_REGISTER_REG (0x2a << 23)
+#define MI_STORE_REGISTER_MEM (0x24 << 23)
+#define MI_ARB_ON_OFF (0x8 << 23)
+#define MI_DISPLAY_FLIP ((0x14 << 23) | 1)
+
+#define GFX_OP_PIPE_CONTROL	((0x3<<29)|(0x3<<27)|(0x2<<24)|2)
+#define   PIPE_CONTROL_QW_WRITE	(1<<14)
+#define   PIPE_CONTROL_LRI_POST_OP (1<<23)
+
+
 static int command_parser_version(int fd)
 {
 	int version = -1;
@@ -51,12 +65,6 @@ static int command_parser_version(int fd)
 	return -1;
 }
 
-#define HSW_CS_GPR(n) (0x2600 + 8*(n))
-#define HSW_CS_GPR0 HSW_CS_GPR(0)
-#define HSW_CS_GPR1 HSW_CS_GPR(1)
-
-#define MI_LOAD_REGISTER_REG (0x2a << 23)
-#define MI_STORE_REGISTER_MEM (0x24 << 23)
 static void hsw_load_register_reg(void)
 {
 	uint32_t buf[16] = {
@@ -100,13 +108,13 @@ static void hsw_load_register_reg(void)
 	reloc[0].offset = 7*sizeof(uint32_t);
 	reloc[0].target_handle = obj[0].handle;
 	reloc[0].delta = 0;
-	reloc[0].read_domains = I915_GEM_DOMAIN_INSTRUCTION;
-	reloc[0].write_domain = I915_GEM_DOMAIN_INSTRUCTION;
+	reloc[0].read_domains = I915_GEM_DOMAIN_COMMAND;
+	reloc[0].write_domain = I915_GEM_DOMAIN_COMMAND;
 	reloc[1].offset = 13*sizeof(uint32_t);
 	reloc[1].target_handle = obj[0].handle;
 	reloc[1].delta = sizeof(uint32_t);
-	reloc[1].read_domains = I915_GEM_DOMAIN_INSTRUCTION;
-	reloc[1].write_domain = I915_GEM_DOMAIN_INSTRUCTION;
+	reloc[1].read_domains = I915_GEM_DOMAIN_COMMAND;
+	reloc[1].write_domain = I915_GEM_DOMAIN_COMMAND;
 	obj[1].relocs_ptr = (uintptr_t)&reloc;
 	obj[1].relocation_count = 2;
 
@@ -152,7 +160,7 @@ static void exec_batch_patched(int fd, uint32_t cmd_bo, uint32_t *cmds,
 			       int size, int patch_offset, uint64_t expected_value)
 {
 	struct drm_i915_gem_execbuffer2 execbuf;
-	struct drm_i915_gem_exec_object2 objs[2];
+	struct drm_i915_gem_exec_object2 obj[2];
 	struct drm_i915_gem_relocation_entry reloc[1];
 
 	uint32_t target_bo = gem_create(fd, 4096);
@@ -160,42 +168,24 @@ static void exec_batch_patched(int fd, uint32_t cmd_bo, uint32_t *cmds,
 
 	gem_write(fd, cmd_bo, 0, cmds, size);
 
+	memset(obj, 0, sizeof(obj));
+	obj[0].handle = target_bo;
+	obj[1].handle = cmd_bo;
+
+	memset(reloc, 0, sizeof(reloc));
 	reloc[0].offset = patch_offset;
+	reloc[0].target_handle = obj[0].handle;
 	reloc[0].delta = 0;
-	reloc[0].target_handle = target_bo;
-	reloc[0].read_domains = I915_GEM_DOMAIN_RENDER;
-	reloc[0].write_domain = I915_GEM_DOMAIN_RENDER;
-	reloc[0].presumed_offset = 0;
-
-	objs[0].handle = target_bo;
-	objs[0].relocation_count = 0;
-	objs[0].relocs_ptr = 0;
-	objs[0].alignment = 0;
-	objs[0].offset = 0;
-	objs[0].flags = 0;
-	objs[0].rsvd1 = 0;
-	objs[0].rsvd2 = 0;
-
-	objs[1].handle = cmd_bo;
-	objs[1].relocation_count = 1;
-	objs[1].relocs_ptr = (uintptr_t)reloc;
-	objs[1].alignment = 0;
-	objs[1].offset = 0;
-	objs[1].flags = 0;
-	objs[1].rsvd1 = 0;
-	objs[1].rsvd2 = 0;
-
-	execbuf.buffers_ptr = (uintptr_t)objs;
+	reloc[0].read_domains = I915_GEM_DOMAIN_COMMAND;
+	reloc[0].write_domain = I915_GEM_DOMAIN_COMMAND;
+	obj[1].relocs_ptr = (uintptr_t)reloc;
+	obj[1].relocation_count = 1;
+
+	memset(&execbuf, 0, sizeof(execbuf));
+	execbuf.buffers_ptr = (uintptr_t)obj;
 	execbuf.buffer_count = 2;
-	execbuf.batch_start_offset = 0;
 	execbuf.batch_len = size;
-	execbuf.cliprects_ptr = 0;
-	execbuf.num_cliprects = 0;
-	execbuf.DR1 = 0;
-	execbuf.DR4 = 0;
 	execbuf.flags = I915_EXEC_RENDER;
-	i915_execbuffer2_set_context_id(execbuf, 0);
-	execbuf.rsvd2 = 0;
 
 	gem_execbuf(fd, &execbuf);
 	gem_sync(fd, cmd_bo);
@@ -210,30 +200,18 @@ static int __exec_batch(int fd, uint32_t cmd_bo, uint32_t *cmds,
 			int size, int ring)
 {
 	struct drm_i915_gem_execbuffer2 execbuf;
-	struct drm_i915_gem_exec_object2 objs[1];
+	struct drm_i915_gem_exec_object2 obj[1];
 
 	gem_write(fd, cmd_bo, 0, cmds, size);
 
-	objs[0].handle = cmd_bo;
-	objs[0].relocation_count = 0;
-	objs[0].relocs_ptr = 0;
-	objs[0].alignment = 0;
-	objs[0].offset = 0;
-	objs[0].flags = 0;
-	objs[0].rsvd1 = 0;
-	objs[0].rsvd2 = 0;
+	memset(obj, 0, sizeof(obj));
+	obj[0].handle = cmd_bo;
 
-	execbuf.buffers_ptr = (uintptr_t)objs;
+	memset(&execbuf, 0, sizeof(execbuf));
+	execbuf.buffers_ptr = (uintptr_t)obj;
 	execbuf.buffer_count = 1;
-	execbuf.batch_start_offset = 0;
 	execbuf.batch_len = size;
-	execbuf.cliprects_ptr = 0;
-	execbuf.num_cliprects = 0;
-	execbuf.DR1 = 0;
-	execbuf.DR4 = 0;
 	execbuf.flags = ring;
-	i915_execbuffer2_set_context_id(execbuf, 0);
-	execbuf.rsvd2 = 0;
 
 	return __gem_execbuf(fd, &execbuf);
 }
@@ -244,7 +222,7 @@ static void exec_split_batch(int fd, uint32_t *cmds,
 			     int size, int ring, int expected_ret)
 {
 	struct drm_i915_gem_execbuffer2 execbuf;
-	struct drm_i915_gem_exec_object2 objs[1];
+	struct drm_i915_gem_exec_object2 obj[1];
 	uint32_t cmd_bo;
 	uint32_t noop[1024] = { 0 };
 	const int alloc_size = 4096 * 2;
@@ -261,16 +239,11 @@ static void exec_split_batch(int fd, uint32_t *cmds,
 	 */
 	gem_write(fd, cmd_bo, actual_start_offset, cmds, size);
 
-	objs[0].handle = cmd_bo;
-	objs[0].relocation_count = 0;
-	objs[0].relocs_ptr = 0;
-	objs[0].alignment = 0;
-	objs[0].offset = 0;
-	objs[0].flags = 0;
-	objs[0].rsvd1 = 0;
-	objs[0].rsvd2 = 0;
+	memset(obj, 0, sizeof(obj));
+	obj[0].handle = cmd_bo;
 
-	execbuf.buffers_ptr = (uintptr_t)objs;
+	memset(&execbuf, 0, sizeof(execbuf));
+	execbuf.buffers_ptr = (uintptr_t)obj;
 	execbuf.buffer_count = 1;
 	/* NB: We want batch_start_offset and batch_len to point to the block
 	 * of the actual commands (i.e. at the last dword of the first page),
@@ -281,13 +254,7 @@ static void exec_split_batch(int fd, uint32_t *cmds,
 	execbuf.batch_len =
 		ALIGN(size + actual_start_offset - execbuf.batch_start_offset,
 		      0x8);
-	execbuf.cliprects_ptr = 0;
-	execbuf.num_cliprects = 0;
-	execbuf.DR1 = 0;
-	execbuf.DR4 = 0;
 	execbuf.flags = ring;
-	i915_execbuffer2_set_context_id(execbuf, 0);
-	execbuf.rsvd2 = 0;
 
 	igt_assert_eq(__gem_execbuf(fd, &execbuf), expected_ret);
 
@@ -300,8 +267,8 @@ static void exec_batch_chained(int fd, uint32_t cmd_bo, uint32_t *cmds,
 			       uint64_t expected_value)
 {
 	struct drm_i915_gem_execbuffer2 execbuf;
-	struct drm_i915_gem_exec_object2 objs[3];
-	struct drm_i915_gem_relocation_entry reloc;
+	struct drm_i915_gem_exec_object2 obj[3];
+	struct drm_i915_gem_relocation_entry reloc[1];
 	struct drm_i915_gem_relocation_entry first_level_reloc;
 
 	uint32_t target_bo = gem_create(fd, 4096);
@@ -322,58 +289,34 @@ static void exec_batch_chained(int fd, uint32_t cmd_bo, uint32_t *cmds,
 		  first_level_cmds, sizeof(first_level_cmds));
 	gem_write(fd, cmd_bo, 0, cmds, size);
 
-	reloc.offset = patch_offset;
-	reloc.delta = 0;
-	reloc.target_handle = target_bo;
-	reloc.read_domains = I915_GEM_DOMAIN_RENDER;
-	reloc.write_domain = I915_GEM_DOMAIN_RENDER;
-	reloc.presumed_offset = 0;
+	memset(obj, 0, sizeof(obj));
+	obj[0].handle = target_bo;
+	obj[1].handle = cmd_bo;
+	obj[2].handle = first_level_bo;
+
+	memset(reloc, 0, sizeof(reloc));
+	reloc[0].offset = patch_offset;
+	reloc[0].delta = 0;
+	reloc[0].target_handle = target_bo;
+	reloc[0].read_domains = I915_GEM_DOMAIN_COMMAND;
+	reloc[0].write_domain = I915_GEM_DOMAIN_COMMAND;
+	obj[1].relocation_count = 1;
+	obj[1].relocs_ptr = (uintptr_t)&reloc;
 
+	memset(&first_level_reloc, 0, sizeof(first_level_reloc));
 	first_level_reloc.offset = 4;
 	first_level_reloc.delta = 0;
 	first_level_reloc.target_handle = cmd_bo;
-	first_level_reloc.read_domains = I915_GEM_DOMAIN_INSTRUCTION;
+	first_level_reloc.read_domains = I915_GEM_DOMAIN_COMMAND;
 	first_level_reloc.write_domain = 0;
-	first_level_reloc.presumed_offset = 0;
-
-	objs[0].handle = target_bo;
-	objs[0].relocation_count = 0;
-	objs[0].relocs_ptr = 0;
-	objs[0].alignment = 0;
-	objs[0].offset = 0;
-	objs[0].flags = 0;
-	objs[0].rsvd1 = 0;
-	objs[0].rsvd2 = 0;
-
-	objs[1].handle = cmd_bo;
-	objs[1].relocation_count = 1;
-	objs[1].relocs_ptr = (uintptr_t)&reloc;
-	objs[1].alignment = 0;
-	objs[1].offset = 0;
-	objs[1].flags = 0;
-	objs[1].rsvd1 = 0;
-	objs[1].rsvd2 = 0;
-
-	objs[2].handle = first_level_bo;
-	objs[2].relocation_count = 1;
-	objs[2].relocs_ptr = (uintptr_t)&first_level_reloc;
-	objs[2].alignment = 0;
-	objs[2].offset = 0;
-	objs[2].flags = 0;
-	objs[2].rsvd1 = 0;
-	objs[2].rsvd2 = 0;
-
-	execbuf.buffers_ptr = (uintptr_t)objs;
+	obj[2].relocation_count = 1;
+	obj[2].relocs_ptr = (uintptr_t)&first_level_reloc;
+
+	memset(&execbuf, 0, sizeof(execbuf));
+	execbuf.buffers_ptr = (uintptr_t)obj;
 	execbuf.buffer_count = 3;
-	execbuf.batch_start_offset = 0;
 	execbuf.batch_len = sizeof(first_level_cmds);
-	execbuf.cliprects_ptr = 0;
-	execbuf.num_cliprects = 0;
-	execbuf.DR1 = 0;
-	execbuf.DR4 = 0;
 	execbuf.flags = I915_EXEC_RENDER;
-	i915_execbuffer2_set_context_id(execbuf, 0);
-	execbuf.rsvd2 = 0;
 
 	gem_execbuf(fd, &execbuf);
 	gem_sync(fd, cmd_bo);
@@ -441,15 +384,8 @@ static void test_allocations(int fd)
 	}
 }
 
-uint32_t handle;
-int fd;
-
-#define MI_ARB_ON_OFF (0x8 << 23)
-#define MI_DISPLAY_FLIP ((0x14 << 23) | 1)
-
-#define GFX_OP_PIPE_CONTROL	((0x3<<29)|(0x3<<27)|(0x2<<24)|2)
-#define   PIPE_CONTROL_QW_WRITE	(1<<14)
-#define   PIPE_CONTROL_LRI_POST_OP (1<<23)
+static uint32_t handle;
+static int fd;
 
 igt_main
 {
-- 
2.10.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH igt v4 03/13] igt/gem_exec_parse: move hsw_load_register_reg down
  2016-11-14 20:51 [PATCH igt v4 00/13] corresponding changes for i915-perf interface Robert Bragg
  2016-11-14 20:51 ` [PATCH igt v4 01/13] igt/perf: add i915 perf stream tests for Haswell Robert Bragg
  2016-11-14 20:51 ` [PATCH igt v4 02/13] igt/gem_exec_parse: some minor cleanups Robert Bragg
@ 2016-11-14 20:51 ` Robert Bragg
  2016-11-14 20:51 ` [PATCH igt v4 04/13] igt/gem_exec_parse: update hsw_load_register_reg Robert Bragg
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Robert Bragg @ 2016-11-14 20:51 UTC (permalink / raw)
  To: intel-gfx

No functional change, just moving hsw_load_regster_reg test code down
below the execbuf utilities in preparation for updating to use them.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
---
 tests/gem_exec_parse.c | 182 ++++++++++++++++++++++++-------------------------
 1 file changed, 91 insertions(+), 91 deletions(-)

diff --git a/tests/gem_exec_parse.c b/tests/gem_exec_parse.c
index c81bbc7..7a9dc82 100644
--- a/tests/gem_exec_parse.c
+++ b/tests/gem_exec_parse.c
@@ -65,97 +65,6 @@ static int command_parser_version(int fd)
 	return -1;
 }
 
-static void hsw_load_register_reg(void)
-{
-	uint32_t buf[16] = {
-		MI_LOAD_REGISTER_IMM | (5 - 2),
-		HSW_CS_GPR0,
-		0xabcdabcd,
-		HSW_CS_GPR1,
-		0xdeadbeef,
-
-		MI_STORE_REGISTER_MEM | (3 - 2),
-		HSW_CS_GPR1,
-		0, /* address0 */
-
-		MI_LOAD_REGISTER_REG | (3 - 2),
-		HSW_CS_GPR0,
-		HSW_CS_GPR1,
-
-		MI_STORE_REGISTER_MEM | (3 - 2),
-		HSW_CS_GPR1,
-		4, /* address1 */
-
-		MI_BATCH_BUFFER_END,
-	};
-	struct drm_i915_gem_execbuffer2 execbuf;
-	struct drm_i915_gem_exec_object2 obj[2];
-	struct drm_i915_gem_relocation_entry reloc[2];
-	int fd;
-
-	/* Open again to get a non-master file descriptor */
-	fd = drm_open_driver(DRIVER_INTEL);
-
-	igt_require(IS_HASWELL(intel_get_drm_devid(fd)));
-	igt_require(command_parser_version(fd) >= 7);
-
-	memset(obj, 0, sizeof(obj));
-	obj[0].handle = gem_create(fd, 4096);
-	obj[1].handle = gem_create(fd, 4096);
-	gem_write(fd, obj[1].handle, 0, buf, sizeof(buf));
-
-	memset(reloc, 0, sizeof(reloc));
-	reloc[0].offset = 7*sizeof(uint32_t);
-	reloc[0].target_handle = obj[0].handle;
-	reloc[0].delta = 0;
-	reloc[0].read_domains = I915_GEM_DOMAIN_COMMAND;
-	reloc[0].write_domain = I915_GEM_DOMAIN_COMMAND;
-	reloc[1].offset = 13*sizeof(uint32_t);
-	reloc[1].target_handle = obj[0].handle;
-	reloc[1].delta = sizeof(uint32_t);
-	reloc[1].read_domains = I915_GEM_DOMAIN_COMMAND;
-	reloc[1].write_domain = I915_GEM_DOMAIN_COMMAND;
-	obj[1].relocs_ptr = (uintptr_t)&reloc;
-	obj[1].relocation_count = 2;
-
-	memset(&execbuf, 0, sizeof(execbuf));
-	execbuf.buffers_ptr = (uintptr_t)obj;
-	execbuf.buffer_count = 2;
-	execbuf.batch_len = sizeof(buf);
-	execbuf.flags = I915_EXEC_RENDER;
-	gem_execbuf(fd, &execbuf);
-	gem_close(fd, obj[1].handle);
-
-	gem_read(fd, obj[0].handle, 0, buf, 2*sizeof(buf[0]));
-	igt_assert_eq_u32(buf[0], 0xdeadbeef); /* before copy */
-	igt_assert_eq_u32(buf[1], 0xabcdabcd); /* after copy */
-
-	/* Now a couple of negative tests that should be filtered */
-	execbuf.buffer_count = 1;
-	execbuf.batch_len = 4*sizeof(buf[0]);
-
-	buf[0] = MI_LOAD_REGISTER_REG | (3 - 2);
-	buf[1] = HSW_CS_GPR0;
-	buf[2] = 0;
-	buf[3] = MI_BATCH_BUFFER_END;
-	gem_write(fd, obj[0].handle, 0, buf, execbuf.batch_len);
-	igt_assert_eq(__gem_execbuf(fd, &execbuf), -EINVAL);
-
-	buf[2] = OACONTROL; /* filtered */
-	gem_write(fd, obj[0].handle, 0, buf, execbuf.batch_len);
-	igt_assert_eq(__gem_execbuf(fd, &execbuf), -EINVAL);
-
-	buf[2] = DERRMR; /* master only */
-	gem_write(fd, obj[0].handle, 0, buf, execbuf.batch_len);
-	igt_assert_eq(__gem_execbuf(fd, &execbuf), -EINVAL);
-
-	buf[2] = 0x2038; /* RING_START: invalid */
-	gem_write(fd, obj[0].handle, 0, buf, execbuf.batch_len);
-	igt_assert_eq(__gem_execbuf(fd, &execbuf), -EINVAL);
-
-	close(fd);
-}
-
 static void exec_batch_patched(int fd, uint32_t cmd_bo, uint32_t *cmds,
 			       int size, int patch_offset, uint64_t expected_value)
 {
@@ -384,6 +293,97 @@ static void test_allocations(int fd)
 	}
 }
 
+static void hsw_load_register_reg(void)
+{
+	uint32_t buf[16] = {
+		MI_LOAD_REGISTER_IMM | (5 - 2),
+		HSW_CS_GPR0,
+		0xabcdabcd,
+		HSW_CS_GPR1,
+		0xdeadbeef,
+
+		MI_STORE_REGISTER_MEM | (3 - 2),
+		HSW_CS_GPR1,
+		0, /* address0 */
+
+		MI_LOAD_REGISTER_REG | (3 - 2),
+		HSW_CS_GPR0,
+		HSW_CS_GPR1,
+
+		MI_STORE_REGISTER_MEM | (3 - 2),
+		HSW_CS_GPR1,
+		4, /* address1 */
+
+		MI_BATCH_BUFFER_END,
+	};
+	struct drm_i915_gem_execbuffer2 execbuf;
+	struct drm_i915_gem_exec_object2 obj[2];
+	struct drm_i915_gem_relocation_entry reloc[2];
+	int fd;
+
+	/* Open again to get a non-master file descriptor */
+	fd = drm_open_driver(DRIVER_INTEL);
+
+	igt_require(IS_HASWELL(intel_get_drm_devid(fd)));
+	igt_require(command_parser_version(fd) >= 7);
+
+	memset(obj, 0, sizeof(obj));
+	obj[0].handle = gem_create(fd, 4096);
+	obj[1].handle = gem_create(fd, 4096);
+	gem_write(fd, obj[1].handle, 0, buf, sizeof(buf));
+
+	memset(reloc, 0, sizeof(reloc));
+	reloc[0].offset = 7*sizeof(uint32_t);
+	reloc[0].target_handle = obj[0].handle;
+	reloc[0].delta = 0;
+	reloc[0].read_domains = I915_GEM_DOMAIN_COMMAND;
+	reloc[0].write_domain = I915_GEM_DOMAIN_COMMAND;
+	reloc[1].offset = 13*sizeof(uint32_t);
+	reloc[1].target_handle = obj[0].handle;
+	reloc[1].delta = sizeof(uint32_t);
+	reloc[1].read_domains = I915_GEM_DOMAIN_COMMAND;
+	reloc[1].write_domain = I915_GEM_DOMAIN_COMMAND;
+	obj[1].relocs_ptr = (uintptr_t)&reloc;
+	obj[1].relocation_count = 2;
+
+	memset(&execbuf, 0, sizeof(execbuf));
+	execbuf.buffers_ptr = (uintptr_t)obj;
+	execbuf.buffer_count = 2;
+	execbuf.batch_len = sizeof(buf);
+	execbuf.flags = I915_EXEC_RENDER;
+	gem_execbuf(fd, &execbuf);
+	gem_close(fd, obj[1].handle);
+
+	gem_read(fd, obj[0].handle, 0, buf, 2*sizeof(buf[0]));
+	igt_assert_eq_u32(buf[0], 0xdeadbeef); /* before copy */
+	igt_assert_eq_u32(buf[1], 0xabcdabcd); /* after copy */
+
+	/* Now a couple of negative tests that should be filtered */
+	execbuf.buffer_count = 1;
+	execbuf.batch_len = 4*sizeof(buf[0]);
+
+	buf[0] = MI_LOAD_REGISTER_REG | (3 - 2);
+	buf[1] = HSW_CS_GPR0;
+	buf[2] = 0;
+	buf[3] = MI_BATCH_BUFFER_END;
+	gem_write(fd, obj[0].handle, 0, buf, execbuf.batch_len);
+	igt_assert_eq(__gem_execbuf(fd, &execbuf), -EINVAL);
+
+	buf[2] = OACONTROL; /* filtered */
+	gem_write(fd, obj[0].handle, 0, buf, execbuf.batch_len);
+	igt_assert_eq(__gem_execbuf(fd, &execbuf), -EINVAL);
+
+	buf[2] = DERRMR; /* master only */
+	gem_write(fd, obj[0].handle, 0, buf, execbuf.batch_len);
+	igt_assert_eq(__gem_execbuf(fd, &execbuf), -EINVAL);
+
+	buf[2] = 0x2038; /* RING_START: invalid */
+	gem_write(fd, obj[0].handle, 0, buf, execbuf.batch_len);
+	igt_assert_eq(__gem_execbuf(fd, &execbuf), -EINVAL);
+
+	close(fd);
+}
+
 static uint32_t handle;
 static int fd;
 
-- 
2.10.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH igt v4 04/13] igt/gem_exec_parse: update hsw_load_register_reg
  2016-11-14 20:51 [PATCH igt v4 00/13] corresponding changes for i915-perf interface Robert Bragg
                   ` (2 preceding siblings ...)
  2016-11-14 20:51 ` [PATCH igt v4 03/13] igt/gem_exec_parse: move hsw_load_register_reg down Robert Bragg
@ 2016-11-14 20:51 ` Robert Bragg
  2016-11-14 20:51 ` [PATCH igt v4 05/13] igt/gem_exec_parse: make global vars local to main() Robert Bragg
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Robert Bragg @ 2016-11-14 20:51 UTC (permalink / raw)
  To: intel-gfx

This generalises hsw_load_register_reg to loop through an array of
allowed and disallowed registers and to use the exec_batch[_patched]
utilities.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
---
 tests/gem_exec_parse.c | 137 +++++++++++++++++++++++--------------------------
 1 file changed, 64 insertions(+), 73 deletions(-)

diff --git a/tests/gem_exec_parse.c b/tests/gem_exec_parse.c
index 7a9dc82..dcf39a2 100644
--- a/tests/gem_exec_parse.c
+++ b/tests/gem_exec_parse.c
@@ -34,8 +34,9 @@
 #define I915_PARAM_CMD_PARSER_VERSION       28
 #endif
 
-#define OACONTROL 0x2360
 #define DERRMR 0x44050
+#define OACONTROL 0x2360
+#define SO_WRITE_OFFSET_0 0x5280
 
 #define HSW_CS_GPR(n) (0x2600 + 8*(n))
 #define HSW_CS_GPR0 HSW_CS_GPR(0)
@@ -65,8 +66,8 @@ static int command_parser_version(int fd)
 	return -1;
 }
 
-static void exec_batch_patched(int fd, uint32_t cmd_bo, uint32_t *cmds,
-			       int size, int patch_offset, uint64_t expected_value)
+static uint64_t __exec_batch_patched(int fd, uint32_t cmd_bo, uint32_t *cmds,
+				     int size, int patch_offset)
 {
 	struct drm_i915_gem_execbuffer2 execbuf;
 	struct drm_i915_gem_exec_object2 obj[2];
@@ -100,9 +101,19 @@ static void exec_batch_patched(int fd, uint32_t cmd_bo, uint32_t *cmds,
 	gem_sync(fd, cmd_bo);
 
 	gem_read(fd,target_bo, 0, &actual_value, sizeof(actual_value));
-	igt_assert_eq(expected_value, actual_value);
 
 	gem_close(fd, target_bo);
+
+	return actual_value;
+}
+
+static void exec_batch_patched(int fd, uint32_t cmd_bo, uint32_t *cmds,
+			       int size, int patch_offset,
+			       uint64_t expected_value)
+{
+	igt_assert_eq(__exec_batch_patched(fd, cmd_bo, cmds,
+					   size, patch_offset),
+		      expected_value);
 }
 
 static int __exec_batch(int fd, uint32_t cmd_bo, uint32_t *cmds,
@@ -295,31 +306,36 @@ static void test_allocations(int fd)
 
 static void hsw_load_register_reg(void)
 {
-	uint32_t buf[16] = {
-		MI_LOAD_REGISTER_IMM | (5 - 2),
+	uint32_t init_gpr0[16] = {
+		MI_LOAD_REGISTER_IMM | (3 - 2),
 		HSW_CS_GPR0,
-		0xabcdabcd,
-		HSW_CS_GPR1,
-		0xdeadbeef,
-
+		0xabcdabc0, /* leave [1:0] zero */
+		MI_BATCH_BUFFER_END,
+	};
+	uint32_t store_gpr0[16] = {
 		MI_STORE_REGISTER_MEM | (3 - 2),
-		HSW_CS_GPR1,
-		0, /* address0 */
-
-		MI_LOAD_REGISTER_REG | (3 - 2),
 		HSW_CS_GPR0,
-		HSW_CS_GPR1,
-
-		MI_STORE_REGISTER_MEM | (3 - 2),
-		HSW_CS_GPR1,
-		4, /* address1 */
-
+		0, /* reloc*/
 		MI_BATCH_BUFFER_END,
 	};
-	struct drm_i915_gem_execbuffer2 execbuf;
-	struct drm_i915_gem_exec_object2 obj[2];
-	struct drm_i915_gem_relocation_entry reloc[2];
+	uint32_t do_lrr[16] = {
+		MI_LOAD_REGISTER_REG | (3 - 2),
+		0, /* [1] = src */
+		HSW_CS_GPR0, /* dst */
+		MI_BATCH_BUFFER_END,
+	};
+	uint32_t allowed_regs[] = {
+		HSW_CS_GPR1,
+		SO_WRITE_OFFSET_0,
+	};
+	uint32_t disallowed_regs[] = {
+		0,
+		OACONTROL, /* filtered */
+		DERRMR, /* master only */
+		0x2038, /* RING_START: invalid */
+	};
 	int fd;
+	uint32_t handle;
 
 	/* Open again to get a non-master file descriptor */
 	fd = drm_open_driver(DRIVER_INTEL);
@@ -327,59 +343,34 @@ static void hsw_load_register_reg(void)
 	igt_require(IS_HASWELL(intel_get_drm_devid(fd)));
 	igt_require(command_parser_version(fd) >= 7);
 
-	memset(obj, 0, sizeof(obj));
-	obj[0].handle = gem_create(fd, 4096);
-	obj[1].handle = gem_create(fd, 4096);
-	gem_write(fd, obj[1].handle, 0, buf, sizeof(buf));
+	handle = gem_create(fd, 4096);
 
-	memset(reloc, 0, sizeof(reloc));
-	reloc[0].offset = 7*sizeof(uint32_t);
-	reloc[0].target_handle = obj[0].handle;
-	reloc[0].delta = 0;
-	reloc[0].read_domains = I915_GEM_DOMAIN_COMMAND;
-	reloc[0].write_domain = I915_GEM_DOMAIN_COMMAND;
-	reloc[1].offset = 13*sizeof(uint32_t);
-	reloc[1].target_handle = obj[0].handle;
-	reloc[1].delta = sizeof(uint32_t);
-	reloc[1].read_domains = I915_GEM_DOMAIN_COMMAND;
-	reloc[1].write_domain = I915_GEM_DOMAIN_COMMAND;
-	obj[1].relocs_ptr = (uintptr_t)&reloc;
-	obj[1].relocation_count = 2;
+	for (int i = 0 ; i < ARRAY_SIZE(allowed_regs); i++) {
+		uint32_t var;
 
-	memset(&execbuf, 0, sizeof(execbuf));
-	execbuf.buffers_ptr = (uintptr_t)obj;
-	execbuf.buffer_count = 2;
-	execbuf.batch_len = sizeof(buf);
-	execbuf.flags = I915_EXEC_RENDER;
-	gem_execbuf(fd, &execbuf);
-	gem_close(fd, obj[1].handle);
-
-	gem_read(fd, obj[0].handle, 0, buf, 2*sizeof(buf[0]));
-	igt_assert_eq_u32(buf[0], 0xdeadbeef); /* before copy */
-	igt_assert_eq_u32(buf[1], 0xabcdabcd); /* after copy */
+		exec_batch(fd, handle, init_gpr0, sizeof(init_gpr0),
+			   I915_EXEC_RENDER,
+			   0);
+		exec_batch_patched(fd, handle,
+				   store_gpr0, sizeof(store_gpr0),
+				   2 * sizeof(uint32_t), /* reloc */
+				   0xabcdabc0);
+		do_lrr[1] = allowed_regs[i];
+		exec_batch(fd, handle, do_lrr, sizeof(do_lrr),
+			   I915_EXEC_RENDER,
+			   0);
+		var = __exec_batch_patched(fd, handle,
+					   store_gpr0, sizeof(store_gpr0),
+					   2 * sizeof(uint32_t)); /* reloc */
+		igt_assert_neq(var, 0xabcdabc0);
+	}
 
-	/* Now a couple of negative tests that should be filtered */
-	execbuf.buffer_count = 1;
-	execbuf.batch_len = 4*sizeof(buf[0]);
-
-	buf[0] = MI_LOAD_REGISTER_REG | (3 - 2);
-	buf[1] = HSW_CS_GPR0;
-	buf[2] = 0;
-	buf[3] = MI_BATCH_BUFFER_END;
-	gem_write(fd, obj[0].handle, 0, buf, execbuf.batch_len);
-	igt_assert_eq(__gem_execbuf(fd, &execbuf), -EINVAL);
-
-	buf[2] = OACONTROL; /* filtered */
-	gem_write(fd, obj[0].handle, 0, buf, execbuf.batch_len);
-	igt_assert_eq(__gem_execbuf(fd, &execbuf), -EINVAL);
-
-	buf[2] = DERRMR; /* master only */
-	gem_write(fd, obj[0].handle, 0, buf, execbuf.batch_len);
-	igt_assert_eq(__gem_execbuf(fd, &execbuf), -EINVAL);
-
-	buf[2] = 0x2038; /* RING_START: invalid */
-	gem_write(fd, obj[0].handle, 0, buf, execbuf.batch_len);
-	igt_assert_eq(__gem_execbuf(fd, &execbuf), -EINVAL);
+	for (int i = 0 ; i < ARRAY_SIZE(disallowed_regs); i++) {
+		do_lrr[1] = disallowed_regs[i];
+		exec_batch(fd, handle, do_lrr, sizeof(do_lrr),
+			   I915_EXEC_RENDER,
+			   -EINVAL);
+	}
 
 	close(fd);
 }
-- 
2.10.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH igt v4 05/13] igt/gem_exec_parse: make global vars local to main()
  2016-11-14 20:51 [PATCH igt v4 00/13] corresponding changes for i915-perf interface Robert Bragg
                   ` (3 preceding siblings ...)
  2016-11-14 20:51 ` [PATCH igt v4 04/13] igt/gem_exec_parse: update hsw_load_register_reg Robert Bragg
@ 2016-11-14 20:51 ` Robert Bragg
  2016-11-14 21:33   ` Matthew Auld
  2016-11-14 20:51 ` [PATCH igt v4 06/13] igt/gem_exec_parse: init global parser_version in fixture Robert Bragg
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 22+ messages in thread
From: Robert Bragg @ 2016-11-14 20:51 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Robert Bragg <robert@sixbynine.org>
---
 tests/gem_exec_parse.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tests/gem_exec_parse.c b/tests/gem_exec_parse.c
index dcf39a2..ebcd092 100644
--- a/tests/gem_exec_parse.c
+++ b/tests/gem_exec_parse.c
@@ -375,11 +375,11 @@ static void hsw_load_register_reg(void)
 	close(fd);
 }
 
-static uint32_t handle;
-static int fd;
-
 igt_main
 {
+	uint32_t handle;
+	int fd;
+
 	igt_fixture {
 		int parser_version = 0;
                 drm_i915_getparam_t gp;
-- 
2.10.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH igt v4 06/13] igt/gem_exec_parse: init global parser_version in fixture
  2016-11-14 20:51 [PATCH igt v4 00/13] corresponding changes for i915-perf interface Robert Bragg
                   ` (4 preceding siblings ...)
  2016-11-14 20:51 ` [PATCH igt v4 05/13] igt/gem_exec_parse: make global vars local to main() Robert Bragg
@ 2016-11-14 20:51 ` Robert Bragg
  2016-11-14 21:33   ` Matthew Auld
  2016-11-14 20:51 ` [PATCH igt v4 07/13] igt/gem_exec_parse: req. v < 9 for oacontrol tracking test Robert Bragg
                   ` (6 subsequent siblings)
  12 siblings, 1 reply; 22+ messages in thread
From: Robert Bragg @ 2016-11-14 20:51 UTC (permalink / raw)
  To: intel-gfx

This adds a static global int parser_version that can be referenced by
all subtests without needing multiple GETPARAM requests.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
---
 tests/gem_exec_parse.c | 13 ++++---------
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/tests/gem_exec_parse.c b/tests/gem_exec_parse.c
index ebcd092..9e57ff5 100644
--- a/tests/gem_exec_parse.c
+++ b/tests/gem_exec_parse.c
@@ -51,6 +51,7 @@
 #define   PIPE_CONTROL_QW_WRITE	(1<<14)
 #define   PIPE_CONTROL_LRI_POST_OP (1<<23)
 
+static int parser_version;
 
 static int command_parser_version(int fd)
 {
@@ -341,7 +342,7 @@ static void hsw_load_register_reg(void)
 	fd = drm_open_driver(DRIVER_INTEL);
 
 	igt_require(IS_HASWELL(intel_get_drm_devid(fd)));
-	igt_require(command_parser_version(fd) >= 7);
+	igt_require(parser_version >= 7);
 
 	handle = gem_create(fd, 4096);
 
@@ -381,16 +382,10 @@ igt_main
 	int fd;
 
 	igt_fixture {
-		int parser_version = 0;
-                drm_i915_getparam_t gp;
-		int rc;
-
 		fd = drm_open_driver(DRIVER_INTEL);
 
-		gp.param = I915_PARAM_CMD_PARSER_VERSION;
-		gp.value = &parser_version;
-		rc = drmIoctl(fd, DRM_IOCTL_I915_GETPARAM, &gp);
-		igt_require(!rc && parser_version > 0);
+		parser_version = command_parser_version(fd);
+		igt_require(parser_version != -1);
 
 		igt_require(gem_uses_ppgtt(fd));
 
-- 
2.10.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH igt v4 07/13] igt/gem_exec_parse: req. v < 9 for oacontrol tracking test
  2016-11-14 20:51 [PATCH igt v4 00/13] corresponding changes for i915-perf interface Robert Bragg
                   ` (5 preceding siblings ...)
  2016-11-14 20:51 ` [PATCH igt v4 06/13] igt/gem_exec_parse: init global parser_version in fixture Robert Bragg
@ 2016-11-14 20:51 ` Robert Bragg
  2016-11-14 20:51 ` [PATCH igt v4 08/13] igt/gem_exec_parse: make basic-rejected version agnostic Robert Bragg
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Robert Bragg @ 2016-11-14 20:51 UTC (permalink / raw)
  To: intel-gfx

This limits testing the oacontrol tracking (required pairing of oa
enable/disable per batch buffer) to version <= 8 of the command parser.

Version 9 of the command parser removes all special handling for
OACONTROL which is now going to be managed by i915-perf and not
programmed from userspace.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
---
 tests/gem_exec_parse.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tests/gem_exec_parse.c b/tests/gem_exec_parse.c
index 9e57ff5..6dc9998 100644
--- a/tests/gem_exec_parse.c
+++ b/tests/gem_exec_parse.c
@@ -551,6 +551,9 @@ igt_main
 			0x31337000,
 			MI_BATCH_BUFFER_END,
 		};
+
+		igt_require(parser_version < 9);
+
 		exec_batch(fd, handle,
 			   lri_ok, sizeof(lri_ok),
 			   I915_EXEC_RENDER,
-- 
2.10.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH igt v4 08/13] igt/gem_exec_parse: make basic-rejected version agnostic
  2016-11-14 20:51 [PATCH igt v4 00/13] corresponding changes for i915-perf interface Robert Bragg
                   ` (6 preceding siblings ...)
  2016-11-14 20:51 ` [PATCH igt v4 07/13] igt/gem_exec_parse: req. v < 9 for oacontrol tracking test Robert Bragg
@ 2016-11-14 20:51 ` Robert Bragg
  2016-11-14 20:51 ` [PATCH igt v4 09/13] igt/gem_exec_parse: update bitmasks test for v >=8 Robert Bragg
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Robert Bragg @ 2016-11-14 20:51 UTC (permalink / raw)
  To: intel-gfx

This adapts the basic-rejected test to focus on invalid commands that
will result in an EINVAL errno being returned to userspace even with the
upcoming version 8 parser change to stop reporting access violations as
EINVAL errors.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
---
 tests/gem_exec_parse.c | 31 ++++++++++++++++++++-----------
 1 file changed, 20 insertions(+), 11 deletions(-)

diff --git a/tests/gem_exec_parse.c b/tests/gem_exec_parse.c
index 6dc9998..03e4d0e 100644
--- a/tests/gem_exec_parse.c
+++ b/tests/gem_exec_parse.c
@@ -42,6 +42,10 @@
 #define HSW_CS_GPR0 HSW_CS_GPR(0)
 #define HSW_CS_GPR1 HSW_CS_GPR(1)
 
+/* To help craft commands known to be invalid across all engines */
+#define INSTR_CLIENT_SHIFT	29
+#define   INSTR_INVALID_CLIENT  0x7
+
 #define MI_LOAD_REGISTER_REG (0x2a << 23)
 #define MI_STORE_REGISTER_MEM (0x24 << 23)
 #define MI_ARB_ON_OFF (0x8 << 23)
@@ -411,33 +415,38 @@ igt_main
 	}
 
 	igt_subtest("basic-rejected") {
-		uint32_t arb_on_off[] = {
-			MI_ARB_ON_OFF,
+		uint32_t invalid_cmd[] = {
+			INSTR_INVALID_CLIENT << INSTR_CLIENT_SHIFT,
 			MI_BATCH_BUFFER_END,
 		};
-		uint32_t display_flip[] = {
-			MI_DISPLAY_FLIP,
-			0, 0, 0,
+		uint32_t invalid_set_context[] = {
+			MI_SET_CONTEXT | 32, /* invalid length */
 			MI_BATCH_BUFFER_END,
-			0
 		};
 		exec_batch(fd, handle,
-			   arb_on_off, sizeof(arb_on_off),
+			   invalid_cmd, sizeof(invalid_cmd),
 			   I915_EXEC_RENDER,
 			   -EINVAL);
 		exec_batch(fd, handle,
-			   arb_on_off, sizeof(arb_on_off),
+			   invalid_cmd, sizeof(invalid_cmd),
 			   I915_EXEC_BSD,
 			   -EINVAL);
+		if (gem_has_blt(fd)) {
+			exec_batch(fd, handle,
+				   invalid_cmd, sizeof(invalid_cmd),
+				   I915_EXEC_BLT,
+				   -EINVAL);
+		}
 		if (gem_has_vebox(fd)) {
 			exec_batch(fd, handle,
-				   arb_on_off, sizeof(arb_on_off),
+				   invalid_cmd, sizeof(invalid_cmd),
 				   I915_EXEC_VEBOX,
 				   -EINVAL);
 		}
+
 		exec_batch(fd, handle,
-			   display_flip, sizeof(display_flip),
-			   I915_EXEC_BLT,
+			   invalid_set_context, sizeof(invalid_set_context),
+			   I915_EXEC_RENDER,
 			   -EINVAL);
 	}
 
-- 
2.10.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH igt v4 09/13] igt/gem_exec_parse: update bitmasks test for v >=8
  2016-11-14 20:51 [PATCH igt v4 00/13] corresponding changes for i915-perf interface Robert Bragg
                   ` (7 preceding siblings ...)
  2016-11-14 20:51 ` [PATCH igt v4 08/13] igt/gem_exec_parse: make basic-rejected version agnostic Robert Bragg
@ 2016-11-14 20:51 ` Robert Bragg
  2016-11-14 20:51 ` [PATCH igt v4 10/13] igt/gem_exec_parse: update cmd-crossing-page for >= v8 Robert Bragg
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Robert Bragg @ 2016-11-14 20:51 UTC (permalink / raw)
  To: intel-gfx

With v8 of the command parser (where we won't get an EINVAL for an
access violation) this updates the bitmasks test to explicitly confirm
that the command became a NOOP by reading back from where the QW_WRITE
would have otherwise landed.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
---
 tests/gem_exec_parse.c | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/tests/gem_exec_parse.c b/tests/gem_exec_parse.c
index 03e4d0e..d5769e4 100644
--- a/tests/gem_exec_parse.c
+++ b/tests/gem_exec_parse.c
@@ -504,10 +504,20 @@ igt_main
 			0,
 			MI_BATCH_BUFFER_END,
 		};
-		exec_batch(fd, handle,
-			   pc, sizeof(pc),
-			   I915_EXEC_RENDER,
-			   -EINVAL);
+		if (parser_version >= 8) {
+			/* Expect to read back zero since the command should be
+			 * squashed to a NOOP
+			 */
+			exec_batch_patched(fd, handle,
+					   pc, sizeof(pc),
+					   8, /* patch offset, */
+					   0x0);
+		} else {
+			exec_batch(fd, handle,
+				   pc, sizeof(pc),
+				   I915_EXEC_RENDER,
+				   -EINVAL);
+		}
 	}
 
 	igt_subtest("batch-without-end") {
-- 
2.10.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH igt v4 10/13] igt/gem_exec_parse: update cmd-crossing-page for >= v8
  2016-11-14 20:51 [PATCH igt v4 00/13] corresponding changes for i915-perf interface Robert Bragg
                   ` (8 preceding siblings ...)
  2016-11-14 20:51 ` [PATCH igt v4 09/13] igt/gem_exec_parse: update bitmasks test for v >=8 Robert Bragg
@ 2016-11-14 20:51 ` Robert Bragg
  2016-11-14 20:51 ` [PATCH igt v4 11/13] igt/gem_exec_parse: update hsw_load_register_reg for v >= 8 Robert Bragg
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Robert Bragg @ 2016-11-14 20:51 UTC (permalink / raw)
  To: intel-gfx

Since an access violation won't return an error to userspace for v >= 8
of the command parser this updates the cmd-crossing-page test to
explicitly read back from SO_WRITE_OFFSET[0] to see that the command
wasn't squashed to a NOOP.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
---
 tests/gem_exec_parse.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/tests/gem_exec_parse.c b/tests/gem_exec_parse.c
index d5769e4..5c67e12 100644
--- a/tests/gem_exec_parse.c
+++ b/tests/gem_exec_parse.c
@@ -531,14 +531,25 @@ igt_main
 	igt_subtest("cmd-crossing-page") {
 		uint32_t lri_ok[] = {
 			MI_LOAD_REGISTER_IMM,
-			0x5280, /* allowed register address (SO_WRITE_OFFSET[0]) */
-			0x1,
+			SO_WRITE_OFFSET_0, /* allowed register address */
+			0xdcbaabc0, /* [1:0] MBZ */
+			MI_BATCH_BUFFER_END,
+		};
+		uint32_t store_reg[] = {
+			MI_STORE_REGISTER_MEM | (3 - 2),
+			SO_WRITE_OFFSET_0,
+			0, /* reloc */
 			MI_BATCH_BUFFER_END,
 		};
 		exec_split_batch(fd,
 				 lri_ok, sizeof(lri_ok),
 				 I915_EXEC_RENDER,
 				 0);
+		exec_batch_patched(fd, handle,
+				   store_reg,
+				   sizeof(store_reg),
+				   2 * sizeof(uint32_t), /* reloc */
+				   0xdcbaabc0);
 	}
 
 	igt_subtest("oacontrol-tracking") {
-- 
2.10.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH igt v4 11/13] igt/gem_exec_parse: update hsw_load_register_reg for v >= 8
  2016-11-14 20:51 [PATCH igt v4 00/13] corresponding changes for i915-perf interface Robert Bragg
                   ` (9 preceding siblings ...)
  2016-11-14 20:51 ` [PATCH igt v4 10/13] igt/gem_exec_parse: update cmd-crossing-page for >= v8 Robert Bragg
@ 2016-11-14 20:51 ` Robert Bragg
  2016-11-14 20:51 ` [PATCH igt v4 12/13] igt/gem_exec_parse: update registers test " Robert Bragg
  2016-11-14 20:51 ` [PATCH igt v4 13/13] igt/gem_exec_parse: check oacontrol lri bad for >= v9 Robert Bragg
  12 siblings, 0 replies; 22+ messages in thread
From: Robert Bragg @ 2016-11-14 20:51 UTC (permalink / raw)
  To: intel-gfx

This updates the checking of disallowed loads to set a distinguishable
value before the load and explicitly check the load was a NOOP by
reading back the final value.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
---
 tests/gem_exec_parse.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/tests/gem_exec_parse.c b/tests/gem_exec_parse.c
index 5c67e12..2bc6340 100644
--- a/tests/gem_exec_parse.c
+++ b/tests/gem_exec_parse.c
@@ -341,6 +341,7 @@ static void hsw_load_register_reg(void)
 	};
 	int fd;
 	uint32_t handle;
+	int bad_lrr_errno = parser_version >= 8 ? 0 : -EINVAL;
 
 	/* Open again to get a non-master file descriptor */
 	fd = drm_open_driver(DRIVER_INTEL);
@@ -371,10 +372,21 @@ static void hsw_load_register_reg(void)
 	}
 
 	for (int i = 0 ; i < ARRAY_SIZE(disallowed_regs); i++) {
+		exec_batch(fd, handle, init_gpr0, sizeof(init_gpr0),
+			   I915_EXEC_RENDER,
+			   0);
+		exec_batch_patched(fd, handle,
+				   store_gpr0, sizeof(store_gpr0),
+				   2 * sizeof(uint32_t), /* reloc */
+				   0xabcdabc0);
 		do_lrr[1] = disallowed_regs[i];
 		exec_batch(fd, handle, do_lrr, sizeof(do_lrr),
 			   I915_EXEC_RENDER,
-			   -EINVAL);
+			   bad_lrr_errno);
+		exec_batch_patched(fd, handle,
+				   store_gpr0, sizeof(store_gpr0),
+				   2 * sizeof(uint32_t), /* reloc */
+				   0xabcdabc0);
 	}
 
 	close(fd);
-- 
2.10.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH igt v4 12/13] igt/gem_exec_parse: update registers test for v >= 8
  2016-11-14 20:51 [PATCH igt v4 00/13] corresponding changes for i915-perf interface Robert Bragg
                   ` (10 preceding siblings ...)
  2016-11-14 20:51 ` [PATCH igt v4 11/13] igt/gem_exec_parse: update hsw_load_register_reg for v >= 8 Robert Bragg
@ 2016-11-14 20:51 ` Robert Bragg
  2016-11-22 15:45   ` Chris Wilson
  2016-11-14 20:51 ` [PATCH igt v4 13/13] igt/gem_exec_parse: check oacontrol lri bad for >= v9 Robert Bragg
  12 siblings, 1 reply; 22+ messages in thread
From: Robert Bragg @ 2016-11-14 20:51 UTC (permalink / raw)
  To: intel-gfx

This combines some parts of the recently added store_lri test with the
registers test to be able to first load a distinguishable value before
the LRI and explicitly read back the register to determine if the
command succeeded or was a NOOP.

For now though we won't look at OACONTROL without checking for version 9
of the command parser.

This updates the 'bad' test to check the OASTATUS2 register so that we
can explicitly read back from the register to check it becomes a NOOP.

This adds a struct test_lri for associating a mask with the init/test
values so we ignore things like hw status bits that might interfere
with the result.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
---
 tests/gem_exec_parse.c | 94 ++++++++++++++++++++++++++------------------------
 1 file changed, 49 insertions(+), 45 deletions(-)

diff --git a/tests/gem_exec_parse.c b/tests/gem_exec_parse.c
index 2bc6340..43f25ce 100644
--- a/tests/gem_exec_parse.c
+++ b/tests/gem_exec_parse.c
@@ -35,6 +35,7 @@
 #endif
 
 #define DERRMR 0x44050
+#define OASTATUS2 0x2368
 #define OACONTROL 0x2360
 #define SO_WRITE_OFFSET_0 0x5280
 
@@ -253,27 +254,35 @@ static void exec_batch_chained(int fd, uint32_t cmd_bo, uint32_t *cmds,
 	gem_close(fd, target_bo);
 }
 
-static void stray_lri(int fd, uint32_t handle)
+/* Be careful to take into account what register bits we can store and read
+ * from...
+ */
+struct test_lri {
+	uint32_t reg, read_mask, init_val, test_val;
+};
+
+static void
+test_lri(int fd, uint32_t handle,
+	 struct test_lri *test, int expected_errno, uint32_t expect)
 {
-	/* Ideally this would test all once whitelisted registers */
 	uint32_t lri[] = {
 		MI_LOAD_REGISTER_IMM,
-		OACONTROL,
-		0x31337000,
+		test->reg,
+		test->test_val,
 		MI_BATCH_BUFFER_END,
 	};
-	int err;
 
-	igt_assert_eq_u32(intel_register_read(OACONTROL), 0xdeadbeef);
+	intel_register_write(test->reg, test->init_val);
 
-	err = __exec_batch(fd, handle, lri, sizeof(lri), I915_EXEC_RENDER);
-	if (err == -EINVAL)
-		return;
-
-	igt_assert_eq(err, 0);
+	exec_batch(fd, handle,
+		   lri, sizeof(lri),
+		   I915_EXEC_RENDER,
+		   expected_errno);
 	gem_sync(fd, handle);
 
-	igt_assert_eq_u32(intel_register_read(OACONTROL), 0xdeadbeef);
+	igt_assert_eq_u32((intel_register_read(test->reg) &
+			   test->read_mask),
+			  expect);
 }
 
 static void test_allocations(int fd)
@@ -462,50 +471,45 @@ igt_main
 			   -EINVAL);
 	}
 
+	igt_subtest("basic-allocation") {
+		test_allocations(fd);
+	}
+
 	igt_subtest_group {
 		igt_fixture {
 			intel_register_access_init(intel_get_pci_device(), 0);
-
-			intel_register_write(OACONTROL, 0xdeadbeef);
-			igt_assert_eq_u32(intel_register_read(OACONTROL), 0xdeadbeef);
 		}
 
-		igt_subtest("basic-stray-lri")
-			stray_lri(fd, handle);
+		igt_subtest("registers") {
+			struct test_lri bad_lris[] = {
+				/* dummy head pointer */
+				{ OASTATUS2, 0xffffff80, 0xdeadf000, 0xbeeff000 }
+			};
+			struct test_lri ok_lris[] = {
+				/* NB: [1:0] MBZ */
+				{ SO_WRITE_OFFSET_0, 0xfffffffc,
+				  0xabcdabc0, 0xbeefbee0 }
+			};
+			int bad_lri_errno = parser_version >= 8 ? 0 : -EINVAL;
+
+			for (int i = 0; i < ARRAY_SIZE(ok_lris); i++) {
+				test_lri(fd, handle,
+					 ok_lris + i, 0,
+					 ok_lris[i].test_val);
+			}
+
+			for (int i = 0; i < ARRAY_SIZE(bad_lris); i++) {
+				test_lri(fd, handle,
+					 bad_lris + i, bad_lri_errno,
+					 bad_lris[i].init_val);
+			}
+		}
 
 		igt_fixture {
-			intel_register_write(OACONTROL, 0);
 			intel_register_access_fini();
 		}
 	}
 
-	igt_subtest("basic-allocation") {
-		test_allocations(fd);
-	}
-
-	igt_subtest("registers") {
-		uint32_t lri_bad[] = {
-			MI_LOAD_REGISTER_IMM,
-			0, /* disallowed register address */
-			0x12000000,
-			MI_BATCH_BUFFER_END,
-		};
-		uint32_t lri_ok[] = {
-			MI_LOAD_REGISTER_IMM,
-			0x5280, /* allowed register address (SO_WRITE_OFFSET[0]) */
-			0x1,
-			MI_BATCH_BUFFER_END,
-		};
-		exec_batch(fd, handle,
-			   lri_bad, sizeof(lri_bad),
-			   I915_EXEC_RENDER,
-			   -EINVAL);
-		exec_batch(fd, handle,
-			   lri_ok, sizeof(lri_ok),
-			   I915_EXEC_RENDER,
-			   0);
-	}
-
 	igt_subtest("bitmasks") {
 		uint32_t pc[] = {
 			GFX_OP_PIPE_CONTROL,
-- 
2.10.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH igt v4 13/13] igt/gem_exec_parse: check oacontrol lri bad for >= v9
  2016-11-14 20:51 [PATCH igt v4 00/13] corresponding changes for i915-perf interface Robert Bragg
                   ` (11 preceding siblings ...)
  2016-11-14 20:51 ` [PATCH igt v4 12/13] igt/gem_exec_parse: update registers test " Robert Bragg
@ 2016-11-14 20:51 ` Robert Bragg
  12 siblings, 0 replies; 22+ messages in thread
From: Robert Bragg @ 2016-11-14 20:51 UTC (permalink / raw)
  To: intel-gfx

OACONTROL is no longer white listed in the command parser so this checks
at attempted LRI will be disallowed and (more importantly) checks that
userspace doesn't get an EINVAL error for an attempted OACONTROL LRI.
This is important becase Mesa application attempt OACONTROL LRIs while
initializing and will abort for any execbuf error.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
---
 tests/gem_exec_parse.c | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/tests/gem_exec_parse.c b/tests/gem_exec_parse.c
index 43f25ce..cc2103a 100644
--- a/tests/gem_exec_parse.c
+++ b/tests/gem_exec_parse.c
@@ -485,6 +485,22 @@ igt_main
 				/* dummy head pointer */
 				{ OASTATUS2, 0xffffff80, 0xdeadf000, 0xbeeff000 }
 			};
+			struct test_lri v9_bad_lris[] = {
+				/* It's really important for us to check that
+				 * an LRI to OACONTROL doesn't result in an
+				 * EINVAL error because Mesa attempts writing
+				 * to OACONTROL to determine what extensions to
+				 * expose and will abort() for execbuffer()
+				 * errors.
+				 *
+				 * Mesa can gracefully recognise and handle the
+				 * LRI becoming a NOOP.
+				 *
+				 * The test values represent dummy context IDs
+				 * while leaving the OA unit disabled
+				 */
+				{ OACONTROL, 0xfffff000, 0xfeed0000, 0x31337000 }
+			};
 			struct test_lri ok_lris[] = {
 				/* NB: [1:0] MBZ */
 				{ SO_WRITE_OFFSET_0, 0xfffffffc,
@@ -503,6 +519,15 @@ igt_main
 					 bad_lris + i, bad_lri_errno,
 					 bad_lris[i].init_val);
 			}
+
+			if (parser_version >= 9) {
+				for (int i = 0; i < ARRAY_SIZE(v9_bad_lris); i++) {
+					test_lri(fd, handle,
+						 v9_bad_lris + i,
+						 0,
+						 v9_bad_lris[i].init_val);
+				}
+			}
 		}
 
 		igt_fixture {
-- 
2.10.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH igt v4 05/13] igt/gem_exec_parse: make global vars local to main()
  2016-11-14 20:51 ` [PATCH igt v4 05/13] igt/gem_exec_parse: make global vars local to main() Robert Bragg
@ 2016-11-14 21:33   ` Matthew Auld
  0 siblings, 0 replies; 22+ messages in thread
From: Matthew Auld @ 2016-11-14 21:33 UTC (permalink / raw)
  To: Robert Bragg; +Cc: Intel Graphics Development

On 14 November 2016 at 20:51, Robert Bragg <robert@sixbynine.org> wrote:
> Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH igt v4 06/13] igt/gem_exec_parse: init global parser_version in fixture
  2016-11-14 20:51 ` [PATCH igt v4 06/13] igt/gem_exec_parse: init global parser_version in fixture Robert Bragg
@ 2016-11-14 21:33   ` Matthew Auld
  0 siblings, 0 replies; 22+ messages in thread
From: Matthew Auld @ 2016-11-14 21:33 UTC (permalink / raw)
  To: Robert Bragg; +Cc: Intel Graphics Development

On 14 November 2016 at 20:51, Robert Bragg <robert@sixbynine.org> wrote:
> This adds a static global int parser_version that can be referenced by
> all subtests without needing multiple GETPARAM requests.
>
> Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH igt v4 12/13] igt/gem_exec_parse: update registers test for v >= 8
  2016-11-14 20:51 ` [PATCH igt v4 12/13] igt/gem_exec_parse: update registers test " Robert Bragg
@ 2016-11-22 15:45   ` Chris Wilson
  2016-11-22 16:06     ` Chris Wilson
  0 siblings, 1 reply; 22+ messages in thread
From: Chris Wilson @ 2016-11-22 15:45 UTC (permalink / raw)
  To: Robert Bragg; +Cc: intel-gfx

On Mon, Nov 14, 2016 at 08:51:21PM +0000, Robert Bragg wrote:
> This combines some parts of the recently added store_lri test with the
> registers test to be able to first load a distinguishable value before
> the LRI and explicitly read back the register to determine if the
> command succeeded or was a NOOP.
> 
> For now though we won't look at OACONTROL without checking for version 9
> of the command parser.
> 
> This updates the 'bad' test to check the OASTATUS2 register so that we
> can explicitly read back from the register to check it becomes a NOOP.
> 
> This adds a struct test_lri for associating a mask with the init/test
> values so we ignore things like hw status bits that might interfere
> with the result.
> 
> Signed-off-by: Robert Bragg <robert@sixbynine.org>
> Reviewed-by: Matthew Auld <matthew.auld@intel.com>
> ---
>  tests/gem_exec_parse.c | 94 ++++++++++++++++++++++++++------------------------
>  1 file changed, 49 insertions(+), 45 deletions(-)
> 
> diff --git a/tests/gem_exec_parse.c b/tests/gem_exec_parse.c
> index 2bc6340..43f25ce 100644
> --- a/tests/gem_exec_parse.c
> +++ b/tests/gem_exec_parse.c
> @@ -35,6 +35,7 @@
>  #endif
>  
>  #define DERRMR 0x44050
> +#define OASTATUS2 0x2368
>  #define OACONTROL 0x2360
>  #define SO_WRITE_OFFSET_0 0x5280
>  
> @@ -253,27 +254,35 @@ static void exec_batch_chained(int fd, uint32_t cmd_bo, uint32_t *cmds,
>  	gem_close(fd, target_bo);
>  }
>  
> -static void stray_lri(int fd, uint32_t handle)

Ahem. Why did you remove that agnostic test without replacing it?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH igt v4 12/13] igt/gem_exec_parse: update registers test for v >= 8
  2016-11-22 15:45   ` Chris Wilson
@ 2016-11-22 16:06     ` Chris Wilson
  2016-11-22 16:50       ` [PATCH igt] igt/gem_exec_parse: test_lri check init + add debug msg Robert Bragg
  0 siblings, 1 reply; 22+ messages in thread
From: Chris Wilson @ 2016-11-22 16:06 UTC (permalink / raw)
  To: Robert Bragg, intel-gfx

On Tue, Nov 22, 2016 at 03:45:59PM +0000, Chris Wilson wrote:
> On Mon, Nov 14, 2016 at 08:51:21PM +0000, Robert Bragg wrote:
> > This combines some parts of the recently added store_lri test with the
> > registers test to be able to first load a distinguishable value before
> > the LRI and explicitly read back the register to determine if the
> > command succeeded or was a NOOP.
> > 
> > For now though we won't look at OACONTROL without checking for version 9
> > of the command parser.
> > 
> > This updates the 'bad' test to check the OASTATUS2 register so that we
> > can explicitly read back from the register to check it becomes a NOOP.
> > 
> > This adds a struct test_lri for associating a mask with the init/test
> > values so we ignore things like hw status bits that might interfere
> > with the result.
> > 
> > Signed-off-by: Robert Bragg <robert@sixbynine.org>
> > Reviewed-by: Matthew Auld <matthew.auld@intel.com>
> > ---
> >  tests/gem_exec_parse.c | 94 ++++++++++++++++++++++++++------------------------
> >  1 file changed, 49 insertions(+), 45 deletions(-)
> > 
> > diff --git a/tests/gem_exec_parse.c b/tests/gem_exec_parse.c
> > index 2bc6340..43f25ce 100644
> > --- a/tests/gem_exec_parse.c
> > +++ b/tests/gem_exec_parse.c
> > @@ -35,6 +35,7 @@
> >  #endif
> >  
> >  #define DERRMR 0x44050
> > +#define OASTATUS2 0x2368
> >  #define OACONTROL 0x2360
> >  #define SO_WRITE_OFFSET_0 0x5280
> >  
> > @@ -253,27 +254,35 @@ static void exec_batch_chained(int fd, uint32_t cmd_bo, uint32_t *cmds,
> >  	gem_close(fd, target_bo);
> >  }
> >  
> > -static void stray_lri(int fd, uint32_t handle)
> 
> Ahem. Why did you remove that agnostic test without replacing it?

In a later patch, that was annoying. But it is not being tested for the
rejection. test_lri() doesn't check the register write took (always
useful to test the setup first), nor does it provide a good hint as to
which lri failed. Either that needs expanding upon, or the multiple
tests extracted into separate subtests.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH igt] igt/gem_exec_parse: test_lri check init + add debug msg
  2016-11-22 16:06     ` Chris Wilson
@ 2016-11-22 16:50       ` Robert Bragg
  2016-11-22 17:03         ` Chris Wilson
  0 siblings, 1 reply; 22+ messages in thread
From: Robert Bragg @ 2016-11-22 16:50 UTC (permalink / raw)
  To: intel-gfx

Just to note I haven't tested yet as I don't have hsw to hand, but seems
simple enough to send out anyway...

--- >8 ---

To make it clear on failure what register was being tested the test_lri
helper now uses igt_debug to log the register address and value being
tested. The test_lri helper now also double checks that the initial
intel_register_write() takes before issuing the LRI.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
---
 tests/gem_exec_parse.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/tests/gem_exec_parse.c b/tests/gem_exec_parse.c
index cc2103a..534a933 100644
--- a/tests/gem_exec_parse.c
+++ b/tests/gem_exec_parse.c
@@ -272,8 +272,15 @@ test_lri(int fd, uint32_t handle,
 		MI_BATCH_BUFFER_END,
 	};
 
+	igt_debug("testing lri, reg=%x, val=%x, expected errno=%d\n",
+		  test->reg, test->test_val, expected_errno);
+
 	intel_register_write(test->reg, test->init_val);
 
+	igt_assert_eq_u32((intel_register_read(test->reg) &
+			   test->read_mask),
+			  test->init_val);
+
 	exec_batch(fd, handle,
 		   lri, sizeof(lri),
 		   I915_EXEC_RENDER,
-- 
2.10.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH igt] igt/gem_exec_parse: test_lri check init + add debug msg
  2016-11-22 16:50       ` [PATCH igt] igt/gem_exec_parse: test_lri check init + add debug msg Robert Bragg
@ 2016-11-22 17:03         ` Chris Wilson
  2016-11-24 13:47           ` [PATCH igt] igt/gem_exec_parse: generalise test_lri + debug info Robert Bragg
  0 siblings, 1 reply; 22+ messages in thread
From: Chris Wilson @ 2016-11-22 17:03 UTC (permalink / raw)
  To: Robert Bragg; +Cc: intel-gfx

On Tue, Nov 22, 2016 at 04:50:35PM +0000, Robert Bragg wrote:
> Just to note I haven't tested yet as I don't have hsw to hand, but seems
> simple enough to send out anyway...
> 
> --- >8 ---
> 
> To make it clear on failure what register was being tested the test_lri
> helper now uses igt_debug to log the register address and value being
> tested. The test_lri helper now also double checks that the initial
> intel_register_write() takes before issuing the LRI.
> 
> Signed-off-by: Robert Bragg <robert@sixbynine.org>
> ---
>  tests/gem_exec_parse.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/tests/gem_exec_parse.c b/tests/gem_exec_parse.c
> index cc2103a..534a933 100644
> --- a/tests/gem_exec_parse.c
> +++ b/tests/gem_exec_parse.c
> @@ -272,8 +272,15 @@ test_lri(int fd, uint32_t handle,
>  		MI_BATCH_BUFFER_END,
>  	};
>  
> +	igt_debug("testing lri, reg=%x, val=%x, expected errno=%d\n",
> +		  test->reg, test->test_val, expected_errno);
> +
>  	intel_register_write(test->reg, test->init_val);
>  
> +	igt_assert_eq_u32((intel_register_read(test->reg) &
> +			   test->read_mask),
> +			  test->init_val);

igt_assert_f((intel_register_read(test->reg) & test->read_mask) == test->init_val,
	     "LRI reg=%x overwrote initial value of %x with %x [LRI value=%x]\n",
	     test->reg, test->init_val, intel_register_read(test->reg) & test->read_mask, test->val);

Something like that. I was also hoping to pass in test->name [name of
the register for convenience], which equally lends itself to
igt_subtest_f("test-lri-%s", test->name)
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH igt] igt/gem_exec_parse: generalise test_lri + debug info
  2016-11-22 17:03         ` Chris Wilson
@ 2016-11-24 13:47           ` Robert Bragg
  2016-11-24 14:28             ` Chris Wilson
  0 siblings, 1 reply; 22+ messages in thread
From: Robert Bragg @ 2016-11-24 13:47 UTC (permalink / raw)
  To: intel-gfx

This further generalises the description passed to test_lri so we only
need one loop over the entries with test_lri deducing the exected errno
and value based on whether the register is marked as whitelisted and
depending on the current command parser version.

Each tested register LRI now test gets its own subtest like:
igt_subtest_f("test-lri-%s", reg_name)

The test_lri helper now also double checks that the initial
intel_register_write() takes before issuing the LRI.

In case of a failure the test_lri helper now uses igt_debug to log the
register name, address and value being tested.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/gem_exec_parse.c | 102 +++++++++++++++++++++++++------------------------
 1 file changed, 52 insertions(+), 50 deletions(-)

diff --git a/tests/gem_exec_parse.c b/tests/gem_exec_parse.c
index cc2103a..0cd7053 100644
--- a/tests/gem_exec_parse.c
+++ b/tests/gem_exec_parse.c
@@ -258,12 +258,17 @@ static void exec_batch_chained(int fd, uint32_t cmd_bo, uint32_t *cmds,
  * from...
  */
 struct test_lri {
-	uint32_t reg, read_mask, init_val, test_val;
+	const char *name; /* register name for debug info */
+	uint32_t reg; /* address to test */
+	uint32_t read_mask; /* ignore things like HW status bits */
+	uint32_t init_val; /* initial identifiable value to set without LRI */
+	uint32_t test_val; /* value to attempt loading via LRI command */
+	bool whitelisted; /* expect to become NOOP / fail if not whitelisted */
+	int min_ver; /* required command parser version to test */
 };
 
 static void
-test_lri(int fd, uint32_t handle,
-	 struct test_lri *test, int expected_errno, uint32_t expect)
+test_lri(int fd, uint32_t handle, struct test_lri *test)
 {
 	uint32_t lri[] = {
 		MI_LOAD_REGISTER_IMM,
@@ -271,9 +276,20 @@ test_lri(int fd, uint32_t handle,
 		test->test_val,
 		MI_BATCH_BUFFER_END,
 	};
+	int bad_lri_errno = parser_version >= 8 ? 0 : -EINVAL;
+	int expected_errno = test->whitelisted ? 0 : bad_lri_errno;
+	uint32_t expect = test->whitelisted ? test->test_val : test->init_val;
+
+	igt_debug("Testing %s LRI: addr=%x, val=%x, expected errno=%d, expected val=%x\n",
+		  test->name, test->reg, test->test_val,
+		  expected_errno, expect);
 
 	intel_register_write(test->reg, test->init_val);
 
+	igt_assert_eq_u32((intel_register_read(test->reg) &
+			   test->read_mask),
+			  test->init_val);
+
 	exec_batch(fd, handle,
 		   lri, sizeof(lri),
 		   I915_EXEC_RENDER,
@@ -476,57 +492,43 @@ igt_main
 	}
 
 	igt_subtest_group {
+#define REG(R, MSK, INI, V, OK, MIN_V) { #R, R, MSK, INI, V, OK, MIN_V }
+		struct test_lri lris[] = {
+			/* dummy head pointer */
+			REG(OASTATUS2,
+			    0xffffff80, 0xdeadf000, 0xbeeff000, false, 0),
+			/* NB: [1:0] MBZ */
+			REG(SO_WRITE_OFFSET_0,
+			    0xfffffffc, 0xabcdabc0, 0xbeefbee0, true, 0),
+
+			/* It's really important for us to check that
+			 * an LRI to OACONTROL doesn't result in an
+			 * EINVAL error because Mesa attempts writing
+			 * to OACONTROL to determine what extensions to
+			 * expose and will abort() for execbuffer()
+			 * errors.
+			 *
+			 * Mesa can gracefully recognise and handle the
+			 * LRI becoming a NOOP.
+			 *
+			 * The test values represent dummy context IDs
+			 * while leaving the OA unit disabled
+			 */
+			REG(OACONTROL,
+			    0xfffff000, 0xfeed0000, 0x31337000, false, 9)
+		};
+#undef REG
+
 		igt_fixture {
 			intel_register_access_init(intel_get_pci_device(), 0);
 		}
 
-		igt_subtest("registers") {
-			struct test_lri bad_lris[] = {
-				/* dummy head pointer */
-				{ OASTATUS2, 0xffffff80, 0xdeadf000, 0xbeeff000 }
-			};
-			struct test_lri v9_bad_lris[] = {
-				/* It's really important for us to check that
-				 * an LRI to OACONTROL doesn't result in an
-				 * EINVAL error because Mesa attempts writing
-				 * to OACONTROL to determine what extensions to
-				 * expose and will abort() for execbuffer()
-				 * errors.
-				 *
-				 * Mesa can gracefully recognise and handle the
-				 * LRI becoming a NOOP.
-				 *
-				 * The test values represent dummy context IDs
-				 * while leaving the OA unit disabled
-				 */
-				{ OACONTROL, 0xfffff000, 0xfeed0000, 0x31337000 }
-			};
-			struct test_lri ok_lris[] = {
-				/* NB: [1:0] MBZ */
-				{ SO_WRITE_OFFSET_0, 0xfffffffc,
-				  0xabcdabc0, 0xbeefbee0 }
-			};
-			int bad_lri_errno = parser_version >= 8 ? 0 : -EINVAL;
-
-			for (int i = 0; i < ARRAY_SIZE(ok_lris); i++) {
-				test_lri(fd, handle,
-					 ok_lris + i, 0,
-					 ok_lris[i].test_val);
-			}
-
-			for (int i = 0; i < ARRAY_SIZE(bad_lris); i++) {
-				test_lri(fd, handle,
-					 bad_lris + i, bad_lri_errno,
-					 bad_lris[i].init_val);
-			}
-
-			if (parser_version >= 9) {
-				for (int i = 0; i < ARRAY_SIZE(v9_bad_lris); i++) {
-					test_lri(fd, handle,
-						 v9_bad_lris + i,
-						 0,
-						 v9_bad_lris[i].init_val);
-				}
+		for (int i = 0; i < ARRAY_SIZE(lris); i++) {
+			igt_subtest_f("test-lri-%s", lris[i].name) {
+				igt_require_f(parser_version >= lris[i].min_ver,
+					      "minimum required parser version for test = %d\n",
+					      lris[i].min_ver);
+				test_lri(fd, handle, lris + i);
 			}
 		}
 
-- 
2.10.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH igt] igt/gem_exec_parse: generalise test_lri + debug info
  2016-11-24 13:47           ` [PATCH igt] igt/gem_exec_parse: generalise test_lri + debug info Robert Bragg
@ 2016-11-24 14:28             ` Chris Wilson
  0 siblings, 0 replies; 22+ messages in thread
From: Chris Wilson @ 2016-11-24 14:28 UTC (permalink / raw)
  To: Robert Bragg; +Cc: intel-gfx

On Thu, Nov 24, 2016 at 01:47:46PM +0000, Robert Bragg wrote:
> This further generalises the description passed to test_lri so we only
> need one loop over the entries with test_lri deducing the exected errno
> and value based on whether the register is marked as whitelisted and
> depending on the current command parser version.
> 
> Each tested register LRI now test gets its own subtest like:
> igt_subtest_f("test-lri-%s", reg_name)
> 
> The test_lri helper now also double checks that the initial
> intel_register_write() takes before issuing the LRI.
> 
> In case of a failure the test_lri helper now uses igt_debug to log the
> register name, address and value being tested.

That looks like it should help immensely if one of these should ever
fire. Thanks.
 
> Signed-off-by: Robert Bragg <robert@sixbynine.org>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

> ---
>  tests/gem_exec_parse.c | 102 +++++++++++++++++++++++++------------------------
>  1 file changed, 52 insertions(+), 50 deletions(-)
> 
> diff --git a/tests/gem_exec_parse.c b/tests/gem_exec_parse.c
> index cc2103a..0cd7053 100644
> --- a/tests/gem_exec_parse.c
> +++ b/tests/gem_exec_parse.c
> @@ -258,12 +258,17 @@ static void exec_batch_chained(int fd, uint32_t cmd_bo, uint32_t *cmds,
>   * from...
>   */
>  struct test_lri {
> -	uint32_t reg, read_mask, init_val, test_val;
> +	const char *name; /* register name for debug info */
> +	uint32_t reg; /* address to test */

I would use offset, but we often mix the two anyway.

> +	uint32_t read_mask; /* ignore things like HW status bits */
> +	uint32_t init_val; /* initial identifiable value to set without LRI */
> +	uint32_t test_val; /* value to attempt loading via LRI command */
> +	bool whitelisted; /* expect to become NOOP / fail if not whitelisted */
> +	int min_ver; /* required command parser version to test */
>  };
>  
>  static void
> -test_lri(int fd, uint32_t handle,
> -	 struct test_lri *test, int expected_errno, uint32_t expect)
> +test_lri(int fd, uint32_t handle, struct test_lri *test)
>  {
>  	uint32_t lri[] = {
>  		MI_LOAD_REGISTER_IMM,
> @@ -271,9 +276,20 @@ test_lri(int fd, uint32_t handle,
>  		test->test_val,
>  		MI_BATCH_BUFFER_END,
>  	};
> +	int bad_lri_errno = parser_version >= 8 ? 0 : -EINVAL;
> +	int expected_errno = test->whitelisted ? 0 : bad_lri_errno;
> +	uint32_t expect = test->whitelisted ? test->test_val : test->init_val;
> +
> +	igt_debug("Testing %s LRI: addr=%x, val=%x, expected errno=%d, expected val=%x\n",
> +		  test->name, test->reg, test->test_val,
> +		  expected_errno, expect);
>  
>  	intel_register_write(test->reg, test->init_val);
>  
> +	igt_assert_eq_u32((intel_register_read(test->reg) &
> +			   test->read_mask),
> +			  test->init_val);
> +
>  	exec_batch(fd, handle,
>  		   lri, sizeof(lri),
>  		   I915_EXEC_RENDER,

I would still be verbose in the assert, it is the first thing you read
on seeing the error before digging through the debug log.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2016-11-24 14:28 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-14 20:51 [PATCH igt v4 00/13] corresponding changes for i915-perf interface Robert Bragg
2016-11-14 20:51 ` [PATCH igt v4 01/13] igt/perf: add i915 perf stream tests for Haswell Robert Bragg
2016-11-14 20:51 ` [PATCH igt v4 02/13] igt/gem_exec_parse: some minor cleanups Robert Bragg
2016-11-14 20:51 ` [PATCH igt v4 03/13] igt/gem_exec_parse: move hsw_load_register_reg down Robert Bragg
2016-11-14 20:51 ` [PATCH igt v4 04/13] igt/gem_exec_parse: update hsw_load_register_reg Robert Bragg
2016-11-14 20:51 ` [PATCH igt v4 05/13] igt/gem_exec_parse: make global vars local to main() Robert Bragg
2016-11-14 21:33   ` Matthew Auld
2016-11-14 20:51 ` [PATCH igt v4 06/13] igt/gem_exec_parse: init global parser_version in fixture Robert Bragg
2016-11-14 21:33   ` Matthew Auld
2016-11-14 20:51 ` [PATCH igt v4 07/13] igt/gem_exec_parse: req. v < 9 for oacontrol tracking test Robert Bragg
2016-11-14 20:51 ` [PATCH igt v4 08/13] igt/gem_exec_parse: make basic-rejected version agnostic Robert Bragg
2016-11-14 20:51 ` [PATCH igt v4 09/13] igt/gem_exec_parse: update bitmasks test for v >=8 Robert Bragg
2016-11-14 20:51 ` [PATCH igt v4 10/13] igt/gem_exec_parse: update cmd-crossing-page for >= v8 Robert Bragg
2016-11-14 20:51 ` [PATCH igt v4 11/13] igt/gem_exec_parse: update hsw_load_register_reg for v >= 8 Robert Bragg
2016-11-14 20:51 ` [PATCH igt v4 12/13] igt/gem_exec_parse: update registers test " Robert Bragg
2016-11-22 15:45   ` Chris Wilson
2016-11-22 16:06     ` Chris Wilson
2016-11-22 16:50       ` [PATCH igt] igt/gem_exec_parse: test_lri check init + add debug msg Robert Bragg
2016-11-22 17:03         ` Chris Wilson
2016-11-24 13:47           ` [PATCH igt] igt/gem_exec_parse: generalise test_lri + debug info Robert Bragg
2016-11-24 14:28             ` Chris Wilson
2016-11-14 20:51 ` [PATCH igt v4 13/13] igt/gem_exec_parse: check oacontrol lri bad for >= v9 Robert Bragg

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.