All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH igt v3 00/11] corresponding changes for i915-perf interface
@ 2016-11-09 16:15 Robert Bragg
  2016-11-09 16:15 ` [PATCH igt v3 01/11] igt/perf: add i915 perf stream tests for Haswell Robert Bragg
                   ` (10 more replies)
  0 siblings, 11 replies; 25+ messages in thread
From: Robert Bragg @ 2016-11-09 16:15 UTC (permalink / raw)
  To: intel-gfx

The i915-perf series affects the command parser and itself includes new uapi
which these i-g-t changes try to cover.

As well as splitting up the gem_exec_parse changes this version maintains
support for testing version 7 of the command parser.

- Robert

Robert Bragg (6):
  igt/perf: add i915 perf stream tests for Haswell
  igt/gem_exec_parse: remove oacontrol checks
  igt/gem_exec_parse: some minor cleanups
  igt/gem_exec_parse: move hsw_load_register_reg down
  igt/gem_exec_parse: update hsw_load_register_reg
  igt/gem_exec_parse: update for version 8 changes

 tests/Makefile.sources |    1 +
 tests/gem_exec_parse.c |  519 +++++------
 tests/perf.c           | 2220 ++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 2429 insertions(+), 311 deletions(-)
 create mode 100644 tests/perf.c

-- 
2.10.1

*** BLURB HERE ***

Robert Bragg (11):
  igt/perf: add i915 perf stream tests for Haswell
  igt/gem_exec_parse: some minor cleanups
  igt/gem_exec_parse: move hsw_load_register_reg down
  igt/gem_exec_parse: update hsw_load_register_reg
  igt/gem_exec_parse: req. v < 9 for oacontrol tracking test
  igt/gem_exec_parse: make basic-rejected version agnostic
  igt/gem_exec_parse: update bitmasks test for v >=8
  igt/gem_exec_parse: update cmd-crossing-page for >= v8
  igt/gem_exec_parse: update hsw_load_register_reg for v >= 8
  igt/gem_exec_parse: update registers test for v >= 8
  igt/gem_exec_parse: check oacontrol lri bad for >= v9

 tests/Makefile.sources |    1 +
 tests/gem_exec_parse.c |  555 ++++++------
 tests/perf.c           | 2220 ++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 2501 insertions(+), 275 deletions(-)
 create mode 100644 tests/perf.c

-- 
2.10.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH igt v3 01/11] igt/perf: add i915 perf stream tests for Haswell
  2016-11-09 16:15 [PATCH igt v3 00/11] corresponding changes for i915-perf interface Robert Bragg
@ 2016-11-09 16:15 ` Robert Bragg
  2016-11-09 16:33   ` Chris Wilson
  2016-11-10 23:03   ` Matthew Auld
  2016-11-09 16:15 ` [PATCH igt v3 02/11] igt/gem_exec_parse: some minor cleanups Robert Bragg
                   ` (9 subsequent siblings)
  10 siblings, 2 replies; 25+ messages in thread
From: Robert Bragg @ 2016-11-09 16:15 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Robert Bragg <robert@sixbynine.org>
---
 tests/Makefile.sources |    1 +
 tests/perf.c           | 2220 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 2221 insertions(+)
 create mode 100644 tests/perf.c

diff --git a/tests/Makefile.sources b/tests/Makefile.sources
index 6d081c3..7c6de2f 100644
--- a/tests/Makefile.sources
+++ b/tests/Makefile.sources
@@ -211,6 +211,7 @@ TESTS_progs = \
 	kms_pwrite_crc \
 	kms_sink_crc_basic \
 	prime_udl \
+	perf \
 	$(NULL)
 
 # IMPORTANT: The ZZ_ tests need to be run last!
diff --git a/tests/perf.c b/tests/perf.c
new file mode 100644
index 0000000..4762e36
--- /dev/null
+++ b/tests/perf.c
@@ -0,0 +1,2220 @@
+/*
+ * Copyright © 2016 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#include <stdlib.h>
+#include <stdio.h>
+#include <string.h>
+#include <fcntl.h>
+#include <inttypes.h>
+#include <errno.h>
+#include <sys/stat.h>
+#include <sys/time.h>
+#include <sys/times.h>
+#include <sys/types.h>
+#include <dirent.h>
+#include <time.h>
+#include <poll.h>
+#include <math.h>
+
+#include "igt.h"
+#include "drm.h"
+
+IGT_TEST_DESCRIPTION("Test the i915 perf metrics streaming interface");
+
+#define GEN6_MI_REPORT_PERF_COUNT ((0x28 << 23) | (3 - 2))
+
+#define GFX_OP_PIPE_CONTROL     ((3 << 29) | (3 << 27) | (2 << 24))
+#define PIPE_CONTROL_CS_STALL           (1 << 20)
+#define PIPE_CONTROL_GLOBAL_SNAPSHOT_COUNT_RESET        (1 << 19)
+#define PIPE_CONTROL_TLB_INVALIDATE     (1 << 18)
+#define PIPE_CONTROL_SYNC_GFDT          (1 << 17)
+#define PIPE_CONTROL_MEDIA_STATE_CLEAR  (1 << 16)
+#define PIPE_CONTROL_NO_WRITE           (0 << 14)
+#define PIPE_CONTROL_WRITE_IMMEDIATE    (1 << 14)
+#define PIPE_CONTROL_WRITE_DEPTH_COUNT  (2 << 14)
+#define PIPE_CONTROL_WRITE_TIMESTAMP    (3 << 14)
+#define PIPE_CONTROL_DEPTH_STALL        (1 << 13)
+#define PIPE_CONTROL_RENDER_TARGET_FLUSH (1 << 12)
+#define PIPE_CONTROL_INSTRUCTION_INVALIDATE (1 << 11)
+#define PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE   (1 << 10) /* GM45+ only */
+#define PIPE_CONTROL_ISP_DIS            (1 << 9)
+#define PIPE_CONTROL_INTERRUPT_ENABLE   (1 << 8)
+#define PIPE_CONTROL_FLUSH_ENABLE       (1 << 7) /* Gen7+ only */
+/* GT */
+#define PIPE_CONTROL_DATA_CACHE_INVALIDATE      (1 << 5)
+#define PIPE_CONTROL_VF_CACHE_INVALIDATE        (1 << 4)
+#define PIPE_CONTROL_CONST_CACHE_INVALIDATE     (1 << 3)
+#define PIPE_CONTROL_STATE_CACHE_INVALIDATE     (1 << 2)
+#define PIPE_CONTROL_STALL_AT_SCOREBOARD        (1 << 1)
+#define PIPE_CONTROL_DEPTH_CACHE_FLUSH          (1 << 0)
+#define PIPE_CONTROL_PPGTT_WRITE        (0 << 2)
+#define PIPE_CONTROL_GLOBAL_GTT_WRITE   (1 << 2)
+
+#define NSEC_PER_SEC 1000000000ull
+
+static struct {
+        const char *name;
+        uint64_t id;
+        size_t size;
+        int a_off; /* bytes */
+        int n_a;
+        int first_a;
+        int b_off;
+        int n_b;
+        int c_off;
+        int n_c;
+} hsw_oa_formats[] = {
+        { "A13", I915_OA_FORMAT_A13, .size = 64,
+                .a_off = 12, .n_a = 13 },
+        { "A29", I915_OA_FORMAT_A29, .size = 128,
+                .a_off = 12, .n_a = 29 },
+        { "A13_B8_C8", I915_OA_FORMAT_A13_B8_C8, .size = 128,
+                .a_off = 12, .n_a = 13,
+                .b_off = 64, .n_b = 8,
+                .c_off = 96, .n_c = 8 },
+        { "A45_B8_C8", I915_OA_FORMAT_A45_B8_C8, .size = 256,
+                .a_off = 12,  .n_a = 45,
+                .b_off = 192, .n_b = 8,
+                .c_off = 224, .n_c = 8 },
+        { "B4_C8", I915_OA_FORMAT_B4_C8, .size = 64,
+                .b_off = 16, .n_b = 4,
+                .c_off = 32, .n_c = 8 },
+        { "B4_C8_A16", I915_OA_FORMAT_B4_C8_A16, .size = 128,
+                .b_off = 16, .n_b = 4,
+                .c_off = 32, .n_c = 8,
+                .a_off = 60, .n_a = 16, .first_a = 29 },
+        { "C4_B8", I915_OA_FORMAT_C4_B8, .size = 64,
+                .c_off = 16, .n_c = 4,
+                .b_off = 28, .n_b = 8 },
+};
+
+static bool hsw_undefined_a_counters[45] = {
+        [4] = true,
+        [6] = true,
+        [9] = true,
+        [11] = true,
+        [14] = true,
+        [16] = true,
+        [19] = true,
+        [21] = true,
+        [24] = true,
+        [26] = true,
+        [29] = true,
+        [31] = true,
+        [34] = true,
+        [43] = true,
+        [44] = true,
+};
+
+static int drm_fd;
+static uint32_t devid;
+static int device;
+
+static uint64_t hsw_render_basic_id = UINT64_MAX;
+static uint64_t gt_min_freq_mhz_saved = 0;
+static uint64_t gt_max_freq_mhz_saved = 0;
+static uint64_t gt_min_freq_mhz = 0;
+static uint64_t gt_max_freq_mhz = 0;
+
+static uint64_t timestamp_frequency = 12500000;
+
+static igt_render_copyfunc_t render_copy = NULL;
+
+static int
+__perf_open(int fd, struct drm_i915_perf_open_param *param)
+{
+        int ret = igt_ioctl(fd, DRM_IOCTL_I915_PERF_OPEN, param);
+
+        igt_assert(ret >= 0);
+        errno = 0;
+
+        return ret;
+}
+
+static int
+lookup_format(int i915_perf_fmt_id)
+{
+        for (int i = 0; i < ARRAY_SIZE(hsw_oa_formats); i++)
+                if (hsw_oa_formats[i].id == i915_perf_fmt_id)
+                        return i;
+
+        igt_assert(!"reached");
+}
+
+static bool
+try_read_u64_file(const char *file, uint64_t *val)
+{
+        char buf[32];
+        int fd, n;
+
+        fd = open(file, O_RDONLY);
+        if (fd < 0)
+                return false;
+
+        while ((n = read(fd, buf, sizeof(buf) - 1)) < 0 && errno == EINTR)
+                ;
+        igt_assert(n >= 0);
+
+        close(fd);
+
+        buf[n] = '\0';
+        *val = strtoull(buf, NULL, 0);
+
+        return true;
+}
+
+static uint64_t
+read_u64_file(const char *file)
+{
+        uint64_t val;
+
+        igt_assert_eq(try_read_u64_file(file, &val), true);
+
+        return val;
+}
+
+static void
+write_u64_file(const char *file, uint64_t val)
+{
+        char buf[32];
+        int fd, len, ret;
+
+        fd = open(file, O_WRONLY);
+        igt_assert(fd >= 0);
+
+        len = snprintf(buf, sizeof(buf), "%"PRIu64, val);
+        igt_assert(len > 0);
+
+        while ((ret = write(fd, buf, len)) < 0 && errno == EINTR)
+                ;
+        igt_assert_eq(ret, len);
+
+        close(fd);
+}
+
+static uint64_t
+sysfs_read(const char *file)
+{
+        char buf[512];
+
+        snprintf(buf, sizeof(buf), "/sys/class/drm/card%d/%s", device, file);
+
+        return read_u64_file(buf);
+}
+
+static void
+sysfs_write(const char *file, uint64_t val)
+{
+        char buf[512];
+
+        snprintf(buf, sizeof(buf), "/sys/class/drm/card%d/%s", device, file);
+
+        write_u64_file(buf, val);
+}
+
+static char *
+read_debugfs_record(const char *file, const char *key)
+{
+        FILE *fp;
+        char *line = NULL;
+        size_t line_buf_size = 0;
+        int len = 0;
+        int key_len = strlen(key);
+        char *value = NULL;
+
+        fp = igt_debugfs_fopen(file, "r");
+        igt_require(fp);
+
+        while ((len = getline(&line, &line_buf_size, fp)) > 0) {
+
+                if (line[len - 1] == '\n')
+                        line[len - 1] = '\0';
+
+                if (strncmp(key, line, key_len) == 0 &&
+                    line[key_len] == ':' &&
+                    line[key_len + 1] == ' ')
+                {
+                        value = strdup(line + key_len + 2);
+                        goto done;
+                }
+        }
+
+done:
+        free(line);
+        if (fp)
+                fclose(fp);
+        return value;
+}
+
+static uint64_t
+read_debugfs_u64_record(const char *file, const char *key)
+{
+        char *str_val = read_debugfs_record(file, key);
+        uint64_t val;
+
+        igt_require(str_val);
+
+        val = strtoull(str_val, NULL, 0);
+        free(str_val);
+
+        return val;
+}
+
+static bool
+lookup_hsw_render_basic_id(void)
+{
+        char buf[256];
+
+        snprintf(buf, sizeof(buf),
+                 "/sys/class/drm/card%d/metrics/403d8832-1a27-4aa6-a64e-f5389ce7b212/id",
+                 device);
+
+        return try_read_u64_file(buf, &hsw_render_basic_id);
+}
+
+static void
+gt_frequency_range_save(void)
+{
+        gt_min_freq_mhz_saved = sysfs_read("gt_min_freq_mhz");
+        gt_max_freq_mhz_saved = sysfs_read("gt_max_freq_mhz");
+
+        gt_min_freq_mhz = gt_min_freq_mhz_saved;
+        gt_max_freq_mhz = gt_max_freq_mhz_saved;
+}
+
+static void
+gt_frequency_pin(int gt_freq_mhz)
+{
+        igt_debug("requesting pinned GT freq = %dmhz\n", gt_freq_mhz);
+
+        if (gt_freq_mhz > gt_max_freq_mhz) {
+                sysfs_write("gt_max_freq_mhz", gt_freq_mhz);
+                sysfs_write("gt_min_freq_mhz", gt_freq_mhz);
+        } else {
+                sysfs_write("gt_min_freq_mhz", gt_freq_mhz);
+                sysfs_write("gt_max_freq_mhz", gt_freq_mhz);
+        }
+        gt_min_freq_mhz = gt_freq_mhz;
+        gt_max_freq_mhz = gt_freq_mhz;
+}
+
+static void
+gt_frequency_range_restore(void)
+{
+        igt_debug("restoring GT frequency range: min = %dmhz, max =%dmhz, current: min=%dmhz, max=%dmhz\n",
+                  (int)gt_min_freq_mhz_saved,
+                  (int)gt_max_freq_mhz_saved,
+                  (int)gt_min_freq_mhz,
+                  (int)gt_max_freq_mhz);
+
+        /* Assume current min/max are the same */
+        if (gt_min_freq_mhz_saved > gt_max_freq_mhz) {
+                sysfs_write("gt_max_freq_mhz", gt_max_freq_mhz_saved);
+                sysfs_write("gt_min_freq_mhz", gt_min_freq_mhz_saved);
+        } else {
+                sysfs_write("gt_min_freq_mhz", gt_min_freq_mhz_saved);
+                sysfs_write("gt_max_freq_mhz", gt_max_freq_mhz_saved);
+        }
+
+        gt_min_freq_mhz = gt_min_freq_mhz_saved;
+        gt_max_freq_mhz = gt_max_freq_mhz_saved;
+}
+
+static uint64_t
+timebase_scale(uint32_t u32_delta)
+{
+        return ((uint64_t)u32_delta * NSEC_PER_SEC) / timestamp_frequency;
+}
+
+/* CAP_SYS_ADMIN is required to open system wide metrics, unless the system
+ * control parameter dev.i915.perf_stream_paranoid == 0 */
+static void
+test_system_wide_paranoid(void)
+{
+        igt_fork(child, 1) {
+                uint64_t properties[] = {
+                        /* Include OA reports in samples */
+                        DRM_I915_PERF_PROP_SAMPLE_OA, true,
+
+                        /* OA unit configuration */
+                        DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
+                        DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
+                        DRM_I915_PERF_PROP_OA_EXPONENT, 13, /* 1 millisecond */
+                };
+                struct drm_i915_perf_open_param param = {
+                        .flags = I915_PERF_FLAG_FD_CLOEXEC |
+                                I915_PERF_FLAG_FD_NONBLOCK,
+                        .num_properties = sizeof(properties) / 16,
+                        .properties_ptr = (uint64_t)properties,
+                };
+
+                write_u64_file("/proc/sys/dev/i915/perf_stream_paranoid", 1);
+
+                igt_drop_root();
+
+                do_ioctl_err(drm_fd, DRM_IOCTL_I915_PERF_OPEN, &param, EACCES);
+        }
+
+        igt_waitchildren();
+
+        igt_fork(child, 1) {
+                uint64_t properties[] = {
+                        /* Include OA reports in samples */
+                        DRM_I915_PERF_PROP_SAMPLE_OA, true,
+
+                        /* OA unit configuration */
+                        DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
+                        DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
+                        DRM_I915_PERF_PROP_OA_EXPONENT, 13, /* 1 millisecond */
+                };
+                struct drm_i915_perf_open_param param = {
+                        .flags = I915_PERF_FLAG_FD_CLOEXEC |
+                                I915_PERF_FLAG_FD_NONBLOCK,
+                        .num_properties = sizeof(properties) / 16,
+                        .properties_ptr = (uint64_t)properties,
+                };
+                int stream_fd;
+
+                write_u64_file("/proc/sys/dev/i915/perf_stream_paranoid", 0);
+
+                igt_drop_root();
+
+                stream_fd = __perf_open(drm_fd, &param);
+                close(stream_fd);
+        }
+
+        igt_waitchildren();
+
+        /* leave in paranoid state */
+        write_u64_file("/proc/sys/dev/i915/perf_stream_paranoid", 1);
+}
+
+static void
+test_invalid_open_flags(void)
+{
+        uint64_t properties[] = {
+                /* Include OA reports in samples */
+                DRM_I915_PERF_PROP_SAMPLE_OA, true,
+
+                /* OA unit configuration */
+                DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
+                DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
+                DRM_I915_PERF_PROP_OA_EXPONENT, 13, /* 1 millisecond */
+        };
+        struct drm_i915_perf_open_param param = {
+                .flags = ~0, /* Undefined flag bits set! */
+                .num_properties = sizeof(properties) / 16,
+                .properties_ptr = (uint64_t)properties,
+        };
+
+        do_ioctl_err(drm_fd, DRM_IOCTL_I915_PERF_OPEN, &param, EINVAL);
+}
+
+static void
+test_invalid_oa_metric_set_id(void)
+{
+        uint64_t properties[] = {
+                /* Include OA reports in samples */
+                DRM_I915_PERF_PROP_SAMPLE_OA, true,
+
+                /* OA unit configuration */
+                DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
+                DRM_I915_PERF_PROP_OA_EXPONENT, 13, /* 1 millisecond */
+                DRM_I915_PERF_PROP_OA_METRICS_SET, UINT64_MAX,
+        };
+        struct drm_i915_perf_open_param param = {
+                .flags = I915_PERF_FLAG_FD_CLOEXEC |
+                        I915_PERF_FLAG_FD_NONBLOCK,
+                .num_properties = sizeof(properties) / 16,
+                .properties_ptr = (uint64_t)properties,
+        };
+        int stream_fd;
+
+        do_ioctl_err(drm_fd, DRM_IOCTL_I915_PERF_OPEN, &param, EINVAL);
+
+        properties[ARRAY_SIZE(properties) - 1] = 0; /* ID 0 is also be reserved as invalid */
+        do_ioctl_err(drm_fd, DRM_IOCTL_I915_PERF_OPEN, &param, EINVAL);
+
+        /* Check that we aren't just seeing false positives... */
+        properties[ARRAY_SIZE(properties) - 1] = hsw_render_basic_id;
+        stream_fd = __perf_open(drm_fd, &param);
+        close(stream_fd);
+
+        /* There's no valid default OA metric set ID... */
+        param.num_properties--;
+        do_ioctl_err(drm_fd, DRM_IOCTL_I915_PERF_OPEN, &param, EINVAL);
+}
+
+static void
+test_invalid_oa_format_id(void)
+{
+        uint64_t properties[] = {
+                /* Include OA reports in samples */
+                DRM_I915_PERF_PROP_SAMPLE_OA, true,
+
+                /* OA unit configuration */
+                DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
+                DRM_I915_PERF_PROP_OA_EXPONENT, 13, /* 1 millisecond */
+                DRM_I915_PERF_PROP_OA_FORMAT, UINT64_MAX,
+        };
+        struct drm_i915_perf_open_param param = {
+                .flags = I915_PERF_FLAG_FD_CLOEXEC |
+                        I915_PERF_FLAG_FD_NONBLOCK,
+                .num_properties = sizeof(properties) / 16,
+                .properties_ptr = (uint64_t)properties,
+        };
+        int stream_fd;
+
+        do_ioctl_err(drm_fd, DRM_IOCTL_I915_PERF_OPEN, &param, EINVAL);
+
+        properties[ARRAY_SIZE(properties) - 1] = 0; /* ID 0 is also be reserved as invalid */
+        do_ioctl_err(drm_fd, DRM_IOCTL_I915_PERF_OPEN, &param, EINVAL);
+
+        /* Check that we aren't just seeing false positives... */
+        properties[ARRAY_SIZE(properties) - 1] = I915_OA_FORMAT_A45_B8_C8;
+        stream_fd = __perf_open(drm_fd, &param);
+        close(stream_fd);
+
+        /* There's no valid default OA format... */
+        param.num_properties--;
+        do_ioctl_err(drm_fd, DRM_IOCTL_I915_PERF_OPEN, &param, EINVAL);
+}
+
+static void
+test_missing_sample_flags(void)
+{
+        uint64_t properties[] = {
+                /* No _PROP_SAMPLE_xyz flags */
+
+                /* OA unit configuration */
+                DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
+                DRM_I915_PERF_PROP_OA_EXPONENT, 13, /* 1 millisecond */
+                DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
+        };
+        struct drm_i915_perf_open_param param = {
+                .flags = I915_PERF_FLAG_FD_CLOEXEC,
+                .num_properties = sizeof(properties) / 16,
+                .properties_ptr = (uint64_t)properties,
+        };
+
+        do_ioctl_err(drm_fd, DRM_IOCTL_I915_PERF_OPEN, &param, EINVAL);
+}
+
+static void
+read_2_oa_reports(int stream_fd,
+                  uint64_t format_id,
+                  size_t format_size,
+                  int exponent,
+                  uint32_t *oa_report0,
+                  uint32_t *oa_report1,
+                  bool timer_only)
+{
+        size_t sample_size = (sizeof(struct drm_i915_perf_record_header) +
+                              format_size);
+        const struct drm_i915_perf_record_header *header;
+        uint32_t exponent_mask = (1 << (exponent + 1)) - 1;
+
+        /* Note: we allocate a large buffer so that each read() iteration
+         * should scrape *all* pending records.
+         *
+         * The largest buffer the OA unit supports is 16MB and the smallest
+         * OA report format is 64bytes allowing up to 262144 reports to
+         * be buffered.
+         *
+         * Being sure we are fetching all buffered reports allows us to
+         * potentially throw away / skip all reports whenever we see
+         * a _REPORT_LOST notification as a way of being sure are
+         * measurements aren't skewed by a lost report.
+         *
+         * Note: that is is useful for some tests but also not something
+         * applications would be expected to resort to. Lost reports are
+         * somewhat unpredictable but typically don't pose a problem - except
+         * to indicate that the OA unit may be over taxed if lots of reports
+         * are being lost.
+         */
+        int buf_size = 262144 * (64 + sizeof(struct drm_i915_perf_record_header));
+        uint8_t *buf = malloc(buf_size);
+        int n = 0;
+
+        for (int i = 0; i < 1000; i++) {
+                ssize_t len;
+
+                while ((len = read(stream_fd, buf, buf_size)) < 0 &&
+                       errno == EINTR)
+                        ;
+
+                igt_assert(len > 0);
+
+                for (size_t offset = 0; offset < len; offset += header->size) {
+                        const uint32_t *report;
+
+                        header = (void *)(buf + offset);
+
+                        igt_assert_eq(header->pad, 0); /* Reserved */
+
+                        /* Currently the only test that should ever expect to
+                         * see a _BUFFER_LOST error is the buffer_fill test,
+                         * otherwise something bad has probably happened...
+                         */
+                        igt_assert_neq(header->type, DRM_I915_PERF_RECORD_OA_BUFFER_LOST);
+
+                        /* At high sampling frequencies the OA HW might not be
+                         * able to cope with all write requests and will notify
+                         * us that a report was lost. We restart our read of
+                         * two sequential reports due to the timeline blip this
+                         * implies
+                         */
+                        if (header->type == DRM_I915_PERF_RECORD_OA_REPORT_LOST) {
+                                igt_debug("read restart: OA trigger collision / report lost\n");
+                                n = 0;
+
+                                /* XXX: break, because we don't know where
+                                 * within the series of already read reports
+                                 * there could be a blip from the lost report.
+                                 */
+                                break;
+                        }
+
+                        /* Currently the only other record type expected is a
+                         * _SAMPLE. Notably this test will need updating if
+                         * i915-perf is extended in the future with additional
+                         * record types.
+                         */
+                        igt_assert_eq(header->type, DRM_I915_PERF_RECORD_SAMPLE);
+
+                        igt_assert_eq(header->size, sample_size);
+
+                        report = (const void *)(header + 1);
+
+                        igt_debug("read report: reason = %x, timestamp = %x, exponent mask=%x\n",
+                                  report[0], report[1], exponent_mask);
+
+                        /* Don't expect zero for timestamps */
+                        igt_assert_neq(report[1], 0);
+
+                        if (timer_only) {
+                                /* For Haswell we don't have a documented
+                                 * report reason field (though empirically
+                                 * report[0] bit 10 does seem to correlate with
+                                 * a timer trigger reason) so we instead infer
+                                 * which reports are timer triggered by
+                                 * checking if the least significant bits are
+                                 * zero and the exponent bit is set.
+                                 */
+                                if ((report[1] & exponent_mask) != (1 << exponent)) {
+                                        igt_debug("skipping non timer report reason=%x\n",
+                                                  report[0]);
+
+                                        /* Also assert our hypothesis about the
+                                         * reason bit...
+                                         */
+                                        igt_assert_eq(report[0] & (1 << 10), 0);
+                                        continue;
+                                }
+                        }
+
+                        if (n++ == 0)
+                                memcpy(oa_report0, report, format_size);
+                        else {
+                                memcpy(oa_report1, report, format_size);
+                                free(buf);
+                                return;
+                        }
+                }
+        }
+
+        free(buf);
+
+        igt_assert(!"reached");
+}
+
+static void
+open_and_read_2_oa_reports(uint64_t format_id,
+                           size_t format_size,
+                           int exponent,
+                           uint32_t *oa_report0,
+                           uint32_t *oa_report1,
+                           bool timer_only)
+{
+        uint64_t properties[] = {
+                /* Include OA reports in samples */
+                DRM_I915_PERF_PROP_SAMPLE_OA, true,
+
+                /* OA unit configuration */
+                DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
+                DRM_I915_PERF_PROP_OA_FORMAT, format_id,
+                DRM_I915_PERF_PROP_OA_EXPONENT, exponent,
+
+        };
+        struct drm_i915_perf_open_param param = {
+                .flags = I915_PERF_FLAG_FD_CLOEXEC,
+                .num_properties = sizeof(properties) / 16,
+                .properties_ptr = (uint64_t)properties,
+        };
+        int stream_fd = __perf_open(drm_fd, &param);
+
+        read_2_oa_reports(stream_fd, format_id, format_size, exponent,
+                          oa_report0, oa_report1, timer_only);
+
+        close(stream_fd);
+}
+
+static void
+print_reports(uint32_t *oa_report0, uint32_t *oa_report1, int fmt)
+{
+        uint32_t *a0, *b0, *c0;
+        uint32_t *a1, *b1, *c1;
+
+        /* Not ideal naming here with a0 or a1
+         * differentiating report0 or 1 not A counter 0 or 1....
+         */
+        a0 = (uint32_t *)(((uint8_t *)oa_report0) + hsw_oa_formats[fmt].a_off);
+        b0 = (uint32_t *)(((uint8_t *)oa_report0) + hsw_oa_formats[fmt].b_off);
+        c0 = (uint32_t *)(((uint8_t *)oa_report0) + hsw_oa_formats[fmt].c_off);
+
+        a1 = (uint32_t *)(((uint8_t *)oa_report1) + hsw_oa_formats[fmt].a_off);
+        b1 = (uint32_t *)(((uint8_t *)oa_report1) + hsw_oa_formats[fmt].b_off);
+        c1 = (uint32_t *)(((uint8_t *)oa_report1) + hsw_oa_formats[fmt].c_off);
+
+        igt_debug("TIMESTAMP: 1st = %"PRIu32", 2nd = %"PRIu32", delta = %"PRIu32"\n",
+                  oa_report0[1], oa_report1[1], oa_report1[1] - oa_report0[1]);
+
+        if (hsw_oa_formats[fmt].n_c) {
+                igt_debug("CLOCK: 1st = %"PRIu32", 2nd = %"PRIu32", delta = %"PRIu32"\n",
+                          c0[2], c1[2], c1[2] - c0[2]);
+        } else
+                igt_debug("CLOCK = N/A\n");
+
+        for (int j = hsw_oa_formats[fmt].first_a;
+             j < hsw_oa_formats[fmt].n_a;
+             j++)
+        {
+                uint32_t delta = a1[j] - a0[j];
+
+                if (hsw_undefined_a_counters[j])
+                        continue;
+
+                igt_debug("A%d: 1st = %"PRIu32", 2nd = %"PRIu32", delta = %"PRIu32"\n",
+                          j, a0[j], a1[j], delta);
+        }
+
+        for (int j = 0; j < hsw_oa_formats[fmt].n_b; j++) {
+                uint32_t delta = b1[j] - b0[j];
+                igt_debug("B%d: 1st = %"PRIu32", 2nd = %"PRIu32", delta = %"PRIu32"\n",
+                          j, b0[j], b1[j], delta);
+        }
+
+        for (int j = 0; j < hsw_oa_formats[fmt].n_c; j++) {
+                uint32_t delta = c1[j] - c0[j];
+                igt_debug("C%d: 1st = %"PRIu32", 2nd = %"PRIu32", delta = %"PRIu32"\n",
+                          j, c0[j], c1[j], delta);
+        }
+}
+
+static void
+test_oa_formats(void)
+{
+        int oa_exponent = 13;
+
+        for (int i = 0; i < ARRAY_SIZE(hsw_oa_formats); i++) {
+                uint32_t oa_report0[64];
+                uint32_t oa_report1[64];
+                uint32_t *a0, *b0, *c0;
+                uint32_t *a1, *b1, *c1;
+                uint32_t time_delta;
+                uint32_t clock_delta;
+                uint32_t max_delta;
+
+                igt_debug("Checking OA format %s\n", hsw_oa_formats[i].name);
+
+                open_and_read_2_oa_reports(hsw_oa_formats[i].id,
+                                           hsw_oa_formats[i].size,
+                                           oa_exponent,
+                                           oa_report0,
+                                           oa_report1,
+                                           false); /* timer reports only */
+
+                print_reports(oa_report0, oa_report1, i);
+
+                a0 = (uint32_t *)(((uint8_t *)oa_report0) + hsw_oa_formats[i].a_off);
+                b0 = (uint32_t *)(((uint8_t *)oa_report0) + hsw_oa_formats[i].b_off);
+                c0 = (uint32_t *)(((uint8_t *)oa_report0) + hsw_oa_formats[i].c_off);
+
+                a1 = (uint32_t *)(((uint8_t *)oa_report1) + hsw_oa_formats[i].a_off);
+                b1 = (uint32_t *)(((uint8_t *)oa_report1) + hsw_oa_formats[i].b_off);
+                c1 = (uint32_t *)(((uint8_t *)oa_report1) + hsw_oa_formats[i].c_off);
+
+                time_delta = timebase_scale(oa_report1[1] - oa_report0[1]);
+                igt_assert_neq(time_delta, 0);
+
+                /* C2 corresponds to a clock counter for this metric set but
+                 * it's not included in all of the formats. */
+                if (hsw_oa_formats[i].n_c) {
+                        uint64_t freq;
+
+                        /* The first report might have a clock count of zero
+                         * but we wouldn't expect that in the second report...
+                         */
+                        igt_assert_neq(c1[2], 0);
+
+                        clock_delta = c1[2] - c0[2];
+                        igt_assert_neq(clock_delta, 0);
+
+                        freq = ((uint64_t)clock_delta * 1000) / time_delta;
+                        igt_debug("freq = %"PRIu64"\n", freq);
+
+                        igt_assert(freq <= gt_max_freq_mhz);
+                } else {
+                        /* Assume running at max freq for sake of
+                         * below sanity check on counters... */
+                        clock_delta = (gt_max_freq_mhz *
+                                       (uint64_t)time_delta) / 1000;
+                }
+
+                igt_debug("clock delta = %"PRIu32"\n", clock_delta);
+
+                /* The maximum rate for any HSW counter =
+                 *   clock_delta * 40 EUs
+                 *
+                 * Sanity check that no counters exceed this delta.
+                 */
+                max_delta = clock_delta * 40;
+
+                for (int j = hsw_oa_formats[i].first_a;
+                     j < hsw_oa_formats[i].n_a;
+                     j++)
+                {
+                        uint32_t delta = a1[j] - a0[j];
+
+                        if (hsw_undefined_a_counters[j])
+                                continue;
+
+                        igt_debug("A%d: delta = %"PRIu32"\n", j, delta);
+                        igt_assert(delta <= max_delta);
+                }
+
+                for (int j = 0; j < hsw_oa_formats[i].n_b; j++) {
+                        uint32_t delta = b1[j] - b0[j];
+                        igt_debug("B%d: delta = %"PRIu32"\n", j, delta);
+                        igt_assert(delta <= max_delta);
+                }
+
+                for (int j = 0; j < hsw_oa_formats[i].n_c; j++) {
+                        uint32_t delta = c1[j] - c0[j];
+                        igt_debug("C%d: delta = %"PRIu32"\n", j, delta);
+                        igt_assert(delta <= max_delta);
+                }
+        }
+}
+
+static void
+test_oa_exponents(int gt_freq_mhz)
+{
+        /* XXX: Note that gt_min/max_freq_mhz don't seem to be a reliable
+         * mechanism for fixing the gpu frequency and since these unit tests
+         * are focused on the OA unit not the ability to pin the frequency we
+         * read back the current frequency for each iteration of this test to
+         * take this into account.
+         */
+        gt_frequency_pin(gt_freq_mhz);
+
+        igt_debug("Testing OA timer exponents with requested GT frequency = %dmhz\n",
+                  gt_freq_mhz);
+
+        /* It's asking a lot to sample with a 160 nanosecond period and the
+         * test can fail due to buffer overflows if it wasn't possible to
+         * keep up, so we don't start from an exponent of zero...
+         */
+        for (int i = 2; i < 20; i++) {
+                uint32_t expected_timestamp_delta;
+                uint32_t timestamp_delta;
+                uint32_t oa_report0[64];
+                uint32_t oa_report1[64];
+                uint32_t *c0, *c1;
+                uint32_t time_delta;
+                uint32_t clock_delta;
+                uint32_t freq;
+                int n_freq_matches = 0;
+
+                /* The exponent is effectively selecting a bit in the timestamp
+                 * to trigger reports on and so in practice we expect the raw
+                 * timestamp deltas for periodic reports to exactly match the
+                 * value of next bit.
+                 */
+                expected_timestamp_delta = 2 << i;
+
+                for (int j = 0; j < 10; j++) {
+                        gt_freq_mhz = sysfs_read("gt_act_freq_mhz");
+
+                        igt_debug("ITER %d: testing OA exponent %d with GT freq = %dmhz\n",
+                                  j, i, gt_freq_mhz);
+
+                        open_and_read_2_oa_reports(I915_OA_FORMAT_A45_B8_C8, 256,
+                                                   i, /* exponent */
+                                                   oa_report0,
+                                                   oa_report1,
+                                                   true); /* timer triggered
+                                                             reports only */
+
+                        timestamp_delta = oa_report1[1] - oa_report0[1];
+                        igt_assert_neq(timestamp_delta, 0);
+
+                        if (timestamp_delta != expected_timestamp_delta) {
+                                igt_debug("timestamp0 = %u/0x%x\n",
+                                          oa_report0[1], oa_report0[1]);
+                                igt_debug("timestamp1 = %u/0x%x\n",
+                                          oa_report1[1], oa_report1[1]);
+                        }
+
+                        igt_assert_eq(timestamp_delta, expected_timestamp_delta);
+
+                        c0 = (uint32_t *)(((uint8_t *)oa_report0) + 224 /* C offset */);
+                        c1 = (uint32_t *)(((uint8_t *)oa_report1) + 224 /* C offset */);
+                        clock_delta = c1[2] - c0[2];
+
+                        time_delta = timebase_scale(timestamp_delta);
+
+                        freq = ((uint64_t)clock_delta * 1000) / time_delta;
+                        igt_debug("ITER %d: time delta = %"PRIu32"(ns) clock delta = %"PRIu32" freq = %"PRIu32"(mhz)\n",
+                                  j, time_delta, clock_delta, freq);
+
+                        if (freq == gt_freq_mhz)
+                                n_freq_matches++;
+                }
+
+                igt_debug("number of iterations with expected clock frequency = %d\n",
+                          n_freq_matches);
+
+                /* Don't assert the calculated frequency for extremely short
+                 * durations... */
+                if (i > 3)
+                        igt_assert(n_freq_matches >= 7);
+        }
+
+        gt_frequency_range_restore();
+}
+
+/* The OA exponent selects a timestamp counter bit to trigger reports on.
+ *
+ * With a 64bit timestamp and least significant bit approx == 80ns then the MSB
+ * equates to > 40 thousand years and isn't exposed via the i915 perf interface.
+ *
+ * The max exponent exposed is expected to be 31, which is still a fairly
+ * ridiculous period (>5min) but is the maximum exponent where it's still
+ * possible to use periodic sampling as a means for tracking the overflow of
+ * 32bit OA report timestamps.
+ */
+static void
+test_invalid_oa_exponent(void)
+{
+        uint64_t properties[] = {
+                /* Include OA reports in samples */
+                DRM_I915_PERF_PROP_SAMPLE_OA, true,
+
+                /* OA unit configuration */
+                DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
+                DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
+                DRM_I915_PERF_PROP_OA_EXPONENT, 31, /* maximum exponent expected
+                                                       to be accepted */
+        };
+        struct drm_i915_perf_open_param param = {
+                .flags = I915_PERF_FLAG_FD_CLOEXEC,
+                .num_properties = sizeof(properties) / 16,
+                .properties_ptr = (uint64_t)properties,
+        };
+        int stream_fd = __perf_open(drm_fd, &param);
+
+        close(stream_fd);
+
+        for (int i = 32; i < 65; i++) {
+                properties[7] = i;
+                do_ioctl_err(drm_fd, DRM_IOCTL_I915_PERF_OPEN, &param, EINVAL);
+        }
+}
+
+/* Return the largest OA exponent that will still result in a sampling
+ * frequency higher than the given frequency.
+ */
+static int
+max_oa_exponent_for_higher_freq(uint64_t freq)
+{
+        /* NB: timebase_scale() takes a uint32_t and an exponent of 30
+         * would already represent a period of ~3 minutes so there's
+         * really no need to consider higher exponents.
+         */
+        for (int i = 0; i < 30; i++) {
+                uint64_t oa_period = timebase_scale(2 << i);
+                uint32_t oa_freq = NSEC_PER_SEC / oa_period;
+
+                if (oa_freq <= freq)
+                        return max(0, i - 1);
+        }
+
+        igt_assert(!"reached");
+        return -1;
+}
+
+/* The lowest periodic sampling exponent equates to a period of 160 nanoseconds
+ * or a frequency of 6.25MHz which is only possible to request as root by
+ * default. By default the maximum OA sampling rate is 100KHz
+ */
+static void
+test_low_oa_exponent_permissions(void)
+{
+        int max_freq = read_u64_file("/proc/sys/dev/i915/oa_max_sample_rate");
+        int bad_exponent = max_oa_exponent_for_higher_freq(max_freq);
+        int ok_exponent = bad_exponent + 1;
+        uint64_t properties[] = {
+                /* Include OA reports in samples */
+                DRM_I915_PERF_PROP_SAMPLE_OA, true,
+
+                /* OA unit configuration */
+                DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
+                DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
+                DRM_I915_PERF_PROP_OA_EXPONENT, bad_exponent,
+        };
+        struct drm_i915_perf_open_param param = {
+                .flags = I915_PERF_FLAG_FD_CLOEXEC,
+                .num_properties = sizeof(properties) / 16,
+                .properties_ptr = (uint64_t)properties,
+        };
+        uint64_t oa_period, oa_freq;
+
+        igt_assert_eq(max_freq, 100000);
+
+        /* Avoid EACCESS errors opening a stream without CAP_SYS_ADMIN */
+        write_u64_file("/proc/sys/dev/i915/perf_stream_paranoid", 0);
+
+        igt_fork(child, 1) {
+                igt_drop_root();
+
+                do_ioctl_err(drm_fd, DRM_IOCTL_I915_PERF_OPEN, &param, EACCES);
+        }
+
+        igt_waitchildren();
+
+        properties[7] = ok_exponent;
+
+        igt_fork(child, 1) {
+                int stream_fd;
+
+                igt_drop_root();
+
+                stream_fd = __perf_open(drm_fd, &param);
+                close(stream_fd);
+        }
+
+        igt_waitchildren();
+
+        oa_period = timebase_scale(2 << ok_exponent);
+        oa_freq = NSEC_PER_SEC / oa_period;
+        write_u64_file("/proc/sys/dev/i915/oa_max_sample_rate", oa_freq - 100);
+
+        igt_fork(child, 1) {
+                igt_drop_root();
+
+                do_ioctl_err(drm_fd, DRM_IOCTL_I915_PERF_OPEN, &param, EACCES);
+        }
+
+        igt_waitchildren();
+
+        /* restore the defaults */
+        write_u64_file("/proc/sys/dev/i915/oa_max_sample_rate", 100000);
+        write_u64_file("/proc/sys/dev/i915/perf_stream_paranoid", 1);
+}
+
+static void
+test_per_context_mode_unprivileged(void)
+{
+        uint64_t properties[] = {
+                /* Single context sampling */
+                DRM_I915_PERF_PROP_CTX_HANDLE, UINT64_MAX, /* updated below */
+
+                /* Include OA reports in samples */
+                DRM_I915_PERF_PROP_SAMPLE_OA, true,
+
+                /* OA unit configuration */
+                DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
+                DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
+                DRM_I915_PERF_PROP_OA_EXPONENT, 13, /* 1 millisecond */
+        };
+        struct drm_i915_perf_open_param param = {
+                .flags = I915_PERF_FLAG_FD_CLOEXEC,
+                .num_properties = sizeof(properties) / 16,
+                .properties_ptr = (uint64_t)properties,
+        };
+
+        /* should be default, but just to be sure... */
+        write_u64_file("/proc/sys/dev/i915/perf_stream_paranoid", 1);
+
+        igt_fork(child, 1) {
+                drm_intel_context *context;
+                drm_intel_bufmgr *bufmgr;
+                int stream_fd;
+
+                igt_drop_root();
+
+                bufmgr = drm_intel_bufmgr_gem_init(drm_fd, 4096);
+                context = drm_intel_gem_context_create(bufmgr);
+
+                igt_assert(context);
+
+                properties[1] = drm_intel_gem_context_get_context_id(context);
+
+                stream_fd = __perf_open(drm_fd, &param);
+                close(stream_fd);
+        }
+
+        igt_waitchildren();
+}
+
+static int64_t
+get_time(void)
+{
+        struct timespec ts;
+
+        clock_gettime(CLOCK_MONOTONIC, &ts);
+
+        return ts.tv_sec * 1000000000 + ts.tv_nsec;
+}
+
+/* Note: The interface doesn't currently provide strict guarantees or control
+ * over the upper bound for how long it might take for a POLLIN event after
+ * some OA report is written by the OA unit.
+ *
+ * The plan is to add a property later that gives some control over the maximum
+ * latency, but for now we expect it is tuned for a fairly low latency
+ * suitable for applications wanting to provide live feedback for captured
+ * metrics.
+ *
+ * At the time of writing this test the driver was using a fixed 200Hz hrtimer
+ * regardless of the OA sampling exponent.
+ *
+ * There is no lower bound since a stream configured for periodic sampling may
+ * still contain other automatically triggered reports.
+ *
+ * What we try and check for here is that blocking reads don't return EAGAIN
+ * and that we aren't spending any significant time burning the cpu in
+ * kernelspace.
+ */
+static void
+test_blocking(void)
+{
+        /* 40 milliseconds
+         *
+         * Having a period somewhat > sysconf(_SC_CLK_TCK) helps to stop
+         * scheduling (liable to kick in when we make blocking poll()s/reads)
+         * from interfering with the test.
+         */
+        int oa_exponent = 18;
+        uint64_t properties[] = {
+                /* Include OA reports in samples */
+                DRM_I915_PERF_PROP_SAMPLE_OA, true,
+
+                /* OA unit configuration */
+                DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
+                DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
+                DRM_I915_PERF_PROP_OA_EXPONENT, oa_exponent,
+        };
+        struct drm_i915_perf_open_param param = {
+                .flags = I915_PERF_FLAG_FD_CLOEXEC,
+                .num_properties = sizeof(properties) / 16,
+                .properties_ptr = (uint64_t)properties,
+        };
+        int stream_fd = __perf_open(drm_fd, &param);
+        uint8_t buf[1024 * 1024];
+        struct tms start_times;
+        struct tms end_times;
+        int64_t user_ns, kernel_ns;
+        int64_t tick_ns = 1000000000 / sysconf(_SC_CLK_TCK);
+        int64_t start;
+        int n = 0;
+
+        times(&start_times);
+
+        /* Loop for 600ms performing blocking reads while the HW is sampling at
+         * ~25Hz, with the expectation that we spend most of our time blocked
+         * in the kernel, and shouldn't be burning cpu cycles in the kernel in
+         * association with this process (verified by looking at stime before
+         * and after loop).
+         */
+        for (start = get_time(); (get_time() - start) < 600000000; /* nop */) {
+                int ret;
+
+                while ((ret = read(stream_fd, buf, sizeof(buf))) < 0 &&
+                       errno == EINTR)
+                        ;
+
+                igt_assert(ret > 0);
+
+                n++;
+        }
+
+        times(&end_times);
+
+        /* Using nanosecond units is fairly silly here, given the tick in-
+         * precision - ah well, it's consistent with the get_time() units.
+         */
+        user_ns = (end_times.tms_utime - start_times.tms_utime) * tick_ns;
+        kernel_ns = (end_times.tms_stime - start_times.tms_stime) * tick_ns;
+
+        igt_debug("%d blocking reads in 500 milliseconds, with 1KHz OA sampling\n", n);
+        igt_debug("time in userspace = %"PRIu64"ns (start utime = %d, end = %d, ns ticks per sec = %d)\n",
+                  user_ns, (int)start_times.tms_utime, (int)end_times.tms_utime, (int)tick_ns);
+        igt_debug("time in kernelspace = %"PRIu64"ns (start stime = %d, end = %d, ns ticks per sec = %d)\n",
+                  kernel_ns, (int)start_times.tms_stime, (int)end_times.tms_stime, (int)tick_ns);
+
+        /* With completely broken blocking (but also not returning an error) we
+         * could end up with an open loop, hopefully recognisable with > 15
+         * (600/40)iterations.
+         */
+        igt_assert(n <= 15);
+
+        /* It's a bit tricky to put a lower limit here, but we expect a
+         * relatively low latency for seeing reports, while we don't currently
+         * give any control over this in the api.
+         *
+         * Limited to a 5 millisecond latency and 45ms (worst case)
+         * per-iteration that could give 13.3 iterations. Rounding gives a tiny
+         * bit more latency slack (6ms)...
+         */
+        igt_assert(n > 13);
+
+        /* A bit tricky to put a number on this, but we don't expect the kernel
+         * to use any significant cpu while waiting and given the in precision
+         * of stime (multiple of CLK_TCK) we expect this to round to zero.
+         */
+        igt_assert_eq(kernel_ns, 0);
+
+        close(stream_fd);
+}
+
+static void
+test_polling(void)
+{
+        /* 40 milliseconds
+         *
+         * Having a period somewhat > sysconf(_SC_CLK_TCK) helps to stop
+         * scheduling (liable to kick in when we make blocking poll()s/reads)
+         * from interfering with the test.
+         */
+        int oa_exponent = 18;
+        uint64_t properties[] = {
+                /* Include OA reports in samples */
+                DRM_I915_PERF_PROP_SAMPLE_OA, true,
+
+                /* OA unit configuration */
+                DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
+                DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
+                DRM_I915_PERF_PROP_OA_EXPONENT, oa_exponent,
+        };
+        struct drm_i915_perf_open_param param = {
+                .flags = I915_PERF_FLAG_FD_CLOEXEC |
+                        I915_PERF_FLAG_FD_NONBLOCK,
+                .num_properties = sizeof(properties) / 16,
+                .properties_ptr = (uint64_t)properties,
+        };
+        int stream_fd = __perf_open(drm_fd, &param);
+        uint8_t buf[1024 * 1024];
+        struct tms start_times;
+        struct tms end_times;
+        int64_t user_ns, kernel_ns;
+        int64_t tick_ns = 1000000000 / sysconf(_SC_CLK_TCK);
+        int64_t start;
+        int n = 0;
+
+        times(&start_times);
+
+        /* Loop for 600ms performing blocking polls while the HW is sampling at
+         * ~25Hz, with the expectation that we spend most of our time blocked
+         * in the kernel, and shouldn't be burning cpu cycles in the kernel in
+         * association with this process (verified by looking at stime before
+         * and after loop).
+         */
+        for (start = get_time(); (get_time() - start) < 600000000; /* nop */) {
+                struct pollfd pollfd = { .fd = stream_fd, .events = POLLIN };
+                int ret;
+
+                while ((ret = poll(&pollfd, 1, -1)) < 0 &&
+                       errno == EINTR)
+                        ;
+                igt_assert_eq(ret, 1);
+                igt_assert(pollfd.revents & POLLIN);
+
+                while ((ret = read(stream_fd, buf, sizeof(buf))) < 0 &&
+                       errno == EINTR)
+                        ;
+
+                /* Don't expect to see EAGAIN if we've had a POLLIN event
+                 *
+                 * XXX: actually this is technically overly strict since we do
+                 * knowingly allow false positive POLLIN events. At least in
+                 * the future when supporting context filtering of metrics for
+                 * Gen8+ handled in the kernel then POLLIN events may be
+                 * delivered when we know there are pending reports to process
+                 * but before we've done any filtering to know for certain that
+                 * any reports are destined to be copied to userspace.
+                 *
+                 * Still, for now it's a reasonable sanity check.
+                 */
+                if (ret < 0)
+                        igt_debug("Unexpected error when reading after poll = %d\n", errno);
+                igt_assert_neq(ret, -1);
+
+                /* At this point, after consuming pending reports (and hoping
+                 * the scheduler hasn't stopped us for too long we now
+                 * expect EAGAIN on read.
+                 */
+                while ((ret = read(stream_fd, buf, sizeof(buf))) < 0 &&
+                       errno == EINTR)
+                        ;
+                igt_assert_eq(ret, -1);
+                igt_assert_eq(errno, EAGAIN);
+
+                n++;
+        }
+
+        times(&end_times);
+
+        /* Using nanosecond units is fairly silly here, given the tick in-
+         * precision - ah well, it's consistent with the get_time() units.
+         */
+        user_ns = (end_times.tms_utime - start_times.tms_utime) * tick_ns;
+        kernel_ns = (end_times.tms_stime - start_times.tms_stime) * tick_ns;
+
+        igt_debug("%d blocking poll()s in 600 milliseconds, with 25Hz OA sampling\n", n);
+        igt_debug("time in userspace = %"PRIu64"ns (start utime = %d, end = %d, ns ticks per sec = %d)\n",
+                  user_ns, (int)start_times.tms_utime, (int)end_times.tms_utime, (int)tick_ns);
+        igt_debug("time in kernelspace = %"PRIu64"ns (start stime = %d, end = %d, ns ticks per sec = %d)\n",
+                  kernel_ns, (int)start_times.tms_stime, (int)end_times.tms_stime, (int)tick_ns);
+
+        /* With completely broken blocking while polling (but still somehow
+         * reporting a POLLIN event) we could end up with an open loop,
+         * hopefully recognisable with > 15 (600/40)iterations.
+         */
+        igt_assert(n <= 15);
+
+        /* It's a bit tricky to put a lower limit here, but we expect a
+         * relatively low latency for seeing reports, while we don't currently
+         * give any control over this in the api.
+         *
+         * Limited to a 5 millisecond latency and 45ms (worst case)
+         * per-iteration that could give 13.3 iterations. Rounding gives a tiny
+         * bit more latency slack (6ms)...
+         */
+        igt_assert(n > 13);
+
+        /* A bit tricky to put a number on this, but we don't expect the kernel
+         * to use any significant cpu while waiting and given the in precision
+         * of stime (multiple of CLK_TCK) we expect this to round to zero.
+         */
+        igt_assert_eq(kernel_ns, 0);
+
+        close(stream_fd);
+}
+
+static void
+test_buffer_fill(void)
+{
+        int oa_exponent = 5; /* 5 micro seconds */
+        uint64_t properties[] = {
+                /* Include OA reports in samples */
+                DRM_I915_PERF_PROP_SAMPLE_OA, true,
+
+                /* OA unit configuration */
+                DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
+                DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
+                DRM_I915_PERF_PROP_OA_EXPONENT, oa_exponent,
+        };
+        struct drm_i915_perf_open_param param = {
+                .flags = I915_PERF_FLAG_FD_CLOEXEC,
+                .num_properties = sizeof(properties) / 16,
+                .properties_ptr = (uint64_t)properties,
+        };
+        int stream_fd = __perf_open(drm_fd, &param);
+        int buf_size = 65536 * (256 + sizeof(struct drm_i915_perf_record_header));
+        uint8_t *buf = malloc(buf_size);
+
+
+        for (int i = 0; i < 5; i++) {
+                struct drm_i915_perf_record_header *header;
+                bool overflow_seen;
+                int offset = 0;
+                int len;
+
+                /* It should take ~330 milliseconds to fill a 16MB OA buffer with a
+                 * 5 microsecond sampling period and 256 byte reports. */
+                nanosleep(&(struct timespec){ .tv_sec = 0, .tv_nsec = 500000000 }, NULL);
+
+                while ((len = read(stream_fd, buf, buf_size)) == -1 && errno == EINTR)
+                        ;
+
+                igt_assert_neq(len, -1);
+
+                overflow_seen = false;
+                for (offset = 0; offset < len; offset += header->size) {
+                        header = (void *)(buf + offset);
+
+                        if (header->type == DRM_I915_PERF_RECORD_OA_BUFFER_LOST)
+                                overflow_seen = true;
+                }
+
+                igt_assert_eq(overflow_seen, true);
+
+                nanosleep(&(struct timespec){ .tv_sec = 0, .tv_nsec = 1000000 }, NULL);
+
+                while ((len = read(stream_fd, buf, buf_size)) == -1 && errno == EINTR)
+                        ;
+
+                igt_assert_neq(len, -1);
+
+                /* expect ~ 200 records in 1 millisecond */
+                igt_assert(len > 256 * 150);
+
+                overflow_seen = false;
+                for (offset = 0; offset < len; offset += header->size) {
+                        header = (void *)(buf + offset);
+
+                        if (header->type == DRM_I915_PERF_RECORD_OA_BUFFER_LOST)
+                                overflow_seen = true;
+                }
+
+                igt_assert_eq(overflow_seen, false);
+        }
+
+        free(buf);
+
+        close(stream_fd);
+}
+
+static void
+test_enable_disable(void)
+{
+        int oa_exponent = 5; /* 5 micro seconds */
+        uint64_t properties[] = {
+                /* Include OA reports in samples */
+                DRM_I915_PERF_PROP_SAMPLE_OA, true,
+
+                /* OA unit configuration */
+                DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
+                DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
+                DRM_I915_PERF_PROP_OA_EXPONENT, oa_exponent,
+        };
+        struct drm_i915_perf_open_param param = {
+                .flags = I915_PERF_FLAG_FD_CLOEXEC |
+                         I915_PERF_FLAG_DISABLED, /* Verify we start disabled */
+                .num_properties = sizeof(properties) / 16,
+                .properties_ptr = (uint64_t)properties,
+        };
+        int stream_fd = __perf_open(drm_fd, &param);
+        int buf_size = 65536 * (256 + sizeof(struct drm_i915_perf_record_header));
+        uint8_t *buf = malloc(buf_size);
+
+
+        for (int i = 0; i < 5; i++) {
+                int len;
+
+                /* If the stream were enabled then it would take ~330
+                 * milliseconds to fill a 16MB OA buffer with a 5 microsecond
+                 * sampling period and 256 byte reports.
+                 *
+                 * Giving enough time for an overflow might help catch whether
+                 * the OA unit has been enabled even if the driver might at
+                 * least avoid copying reports while disabled.
+                 */
+                nanosleep(&(struct timespec){ .tv_sec = 0, .tv_nsec = 500000000 }, NULL);
+
+                while ((len = read(stream_fd, buf, buf_size)) == -1 && errno == EINTR)
+                        ;
+
+                igt_assert_eq(len, -1);
+                igt_assert_eq(errno, EIO);
+
+                do_ioctl(stream_fd, I915_PERF_IOCTL_ENABLE, 0);
+
+                nanosleep(&(struct timespec){ .tv_sec = 0, .tv_nsec = 1000000 }, NULL);
+
+                while ((len = read(stream_fd, buf, buf_size)) == -1 && errno == EINTR)
+                        ;
+
+                igt_assert_neq(len, -1);
+
+                /* expect ~ 200 records in 1 millisecond */
+                igt_assert(len > 256 * 150 && len < 256 * 2000);
+
+                do_ioctl(stream_fd, I915_PERF_IOCTL_DISABLE, 0);
+
+                /* It's considered an error to read a stream while it's disabled
+                 * since it would block indefinitely...
+                 */
+                len = read(stream_fd, buf, buf_size);
+
+                igt_assert_eq(len, -1);
+                igt_assert_eq(errno, EIO);
+        }
+
+        free(buf);
+
+        close(stream_fd);
+}
+
+static void
+test_short_reads(void)
+{
+        int oa_exponent = 5; /* 5 micro seconds */
+        uint64_t properties[] = {
+                /* Include OA reports in samples */
+                DRM_I915_PERF_PROP_SAMPLE_OA, true,
+
+                /* OA unit configuration */
+                DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
+                DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
+                DRM_I915_PERF_PROP_OA_EXPONENT, oa_exponent,
+        };
+        struct drm_i915_perf_open_param param = {
+                .flags = I915_PERF_FLAG_FD_CLOEXEC,
+                .num_properties = sizeof(properties) / 16,
+                .properties_ptr = (uint64_t)properties,
+        };
+        size_t record_size = 256 + sizeof(struct drm_i915_perf_record_header);
+        size_t page_size = sysconf(_SC_PAGE_SIZE);
+        int zero_fd = open("/dev/zero", O_RDWR|O_CLOEXEC);
+        uint8_t *pages = mmap(NULL, page_size * 2,
+                              PROT_READ|PROT_WRITE, MAP_PRIVATE, zero_fd, 0);
+        struct drm_i915_perf_record_header *header;
+        int stream_fd;
+        int ret;
+
+        igt_assert(pages);
+
+        ret = mprotect(pages + page_size, page_size, PROT_NONE);
+        igt_assert_eq(ret, 0);
+
+        stream_fd = __perf_open(drm_fd, &param);
+
+        nanosleep(&(struct timespec){ .tv_sec = 0, .tv_nsec = 5000000 }, NULL);
+
+        /* At this point there should be lots of pending reports to read */
+
+        /* A read that can return at least one record should result in a short
+         * read not an EFAULT if the buffer is smaller than the requested read
+         * size...
+         *
+         * Expect to see a sample record here, but at least skip over any
+         * _RECORD_LOST notifications.
+         */
+        do {
+                header = (void *)(pages + page_size - record_size);
+                ret = read(stream_fd,
+                           header,
+                           page_size);
+                igt_assert(ret > 0);
+        } while (header->type == DRM_I915_PERF_RECORD_OA_REPORT_LOST);
+
+        igt_assert_eq(ret, record_size);
+
+        /* A read that can't return a single record because it would result
+         * in a fault on buffer overrun should result in an EFAULT error...
+         */
+        ret = read(stream_fd, pages + page_size - 16, page_size);
+        igt_assert_eq(ret, -1);
+        igt_assert_eq(errno, EFAULT);
+
+        /* A read that can't return a single record because the buffer is too
+         * small should result in an ENOSPC error..
+         *
+         * Again, skip over _RECORD_LOST records (smaller than record_size/2)
+         */
+        do {
+                header = (void *)(pages + page_size - record_size / 2);
+                ret = read(stream_fd,
+                           header,
+                           record_size / 2);
+        } while (ret > 0 && header->type == DRM_I915_PERF_RECORD_OA_REPORT_LOST);
+
+        igt_assert_eq(ret, -1);
+        igt_assert_eq(errno, ENOSPC);
+
+        close(stream_fd);
+}
+
+static void
+test_non_sampling_read_error(void)
+{
+        uint64_t properties[] = {
+                /* XXX: even without periodic sampling we have to
+                 * specify at least one sample layout property...
+                 */
+                DRM_I915_PERF_PROP_SAMPLE_OA, true,
+
+                /* OA unit configuration */
+                DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
+                DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
+
+                /* XXX: no sampling exponent */
+        };
+        struct drm_i915_perf_open_param param = {
+                .flags = I915_PERF_FLAG_FD_CLOEXEC,
+                .num_properties = sizeof(properties) / 16,
+                .properties_ptr = (uint64_t)properties,
+        };
+        int stream_fd = __perf_open(drm_fd, &param);
+        uint8_t buf[1024];
+
+        int ret = read(stream_fd, buf, sizeof(buf));
+        igt_assert_eq(ret, -1);
+        igt_assert_eq(errno, EIO);
+
+        close(stream_fd);
+}
+
+/* Check that attempts to read from a stream while it is disable will return
+ * EIO instead of blocking indefinitely.
+ */
+static void
+test_disabled_read_error(void)
+{
+        int oa_exponent = 5; /* 5 micro seconds */
+        uint64_t properties[] = {
+                /* XXX: even without periodic sampling we have to
+                 * specify at least one sample layout property...
+                 */
+                DRM_I915_PERF_PROP_SAMPLE_OA, true,
+
+                /* OA unit configuration */
+                DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
+                DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
+                DRM_I915_PERF_PROP_OA_EXPONENT, oa_exponent,
+        };
+        struct drm_i915_perf_open_param param = {
+                .flags = I915_PERF_FLAG_FD_CLOEXEC |
+                         I915_PERF_FLAG_DISABLED, /* XXX: open disabled */
+                .num_properties = sizeof(properties) / 16,
+                .properties_ptr = (uint64_t)properties,
+        };
+        int stream_fd = __perf_open(drm_fd, &param);
+        uint32_t oa_report0[64];
+        uint32_t oa_report1[64];
+        uint32_t buf[128] = { 0 };
+        int ret;
+
+
+        ret = read(stream_fd, buf, sizeof(buf));
+        igt_assert_eq(ret, -1);
+        igt_assert_eq(errno, EIO);
+
+        close(stream_fd);
+
+
+        param.flags &= ~I915_PERF_FLAG_DISABLED;
+        stream_fd = __perf_open(drm_fd, &param);
+
+        read_2_oa_reports(stream_fd,
+                          I915_OA_FORMAT_A45_B8_C8, 256,
+                          oa_exponent,
+                          oa_report0,
+                          oa_report1,
+                          false); /* not just timer reports */
+
+        do_ioctl(stream_fd, I915_PERF_IOCTL_DISABLE, 0);
+
+        ret = read(stream_fd, buf, sizeof(buf));
+        igt_assert_eq(ret, -1);
+        igt_assert_eq(errno, EIO);
+
+        do_ioctl(stream_fd, I915_PERF_IOCTL_ENABLE, 0);
+
+        read_2_oa_reports(stream_fd,
+                          I915_OA_FORMAT_A45_B8_C8, 256,
+                          oa_exponent,
+                          oa_report0,
+                          oa_report1,
+                          false); /* not just timer reports */
+
+        close(stream_fd);
+}
+
+static void
+test_mi_rpc(void)
+{
+        uint64_t properties[] = {
+                /* Note: we have to specify at least one sample property even
+                 * though we aren't interested in samples in this case.
+                 */
+                DRM_I915_PERF_PROP_SAMPLE_OA, true,
+
+                /* OA unit configuration */
+                DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
+                DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
+
+                /* Note: no OA exponent specified in this case */
+        };
+        struct drm_i915_perf_open_param param = {
+                .flags = I915_PERF_FLAG_FD_CLOEXEC,
+                .num_properties = sizeof(properties) / 16,
+                .properties_ptr = (uint64_t)properties,
+        };
+        int stream_fd = __perf_open(drm_fd, &param);
+        drm_intel_bufmgr *bufmgr = drm_intel_bufmgr_gem_init(drm_fd, 4096);
+        drm_intel_context *context;
+        struct intel_batchbuffer *batch;
+        drm_intel_bo *bo;
+        uint32_t *report32;
+        int ret;
+
+        drm_intel_bufmgr_gem_enable_reuse(bufmgr);
+
+        context = drm_intel_gem_context_create(bufmgr);
+        igt_assert(context);
+
+        batch = intel_batchbuffer_alloc(bufmgr, devid);
+
+        bo = drm_intel_bo_alloc(bufmgr, "mi_rpc dest bo", 4096, 64);
+
+        ret = drm_intel_bo_map(bo, true);
+        igt_assert_eq(ret, 0);
+
+        memset(bo->virtual, 0x80, 4096);
+        drm_intel_bo_unmap(bo);
+
+        BEGIN_BATCH(3, 1);
+        OUT_BATCH(GEN6_MI_REPORT_PERF_COUNT);
+        OUT_RELOC(bo, I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION,
+                  0); /* offset in bytes */
+        OUT_BATCH(0xdeadbeef); /* report ID */
+        ADVANCE_BATCH();
+
+        intel_batchbuffer_flush_with_context(batch, context);
+
+        ret = drm_intel_bo_map(bo, false /* write enable */);
+        igt_assert_eq(ret, 0);
+
+        report32 = bo->virtual;
+        igt_assert_eq(report32[0], 0xdeadbeef); /* report ID */
+        igt_assert_neq(report32[1], 0); /* timestamp */
+
+        igt_assert_neq(report32[63], 0x80808080); /* end of report */
+        igt_assert_eq(report32[64], 0x80808080); /* after 256 byte report */
+
+        intel_batchbuffer_free(batch);
+        drm_intel_gem_context_destroy(context);
+        drm_intel_bufmgr_destroy(bufmgr);
+        close(stream_fd);
+}
+
+static void
+scratch_buf_init(drm_intel_bufmgr *bufmgr,
+                 struct igt_buf *buf,
+                 int width, int height,
+                 uint32_t color)
+{
+        size_t stride = width * 4;
+        size_t size = stride * height;
+	drm_intel_bo *bo = drm_intel_bo_alloc(bufmgr, "", size, 4096);
+        int ret;
+
+        ret = drm_intel_bo_map(bo, true /* writable */);
+        igt_assert_eq(ret, 0);
+
+        for (int i = 0; i < width * height; i++)
+                ((uint32_t *)bo->virtual)[i] = color;
+
+        drm_intel_bo_unmap(bo);
+
+	buf->bo = bo;
+	buf->stride = stride;
+	buf->tiling = I915_TILING_NONE;
+	buf->size = size;
+}
+
+static void
+emit_stall_timestamp_and_rpc(struct intel_batchbuffer *batch,
+                             drm_intel_bo *dst,
+                             int timestamp_offset,
+                             int report_dst_offset,
+                             uint32_t report_id)
+{
+        uint32_t pipe_ctl_flags = (PIPE_CONTROL_CS_STALL |
+                                   PIPE_CONTROL_RENDER_TARGET_FLUSH |
+                                   PIPE_CONTROL_WRITE_TIMESTAMP);
+
+        BEGIN_BATCH(5, 1);
+        OUT_BATCH(GFX_OP_PIPE_CONTROL | (5 - 2));
+        OUT_BATCH(pipe_ctl_flags);
+        OUT_RELOC(dst, I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION,
+                  timestamp_offset);
+        OUT_BATCH(0); /* imm lower */
+        OUT_BATCH(0); /* imm upper */
+        ADVANCE_BATCH();
+
+        BEGIN_BATCH(3, 1);
+        OUT_BATCH(GEN6_MI_REPORT_PERF_COUNT);
+        OUT_RELOC(dst, I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION,
+                  report_dst_offset);
+        OUT_BATCH(report_id);
+        ADVANCE_BATCH();
+}
+
+/* Tests the INTEL_performance_query use case where an unprivileged process
+ * should be able to configure the OA unit for per-context metrics (for a
+ * context associated with that process' drm file descriptor) and the counters
+ * should only relate to that specific context.
+ */
+static void
+test_per_ctx_mi_rpc(void)
+{
+        uint64_t properties[] = {
+                DRM_I915_PERF_PROP_CTX_HANDLE, UINT64_MAX, /* updated below */
+
+                /* Note: we have to specify at least one sample property even
+                 * though we aren't interested in samples in this case
+                 */
+                DRM_I915_PERF_PROP_SAMPLE_OA, true,
+
+                /* OA unit configuration */
+                DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
+                DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
+
+                /* Note: no OA exponent specified in this case */
+        };
+        struct drm_i915_perf_open_param param = {
+                .flags = I915_PERF_FLAG_FD_CLOEXEC,
+                .num_properties = sizeof(properties) / 16,
+                .properties_ptr = (uint64_t)properties,
+        };
+
+        /* should be default, but just to be sure... */
+        write_u64_file("/proc/sys/dev/i915/perf_stream_paranoid", 1);
+
+        igt_fork(child, 1) {
+                drm_intel_bufmgr *bufmgr;
+                drm_intel_context *context0, *context1;
+                int stream_fd;
+                struct intel_batchbuffer *batch;
+                struct igt_buf src, dst;
+                drm_intel_bo *bo;
+                uint32_t *report0_32, *report1_32;
+                uint64_t timestamp0_64, timestamp1_64;
+                uint32_t delta_ts64, delta_oa32;
+                uint64_t delta_ts64_ns, delta_oa32_ns;
+                uint32_t delta_delta;
+                int n_samples_written;
+                int width = 800;
+                int height = 600;
+                int ret;
+
+                igt_drop_root();
+
+                bufmgr = drm_intel_bufmgr_gem_init(drm_fd, 4096);
+                drm_intel_bufmgr_gem_enable_reuse(bufmgr);
+
+                scratch_buf_init(bufmgr, &src, width, height, 0xff0000ff);
+                scratch_buf_init(bufmgr, &dst, width, height, 0x00ff00ff);
+
+                batch = intel_batchbuffer_alloc(bufmgr, devid);
+
+                context0 = drm_intel_gem_context_create(bufmgr);
+                igt_assert(context0);
+
+                context1 = drm_intel_gem_context_create(bufmgr);
+                igt_assert(context1);
+
+                igt_debug("submitting warm up render_copy\n");
+
+                /* Submit some early, unmeasured, work to the context we want
+                 * to measure to try and catch issues with i915-perf
+                 * initializing the HW context ID for filtering.
+                 *
+                 * We do this because i915-perf single context filtering had
+                 * previously only relied on a hook into context pinning to
+                 * initialize the HW context ID, instead of also trying to
+                 * determine the HW ID while opening the stream, in case it
+                 * has already been pinned.
+                 *
+                 * This wasn't noticed by the previous unit test because we
+                 * were opening the stream while the context hadn't been
+                 * touched or pinned yet and so it worked out correctly to wait
+                 * for the pinning hook.
+                 *
+                 * Now a buggy version of i915-perf will fail to measure
+                 * anything for context0 once this initial render_copy() ends
+                 * up pinning the context since there won't ever be a pinning
+                 * hook callback.
+                 */
+                render_copy(batch,
+                            context0,
+                            &src, 0, 0, width, height,
+                            &dst, 0, 0);
+
+                properties[1] = drm_intel_gem_context_get_context_id(context0);
+
+                igt_debug("opening i915-perf stream\n");
+                stream_fd = __perf_open(drm_fd, &param);
+
+                bo = drm_intel_bo_alloc(bufmgr, "mi_rpc dest bo", 4096, 64);
+
+                ret = drm_intel_bo_map(bo, true /* write enable */);
+                igt_assert_eq(ret, 0);
+
+                memset(bo->virtual, 0x80, 4096);
+                drm_intel_bo_unmap(bo);
+
+                emit_stall_timestamp_and_rpc(batch,
+                                             bo,
+                                             512 /* timestamp offset */,
+                                             0, /* report dst offset */
+                                             0xdeadbeef); /* report id */
+
+                /* Explicitly flush here (even though the render_copy() call
+                 * will itself flush before/after the copy) to clarify that
+                 * that the PIPE_CONTROL + MI_RPC commands will be in a
+                 * separate batch from the copy.
+                 */
+                intel_batchbuffer_flush_with_context(batch, context0);
+
+                render_copy(batch,
+                            context0,
+                            &src, 0, 0, width, height,
+                            &dst, 0, 0);
+
+                /* Another redundant flush to clarify batch bo is free to reuse */
+                intel_batchbuffer_flush_with_context(batch, context0);
+
+                /* submit two copies on the other context to avoid a false
+                 * positive in case the driver somehow ended up filtering for
+                 * context1
+                 */
+                render_copy(batch,
+                            context1,
+                            &src, 0, 0, width, height,
+                            &dst, 0, 0);
+
+                render_copy(batch,
+                            context1,
+                            &src, 0, 0, width, height,
+                            &dst, 0, 0);
+
+                /* And another */
+                intel_batchbuffer_flush_with_context(batch, context1);
+
+                emit_stall_timestamp_and_rpc(batch,
+                                             bo,
+                                             520 /* timestamp offset */,
+                                             256, /* report dst offset */
+                                             0xbeefbeef); /* report id */
+
+                intel_batchbuffer_flush_with_context(batch, context0);
+
+                ret = drm_intel_bo_map(bo, false /* write enable */);
+                igt_assert_eq(ret, 0);
+
+                report0_32 = bo->virtual;
+                igt_assert_eq(report0_32[0], 0xdeadbeef); /* report ID */
+                igt_assert_neq(report0_32[1], 0); /* timestamp */
+
+                report1_32 = report0_32 + 64;
+                igt_assert_eq(report1_32[0], 0xbeefbeef); /* report ID */
+                igt_assert_neq(report1_32[1], 0); /* timestamp */
+
+                print_reports(report0_32, report1_32,
+                              lookup_format(I915_OA_FORMAT_A45_B8_C8));
+
+                /* A40 == N samples written to all render targets */
+                n_samples_written = report1_32[43] - report0_32[43];
+                igt_debug("n samples written = %d\n", n_samples_written);
+                igt_assert_eq(n_samples_written, width * height);
+
+                igt_debug("timestamp32 0 = %u\n", report0_32[1]);
+                igt_debug("timestamp32 1 = %u\n", report1_32[1]);
+
+                timestamp0_64 = *(uint64_t *)(((uint8_t *)bo->virtual) + 512);
+                timestamp1_64 = *(uint64_t *)(((uint8_t *)bo->virtual) + 520);
+
+                igt_debug("timestamp64 0 = %"PRIu64"\n", timestamp0_64);
+                igt_debug("timestamp64 1 = %"PRIu64"\n", timestamp1_64);
+
+                delta_ts64 = timestamp1_64 - timestamp0_64;
+                delta_oa32 = report1_32[1] - report0_32[1];
+
+                /* sanity check that we can pass the delta to timebase_scale */
+                igt_assert(delta_ts64 < UINT32_MAX);
+                delta_oa32_ns = timebase_scale(delta_oa32);
+                delta_ts64_ns = timebase_scale(delta_ts64);
+
+                igt_debug("ts32 delta = %u, = %uns\n",
+                          delta_oa32, (unsigned)delta_oa32_ns);
+                igt_debug("ts64 delta = %u, = %uns\n",
+                          delta_ts64, (unsigned)delta_ts64_ns);
+
+                /* The delta as calculated via the PIPE_CONTROL timestamp or
+                 * the OA report timestamps should be almost identical but
+                 * allow a 320 nanoseconds margin.
+                 */
+                delta_delta = delta_ts64_ns > delta_oa32_ns ?
+                        (delta_ts64_ns - delta_oa32_ns) :
+                        (delta_oa32_ns - delta_ts64_ns);
+                igt_assert(delta_delta <= 320);
+
+                intel_batchbuffer_free(batch);
+                drm_intel_gem_context_destroy(context0);
+                drm_intel_gem_context_destroy(context1);
+                drm_intel_bufmgr_destroy(bufmgr);
+                close(stream_fd);
+        }
+
+        igt_waitchildren();
+}
+
+static void
+test_rc6_disable(void)
+{
+        int oa_exponent = 13; /* 1 millisecond */
+        uint64_t properties[] = {
+                /* Include OA reports in samples */
+                DRM_I915_PERF_PROP_SAMPLE_OA, true,
+
+                /* OA unit configuration */
+                DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
+                DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
+                DRM_I915_PERF_PROP_OA_EXPONENT, oa_exponent,
+        };
+        struct drm_i915_perf_open_param param = {
+                .flags = I915_PERF_FLAG_FD_CLOEXEC,
+                .num_properties = sizeof(properties) / 16,
+                .properties_ptr = (uint64_t)properties,
+        };
+        int stream_fd = __perf_open(drm_fd, &param);
+        uint64_t n_events_start = read_debugfs_u64_record("i915_drpc_info",
+                                                          "RC6 residency since boot");
+        uint64_t n_events_end;
+
+        nanosleep(&(struct timespec){ .tv_sec = 0, .tv_nsec = 500000000 }, NULL);
+
+        n_events_end = read_debugfs_u64_record("i915_drpc_info",
+                                               "RC6 residency since boot");
+
+        igt_assert_eq(n_events_end - n_events_start, 0);
+
+        close(stream_fd);
+
+        n_events_start = read_debugfs_u64_record("i915_drpc_info",
+                                                 "RC6 residency since boot");
+
+        nanosleep(&(struct timespec){ .tv_sec = 0, .tv_nsec = 500000000 }, NULL);
+
+        n_events_end = read_debugfs_u64_record("i915_drpc_info",
+                                               "RC6 residency since boot");
+
+        igt_assert_neq(n_events_end - n_events_start, 0);
+}
+
+static unsigned
+read_i915_module_ref(void)
+{
+        FILE *fp = fopen("/proc/modules", "r");
+        char *line = NULL;
+        size_t line_buf_size = 0;
+        int len = 0;
+        unsigned ref_count;
+
+        igt_assert(fp);
+
+        while ((len = getline(&line, &line_buf_size, fp)) > 0) {
+                if (strncmp(line, "i915 ", 5) == 0) {
+                        unsigned long mem;
+                        int ret = sscanf(line + 5, "%lu %u", &mem, &ref_count);
+                        igt_assert(ret == 2);
+                        goto done;
+                }
+        }
+
+        igt_assert(!"reached");
+
+done:
+        free(line);
+        fclose(fp);
+        return ref_count;
+}
+
+/* check that an open i915 perf stream holds a reference on the drm i915 module
+ * including in the corner case where the original drm fd has been closed.
+ */
+static void
+test_i915_ref_count(void)
+{
+        int oa_exponent = 13; /* 1 millisecond */
+        uint64_t properties[] = {
+                /* Include OA reports in samples */
+                DRM_I915_PERF_PROP_SAMPLE_OA, true,
+
+                /* OA unit configuration */
+                DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
+                DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
+                DRM_I915_PERF_PROP_OA_EXPONENT, oa_exponent,
+        };
+        struct drm_i915_perf_open_param param = {
+                .flags = I915_PERF_FLAG_FD_CLOEXEC,
+                .num_properties = sizeof(properties) / 16,
+                .properties_ptr = (uint64_t)properties,
+        };
+        unsigned baseline, ref_count0, ref_count1;
+        int stream_fd;
+        uint32_t oa_report0[64];
+        uint32_t oa_report1[64];
+
+        close(drm_fd);
+        baseline = read_i915_module_ref();
+        igt_debug("baseline ref count (drm fd closed) = %u\n", baseline);
+
+        drm_fd = drm_open_driver_render(DRIVER_INTEL);
+        ref_count0 = read_i915_module_ref();
+        igt_debug("initial ref count with drm_fd open = %u\n", ref_count0);
+        igt_assert(ref_count0 > baseline);
+
+        stream_fd = __perf_open(drm_fd, &param);
+        ref_count1 = read_i915_module_ref();
+        igt_debug("ref count after opening i915 perf stream = %u\n", ref_count1);
+        igt_assert(ref_count1 > ref_count0);
+
+        close(drm_fd);
+        ref_count0 = read_i915_module_ref();
+        igt_debug("ref count after closing drm fd = %u\n", ref_count0);
+
+        igt_assert(ref_count0 > baseline);
+
+        read_2_oa_reports(stream_fd,
+                          I915_OA_FORMAT_A45_B8_C8, 256,
+                          oa_exponent,
+                          oa_report0,
+                          oa_report1,
+                          false); /* not just timer reports */
+
+        close(stream_fd);
+        ref_count0 = read_i915_module_ref();
+        igt_debug("ref count after closing i915 perf stream fd = %u\n", ref_count0);
+        igt_assert_eq(ref_count0, baseline);
+
+        drm_fd = drm_open_driver_render(DRIVER_INTEL);
+}
+
+igt_main
+{
+        igt_skip_on_simulation();
+
+        igt_fixture {
+                struct stat sb;
+                int ret;
+
+                drm_fd = drm_open_driver_render(DRIVER_INTEL);
+                devid = intel_get_drm_devid(drm_fd);
+                device = drm_get_card();
+
+                igt_require(IS_HASWELL(devid));
+                igt_require(lookup_hsw_render_basic_id());
+
+                ret = stat("/proc/sys/dev/i915/perf_stream_paranoid", &sb);
+                igt_require(ret == 0);
+                ret = stat("/proc/sys/dev/i915/oa_max_sample_rate", &sb);
+                igt_require(ret == 0);
+
+                gt_frequency_range_save();
+
+                write_u64_file("/proc/sys/dev/i915/perf_stream_paranoid", 1);
+
+                render_copy = igt_get_render_copyfunc(devid);
+                igt_require_f(render_copy, "no render-copy function\n");
+        }
+
+        igt_subtest("non-system-wide-paranoid")
+                test_system_wide_paranoid();
+
+        igt_subtest("invalid-open-flags")
+                test_invalid_open_flags();
+
+        igt_subtest("invalid-oa-metric-set-id")
+                test_invalid_oa_metric_set_id();
+
+        igt_subtest("invalid-oa-format-id")
+                test_invalid_oa_format_id();
+
+        igt_subtest("missing-sample-flags")
+                test_missing_sample_flags();
+
+        igt_subtest("oa-formats")
+                test_oa_formats();
+
+        igt_subtest("invalid-oa-exponent")
+                test_invalid_oa_exponent();
+        igt_subtest("low-oa-exponent-permissions")
+                test_low_oa_exponent_permissions();
+        igt_subtest("oa-exponents") {
+                test_oa_exponents(450);
+                test_oa_exponents(550);
+        }
+
+        igt_subtest("per-context-mode-unprivileged")
+                test_per_context_mode_unprivileged();
+
+        igt_subtest("buffer-fill")
+                test_buffer_fill();
+
+        igt_subtest("disabled-read-error")
+                test_disabled_read_error();
+        igt_subtest("non-sampling-read-error")
+                test_non_sampling_read_error();
+
+        igt_subtest("enable-disable")
+                test_enable_disable();
+
+        igt_subtest("blocking")
+                test_blocking();
+
+        igt_subtest("polling")
+                test_polling();
+
+        igt_subtest("short-reads")
+                test_short_reads();
+
+        igt_subtest("mi-rpc")
+                test_mi_rpc();
+
+        igt_subtest("mi-rpc-per-ctx")
+                test_per_ctx_mi_rpc();
+
+        igt_subtest("i915-ref-count")
+                test_i915_ref_count();
+
+        igt_subtest("rc6-disable")
+                test_rc6_disable();
+
+        igt_fixture {
+                /* leave sysctl options in their default state... */
+                write_u64_file("/proc/sys/dev/i915/oa_max_sample_rate", 100000);
+                write_u64_file("/proc/sys/dev/i915/perf_stream_paranoid", 1);
+
+                gt_frequency_range_restore();
+
+                close(drm_fd);
+        }
+}
-- 
2.10.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH igt v3 02/11] igt/gem_exec_parse: some minor cleanups
  2016-11-09 16:15 [PATCH igt v3 00/11] corresponding changes for i915-perf interface Robert Bragg
  2016-11-09 16:15 ` [PATCH igt v3 01/11] igt/perf: add i915 perf stream tests for Haswell Robert Bragg
@ 2016-11-09 16:15 ` Robert Bragg
  2016-11-11 21:49   ` Matthew Auld
  2016-11-09 16:15 ` [PATCH igt v3 03/11] igt/gem_exec_parse: move hsw_load_register_reg down Robert Bragg
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 25+ messages in thread
From: Robert Bragg @ 2016-11-09 16:15 UTC (permalink / raw)
  To: intel-gfx

This normalizes the execbuf utilities in this file to all use memset to
clear obj, reloc and execbuf structures and set them up in the same
order. As I was debugging some unpredictable test failures I was getting
unsure that all these structures were being fully initialized.

The same I915_GEM_DOMAIN_COMMAND domain is now used with all relocs.

The register/command defines have been moved to the top of the file to
be available to all tests/utilities.

The handle + fd variables are now static.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
---
 tests/gem_exec_parse.c | 196 +++++++++++++++++--------------------------------
 1 file changed, 66 insertions(+), 130 deletions(-)

diff --git a/tests/gem_exec_parse.c b/tests/gem_exec_parse.c
index a39db3e..72d7c7b 100644
--- a/tests/gem_exec_parse.c
+++ b/tests/gem_exec_parse.c
@@ -37,6 +37,20 @@
 #define OACONTROL 0x2360
 #define DERRMR 0x44050
 
+#define HSW_CS_GPR(n) (0x2600 + 8*(n))
+#define HSW_CS_GPR0 HSW_CS_GPR(0)
+#define HSW_CS_GPR1 HSW_CS_GPR(1)
+
+#define MI_LOAD_REGISTER_REG (0x2a << 23)
+#define MI_STORE_REGISTER_MEM (0x24 << 23)
+#define MI_ARB_ON_OFF (0x8 << 23)
+#define MI_DISPLAY_FLIP ((0x14 << 23) | 1)
+
+#define GFX_OP_PIPE_CONTROL	((0x3<<29)|(0x3<<27)|(0x2<<24)|2)
+#define   PIPE_CONTROL_QW_WRITE	(1<<14)
+#define   PIPE_CONTROL_LRI_POST_OP (1<<23)
+
+
 static int command_parser_version(int fd)
 {
 	int version = -1;
@@ -51,12 +65,6 @@ static int command_parser_version(int fd)
 	return -1;
 }
 
-#define HSW_CS_GPR(n) (0x2600 + 8*(n))
-#define HSW_CS_GPR0 HSW_CS_GPR(0)
-#define HSW_CS_GPR1 HSW_CS_GPR(1)
-
-#define MI_LOAD_REGISTER_REG (0x2a << 23)
-#define MI_STORE_REGISTER_MEM (0x24 << 23)
 static void hsw_load_register_reg(void)
 {
 	uint32_t buf[16] = {
@@ -100,13 +108,13 @@ static void hsw_load_register_reg(void)
 	reloc[0].offset = 7*sizeof(uint32_t);
 	reloc[0].target_handle = obj[0].handle;
 	reloc[0].delta = 0;
-	reloc[0].read_domains = I915_GEM_DOMAIN_INSTRUCTION;
-	reloc[0].write_domain = I915_GEM_DOMAIN_INSTRUCTION;
+	reloc[0].read_domains = I915_GEM_DOMAIN_COMMAND;
+	reloc[0].write_domain = I915_GEM_DOMAIN_COMMAND;
 	reloc[1].offset = 13*sizeof(uint32_t);
 	reloc[1].target_handle = obj[0].handle;
 	reloc[1].delta = sizeof(uint32_t);
-	reloc[1].read_domains = I915_GEM_DOMAIN_INSTRUCTION;
-	reloc[1].write_domain = I915_GEM_DOMAIN_INSTRUCTION;
+	reloc[1].read_domains = I915_GEM_DOMAIN_COMMAND;
+	reloc[1].write_domain = I915_GEM_DOMAIN_COMMAND;
 	obj[1].relocs_ptr = (uintptr_t)&reloc;
 	obj[1].relocation_count = 2;
 
@@ -152,7 +160,7 @@ static void exec_batch_patched(int fd, uint32_t cmd_bo, uint32_t *cmds,
 			       int size, int patch_offset, uint64_t expected_value)
 {
 	struct drm_i915_gem_execbuffer2 execbuf;
-	struct drm_i915_gem_exec_object2 objs[2];
+	struct drm_i915_gem_exec_object2 obj[2];
 	struct drm_i915_gem_relocation_entry reloc[1];
 
 	uint32_t target_bo = gem_create(fd, 4096);
@@ -160,42 +168,24 @@ static void exec_batch_patched(int fd, uint32_t cmd_bo, uint32_t *cmds,
 
 	gem_write(fd, cmd_bo, 0, cmds, size);
 
+	memset(obj, 0, sizeof(obj));
+	obj[0].handle = target_bo;
+	obj[1].handle = cmd_bo;
+
+	memset(reloc, 0, sizeof(reloc));
 	reloc[0].offset = patch_offset;
+	reloc[0].target_handle = obj[0].handle;
 	reloc[0].delta = 0;
-	reloc[0].target_handle = target_bo;
-	reloc[0].read_domains = I915_GEM_DOMAIN_RENDER;
-	reloc[0].write_domain = I915_GEM_DOMAIN_RENDER;
-	reloc[0].presumed_offset = 0;
-
-	objs[0].handle = target_bo;
-	objs[0].relocation_count = 0;
-	objs[0].relocs_ptr = 0;
-	objs[0].alignment = 0;
-	objs[0].offset = 0;
-	objs[0].flags = 0;
-	objs[0].rsvd1 = 0;
-	objs[0].rsvd2 = 0;
-
-	objs[1].handle = cmd_bo;
-	objs[1].relocation_count = 1;
-	objs[1].relocs_ptr = (uintptr_t)reloc;
-	objs[1].alignment = 0;
-	objs[1].offset = 0;
-	objs[1].flags = 0;
-	objs[1].rsvd1 = 0;
-	objs[1].rsvd2 = 0;
-
-	execbuf.buffers_ptr = (uintptr_t)objs;
+	reloc[0].read_domains = I915_GEM_DOMAIN_COMMAND;
+	reloc[0].write_domain = I915_GEM_DOMAIN_COMMAND;
+	obj[1].relocs_ptr = (uintptr_t)reloc;
+	obj[1].relocation_count = 1;
+
+	memset(&execbuf, 0, sizeof(execbuf));
+	execbuf.buffers_ptr = (uintptr_t)obj;
 	execbuf.buffer_count = 2;
-	execbuf.batch_start_offset = 0;
 	execbuf.batch_len = size;
-	execbuf.cliprects_ptr = 0;
-	execbuf.num_cliprects = 0;
-	execbuf.DR1 = 0;
-	execbuf.DR4 = 0;
 	execbuf.flags = I915_EXEC_RENDER;
-	i915_execbuffer2_set_context_id(execbuf, 0);
-	execbuf.rsvd2 = 0;
 
 	gem_execbuf(fd, &execbuf);
 	gem_sync(fd, cmd_bo);
@@ -210,30 +200,18 @@ static int __exec_batch(int fd, uint32_t cmd_bo, uint32_t *cmds,
 			int size, int ring)
 {
 	struct drm_i915_gem_execbuffer2 execbuf;
-	struct drm_i915_gem_exec_object2 objs[1];
+	struct drm_i915_gem_exec_object2 obj[1];
 
 	gem_write(fd, cmd_bo, 0, cmds, size);
 
-	objs[0].handle = cmd_bo;
-	objs[0].relocation_count = 0;
-	objs[0].relocs_ptr = 0;
-	objs[0].alignment = 0;
-	objs[0].offset = 0;
-	objs[0].flags = 0;
-	objs[0].rsvd1 = 0;
-	objs[0].rsvd2 = 0;
+	memset(obj, 0, sizeof(obj));
+	obj[0].handle = cmd_bo;
 
-	execbuf.buffers_ptr = (uintptr_t)objs;
+	memset(&execbuf, 0, sizeof(execbuf));
+	execbuf.buffers_ptr = (uintptr_t)obj;
 	execbuf.buffer_count = 1;
-	execbuf.batch_start_offset = 0;
 	execbuf.batch_len = size;
-	execbuf.cliprects_ptr = 0;
-	execbuf.num_cliprects = 0;
-	execbuf.DR1 = 0;
-	execbuf.DR4 = 0;
 	execbuf.flags = ring;
-	i915_execbuffer2_set_context_id(execbuf, 0);
-	execbuf.rsvd2 = 0;
 
 	return __gem_execbuf(fd, &execbuf);
 }
@@ -244,7 +222,7 @@ static void exec_split_batch(int fd, uint32_t *cmds,
 			     int size, int ring, int expected_ret)
 {
 	struct drm_i915_gem_execbuffer2 execbuf;
-	struct drm_i915_gem_exec_object2 objs[1];
+	struct drm_i915_gem_exec_object2 obj[1];
 	uint32_t cmd_bo;
 	uint32_t noop[1024] = { 0 };
 	const int alloc_size = 4096 * 2;
@@ -261,16 +239,11 @@ static void exec_split_batch(int fd, uint32_t *cmds,
 	 */
 	gem_write(fd, cmd_bo, actual_start_offset, cmds, size);
 
-	objs[0].handle = cmd_bo;
-	objs[0].relocation_count = 0;
-	objs[0].relocs_ptr = 0;
-	objs[0].alignment = 0;
-	objs[0].offset = 0;
-	objs[0].flags = 0;
-	objs[0].rsvd1 = 0;
-	objs[0].rsvd2 = 0;
+	memset(obj, 0, sizeof(obj));
+	obj[0].handle = cmd_bo;
 
-	execbuf.buffers_ptr = (uintptr_t)objs;
+	memset(&execbuf, 0, sizeof(execbuf));
+	execbuf.buffers_ptr = (uintptr_t)obj;
 	execbuf.buffer_count = 1;
 	/* NB: We want batch_start_offset and batch_len to point to the block
 	 * of the actual commands (i.e. at the last dword of the first page),
@@ -281,13 +254,7 @@ static void exec_split_batch(int fd, uint32_t *cmds,
 	execbuf.batch_len =
 		ALIGN(size + actual_start_offset - execbuf.batch_start_offset,
 		      0x8);
-	execbuf.cliprects_ptr = 0;
-	execbuf.num_cliprects = 0;
-	execbuf.DR1 = 0;
-	execbuf.DR4 = 0;
 	execbuf.flags = ring;
-	i915_execbuffer2_set_context_id(execbuf, 0);
-	execbuf.rsvd2 = 0;
 
 	igt_assert_eq(__gem_execbuf(fd, &execbuf), expected_ret);
 
@@ -300,8 +267,8 @@ static void exec_batch_chained(int fd, uint32_t cmd_bo, uint32_t *cmds,
 			       uint64_t expected_value)
 {
 	struct drm_i915_gem_execbuffer2 execbuf;
-	struct drm_i915_gem_exec_object2 objs[3];
-	struct drm_i915_gem_relocation_entry reloc;
+	struct drm_i915_gem_exec_object2 obj[3];
+	struct drm_i915_gem_relocation_entry reloc[1];
 	struct drm_i915_gem_relocation_entry first_level_reloc;
 
 	uint32_t target_bo = gem_create(fd, 4096);
@@ -322,58 +289,34 @@ static void exec_batch_chained(int fd, uint32_t cmd_bo, uint32_t *cmds,
 		  first_level_cmds, sizeof(first_level_cmds));
 	gem_write(fd, cmd_bo, 0, cmds, size);
 
-	reloc.offset = patch_offset;
-	reloc.delta = 0;
-	reloc.target_handle = target_bo;
-	reloc.read_domains = I915_GEM_DOMAIN_RENDER;
-	reloc.write_domain = I915_GEM_DOMAIN_RENDER;
-	reloc.presumed_offset = 0;
+	memset(obj, 0, sizeof(obj));
+	obj[0].handle = target_bo;
+	obj[1].handle = cmd_bo;
+	obj[2].handle = first_level_bo;
+
+	memset(reloc, 0, sizeof(reloc));
+	reloc[0].offset = patch_offset;
+	reloc[0].delta = 0;
+	reloc[0].target_handle = target_bo;
+	reloc[0].read_domains = I915_GEM_DOMAIN_COMMAND;
+	reloc[0].write_domain = I915_GEM_DOMAIN_COMMAND;
+	obj[1].relocation_count = 1;
+	obj[1].relocs_ptr = (uintptr_t)&reloc;
 
+	memset(&first_level_reloc, 0, sizeof(first_level_reloc));
 	first_level_reloc.offset = 4;
 	first_level_reloc.delta = 0;
 	first_level_reloc.target_handle = cmd_bo;
-	first_level_reloc.read_domains = I915_GEM_DOMAIN_INSTRUCTION;
+	first_level_reloc.read_domains = I915_GEM_DOMAIN_COMMAND;
 	first_level_reloc.write_domain = 0;
-	first_level_reloc.presumed_offset = 0;
-
-	objs[0].handle = target_bo;
-	objs[0].relocation_count = 0;
-	objs[0].relocs_ptr = 0;
-	objs[0].alignment = 0;
-	objs[0].offset = 0;
-	objs[0].flags = 0;
-	objs[0].rsvd1 = 0;
-	objs[0].rsvd2 = 0;
-
-	objs[1].handle = cmd_bo;
-	objs[1].relocation_count = 1;
-	objs[1].relocs_ptr = (uintptr_t)&reloc;
-	objs[1].alignment = 0;
-	objs[1].offset = 0;
-	objs[1].flags = 0;
-	objs[1].rsvd1 = 0;
-	objs[1].rsvd2 = 0;
-
-	objs[2].handle = first_level_bo;
-	objs[2].relocation_count = 1;
-	objs[2].relocs_ptr = (uintptr_t)&first_level_reloc;
-	objs[2].alignment = 0;
-	objs[2].offset = 0;
-	objs[2].flags = 0;
-	objs[2].rsvd1 = 0;
-	objs[2].rsvd2 = 0;
-
-	execbuf.buffers_ptr = (uintptr_t)objs;
+	obj[2].relocation_count = 1;
+	obj[2].relocs_ptr = (uintptr_t)&first_level_reloc;
+
+	memset(&execbuf, 0, sizeof(execbuf));
+	execbuf.buffers_ptr = (uintptr_t)obj;
 	execbuf.buffer_count = 3;
-	execbuf.batch_start_offset = 0;
 	execbuf.batch_len = sizeof(first_level_cmds);
-	execbuf.cliprects_ptr = 0;
-	execbuf.num_cliprects = 0;
-	execbuf.DR1 = 0;
-	execbuf.DR4 = 0;
 	execbuf.flags = I915_EXEC_RENDER;
-	i915_execbuffer2_set_context_id(execbuf, 0);
-	execbuf.rsvd2 = 0;
 
 	gem_execbuf(fd, &execbuf);
 	gem_sync(fd, cmd_bo);
@@ -408,15 +351,8 @@ static void stray_lri(int fd, uint32_t handle)
 	igt_assert_eq_u32(intel_register_read(OACONTROL), 0xdeadbeef);
 }
 
-uint32_t handle;
-int fd;
-
-#define MI_ARB_ON_OFF (0x8 << 23)
-#define MI_DISPLAY_FLIP ((0x14 << 23) | 1)
-
-#define GFX_OP_PIPE_CONTROL	((0x3<<29)|(0x3<<27)|(0x2<<24)|2)
-#define   PIPE_CONTROL_QW_WRITE	(1<<14)
-#define   PIPE_CONTROL_LRI_POST_OP (1<<23)
+static uint32_t handle;
+static int fd;
 
 igt_main
 {
-- 
2.10.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH igt v3 03/11] igt/gem_exec_parse: move hsw_load_register_reg down
  2016-11-09 16:15 [PATCH igt v3 00/11] corresponding changes for i915-perf interface Robert Bragg
  2016-11-09 16:15 ` [PATCH igt v3 01/11] igt/perf: add i915 perf stream tests for Haswell Robert Bragg
  2016-11-09 16:15 ` [PATCH igt v3 02/11] igt/gem_exec_parse: some minor cleanups Robert Bragg
@ 2016-11-09 16:15 ` Robert Bragg
  2016-11-11 21:51   ` Matthew Auld
  2016-11-09 16:15 ` [PATCH igt v3 04/11] igt/gem_exec_parse: update hsw_load_register_reg Robert Bragg
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 25+ messages in thread
From: Robert Bragg @ 2016-11-09 16:15 UTC (permalink / raw)
  To: intel-gfx

No functional change, just moving hsw_load_regster_reg test code down
below the execbuf utilities in preparation for updating to use them.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
---
 tests/gem_exec_parse.c | 182 ++++++++++++++++++++++++-------------------------
 1 file changed, 91 insertions(+), 91 deletions(-)

diff --git a/tests/gem_exec_parse.c b/tests/gem_exec_parse.c
index 72d7c7b..c530bb6 100644
--- a/tests/gem_exec_parse.c
+++ b/tests/gem_exec_parse.c
@@ -65,97 +65,6 @@ static int command_parser_version(int fd)
 	return -1;
 }
 
-static void hsw_load_register_reg(void)
-{
-	uint32_t buf[16] = {
-		MI_LOAD_REGISTER_IMM | (5 - 2),
-		HSW_CS_GPR0,
-		0xabcdabcd,
-		HSW_CS_GPR1,
-		0xdeadbeef,
-
-		MI_STORE_REGISTER_MEM | (3 - 2),
-		HSW_CS_GPR1,
-		0, /* address0 */
-
-		MI_LOAD_REGISTER_REG | (3 - 2),
-		HSW_CS_GPR0,
-		HSW_CS_GPR1,
-
-		MI_STORE_REGISTER_MEM | (3 - 2),
-		HSW_CS_GPR1,
-		4, /* address1 */
-
-		MI_BATCH_BUFFER_END,
-	};
-	struct drm_i915_gem_execbuffer2 execbuf;
-	struct drm_i915_gem_exec_object2 obj[2];
-	struct drm_i915_gem_relocation_entry reloc[2];
-	int fd;
-
-	/* Open again to get a non-master file descriptor */
-	fd = drm_open_driver(DRIVER_INTEL);
-
-	igt_require(IS_HASWELL(intel_get_drm_devid(fd)));
-	igt_require(command_parser_version(fd) >= 7);
-
-	memset(obj, 0, sizeof(obj));
-	obj[0].handle = gem_create(fd, 4096);
-	obj[1].handle = gem_create(fd, 4096);
-	gem_write(fd, obj[1].handle, 0, buf, sizeof(buf));
-
-	memset(reloc, 0, sizeof(reloc));
-	reloc[0].offset = 7*sizeof(uint32_t);
-	reloc[0].target_handle = obj[0].handle;
-	reloc[0].delta = 0;
-	reloc[0].read_domains = I915_GEM_DOMAIN_COMMAND;
-	reloc[0].write_domain = I915_GEM_DOMAIN_COMMAND;
-	reloc[1].offset = 13*sizeof(uint32_t);
-	reloc[1].target_handle = obj[0].handle;
-	reloc[1].delta = sizeof(uint32_t);
-	reloc[1].read_domains = I915_GEM_DOMAIN_COMMAND;
-	reloc[1].write_domain = I915_GEM_DOMAIN_COMMAND;
-	obj[1].relocs_ptr = (uintptr_t)&reloc;
-	obj[1].relocation_count = 2;
-
-	memset(&execbuf, 0, sizeof(execbuf));
-	execbuf.buffers_ptr = (uintptr_t)obj;
-	execbuf.buffer_count = 2;
-	execbuf.batch_len = sizeof(buf);
-	execbuf.flags = I915_EXEC_RENDER;
-	gem_execbuf(fd, &execbuf);
-	gem_close(fd, obj[1].handle);
-
-	gem_read(fd, obj[0].handle, 0, buf, 2*sizeof(buf[0]));
-	igt_assert_eq_u32(buf[0], 0xdeadbeef); /* before copy */
-	igt_assert_eq_u32(buf[1], 0xabcdabcd); /* after copy */
-
-	/* Now a couple of negative tests that should be filtered */
-	execbuf.buffer_count = 1;
-	execbuf.batch_len = 4*sizeof(buf[0]);
-
-	buf[0] = MI_LOAD_REGISTER_REG | (3 - 2);
-	buf[1] = HSW_CS_GPR0;
-	buf[2] = 0;
-	buf[3] = MI_BATCH_BUFFER_END;
-	gem_write(fd, obj[0].handle, 0, buf, execbuf.batch_len);
-	igt_assert_eq(__gem_execbuf(fd, &execbuf), -EINVAL);
-
-	buf[2] = OACONTROL; /* filtered */
-	gem_write(fd, obj[0].handle, 0, buf, execbuf.batch_len);
-	igt_assert_eq(__gem_execbuf(fd, &execbuf), -EINVAL);
-
-	buf[2] = DERRMR; /* master only */
-	gem_write(fd, obj[0].handle, 0, buf, execbuf.batch_len);
-	igt_assert_eq(__gem_execbuf(fd, &execbuf), -EINVAL);
-
-	buf[2] = 0x2038; /* RING_START: invalid */
-	gem_write(fd, obj[0].handle, 0, buf, execbuf.batch_len);
-	igt_assert_eq(__gem_execbuf(fd, &execbuf), -EINVAL);
-
-	close(fd);
-}
-
 static void exec_batch_patched(int fd, uint32_t cmd_bo, uint32_t *cmds,
 			       int size, int patch_offset, uint64_t expected_value)
 {
@@ -351,6 +260,97 @@ static void stray_lri(int fd, uint32_t handle)
 	igt_assert_eq_u32(intel_register_read(OACONTROL), 0xdeadbeef);
 }
 
+static void hsw_load_register_reg(void)
+{
+	uint32_t buf[16] = {
+		MI_LOAD_REGISTER_IMM | (5 - 2),
+		HSW_CS_GPR0,
+		0xabcdabcd,
+		HSW_CS_GPR1,
+		0xdeadbeef,
+
+		MI_STORE_REGISTER_MEM | (3 - 2),
+		HSW_CS_GPR1,
+		0, /* address0 */
+
+		MI_LOAD_REGISTER_REG | (3 - 2),
+		HSW_CS_GPR0,
+		HSW_CS_GPR1,
+
+		MI_STORE_REGISTER_MEM | (3 - 2),
+		HSW_CS_GPR1,
+		4, /* address1 */
+
+		MI_BATCH_BUFFER_END,
+	};
+	struct drm_i915_gem_execbuffer2 execbuf;
+	struct drm_i915_gem_exec_object2 obj[2];
+	struct drm_i915_gem_relocation_entry reloc[2];
+	int fd;
+
+	/* Open again to get a non-master file descriptor */
+	fd = drm_open_driver(DRIVER_INTEL);
+
+	igt_require(IS_HASWELL(intel_get_drm_devid(fd)));
+	igt_require(command_parser_version(fd) >= 7);
+
+	memset(obj, 0, sizeof(obj));
+	obj[0].handle = gem_create(fd, 4096);
+	obj[1].handle = gem_create(fd, 4096);
+	gem_write(fd, obj[1].handle, 0, buf, sizeof(buf));
+
+	memset(reloc, 0, sizeof(reloc));
+	reloc[0].offset = 7*sizeof(uint32_t);
+	reloc[0].target_handle = obj[0].handle;
+	reloc[0].delta = 0;
+	reloc[0].read_domains = I915_GEM_DOMAIN_COMMAND;
+	reloc[0].write_domain = I915_GEM_DOMAIN_COMMAND;
+	reloc[1].offset = 13*sizeof(uint32_t);
+	reloc[1].target_handle = obj[0].handle;
+	reloc[1].delta = sizeof(uint32_t);
+	reloc[1].read_domains = I915_GEM_DOMAIN_COMMAND;
+	reloc[1].write_domain = I915_GEM_DOMAIN_COMMAND;
+	obj[1].relocs_ptr = (uintptr_t)&reloc;
+	obj[1].relocation_count = 2;
+
+	memset(&execbuf, 0, sizeof(execbuf));
+	execbuf.buffers_ptr = (uintptr_t)obj;
+	execbuf.buffer_count = 2;
+	execbuf.batch_len = sizeof(buf);
+	execbuf.flags = I915_EXEC_RENDER;
+	gem_execbuf(fd, &execbuf);
+	gem_close(fd, obj[1].handle);
+
+	gem_read(fd, obj[0].handle, 0, buf, 2*sizeof(buf[0]));
+	igt_assert_eq_u32(buf[0], 0xdeadbeef); /* before copy */
+	igt_assert_eq_u32(buf[1], 0xabcdabcd); /* after copy */
+
+	/* Now a couple of negative tests that should be filtered */
+	execbuf.buffer_count = 1;
+	execbuf.batch_len = 4*sizeof(buf[0]);
+
+	buf[0] = MI_LOAD_REGISTER_REG | (3 - 2);
+	buf[1] = HSW_CS_GPR0;
+	buf[2] = 0;
+	buf[3] = MI_BATCH_BUFFER_END;
+	gem_write(fd, obj[0].handle, 0, buf, execbuf.batch_len);
+	igt_assert_eq(__gem_execbuf(fd, &execbuf), -EINVAL);
+
+	buf[2] = OACONTROL; /* filtered */
+	gem_write(fd, obj[0].handle, 0, buf, execbuf.batch_len);
+	igt_assert_eq(__gem_execbuf(fd, &execbuf), -EINVAL);
+
+	buf[2] = DERRMR; /* master only */
+	gem_write(fd, obj[0].handle, 0, buf, execbuf.batch_len);
+	igt_assert_eq(__gem_execbuf(fd, &execbuf), -EINVAL);
+
+	buf[2] = 0x2038; /* RING_START: invalid */
+	gem_write(fd, obj[0].handle, 0, buf, execbuf.batch_len);
+	igt_assert_eq(__gem_execbuf(fd, &execbuf), -EINVAL);
+
+	close(fd);
+}
+
 static uint32_t handle;
 static int fd;
 
-- 
2.10.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH igt v3 04/11] igt/gem_exec_parse: update hsw_load_register_reg
  2016-11-09 16:15 [PATCH igt v3 00/11] corresponding changes for i915-perf interface Robert Bragg
                   ` (2 preceding siblings ...)
  2016-11-09 16:15 ` [PATCH igt v3 03/11] igt/gem_exec_parse: move hsw_load_register_reg down Robert Bragg
@ 2016-11-09 16:15 ` Robert Bragg
  2016-11-11 22:01   ` Matthew Auld
  2016-11-09 16:15 ` [PATCH igt v3 05/11] igt/gem_exec_parse: req. v < 9 for oacontrol tracking test Robert Bragg
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 25+ messages in thread
From: Robert Bragg @ 2016-11-09 16:15 UTC (permalink / raw)
  To: intel-gfx

This generalises hsw_load_register_reg to loop through an array of
allowed and disallowed registers and to use the exec_batch[_patched]
utilities.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
---
 tests/gem_exec_parse.c | 139 +++++++++++++++++++++++--------------------------
 1 file changed, 66 insertions(+), 73 deletions(-)

diff --git a/tests/gem_exec_parse.c b/tests/gem_exec_parse.c
index c530bb6..2fea060 100644
--- a/tests/gem_exec_parse.c
+++ b/tests/gem_exec_parse.c
@@ -34,8 +34,11 @@
 #define I915_PARAM_CMD_PARSER_VERSION       28
 #endif
 
-#define OACONTROL 0x2360
+#define ARRAY_LEN(A) (sizeof(A) / sizeof(A[0]))
+
 #define DERRMR 0x44050
+#define OACONTROL 0x2360
+#define SO_WRITE_OFFSET_0 0x5280
 
 #define HSW_CS_GPR(n) (0x2600 + 8*(n))
 #define HSW_CS_GPR0 HSW_CS_GPR(0)
@@ -65,8 +68,8 @@ static int command_parser_version(int fd)
 	return -1;
 }
 
-static void exec_batch_patched(int fd, uint32_t cmd_bo, uint32_t *cmds,
-			       int size, int patch_offset, uint64_t expected_value)
+static uint64_t __exec_batch_patched(int fd, uint32_t cmd_bo, uint32_t *cmds,
+				     int size, int patch_offset)
 {
 	struct drm_i915_gem_execbuffer2 execbuf;
 	struct drm_i915_gem_exec_object2 obj[2];
@@ -100,9 +103,19 @@ static void exec_batch_patched(int fd, uint32_t cmd_bo, uint32_t *cmds,
 	gem_sync(fd, cmd_bo);
 
 	gem_read(fd,target_bo, 0, &actual_value, sizeof(actual_value));
-	igt_assert_eq(expected_value, actual_value);
 
 	gem_close(fd, target_bo);
+
+	return actual_value;
+}
+
+static void exec_batch_patched(int fd, uint32_t cmd_bo, uint32_t *cmds,
+			       int size, int patch_offset,
+			       uint64_t expected_value)
+{
+	igt_assert_eq(__exec_batch_patched(fd, cmd_bo, cmds,
+					   size, patch_offset),
+		      expected_value);
 }
 
 static int __exec_batch(int fd, uint32_t cmd_bo, uint32_t *cmds,
@@ -262,31 +275,36 @@ static void stray_lri(int fd, uint32_t handle)
 
 static void hsw_load_register_reg(void)
 {
-	uint32_t buf[16] = {
-		MI_LOAD_REGISTER_IMM | (5 - 2),
+	uint32_t init_gpr0[16] = {
+		MI_LOAD_REGISTER_IMM | (3 - 2),
 		HSW_CS_GPR0,
-		0xabcdabcd,
-		HSW_CS_GPR1,
-		0xdeadbeef,
-
+		0xabcdabc0, /* leave [1:0] zero */
+		MI_BATCH_BUFFER_END,
+	};
+	uint32_t store_gpr0[16] = {
 		MI_STORE_REGISTER_MEM | (3 - 2),
-		HSW_CS_GPR1,
-		0, /* address0 */
-
-		MI_LOAD_REGISTER_REG | (3 - 2),
 		HSW_CS_GPR0,
-		HSW_CS_GPR1,
-
-		MI_STORE_REGISTER_MEM | (3 - 2),
-		HSW_CS_GPR1,
-		4, /* address1 */
-
+		0, /* reloc*/
 		MI_BATCH_BUFFER_END,
 	};
-	struct drm_i915_gem_execbuffer2 execbuf;
-	struct drm_i915_gem_exec_object2 obj[2];
-	struct drm_i915_gem_relocation_entry reloc[2];
+	uint32_t do_lrr[16] = {
+		MI_LOAD_REGISTER_REG | (3 - 2),
+		0, /* [1] = src */
+		HSW_CS_GPR0, /* dst */
+		MI_BATCH_BUFFER_END,
+	};
+	uint32_t allowed_regs[] = {
+		HSW_CS_GPR1,
+		SO_WRITE_OFFSET_0,
+	};
+	uint32_t disallowed_regs[] = {
+		0,
+		OACONTROL, /* filtered */
+		DERRMR, /* master only */
+		0x2038, /* RING_START: invalid */
+	};
 	int fd;
+	uint32_t handle;
 
 	/* Open again to get a non-master file descriptor */
 	fd = drm_open_driver(DRIVER_INTEL);
@@ -294,59 +312,34 @@ static void hsw_load_register_reg(void)
 	igt_require(IS_HASWELL(intel_get_drm_devid(fd)));
 	igt_require(command_parser_version(fd) >= 7);
 
-	memset(obj, 0, sizeof(obj));
-	obj[0].handle = gem_create(fd, 4096);
-	obj[1].handle = gem_create(fd, 4096);
-	gem_write(fd, obj[1].handle, 0, buf, sizeof(buf));
-
-	memset(reloc, 0, sizeof(reloc));
-	reloc[0].offset = 7*sizeof(uint32_t);
-	reloc[0].target_handle = obj[0].handle;
-	reloc[0].delta = 0;
-	reloc[0].read_domains = I915_GEM_DOMAIN_COMMAND;
-	reloc[0].write_domain = I915_GEM_DOMAIN_COMMAND;
-	reloc[1].offset = 13*sizeof(uint32_t);
-	reloc[1].target_handle = obj[0].handle;
-	reloc[1].delta = sizeof(uint32_t);
-	reloc[1].read_domains = I915_GEM_DOMAIN_COMMAND;
-	reloc[1].write_domain = I915_GEM_DOMAIN_COMMAND;
-	obj[1].relocs_ptr = (uintptr_t)&reloc;
-	obj[1].relocation_count = 2;
+	handle = gem_create(fd, 4096);
 
-	memset(&execbuf, 0, sizeof(execbuf));
-	execbuf.buffers_ptr = (uintptr_t)obj;
-	execbuf.buffer_count = 2;
-	execbuf.batch_len = sizeof(buf);
-	execbuf.flags = I915_EXEC_RENDER;
-	gem_execbuf(fd, &execbuf);
-	gem_close(fd, obj[1].handle);
+	for (int i = 0 ; i < ARRAY_LEN(allowed_regs); i++) {
+		uint32_t var;
 
-	gem_read(fd, obj[0].handle, 0, buf, 2*sizeof(buf[0]));
-	igt_assert_eq_u32(buf[0], 0xdeadbeef); /* before copy */
-	igt_assert_eq_u32(buf[1], 0xabcdabcd); /* after copy */
+		exec_batch(fd, handle, init_gpr0, sizeof(init_gpr0),
+			   I915_EXEC_RENDER,
+			   0);
+		exec_batch_patched(fd, handle,
+				   store_gpr0, sizeof(store_gpr0),
+				   2 * sizeof(uint32_t), /* reloc */
+				   0xabcdabc0);
+		do_lrr[1] = allowed_regs[i];
+		exec_batch(fd, handle, do_lrr, sizeof(do_lrr),
+			   I915_EXEC_RENDER,
+			   0);
+		var = __exec_batch_patched(fd, handle,
+					   store_gpr0, sizeof(store_gpr0),
+					   2 * sizeof(uint32_t)); /* reloc */
+		igt_assert_neq(var, 0xabcdabc0);
+	}
 
-	/* Now a couple of negative tests that should be filtered */
-	execbuf.buffer_count = 1;
-	execbuf.batch_len = 4*sizeof(buf[0]);
-
-	buf[0] = MI_LOAD_REGISTER_REG | (3 - 2);
-	buf[1] = HSW_CS_GPR0;
-	buf[2] = 0;
-	buf[3] = MI_BATCH_BUFFER_END;
-	gem_write(fd, obj[0].handle, 0, buf, execbuf.batch_len);
-	igt_assert_eq(__gem_execbuf(fd, &execbuf), -EINVAL);
-
-	buf[2] = OACONTROL; /* filtered */
-	gem_write(fd, obj[0].handle, 0, buf, execbuf.batch_len);
-	igt_assert_eq(__gem_execbuf(fd, &execbuf), -EINVAL);
-
-	buf[2] = DERRMR; /* master only */
-	gem_write(fd, obj[0].handle, 0, buf, execbuf.batch_len);
-	igt_assert_eq(__gem_execbuf(fd, &execbuf), -EINVAL);
-
-	buf[2] = 0x2038; /* RING_START: invalid */
-	gem_write(fd, obj[0].handle, 0, buf, execbuf.batch_len);
-	igt_assert_eq(__gem_execbuf(fd, &execbuf), -EINVAL);
+	for (int i = 0 ; i < ARRAY_LEN(disallowed_regs); i++) {
+		do_lrr[1] = disallowed_regs[i];
+		exec_batch(fd, handle, do_lrr, sizeof(do_lrr),
+			   I915_EXEC_RENDER,
+			   -EINVAL);
+	}
 
 	close(fd);
 }
-- 
2.10.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH igt v3 05/11] igt/gem_exec_parse: req. v < 9 for oacontrol tracking test
  2016-11-09 16:15 [PATCH igt v3 00/11] corresponding changes for i915-perf interface Robert Bragg
                   ` (3 preceding siblings ...)
  2016-11-09 16:15 ` [PATCH igt v3 04/11] igt/gem_exec_parse: update hsw_load_register_reg Robert Bragg
@ 2016-11-09 16:15 ` Robert Bragg
  2016-11-11 22:07   ` Matthew Auld
  2016-11-09 16:15 ` [PATCH igt v3 06/11] igt/gem_exec_parse: make basic-rejected version agnostic Robert Bragg
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 25+ messages in thread
From: Robert Bragg @ 2016-11-09 16:15 UTC (permalink / raw)
  To: intel-gfx

This limits testing the oacontrol tracking (required pairing of oa
enable/disable per batch buffer) to version <= 8 of the command parser.

Version 9 of the command parser removes all special handling for
OACONTROL which is now going to be managed by i915-perf and not
programmed from userspace.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
---
 tests/gem_exec_parse.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/tests/gem_exec_parse.c b/tests/gem_exec_parse.c
index 2fea060..368f30b 100644
--- a/tests/gem_exec_parse.c
+++ b/tests/gem_exec_parse.c
@@ -349,8 +349,9 @@ static int fd;
 
 igt_main
 {
+	int parser_version = 0;
+
 	igt_fixture {
-		int parser_version = 0;
                 drm_i915_getparam_t gp;
 		int rc;
 
@@ -521,6 +522,9 @@ igt_main
 			0x31337000,
 			MI_BATCH_BUFFER_END,
 		};
+
+		igt_require(parser_version < 9);
+
 		exec_batch(fd, handle,
 			   lri_ok, sizeof(lri_ok),
 			   I915_EXEC_RENDER,
-- 
2.10.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH igt v3 06/11] igt/gem_exec_parse: make basic-rejected version agnostic
  2016-11-09 16:15 [PATCH igt v3 00/11] corresponding changes for i915-perf interface Robert Bragg
                   ` (4 preceding siblings ...)
  2016-11-09 16:15 ` [PATCH igt v3 05/11] igt/gem_exec_parse: req. v < 9 for oacontrol tracking test Robert Bragg
@ 2016-11-09 16:15 ` Robert Bragg
  2016-11-14 18:57   ` Matthew Auld
  2016-11-09 16:15 ` [PATCH igt v3 07/11] igt/gem_exec_parse: update bitmasks test for v >=8 Robert Bragg
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 25+ messages in thread
From: Robert Bragg @ 2016-11-09 16:15 UTC (permalink / raw)
  To: intel-gfx

This adapts the basic-rejected test to focus on invalid commands that
will result in an EINVAL errno being returned to userspace even with the
upcoming version 8 parser change to stop reporting access violations as
EINVAL errors.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
---
 tests/gem_exec_parse.c | 28 +++++++++++++++++-----------
 1 file changed, 17 insertions(+), 11 deletions(-)

diff --git a/tests/gem_exec_parse.c b/tests/gem_exec_parse.c
index 368f30b..63f4efe 100644
--- a/tests/gem_exec_parse.c
+++ b/tests/gem_exec_parse.c
@@ -386,33 +386,39 @@ igt_main
 	}
 
 	igt_subtest("basic-rejected") {
-		uint32_t arb_on_off[] = {
-			MI_ARB_ON_OFF,
+		uint32_t invalid_cmd[] = {
+			(0x7<<29), /* Reserved command type,
+				      across all engines */
 			MI_BATCH_BUFFER_END,
 		};
-		uint32_t display_flip[] = {
-			MI_DISPLAY_FLIP,
-			0, 0, 0,
+		uint32_t invalid_set_context[] = {
+			MI_SET_CONTEXT | 32, /* invalid length */
 			MI_BATCH_BUFFER_END,
-			0
 		};
 		exec_batch(fd, handle,
-			   arb_on_off, sizeof(arb_on_off),
+			   invalid_cmd, sizeof(invalid_cmd),
 			   I915_EXEC_RENDER,
 			   -EINVAL);
 		exec_batch(fd, handle,
-			   arb_on_off, sizeof(arb_on_off),
+			   invalid_cmd, sizeof(invalid_cmd),
 			   I915_EXEC_BSD,
 			   -EINVAL);
+		if (gem_has_blt(fd)) {
+			exec_batch(fd, handle,
+				   invalid_cmd, sizeof(invalid_cmd),
+				   I915_EXEC_BLT,
+				   -EINVAL);
+		}
 		if (gem_has_vebox(fd)) {
 			exec_batch(fd, handle,
-				   arb_on_off, sizeof(arb_on_off),
+				   invalid_cmd, sizeof(invalid_cmd),
 				   I915_EXEC_VEBOX,
 				   -EINVAL);
 		}
+
 		exec_batch(fd, handle,
-			   display_flip, sizeof(display_flip),
-			   I915_EXEC_BLT,
+			   invalid_set_context, sizeof(invalid_set_context),
+			   I915_EXEC_RENDER,
 			   -EINVAL);
 	}
 
-- 
2.10.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH igt v3 07/11] igt/gem_exec_parse: update bitmasks test for v >=8
  2016-11-09 16:15 [PATCH igt v3 00/11] corresponding changes for i915-perf interface Robert Bragg
                   ` (5 preceding siblings ...)
  2016-11-09 16:15 ` [PATCH igt v3 06/11] igt/gem_exec_parse: make basic-rejected version agnostic Robert Bragg
@ 2016-11-09 16:15 ` Robert Bragg
  2016-11-11 22:08   ` Matthew Auld
  2016-11-09 16:15 ` [PATCH igt v3 08/11] igt/gem_exec_parse: update cmd-crossing-page for >= v8 Robert Bragg
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 25+ messages in thread
From: Robert Bragg @ 2016-11-09 16:15 UTC (permalink / raw)
  To: intel-gfx

With v8 of the command parser (where we won't get an EINVAL for an
access violation) this updates the bitmasks test to explicitly confirm
that the command became a NOOP by reading back from where the QW_WRITE
would have otherwise landed.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
---
 tests/gem_exec_parse.c | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/tests/gem_exec_parse.c b/tests/gem_exec_parse.c
index 63f4efe..1aa5114 100644
--- a/tests/gem_exec_parse.c
+++ b/tests/gem_exec_parse.c
@@ -472,10 +472,20 @@ igt_main
 			0,
 			MI_BATCH_BUFFER_END,
 		};
-		exec_batch(fd, handle,
-			   pc, sizeof(pc),
-			   I915_EXEC_RENDER,
-			   -EINVAL);
+		if (parser_version >= 8) {
+			/* Expect to read back zero since the command should be
+			 * squashed to a NOOP
+			 */
+			exec_batch_patched(fd, handle,
+					   pc, sizeof(pc),
+					   8, /* patch offset, */
+					   0x0);
+		} else {
+			exec_batch(fd, handle,
+				   pc, sizeof(pc),
+				   I915_EXEC_RENDER,
+				   -EINVAL);
+		}
 	}
 
 	igt_subtest("batch-without-end") {
-- 
2.10.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH igt v3 08/11] igt/gem_exec_parse: update cmd-crossing-page for >= v8
  2016-11-09 16:15 [PATCH igt v3 00/11] corresponding changes for i915-perf interface Robert Bragg
                   ` (6 preceding siblings ...)
  2016-11-09 16:15 ` [PATCH igt v3 07/11] igt/gem_exec_parse: update bitmasks test for v >=8 Robert Bragg
@ 2016-11-09 16:15 ` Robert Bragg
  2016-11-11 22:10   ` Matthew Auld
  2016-11-09 16:16 ` [PATCH igt v3 09/11] igt/gem_exec_parse: update hsw_load_register_reg for v >= 8 Robert Bragg
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 25+ messages in thread
From: Robert Bragg @ 2016-11-09 16:15 UTC (permalink / raw)
  To: intel-gfx

Since an access violation won't return an error to userspace for v >= 8
of the command parser this updates the cmd-crossing-page test to
explicitly read back from SO_WRITE_OFFSET[0] to see that the command
wasn't squashed to a NOOP.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
---
 tests/gem_exec_parse.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/tests/gem_exec_parse.c b/tests/gem_exec_parse.c
index 1aa5114..d6a2885 100644
--- a/tests/gem_exec_parse.c
+++ b/tests/gem_exec_parse.c
@@ -499,14 +499,25 @@ igt_main
 	igt_subtest("cmd-crossing-page") {
 		uint32_t lri_ok[] = {
 			MI_LOAD_REGISTER_IMM,
-			0x5280, /* allowed register address (SO_WRITE_OFFSET[0]) */
-			0x1,
+			SO_WRITE_OFFSET_0, /* allowed register address */
+			0xdcbaabc0, /* [1:0] MBZ */
+			MI_BATCH_BUFFER_END,
+		};
+		uint32_t store_reg[] = {
+			MI_STORE_REGISTER_MEM | (3 - 2),
+			SO_WRITE_OFFSET_0,
+			0, /* reloc */
 			MI_BATCH_BUFFER_END,
 		};
 		exec_split_batch(fd,
 				 lri_ok, sizeof(lri_ok),
 				 I915_EXEC_RENDER,
 				 0);
+		exec_batch_patched(fd, handle,
+				   store_reg,
+				   sizeof(store_reg),
+				   2 * sizeof(uint32_t), /* reloc */
+				   0xdcbaabc0);
 	}
 
 	igt_subtest("oacontrol-tracking") {
-- 
2.10.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH igt v3 09/11] igt/gem_exec_parse: update hsw_load_register_reg for v >= 8
  2016-11-09 16:15 [PATCH igt v3 00/11] corresponding changes for i915-perf interface Robert Bragg
                   ` (7 preceding siblings ...)
  2016-11-09 16:15 ` [PATCH igt v3 08/11] igt/gem_exec_parse: update cmd-crossing-page for >= v8 Robert Bragg
@ 2016-11-09 16:16 ` Robert Bragg
  2016-11-11 22:14   ` Matthew Auld
  2016-11-09 16:16 ` [PATCH igt v3 10/11] igt/gem_exec_parse: update registers test " Robert Bragg
  2016-11-09 16:16 ` [PATCH igt v3 11/11] igt/gem_exec_parse: check oacontrol lri bad for >= v9 Robert Bragg
  10 siblings, 1 reply; 25+ messages in thread
From: Robert Bragg @ 2016-11-09 16:16 UTC (permalink / raw)
  To: intel-gfx

This updates the checking of disallowed loads to set a distinguishable
value before the load and explicitly check the load was a NOOP by
reading back the final value.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
---
 tests/gem_exec_parse.c | 20 ++++++++++++++++++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/tests/gem_exec_parse.c b/tests/gem_exec_parse.c
index d6a2885..f241ab7 100644
--- a/tests/gem_exec_parse.c
+++ b/tests/gem_exec_parse.c
@@ -305,12 +305,17 @@ static void hsw_load_register_reg(void)
 	};
 	int fd;
 	uint32_t handle;
+	int parser_version;
+	int bad_lrr_errno;
 
 	/* Open again to get a non-master file descriptor */
 	fd = drm_open_driver(DRIVER_INTEL);
 
+	parser_version = command_parser_version(fd);
+	bad_lrr_errno = parser_version >= 8 ? 0 : -EINVAL;
+
 	igt_require(IS_HASWELL(intel_get_drm_devid(fd)));
-	igt_require(command_parser_version(fd) >= 7);
+	igt_require(parser_version >= 7);
 
 	handle = gem_create(fd, 4096);
 
@@ -335,10 +340,21 @@ static void hsw_load_register_reg(void)
 	}
 
 	for (int i = 0 ; i < ARRAY_LEN(disallowed_regs); i++) {
+		exec_batch(fd, handle, init_gpr0, sizeof(init_gpr0),
+			   I915_EXEC_RENDER,
+			   0);
+		exec_batch_patched(fd, handle,
+				   store_gpr0, sizeof(store_gpr0),
+				   2 * sizeof(uint32_t), /* reloc */
+				   0xabcdabc0);
 		do_lrr[1] = disallowed_regs[i];
 		exec_batch(fd, handle, do_lrr, sizeof(do_lrr),
 			   I915_EXEC_RENDER,
-			   -EINVAL);
+			   bad_lrr_errno);
+		exec_batch_patched(fd, handle,
+				   store_gpr0, sizeof(store_gpr0),
+				   2 * sizeof(uint32_t), /* reloc */
+				   0xabcdabc0);
 	}
 
 	close(fd);
-- 
2.10.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH igt v3 10/11] igt/gem_exec_parse: update registers test for v >= 8
  2016-11-09 16:15 [PATCH igt v3 00/11] corresponding changes for i915-perf interface Robert Bragg
                   ` (8 preceding siblings ...)
  2016-11-09 16:16 ` [PATCH igt v3 09/11] igt/gem_exec_parse: update hsw_load_register_reg for v >= 8 Robert Bragg
@ 2016-11-09 16:16 ` Robert Bragg
  2016-11-11 22:28   ` Matthew Auld
  2016-11-09 16:16 ` [PATCH igt v3 11/11] igt/gem_exec_parse: check oacontrol lri bad for >= v9 Robert Bragg
  10 siblings, 1 reply; 25+ messages in thread
From: Robert Bragg @ 2016-11-09 16:16 UTC (permalink / raw)
  To: intel-gfx

This combines some parts of the recently added store_lri test with the
registers test to be able to first load a distinguishable value before
the LRI and explicitly read back the register to determine if the
command succeeded or was a NOOP.

For now though we won't look at OACONTROL without checking for version 9
of the command parser.

This updates the 'bad' test to check the OASTATUS2 register so that we
can explicitly read back from the register to check it becomes a NOOP.

This adds a struct test_lri for associating a mask with the init/test
values so we ignore things like hw status bits that might interfere
with the result.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
---
 tests/gem_exec_parse.c | 86 ++++++++++++++++++++++++++------------------------
 1 file changed, 45 insertions(+), 41 deletions(-)

diff --git a/tests/gem_exec_parse.c b/tests/gem_exec_parse.c
index f241ab7..0fa3af8 100644
--- a/tests/gem_exec_parse.c
+++ b/tests/gem_exec_parse.c
@@ -37,6 +37,7 @@
 #define ARRAY_LEN(A) (sizeof(A) / sizeof(A[0]))
 
 #define DERRMR 0x44050
+#define OASTATUS2 0x2368
 #define OACONTROL 0x2360
 #define SO_WRITE_OFFSET_0 0x5280
 
@@ -250,27 +251,35 @@ static void exec_batch_chained(int fd, uint32_t cmd_bo, uint32_t *cmds,
 	gem_close(fd, target_bo);
 }
 
-static void stray_lri(int fd, uint32_t handle)
+/* Be careful to take into account what register bits we can store and read
+ * from...
+ */
+struct test_lri {
+	uint32_t reg, read_mask, init_val, test_val;
+};
+
+static void
+test_lri(int fd, uint32_t handle,
+	 struct test_lri *test, int expected_errno, uint32_t expect)
 {
-	/* Ideally this would test all once whitelisted registers */
 	uint32_t lri[] = {
 		MI_LOAD_REGISTER_IMM,
-		OACONTROL,
-		0x31337000,
+		test->reg,
+		test->test_val,
 		MI_BATCH_BUFFER_END,
 	};
-	int err;
-
-	igt_assert_eq_u32(intel_register_read(OACONTROL), 0xdeadbeef);
 
-	err = __exec_batch(fd, handle, lri, sizeof(lri), I915_EXEC_RENDER);
-	if (err == -EINVAL)
-		return;
+	intel_register_write(test->reg, test->init_val);
 
-	igt_assert_eq(err, 0);
+	exec_batch(fd, handle,
+		   lri, sizeof(lri),
+		   I915_EXEC_RENDER,
+		   expected_errno);
 	gem_sync(fd, handle);
 
-	igt_assert_eq_u32(intel_register_read(OACONTROL), 0xdeadbeef);
+	igt_assert_eq_u32((intel_register_read(test->reg) &
+			   test->read_mask),
+			  expect);
 }
 
 static void hsw_load_register_reg(void)
@@ -441,43 +450,38 @@ igt_main
 	igt_subtest_group {
 		igt_fixture {
 			intel_register_access_init(intel_get_pci_device(), 0);
-
-			intel_register_write(OACONTROL, 0xdeadbeef);
-			igt_assert_eq_u32(intel_register_read(OACONTROL), 0xdeadbeef);
 		}
 
-		igt_subtest("basic-stray-lri")
-			stray_lri(fd, handle);
+		igt_subtest("registers") {
+			struct test_lri bad_lris[] = {
+				/* dummy head pointer */
+				{ OASTATUS2, 0xffffff80, 0xdeadf000, 0xbeeff000 }
+			};
+			struct test_lri ok_lris[] = {
+				/* NB: [1:0] MBZ */
+				{ SO_WRITE_OFFSET_0, 0xfffffffc,
+				  0xabcdabc0, 0xbeefbee0 }
+			};
+			int bad_lri_errno = parser_version >= 8 ? 0 : -EINVAL;
+
+			for (int i = 0; i < ARRAY_LEN(ok_lris); i++) {
+				test_lri(fd, handle,
+					 ok_lris + i, 0,
+					 ok_lris[i].test_val);
+			}
+
+			for (int i = 0; i < ARRAY_LEN(bad_lris); i++) {
+				test_lri(fd, handle,
+					 bad_lris + i, bad_lri_errno,
+					 bad_lris[i].init_val);
+			}
+		}
 
 		igt_fixture {
-			intel_register_write(OACONTROL, 0);
 			intel_register_access_fini();
 		}
 	}
 
-	igt_subtest("registers") {
-		uint32_t lri_bad[] = {
-			MI_LOAD_REGISTER_IMM,
-			0, /* disallowed register address */
-			0x12000000,
-			MI_BATCH_BUFFER_END,
-		};
-		uint32_t lri_ok[] = {
-			MI_LOAD_REGISTER_IMM,
-			0x5280, /* allowed register address (SO_WRITE_OFFSET[0]) */
-			0x1,
-			MI_BATCH_BUFFER_END,
-		};
-		exec_batch(fd, handle,
-			   lri_bad, sizeof(lri_bad),
-			   I915_EXEC_RENDER,
-			   -EINVAL);
-		exec_batch(fd, handle,
-			   lri_ok, sizeof(lri_ok),
-			   I915_EXEC_RENDER,
-			   0);
-	}
-
 	igt_subtest("bitmasks") {
 		uint32_t pc[] = {
 			GFX_OP_PIPE_CONTROL,
-- 
2.10.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH igt v3 11/11] igt/gem_exec_parse: check oacontrol lri bad for >= v9
  2016-11-09 16:15 [PATCH igt v3 00/11] corresponding changes for i915-perf interface Robert Bragg
                   ` (9 preceding siblings ...)
  2016-11-09 16:16 ` [PATCH igt v3 10/11] igt/gem_exec_parse: update registers test " Robert Bragg
@ 2016-11-09 16:16 ` Robert Bragg
  2016-11-11 22:36   ` Matthew Auld
  10 siblings, 1 reply; 25+ messages in thread
From: Robert Bragg @ 2016-11-09 16:16 UTC (permalink / raw)
  To: intel-gfx

OACONTROL is no longer white listed in the command parser so this checks
at attempted LRI will be disallowed and (more importantly) checks that
userspace doesn't get an EINVAL error for an attempted OACONTROL LRI.
This is important becase Mesa application attempt OACONTROL LRIs while
initializing and will abort for any execbuf error.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
---
 tests/gem_exec_parse.c | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/tests/gem_exec_parse.c b/tests/gem_exec_parse.c
index 0fa3af8..702b6f4 100644
--- a/tests/gem_exec_parse.c
+++ b/tests/gem_exec_parse.c
@@ -457,6 +457,22 @@ igt_main
 				/* dummy head pointer */
 				{ OASTATUS2, 0xffffff80, 0xdeadf000, 0xbeeff000 }
 			};
+			struct test_lri v9_bad_lris[] = {
+				/* It's really important for us to check that
+				 * an LRI to OACONTROL doesn't result in an
+				 * EINVAL error because Mesa attempts writing
+				 * to OACONTROL to determine what extensions to
+				 * expose and will abort() for execbuffer()
+				 * errors.
+				 *
+				 * Mesa can gracefully recognise and handle the
+				 * LRI becoming a NOOP.
+				 *
+				 * The test values represent dummy context IDs
+				 * while leaving the OA unit disabled
+				 */
+				{ OACONTROL, 0xfffff000, 0xfeed0000, 0x31337000 }
+			};
 			struct test_lri ok_lris[] = {
 				/* NB: [1:0] MBZ */
 				{ SO_WRITE_OFFSET_0, 0xfffffffc,
@@ -475,6 +491,15 @@ igt_main
 					 bad_lris + i, bad_lri_errno,
 					 bad_lris[i].init_val);
 			}
+
+			if (parser_version >= 9) {
+				for (int i = 0; i < ARRAY_LEN(v9_bad_lris); i++) {
+					test_lri(fd, handle,
+						 v9_bad_lris + i,
+						 0,
+						 v9_bad_lris[i].init_val);
+				}
+			}
 		}
 
 		igt_fixture {
-- 
2.10.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH igt v3 01/11] igt/perf: add i915 perf stream tests for Haswell
  2016-11-09 16:15 ` [PATCH igt v3 01/11] igt/perf: add i915 perf stream tests for Haswell Robert Bragg
@ 2016-11-09 16:33   ` Chris Wilson
  2016-11-10 23:03   ` Matthew Auld
  1 sibling, 0 replies; 25+ messages in thread
From: Chris Wilson @ 2016-11-09 16:33 UTC (permalink / raw)
  To: Robert Bragg; +Cc: intel-gfx

On Wed, Nov 09, 2016 at 04:15:52PM +0000, Robert Bragg wrote:
> +static void
> +test_i915_ref_count(void)
> +{
> +        int oa_exponent = 13; /* 1 millisecond */
> +        uint64_t properties[] = {
> +                /* Include OA reports in samples */
> +                DRM_I915_PERF_PROP_SAMPLE_OA, true,
> +
> +                /* OA unit configuration */
> +                DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
> +                DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
> +                DRM_I915_PERF_PROP_OA_EXPONENT, oa_exponent,
> +        };
> +        struct drm_i915_perf_open_param param = {
> +                .flags = I915_PERF_FLAG_FD_CLOEXEC,
> +                .num_properties = sizeof(properties) / 16,
> +                .properties_ptr = (uint64_t)properties,
> +        };
> +        unsigned baseline, ref_count0, ref_count1;
> +        int stream_fd;
> +        uint32_t oa_report0[64];
> +        uint32_t oa_report1[64];
> +
> +        close(drm_fd);
> +        baseline = read_i915_module_ref();
> +        igt_debug("baseline ref count (drm fd closed) = %u\n", baseline);
> +
> +        drm_fd = drm_open_driver_render(DRIVER_INTEL);

Write this as a standalone (or first before igt_fixture) test and use
__drm_open_driver() (or export __drm_open_driver_render() if you insist)
so that we know that the extra driver ref taken by igt is not fouling up
your results..

> +        ref_count0 = read_i915_module_ref();
> +        igt_debug("initial ref count with drm_fd open = %u\n", ref_count0);
> +        igt_assert(ref_count0 > baseline);
> +
> +        stream_fd = __perf_open(drm_fd, &param);
> +        ref_count1 = read_i915_module_ref();
> +        igt_debug("ref count after opening i915 perf stream = %u\n", ref_count1);
> +        igt_assert(ref_count1 > ref_count0);
> +
> +        close(drm_fd);
> +        ref_count0 = read_i915_module_ref();
> +        igt_debug("ref count after closing drm fd = %u\n", ref_count0);
> +
> +        igt_assert(ref_count0 > baseline);
> +
> +        read_2_oa_reports(stream_fd,
> +                          I915_OA_FORMAT_A45_B8_C8, 256,
> +                          oa_exponent,
> +                          oa_report0,
> +                          oa_report1,
> +                          false); /* not just timer reports */
> +
> +        close(stream_fd);
> +        ref_count0 = read_i915_module_ref();
> +        igt_debug("ref count after closing i915 perf stream fd = %u\n", ref_count0);
> +        igt_assert_eq(ref_count0, baseline);
> +
> +        drm_fd = drm_open_driver_render(DRIVER_INTEL);
> +}

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH igt v3 01/11] igt/perf: add i915 perf stream tests for Haswell
  2016-11-09 16:15 ` [PATCH igt v3 01/11] igt/perf: add i915 perf stream tests for Haswell Robert Bragg
  2016-11-09 16:33   ` Chris Wilson
@ 2016-11-10 23:03   ` Matthew Auld
  2016-11-14 15:52     ` Robert Bragg
  1 sibling, 1 reply; 25+ messages in thread
From: Matthew Auld @ 2016-11-10 23:03 UTC (permalink / raw)
  To: Robert Bragg; +Cc: intel-gfx

On 11/09, Robert Bragg wrote:
> Signed-off-by: Robert Bragg <robert@sixbynine.org>
> ---
>  tests/Makefile.sources |    1 +
>  tests/perf.c           | 2220 ++++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 2221 insertions(+)
>  create mode 100644 tests/perf.c
> 
> diff --git a/tests/Makefile.sources b/tests/Makefile.sources
> index 6d081c3..7c6de2f 100644
> --- a/tests/Makefile.sources
> +++ b/tests/Makefile.sources
> @@ -211,6 +211,7 @@ TESTS_progs = \
>  	kms_pwrite_crc \
>  	kms_sink_crc_basic \
>  	prime_udl \
> +	perf \
>  	$(NULL)
>  
>  # IMPORTANT: The ZZ_ tests need to be run last!
> diff --git a/tests/perf.c b/tests/perf.c
> new file mode 100644
> index 0000000..4762e36
> --- /dev/null
> +++ b/tests/perf.c
> @@ -0,0 +1,2220 @@
> +/*
> + * Copyright © 2016 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
> + * IN THE SOFTWARE.
> + *
> + */
> +
> +#include <stdlib.h>
> +#include <stdio.h>
> +#include <string.h>
> +#include <fcntl.h>
> +#include <inttypes.h>
> +#include <errno.h>
> +#include <sys/stat.h>
> +#include <sys/time.h>
> +#include <sys/times.h>
> +#include <sys/types.h>
> +#include <dirent.h>
> +#include <time.h>
> +#include <poll.h>
> +#include <math.h>
> +
> +#include "igt.h"
> +#include "drm.h"
> +
> +IGT_TEST_DESCRIPTION("Test the i915 perf metrics streaming interface");
> +
> +#define GEN6_MI_REPORT_PERF_COUNT ((0x28 << 23) | (3 - 2))
> +
> +#define GFX_OP_PIPE_CONTROL     ((3 << 29) | (3 << 27) | (2 << 24))
> +#define PIPE_CONTROL_CS_STALL           (1 << 20)
> +#define PIPE_CONTROL_GLOBAL_SNAPSHOT_COUNT_RESET        (1 << 19)
> +#define PIPE_CONTROL_TLB_INVALIDATE     (1 << 18)
> +#define PIPE_CONTROL_SYNC_GFDT          (1 << 17)
> +#define PIPE_CONTROL_MEDIA_STATE_CLEAR  (1 << 16)
> +#define PIPE_CONTROL_NO_WRITE           (0 << 14)
> +#define PIPE_CONTROL_WRITE_IMMEDIATE    (1 << 14)
> +#define PIPE_CONTROL_WRITE_DEPTH_COUNT  (2 << 14)
> +#define PIPE_CONTROL_WRITE_TIMESTAMP    (3 << 14)
> +#define PIPE_CONTROL_DEPTH_STALL        (1 << 13)
> +#define PIPE_CONTROL_RENDER_TARGET_FLUSH (1 << 12)
> +#define PIPE_CONTROL_INSTRUCTION_INVALIDATE (1 << 11)
> +#define PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE   (1 << 10) /* GM45+ only */
> +#define PIPE_CONTROL_ISP_DIS            (1 << 9)
> +#define PIPE_CONTROL_INTERRUPT_ENABLE   (1 << 8)
> +#define PIPE_CONTROL_FLUSH_ENABLE       (1 << 7) /* Gen7+ only */
> +/* GT */
> +#define PIPE_CONTROL_DATA_CACHE_INVALIDATE      (1 << 5)
> +#define PIPE_CONTROL_VF_CACHE_INVALIDATE        (1 << 4)
> +#define PIPE_CONTROL_CONST_CACHE_INVALIDATE     (1 << 3)
> +#define PIPE_CONTROL_STATE_CACHE_INVALIDATE     (1 << 2)
> +#define PIPE_CONTROL_STALL_AT_SCOREBOARD        (1 << 1)
> +#define PIPE_CONTROL_DEPTH_CACHE_FLUSH          (1 << 0)
> +#define PIPE_CONTROL_PPGTT_WRITE        (0 << 2)
> +#define PIPE_CONTROL_GLOBAL_GTT_WRITE   (1 << 2)
> +
> +#define NSEC_PER_SEC 1000000000ull
> +
> +static struct {
> +        const char *name;
> +        uint64_t id;
> +        size_t size;
> +        int a_off; /* bytes */
> +        int n_a;
> +        int first_a;
> +        int b_off;
> +        int n_b;
> +        int c_off;
> +        int n_c;
> +} hsw_oa_formats[] = {
> +        { "A13", I915_OA_FORMAT_A13, .size = 64,
> +                .a_off = 12, .n_a = 13 },
> +        { "A29", I915_OA_FORMAT_A29, .size = 128,
> +                .a_off = 12, .n_a = 29 },
> +        { "A13_B8_C8", I915_OA_FORMAT_A13_B8_C8, .size = 128,
> +                .a_off = 12, .n_a = 13,
> +                .b_off = 64, .n_b = 8,
> +                .c_off = 96, .n_c = 8 },
> +        { "A45_B8_C8", I915_OA_FORMAT_A45_B8_C8, .size = 256,
> +                .a_off = 12,  .n_a = 45,
> +                .b_off = 192, .n_b = 8,
> +                .c_off = 224, .n_c = 8 },
> +        { "B4_C8", I915_OA_FORMAT_B4_C8, .size = 64,
> +                .b_off = 16, .n_b = 4,
> +                .c_off = 32, .n_c = 8 },
> +        { "B4_C8_A16", I915_OA_FORMAT_B4_C8_A16, .size = 128,
> +                .b_off = 16, .n_b = 4,
> +                .c_off = 32, .n_c = 8,
> +                .a_off = 60, .n_a = 16, .first_a = 29 },
> +        { "C4_B8", I915_OA_FORMAT_C4_B8, .size = 64,
> +                .c_off = 16, .n_c = 4,
> +                .b_off = 28, .n_b = 8 },
> +};
> +
> +static bool hsw_undefined_a_counters[45] = {
> +        [4] = true,
> +        [6] = true,
> +        [9] = true,
> +        [11] = true,
> +        [14] = true,
> +        [16] = true,
> +        [19] = true,
> +        [21] = true,
> +        [24] = true,
> +        [26] = true,
> +        [29] = true,
> +        [31] = true,
> +        [34] = true,
> +        [43] = true,
> +        [44] = true,
> +};
> +
> +static int drm_fd;
> +static uint32_t devid;
> +static int device;
> +
> +static uint64_t hsw_render_basic_id = UINT64_MAX;
> +static uint64_t gt_min_freq_mhz_saved = 0;
> +static uint64_t gt_max_freq_mhz_saved = 0;
> +static uint64_t gt_min_freq_mhz = 0;
> +static uint64_t gt_max_freq_mhz = 0;
> +
> +static uint64_t timestamp_frequency = 12500000;
> +
> +static igt_render_copyfunc_t render_copy = NULL;
> +
> +static int
> +__perf_open(int fd, struct drm_i915_perf_open_param *param)
> +{
> +        int ret = igt_ioctl(fd, DRM_IOCTL_I915_PERF_OPEN, param);
> +
> +        igt_assert(ret >= 0);
> +        errno = 0;
> +
> +        return ret;
> +}
> +
> +static int
> +lookup_format(int i915_perf_fmt_id)
> +{
> +        for (int i = 0; i < ARRAY_SIZE(hsw_oa_formats); i++)
> +                if (hsw_oa_formats[i].id == i915_perf_fmt_id)
> +                        return i;
> +
> +        igt_assert(!"reached");
> +}
> +
> +static bool
> +try_read_u64_file(const char *file, uint64_t *val)
> +{
> +        char buf[32];
> +        int fd, n;
> +
> +        fd = open(file, O_RDONLY);
> +        if (fd < 0)
> +                return false;
> +
> +        while ((n = read(fd, buf, sizeof(buf) - 1)) < 0 && errno == EINTR)
> +                ;
> +        igt_assert(n >= 0);
> +
> +        close(fd);
> +
> +        buf[n] = '\0';
> +        *val = strtoull(buf, NULL, 0);
> +
> +        return true;
> +}
> +
> +static uint64_t
> +read_u64_file(const char *file)
> +{
> +        uint64_t val;
> +
> +        igt_assert_eq(try_read_u64_file(file, &val), true);
> +
> +        return val;
> +}
> +
> +static void
> +write_u64_file(const char *file, uint64_t val)
> +{
> +        char buf[32];
> +        int fd, len, ret;
> +
> +        fd = open(file, O_WRONLY);
> +        igt_assert(fd >= 0);
> +
> +        len = snprintf(buf, sizeof(buf), "%"PRIu64, val);
> +        igt_assert(len > 0);
> +
> +        while ((ret = write(fd, buf, len)) < 0 && errno == EINTR)
> +                ;
> +        igt_assert_eq(ret, len);
> +
> +        close(fd);
> +}
> +
> +static uint64_t
> +sysfs_read(const char *file)
> +{
> +        char buf[512];
> +
> +        snprintf(buf, sizeof(buf), "/sys/class/drm/card%d/%s", device, file);
> +
> +        return read_u64_file(buf);
> +}
> +
> +static void
> +sysfs_write(const char *file, uint64_t val)
> +{
> +        char buf[512];
> +
> +        snprintf(buf, sizeof(buf), "/sys/class/drm/card%d/%s", device, file);
> +
> +        write_u64_file(buf, val);
> +}
> +
> +static char *
> +read_debugfs_record(const char *file, const char *key)
> +{
> +        FILE *fp;
> +        char *line = NULL;
> +        size_t line_buf_size = 0;
> +        int len = 0;
> +        int key_len = strlen(key);
> +        char *value = NULL;
> +
> +        fp = igt_debugfs_fopen(file, "r");
> +        igt_require(fp);
> +
> +        while ((len = getline(&line, &line_buf_size, fp)) > 0) {
> +
> +                if (line[len - 1] == '\n')
> +                        line[len - 1] = '\0';
> +
> +                if (strncmp(key, line, key_len) == 0 &&
> +                    line[key_len] == ':' &&
> +                    line[key_len + 1] == ' ')
> +                {
> +                        value = strdup(line + key_len + 2);
> +                        goto done;
Why not just break here, then we don't need the label ? Or perhaps you
meant to whack an igt_assert(!"reached") inbetween ?

> +                }
> +        }
> +
> +done:
> +        free(line);
> +        if (fp)
> +                fclose(fp);
> +        return value;
> +}
> +
> +static uint64_t
> +read_debugfs_u64_record(const char *file, const char *key)
> +{
> +        char *str_val = read_debugfs_record(file, key);
> +        uint64_t val;
> +
> +        igt_require(str_val);
> +
> +        val = strtoull(str_val, NULL, 0);
> +        free(str_val);
> +
> +        return val;
> +}
> +
> +static bool
> +lookup_hsw_render_basic_id(void)
> +{
> +        char buf[256];
> +
> +        snprintf(buf, sizeof(buf),
> +                 "/sys/class/drm/card%d/metrics/403d8832-1a27-4aa6-a64e-f5389ce7b212/id",
> +                 device);
> +
> +        return try_read_u64_file(buf, &hsw_render_basic_id);
> +}
> +
> +static void
> +gt_frequency_range_save(void)
> +{
> +        gt_min_freq_mhz_saved = sysfs_read("gt_min_freq_mhz");
> +        gt_max_freq_mhz_saved = sysfs_read("gt_max_freq_mhz");
> +
> +        gt_min_freq_mhz = gt_min_freq_mhz_saved;
> +        gt_max_freq_mhz = gt_max_freq_mhz_saved;
> +}
> +
> +static void
> +gt_frequency_pin(int gt_freq_mhz)
> +{
> +        igt_debug("requesting pinned GT freq = %dmhz\n", gt_freq_mhz);
> +
> +        if (gt_freq_mhz > gt_max_freq_mhz) {
> +                sysfs_write("gt_max_freq_mhz", gt_freq_mhz);
> +                sysfs_write("gt_min_freq_mhz", gt_freq_mhz);
> +        } else {
> +                sysfs_write("gt_min_freq_mhz", gt_freq_mhz);
> +                sysfs_write("gt_max_freq_mhz", gt_freq_mhz);
> +        }
> +        gt_min_freq_mhz = gt_freq_mhz;
> +        gt_max_freq_mhz = gt_freq_mhz;
> +}
> +
> +static void
> +gt_frequency_range_restore(void)
> +{
> +        igt_debug("restoring GT frequency range: min = %dmhz, max =%dmhz, current: min=%dmhz, max=%dmhz\n",
> +                  (int)gt_min_freq_mhz_saved,
> +                  (int)gt_max_freq_mhz_saved,
> +                  (int)gt_min_freq_mhz,
> +                  (int)gt_max_freq_mhz);
> +
> +        /* Assume current min/max are the same */
> +        if (gt_min_freq_mhz_saved > gt_max_freq_mhz) {
> +                sysfs_write("gt_max_freq_mhz", gt_max_freq_mhz_saved);
> +                sysfs_write("gt_min_freq_mhz", gt_min_freq_mhz_saved);
> +        } else {
> +                sysfs_write("gt_min_freq_mhz", gt_min_freq_mhz_saved);
> +                sysfs_write("gt_max_freq_mhz", gt_max_freq_mhz_saved);
> +        }
> +
> +        gt_min_freq_mhz = gt_min_freq_mhz_saved;
> +        gt_max_freq_mhz = gt_max_freq_mhz_saved;
> +}
> +
> +static uint64_t
> +timebase_scale(uint32_t u32_delta)
> +{
> +        return ((uint64_t)u32_delta * NSEC_PER_SEC) / timestamp_frequency;
> +}
> +
> +/* CAP_SYS_ADMIN is required to open system wide metrics, unless the system
> + * control parameter dev.i915.perf_stream_paranoid == 0 */
> +static void
> +test_system_wide_paranoid(void)
> +{
> +        igt_fork(child, 1) {
> +                uint64_t properties[] = {
> +                        /* Include OA reports in samples */
> +                        DRM_I915_PERF_PROP_SAMPLE_OA, true,
> +
> +                        /* OA unit configuration */
> +                        DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
> +                        DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
> +                        DRM_I915_PERF_PROP_OA_EXPONENT, 13, /* 1 millisecond */
> +                };
> +                struct drm_i915_perf_open_param param = {
> +                        .flags = I915_PERF_FLAG_FD_CLOEXEC |
> +                                I915_PERF_FLAG_FD_NONBLOCK,
> +                        .num_properties = sizeof(properties) / 16,
> +                        .properties_ptr = (uint64_t)properties,
> +                };
> +
> +                write_u64_file("/proc/sys/dev/i915/perf_stream_paranoid", 1);
> +
> +                igt_drop_root();
> +
> +                do_ioctl_err(drm_fd, DRM_IOCTL_I915_PERF_OPEN, &param, EACCES);
> +        }
> +
> +        igt_waitchildren();
> +
> +        igt_fork(child, 1) {
> +                uint64_t properties[] = {
> +                        /* Include OA reports in samples */
> +                        DRM_I915_PERF_PROP_SAMPLE_OA, true,
> +
> +                        /* OA unit configuration */
> +                        DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
> +                        DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
> +                        DRM_I915_PERF_PROP_OA_EXPONENT, 13, /* 1 millisecond */
> +                };
> +                struct drm_i915_perf_open_param param = {
> +                        .flags = I915_PERF_FLAG_FD_CLOEXEC |
> +                                I915_PERF_FLAG_FD_NONBLOCK,
> +                        .num_properties = sizeof(properties) / 16,
> +                        .properties_ptr = (uint64_t)properties,
> +                };
> +                int stream_fd;
> +
> +                write_u64_file("/proc/sys/dev/i915/perf_stream_paranoid", 0);
> +
> +                igt_drop_root();
> +
> +                stream_fd = __perf_open(drm_fd, &param);
> +                close(stream_fd);
> +        }
> +
> +        igt_waitchildren();
> +
> +        /* leave in paranoid state */
> +        write_u64_file("/proc/sys/dev/i915/perf_stream_paranoid", 1);
> +}
> +
> +static void
> +test_invalid_open_flags(void)
> +{
> +        uint64_t properties[] = {
> +                /* Include OA reports in samples */
> +                DRM_I915_PERF_PROP_SAMPLE_OA, true,
> +
> +                /* OA unit configuration */
> +                DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
> +                DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
> +                DRM_I915_PERF_PROP_OA_EXPONENT, 13, /* 1 millisecond */
> +        };
> +        struct drm_i915_perf_open_param param = {
> +                .flags = ~0, /* Undefined flag bits set! */
> +                .num_properties = sizeof(properties) / 16,
> +                .properties_ptr = (uint64_t)properties,
> +        };
> +
> +        do_ioctl_err(drm_fd, DRM_IOCTL_I915_PERF_OPEN, &param, EINVAL);
> +}
> +
> +static void
> +test_invalid_oa_metric_set_id(void)
> +{
> +        uint64_t properties[] = {
> +                /* Include OA reports in samples */
> +                DRM_I915_PERF_PROP_SAMPLE_OA, true,
> +
> +                /* OA unit configuration */
> +                DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
> +                DRM_I915_PERF_PROP_OA_EXPONENT, 13, /* 1 millisecond */
> +                DRM_I915_PERF_PROP_OA_METRICS_SET, UINT64_MAX,
> +        };
> +        struct drm_i915_perf_open_param param = {
> +                .flags = I915_PERF_FLAG_FD_CLOEXEC |
> +                        I915_PERF_FLAG_FD_NONBLOCK,
> +                .num_properties = sizeof(properties) / 16,
> +                .properties_ptr = (uint64_t)properties,
> +        };
> +        int stream_fd;
> +
> +        do_ioctl_err(drm_fd, DRM_IOCTL_I915_PERF_OPEN, &param, EINVAL);
> +
> +        properties[ARRAY_SIZE(properties) - 1] = 0; /* ID 0 is also be reserved as invalid */
> +        do_ioctl_err(drm_fd, DRM_IOCTL_I915_PERF_OPEN, &param, EINVAL);
> +
> +        /* Check that we aren't just seeing false positives... */
> +        properties[ARRAY_SIZE(properties) - 1] = hsw_render_basic_id;
> +        stream_fd = __perf_open(drm_fd, &param);
> +        close(stream_fd);
> +
> +        /* There's no valid default OA metric set ID... */
> +        param.num_properties--;
> +        do_ioctl_err(drm_fd, DRM_IOCTL_I915_PERF_OPEN, &param, EINVAL);
> +}
> +
> +static void
> +test_invalid_oa_format_id(void)
> +{
> +        uint64_t properties[] = {
> +                /* Include OA reports in samples */
> +                DRM_I915_PERF_PROP_SAMPLE_OA, true,
> +
> +                /* OA unit configuration */
> +                DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
> +                DRM_I915_PERF_PROP_OA_EXPONENT, 13, /* 1 millisecond */
> +                DRM_I915_PERF_PROP_OA_FORMAT, UINT64_MAX,
> +        };
> +        struct drm_i915_perf_open_param param = {
> +                .flags = I915_PERF_FLAG_FD_CLOEXEC |
> +                        I915_PERF_FLAG_FD_NONBLOCK,
> +                .num_properties = sizeof(properties) / 16,
> +                .properties_ptr = (uint64_t)properties,
> +        };
> +        int stream_fd;
> +
> +        do_ioctl_err(drm_fd, DRM_IOCTL_I915_PERF_OPEN, &param, EINVAL);
> +
> +        properties[ARRAY_SIZE(properties) - 1] = 0; /* ID 0 is also be reserved as invalid */
> +        do_ioctl_err(drm_fd, DRM_IOCTL_I915_PERF_OPEN, &param, EINVAL);
> +
> +        /* Check that we aren't just seeing false positives... */
> +        properties[ARRAY_SIZE(properties) - 1] = I915_OA_FORMAT_A45_B8_C8;
> +        stream_fd = __perf_open(drm_fd, &param);
> +        close(stream_fd);
> +
> +        /* There's no valid default OA format... */
> +        param.num_properties--;
> +        do_ioctl_err(drm_fd, DRM_IOCTL_I915_PERF_OPEN, &param, EINVAL);
> +}
> +
> +static void
> +test_missing_sample_flags(void)
> +{
> +        uint64_t properties[] = {
> +                /* No _PROP_SAMPLE_xyz flags */
> +
> +                /* OA unit configuration */
> +                DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
> +                DRM_I915_PERF_PROP_OA_EXPONENT, 13, /* 1 millisecond */
> +                DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
> +        };
> +        struct drm_i915_perf_open_param param = {
> +                .flags = I915_PERF_FLAG_FD_CLOEXEC,
> +                .num_properties = sizeof(properties) / 16,
> +                .properties_ptr = (uint64_t)properties,
> +        };
> +
> +        do_ioctl_err(drm_fd, DRM_IOCTL_I915_PERF_OPEN, &param, EINVAL);
> +}
> +
> +static void
> +read_2_oa_reports(int stream_fd,
> +                  uint64_t format_id,
> +                  size_t format_size,
> +                  int exponent,
> +                  uint32_t *oa_report0,
> +                  uint32_t *oa_report1,
> +                  bool timer_only)
> +{
> +        size_t sample_size = (sizeof(struct drm_i915_perf_record_header) +
> +                              format_size);
> +        const struct drm_i915_perf_record_header *header;
> +        uint32_t exponent_mask = (1 << (exponent + 1)) - 1;
> +
> +        /* Note: we allocate a large buffer so that each read() iteration
> +         * should scrape *all* pending records.
> +         *
> +         * The largest buffer the OA unit supports is 16MB and the smallest
> +         * OA report format is 64bytes allowing up to 262144 reports to
> +         * be buffered.
> +         *
> +         * Being sure we are fetching all buffered reports allows us to
> +         * potentially throw away / skip all reports whenever we see
> +         * a _REPORT_LOST notification as a way of being sure are
> +         * measurements aren't skewed by a lost report.
> +         *
> +         * Note: that is is useful for some tests but also not something
> +         * applications would be expected to resort to. Lost reports are
> +         * somewhat unpredictable but typically don't pose a problem - except
> +         * to indicate that the OA unit may be over taxed if lots of reports
> +         * are being lost.
> +         */
> +        int buf_size = 262144 * (64 + sizeof(struct drm_i915_perf_record_header));
> +        uint8_t *buf = malloc(buf_size);
> +        int n = 0;
> +
> +        for (int i = 0; i < 1000; i++) {
> +                ssize_t len;
> +
> +                while ((len = read(stream_fd, buf, buf_size)) < 0 &&
> +                       errno == EINTR)
> +                        ;
> +
> +                igt_assert(len > 0);
> +
> +                for (size_t offset = 0; offset < len; offset += header->size) {
> +                        const uint32_t *report;
> +
> +                        header = (void *)(buf + offset);
> +
> +                        igt_assert_eq(header->pad, 0); /* Reserved */
> +
> +                        /* Currently the only test that should ever expect to
> +                         * see a _BUFFER_LOST error is the buffer_fill test,
> +                         * otherwise something bad has probably happened...
> +                         */
> +                        igt_assert_neq(header->type, DRM_I915_PERF_RECORD_OA_BUFFER_LOST);
> +
> +                        /* At high sampling frequencies the OA HW might not be
> +                         * able to cope with all write requests and will notify
> +                         * us that a report was lost. We restart our read of
> +                         * two sequential reports due to the timeline blip this
> +                         * implies
> +                         */
> +                        if (header->type == DRM_I915_PERF_RECORD_OA_REPORT_LOST) {
> +                                igt_debug("read restart: OA trigger collision / report lost\n");
> +                                n = 0;
> +
> +                                /* XXX: break, because we don't know where
> +                                 * within the series of already read reports
> +                                 * there could be a blip from the lost report.
> +                                 */
> +                                break;
> +                        }
> +
> +                        /* Currently the only other record type expected is a
> +                         * _SAMPLE. Notably this test will need updating if
> +                         * i915-perf is extended in the future with additional
> +                         * record types.
> +                         */
> +                        igt_assert_eq(header->type, DRM_I915_PERF_RECORD_SAMPLE);
> +
> +                        igt_assert_eq(header->size, sample_size);
> +
> +                        report = (const void *)(header + 1);
> +
> +                        igt_debug("read report: reason = %x, timestamp = %x, exponent mask=%x\n",
> +                                  report[0], report[1], exponent_mask);
> +
> +                        /* Don't expect zero for timestamps */
> +                        igt_assert_neq(report[1], 0);
> +
> +                        if (timer_only) {
> +                                /* For Haswell we don't have a documented
> +                                 * report reason field (though empirically
> +                                 * report[0] bit 10 does seem to correlate with
> +                                 * a timer trigger reason) so we instead infer
> +                                 * which reports are timer triggered by
> +                                 * checking if the least significant bits are
> +                                 * zero and the exponent bit is set.
> +                                 */
> +                                if ((report[1] & exponent_mask) != (1 << exponent)) {
> +                                        igt_debug("skipping non timer report reason=%x\n",
> +                                                  report[0]);
> +
> +                                        /* Also assert our hypothesis about the
> +                                         * reason bit...
> +                                         */
> +                                        igt_assert_eq(report[0] & (1 << 10), 0);
> +                                        continue;
> +                                }
> +                        }
> +
> +                        if (n++ == 0)
> +                                memcpy(oa_report0, report, format_size);
> +                        else {
> +                                memcpy(oa_report1, report, format_size);
> +                                free(buf);
> +                                return;
> +                        }
> +                }
> +        }
> +
> +        free(buf);
> +
> +        igt_assert(!"reached");
> +}
> +
> +static void
> +open_and_read_2_oa_reports(uint64_t format_id,
> +                           size_t format_size,
> +                           int exponent,
> +                           uint32_t *oa_report0,
> +                           uint32_t *oa_report1,
> +                           bool timer_only)
> +{
> +        uint64_t properties[] = {
> +                /* Include OA reports in samples */
> +                DRM_I915_PERF_PROP_SAMPLE_OA, true,
> +
> +                /* OA unit configuration */
> +                DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
> +                DRM_I915_PERF_PROP_OA_FORMAT, format_id,
> +                DRM_I915_PERF_PROP_OA_EXPONENT, exponent,
> +
> +        };
> +        struct drm_i915_perf_open_param param = {
> +                .flags = I915_PERF_FLAG_FD_CLOEXEC,
> +                .num_properties = sizeof(properties) / 16,
> +                .properties_ptr = (uint64_t)properties,
> +        };
> +        int stream_fd = __perf_open(drm_fd, &param);
> +
> +        read_2_oa_reports(stream_fd, format_id, format_size, exponent,
> +                          oa_report0, oa_report1, timer_only);
> +
> +        close(stream_fd);
> +}
> +
> +static void
> +print_reports(uint32_t *oa_report0, uint32_t *oa_report1, int fmt)
> +{
> +        uint32_t *a0, *b0, *c0;
> +        uint32_t *a1, *b1, *c1;
> +
> +        /* Not ideal naming here with a0 or a1
> +         * differentiating report0 or 1 not A counter 0 or 1....
> +         */
> +        a0 = (uint32_t *)(((uint8_t *)oa_report0) + hsw_oa_formats[fmt].a_off);
> +        b0 = (uint32_t *)(((uint8_t *)oa_report0) + hsw_oa_formats[fmt].b_off);
> +        c0 = (uint32_t *)(((uint8_t *)oa_report0) + hsw_oa_formats[fmt].c_off);
> +
> +        a1 = (uint32_t *)(((uint8_t *)oa_report1) + hsw_oa_formats[fmt].a_off);
> +        b1 = (uint32_t *)(((uint8_t *)oa_report1) + hsw_oa_formats[fmt].b_off);
> +        c1 = (uint32_t *)(((uint8_t *)oa_report1) + hsw_oa_formats[fmt].c_off);
> +
> +        igt_debug("TIMESTAMP: 1st = %"PRIu32", 2nd = %"PRIu32", delta = %"PRIu32"\n",
> +                  oa_report0[1], oa_report1[1], oa_report1[1] - oa_report0[1]);
> +
> +        if (hsw_oa_formats[fmt].n_c) {
> +                igt_debug("CLOCK: 1st = %"PRIu32", 2nd = %"PRIu32", delta = %"PRIu32"\n",
> +                          c0[2], c1[2], c1[2] - c0[2]);
> +        } else
> +                igt_debug("CLOCK = N/A\n");
> +
> +        for (int j = hsw_oa_formats[fmt].first_a;
> +             j < hsw_oa_formats[fmt].n_a;
> +             j++)
> +        {
> +                uint32_t delta = a1[j] - a0[j];
> +
> +                if (hsw_undefined_a_counters[j])
> +                        continue;
> +
> +                igt_debug("A%d: 1st = %"PRIu32", 2nd = %"PRIu32", delta = %"PRIu32"\n",
> +                          j, a0[j], a1[j], delta);
> +        }
> +
> +        for (int j = 0; j < hsw_oa_formats[fmt].n_b; j++) {
> +                uint32_t delta = b1[j] - b0[j];
> +                igt_debug("B%d: 1st = %"PRIu32", 2nd = %"PRIu32", delta = %"PRIu32"\n",
> +                          j, b0[j], b1[j], delta);
> +        }
> +
> +        for (int j = 0; j < hsw_oa_formats[fmt].n_c; j++) {
> +                uint32_t delta = c1[j] - c0[j];
> +                igt_debug("C%d: 1st = %"PRIu32", 2nd = %"PRIu32", delta = %"PRIu32"\n",
> +                          j, c0[j], c1[j], delta);
> +        }
> +}
> +
> +static void
> +test_oa_formats(void)
> +{
> +        int oa_exponent = 13;
> +
> +        for (int i = 0; i < ARRAY_SIZE(hsw_oa_formats); i++) {
> +                uint32_t oa_report0[64];
> +                uint32_t oa_report1[64];
> +                uint32_t *a0, *b0, *c0;
> +                uint32_t *a1, *b1, *c1;
> +                uint32_t time_delta;
> +                uint32_t clock_delta;
> +                uint32_t max_delta;
> +
> +                igt_debug("Checking OA format %s\n", hsw_oa_formats[i].name);
> +
> +                open_and_read_2_oa_reports(hsw_oa_formats[i].id,
> +                                           hsw_oa_formats[i].size,
> +                                           oa_exponent,
> +                                           oa_report0,
> +                                           oa_report1,
> +                                           false); /* timer reports only */
> +
> +                print_reports(oa_report0, oa_report1, i);
> +
> +                a0 = (uint32_t *)(((uint8_t *)oa_report0) + hsw_oa_formats[i].a_off);
> +                b0 = (uint32_t *)(((uint8_t *)oa_report0) + hsw_oa_formats[i].b_off);
> +                c0 = (uint32_t *)(((uint8_t *)oa_report0) + hsw_oa_formats[i].c_off);
> +
> +                a1 = (uint32_t *)(((uint8_t *)oa_report1) + hsw_oa_formats[i].a_off);
> +                b1 = (uint32_t *)(((uint8_t *)oa_report1) + hsw_oa_formats[i].b_off);
> +                c1 = (uint32_t *)(((uint8_t *)oa_report1) + hsw_oa_formats[i].c_off);
> +
> +                time_delta = timebase_scale(oa_report1[1] - oa_report0[1]);
> +                igt_assert_neq(time_delta, 0);
> +
> +                /* C2 corresponds to a clock counter for this metric set but
> +                 * it's not included in all of the formats. */
> +                if (hsw_oa_formats[i].n_c) {
> +                        uint64_t freq;
> +
> +                        /* The first report might have a clock count of zero
> +                         * but we wouldn't expect that in the second report...
> +                         */
> +                        igt_assert_neq(c1[2], 0);
> +
> +                        clock_delta = c1[2] - c0[2];
> +                        igt_assert_neq(clock_delta, 0);
> +
> +                        freq = ((uint64_t)clock_delta * 1000) / time_delta;
> +                        igt_debug("freq = %"PRIu64"\n", freq);
> +
> +                        igt_assert(freq <= gt_max_freq_mhz);
> +                } else {
> +                        /* Assume running at max freq for sake of
> +                         * below sanity check on counters... */
> +                        clock_delta = (gt_max_freq_mhz *
> +                                       (uint64_t)time_delta) / 1000;
> +                }
> +
> +                igt_debug("clock delta = %"PRIu32"\n", clock_delta);
> +
> +                /* The maximum rate for any HSW counter =
> +                 *   clock_delta * 40 EUs
> +                 *
> +                 * Sanity check that no counters exceed this delta.
> +                 */
> +                max_delta = clock_delta * 40;
> +
> +                for (int j = hsw_oa_formats[i].first_a;
> +                     j < hsw_oa_formats[i].n_a;
> +                     j++)
> +                {
> +                        uint32_t delta = a1[j] - a0[j];
> +
> +                        if (hsw_undefined_a_counters[j])
> +                                continue;
> +
> +                        igt_debug("A%d: delta = %"PRIu32"\n", j, delta);
> +                        igt_assert(delta <= max_delta);
> +                }
> +
> +                for (int j = 0; j < hsw_oa_formats[i].n_b; j++) {
> +                        uint32_t delta = b1[j] - b0[j];
> +                        igt_debug("B%d: delta = %"PRIu32"\n", j, delta);
> +                        igt_assert(delta <= max_delta);
> +                }
> +
> +                for (int j = 0; j < hsw_oa_formats[i].n_c; j++) {
> +                        uint32_t delta = c1[j] - c0[j];
> +                        igt_debug("C%d: delta = %"PRIu32"\n", j, delta);
> +                        igt_assert(delta <= max_delta);
> +                }
> +        }
> +}
> +
> +static void
> +test_oa_exponents(int gt_freq_mhz)
> +{
> +        /* XXX: Note that gt_min/max_freq_mhz don't seem to be a reliable
> +         * mechanism for fixing the gpu frequency and since these unit tests
> +         * are focused on the OA unit not the ability to pin the frequency we
> +         * read back the current frequency for each iteration of this test to
> +         * take this into account.
> +         */
> +        gt_frequency_pin(gt_freq_mhz);
> +
> +        igt_debug("Testing OA timer exponents with requested GT frequency = %dmhz\n",
> +                  gt_freq_mhz);
> +
> +        /* It's asking a lot to sample with a 160 nanosecond period and the
> +         * test can fail due to buffer overflows if it wasn't possible to
> +         * keep up, so we don't start from an exponent of zero...
> +         */
> +        for (int i = 2; i < 20; i++) {
> +                uint32_t expected_timestamp_delta;
> +                uint32_t timestamp_delta;
> +                uint32_t oa_report0[64];
> +                uint32_t oa_report1[64];
> +                uint32_t *c0, *c1;
> +                uint32_t time_delta;
> +                uint32_t clock_delta;
> +                uint32_t freq;
> +                int n_freq_matches = 0;
> +
> +                /* The exponent is effectively selecting a bit in the timestamp
> +                 * to trigger reports on and so in practice we expect the raw
> +                 * timestamp deltas for periodic reports to exactly match the
> +                 * value of next bit.
> +                 */
> +                expected_timestamp_delta = 2 << i;
> +
> +                for (int j = 0; j < 10; j++) {
> +                        gt_freq_mhz = sysfs_read("gt_act_freq_mhz");
> +
> +                        igt_debug("ITER %d: testing OA exponent %d with GT freq = %dmhz\n",
> +                                  j, i, gt_freq_mhz);
> +
> +                        open_and_read_2_oa_reports(I915_OA_FORMAT_A45_B8_C8, 256,
> +                                                   i, /* exponent */
> +                                                   oa_report0,
> +                                                   oa_report1,
> +                                                   true); /* timer triggered
> +                                                             reports only */
> +
> +                        timestamp_delta = oa_report1[1] - oa_report0[1];
> +                        igt_assert_neq(timestamp_delta, 0);
> +
> +                        if (timestamp_delta != expected_timestamp_delta) {
> +                                igt_debug("timestamp0 = %u/0x%x\n",
> +                                          oa_report0[1], oa_report0[1]);
> +                                igt_debug("timestamp1 = %u/0x%x\n",
> +                                          oa_report1[1], oa_report1[1]);
> +                        }
> +
> +                        igt_assert_eq(timestamp_delta, expected_timestamp_delta);
> +
> +                        c0 = (uint32_t *)(((uint8_t *)oa_report0) + 224 /* C offset */);
> +                        c1 = (uint32_t *)(((uint8_t *)oa_report1) + 224 /* C offset */);
Most definitely sprinkle a comment here for what the custom counter
actually is, "The total number of GPU core clocks elapsed during the
measurement."

> +                        clock_delta = c1[2] - c0[2];
> +
> +                        time_delta = timebase_scale(timestamp_delta);
> +
> +                        freq = ((uint64_t)clock_delta * 1000) / time_delta;
> +                        igt_debug("ITER %d: time delta = %"PRIu32"(ns) clock delta = %"PRIu32" freq = %"PRIu32"(mhz)\n",
> +                                  j, time_delta, clock_delta, freq);
> +
> +                        if (freq == gt_freq_mhz)
> +                                n_freq_matches++;
> +                }
> +
> +                igt_debug("number of iterations with expected clock frequency = %d\n",
> +                          n_freq_matches);
> +
> +                /* Don't assert the calculated frequency for extremely short
> +                 * durations... */
> +                if (i > 3)
> +                        igt_assert(n_freq_matches >= 7);
> +        }
> +
> +        gt_frequency_range_restore();
> +}
> +
> +/* The OA exponent selects a timestamp counter bit to trigger reports on.
> + *
> + * With a 64bit timestamp and least significant bit approx == 80ns then the MSB
> + * equates to > 40 thousand years and isn't exposed via the i915 perf interface.
> + *
> + * The max exponent exposed is expected to be 31, which is still a fairly
> + * ridiculous period (>5min) but is the maximum exponent where it's still
> + * possible to use periodic sampling as a means for tracking the overflow of
> + * 32bit OA report timestamps.
> + */
> +static void
> +test_invalid_oa_exponent(void)
> +{
> +        uint64_t properties[] = {
> +                /* Include OA reports in samples */
> +                DRM_I915_PERF_PROP_SAMPLE_OA, true,
> +
> +                /* OA unit configuration */
> +                DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
> +                DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
> +                DRM_I915_PERF_PROP_OA_EXPONENT, 31, /* maximum exponent expected
> +                                                       to be accepted */
> +        };
> +        struct drm_i915_perf_open_param param = {
> +                .flags = I915_PERF_FLAG_FD_CLOEXEC,
> +                .num_properties = sizeof(properties) / 16,
> +                .properties_ptr = (uint64_t)properties,
> +        };
> +        int stream_fd = __perf_open(drm_fd, &param);
> +
> +        close(stream_fd);
> +
> +        for (int i = 32; i < 65; i++) {
> +                properties[7] = i;
> +                do_ioctl_err(drm_fd, DRM_IOCTL_I915_PERF_OPEN, &param, EINVAL);
> +        }
> +}
> +
> +/* Return the largest OA exponent that will still result in a sampling
> + * frequency higher than the given frequency.
> + */
> +static int
> +max_oa_exponent_for_higher_freq(uint64_t freq)
> +{
> +        /* NB: timebase_scale() takes a uint32_t and an exponent of 30
> +         * would already represent a period of ~3 minutes so there's
> +         * really no need to consider higher exponents.
> +         */
> +        for (int i = 0; i < 30; i++) {
> +                uint64_t oa_period = timebase_scale(2 << i);
> +                uint32_t oa_freq = NSEC_PER_SEC / oa_period;
> +
> +                if (oa_freq <= freq)
> +                        return max(0, i - 1);
> +        }
> +
> +        igt_assert(!"reached");
> +        return -1;
> +}
> +
> +/* The lowest periodic sampling exponent equates to a period of 160 nanoseconds
> + * or a frequency of 6.25MHz which is only possible to request as root by
> + * default. By default the maximum OA sampling rate is 100KHz
> + */
> +static void
> +test_low_oa_exponent_permissions(void)
> +{
> +        int max_freq = read_u64_file("/proc/sys/dev/i915/oa_max_sample_rate");
> +        int bad_exponent = max_oa_exponent_for_higher_freq(max_freq);
> +        int ok_exponent = bad_exponent + 1;
> +        uint64_t properties[] = {
> +                /* Include OA reports in samples */
> +                DRM_I915_PERF_PROP_SAMPLE_OA, true,
> +
> +                /* OA unit configuration */
> +                DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
> +                DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
> +                DRM_I915_PERF_PROP_OA_EXPONENT, bad_exponent,
> +        };
> +        struct drm_i915_perf_open_param param = {
> +                .flags = I915_PERF_FLAG_FD_CLOEXEC,
> +                .num_properties = sizeof(properties) / 16,
> +                .properties_ptr = (uint64_t)properties,
> +        };
> +        uint64_t oa_period, oa_freq;
> +
> +        igt_assert_eq(max_freq, 100000);
> +
> +        /* Avoid EACCESS errors opening a stream without CAP_SYS_ADMIN */
> +        write_u64_file("/proc/sys/dev/i915/perf_stream_paranoid", 0);
> +
> +        igt_fork(child, 1) {
> +                igt_drop_root();
> +
> +                do_ioctl_err(drm_fd, DRM_IOCTL_I915_PERF_OPEN, &param, EACCES);
> +        }
> +
> +        igt_waitchildren();
> +
> +        properties[7] = ok_exponent;
> +
> +        igt_fork(child, 1) {
> +                int stream_fd;
> +
> +                igt_drop_root();
> +
> +                stream_fd = __perf_open(drm_fd, &param);
> +                close(stream_fd);
> +        }
> +
> +        igt_waitchildren();
> +
> +        oa_period = timebase_scale(2 << ok_exponent);
> +        oa_freq = NSEC_PER_SEC / oa_period;
> +        write_u64_file("/proc/sys/dev/i915/oa_max_sample_rate", oa_freq - 100);
> +
> +        igt_fork(child, 1) {
> +                igt_drop_root();
> +
> +                do_ioctl_err(drm_fd, DRM_IOCTL_I915_PERF_OPEN, &param, EACCES);
> +        }
> +
> +        igt_waitchildren();
> +
> +        /* restore the defaults */
> +        write_u64_file("/proc/sys/dev/i915/oa_max_sample_rate", 100000);
> +        write_u64_file("/proc/sys/dev/i915/perf_stream_paranoid", 1);
> +}
> +
> +static void
> +test_per_context_mode_unprivileged(void)
> +{
> +        uint64_t properties[] = {
> +                /* Single context sampling */
> +                DRM_I915_PERF_PROP_CTX_HANDLE, UINT64_MAX, /* updated below */
> +
> +                /* Include OA reports in samples */
> +                DRM_I915_PERF_PROP_SAMPLE_OA, true,
> +
> +                /* OA unit configuration */
> +                DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
> +                DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
> +                DRM_I915_PERF_PROP_OA_EXPONENT, 13, /* 1 millisecond */
> +        };
> +        struct drm_i915_perf_open_param param = {
> +                .flags = I915_PERF_FLAG_FD_CLOEXEC,
> +                .num_properties = sizeof(properties) / 16,
> +                .properties_ptr = (uint64_t)properties,
> +        };
> +
> +        /* should be default, but just to be sure... */
> +        write_u64_file("/proc/sys/dev/i915/perf_stream_paranoid", 1);
> +
> +        igt_fork(child, 1) {
> +                drm_intel_context *context;
> +                drm_intel_bufmgr *bufmgr;
> +                int stream_fd;
> +
> +                igt_drop_root();
> +
> +                bufmgr = drm_intel_bufmgr_gem_init(drm_fd, 4096);
> +                context = drm_intel_gem_context_create(bufmgr);
> +
> +                igt_assert(context);
> +
> +                properties[1] = drm_intel_gem_context_get_context_id(context);
> +
> +                stream_fd = __perf_open(drm_fd, &param);
> +                close(stream_fd);
Missing drm_intel_gem_context_destroy.
Missing drm_intel_bufmgr_destroy.

> +        }
> +
> +        igt_waitchildren();
> +}
> +
> +static int64_t
> +get_time(void)
> +{
> +        struct timespec ts;
> +
> +        clock_gettime(CLOCK_MONOTONIC, &ts);
> +
> +        return ts.tv_sec * 1000000000 + ts.tv_nsec;
> +}
> +
> +/* Note: The interface doesn't currently provide strict guarantees or control
> + * over the upper bound for how long it might take for a POLLIN event after
> + * some OA report is written by the OA unit.
> + *
> + * The plan is to add a property later that gives some control over the maximum
> + * latency, but for now we expect it is tuned for a fairly low latency
> + * suitable for applications wanting to provide live feedback for captured
> + * metrics.
> + *
> + * At the time of writing this test the driver was using a fixed 200Hz hrtimer
> + * regardless of the OA sampling exponent.
> + *
> + * There is no lower bound since a stream configured for periodic sampling may
> + * still contain other automatically triggered reports.
> + *
> + * What we try and check for here is that blocking reads don't return EAGAIN
> + * and that we aren't spending any significant time burning the cpu in
> + * kernelspace.
> + */
> +static void
> +test_blocking(void)
> +{
> +        /* 40 milliseconds
> +         *
> +         * Having a period somewhat > sysconf(_SC_CLK_TCK) helps to stop
> +         * scheduling (liable to kick in when we make blocking poll()s/reads)
> +         * from interfering with the test.
> +         */
> +        int oa_exponent = 18;
> +        uint64_t properties[] = {
> +                /* Include OA reports in samples */
> +                DRM_I915_PERF_PROP_SAMPLE_OA, true,
> +
> +                /* OA unit configuration */
> +                DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
> +                DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
> +                DRM_I915_PERF_PROP_OA_EXPONENT, oa_exponent,
> +        };
> +        struct drm_i915_perf_open_param param = {
> +                .flags = I915_PERF_FLAG_FD_CLOEXEC,
> +                .num_properties = sizeof(properties) / 16,
> +                .properties_ptr = (uint64_t)properties,
> +        };
> +        int stream_fd = __perf_open(drm_fd, &param);
> +        uint8_t buf[1024 * 1024];
> +        struct tms start_times;
> +        struct tms end_times;
> +        int64_t user_ns, kernel_ns;
> +        int64_t tick_ns = 1000000000 / sysconf(_SC_CLK_TCK);
> +        int64_t start;
> +        int n = 0;
> +
> +        times(&start_times);
> +
> +        /* Loop for 600ms performing blocking reads while the HW is sampling at
> +         * ~25Hz, with the expectation that we spend most of our time blocked
> +         * in the kernel, and shouldn't be burning cpu cycles in the kernel in
> +         * association with this process (verified by looking at stime before
> +         * and after loop).
> +         */
> +        for (start = get_time(); (get_time() - start) < 600000000; /* nop */) {
> +                int ret;
> +
> +                while ((ret = read(stream_fd, buf, sizeof(buf))) < 0 &&
> +                       errno == EINTR)
> +                        ;
> +
> +                igt_assert(ret > 0);
> +
> +                n++;
> +        }
> +
> +        times(&end_times);
> +
> +        /* Using nanosecond units is fairly silly here, given the tick in-
> +         * precision - ah well, it's consistent with the get_time() units.
> +         */
> +        user_ns = (end_times.tms_utime - start_times.tms_utime) * tick_ns;
> +        kernel_ns = (end_times.tms_stime - start_times.tms_stime) * tick_ns;
> +
> +        igt_debug("%d blocking reads in 500 milliseconds, with 1KHz OA sampling\n", n);
> +        igt_debug("time in userspace = %"PRIu64"ns (start utime = %d, end = %d, ns ticks per sec = %d)\n",
> +                  user_ns, (int)start_times.tms_utime, (int)end_times.tms_utime, (int)tick_ns);
> +        igt_debug("time in kernelspace = %"PRIu64"ns (start stime = %d, end = %d, ns ticks per sec = %d)\n",
> +                  kernel_ns, (int)start_times.tms_stime, (int)end_times.tms_stime, (int)tick_ns);
> +
> +        /* With completely broken blocking (but also not returning an error) we
> +         * could end up with an open loop, hopefully recognisable with > 15
> +         * (600/40)iterations.
> +         */
> +        igt_assert(n <= 15);
> +
> +        /* It's a bit tricky to put a lower limit here, but we expect a
> +         * relatively low latency for seeing reports, while we don't currently
> +         * give any control over this in the api.
> +         *
> +         * Limited to a 5 millisecond latency and 45ms (worst case)
> +         * per-iteration that could give 13.3 iterations. Rounding gives a tiny
> +         * bit more latency slack (6ms)...
> +         */
> +        igt_assert(n > 13);
> +
> +        /* A bit tricky to put a number on this, but we don't expect the kernel
> +         * to use any significant cpu while waiting and given the in precision
> +         * of stime (multiple of CLK_TCK) we expect this to round to zero.
> +         */
> +        igt_assert_eq(kernel_ns, 0);
> +
> +        close(stream_fd);
> +}
> +
> +static void
> +test_polling(void)
> +{
> +        /* 40 milliseconds
> +         *
> +         * Having a period somewhat > sysconf(_SC_CLK_TCK) helps to stop
> +         * scheduling (liable to kick in when we make blocking poll()s/reads)
> +         * from interfering with the test.
> +         */
> +        int oa_exponent = 18;
> +        uint64_t properties[] = {
> +                /* Include OA reports in samples */
> +                DRM_I915_PERF_PROP_SAMPLE_OA, true,
> +
> +                /* OA unit configuration */
> +                DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
> +                DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
> +                DRM_I915_PERF_PROP_OA_EXPONENT, oa_exponent,
> +        };
> +        struct drm_i915_perf_open_param param = {
> +                .flags = I915_PERF_FLAG_FD_CLOEXEC |
> +                        I915_PERF_FLAG_FD_NONBLOCK,
> +                .num_properties = sizeof(properties) / 16,
> +                .properties_ptr = (uint64_t)properties,
> +        };
> +        int stream_fd = __perf_open(drm_fd, &param);
> +        uint8_t buf[1024 * 1024];
> +        struct tms start_times;
> +        struct tms end_times;
> +        int64_t user_ns, kernel_ns;
> +        int64_t tick_ns = 1000000000 / sysconf(_SC_CLK_TCK);
> +        int64_t start;
> +        int n = 0;
> +
> +        times(&start_times);
> +
> +        /* Loop for 600ms performing blocking polls while the HW is sampling at
> +         * ~25Hz, with the expectation that we spend most of our time blocked
> +         * in the kernel, and shouldn't be burning cpu cycles in the kernel in
> +         * association with this process (verified by looking at stime before
> +         * and after loop).
> +         */
> +        for (start = get_time(); (get_time() - start) < 600000000; /* nop */) {
> +                struct pollfd pollfd = { .fd = stream_fd, .events = POLLIN };
> +                int ret;
> +
> +                while ((ret = poll(&pollfd, 1, -1)) < 0 &&
> +                       errno == EINTR)
> +                        ;
> +                igt_assert_eq(ret, 1);
> +                igt_assert(pollfd.revents & POLLIN);
> +
> +                while ((ret = read(stream_fd, buf, sizeof(buf))) < 0 &&
> +                       errno == EINTR)
> +                        ;
> +
> +                /* Don't expect to see EAGAIN if we've had a POLLIN event
> +                 *
> +                 * XXX: actually this is technically overly strict since we do
> +                 * knowingly allow false positive POLLIN events. At least in
> +                 * the future when supporting context filtering of metrics for
> +                 * Gen8+ handled in the kernel then POLLIN events may be
> +                 * delivered when we know there are pending reports to process
> +                 * but before we've done any filtering to know for certain that
> +                 * any reports are destined to be copied to userspace.
> +                 *
> +                 * Still, for now it's a reasonable sanity check.
> +                 */
> +                if (ret < 0)
> +                        igt_debug("Unexpected error when reading after poll = %d\n", errno);
> +                igt_assert_neq(ret, -1);
> +
> +                /* At this point, after consuming pending reports (and hoping
> +                 * the scheduler hasn't stopped us for too long we now
> +                 * expect EAGAIN on read.
> +                 */
> +                while ((ret = read(stream_fd, buf, sizeof(buf))) < 0 &&
> +                       errno == EINTR)
> +                        ;
> +                igt_assert_eq(ret, -1);
> +                igt_assert_eq(errno, EAGAIN);
> +
> +                n++;
> +        }
> +
> +        times(&end_times);
> +
> +        /* Using nanosecond units is fairly silly here, given the tick in-
> +         * precision - ah well, it's consistent with the get_time() units.
> +         */
> +        user_ns = (end_times.tms_utime - start_times.tms_utime) * tick_ns;
> +        kernel_ns = (end_times.tms_stime - start_times.tms_stime) * tick_ns;
> +
> +        igt_debug("%d blocking poll()s in 600 milliseconds, with 25Hz OA sampling\n", n);
> +        igt_debug("time in userspace = %"PRIu64"ns (start utime = %d, end = %d, ns ticks per sec = %d)\n",
> +                  user_ns, (int)start_times.tms_utime, (int)end_times.tms_utime, (int)tick_ns);
> +        igt_debug("time in kernelspace = %"PRIu64"ns (start stime = %d, end = %d, ns ticks per sec = %d)\n",
> +                  kernel_ns, (int)start_times.tms_stime, (int)end_times.tms_stime, (int)tick_ns);
> +
> +        /* With completely broken blocking while polling (but still somehow
> +         * reporting a POLLIN event) we could end up with an open loop,
> +         * hopefully recognisable with > 15 (600/40)iterations.
> +         */
> +        igt_assert(n <= 15);
> +
> +        /* It's a bit tricky to put a lower limit here, but we expect a
> +         * relatively low latency for seeing reports, while we don't currently
> +         * give any control over this in the api.
> +         *
> +         * Limited to a 5 millisecond latency and 45ms (worst case)
> +         * per-iteration that could give 13.3 iterations. Rounding gives a tiny
> +         * bit more latency slack (6ms)...
> +         */
> +        igt_assert(n > 13);
> +
> +        /* A bit tricky to put a number on this, but we don't expect the kernel
> +         * to use any significant cpu while waiting and given the in precision
> +         * of stime (multiple of CLK_TCK) we expect this to round to zero.
> +         */
> +        igt_assert_eq(kernel_ns, 0);
> +
> +        close(stream_fd);
> +}
> +
> +static void
> +test_buffer_fill(void)
> +{
> +        int oa_exponent = 5; /* 5 micro seconds */
> +        uint64_t properties[] = {
> +                /* Include OA reports in samples */
> +                DRM_I915_PERF_PROP_SAMPLE_OA, true,
> +
> +                /* OA unit configuration */
> +                DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
> +                DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
> +                DRM_I915_PERF_PROP_OA_EXPONENT, oa_exponent,
> +        };
> +        struct drm_i915_perf_open_param param = {
> +                .flags = I915_PERF_FLAG_FD_CLOEXEC,
> +                .num_properties = sizeof(properties) / 16,
> +                .properties_ptr = (uint64_t)properties,
> +        };
> +        int stream_fd = __perf_open(drm_fd, &param);
> +        int buf_size = 65536 * (256 + sizeof(struct drm_i915_perf_record_header));
> +        uint8_t *buf = malloc(buf_size);
> +
> +
> +        for (int i = 0; i < 5; i++) {
> +                struct drm_i915_perf_record_header *header;
> +                bool overflow_seen;
> +                int offset = 0;
> +                int len;
> +
> +                /* It should take ~330 milliseconds to fill a 16MB OA buffer with a
> +                 * 5 microsecond sampling period and 256 byte reports. */
> +                nanosleep(&(struct timespec){ .tv_sec = 0, .tv_nsec = 500000000 }, NULL);
> +
> +                while ((len = read(stream_fd, buf, buf_size)) == -1 && errno == EINTR)
> +                        ;
> +
> +                igt_assert_neq(len, -1);
> +
> +                overflow_seen = false;
> +                for (offset = 0; offset < len; offset += header->size) {
> +                        header = (void *)(buf + offset);
> +
> +                        if (header->type == DRM_I915_PERF_RECORD_OA_BUFFER_LOST)
> +                                overflow_seen = true;
> +                }
> +
> +                igt_assert_eq(overflow_seen, true);
> +
> +                nanosleep(&(struct timespec){ .tv_sec = 0, .tv_nsec = 1000000 }, NULL);
> +
> +                while ((len = read(stream_fd, buf, buf_size)) == -1 && errno == EINTR)
> +                        ;
> +
> +                igt_assert_neq(len, -1);
> +
> +                /* expect ~ 200 records in 1 millisecond */
> +                igt_assert(len > 256 * 150);
> +
> +                overflow_seen = false;
> +                for (offset = 0; offset < len; offset += header->size) {
> +                        header = (void *)(buf + offset);
> +
> +                        if (header->type == DRM_I915_PERF_RECORD_OA_BUFFER_LOST)
> +                                overflow_seen = true;
> +                }
> +
> +                igt_assert_eq(overflow_seen, false);
> +        }
> +
> +        free(buf);
> +
> +        close(stream_fd);
> +}
> +
> +static void
> +test_enable_disable(void)
> +{
> +        int oa_exponent = 5; /* 5 micro seconds */
> +        uint64_t properties[] = {
> +                /* Include OA reports in samples */
> +                DRM_I915_PERF_PROP_SAMPLE_OA, true,
> +
> +                /* OA unit configuration */
> +                DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
> +                DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
> +                DRM_I915_PERF_PROP_OA_EXPONENT, oa_exponent,
> +        };
> +        struct drm_i915_perf_open_param param = {
> +                .flags = I915_PERF_FLAG_FD_CLOEXEC |
> +                         I915_PERF_FLAG_DISABLED, /* Verify we start disabled */
> +                .num_properties = sizeof(properties) / 16,
> +                .properties_ptr = (uint64_t)properties,
> +        };
> +        int stream_fd = __perf_open(drm_fd, &param);
> +        int buf_size = 65536 * (256 + sizeof(struct drm_i915_perf_record_header));
> +        uint8_t *buf = malloc(buf_size);
> +
> +
> +        for (int i = 0; i < 5; i++) {
> +                int len;
> +
> +                /* If the stream were enabled then it would take ~330
> +                 * milliseconds to fill a 16MB OA buffer with a 5 microsecond
> +                 * sampling period and 256 byte reports.
> +                 *
> +                 * Giving enough time for an overflow might help catch whether
> +                 * the OA unit has been enabled even if the driver might at
> +                 * least avoid copying reports while disabled.
> +                 */
> +                nanosleep(&(struct timespec){ .tv_sec = 0, .tv_nsec = 500000000 }, NULL);
> +
> +                while ((len = read(stream_fd, buf, buf_size)) == -1 && errno == EINTR)
> +                        ;
> +
> +                igt_assert_eq(len, -1);
> +                igt_assert_eq(errno, EIO);
> +
> +                do_ioctl(stream_fd, I915_PERF_IOCTL_ENABLE, 0);
> +
> +                nanosleep(&(struct timespec){ .tv_sec = 0, .tv_nsec = 1000000 }, NULL);
> +
> +                while ((len = read(stream_fd, buf, buf_size)) == -1 && errno == EINTR)
> +                        ;
> +
> +                igt_assert_neq(len, -1);
> +
> +                /* expect ~ 200 records in 1 millisecond */
> +                igt_assert(len > 256 * 150 && len < 256 * 2000);
> +
> +                do_ioctl(stream_fd, I915_PERF_IOCTL_DISABLE, 0);
> +
> +                /* It's considered an error to read a stream while it's disabled
> +                 * since it would block indefinitely...
> +                 */
> +                len = read(stream_fd, buf, buf_size);
> +
> +                igt_assert_eq(len, -1);
> +                igt_assert_eq(errno, EIO);
> +        }
> +
> +        free(buf);
> +
> +        close(stream_fd);
> +}
> +
> +static void
> +test_short_reads(void)
> +{
> +        int oa_exponent = 5; /* 5 micro seconds */
> +        uint64_t properties[] = {
> +                /* Include OA reports in samples */
> +                DRM_I915_PERF_PROP_SAMPLE_OA, true,
> +
> +                /* OA unit configuration */
> +                DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
> +                DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
> +                DRM_I915_PERF_PROP_OA_EXPONENT, oa_exponent,
> +        };
> +        struct drm_i915_perf_open_param param = {
> +                .flags = I915_PERF_FLAG_FD_CLOEXEC,
> +                .num_properties = sizeof(properties) / 16,
> +                .properties_ptr = (uint64_t)properties,
> +        };
> +        size_t record_size = 256 + sizeof(struct drm_i915_perf_record_header);
> +        size_t page_size = sysconf(_SC_PAGE_SIZE);
> +        int zero_fd = open("/dev/zero", O_RDWR|O_CLOEXEC);
> +        uint8_t *pages = mmap(NULL, page_size * 2,
> +                              PROT_READ|PROT_WRITE, MAP_PRIVATE, zero_fd, 0);
I don't see an munmap or close(zero_fd) anywhere ?

> +        struct drm_i915_perf_record_header *header;
> +        int stream_fd;
> +        int ret;
> +
> +        igt_assert(pages);
> +
> +        ret = mprotect(pages + page_size, page_size, PROT_NONE);
> +        igt_assert_eq(ret, 0);
> +
> +        stream_fd = __perf_open(drm_fd, &param);
> +
> +        nanosleep(&(struct timespec){ .tv_sec = 0, .tv_nsec = 5000000 }, NULL);
> +
> +        /* At this point there should be lots of pending reports to read */
> +
> +        /* A read that can return at least one record should result in a short
> +         * read not an EFAULT if the buffer is smaller than the requested read
> +         * size...
> +         *
> +         * Expect to see a sample record here, but at least skip over any
> +         * _RECORD_LOST notifications.
> +         */
> +        do {
> +                header = (void *)(pages + page_size - record_size);
> +                ret = read(stream_fd,
> +                           header,
> +                           page_size);
> +                igt_assert(ret > 0);
> +        } while (header->type == DRM_I915_PERF_RECORD_OA_REPORT_LOST);
> +
> +        igt_assert_eq(ret, record_size);
> +
> +        /* A read that can't return a single record because it would result
> +         * in a fault on buffer overrun should result in an EFAULT error...
> +         */
> +        ret = read(stream_fd, pages + page_size - 16, page_size);
> +        igt_assert_eq(ret, -1);
> +        igt_assert_eq(errno, EFAULT);
> +
> +        /* A read that can't return a single record because the buffer is too
> +         * small should result in an ENOSPC error..
> +         *
> +         * Again, skip over _RECORD_LOST records (smaller than record_size/2)
> +         */
> +        do {
> +                header = (void *)(pages + page_size - record_size / 2);
> +                ret = read(stream_fd,
> +                           header,
> +                           record_size / 2);
> +        } while (ret > 0 && header->type == DRM_I915_PERF_RECORD_OA_REPORT_LOST);
> +
> +        igt_assert_eq(ret, -1);
> +        igt_assert_eq(errno, ENOSPC);
> +
> +        close(stream_fd);
> +}
> +
> +static void
> +test_non_sampling_read_error(void)
> +{
> +        uint64_t properties[] = {
> +                /* XXX: even without periodic sampling we have to
> +                 * specify at least one sample layout property...
> +                 */
> +                DRM_I915_PERF_PROP_SAMPLE_OA, true,
> +
> +                /* OA unit configuration */
> +                DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
> +                DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
> +
> +                /* XXX: no sampling exponent */
> +        };
> +        struct drm_i915_perf_open_param param = {
> +                .flags = I915_PERF_FLAG_FD_CLOEXEC,
> +                .num_properties = sizeof(properties) / 16,
> +                .properties_ptr = (uint64_t)properties,
> +        };
> +        int stream_fd = __perf_open(drm_fd, &param);
> +        uint8_t buf[1024];
> +
> +        int ret = read(stream_fd, buf, sizeof(buf));
> +        igt_assert_eq(ret, -1);
> +        igt_assert_eq(errno, EIO);
> +
> +        close(stream_fd);
> +}
> +
> +/* Check that attempts to read from a stream while it is disable will return
> + * EIO instead of blocking indefinitely.
> + */
> +static void
> +test_disabled_read_error(void)
> +{
> +        int oa_exponent = 5; /* 5 micro seconds */
> +        uint64_t properties[] = {
> +                /* XXX: even without periodic sampling we have to
> +                 * specify at least one sample layout property...
> +                 */
> +                DRM_I915_PERF_PROP_SAMPLE_OA, true,
> +
> +                /* OA unit configuration */
> +                DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
> +                DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
> +                DRM_I915_PERF_PROP_OA_EXPONENT, oa_exponent,
> +        };
> +        struct drm_i915_perf_open_param param = {
> +                .flags = I915_PERF_FLAG_FD_CLOEXEC |
> +                         I915_PERF_FLAG_DISABLED, /* XXX: open disabled */
> +                .num_properties = sizeof(properties) / 16,
> +                .properties_ptr = (uint64_t)properties,
> +        };
> +        int stream_fd = __perf_open(drm_fd, &param);
> +        uint32_t oa_report0[64];
> +        uint32_t oa_report1[64];
> +        uint32_t buf[128] = { 0 };
> +        int ret;
> +
> +
> +        ret = read(stream_fd, buf, sizeof(buf));
> +        igt_assert_eq(ret, -1);
> +        igt_assert_eq(errno, EIO);
> +
> +        close(stream_fd);
> +
> +
> +        param.flags &= ~I915_PERF_FLAG_DISABLED;
> +        stream_fd = __perf_open(drm_fd, &param);
> +
> +        read_2_oa_reports(stream_fd,
> +                          I915_OA_FORMAT_A45_B8_C8, 256,
> +                          oa_exponent,
> +                          oa_report0,
> +                          oa_report1,
> +                          false); /* not just timer reports */
> +
> +        do_ioctl(stream_fd, I915_PERF_IOCTL_DISABLE, 0);
> +
> +        ret = read(stream_fd, buf, sizeof(buf));
> +        igt_assert_eq(ret, -1);
> +        igt_assert_eq(errno, EIO);
> +
> +        do_ioctl(stream_fd, I915_PERF_IOCTL_ENABLE, 0);
> +
> +        read_2_oa_reports(stream_fd,
> +                          I915_OA_FORMAT_A45_B8_C8, 256,
> +                          oa_exponent,
> +                          oa_report0,
> +                          oa_report1,
> +                          false); /* not just timer reports */
> +
> +        close(stream_fd);
> +}
> +
> +static void
> +test_mi_rpc(void)
> +{
> +        uint64_t properties[] = {
> +                /* Note: we have to specify at least one sample property even
> +                 * though we aren't interested in samples in this case.
> +                 */
> +                DRM_I915_PERF_PROP_SAMPLE_OA, true,
> +
> +                /* OA unit configuration */
> +                DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
> +                DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
> +
> +                /* Note: no OA exponent specified in this case */
> +        };
> +        struct drm_i915_perf_open_param param = {
> +                .flags = I915_PERF_FLAG_FD_CLOEXEC,
> +                .num_properties = sizeof(properties) / 16,
> +                .properties_ptr = (uint64_t)properties,
> +        };
> +        int stream_fd = __perf_open(drm_fd, &param);
> +        drm_intel_bufmgr *bufmgr = drm_intel_bufmgr_gem_init(drm_fd, 4096);
> +        drm_intel_context *context;
> +        struct intel_batchbuffer *batch;
> +        drm_intel_bo *bo;
> +        uint32_t *report32;
> +        int ret;
> +
> +        drm_intel_bufmgr_gem_enable_reuse(bufmgr);
> +
> +        context = drm_intel_gem_context_create(bufmgr);
> +        igt_assert(context);
> +
> +        batch = intel_batchbuffer_alloc(bufmgr, devid);
> +
> +        bo = drm_intel_bo_alloc(bufmgr, "mi_rpc dest bo", 4096, 64);
> +
> +        ret = drm_intel_bo_map(bo, true);
> +        igt_assert_eq(ret, 0);
> +
> +        memset(bo->virtual, 0x80, 4096);
> +        drm_intel_bo_unmap(bo);
> +
> +        BEGIN_BATCH(3, 1);
> +        OUT_BATCH(GEN6_MI_REPORT_PERF_COUNT);
> +        OUT_RELOC(bo, I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION,
> +                  0); /* offset in bytes */
> +        OUT_BATCH(0xdeadbeef); /* report ID */
> +        ADVANCE_BATCH();
> +
> +        intel_batchbuffer_flush_with_context(batch, context);
> +
> +        ret = drm_intel_bo_map(bo, false /* write enable */);
> +        igt_assert_eq(ret, 0);
> +
> +        report32 = bo->virtual;
> +        igt_assert_eq(report32[0], 0xdeadbeef); /* report ID */
> +        igt_assert_neq(report32[1], 0); /* timestamp */
> +
> +        igt_assert_neq(report32[63], 0x80808080); /* end of report */
> +        igt_assert_eq(report32[64], 0x80808080); /* after 256 byte report */
Missing a drm_intel_bo_unmap somewhere.

> +
> +        intel_batchbuffer_free(batch);
> +        drm_intel_gem_context_destroy(context);
> +        drm_intel_bufmgr_destroy(bufmgr);
> +        close(stream_fd);
> +}
> +
> +static void
> +scratch_buf_init(drm_intel_bufmgr *bufmgr,
> +                 struct igt_buf *buf,
> +                 int width, int height,
> +                 uint32_t color)
> +{
> +        size_t stride = width * 4;
> +        size_t size = stride * height;
> +	drm_intel_bo *bo = drm_intel_bo_alloc(bufmgr, "", size, 4096);
> +        int ret;
> +
> +        ret = drm_intel_bo_map(bo, true /* writable */);
> +        igt_assert_eq(ret, 0);
> +
> +        for (int i = 0; i < width * height; i++)
> +                ((uint32_t *)bo->virtual)[i] = color;
> +
> +        drm_intel_bo_unmap(bo);
> +
> +	buf->bo = bo;
> +	buf->stride = stride;
> +	buf->tiling = I915_TILING_NONE;
> +	buf->size = size;
> +}
The whitespace formatting here is pretty sketchy.

> +
> +static void
> +emit_stall_timestamp_and_rpc(struct intel_batchbuffer *batch,
> +                             drm_intel_bo *dst,
> +                             int timestamp_offset,
> +                             int report_dst_offset,
> +                             uint32_t report_id)
> +{
> +        uint32_t pipe_ctl_flags = (PIPE_CONTROL_CS_STALL |
> +                                   PIPE_CONTROL_RENDER_TARGET_FLUSH |
> +                                   PIPE_CONTROL_WRITE_TIMESTAMP);
> +
> +        BEGIN_BATCH(5, 1);
> +        OUT_BATCH(GFX_OP_PIPE_CONTROL | (5 - 2));
> +        OUT_BATCH(pipe_ctl_flags);
> +        OUT_RELOC(dst, I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION,
> +                  timestamp_offset);
> +        OUT_BATCH(0); /* imm lower */
> +        OUT_BATCH(0); /* imm upper */
> +        ADVANCE_BATCH();
> +
> +        BEGIN_BATCH(3, 1);
> +        OUT_BATCH(GEN6_MI_REPORT_PERF_COUNT);
> +        OUT_RELOC(dst, I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION,
> +                  report_dst_offset);
> +        OUT_BATCH(report_id);
> +        ADVANCE_BATCH();
> +}
> +
> +/* Tests the INTEL_performance_query use case where an unprivileged process
> + * should be able to configure the OA unit for per-context metrics (for a
> + * context associated with that process' drm file descriptor) and the counters
> + * should only relate to that specific context.
> + */
> +static void
> +test_per_ctx_mi_rpc(void)
> +{
> +        uint64_t properties[] = {
> +                DRM_I915_PERF_PROP_CTX_HANDLE, UINT64_MAX, /* updated below */
> +
> +                /* Note: we have to specify at least one sample property even
> +                 * though we aren't interested in samples in this case
> +                 */
> +                DRM_I915_PERF_PROP_SAMPLE_OA, true,
> +
> +                /* OA unit configuration */
> +                DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
> +                DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
> +
> +                /* Note: no OA exponent specified in this case */
> +        };
> +        struct drm_i915_perf_open_param param = {
> +                .flags = I915_PERF_FLAG_FD_CLOEXEC,
> +                .num_properties = sizeof(properties) / 16,
> +                .properties_ptr = (uint64_t)properties,
> +        };
> +
> +        /* should be default, but just to be sure... */
> +        write_u64_file("/proc/sys/dev/i915/perf_stream_paranoid", 1);
> +
> +        igt_fork(child, 1) {
> +                drm_intel_bufmgr *bufmgr;
> +                drm_intel_context *context0, *context1;
> +                int stream_fd;
> +                struct intel_batchbuffer *batch;
> +                struct igt_buf src, dst;
> +                drm_intel_bo *bo;
> +                uint32_t *report0_32, *report1_32;
> +                uint64_t timestamp0_64, timestamp1_64;
> +                uint32_t delta_ts64, delta_oa32;
> +                uint64_t delta_ts64_ns, delta_oa32_ns;
> +                uint32_t delta_delta;
> +                int n_samples_written;
> +                int width = 800;
> +                int height = 600;
> +                int ret;
> +
> +                igt_drop_root();
> +
> +                bufmgr = drm_intel_bufmgr_gem_init(drm_fd, 4096);
> +                drm_intel_bufmgr_gem_enable_reuse(bufmgr);
> +
> +                scratch_buf_init(bufmgr, &src, width, height, 0xff0000ff);
> +                scratch_buf_init(bufmgr, &dst, width, height, 0x00ff00ff);
> +
> +                batch = intel_batchbuffer_alloc(bufmgr, devid);
> +
> +                context0 = drm_intel_gem_context_create(bufmgr);
> +                igt_assert(context0);
> +
> +                context1 = drm_intel_gem_context_create(bufmgr);
> +                igt_assert(context1);
> +
> +                igt_debug("submitting warm up render_copy\n");
> +
> +                /* Submit some early, unmeasured, work to the context we want
> +                 * to measure to try and catch issues with i915-perf
> +                 * initializing the HW context ID for filtering.
> +                 *
> +                 * We do this because i915-perf single context filtering had
> +                 * previously only relied on a hook into context pinning to
> +                 * initialize the HW context ID, instead of also trying to
> +                 * determine the HW ID while opening the stream, in case it
> +                 * has already been pinned.
> +                 *
> +                 * This wasn't noticed by the previous unit test because we
> +                 * were opening the stream while the context hadn't been
> +                 * touched or pinned yet and so it worked out correctly to wait
> +                 * for the pinning hook.
> +                 *
> +                 * Now a buggy version of i915-perf will fail to measure
> +                 * anything for context0 once this initial render_copy() ends
> +                 * up pinning the context since there won't ever be a pinning
> +                 * hook callback.
> +                 */
> +                render_copy(batch,
> +                            context0,
> +                            &src, 0, 0, width, height,
> +                            &dst, 0, 0);
> +
> +                properties[1] = drm_intel_gem_context_get_context_id(context0);
> +
> +                igt_debug("opening i915-perf stream\n");
> +                stream_fd = __perf_open(drm_fd, &param);
> +
> +                bo = drm_intel_bo_alloc(bufmgr, "mi_rpc dest bo", 4096, 64);
> +
> +                ret = drm_intel_bo_map(bo, true /* write enable */);
> +                igt_assert_eq(ret, 0);
> +
> +                memset(bo->virtual, 0x80, 4096);
> +                drm_intel_bo_unmap(bo);
> +
> +                emit_stall_timestamp_and_rpc(batch,
> +                                             bo,
> +                                             512 /* timestamp offset */,
> +                                             0, /* report dst offset */
> +                                             0xdeadbeef); /* report id */
> +
> +                /* Explicitly flush here (even though the render_copy() call
> +                 * will itself flush before/after the copy) to clarify that
> +                 * that the PIPE_CONTROL + MI_RPC commands will be in a
> +                 * separate batch from the copy.
> +                 */
> +                intel_batchbuffer_flush_with_context(batch, context0);
> +
> +                render_copy(batch,
> +                            context0,
> +                            &src, 0, 0, width, height,
> +                            &dst, 0, 0);
> +
> +                /* Another redundant flush to clarify batch bo is free to reuse */
> +                intel_batchbuffer_flush_with_context(batch, context0);
> +
> +                /* submit two copies on the other context to avoid a false
> +                 * positive in case the driver somehow ended up filtering for
> +                 * context1
> +                 */
> +                render_copy(batch,
> +                            context1,
> +                            &src, 0, 0, width, height,
> +                            &dst, 0, 0);
> +
> +                render_copy(batch,
> +                            context1,
> +                            &src, 0, 0, width, height,
> +                            &dst, 0, 0);
> +
> +                /* And another */
> +                intel_batchbuffer_flush_with_context(batch, context1);
> +
> +                emit_stall_timestamp_and_rpc(batch,
> +                                             bo,
> +                                             520 /* timestamp offset */,
> +                                             256, /* report dst offset */
> +                                             0xbeefbeef); /* report id */
> +
> +                intel_batchbuffer_flush_with_context(batch, context0);
> +
> +                ret = drm_intel_bo_map(bo, false /* write enable */);
> +                igt_assert_eq(ret, 0);
> +
> +                report0_32 = bo->virtual;
> +                igt_assert_eq(report0_32[0], 0xdeadbeef); /* report ID */
> +                igt_assert_neq(report0_32[1], 0); /* timestamp */
> +
> +                report1_32 = report0_32 + 64;
> +                igt_assert_eq(report1_32[0], 0xbeefbeef); /* report ID */
> +                igt_assert_neq(report1_32[1], 0); /* timestamp */
> +
> +                print_reports(report0_32, report1_32,
> +                              lookup_format(I915_OA_FORMAT_A45_B8_C8));
> +
> +                /* A40 == N samples written to all render targets */
> +                n_samples_written = report1_32[43] - report0_32[43];
> +                igt_debug("n samples written = %d\n", n_samples_written);
> +                igt_assert_eq(n_samples_written, width * height);
> +
> +                igt_debug("timestamp32 0 = %u\n", report0_32[1]);
> +                igt_debug("timestamp32 1 = %u\n", report1_32[1]);
> +
> +                timestamp0_64 = *(uint64_t *)(((uint8_t *)bo->virtual) + 512);
> +                timestamp1_64 = *(uint64_t *)(((uint8_t *)bo->virtual) + 520);
> +
> +                igt_debug("timestamp64 0 = %"PRIu64"\n", timestamp0_64);
> +                igt_debug("timestamp64 1 = %"PRIu64"\n", timestamp1_64);
> +
> +                delta_ts64 = timestamp1_64 - timestamp0_64;
> +                delta_oa32 = report1_32[1] - report0_32[1];
> +
> +                /* sanity check that we can pass the delta to timebase_scale */
> +                igt_assert(delta_ts64 < UINT32_MAX);
> +                delta_oa32_ns = timebase_scale(delta_oa32);
> +                delta_ts64_ns = timebase_scale(delta_ts64);
> +
> +                igt_debug("ts32 delta = %u, = %uns\n",
> +                          delta_oa32, (unsigned)delta_oa32_ns);
> +                igt_debug("ts64 delta = %u, = %uns\n",
> +                          delta_ts64, (unsigned)delta_ts64_ns);
> +
> +                /* The delta as calculated via the PIPE_CONTROL timestamp or
> +                 * the OA report timestamps should be almost identical but
> +                 * allow a 320 nanoseconds margin.
> +                 */
> +                delta_delta = delta_ts64_ns > delta_oa32_ns ?
> +                        (delta_ts64_ns - delta_oa32_ns) :
> +                        (delta_oa32_ns - delta_ts64_ns);
> +                igt_assert(delta_delta <= 320);
> +
Missing a drm_intel_bo_unmap somewhere.

> +                intel_batchbuffer_free(batch);
> +                drm_intel_gem_context_destroy(context0);
> +                drm_intel_gem_context_destroy(context1);
> +                drm_intel_bufmgr_destroy(bufmgr);
> +                close(stream_fd);
> +        }
> +
> +        igt_waitchildren();
> +}
> +
> +static void
> +test_rc6_disable(void)
> +{
> +        int oa_exponent = 13; /* 1 millisecond */
> +        uint64_t properties[] = {
> +                /* Include OA reports in samples */
> +                DRM_I915_PERF_PROP_SAMPLE_OA, true,
> +
> +                /* OA unit configuration */
> +                DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
> +                DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
> +                DRM_I915_PERF_PROP_OA_EXPONENT, oa_exponent,
> +        };
> +        struct drm_i915_perf_open_param param = {
> +                .flags = I915_PERF_FLAG_FD_CLOEXEC,
> +                .num_properties = sizeof(properties) / 16,
> +                .properties_ptr = (uint64_t)properties,
> +        };
> +        int stream_fd = __perf_open(drm_fd, &param);
> +        uint64_t n_events_start = read_debugfs_u64_record("i915_drpc_info",
> +                                                          "RC6 residency since boot");
> +        uint64_t n_events_end;
> +
> +        nanosleep(&(struct timespec){ .tv_sec = 0, .tv_nsec = 500000000 }, NULL);
> +
> +        n_events_end = read_debugfs_u64_record("i915_drpc_info",
> +                                               "RC6 residency since boot");
> +
> +        igt_assert_eq(n_events_end - n_events_start, 0);
> +
> +        close(stream_fd);
> +
> +        n_events_start = read_debugfs_u64_record("i915_drpc_info",
> +                                                 "RC6 residency since boot");
> +
> +        nanosleep(&(struct timespec){ .tv_sec = 0, .tv_nsec = 500000000 }, NULL);
> +
> +        n_events_end = read_debugfs_u64_record("i915_drpc_info",
> +                                               "RC6 residency since boot");
> +
> +        igt_assert_neq(n_events_end - n_events_start, 0);
> +}
> +
> +static unsigned
> +read_i915_module_ref(void)
> +{
> +        FILE *fp = fopen("/proc/modules", "r");
> +        char *line = NULL;
> +        size_t line_buf_size = 0;
> +        int len = 0;
> +        unsigned ref_count;
> +
> +        igt_assert(fp);
> +
> +        while ((len = getline(&line, &line_buf_size, fp)) > 0) {
> +                if (strncmp(line, "i915 ", 5) == 0) {
> +                        unsigned long mem;
> +                        int ret = sscanf(line + 5, "%lu %u", &mem, &ref_count);
> +                        igt_assert(ret == 2);
> +                        goto done;
> +                }
> +        }
> +
> +        igt_assert(!"reached");
> +
> +done:
> +        free(line);
> +        fclose(fp);
> +        return ref_count;
> +}
> +
> +/* check that an open i915 perf stream holds a reference on the drm i915 module
> + * including in the corner case where the original drm fd has been closed.
> + */
> +static void
> +test_i915_ref_count(void)
> +{
> +        int oa_exponent = 13; /* 1 millisecond */
> +        uint64_t properties[] = {
> +                /* Include OA reports in samples */
> +                DRM_I915_PERF_PROP_SAMPLE_OA, true,
> +
> +                /* OA unit configuration */
> +                DRM_I915_PERF_PROP_OA_METRICS_SET, hsw_render_basic_id,
> +                DRM_I915_PERF_PROP_OA_FORMAT, I915_OA_FORMAT_A45_B8_C8,
> +                DRM_I915_PERF_PROP_OA_EXPONENT, oa_exponent,
> +        };
> +        struct drm_i915_perf_open_param param = {
> +                .flags = I915_PERF_FLAG_FD_CLOEXEC,
> +                .num_properties = sizeof(properties) / 16,
> +                .properties_ptr = (uint64_t)properties,
> +        };
> +        unsigned baseline, ref_count0, ref_count1;
> +        int stream_fd;
> +        uint32_t oa_report0[64];
> +        uint32_t oa_report1[64];
> +
> +        close(drm_fd);
> +        baseline = read_i915_module_ref();
> +        igt_debug("baseline ref count (drm fd closed) = %u\n", baseline);
> +
> +        drm_fd = drm_open_driver_render(DRIVER_INTEL);
> +        ref_count0 = read_i915_module_ref();
> +        igt_debug("initial ref count with drm_fd open = %u\n", ref_count0);
> +        igt_assert(ref_count0 > baseline);
> +
> +        stream_fd = __perf_open(drm_fd, &param);
> +        ref_count1 = read_i915_module_ref();
> +        igt_debug("ref count after opening i915 perf stream = %u\n", ref_count1);
> +        igt_assert(ref_count1 > ref_count0);
> +
> +        close(drm_fd);
> +        ref_count0 = read_i915_module_ref();
> +        igt_debug("ref count after closing drm fd = %u\n", ref_count0);
> +
> +        igt_assert(ref_count0 > baseline);
> +
> +        read_2_oa_reports(stream_fd,
> +                          I915_OA_FORMAT_A45_B8_C8, 256,
> +                          oa_exponent,
> +                          oa_report0,
> +                          oa_report1,
> +                          false); /* not just timer reports */
> +
> +        close(stream_fd);
> +        ref_count0 = read_i915_module_ref();
> +        igt_debug("ref count after closing i915 perf stream fd = %u\n", ref_count0);
> +        igt_assert_eq(ref_count0, baseline);
> +
> +        drm_fd = drm_open_driver_render(DRIVER_INTEL);
> +}
> +
> +igt_main
> +{
> +        igt_skip_on_simulation();
> +
> +        igt_fixture {
> +                struct stat sb;
> +                int ret;
> +
> +                drm_fd = drm_open_driver_render(DRIVER_INTEL);
> +                devid = intel_get_drm_devid(drm_fd);
> +                device = drm_get_card();
> +
> +                igt_require(IS_HASWELL(devid));
> +                igt_require(lookup_hsw_render_basic_id());
> +
> +                ret = stat("/proc/sys/dev/i915/perf_stream_paranoid", &sb);
> +                igt_require(ret == 0);
> +                ret = stat("/proc/sys/dev/i915/oa_max_sample_rate", &sb);
> +                igt_require(ret == 0);
The absence of the above files would indicate a failure in the kernel,
so would it not be more apt to assert, rather than skip ?

> +
> +                gt_frequency_range_save();
> +
> +                write_u64_file("/proc/sys/dev/i915/perf_stream_paranoid", 1);
Don't we also want to ensure that the oa_max_sample_rate is also in a
"good" starting state before we begin, especially since we ensure that
we leave in its default state when cleaning up ?

Anyway, I think it all looks pretty reasonable to me and it looks like
we have a good amount of coverage, so you can have my r-b with Chris'
comment addressed.

> +
> +                render_copy = igt_get_render_copyfunc(devid);
> +                igt_require_f(render_copy, "no render-copy function\n");
> +        }
> +
> +        igt_subtest("non-system-wide-paranoid")
> +                test_system_wide_paranoid();
> +
> +        igt_subtest("invalid-open-flags")
> +                test_invalid_open_flags();
> +
> +        igt_subtest("invalid-oa-metric-set-id")
> +                test_invalid_oa_metric_set_id();
> +
> +        igt_subtest("invalid-oa-format-id")
> +                test_invalid_oa_format_id();
> +
> +        igt_subtest("missing-sample-flags")
> +                test_missing_sample_flags();
> +
> +        igt_subtest("oa-formats")
> +                test_oa_formats();
> +
> +        igt_subtest("invalid-oa-exponent")
> +                test_invalid_oa_exponent();
> +        igt_subtest("low-oa-exponent-permissions")
> +                test_low_oa_exponent_permissions();
> +        igt_subtest("oa-exponents") {
> +                test_oa_exponents(450);
> +                test_oa_exponents(550);
> +        }
> +
> +        igt_subtest("per-context-mode-unprivileged")
> +                test_per_context_mode_unprivileged();
> +
> +        igt_subtest("buffer-fill")
> +                test_buffer_fill();
> +
> +        igt_subtest("disabled-read-error")
> +                test_disabled_read_error();
> +        igt_subtest("non-sampling-read-error")
> +                test_non_sampling_read_error();
> +
> +        igt_subtest("enable-disable")
> +                test_enable_disable();
> +
> +        igt_subtest("blocking")
> +                test_blocking();
> +
> +        igt_subtest("polling")
> +                test_polling();
> +
> +        igt_subtest("short-reads")
> +                test_short_reads();
> +
> +        igt_subtest("mi-rpc")
> +                test_mi_rpc();
> +
> +        igt_subtest("mi-rpc-per-ctx")
> +                test_per_ctx_mi_rpc();
> +
> +        igt_subtest("i915-ref-count")
> +                test_i915_ref_count();
> +
> +        igt_subtest("rc6-disable")
> +                test_rc6_disable();
> +
> +        igt_fixture {
> +                /* leave sysctl options in their default state... */
> +                write_u64_file("/proc/sys/dev/i915/oa_max_sample_rate", 100000);
> +                write_u64_file("/proc/sys/dev/i915/perf_stream_paranoid", 1);
> +
> +                gt_frequency_range_restore();
> +
> +                close(drm_fd);
> +        }
> +}
> -- 
> 2.10.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH igt v3 02/11] igt/gem_exec_parse: some minor cleanups
  2016-11-09 16:15 ` [PATCH igt v3 02/11] igt/gem_exec_parse: some minor cleanups Robert Bragg
@ 2016-11-11 21:49   ` Matthew Auld
  0 siblings, 0 replies; 25+ messages in thread
From: Matthew Auld @ 2016-11-11 21:49 UTC (permalink / raw)
  To: Robert Bragg; +Cc: Intel Graphics Development

On 9 November 2016 at 16:15, Robert Bragg <robert@sixbynine.org> wrote:
> This normalizes the execbuf utilities in this file to all use memset to
> clear obj, reloc and execbuf structures and set them up in the same
> order. As I was debugging some unpredictable test failures I was getting
> unsure that all these structures were being fully initialized.
>
> The same I915_GEM_DOMAIN_COMMAND domain is now used with all relocs.
>
> The register/command defines have been moved to the top of the file to
> be available to all tests/utilities.
>
> The handle + fd variables are now static.
>
> Signed-off-by: Robert Bragg <robert@sixbynine.org>
Okay seems reasonable.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH igt v3 03/11] igt/gem_exec_parse: move hsw_load_register_reg down
  2016-11-09 16:15 ` [PATCH igt v3 03/11] igt/gem_exec_parse: move hsw_load_register_reg down Robert Bragg
@ 2016-11-11 21:51   ` Matthew Auld
  0 siblings, 0 replies; 25+ messages in thread
From: Matthew Auld @ 2016-11-11 21:51 UTC (permalink / raw)
  To: Robert Bragg; +Cc: Intel Graphics Development

On 9 November 2016 at 16:15, Robert Bragg <robert@sixbynine.org> wrote:
> No functional change, just moving hsw_load_regster_reg test code down
s/regster/register
> below the execbuf utilities in preparation for updating to use them.
>
> Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH igt v3 04/11] igt/gem_exec_parse: update hsw_load_register_reg
  2016-11-09 16:15 ` [PATCH igt v3 04/11] igt/gem_exec_parse: update hsw_load_register_reg Robert Bragg
@ 2016-11-11 22:01   ` Matthew Auld
  0 siblings, 0 replies; 25+ messages in thread
From: Matthew Auld @ 2016-11-11 22:01 UTC (permalink / raw)
  To: Robert Bragg; +Cc: Intel Graphics Development

On 9 November 2016 at 16:15, Robert Bragg <robert@sixbynine.org> wrote:
> This generalises hsw_load_register_reg to loop through an array of
> allowed and disallowed registers and to use the exec_batch[_patched]
> utilities.
>
> Signed-off-by: Robert Bragg <robert@sixbynine.org>
> ---
>  tests/gem_exec_parse.c | 139 +++++++++++++++++++++++--------------------------
>  1 file changed, 66 insertions(+), 73 deletions(-)
>
> diff --git a/tests/gem_exec_parse.c b/tests/gem_exec_parse.c
> index c530bb6..2fea060 100644
> --- a/tests/gem_exec_parse.c
> +++ b/tests/gem_exec_parse.c
> @@ -34,8 +34,11 @@
>  #define I915_PARAM_CMD_PARSER_VERSION       28
>  #endif
>
> -#define OACONTROL 0x2360
> +#define ARRAY_LEN(A) (sizeof(A) / sizeof(A[0]))
We have ARRAY_SIZE.

> +
>  #define DERRMR 0x44050
> +#define OACONTROL 0x2360
> +#define SO_WRITE_OFFSET_0 0x5280
>
>  #define HSW_CS_GPR(n) (0x2600 + 8*(n))
>  #define HSW_CS_GPR0 HSW_CS_GPR(0)
> @@ -65,8 +68,8 @@ static int command_parser_version(int fd)
>         return -1;
>  }
>
> -static void exec_batch_patched(int fd, uint32_t cmd_bo, uint32_t *cmds,
> -                              int size, int patch_offset, uint64_t expected_value)
> +static uint64_t __exec_batch_patched(int fd, uint32_t cmd_bo, uint32_t *cmds,
> +                                    int size, int patch_offset)
>  {
>         struct drm_i915_gem_execbuffer2 execbuf;
>         struct drm_i915_gem_exec_object2 obj[2];
> @@ -100,9 +103,19 @@ static void exec_batch_patched(int fd, uint32_t cmd_bo, uint32_t *cmds,
>         gem_sync(fd, cmd_bo);
>
>         gem_read(fd,target_bo, 0, &actual_value, sizeof(actual_value));
> -       igt_assert_eq(expected_value, actual_value);
>
>         gem_close(fd, target_bo);
> +
> +       return actual_value;
> +}
> +
> +static void exec_batch_patched(int fd, uint32_t cmd_bo, uint32_t *cmds,
> +                              int size, int patch_offset,
> +                              uint64_t expected_value)
> +{
> +       igt_assert_eq(__exec_batch_patched(fd, cmd_bo, cmds,
> +                                          size, patch_offset),
> +                     expected_value);
Formatting seems off.

Seems reasonable so:
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH igt v3 05/11] igt/gem_exec_parse: req. v < 9 for oacontrol tracking test
  2016-11-09 16:15 ` [PATCH igt v3 05/11] igt/gem_exec_parse: req. v < 9 for oacontrol tracking test Robert Bragg
@ 2016-11-11 22:07   ` Matthew Auld
  0 siblings, 0 replies; 25+ messages in thread
From: Matthew Auld @ 2016-11-11 22:07 UTC (permalink / raw)
  To: Robert Bragg; +Cc: Intel Graphics Development

On 9 November 2016 at 16:15, Robert Bragg <robert@sixbynine.org> wrote:
> This limits testing the oacontrol tracking (required pairing of oa
> enable/disable per batch buffer) to version <= 8 of the command parser.
>
> Version 9 of the command parser removes all special handling for
> OACONTROL which is now going to be managed by i915-perf and not
> programmed from userspace.
>
> Signed-off-by: Robert Bragg <robert@sixbynine.org>
Fancy cleaning up the open coded parser_version get param business in
the fixture, and also make it so we do it only once ?

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH igt v3 07/11] igt/gem_exec_parse: update bitmasks test for v >=8
  2016-11-09 16:15 ` [PATCH igt v3 07/11] igt/gem_exec_parse: update bitmasks test for v >=8 Robert Bragg
@ 2016-11-11 22:08   ` Matthew Auld
  0 siblings, 0 replies; 25+ messages in thread
From: Matthew Auld @ 2016-11-11 22:08 UTC (permalink / raw)
  To: Robert Bragg; +Cc: Intel Graphics Development

On 9 November 2016 at 16:15, Robert Bragg <robert@sixbynine.org> wrote:
> With v8 of the command parser (where we won't get an EINVAL for an
> access violation) this updates the bitmasks test to explicitly confirm
> that the command became a NOOP by reading back from where the QW_WRITE
> would have otherwise landed.
>
> Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH igt v3 08/11] igt/gem_exec_parse: update cmd-crossing-page for >= v8
  2016-11-09 16:15 ` [PATCH igt v3 08/11] igt/gem_exec_parse: update cmd-crossing-page for >= v8 Robert Bragg
@ 2016-11-11 22:10   ` Matthew Auld
  0 siblings, 0 replies; 25+ messages in thread
From: Matthew Auld @ 2016-11-11 22:10 UTC (permalink / raw)
  To: Robert Bragg; +Cc: Intel Graphics Development

On 9 November 2016 at 16:15, Robert Bragg <robert@sixbynine.org> wrote:
> Since an access violation won't return an error to userspace for v >= 8
> of the command parser this updates the cmd-crossing-page test to
> explicitly read back from SO_WRITE_OFFSET[0] to see that the command
> wasn't squashed to a NOOP.
>
> Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH igt v3 09/11] igt/gem_exec_parse: update hsw_load_register_reg for v >= 8
  2016-11-09 16:16 ` [PATCH igt v3 09/11] igt/gem_exec_parse: update hsw_load_register_reg for v >= 8 Robert Bragg
@ 2016-11-11 22:14   ` Matthew Auld
  0 siblings, 0 replies; 25+ messages in thread
From: Matthew Auld @ 2016-11-11 22:14 UTC (permalink / raw)
  To: Robert Bragg; +Cc: Intel Graphics Development

On 9 November 2016 at 16:16, Robert Bragg <robert@sixbynine.org> wrote:
> This updates the checking of disallowed loads to set a distinguishable
> value before the load and explicitly check the load was a NOOP by
> reading back the final value.
>
> Signed-off-by: Robert Bragg <robert@sixbynine.org>
> ---
>  tests/gem_exec_parse.c | 20 ++++++++++++++++++--
>  1 file changed, 18 insertions(+), 2 deletions(-)
>
> diff --git a/tests/gem_exec_parse.c b/tests/gem_exec_parse.c
> index d6a2885..f241ab7 100644
> --- a/tests/gem_exec_parse.c
> +++ b/tests/gem_exec_parse.c
> @@ -305,12 +305,17 @@ static void hsw_load_register_reg(void)
>         };
>         int fd;
>         uint32_t handle;
> +       int parser_version;
> +       int bad_lrr_errno;
>
>         /* Open again to get a non-master file descriptor */
>         fd = drm_open_driver(DRIVER_INTEL);
>
> +       parser_version = command_parser_version(fd);
Ah, I was reading ahead, so move this to the fixture and re-use ?

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH igt v3 10/11] igt/gem_exec_parse: update registers test for v >= 8
  2016-11-09 16:16 ` [PATCH igt v3 10/11] igt/gem_exec_parse: update registers test " Robert Bragg
@ 2016-11-11 22:28   ` Matthew Auld
  0 siblings, 0 replies; 25+ messages in thread
From: Matthew Auld @ 2016-11-11 22:28 UTC (permalink / raw)
  To: Robert Bragg; +Cc: Intel Graphics Development

On 9 November 2016 at 16:16, Robert Bragg <robert@sixbynine.org> wrote:
> This combines some parts of the recently added store_lri test with the
> registers test to be able to first load a distinguishable value before
> the LRI and explicitly read back the register to determine if the
> command succeeded or was a NOOP.
>
> For now though we won't look at OACONTROL without checking for version 9
> of the command parser.
>
> This updates the 'bad' test to check the OASTATUS2 register so that we
> can explicitly read back from the register to check it becomes a NOOP.
>
> This adds a struct test_lri for associating a mask with the init/test
> values so we ignore things like hw status bits that might interfere
> with the result.
>
> Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH igt v3 11/11] igt/gem_exec_parse: check oacontrol lri bad for >= v9
  2016-11-09 16:16 ` [PATCH igt v3 11/11] igt/gem_exec_parse: check oacontrol lri bad for >= v9 Robert Bragg
@ 2016-11-11 22:36   ` Matthew Auld
  0 siblings, 0 replies; 25+ messages in thread
From: Matthew Auld @ 2016-11-11 22:36 UTC (permalink / raw)
  To: Robert Bragg; +Cc: Intel Graphics Development

On 9 November 2016 at 16:16, Robert Bragg <robert@sixbynine.org> wrote:
> OACONTROL is no longer white listed in the command parser so this checks
> at attempted LRI will be disallowed and (more importantly) checks that
> userspace doesn't get an EINVAL error for an attempted OACONTROL LRI.
> This is important becase Mesa application attempt OACONTROL LRIs while
> initializing and will abort for any execbuf error.
>
> Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH igt v3 01/11] igt/perf: add i915 perf stream tests for Haswell
  2016-11-10 23:03   ` Matthew Auld
@ 2016-11-14 15:52     ` Robert Bragg
  0 siblings, 0 replies; 25+ messages in thread
From: Robert Bragg @ 2016-11-14 15:52 UTC (permalink / raw)
  To: Matthew Auld; +Cc: Intel Graphics Development


[-- Attachment #1.1: Type: text/plain, Size: 2426 bytes --]

On Thu, Nov 10, 2016 at 11:03 PM, Matthew Auld <matthew.william.auld@gmail.
com> wrote:

> On 11/09, Robert Bragg wrote:
> > +
> > +igt_main
> > +{
> > +        igt_skip_on_simulation();
> > +
> > +        igt_fixture {
> > +                struct stat sb;
> > +                int ret;
> > +
> > +                drm_fd = drm_open_driver_render(DRIVER_INTEL);
> > +                devid = intel_get_drm_devid(drm_fd);
> > +                device = drm_get_card();
> > +
> > +                igt_require(IS_HASWELL(devid));
> > +                igt_require(lookup_hsw_render_basic_id());
> > +
> > +                ret = stat("/proc/sys/dev/i915/perf_stream_paranoid",
> &sb);
> > +                igt_require(ret == 0);
> > +                ret = stat("/proc/sys/dev/i915/oa_max_sample_rate",
> &sb);
> > +                igt_require(ret == 0);
> The absence of the above files would indicate a failure in the kernel,
> so would it not be more apt to assert, rather than skip ?
>

The test could be running against an older kernel which won't have these
files and we should skip all the tests in that case.

E.g. Chris asked me to maintain compatibility within the gem_exec_parse
test for older versions of the command parser so I suppose i-g-t tries to
maintain backwards compatibility.

We could potentially require one and assert the other.


> > +
> > +                gt_frequency_range_save();
> > +
> > +                write_u64_file("/proc/sys/dev/i915/perf_stream_paranoid",
> 1);
> Don't we also want to ensure that the oa_max_sample_rate is also in a
> "good" starting state before we begin, especially since we ensure that
> we leave in its default state when cleaning up ?
>

Explicitly setting the max rate interferes with being able to assert what
the default is, but that's already a problem with the cleanup fixture
explicitly setting the rate.

What I'm doing now is initializing oa_max_sample_rate before tests, as
suggested here, and I've also added a test_sysctl_defaults() test that's
run very early, just after the i915-ref-count test.



>
> Anyway, I think it all looks pretty reasonable to me and it looks like
> we have a good amount of coverage, so you can have my r-b with Chris'
> comment addressed.
>

Thanks, I've moved the ref counting test in front of the first fixture as
suggested by chris (with just the requirements check that the i915-perf
interface exists in another fixture before).

[-- Attachment #1.2: Type: text/html, Size: 3647 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH igt v3 06/11] igt/gem_exec_parse: make basic-rejected version agnostic
  2016-11-09 16:15 ` [PATCH igt v3 06/11] igt/gem_exec_parse: make basic-rejected version agnostic Robert Bragg
@ 2016-11-14 18:57   ` Matthew Auld
  0 siblings, 0 replies; 25+ messages in thread
From: Matthew Auld @ 2016-11-14 18:57 UTC (permalink / raw)
  To: Robert Bragg; +Cc: Intel Graphics Development

On 9 November 2016 at 16:15, Robert Bragg <robert@sixbynine.org> wrote:
> This adapts the basic-rejected test to focus on invalid commands that
> will result in an EINVAL errno being returned to userspace even with the
> upcoming version 8 parser change to stop reporting access violations as
> EINVAL errors.
>
> Signed-off-by: Robert Bragg <robert@sixbynine.org>
> ---
>  tests/gem_exec_parse.c | 28 +++++++++++++++++-----------
>  1 file changed, 17 insertions(+), 11 deletions(-)
>
> diff --git a/tests/gem_exec_parse.c b/tests/gem_exec_parse.c
> index 368f30b..63f4efe 100644
> --- a/tests/gem_exec_parse.c
> +++ b/tests/gem_exec_parse.c
> @@ -386,33 +386,39 @@ igt_main
>         }
>
>         igt_subtest("basic-rejected") {
> -               uint32_t arb_on_off[] = {
> -                       MI_ARB_ON_OFF,
> +               uint32_t invalid_cmd[] = {
> +                       (0x7<<29), /* Reserved command type,
> +                                     across all engines */
Maybe make this more clear with something like:

   (INSTR_INVALID_CLIENT << INSTR_CLIENT_SHIFT)

Anyway:
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2016-11-14 18:57 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-09 16:15 [PATCH igt v3 00/11] corresponding changes for i915-perf interface Robert Bragg
2016-11-09 16:15 ` [PATCH igt v3 01/11] igt/perf: add i915 perf stream tests for Haswell Robert Bragg
2016-11-09 16:33   ` Chris Wilson
2016-11-10 23:03   ` Matthew Auld
2016-11-14 15:52     ` Robert Bragg
2016-11-09 16:15 ` [PATCH igt v3 02/11] igt/gem_exec_parse: some minor cleanups Robert Bragg
2016-11-11 21:49   ` Matthew Auld
2016-11-09 16:15 ` [PATCH igt v3 03/11] igt/gem_exec_parse: move hsw_load_register_reg down Robert Bragg
2016-11-11 21:51   ` Matthew Auld
2016-11-09 16:15 ` [PATCH igt v3 04/11] igt/gem_exec_parse: update hsw_load_register_reg Robert Bragg
2016-11-11 22:01   ` Matthew Auld
2016-11-09 16:15 ` [PATCH igt v3 05/11] igt/gem_exec_parse: req. v < 9 for oacontrol tracking test Robert Bragg
2016-11-11 22:07   ` Matthew Auld
2016-11-09 16:15 ` [PATCH igt v3 06/11] igt/gem_exec_parse: make basic-rejected version agnostic Robert Bragg
2016-11-14 18:57   ` Matthew Auld
2016-11-09 16:15 ` [PATCH igt v3 07/11] igt/gem_exec_parse: update bitmasks test for v >=8 Robert Bragg
2016-11-11 22:08   ` Matthew Auld
2016-11-09 16:15 ` [PATCH igt v3 08/11] igt/gem_exec_parse: update cmd-crossing-page for >= v8 Robert Bragg
2016-11-11 22:10   ` Matthew Auld
2016-11-09 16:16 ` [PATCH igt v3 09/11] igt/gem_exec_parse: update hsw_load_register_reg for v >= 8 Robert Bragg
2016-11-11 22:14   ` Matthew Auld
2016-11-09 16:16 ` [PATCH igt v3 10/11] igt/gem_exec_parse: update registers test " Robert Bragg
2016-11-11 22:28   ` Matthew Auld
2016-11-09 16:16 ` [PATCH igt v3 11/11] igt/gem_exec_parse: check oacontrol lri bad for >= v9 Robert Bragg
2016-11-11 22:36   ` Matthew Auld

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.