All of lore.kernel.org
 help / color / mirror / Atom feed
* [Intel-gfx] [PATCH 0/7] drm/i915/perf: add OA interrupt support
@ 2020-03-03 22:18 Umesh Nerlige Ramappa
  2020-03-03 22:18 ` [Intel-gfx] [PATCH 1/7] drm/i915/perf: rework aging tail workaround Umesh Nerlige Ramappa
                   ` (11 more replies)
  0 siblings, 12 replies; 28+ messages in thread
From: Umesh Nerlige Ramappa @ 2020-03-03 22:18 UTC (permalink / raw)
  To: intel-gfx, Lionel G Landwerlin

Hi all,

This is a revival of an earlier patch series submitted by Lionel
Landwerlin - https://patchwork.freedesktop.org/series/54280/

The patches enable interrupt support for the perf OA unit in
i915, further details can be found in the orignal series linked
above.

Regards,
Umesh

Lionel Landwerlin (7):
  drm/i915/perf: rework aging tail workaround
  drm/i915/perf: move pollin setup to non hw specific code
  drm/i915/perf: only append status when data is available
  drm/i915/perf: add new open param to configure polling of OA buffer
  drm/i915: handle interrupts from the OA unit
  drm/i915/perf: add interrupt enabling parameter
  drm/i915/perf: add flushing ioctl

 drivers/gpu/drm/i915/gt/intel_gt_irq.c        |  29 +-
 drivers/gpu/drm/i915/gt/intel_gt_irq.h        |   1 +
 drivers/gpu/drm/i915/gt/intel_gt_pm_irq.c     |  34 +-
 .../gpu/drm/i915/gt/intel_ring_submission.c   |   2 +
 drivers/gpu/drm/i915/i915_perf.c              | 421 +++++++++++-------
 drivers/gpu/drm/i915/i915_perf_types.h        |  54 ++-
 drivers/gpu/drm/i915/i915_reg.h               |   9 +
 include/uapi/drm/i915_drm.h                   |  41 ++
 8 files changed, 401 insertions(+), 190 deletions(-)

-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Intel-gfx] [PATCH 1/7] drm/i915/perf: rework aging tail workaround
  2020-03-03 22:18 [Intel-gfx] [PATCH 0/7] drm/i915/perf: add OA interrupt support Umesh Nerlige Ramappa
@ 2020-03-03 22:18 ` Umesh Nerlige Ramappa
  2020-03-03 22:19 ` [Intel-gfx] [PATCH 2/7] drm/i915/perf: move pollin setup to non hw specific code Umesh Nerlige Ramappa
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 28+ messages in thread
From: Umesh Nerlige Ramappa @ 2020-03-03 22:18 UTC (permalink / raw)
  To: intel-gfx, Lionel G Landwerlin

From: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

We're about to introduce an options to open the perf stream, giving
the user ability to configure how often it wants the kernel to poll
the OA registers for available data.

Right now the workaround against the OA tail pointer race condition
requires at least twice the internal kernel polling timer to make any
data available.

This changes introduce checks on the OA data written into the circular
buffer to make as much data as possible available on the first
iteration of the polling timer.

v2: Use OA_TAKEN macro without the gtt_offset (Lionel)
v3: (Umesh)
- Rebase
- Change report to report32 from below review
  https://patchwork.freedesktop.org/patch/330704/?series=66697&rev=1

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
---
 drivers/gpu/drm/i915/i915_perf.c       | 207 +++++++++++--------------
 drivers/gpu/drm/i915/i915_perf_types.h |  29 ++--
 2 files changed, 106 insertions(+), 130 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 1b074bb4a7fe..92c5c75e2a2b 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -241,23 +241,14 @@
  * for this earlier, as part of the oa_buffer_check to avoid lots of redundant
  * read() attempts.
  *
- * In effect we define a tail pointer for reading that lags the real tail
- * pointer by at least %OA_TAIL_MARGIN_NSEC nanoseconds, which gives enough
- * time for the corresponding reports to become visible to the CPU.
- *
- * To manage this we actually track two tail pointers:
- *  1) An 'aging' tail with an associated timestamp that is tracked until we
- *     can trust the corresponding data is visible to the CPU; at which point
- *     it is considered 'aged'.
- *  2) An 'aged' tail that can be used for read()ing.
- *
- * The two separate pointers let us decouple read()s from tail pointer aging.
- *
- * The tail pointers are checked and updated at a limited rate within a hrtimer
- * callback (the same callback that is used for delivering EPOLLIN events)
- *
- * Initially the tails are marked invalid with %INVALID_TAIL_PTR which
- * indicates that an updated tail pointer is needed.
+ * We workaround this issue in oa_buffer_check() by reading the reports in the
+ * OA buffer, starting from the tail reported by the HW until we find 2
+ * consecutive reports with their first 2 dwords of not at 0. Those dwords are
+ * also set to 0 once read and the whole buffer is cleared upon OA buffer
+ * initialization. The first dword is the reason for this report while the
+ * second is the timestamp, making the chances of having those 2 fields at 0
+ * fairly unlikely. A more detailed explanation is available in
+ * oa_buffer_check().
  *
  * Most of the implementation details for this workaround are in
  * oa_buffer_check_unlocked() and _append_oa_reports()
@@ -270,7 +261,6 @@
  * enabled without any periodic sampling.
  */
 #define OA_TAIL_MARGIN_NSEC	100000ULL
-#define INVALID_TAIL_PTR	0xffffffff
 
 /* frequency for checking whether the OA unit has written new reports to the
  * circular OA buffer...
@@ -476,10 +466,10 @@ static u32 gen7_oa_hw_tail_read(struct i915_perf_stream *stream)
  */
 static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
 {
+	u32 gtt_offset = i915_ggtt_offset(stream->oa_buffer.vma);
 	int report_size = stream->oa_buffer.format_size;
 	unsigned long flags;
-	unsigned int aged_idx;
-	u32 head, hw_tail, aged_tail, aging_tail;
+	u32 hw_tail;
 	u64 now;
 
 	/* We have to consider the (unlikely) possibility that read() errors
@@ -488,16 +478,6 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
 	 */
 	spin_lock_irqsave(&stream->oa_buffer.ptr_lock, flags);
 
-	/* NB: The head we observe here might effectively be a little out of
-	 * date (between head and tails[aged_idx].offset if there is currently
-	 * a read() in progress.
-	 */
-	head = stream->oa_buffer.head;
-
-	aged_idx = stream->oa_buffer.aged_tail_idx;
-	aged_tail = stream->oa_buffer.tails[aged_idx].offset;
-	aging_tail = stream->oa_buffer.tails[!aged_idx].offset;
-
 	hw_tail = stream->perf->ops.oa_hw_tail_read(stream);
 
 	/* The tail pointer increases in 64 byte increments,
@@ -507,64 +487,76 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
 
 	now = ktime_get_mono_fast_ns();
 
-	/* Update the aged tail
-	 *
-	 * Flip the tail pointer available for read()s once the aging tail is
-	 * old enough to trust that the corresponding data will be visible to
-	 * the CPU...
-	 *
-	 * Do this before updating the aging pointer in case we may be able to
-	 * immediately start aging a new pointer too (if new data has become
-	 * available) without needing to wait for a later hrtimer callback.
-	 */
-	if (aging_tail != INVALID_TAIL_PTR &&
-	    ((now - stream->oa_buffer.aging_timestamp) >
-	     OA_TAIL_MARGIN_NSEC)) {
-
-		aged_idx ^= 1;
-		stream->oa_buffer.aged_tail_idx = aged_idx;
+	if (hw_tail == stream->oa_buffer.aging_tail) {
+		/* If the HW tail hasn't move since the last check and the HW
+		 * tail has been aging for long enough, declare it the new
+		 * tail.
+		 */
+		if ((now - stream->oa_buffer.aging_timestamp) >
+		    OA_TAIL_MARGIN_NSEC) {
+			stream->oa_buffer.tail =
+				stream->oa_buffer.aging_tail;
+		}
+	} else {
+		u32 head, tail, landed_report_heads;
 
-		aged_tail = aging_tail;
+		/* NB: The head we observe here might effectively be a little out of
+		 * date (between head and tails[aged_idx].offset if there is currently
+		 * a read() in progress.
+		 */
+		head = stream->oa_buffer.head - gtt_offset;
 
-		/* Mark that we need a new pointer to start aging... */
-		stream->oa_buffer.tails[!aged_idx].offset = INVALID_TAIL_PTR;
-		aging_tail = INVALID_TAIL_PTR;
-	}
+		hw_tail -= gtt_offset;
+		tail = hw_tail;
 
-	/* Update the aging tail
-	 *
-	 * We throttle aging tail updates until we have a new tail that
-	 * represents >= one report more data than is already available for
-	 * reading. This ensures there will be enough data for a successful
-	 * read once this new pointer has aged and ensures we will give the new
-	 * pointer time to age.
-	 */
-	if (aging_tail == INVALID_TAIL_PTR &&
-	    (aged_tail == INVALID_TAIL_PTR ||
-	     OA_TAKEN(hw_tail, aged_tail) >= report_size)) {
-		struct i915_vma *vma = stream->oa_buffer.vma;
-		u32 gtt_offset = i915_ggtt_offset(vma);
-
-		/* Be paranoid and do a bounds check on the pointer read back
-		 * from hardware, just in case some spurious hardware condition
-		 * could put the tail out of bounds...
+		/* Walk the stream backward until we find at least 2 reports
+		 * with dword 0 & 1 not at 0. Since the circular buffer
+		 * pointers progress by increments of 64 bytes and that
+		 * reports can be up to 256 bytes long, we can't tell whether
+		 * a report has fully landed in memory before the first 2
+		 * dwords of the following report have effectively landed.
+		 *
+		 * This is assuming that the writes of the OA unit land in
+		 * memory in the order they were written to.
+		 * If not : (╯°□°)╯︵ ┻━┻
 		 */
-		if (hw_tail >= gtt_offset &&
-		    hw_tail < (gtt_offset + OA_BUFFER_SIZE)) {
-			stream->oa_buffer.tails[!aged_idx].offset =
-				aging_tail = hw_tail;
-			stream->oa_buffer.aging_timestamp = now;
-		} else {
-			drm_err(&stream->perf->i915->drm,
-				"Ignoring spurious out of range OA buffer tail pointer = %x\n",
-				hw_tail);
+		landed_report_heads = 0;
+		while (OA_TAKEN(tail, head) >= report_size) {
+			u32 previous_tail = (tail - report_size) & (OA_BUFFER_SIZE - 1);
+			u8 *report = stream->oa_buffer.vaddr + previous_tail;
+			u32 *report32 = (void *) report;
+
+			/* Head of the report indicated by the HW tail register has
+			 * indeed landed into memory.
+			 */
+			if (report32[0] != 0 || report32[1] != 0) {
+				landed_report_heads++;
+
+				if (landed_report_heads >= 2)
+					break;
+			}
+
+			tail = previous_tail;
 		}
+
+		if (abs(tail - hw_tail) >= (2 * report_size)) {
+			if (__ratelimit(&stream->perf->tail_pointer_race)) {
+				DRM_NOTE("unlanded report(s) head=0x%x "
+					 "tail=0x%x hw_tail=0x%x\n",
+					 head, tail, hw_tail);
+			}
+		}
+
+		stream->oa_buffer.tail = gtt_offset + tail;
+		stream->oa_buffer.aging_tail = gtt_offset + hw_tail;
+		stream->oa_buffer.aging_timestamp = now;
 	}
 
 	spin_unlock_irqrestore(&stream->oa_buffer.ptr_lock, flags);
 
-	return aged_tail == INVALID_TAIL_PTR ?
-		false : OA_TAKEN(aged_tail, head) >= report_size;
+	return OA_TAKEN(stream->oa_buffer.tail - gtt_offset,
+			stream->oa_buffer.head - gtt_offset) >= report_size;
+
 }
 
 /**
@@ -682,7 +674,6 @@ static int gen8_append_oa_reports(struct i915_perf_stream *stream,
 	u32 mask = (OA_BUFFER_SIZE - 1);
 	size_t start_offset = *offset;
 	unsigned long flags;
-	unsigned int aged_tail_idx;
 	u32 head, tail;
 	u32 taken;
 	int ret = 0;
@@ -693,18 +684,10 @@ static int gen8_append_oa_reports(struct i915_perf_stream *stream,
 	spin_lock_irqsave(&stream->oa_buffer.ptr_lock, flags);
 
 	head = stream->oa_buffer.head;
-	aged_tail_idx = stream->oa_buffer.aged_tail_idx;
-	tail = stream->oa_buffer.tails[aged_tail_idx].offset;
+	tail = stream->oa_buffer.tail;
 
 	spin_unlock_irqrestore(&stream->oa_buffer.ptr_lock, flags);
 
-	/*
-	 * An invalid tail pointer here means we're still waiting for the poll
-	 * hrtimer callback to give us a pointer
-	 */
-	if (tail == INVALID_TAIL_PTR)
-		return -EAGAIN;
-
 	/*
 	 * NB: oa_buffer.head/tail include the gtt_offset which we don't want
 	 * while indexing relative to oa_buf_base.
@@ -838,13 +821,10 @@ static int gen8_append_oa_reports(struct i915_perf_stream *stream,
 		}
 
 		/*
-		 * The above reason field sanity check is based on
-		 * the assumption that the OA buffer is initially
-		 * zeroed and we reset the field after copying so the
-		 * check is still meaningful once old reports start
-		 * being overwritten.
+		 * Clear out the first 2 dword as a mean to detect unlanded
+		 * reports.
 		 */
-		report32[0] = 0;
+		report32[0] = report32[1] = 0;
 	}
 
 	if (start_offset != *offset) {
@@ -985,7 +965,6 @@ static int gen7_append_oa_reports(struct i915_perf_stream *stream,
 	u32 mask = (OA_BUFFER_SIZE - 1);
 	size_t start_offset = *offset;
 	unsigned long flags;
-	unsigned int aged_tail_idx;
 	u32 head, tail;
 	u32 taken;
 	int ret = 0;
@@ -996,17 +975,10 @@ static int gen7_append_oa_reports(struct i915_perf_stream *stream,
 	spin_lock_irqsave(&stream->oa_buffer.ptr_lock, flags);
 
 	head = stream->oa_buffer.head;
-	aged_tail_idx = stream->oa_buffer.aged_tail_idx;
-	tail = stream->oa_buffer.tails[aged_tail_idx].offset;
+	tail = stream->oa_buffer.tail;
 
 	spin_unlock_irqrestore(&stream->oa_buffer.ptr_lock, flags);
 
-	/* An invalid tail pointer here means we're still waiting for the poll
-	 * hrtimer callback to give us a pointer
-	 */
-	if (tail == INVALID_TAIL_PTR)
-		return -EAGAIN;
-
 	/* NB: oa_buffer.head/tail include the gtt_offset which we don't want
 	 * while indexing relative to oa_buf_base.
 	 */
@@ -1064,13 +1036,10 @@ static int gen7_append_oa_reports(struct i915_perf_stream *stream,
 		if (ret)
 			break;
 
-		/* The above report-id field sanity check is based on
-		 * the assumption that the OA buffer is initially
-		 * zeroed and we reset the field after copying so the
-		 * check is still meaningful once old reports start
-		 * being overwritten.
+		/* Clear out the first 2 dwords as a mean to detect unlanded
+		 * reports.
 		 */
-		report32[0] = 0;
+		report32[0] = report32[1] = 0;
 	}
 
 	if (start_offset != *offset) {
@@ -1449,8 +1418,8 @@ static void gen7_init_oa_buffer(struct i915_perf_stream *stream)
 			   gtt_offset | OABUFFER_SIZE_16M);
 
 	/* Mark that we need updated tail pointers to read from... */
-	stream->oa_buffer.tails[0].offset = INVALID_TAIL_PTR;
-	stream->oa_buffer.tails[1].offset = INVALID_TAIL_PTR;
+	stream->oa_buffer.aging_tail =
+		stream->oa_buffer.tail = gtt_offset;
 
 	spin_unlock_irqrestore(&stream->oa_buffer.ptr_lock, flags);
 
@@ -1503,8 +1472,8 @@ static void gen8_init_oa_buffer(struct i915_perf_stream *stream)
 	intel_uncore_write(uncore, GEN8_OATAILPTR, gtt_offset & GEN8_OATAILPTR_MASK);
 
 	/* Mark that we need updated tail pointers to read from... */
-	stream->oa_buffer.tails[0].offset = INVALID_TAIL_PTR;
-	stream->oa_buffer.tails[1].offset = INVALID_TAIL_PTR;
+	stream->oa_buffer.aging_tail =
+		stream->oa_buffer.tail = gtt_offset;
 
 	/*
 	 * Reset state used to recognise context switches, affecting which
@@ -1559,8 +1528,8 @@ static void gen12_init_oa_buffer(struct i915_perf_stream *stream)
 			   gtt_offset & GEN12_OAG_OATAILPTR_MASK);
 
 	/* Mark that we need updated tail pointers to read from... */
-	stream->oa_buffer.tails[0].offset = INVALID_TAIL_PTR;
-	stream->oa_buffer.tails[1].offset = INVALID_TAIL_PTR;
+	stream->oa_buffer.aging_tail =
+		stream->oa_buffer.tail = gtt_offset;
 
 	/*
 	 * Reset state used to recognise context switches, affecting which
@@ -4408,6 +4377,12 @@ void i915_perf_init(struct drm_i915_private *i915)
 		ratelimit_set_flags(&perf->spurious_report_rs,
 				    RATELIMIT_MSG_ON_RELEASE);
 
+		ratelimit_state_init(&perf->tail_pointer_race,
+				     5 * HZ, 10);
+		ratelimit_set_flags(&perf->tail_pointer_race,
+				    RATELIMIT_MSG_ON_RELEASE);
+
+
 		atomic64_set(&perf->noa_programming_delay,
 			     500 * 1000 /* 500us */);
 
diff --git a/drivers/gpu/drm/i915/i915_perf_types.h b/drivers/gpu/drm/i915/i915_perf_types.h
index a0e22f00f6cf..9ee7c58e70d5 100644
--- a/drivers/gpu/drm/i915/i915_perf_types.h
+++ b/drivers/gpu/drm/i915/i915_perf_types.h
@@ -272,21 +272,10 @@ struct i915_perf_stream {
 		spinlock_t ptr_lock;
 
 		/**
-		 * @tails: One 'aging' tail pointer and one 'aged' tail pointer ready to
-		 * used for reading.
-		 *
-		 * Initial values of 0xffffffff are invalid and imply that an
-		 * update is required (and should be ignored by an attempted
-		 * read)
+		 * @aging_tail: The last HW tail reported by HW. The data
+		 * might not have made it to memory yet though.
 		 */
-		struct {
-			u32 offset;
-		} tails[2];
-
-		/**
-		 * @aged_tail_idx: Index for the aged tail ready to read() data up to.
-		 */
-		unsigned int aged_tail_idx;
+		u32 aging_tail;
 
 		/**
 		 * @aging_timestamp: A monotonic timestamp for when the current aging tail pointer
@@ -302,6 +291,12 @@ struct i915_perf_stream {
 		 * OA buffer data to userspace.
 		 */
 		u32 head;
+
+		/**
+		 * @tail: The last tail verified tail that can be read by
+		 * userspace.
+		 */
+		u32 tail;
 	} oa_buffer;
 
 	/**
@@ -413,6 +408,12 @@ struct i915_perf {
 	 */
 	struct ratelimit_state spurious_report_rs;
 
+	/**
+	 * For rate limiting any notifications of tail pointer
+	 * race.
+	 */
+	struct ratelimit_state tail_pointer_race;
+
 	struct i915_oa_config test_config;
 
 	u32 gen7_latched_oastatus1;
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [Intel-gfx] [PATCH 2/7] drm/i915/perf: move pollin setup to non hw specific code
  2020-03-03 22:18 [Intel-gfx] [PATCH 0/7] drm/i915/perf: add OA interrupt support Umesh Nerlige Ramappa
  2020-03-03 22:18 ` [Intel-gfx] [PATCH 1/7] drm/i915/perf: rework aging tail workaround Umesh Nerlige Ramappa
@ 2020-03-03 22:19 ` Umesh Nerlige Ramappa
  2020-03-03 22:19 ` [Intel-gfx] [PATCH 3/7] drm/i915/perf: only append status when data is available Umesh Nerlige Ramappa
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 28+ messages in thread
From: Umesh Nerlige Ramappa @ 2020-03-03 22:19 UTC (permalink / raw)
  To: intel-gfx, Lionel G Landwerlin

From: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

This isn't really gen specific stuff, so just move it to the common
code.

v2: Rebase (Umesh)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
---
 drivers/gpu/drm/i915/i915_perf.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 92c5c75e2a2b..0ec4546f1330 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -1441,8 +1441,6 @@ static void gen7_init_oa_buffer(struct i915_perf_stream *stream)
 	 * memory...
 	 */
 	memset(stream->oa_buffer.vaddr, 0, OA_BUFFER_SIZE);
-
-	stream->pollin = false;
 }
 
 static void gen8_init_oa_buffer(struct i915_perf_stream *stream)
@@ -1497,8 +1495,6 @@ static void gen8_init_oa_buffer(struct i915_perf_stream *stream)
 	 * memory...
 	 */
 	memset(stream->oa_buffer.vaddr, 0, OA_BUFFER_SIZE);
-
-	stream->pollin = false;
 }
 
 static void gen12_init_oa_buffer(struct i915_perf_stream *stream)
@@ -1554,8 +1550,6 @@ static void gen12_init_oa_buffer(struct i915_perf_stream *stream)
 	 */
 	memset(stream->oa_buffer.vaddr, 0,
 	       stream->oa_buffer.vma->size);
-
-	stream->pollin = false;
 }
 
 static int alloc_oa_buffer(struct i915_perf_stream *stream)
@@ -2626,6 +2620,12 @@ static void gen12_oa_enable(struct i915_perf_stream *stream)
  */
 static void i915_oa_stream_enable(struct i915_perf_stream *stream)
 {
+	/*
+	 * Maybe make ->pollin per-stream state if we support multiple
+	 * concurrent streams in the future.
+	 */
+	stream->pollin = false;
+
 	stream->perf->ops.oa_enable(stream);
 
 	if (stream->periodic)
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [Intel-gfx] [PATCH 3/7] drm/i915/perf: only append status when data is available
  2020-03-03 22:18 [Intel-gfx] [PATCH 0/7] drm/i915/perf: add OA interrupt support Umesh Nerlige Ramappa
  2020-03-03 22:18 ` [Intel-gfx] [PATCH 1/7] drm/i915/perf: rework aging tail workaround Umesh Nerlige Ramappa
  2020-03-03 22:19 ` [Intel-gfx] [PATCH 2/7] drm/i915/perf: move pollin setup to non hw specific code Umesh Nerlige Ramappa
@ 2020-03-03 22:19 ` Umesh Nerlige Ramappa
  2020-03-03 22:19 ` [Intel-gfx] [PATCH 4/7] drm/i915/perf: add new open param to configure polling of OA buffer Umesh Nerlige Ramappa
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 28+ messages in thread
From: Umesh Nerlige Ramappa @ 2020-03-03 22:19 UTC (permalink / raw)
  To: intel-gfx, Lionel G Landwerlin

From: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

The only bit of the status register we currently report in the
i915-perf stream is the "report loss" bit. Only report this when we
have some data to report with it. There was a kind of inconsistency
here in that we could report report loss without appending the reports
associated with the loss.

v2: Rebase (Umesh)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
---
 drivers/gpu/drm/i915/i915_perf.c | 54 ++++++++++++++++++++------------
 1 file changed, 34 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 0ec4546f1330..21a63644846f 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -663,6 +663,7 @@ static int append_oa_sample(struct i915_perf_stream *stream,
  * Returns: 0 on success, negative error code on failure.
  */
 static int gen8_append_oa_reports(struct i915_perf_stream *stream,
+				  u32 oastatus,
 				  char __user *buf,
 				  size_t count,
 				  size_t *offset)
@@ -709,6 +710,21 @@ static int gen8_append_oa_reports(struct i915_perf_stream *stream,
 			  head, tail))
 		return -EIO;
 
+	/*
+	 * If there is nothing to read, don't append the status report yet,
+	 * wait until we have some data available.
+	 */
+	if (!OA_TAKEN(tail, head))
+		return 0;
+
+	if (oastatus & GEN8_OASTATUS_REPORT_LOST) {
+		ret = append_oa_status(stream, buf, count, offset,
+				       DRM_I915_PERF_RECORD_OA_REPORT_LOST);
+		if (ret)
+			return ret;
+		intel_uncore_write(uncore, GEN8_OASTATUS,
+				   oastatus & ~GEN8_OASTATUS_REPORT_LOST);
+	}
 
 	for (/* none */;
 	     (taken = OA_TAKEN(tail, head));
@@ -921,16 +937,7 @@ static int gen8_oa_read(struct i915_perf_stream *stream,
 		oastatus = intel_uncore_read(uncore, oastatus_reg);
 	}
 
-	if (oastatus & GEN8_OASTATUS_REPORT_LOST) {
-		ret = append_oa_status(stream, buf, count, offset,
-				       DRM_I915_PERF_RECORD_OA_REPORT_LOST);
-		if (ret)
-			return ret;
-		intel_uncore_write(uncore, oastatus_reg,
-				   oastatus & ~GEN8_OASTATUS_REPORT_LOST);
-	}
-
-	return gen8_append_oa_reports(stream, buf, count, offset);
+	return gen8_append_oa_reports(stream, oastatus, buf, count, offset);
 }
 
 /**
@@ -954,6 +961,7 @@ static int gen8_oa_read(struct i915_perf_stream *stream,
  * Returns: 0 on success, negative error code on failure.
  */
 static int gen7_append_oa_reports(struct i915_perf_stream *stream,
+				  u32 oastatus1,
 				  char __user *buf,
 				  size_t count,
 				  size_t *offset)
@@ -998,6 +1006,21 @@ static int gen7_append_oa_reports(struct i915_perf_stream *stream,
 			  head, tail))
 		return -EIO;
 
+	/*
+	 * If there is nothing to read, don't append the status report yet,
+	 * wait until we have some data available.
+	 */
+	if (!OA_TAKEN(tail, head))
+		return 0;
+
+	if (unlikely(oastatus1 & GEN7_OASTATUS1_REPORT_LOST)) {
+		ret = append_oa_status(stream, buf, count, offset,
+				       DRM_I915_PERF_RECORD_OA_REPORT_LOST);
+		if (ret)
+			return ret;
+		stream->perf->gen7_latched_oastatus1 |=
+			GEN7_OASTATUS1_REPORT_LOST;
+	}
 
 	for (/* none */;
 	     (taken = OA_TAKEN(tail, head));
@@ -1133,16 +1156,7 @@ static int gen7_oa_read(struct i915_perf_stream *stream,
 		oastatus1 = intel_uncore_read(uncore, GEN7_OASTATUS1);
 	}
 
-	if (unlikely(oastatus1 & GEN7_OASTATUS1_REPORT_LOST)) {
-		ret = append_oa_status(stream, buf, count, offset,
-				       DRM_I915_PERF_RECORD_OA_REPORT_LOST);
-		if (ret)
-			return ret;
-		stream->perf->gen7_latched_oastatus1 |=
-			GEN7_OASTATUS1_REPORT_LOST;
-	}
-
-	return gen7_append_oa_reports(stream, buf, count, offset);
+	return gen7_append_oa_reports(stream, oastatus1, buf, count, offset);
 }
 
 /**
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [Intel-gfx] [PATCH 4/7] drm/i915/perf: add new open param to configure polling of OA buffer
  2020-03-03 22:18 [Intel-gfx] [PATCH 0/7] drm/i915/perf: add OA interrupt support Umesh Nerlige Ramappa
                   ` (2 preceding siblings ...)
  2020-03-03 22:19 ` [Intel-gfx] [PATCH 3/7] drm/i915/perf: only append status when data is available Umesh Nerlige Ramappa
@ 2020-03-03 22:19 ` Umesh Nerlige Ramappa
  2020-03-12 19:27   ` Dixit, Ashutosh
  2020-03-03 22:19 ` [Intel-gfx] [PATCH 5/7] drm/i915: handle interrupts from the OA unit Umesh Nerlige Ramappa
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 28+ messages in thread
From: Umesh Nerlige Ramappa @ 2020-03-03 22:19 UTC (permalink / raw)
  To: intel-gfx, Lionel G Landwerlin

From: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

This new parameter let's the application choose how often the OA
buffer should be checked on the CPU side for data availability. Longer
polling period tend to reduce CPU overhead if the application does not
care about somewhat real time data collection.

v2: Allow disabling polling completely with 0 value (Lionel)
v3: Version the new parameter (Joonas)
v4: Rebase (Umesh)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
---
 drivers/gpu/drm/i915/i915_perf.c       | 48 +++++++++++++++++++++-----
 drivers/gpu/drm/i915/i915_perf_types.h |  6 ++++
 include/uapi/drm/i915_drm.h            | 10 ++++++
 3 files changed, 56 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 21a63644846f..cf4176f4d11d 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -262,11 +262,11 @@
  */
 #define OA_TAIL_MARGIN_NSEC	100000ULL
 
-/* frequency for checking whether the OA unit has written new reports to the
- * circular OA buffer...
+/* The default frequency for checking whether the OA unit has written new
+ * reports to the circular OA buffer...
  */
-#define POLL_FREQUENCY 200
-#define POLL_PERIOD (NSEC_PER_SEC / POLL_FREQUENCY)
+#define DEFAULT_POLL_FREQUENCY 200
+#define DEFAULT_POLL_PERIOD (NSEC_PER_SEC / DEFAULT_POLL_FREQUENCY)
 
 /* for sysctl proc_dointvec_minmax of dev.i915.perf_stream_paranoid */
 static u32 i915_perf_stream_paranoid = true;
@@ -349,6 +349,8 @@ static const struct i915_oa_format gen12_oa_formats[I915_OA_FORMAT_MAX] = {
  * @oa_periodic: Whether to enable periodic OA unit sampling
  * @oa_period_exponent: The OA unit sampling period is derived from this
  * @engine: The engine (typically rcs0) being monitored by the OA unit
+ * @poll_oa_period: The period at which the CPU will check for OA data
+ * availability
  *
  * As read_properties_unlocked() enumerates and validates the properties given
  * to open a stream of metrics the configuration is built up in the structure
@@ -368,6 +370,7 @@ struct perf_open_properties {
 	int oa_period_exponent;
 
 	struct intel_engine_cs *engine;
+	u64 poll_oa_period;
 };
 
 struct i915_oa_config_bo {
@@ -2642,9 +2645,9 @@ static void i915_oa_stream_enable(struct i915_perf_stream *stream)
 
 	stream->perf->ops.oa_enable(stream);
 
-	if (stream->periodic)
+	if (stream->periodic && stream->poll_oa_period)
 		hrtimer_start(&stream->poll_check_timer,
-			      ns_to_ktime(POLL_PERIOD),
+			      ns_to_ktime(stream->poll_oa_period),
 			      HRTIMER_MODE_REL_PINNED);
 }
 
@@ -3044,7 +3047,8 @@ static enum hrtimer_restart oa_poll_check_timer_cb(struct hrtimer *hrtimer)
 		wake_up(&stream->poll_wq);
 	}
 
-	hrtimer_forward_now(hrtimer, ns_to_ktime(POLL_PERIOD));
+	hrtimer_forward_now(hrtimer,
+			    ns_to_ktime(stream->poll_oa_period));
 
 	return HRTIMER_RESTART;
 }
@@ -3425,6 +3429,7 @@ i915_perf_open_ioctl_locked(struct i915_perf *perf,
 
 	stream->perf = perf;
 	stream->ctx = specific_ctx;
+	stream->poll_oa_period = props->poll_oa_period;
 
 	ret = i915_oa_stream_init(stream, param, props);
 	if (ret)
@@ -3481,6 +3486,7 @@ static u64 oa_exponent_to_ns(struct i915_perf *perf, int exponent)
 /**
  * read_properties_unlocked - validate + copy userspace stream open properties
  * @perf: i915 perf instance
+ * @open_flags: Flags set by userspace for the opening of the stream
  * @uprops: The array of u64 key value pairs given by userspace
  * @n_props: The number of key value pairs expected in @uprops
  * @props: The stream configuration built up while validating properties
@@ -3494,6 +3500,7 @@ static u64 oa_exponent_to_ns(struct i915_perf *perf, int exponent)
  * rule out defining new properties with ordering requirements in the future.
  */
 static int read_properties_unlocked(struct i915_perf *perf,
+				    u32 open_flags,
 				    u64 __user *uprops,
 				    u32 n_props,
 				    struct perf_open_properties *props)
@@ -3502,6 +3509,7 @@ static int read_properties_unlocked(struct i915_perf *perf,
 	u32 i;
 
 	memset(props, 0, sizeof(struct perf_open_properties));
+	props->poll_oa_period = DEFAULT_POLL_PERIOD;
 
 	if (!n_props) {
 		DRM_DEBUG("No i915 perf properties given\n");
@@ -3617,6 +3625,14 @@ static int read_properties_unlocked(struct i915_perf *perf,
 		case DRM_I915_PERF_PROP_HOLD_PREEMPTION:
 			props->hold_preemption = !!value;
 			break;
+		case DRM_I915_PERF_PROP_POLL_OA_DELAY:
+			if (value > 0 && value < 100000 /* 100us */) {
+				DRM_DEBUG("OA availability timer too small (%lluns < 100us)\n",
+					  value);
+				return -EINVAL;
+			}
+			props->poll_oa_period = value;
+			break;
 		case DRM_I915_PERF_PROP_MAX:
 			MISSING_CASE(id);
 			return -EINVAL;
@@ -3625,6 +3641,17 @@ static int read_properties_unlocked(struct i915_perf *perf,
 		uprop += 2;
 	}
 
+	/*
+	 * Blocking read need to be waken up by some mechanism. If no polling
+	 * of the HEAD/TAIL register is done by the kernel, we'll never be
+	 * able to wake up.
+	 */
+	if ((open_flags & I915_PERF_FLAG_FD_NONBLOCK) == 0 &&
+	    !props->poll_oa_period) {
+		DRM_DEBUG("Requesting a blocking stream with no polling period.\n");
+		return -EINVAL;
+	}
+
 	return 0;
 }
 
@@ -3675,6 +3702,7 @@ int i915_perf_open_ioctl(struct drm_device *dev, void *data,
 	}
 
 	ret = read_properties_unlocked(perf,
+				       param->flags,
 				       u64_to_user_ptr(param->properties_ptr),
 				       param->num_properties,
 				       &props);
@@ -4457,8 +4485,12 @@ int i915_perf_ioctl_version(void)
 	 *    preemption on a particular context so that performance data is
 	 *    accessible from a delta of MI_RPC reports without looking at the
 	 *    OA buffer.
+	 *
+	 * 4: Add DRM_I915_PERF_PROP_POLL_OA_DELAY parameter that controls
+	 *    enable/disable as well as the interval for the hrtimer used to
+	 *    check for OA data.
 	 */
-	return 3;
+	return 4;
 }
 
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
diff --git a/drivers/gpu/drm/i915/i915_perf_types.h b/drivers/gpu/drm/i915/i915_perf_types.h
index 9ee7c58e70d5..01559ead22e2 100644
--- a/drivers/gpu/drm/i915/i915_perf_types.h
+++ b/drivers/gpu/drm/i915/i915_perf_types.h
@@ -304,6 +304,12 @@ struct i915_perf_stream {
 	 * reprogrammed.
 	 */
 	struct i915_vma *noa_wait;
+
+	/**
+	 * @poll_oa_period: The period in nanoseconds at which the OA
+	 * buffer should be checked for available data.
+	 */
+	u64 poll_oa_period;
 };
 
 /**
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 2813e579b480..131cb237d19c 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1969,6 +1969,16 @@ enum drm_i915_perf_property_id {
 	 */
 	DRM_I915_PERF_PROP_HOLD_PREEMPTION,
 
+	/**
+	 * Specifying this property sets up a hrtimer in nanoseconds at which
+	 * the i915 driver will check the OA buffer for available data. A
+	 * value of 0 means no hrtimer will be started. Values below 100
+	 * microseconds are not allowed.
+	 *
+	 * This property is available in perf revision 4.
+	 */
+	DRM_I915_PERF_PROP_POLL_OA_DELAY,
+
 	DRM_I915_PERF_PROP_MAX /* non-ABI */
 };
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [Intel-gfx] [PATCH 5/7] drm/i915: handle interrupts from the OA unit
  2020-03-03 22:18 [Intel-gfx] [PATCH 0/7] drm/i915/perf: add OA interrupt support Umesh Nerlige Ramappa
                   ` (3 preceding siblings ...)
  2020-03-03 22:19 ` [Intel-gfx] [PATCH 4/7] drm/i915/perf: add new open param to configure polling of OA buffer Umesh Nerlige Ramappa
@ 2020-03-03 22:19 ` Umesh Nerlige Ramappa
  2020-03-03 22:19 ` [Intel-gfx] [PATCH 6/7] drm/i915/perf: add interrupt enabling parameter Umesh Nerlige Ramappa
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 28+ messages in thread
From: Umesh Nerlige Ramappa @ 2020-03-03 22:19 UTC (permalink / raw)
  To: intel-gfx, Lionel G Landwerlin

From: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

The OA unit can notify that its circular buffer is half full through
an interrupt and we would like to give the application the ability to
make use of this interrupt to get rid of CPU checks on the OA buffer.

This change wires up the interrupt to the i915-perf stream and leaves
it ignored for now.

v2: Use spin_lock_irq() to access the IMR register on Haswell (Chris)
v3: (Umesh)
- Rebase
- Enable OA IMR for gen12
- Add OA interrupt to the pm enable/disable irq for gen11/12

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt_irq.c        | 29 ++++++++++++++--
 drivers/gpu/drm/i915/gt/intel_gt_irq.h        |  1 +
 drivers/gpu/drm/i915/gt/intel_gt_pm_irq.c     | 34 +++++++++++--------
 .../gpu/drm/i915/gt/intel_ring_submission.c   |  2 ++
 drivers/gpu/drm/i915/i915_perf.c              | 34 +++++++++++++++++++
 drivers/gpu/drm/i915/i915_perf_types.h        | 19 +++++++++++
 drivers/gpu/drm/i915/i915_reg.h               |  9 +++++
 7 files changed, 112 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_irq.c b/drivers/gpu/drm/i915/gt/intel_gt_irq.c
index f0e7fd95165a..708cfb12befb 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_irq.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_irq.c
@@ -85,6 +85,18 @@ gen11_gt_engine_identity(struct intel_gt *gt,
 	return ident;
 }
 
+static void notify_perfmon_buffer_half_full(struct drm_i915_private *i915)
+{
+	atomic64_inc(&i915->perf.exclusive_stream->half_full_count);
+	wake_up_all(&i915->perf.exclusive_stream->poll_wq);
+}
+
+static void gen8_perfmon_handler(struct drm_i915_private *i915, u32 iir)
+{
+	if (iir & GEN8_GT_PERFMON_BUFFER_HALF_FULL_INTERRUPT)
+		notify_perfmon_buffer_half_full(i915);
+}
+
 static void
 gen11_other_irq_handler(struct intel_gt *gt, const u8 instance,
 			const u16 iir)
@@ -95,6 +107,9 @@ gen11_other_irq_handler(struct intel_gt *gt, const u8 instance,
 	if (instance == OTHER_GTPM_INSTANCE)
 		return gen11_rps_irq_handler(&gt->rps, iir);
 
+ 	if (instance == OTHER_WDOAPERF_INSTANCE)
+ 		return gen8_perfmon_handler(gt->i915, iir);
+
 	WARN_ONCE(1, "unhandled other interrupt instance=0x%x, iir=0x%x\n",
 		  instance, iir);
 }
@@ -300,6 +315,9 @@ void gen6_gt_irq_handler(struct intel_gt *gt, u32 gt_iir)
 		      GT_CS_MASTER_ERROR_INTERRUPT))
 		DRM_DEBUG("Command parser error, gt_iir 0x%08x\n", gt_iir);
 
+	if (gt_iir & GT_PERFMON_BUFFER_HALF_FULL_INTERRUPT)
+		notify_perfmon_buffer_half_full(gt->i915);
+
 	if (gt_iir & GT_PARITY_ERROR(gt->i915))
 		gen7_parity_error_irq_handler(gt, gt_iir);
 }
@@ -331,11 +349,13 @@ void gen8_gt_irq_handler(struct intel_gt *gt, u32 master_ctl)
 		}
 	}
 
-	if (master_ctl & GEN8_GT_VECS_IRQ) {
+	if (master_ctl & (GEN8_GT_VECS_IRQ | GEN8_GT_WDBOX_OACS_IRQ)) {
 		iir = raw_reg_read(regs, GEN8_GT_IIR(3));
 		if (likely(iir)) {
 			cs_irq_handler(gt->engine_class[VIDEO_ENHANCEMENT_CLASS][0],
 				       iir >> GEN8_VECS_IRQ_SHIFT);
+			gen8_perfmon_handler(gt->i915,
+					     iir >> GEN8_WD_IRQ_SHIFT);
 			raw_reg_write(regs, GEN8_GT_IIR(3), iir);
 		}
 	}
@@ -371,7 +391,8 @@ void gen8_gt_irq_postinstall(struct intel_gt *gt)
 		irqs << GEN8_RCS_IRQ_SHIFT | irqs << GEN8_BCS_IRQ_SHIFT,
 		irqs << GEN8_VCS0_IRQ_SHIFT | irqs << GEN8_VCS1_IRQ_SHIFT,
 		0,
-		irqs << GEN8_VECS_IRQ_SHIFT,
+		irqs << GEN8_VECS_IRQ_SHIFT |
+		GEN8_GT_PERFMON_BUFFER_HALF_FULL_INTERRUPT << GEN8_WD_IRQ_SHIFT,
 	};
 	struct intel_uncore *uncore = gt->uncore;
 
@@ -439,6 +460,10 @@ void gen5_gt_irq_postinstall(struct intel_gt *gt)
 	else
 		gt_irqs |= GT_BLT_USER_INTERRUPT | GT_BSD_USER_INTERRUPT;
 
+ 	/* We only expose the i915/perf interface on HSW+. */
+ 	if (IS_HASWELL(gt->i915))
+ 		gt_irqs |= GT_PERFMON_BUFFER_HALF_FULL_INTERRUPT;
+
 	GEN3_IRQ_INIT(uncore, GT, gt->gt_imr, gt_irqs);
 
 	if (INTEL_GEN(gt->i915) >= 6) {
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_irq.h b/drivers/gpu/drm/i915/gt/intel_gt_irq.h
index 886c5cf408a2..427011e30c39 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_irq.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_irq.h
@@ -15,6 +15,7 @@ struct intel_gt;
 		      GEN8_GT_BCS_IRQ | \
 		      GEN8_GT_VCS0_IRQ | \
 		      GEN8_GT_VCS1_IRQ | \
+ 		      GEN8_GT_WDBOX_OACS_IRQ | \
 		      GEN8_GT_VECS_IRQ | \
 		      GEN8_GT_PM_IRQ | \
 		      GEN8_GT_GUC_IRQ)
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm_irq.c b/drivers/gpu/drm/i915/gt/intel_gt_pm_irq.c
index babe866126d7..6eecf9912a67 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_pm_irq.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_pm_irq.c
@@ -14,19 +14,16 @@ static void write_pm_imr(struct intel_gt *gt)
 {
 	struct drm_i915_private *i915 = gt->i915;
 	struct intel_uncore *uncore = gt->uncore;
-	u32 mask = gt->pm_imr;
 	i915_reg_t reg;
 
-	if (INTEL_GEN(i915) >= 11) {
+	if (INTEL_GEN(i915) >= 11)
 		reg = GEN11_GPM_WGBOXPERF_INTR_MASK;
-		mask <<= 16; /* pm is in upper half */
-	} else if (INTEL_GEN(i915) >= 8) {
+	else if (INTEL_GEN(i915) >= 8)
 		reg = GEN8_GT_IMR(2);
-	} else {
+	else
 		reg = GEN6_PMIMR;
-	}
 
-	intel_uncore_write(uncore, reg, mask);
+	intel_uncore_write(uncore, reg, gt->pm_imr);
 }
 
 static void gen6_gt_pm_update_irq(struct intel_gt *gt,
@@ -75,25 +72,28 @@ static void write_pm_ier(struct intel_gt *gt)
 {
 	struct drm_i915_private *i915 = gt->i915;
 	struct intel_uncore *uncore = gt->uncore;
-	u32 mask = gt->pm_ier;
 	i915_reg_t reg;
 
-	if (INTEL_GEN(i915) >= 11) {
+	if (INTEL_GEN(i915) >= 11)
 		reg = GEN11_GPM_WGBOXPERF_INTR_ENABLE;
-		mask <<= 16; /* pm is in upper half */
-	} else if (INTEL_GEN(i915) >= 8) {
+	else if (INTEL_GEN(i915) >= 8)
 		reg = GEN8_GT_IER(2);
-	} else {
+	else
 		reg = GEN6_PMIER;
-	}
 
-	intel_uncore_write(uncore, reg, mask);
+	intel_uncore_write(uncore, reg, gt->pm_ier);
 }
 
 void gen6_gt_pm_enable_irq(struct intel_gt *gt, u32 enable_mask)
 {
+	struct drm_i915_private *i915 = gt->i915;
+
 	lockdep_assert_held(&gt->irq_lock);
 
+	if (INTEL_GEN(i915) >= 11)
+		enable_mask = (enable_mask << 16) | /* pm is in upper half */
+			GEN8_GT_PERFMON_BUFFER_HALF_FULL_INTERRUPT;
+
 	gt->pm_ier |= enable_mask;
 	write_pm_ier(gt);
 	gen6_gt_pm_unmask_irq(gt, enable_mask);
@@ -101,8 +101,14 @@ void gen6_gt_pm_enable_irq(struct intel_gt *gt, u32 enable_mask)
 
 void gen6_gt_pm_disable_irq(struct intel_gt *gt, u32 disable_mask)
 {
+	struct drm_i915_private *i915 = gt->i915;
+
 	lockdep_assert_held(&gt->irq_lock);
 
+	if (INTEL_GEN(i915) >= 11)
+		disable_mask = (disable_mask << 16) | /* pm is in upper half */
+			GEN8_GT_PERFMON_BUFFER_HALF_FULL_INTERRUPT;
+
 	gt->pm_ier &= ~disable_mask;
 	gen6_gt_pm_mask_irq(gt, disable_mask);
 	write_pm_ier(gt);
diff --git a/drivers/gpu/drm/i915/gt/intel_ring_submission.c b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
index fee743626060..d792711aea13 100644
--- a/drivers/gpu/drm/i915/gt/intel_ring_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
@@ -1838,6 +1838,8 @@ static void setup_rcs(struct intel_engine_cs *engine)
 
 	if (HAS_L3_DPF(i915))
 		engine->irq_keep_mask = GT_RENDER_L3_PARITY_ERROR_INTERRUPT;
+	if (IS_HASWELL(i915))
+		engine->irq_keep_mask |= GT_PERFMON_BUFFER_HALF_FULL_INTERRUPT;
 
 	engine->irq_enable_mask = GT_RENDER_USER_INTERRUPT;
 
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index cf4176f4d11d..502961da840d 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -201,6 +201,7 @@
 #include "gt/intel_gt.h"
 #include "gt/intel_lrc_reg.h"
 #include "gt/intel_ring.h"
+#include "gt/intel_gt_irq.h"
 
 #include "i915_drv.h"
 #include "i915_perf.h"
@@ -351,6 +352,7 @@ static const struct i915_oa_format gen12_oa_formats[I915_OA_FORMAT_MAX] = {
  * @engine: The engine (typically rcs0) being monitored by the OA unit
  * @poll_oa_period: The period at which the CPU will check for OA data
  * availability
+ * @oa_interrupt_monitor: Whether we should monitor the OA interrupt.
  *
  * As read_properties_unlocked() enumerates and validates the properties given
  * to open a stream of metrics the configuration is built up in the structure
@@ -371,6 +373,7 @@ struct perf_open_properties {
 
 	struct intel_engine_cs *engine;
 	u64 poll_oa_period;
+	bool oa_interrupt_monitor;
 };
 
 struct i915_oa_config_bo {
@@ -2571,6 +2574,13 @@ static void gen7_oa_enable(struct i915_perf_stream *stream)
 	 */
 	gen7_init_oa_buffer(stream);
 
+	if (stream->oa_interrupt_monitor) {
+		spin_lock_irq(&stream->perf->i915->irq_lock);
+		gen5_gt_enable_irq(&stream->perf->i915->gt,
+				   GT_PERFMON_BUFFER_HALF_FULL_INTERRUPT);
+		spin_unlock_irq(&stream->perf->i915->irq_lock);
+	}
+
 	intel_uncore_write(uncore, GEN7_OACONTROL,
 			   (ctx_id & GEN7_OACONTROL_CTX_MASK) |
 			   (period_exponent <<
@@ -2597,6 +2607,10 @@ static void gen8_oa_enable(struct i915_perf_stream *stream)
 	 */
 	gen8_init_oa_buffer(stream);
 
+	if (stream->oa_interrupt_monitor)
+		intel_uncore_write(uncore, GEN8_OA_IMR,
+				   ~GEN8_OA_IMR_MASK_INTR);
+
 	/*
 	 * Note: we don't rely on the hardware to perform single context
 	 * filtering and instead filter on the cpu based on the context-id
@@ -2621,6 +2635,10 @@ static void gen12_oa_enable(struct i915_perf_stream *stream)
 
 	gen12_init_oa_buffer(stream);
 
+	if (stream->oa_interrupt_monitor)
+		intel_uncore_write(uncore, GEN12_OA_IMR,
+				   ~GEN8_OA_IMR_MASK_INTR);
+
 	intel_uncore_write(uncore, GEN12_OAG_OACONTROL,
 			   (report_format << GEN12_OAG_OACONTROL_OA_COUNTER_FORMAT_SHIFT) |
 			   GEN12_OAG_OACONTROL_OA_COUNTER_ENABLE);
@@ -2643,6 +2661,10 @@ static void i915_oa_stream_enable(struct i915_perf_stream *stream)
 	 */
 	stream->pollin = false;
 
+	stream->half_full_count_last = 0;
+	atomic64_set(&stream->half_full_count,
+		     stream->half_full_count_last);
+
 	stream->perf->ops.oa_enable(stream);
 
 	if (stream->periodic && stream->poll_oa_period)
@@ -2655,6 +2677,13 @@ static void gen7_oa_disable(struct i915_perf_stream *stream)
 {
 	struct intel_uncore *uncore = stream->uncore;
 
+ 	if (stream->oa_interrupt_monitor) {
+ 		spin_lock_irq(&stream->perf->i915->irq_lock);
+ 		gen5_gt_disable_irq(&stream->perf->i915->gt,
+ 				    GT_PERFMON_BUFFER_HALF_FULL_INTERRUPT);
+ 		spin_unlock_irq(&stream->perf->i915->irq_lock);
+ 	}
+
 	intel_uncore_write(uncore, GEN7_OACONTROL, 0);
 	if (intel_wait_for_register(uncore,
 				    GEN7_OACONTROL, GEN7_OACONTROL_ENABLE, 0,
@@ -2667,6 +2696,8 @@ static void gen8_oa_disable(struct i915_perf_stream *stream)
 {
 	struct intel_uncore *uncore = stream->uncore;
 
+ 	intel_uncore_write(uncore, GEN8_OA_IMR, 0xffffffff);
+
 	intel_uncore_write(uncore, GEN8_OACONTROL, 0);
 	if (intel_wait_for_register(uncore,
 				    GEN8_OACONTROL, GEN8_OA_COUNTER_ENABLE, 0,
@@ -2679,6 +2710,8 @@ static void gen12_oa_disable(struct i915_perf_stream *stream)
 {
 	struct intel_uncore *uncore = stream->uncore;
 
+ 	intel_uncore_write(uncore, GEN12_OA_IMR, 0xffffffff);
+
 	intel_uncore_write(uncore, GEN12_OAG_OACONTROL, 0);
 	if (intel_wait_for_register(uncore,
 				    GEN12_OAG_OACONTROL,
@@ -3430,6 +3463,7 @@ i915_perf_open_ioctl_locked(struct i915_perf *perf,
 	stream->perf = perf;
 	stream->ctx = specific_ctx;
 	stream->poll_oa_period = props->poll_oa_period;
+	stream->oa_interrupt_monitor = props->oa_interrupt_monitor;
 
 	ret = i915_oa_stream_init(stream, param, props);
 	if (ret)
diff --git a/drivers/gpu/drm/i915/i915_perf_types.h b/drivers/gpu/drm/i915/i915_perf_types.h
index 01559ead22e2..56b04a4970bd 100644
--- a/drivers/gpu/drm/i915/i915_perf_types.h
+++ b/drivers/gpu/drm/i915/i915_perf_types.h
@@ -228,6 +228,19 @@ struct i915_perf_stream {
 	 */
 	bool pollin;
 
+	/**
+	 * @half_full_count: Atomic counter incremented by the interrupt
+	 * handling code for each OA half full interrupt received.
+	 */
+	atomic64_t half_full_count;
+
+	/**
+	 * half_full_count_last: Copy of the atomic half_full_count that was
+	 * last processed in the i915-perf driver. If both counters differ,
+	 * there is data available to read in the OA buffer.
+	 */
+	u64 half_full_count_last;
+
 	/**
 	 * @periodic: Whether periodic sampling is currently enabled.
 	 */
@@ -310,6 +323,12 @@ struct i915_perf_stream {
 	 * buffer should be checked for available data.
 	 */
 	u64 poll_oa_period;
+
+	/**
+	 * @oa_interrupt_monitor: Whether the stream will be notified by OA
+	 * interrupts.
+	 */
+	bool oa_interrupt_monitor;
 };
 
 /**
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 80cf02a6eec1..f176ec023f2b 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -693,6 +693,11 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define OABUFFER_SIZE_8M    (6 << 3)
 #define OABUFFER_SIZE_16M   (7 << 3)
 
+#define GEN8_OA_IMR _MMIO(0x2b20)
+#define  GEN8_OA_IMR_MASK_INTR (1 << 28)
+
+#define GEN12_OA_IMR _MMIO(0xdb14)
+
 /* Gen12 OAR unit */
 #define GEN12_OAR_OACONTROL _MMIO(0x2960)
 #define  GEN12_OAR_OACONTROL_COUNTER_FORMAT_SHIFT 1
@@ -3091,7 +3096,9 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define GT_BLT_USER_INTERRUPT			(1 << 22)
 #define GT_BSD_CS_ERROR_INTERRUPT		(1 << 15)
 #define GT_BSD_USER_INTERRUPT			(1 << 12)
+#define GEN8_GT_PERFMON_BUFFER_HALF_FULL_INTERRUPT (1 << 12) /* bdw+ */
 #define GT_RENDER_L3_PARITY_ERROR_INTERRUPT_S1	(1 << 11) /* hsw+; rsvd on snb, ivb, vlv */
+#define GT_PERFMON_BUFFER_HALF_FULL_INTERRUPT   (1 <<  9) /* ivb+ but only used on hsw+ */
 #define GT_CONTEXT_SWITCH_INTERRUPT		(1 <<  8)
 #define GT_RENDER_L3_PARITY_ERROR_INTERRUPT	(1 <<  5) /* !snb */
 #define GT_RENDER_PIPECTL_NOTIFY_INTERRUPT	(1 <<  4)
@@ -7438,6 +7445,7 @@ enum {
 #define  GEN8_DE_PIPE_B_IRQ		(1 << 17)
 #define  GEN8_DE_PIPE_A_IRQ		(1 << 16)
 #define  GEN8_DE_PIPE_IRQ(pipe)		(1 << (16 + (pipe)))
+#define  GEN8_GT_WDBOX_OACS_IRQ         (1 << 7)
 #define  GEN8_GT_VECS_IRQ		(1 << 6)
 #define  GEN8_GT_GUC_IRQ		(1 << 5)
 #define  GEN8_GT_PM_IRQ			(1 << 4)
@@ -7635,6 +7643,7 @@ enum {
 /* irq instances for OTHER_CLASS */
 #define OTHER_GUC_INSTANCE	0
 #define OTHER_GTPM_INSTANCE	1
+#define OTHER_WDOAPERF_INSTANCE	2
 
 #define GEN11_INTR_IDENTITY_REG(x)	_MMIO(0x190060 + ((x) * 4))
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [Intel-gfx] [PATCH 6/7] drm/i915/perf: add interrupt enabling parameter
  2020-03-03 22:18 [Intel-gfx] [PATCH 0/7] drm/i915/perf: add OA interrupt support Umesh Nerlige Ramappa
                   ` (4 preceding siblings ...)
  2020-03-03 22:19 ` [Intel-gfx] [PATCH 5/7] drm/i915: handle interrupts from the OA unit Umesh Nerlige Ramappa
@ 2020-03-03 22:19 ` Umesh Nerlige Ramappa
  2020-03-04  5:47   ` Dixit, Ashutosh
  2020-03-10 20:08   ` Umesh Nerlige Ramappa
  2020-03-03 22:19 ` [Intel-gfx] [PATCH 7/7] drm/i915/perf: add flushing ioctl Umesh Nerlige Ramappa
                   ` (5 subsequent siblings)
  11 siblings, 2 replies; 28+ messages in thread
From: Umesh Nerlige Ramappa @ 2020-03-03 22:19 UTC (permalink / raw)
  To: intel-gfx, Lionel G Landwerlin

From: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

This let's the application choose to be driven by the interrupt
mechanism of the HW. In conjuction with long periods for checks for
the availability of data on the CPU, this can reduce the CPU load when
doing capture of OA data.

v2: Version the new parameter (Joonas)
v3: Rebase (Umesh)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
---
 drivers/gpu/drm/i915/i915_perf.c | 58 +++++++++++++++++++++++---------
 include/uapi/drm/i915_drm.h      | 10 ++++++
 2 files changed, 53 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 502961da840d..ab41cba85b40 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -252,7 +252,7 @@
  * oa_buffer_check().
  *
  * Most of the implementation details for this workaround are in
- * oa_buffer_check_unlocked() and _append_oa_reports()
+ * oa_buffer_check() and _append_oa_reports()
  *
  * Note for posterity: previously the driver used to define an effective tail
  * pointer that lagged the real pointer by a 'tail margin' measured in bytes
@@ -447,8 +447,9 @@ static u32 gen7_oa_hw_tail_read(struct i915_perf_stream *stream)
 }
 
 /**
- * oa_buffer_check_unlocked - check for data and update tail ptr state
+ * oa_buffer_check - check for data and update tail ptr state
  * @stream: i915 stream instance
+ * @lock: whether to take the oa_buffer spin lock
  *
  * This is either called via fops (for blocking reads in user ctx) or the poll
  * check hrtimer (atomic ctx) to check the OA buffer tail pointer and check
@@ -470,8 +471,9 @@ static u32 gen7_oa_hw_tail_read(struct i915_perf_stream *stream)
  *
  * Returns: %true if the OA buffer contains data, else %false
  */
-static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
+static bool oa_buffer_check(struct i915_perf_stream *stream, bool lock)
 {
+	u64 half_full_count = atomic64_read(&stream->half_full_count);
 	u32 gtt_offset = i915_ggtt_offset(stream->oa_buffer.vma);
 	int report_size = stream->oa_buffer.format_size;
 	unsigned long flags;
@@ -482,7 +484,8 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
 	 * could result in an OA buffer reset which might reset the head,
 	 * tails[] and aged_tail state.
 	 */
-	spin_lock_irqsave(&stream->oa_buffer.ptr_lock, flags);
+	if (lock)
+		spin_lock_irqsave(&stream->oa_buffer.ptr_lock, flags);
 
 	hw_tail = stream->perf->ops.oa_hw_tail_read(stream);
 
@@ -558,7 +561,10 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
 		stream->oa_buffer.aging_timestamp = now;
 	}
 
-	spin_unlock_irqrestore(&stream->oa_buffer.ptr_lock, flags);
+	stream->half_full_count_last = half_full_count;
+
+	if (lock)
+		spin_unlock_irqrestore(&stream->oa_buffer.ptr_lock, flags);
 
 	return OA_TAKEN(stream->oa_buffer.tail - gtt_offset,
 			stream->oa_buffer.head - gtt_offset) >= report_size;
@@ -1169,9 +1175,9 @@ static int gen7_oa_read(struct i915_perf_stream *stream,
  * i915_oa_wait_unlocked - handles blocking IO until OA data available
  * @stream: An i915-perf stream opened for OA metrics
  *
- * Called when userspace tries to read() from a blocking stream FD opened
- * for OA metrics. It waits until the hrtimer callback finds a non-empty
- * OA buffer and wakes us.
+ * Called when userspace tries to read() from a blocking stream FD opened for
+ * OA metrics. It waits until either the hrtimer callback finds a non-empty OA
+ * buffer or the OA interrupt kicks in and wakes us.
  *
  * Note: it's acceptable to have this return with some false positives
  * since any subsequent read handling will return -EAGAIN if there isn't
@@ -1186,7 +1192,7 @@ static int i915_oa_wait_unlocked(struct i915_perf_stream *stream)
 		return -EIO;
 
 	return wait_event_interruptible(stream->poll_wq,
-					oa_buffer_check_unlocked(stream));
+					oa_buffer_check(stream, true));
 }
 
 /**
@@ -2733,6 +2739,10 @@ static void i915_oa_stream_disable(struct i915_perf_stream *stream)
 {
 	stream->perf->ops.oa_disable(stream);
 
+	stream->half_full_count_last = 0;
+	atomic64_set(&stream->half_full_count,
+		     stream->half_full_count_last);
+
 	if (stream->periodic)
 		hrtimer_cancel(&stream->poll_check_timer);
 }
@@ -3075,7 +3085,7 @@ static enum hrtimer_restart oa_poll_check_timer_cb(struct hrtimer *hrtimer)
 	struct i915_perf_stream *stream =
 		container_of(hrtimer, typeof(*stream), poll_check_timer);
 
-	if (oa_buffer_check_unlocked(stream)) {
+	if (oa_buffer_check(stream, true)) {
 		stream->pollin = true;
 		wake_up(&stream->poll_wq);
 	}
@@ -3109,6 +3119,16 @@ static __poll_t i915_perf_poll_locked(struct i915_perf_stream *stream,
 
 	stream->ops->poll_wait(stream, file, wait);
 
+	/*
+	 * Only check the half buffer full notifications if requested by the
+	 * user.
+	 */
+	if (stream->oa_interrupt_monitor &&
+	    (stream->half_full_count_last !=
+	     atomic64_read(&stream->half_full_count))) {
+		stream->pollin = oa_buffer_check(stream, true);
+	}
+
 	/* Note: we don't explicitly check whether there's something to read
 	 * here since this path may be very hot depending on what else
 	 * userspace is polling, or on the timeout in use. We rely solely on
@@ -3667,6 +3687,9 @@ static int read_properties_unlocked(struct i915_perf *perf,
 			}
 			props->poll_oa_period = value;
 			break;
+		case DRM_I915_PERF_PROP_OA_ENABLE_INTERRUPT:
+			props->oa_interrupt_monitor = value != 0;
+			break;
 		case DRM_I915_PERF_PROP_MAX:
 			MISSING_CASE(id);
 			return -EINVAL;
@@ -3677,12 +3700,14 @@ static int read_properties_unlocked(struct i915_perf *perf,
 
 	/*
 	 * Blocking read need to be waken up by some mechanism. If no polling
-	 * of the HEAD/TAIL register is done by the kernel, we'll never be
-	 * able to wake up.
+	 * of the HEAD/TAIL register is done by the kernel and no interrupt is
+	 * enabled, we'll never be able to wake up.
 	 */
 	if ((open_flags & I915_PERF_FLAG_FD_NONBLOCK) == 0 &&
-	    !props->poll_oa_period) {
-		DRM_DEBUG("Requesting a blocking stream with no polling period.\n");
+	    !props->poll_oa_period &&
+	    !props->oa_interrupt_monitor) {
+		DRM_DEBUG("Requesting a blocking stream with no polling period "
+			  "& no interrupt.\n");
 		return -EINVAL;
 	}
 
@@ -4523,8 +4548,11 @@ int i915_perf_ioctl_version(void)
 	 * 4: Add DRM_I915_PERF_PROP_POLL_OA_DELAY parameter that controls
 	 *    enable/disable as well as the interval for the hrtimer used to
 	 *    check for OA data.
+	 *
+	 * 5: Add DRM_I915_PERF_PROP_OA_ENABLE_INTERRUPT paramter to
+	 *    enable/disable interrupts in OA.
 	 */
-	return 4;
+	return 5;
 }
 
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 131cb237d19c..f609ff4ceccb 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1979,6 +1979,16 @@ enum drm_i915_perf_property_id {
 	 */
 	DRM_I915_PERF_PROP_POLL_OA_DELAY,
 
+	/**
+	 * Specifying this property sets up the interrupt mechanism for the OA
+	 * buffer in i915. This option in conjuction with a long polling delay
+	 * for avaibility of OA data can reduce CPU load significantly if you
+	 * do not care about OA data being read as soon as it's available.
+	 *
+	 * This property is available in perf revision 5.
+	 */
+	DRM_I915_PERF_PROP_OA_ENABLE_INTERRUPT,
+
 	DRM_I915_PERF_PROP_MAX /* non-ABI */
 };
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [Intel-gfx] [PATCH 7/7] drm/i915/perf: add flushing ioctl
  2020-03-03 22:18 [Intel-gfx] [PATCH 0/7] drm/i915/perf: add OA interrupt support Umesh Nerlige Ramappa
                   ` (5 preceding siblings ...)
  2020-03-03 22:19 ` [Intel-gfx] [PATCH 6/7] drm/i915/perf: add interrupt enabling parameter Umesh Nerlige Ramappa
@ 2020-03-03 22:19 ` Umesh Nerlige Ramappa
  2020-03-04  5:48   ` Dixit, Ashutosh
  2020-03-04  2:53 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915/perf: add OA interrupt support (rev5) Patchwork
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 28+ messages in thread
From: Umesh Nerlige Ramappa @ 2020-03-03 22:19 UTC (permalink / raw)
  To: intel-gfx, Lionel G Landwerlin

From: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

With the currently available parameters for the i915-perf stream,
there are still situations that are not well covered :

If an application opens the stream with polling disable or at very low
frequency and OA interrupt enabled, no data will be available even
though somewhere between nothing and half of the OA buffer worth of
data might have landed in memory.

To solve this issue we have a new flush ioctl on the perf stream that
forces the i915-perf driver to look at the state of the buffer when
called and makes any data available through both poll() & read() type
syscalls.

v2: Version the ioctl (Joonas)
v3: Rebase (Umesh)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
---
 drivers/gpu/drm/i915/i915_perf.c | 18 ++++++++++++++++++
 include/uapi/drm/i915_drm.h      | 21 +++++++++++++++++++++
 2 files changed, 39 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index ab41cba85b40..b6cb47e80b86 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -3221,6 +3221,18 @@ static void i915_perf_disable_locked(struct i915_perf_stream *stream)
 		stream->ops->disable(stream);
 }
 
+/**
+ * i915_perf_flush_data - handle `I915_PERF_IOCTL_FLUSH_DATA` ioctl
+ * @stream: An enabled i915 perf stream
+ *
+ * The intention is to flush all the data available for reading from the OA
+ * buffer
+ */
+static void i915_perf_flush_data(struct i915_perf_stream *stream)
+{
+	stream->pollin = oa_buffer_check(stream, true);
+}
+
 static long i915_perf_config_locked(struct i915_perf_stream *stream,
 				    unsigned long metrics_set)
 {
@@ -3282,6 +3294,9 @@ static long i915_perf_ioctl_locked(struct i915_perf_stream *stream,
 		return 0;
 	case I915_PERF_IOCTL_CONFIG:
 		return i915_perf_config_locked(stream, arg);
+	case I915_PERF_IOCTL_FLUSH_DATA:
+		i915_perf_flush_data(stream);
+		return 0;
 	}
 
 	return -EINVAL;
@@ -4551,6 +4566,9 @@ int i915_perf_ioctl_version(void)
 	 *
 	 * 5: Add DRM_I915_PERF_PROP_OA_ENABLE_INTERRUPT paramter to
 	 *    enable/disable interrupts in OA.
+	 *
+	 * 6: Add ioctl to flush OA data before reading.
+	 *    I915_PERF_IOCTL_FLUSH_DATA
 	 */
 	return 5;
 }
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index f609ff4ceccb..3fd6bb189248 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -2044,6 +2044,27 @@ struct drm_i915_perf_open_param {
  */
 #define I915_PERF_IOCTL_CONFIG	_IO('i', 0x2)
 
+/**
+ * Actively check the availability of data from a stream.
+ *
+ * A stream data availability can be driven by two types of events :
+ *
+ *   - if enabled, the kernel's hrtimer checking the amount of available data
+ *     in the OA buffer through head/tail registers.
+ *
+ *   - if enabled, the OA unit's interrupt mechanism
+ *
+ * The kernel hrtimer incur a cost of running callback at fixed time
+ * intervals, while the OA interrupt might only happen rarely. In the
+ * situation where the application has disabled the kernel's hrtimer and only
+ * uses the OA interrupt to know about available data, the application can
+ * request an active check of the available OA data through this ioctl. This
+ * will make any data in the OA buffer available with either poll() or read().
+ *
+ * This ioctl is available in perf revision 6.
+ */
+#define I915_PERF_IOCTL_FLUSH_DATA _IO('i', 0x3)
+
 /**
  * Common to all i915 perf records
  */
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915/perf: add OA interrupt support (rev5)
  2020-03-03 22:18 [Intel-gfx] [PATCH 0/7] drm/i915/perf: add OA interrupt support Umesh Nerlige Ramappa
                   ` (6 preceding siblings ...)
  2020-03-03 22:19 ` [Intel-gfx] [PATCH 7/7] drm/i915/perf: add flushing ioctl Umesh Nerlige Ramappa
@ 2020-03-04  2:53 ` Patchwork
  2020-03-04  2:58 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 28+ messages in thread
From: Patchwork @ 2020-03-04  2:53 UTC (permalink / raw)
  To: Lionel Landwerlin; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/perf: add OA interrupt support (rev5)
URL   : https://patchwork.freedesktop.org/series/54280/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
e9632eb2cc8a drm/i915/perf: rework aging tail workaround
-:189: CHECK:SPACING: No space is necessary after a cast
#189: FILE: drivers/gpu/drm/i915/i915_perf.c:527:
+			u32 *report32 = (void *) report;

-:268: CHECK:MULTIPLE_ASSIGNMENTS: multiple assignments should be avoided
#268: FILE: drivers/gpu/drm/i915/i915_perf.c:827:
+		report32[0] = report32[1] = 0;

-:312: CHECK:MULTIPLE_ASSIGNMENTS: multiple assignments should be avoided
#312: FILE: drivers/gpu/drm/i915/i915_perf.c:1042:
+		report32[0] = report32[1] = 0;

-:358: CHECK:LINE_SPACING: Please don't use multiple blank lines
#358: FILE: drivers/gpu/drm/i915/i915_perf.c:4385:
+
+

total: 0 errors, 0 warnings, 4 checks, 364 lines checked
b97afa567f13 drm/i915/perf: move pollin setup to non hw specific code
249c37e99eb0 drm/i915/perf: only append status when data is available
b69637a72e34 drm/i915/perf: add new open param to configure polling of OA buffer
974daa81481c drm/i915: handle interrupts from the OA unit
-:49: ERROR:CODE_INDENT: code indent should use tabs where possible
#49: FILE: drivers/gpu/drm/i915/gt/intel_gt_irq.c:110:
+ ^Iif (instance == OTHER_WDOAPERF_INSTANCE)$

-:49: WARNING:SPACE_BEFORE_TAB: please, no space before tabs
#49: FILE: drivers/gpu/drm/i915/gt/intel_gt_irq.c:110:
+ ^Iif (instance == OTHER_WDOAPERF_INSTANCE)$

-:49: WARNING:LEADING_SPACE: please, no spaces at the start of a line
#49: FILE: drivers/gpu/drm/i915/gt/intel_gt_irq.c:110:
+ ^Iif (instance == OTHER_WDOAPERF_INSTANCE)$

-:50: ERROR:CODE_INDENT: code indent should use tabs where possible
#50: FILE: drivers/gpu/drm/i915/gt/intel_gt_irq.c:111:
+ ^I^Ireturn gen8_perfmon_handler(gt->i915, iir);$

-:50: WARNING:SPACE_BEFORE_TAB: please, no space before tabs
#50: FILE: drivers/gpu/drm/i915/gt/intel_gt_irq.c:111:
+ ^I^Ireturn gen8_perfmon_handler(gt->i915, iir);$

-:50: WARNING:LEADING_SPACE: please, no spaces at the start of a line
#50: FILE: drivers/gpu/drm/i915/gt/intel_gt_irq.c:111:
+ ^I^Ireturn gen8_perfmon_handler(gt->i915, iir);$

-:94: ERROR:CODE_INDENT: code indent should use tabs where possible
#94: FILE: drivers/gpu/drm/i915/gt/intel_gt_irq.c:463:
+ ^I/* We only expose the i915/perf interface on HSW+. */$

-:94: WARNING:SPACE_BEFORE_TAB: please, no space before tabs
#94: FILE: drivers/gpu/drm/i915/gt/intel_gt_irq.c:463:
+ ^I/* We only expose the i915/perf interface on HSW+. */$

-:94: WARNING:LEADING_SPACE: please, no spaces at the start of a line
#94: FILE: drivers/gpu/drm/i915/gt/intel_gt_irq.c:463:
+ ^I/* We only expose the i915/perf interface on HSW+. */$

-:95: ERROR:CODE_INDENT: code indent should use tabs where possible
#95: FILE: drivers/gpu/drm/i915/gt/intel_gt_irq.c:464:
+ ^Iif (IS_HASWELL(gt->i915))$

-:95: WARNING:SPACE_BEFORE_TAB: please, no space before tabs
#95: FILE: drivers/gpu/drm/i915/gt/intel_gt_irq.c:464:
+ ^Iif (IS_HASWELL(gt->i915))$

-:95: WARNING:LEADING_SPACE: please, no spaces at the start of a line
#95: FILE: drivers/gpu/drm/i915/gt/intel_gt_irq.c:464:
+ ^Iif (IS_HASWELL(gt->i915))$

-:96: ERROR:CODE_INDENT: code indent should use tabs where possible
#96: FILE: drivers/gpu/drm/i915/gt/intel_gt_irq.c:465:
+ ^I^Igt_irqs |= GT_PERFMON_BUFFER_HALF_FULL_INTERRUPT;$

-:96: WARNING:SPACE_BEFORE_TAB: please, no space before tabs
#96: FILE: drivers/gpu/drm/i915/gt/intel_gt_irq.c:465:
+ ^I^Igt_irqs |= GT_PERFMON_BUFFER_HALF_FULL_INTERRUPT;$

-:96: WARNING:LEADING_SPACE: please, no spaces at the start of a line
#96: FILE: drivers/gpu/drm/i915/gt/intel_gt_irq.c:465:
+ ^I^Igt_irqs |= GT_PERFMON_BUFFER_HALF_FULL_INTERRUPT;$

-:109: ERROR:CODE_INDENT: code indent should use tabs where possible
#109: FILE: drivers/gpu/drm/i915/gt/intel_gt_irq.h:18:
+ ^I^I      GEN8_GT_WDBOX_OACS_IRQ | \$

-:109: WARNING:SPACE_BEFORE_TAB: please, no space before tabs
#109: FILE: drivers/gpu/drm/i915/gt/intel_gt_irq.h:18:
+ ^I^I      GEN8_GT_WDBOX_OACS_IRQ | \$

-:109: WARNING:LEADING_SPACE: please, no spaces at the start of a line
#109: FILE: drivers/gpu/drm/i915/gt/intel_gt_irq.h:18:
+ ^I^I      GEN8_GT_WDBOX_OACS_IRQ | \$

-:284: ERROR:CODE_INDENT: code indent should use tabs where possible
#284: FILE: drivers/gpu/drm/i915/i915_perf.c:2680:
+ ^Iif (stream->oa_interrupt_monitor) {$

-:284: WARNING:SPACE_BEFORE_TAB: please, no space before tabs
#284: FILE: drivers/gpu/drm/i915/i915_perf.c:2680:
+ ^Iif (stream->oa_interrupt_monitor) {$

-:284: WARNING:LEADING_SPACE: please, no spaces at the start of a line
#284: FILE: drivers/gpu/drm/i915/i915_perf.c:2680:
+ ^Iif (stream->oa_interrupt_monitor) {$

-:285: ERROR:CODE_INDENT: code indent should use tabs where possible
#285: FILE: drivers/gpu/drm/i915/i915_perf.c:2681:
+ ^I^Ispin_lock_irq(&stream->perf->i915->irq_lock);$

-:285: WARNING:SPACE_BEFORE_TAB: please, no space before tabs
#285: FILE: drivers/gpu/drm/i915/i915_perf.c:2681:
+ ^I^Ispin_lock_irq(&stream->perf->i915->irq_lock);$

-:285: WARNING:LEADING_SPACE: please, no spaces at the start of a line
#285: FILE: drivers/gpu/drm/i915/i915_perf.c:2681:
+ ^I^Ispin_lock_irq(&stream->perf->i915->irq_lock);$

-:286: ERROR:CODE_INDENT: code indent should use tabs where possible
#286: FILE: drivers/gpu/drm/i915/i915_perf.c:2682:
+ ^I^Igen5_gt_disable_irq(&stream->perf->i915->gt,$

-:286: WARNING:SPACE_BEFORE_TAB: please, no space before tabs
#286: FILE: drivers/gpu/drm/i915/i915_perf.c:2682:
+ ^I^Igen5_gt_disable_irq(&stream->perf->i915->gt,$

-:286: WARNING:LEADING_SPACE: please, no spaces at the start of a line
#286: FILE: drivers/gpu/drm/i915/i915_perf.c:2682:
+ ^I^Igen5_gt_disable_irq(&stream->perf->i915->gt,$

-:287: ERROR:CODE_INDENT: code indent should use tabs where possible
#287: FILE: drivers/gpu/drm/i915/i915_perf.c:2683:
+ ^I^I^I^I    GT_PERFMON_BUFFER_HALF_FULL_INTERRUPT);$

-:287: WARNING:SPACE_BEFORE_TAB: please, no space before tabs
#287: FILE: drivers/gpu/drm/i915/i915_perf.c:2683:
+ ^I^I^I^I    GT_PERFMON_BUFFER_HALF_FULL_INTERRUPT);$

-:287: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#287: FILE: drivers/gpu/drm/i915/i915_perf.c:2683:
+ 		gen5_gt_disable_irq(&stream->perf->i915->gt,
+ 				    GT_PERFMON_BUFFER_HALF_FULL_INTERRUPT);

-:287: WARNING:LEADING_SPACE: please, no spaces at the start of a line
#287: FILE: drivers/gpu/drm/i915/i915_perf.c:2683:
+ ^I^I^I^I    GT_PERFMON_BUFFER_HALF_FULL_INTERRUPT);$

-:288: ERROR:CODE_INDENT: code indent should use tabs where possible
#288: FILE: drivers/gpu/drm/i915/i915_perf.c:2684:
+ ^I^Ispin_unlock_irq(&stream->perf->i915->irq_lock);$

-:288: WARNING:SPACE_BEFORE_TAB: please, no space before tabs
#288: FILE: drivers/gpu/drm/i915/i915_perf.c:2684:
+ ^I^Ispin_unlock_irq(&stream->perf->i915->irq_lock);$

-:288: WARNING:LEADING_SPACE: please, no spaces at the start of a line
#288: FILE: drivers/gpu/drm/i915/i915_perf.c:2684:
+ ^I^Ispin_unlock_irq(&stream->perf->i915->irq_lock);$

-:289: ERROR:CODE_INDENT: code indent should use tabs where possible
#289: FILE: drivers/gpu/drm/i915/i915_perf.c:2685:
+ ^I}$

-:289: WARNING:SPACE_BEFORE_TAB: please, no space before tabs
#289: FILE: drivers/gpu/drm/i915/i915_perf.c:2685:
+ ^I}$

-:289: WARNING:LEADING_SPACE: please, no spaces at the start of a line
#289: FILE: drivers/gpu/drm/i915/i915_perf.c:2685:
+ ^I}$

-:298: ERROR:CODE_INDENT: code indent should use tabs where possible
#298: FILE: drivers/gpu/drm/i915/i915_perf.c:2699:
+ ^Iintel_uncore_write(uncore, GEN8_OA_IMR, 0xffffffff);$

-:298: WARNING:SPACE_BEFORE_TAB: please, no space before tabs
#298: FILE: drivers/gpu/drm/i915/i915_perf.c:2699:
+ ^Iintel_uncore_write(uncore, GEN8_OA_IMR, 0xffffffff);$

-:298: WARNING:LEADING_SPACE: please, no spaces at the start of a line
#298: FILE: drivers/gpu/drm/i915/i915_perf.c:2699:
+ ^Iintel_uncore_write(uncore, GEN8_OA_IMR, 0xffffffff);$

-:307: ERROR:CODE_INDENT: code indent should use tabs where possible
#307: FILE: drivers/gpu/drm/i915/i915_perf.c:2713:
+ ^Iintel_uncore_write(uncore, GEN12_OA_IMR, 0xffffffff);$

-:307: WARNING:SPACE_BEFORE_TAB: please, no space before tabs
#307: FILE: drivers/gpu/drm/i915/i915_perf.c:2713:
+ ^Iintel_uncore_write(uncore, GEN12_OA_IMR, 0xffffffff);$

-:307: WARNING:LEADING_SPACE: please, no spaces at the start of a line
#307: FILE: drivers/gpu/drm/i915/i915_perf.c:2713:
+ ^Iintel_uncore_write(uncore, GEN12_OA_IMR, 0xffffffff);$

total: 14 errors, 28 warnings, 1 checks, 321 lines checked
4f90346ada25 drm/i915/perf: add interrupt enabling parameter
-:7: WARNING:TYPO_SPELLING: 'conjuction' may be misspelled - perhaps 'conjunction'?
#7: 
mechanism of the HW. In conjuction with long periods for checks for

-:124: CHECK:UNNECESSARY_PARENTHESES: Unnecessary parentheses around 'stream->half_full_count_last !=
 	     atomic64_read(&stream->half_full_count)'
#124: FILE: drivers/gpu/drm/i915/i915_perf.c:3126:
+	if (stream->oa_interrupt_monitor &&
+	    (stream->half_full_count_last !=
+	     atomic64_read(&stream->half_full_count))) {

-:167: WARNING:TYPO_SPELLING: 'paramter' may be misspelled - perhaps 'parameter'?
#167: FILE: drivers/gpu/drm/i915/i915_perf.c:4552:
+	 * 5: Add DRM_I915_PERF_PROP_OA_ENABLE_INTERRUPT paramter to

-:185: WARNING:TYPO_SPELLING: 'conjuction' may be misspelled - perhaps 'conjunction'?
#185: FILE: include/uapi/drm/i915_drm.h:1984:
+	 * buffer in i915. This option in conjuction with a long polling delay

total: 0 errors, 3 warnings, 1 checks, 157 lines checked
cd7ce8c94fd0 drm/i915/perf: add flushing ioctl

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Intel-gfx] ✗ Fi.CI.SPARSE: warning for drm/i915/perf: add OA interrupt support (rev5)
  2020-03-03 22:18 [Intel-gfx] [PATCH 0/7] drm/i915/perf: add OA interrupt support Umesh Nerlige Ramappa
                   ` (7 preceding siblings ...)
  2020-03-04  2:53 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915/perf: add OA interrupt support (rev5) Patchwork
@ 2020-03-04  2:58 ` Patchwork
  2020-03-04  3:13 ` [Intel-gfx] ✗ Fi.CI.DOCS: " Patchwork
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 28+ messages in thread
From: Patchwork @ 2020-03-04  2:58 UTC (permalink / raw)
  To: Lionel Landwerlin; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/perf: add OA interrupt support (rev5)
URL   : https://patchwork.freedesktop.org/series/54280/
State : warning

== Summary ==

$ dim sparse origin/drm-tip
Sparse version: v0.6.0
Commit: drm/i915/perf: rework aging tail workaround
Okay!

Commit: drm/i915/perf: move pollin setup to non hw specific code
-O:drivers/gpu/drm/i915/i915_perf.c:1443:15: warning: memset with byte count of 16777216
-O:drivers/gpu/drm/i915/i915_perf.c:1499:15: warning: memset with byte count of 16777216
+drivers/gpu/drm/i915/i915_perf.c:1443:15: warning: memset with byte count of 16777216
+drivers/gpu/drm/i915/i915_perf.c:1497:15: warning: memset with byte count of 16777216

Commit: drm/i915/perf: only append status when data is available
Okay!

Commit: drm/i915/perf: add new open param to configure polling of OA buffer
Okay!

Commit: drm/i915: handle interrupts from the OA unit
Okay!

Commit: drm/i915/perf: add interrupt enabling parameter
+drivers/gpu/drm/i915/i915_perf.c:569:9: warning: context imbalance in 'oa_buffer_check' - different lock contexts for basic block

Commit: drm/i915/perf: add flushing ioctl
Okay!

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Intel-gfx] ✗ Fi.CI.DOCS: warning for drm/i915/perf: add OA interrupt support (rev5)
  2020-03-03 22:18 [Intel-gfx] [PATCH 0/7] drm/i915/perf: add OA interrupt support Umesh Nerlige Ramappa
                   ` (8 preceding siblings ...)
  2020-03-04  2:58 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
@ 2020-03-04  3:13 ` Patchwork
  2020-03-04  3:18 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
  2020-03-04 19:31 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork
  11 siblings, 0 replies; 28+ messages in thread
From: Patchwork @ 2020-03-04  3:13 UTC (permalink / raw)
  To: Lionel Landwerlin; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/perf: add OA interrupt support (rev5)
URL   : https://patchwork.freedesktop.org/series/54280/
State : warning

== Summary ==

$ make htmldocs 2>&1 > /dev/null | grep i915
./drivers/gpu/drm/i915/display/intel_dpll_mgr.h:285: warning: Function parameter or member 'get_freq' not described in 'intel_shared_dpll_funcs'
./drivers/gpu/drm/i915/i915_perf_types.h:238: warning: Incorrect use of kernel-doc format:          * half_full_count_last: Copy of the atomic half_full_count that was
./drivers/gpu/drm/i915/i915_perf_types.h:332: warning: Function parameter or member 'half_full_count_last' not described in 'i915_perf_stream'
./drivers/gpu/drm/i915/i915_perf_types.h:238: warning: Incorrect use of kernel-doc format:          * half_full_count_last: Copy of the atomic half_full_count that was
./drivers/gpu/drm/i915/i915_perf_types.h:238: warning: Incorrect use of kernel-doc format:          * half_full_count_last: Copy of the atomic half_full_count that was
./drivers/gpu/drm/i915/i915_perf.c:683: warning: Function parameter or member 'oastatus' not described in 'gen8_append_oa_reports'
./drivers/gpu/drm/i915/i915_perf.c:981: warning: Function parameter or member 'oastatus1' not described in 'gen7_append_oa_reports'

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915/perf: add OA interrupt support (rev5)
  2020-03-03 22:18 [Intel-gfx] [PATCH 0/7] drm/i915/perf: add OA interrupt support Umesh Nerlige Ramappa
                   ` (9 preceding siblings ...)
  2020-03-04  3:13 ` [Intel-gfx] ✗ Fi.CI.DOCS: " Patchwork
@ 2020-03-04  3:18 ` Patchwork
  2020-03-04 19:31 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork
  11 siblings, 0 replies; 28+ messages in thread
From: Patchwork @ 2020-03-04  3:18 UTC (permalink / raw)
  To: Lionel Landwerlin; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/perf: add OA interrupt support (rev5)
URL   : https://patchwork.freedesktop.org/series/54280/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_8061 -> Patchwork_16808
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/index.html

Known issues
------------

  Here are the changes found in Patchwork_16808 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@kms_addfb_basic@bo-too-small:
    - fi-tgl-y:           [PASS][1] -> [DMESG-WARN][2] ([CI#94] / [i915#402]) +1 similar issue
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8061/fi-tgl-y/igt@kms_addfb_basic@bo-too-small.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/fi-tgl-y/igt@kms_addfb_basic@bo-too-small.html

  
#### Possible fixes ####

  * igt@i915_selftest@live@gt_heartbeat:
    - fi-bxt-dsi:         [DMESG-FAIL][3] ([fdo#112406]) -> [PASS][4]
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8061/fi-bxt-dsi/igt@i915_selftest@live@gt_heartbeat.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/fi-bxt-dsi/igt@i915_selftest@live@gt_heartbeat.html

  * igt@kms_addfb_basic@addfb25-bad-modifier:
    - fi-tgl-y:           [DMESG-WARN][5] ([CI#94] / [i915#402]) -> [PASS][6]
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8061/fi-tgl-y/igt@kms_addfb_basic@addfb25-bad-modifier.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/fi-tgl-y/igt@kms_addfb_basic@addfb25-bad-modifier.html

  * igt@kms_chamelium@hdmi-hpd-fast:
    - fi-icl-u2:          [FAIL][7] ([i915#217]) -> [PASS][8]
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8061/fi-icl-u2/igt@kms_chamelium@hdmi-hpd-fast.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/fi-icl-u2/igt@kms_chamelium@hdmi-hpd-fast.html
    - fi-kbl-7500u:       [FAIL][9] ([fdo#111096] / [i915#323]) -> [PASS][10]
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8061/fi-kbl-7500u/igt@kms_chamelium@hdmi-hpd-fast.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/fi-kbl-7500u/igt@kms_chamelium@hdmi-hpd-fast.html

  
#### Warnings ####

  * igt@i915_pm_rpm@basic-pci-d3-state:
    - fi-kbl-guc:         [FAIL][11] ([i915#704]) -> [SKIP][12] ([fdo#109271])
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8061/fi-kbl-guc/igt@i915_pm_rpm@basic-pci-d3-state.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/fi-kbl-guc/igt@i915_pm_rpm@basic-pci-d3-state.html

  
  [CI#94]: https://gitlab.freedesktop.org/gfx-ci/i915-infra/issues/94
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#111096]: https://bugs.freedesktop.org/show_bug.cgi?id=111096
  [fdo#112406]: https://bugs.freedesktop.org/show_bug.cgi?id=112406
  [i915#217]: https://gitlab.freedesktop.org/drm/intel/issues/217
  [i915#323]: https://gitlab.freedesktop.org/drm/intel/issues/323
  [i915#402]: https://gitlab.freedesktop.org/drm/intel/issues/402
  [i915#704]: https://gitlab.freedesktop.org/drm/intel/issues/704


Participating hosts (47 -> 45)
------------------------------

  Additional (1): fi-byt-n2820 
  Missing    (3): fi-byt-squawks fi-bsw-cyan fi-bdw-samus 


Build changes
-------------

  * CI: CI-20190529 -> None
  * Linux: CI_DRM_8061 -> Patchwork_16808

  CI-20190529: 20190529
  CI_DRM_8061: e6398b7ae826a02206308a7e164bcf354bbf4993 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5490: f98b33cbd5b57bae52b8e2fae2039e89ac877822 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_16808: cd7ce8c94fd002813b25c573ab7db8a7b7ce7f14 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

cd7ce8c94fd0 drm/i915/perf: add flushing ioctl
4f90346ada25 drm/i915/perf: add interrupt enabling parameter
974daa81481c drm/i915: handle interrupts from the OA unit
b69637a72e34 drm/i915/perf: add new open param to configure polling of OA buffer
249c37e99eb0 drm/i915/perf: only append status when data is available
b97afa567f13 drm/i915/perf: move pollin setup to non hw specific code
e9632eb2cc8a drm/i915/perf: rework aging tail workaround

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/index.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Intel-gfx] [PATCH 6/7] drm/i915/perf: add interrupt enabling parameter
  2020-03-03 22:19 ` [Intel-gfx] [PATCH 6/7] drm/i915/perf: add interrupt enabling parameter Umesh Nerlige Ramappa
@ 2020-03-04  5:47   ` Dixit, Ashutosh
  2020-03-04  8:55     ` Lionel Landwerlin
  2020-03-10 20:08   ` Umesh Nerlige Ramappa
  1 sibling, 1 reply; 28+ messages in thread
From: Dixit, Ashutosh @ 2020-03-04  5:47 UTC (permalink / raw)
  To: Umesh Nerlige Ramappa; +Cc: intel-gfx

On Tue, 03 Mar 2020 14:19:04 -0800, Umesh Nerlige Ramappa wrote:
>
> From: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>
> This let's the application choose to be driven by the interrupt
> mechanism of the HW. In conjuction with long periods for checks for
> the availability of data on the CPU, this can reduce the CPU load when
> doing capture of OA data.
>
> v2: Version the new parameter (Joonas)
> v3: Rebase (Umesh)
>
> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>

[snip]

> +	/**
> +	 * Specifying this property sets up the interrupt mechanism for the OA
> +	 * buffer in i915. This option in conjuction with a long polling delay
> +	 * for avaibility of OA data can reduce CPU load significantly if you
> +	 * do not care about OA data being read as soon as it's available.
> +	 *
> +	 * This property is available in perf revision 5.
> +	 */
> +	DRM_I915_PERF_PROP_OA_ENABLE_INTERRUPT,

What if we do not expose this parameter in the uapi at all and internally
decide in i915 whether to leave the interrupt either always enabled or
always disabled (and in that case always use the hrtimer)? This way we
retain flexibility in i915 if hardware evolves in the future e.g. to use
watermarks for the interrupt, without yielding control to userspace.

Overall I feel we should avoid exposing unnecessary details of the internal
implemenation to userspace, they would be neither interested in knowing
internal details nor know how to properly use these parameters. Shouldn't
the driver be able to make these kinds of decisions internally?

At this point the only parameter which implicitly exposed to userspace is
the hrtimer poll period, so perhaps all we need to do is to expose that in
the uapi? Thoughts?
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Intel-gfx] [PATCH 7/7] drm/i915/perf: add flushing ioctl
  2020-03-03 22:19 ` [Intel-gfx] [PATCH 7/7] drm/i915/perf: add flushing ioctl Umesh Nerlige Ramappa
@ 2020-03-04  5:48   ` Dixit, Ashutosh
  2020-03-04  8:52     ` Lionel Landwerlin
  0 siblings, 1 reply; 28+ messages in thread
From: Dixit, Ashutosh @ 2020-03-04  5:48 UTC (permalink / raw)
  To: Umesh Nerlige Ramappa; +Cc: intel-gfx

On Tue, 03 Mar 2020 14:19:05 -0800, Umesh Nerlige Ramappa wrote:
>
> From: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>
> With the currently available parameters for the i915-perf stream,
> there are still situations that are not well covered :
>
> If an application opens the stream with polling disable or at very low
> frequency and OA interrupt enabled, no data will be available even
> though somewhere between nothing and half of the OA buffer worth of
> data might have landed in memory.
>
> To solve this issue we have a new flush ioctl on the perf stream that
> forces the i915-perf driver to look at the state of the buffer when
> called and makes any data available through both poll() & read() type
> syscalls.
>
> v2: Version the ioctl (Joonas)
> v3: Rebase (Umesh)
>
> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>

[snip]

> +/**
> + * i915_perf_flush_data - handle `I915_PERF_IOCTL_FLUSH_DATA` ioctl
> + * @stream: An enabled i915 perf stream
> + *
> + * The intention is to flush all the data available for reading from the OA
> + * buffer
> + */
> +static void i915_perf_flush_data(struct i915_perf_stream *stream)
> +{
> +	stream->pollin = oa_buffer_check(stream, true);
> +}

Since this function doesn't actually wake up any thread (which anyway can
be done by sending a signal to the blocked thread), is the only purpose of
this function to update OA buffer head/tail? But in that it is not clear
why a separate ioctl should be created for this, can't the read() call
itself call oa_buffer_check() to update the OA buffer head/tail?

Again just trying to minimize uapi changes if possible.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Intel-gfx] [PATCH 7/7] drm/i915/perf: add flushing ioctl
  2020-03-04  5:48   ` Dixit, Ashutosh
@ 2020-03-04  8:52     ` Lionel Landwerlin
  2020-03-05  5:56       ` Dixit, Ashutosh
  0 siblings, 1 reply; 28+ messages in thread
From: Lionel Landwerlin @ 2020-03-04  8:52 UTC (permalink / raw)
  To: Dixit, Ashutosh, Umesh Nerlige Ramappa; +Cc: intel-gfx

On 04/03/2020 07:48, Dixit, Ashutosh wrote:
> On Tue, 03 Mar 2020 14:19:05 -0800, Umesh Nerlige Ramappa wrote:
>> From: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>>
>> With the currently available parameters for the i915-perf stream,
>> there are still situations that are not well covered :
>>
>> If an application opens the stream with polling disable or at very low
>> frequency and OA interrupt enabled, no data will be available even
>> though somewhere between nothing and half of the OA buffer worth of
>> data might have landed in memory.
>>
>> To solve this issue we have a new flush ioctl on the perf stream that
>> forces the i915-perf driver to look at the state of the buffer when
>> called and makes any data available through both poll() & read() type
>> syscalls.
>>
>> v2: Version the ioctl (Joonas)
>> v3: Rebase (Umesh)
>>
>> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
> [snip]
>
>> +/**
>> + * i915_perf_flush_data - handle `I915_PERF_IOCTL_FLUSH_DATA` ioctl
>> + * @stream: An enabled i915 perf stream
>> + *
>> + * The intention is to flush all the data available for reading from the OA
>> + * buffer
>> + */
>> +static void i915_perf_flush_data(struct i915_perf_stream *stream)
>> +{
>> +	stream->pollin = oa_buffer_check(stream, true);
>> +}
> Since this function doesn't actually wake up any thread (which anyway can
> be done by sending a signal to the blocked thread), is the only purpose of
> this function to update OA buffer head/tail? But in that it is not clear
> why a separate ioctl should be created for this, can't the read() call
> itself call oa_buffer_check() to update the OA buffer head/tail?
>
> Again just trying to minimize uapi changes if possible.

Most applications will call read() after being notified by 
poll()/select() that some data is available.

Changing that behavior will break some of the existing perf tests .


If any data is available, this new ioctl will wake up existing waiters 
on poll()/select().


-Lionel

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Intel-gfx] [PATCH 6/7] drm/i915/perf: add interrupt enabling parameter
  2020-03-04  5:47   ` Dixit, Ashutosh
@ 2020-03-04  8:55     ` Lionel Landwerlin
  0 siblings, 0 replies; 28+ messages in thread
From: Lionel Landwerlin @ 2020-03-04  8:55 UTC (permalink / raw)
  To: Dixit, Ashutosh, Umesh Nerlige Ramappa; +Cc: intel-gfx

On 04/03/2020 07:47, Dixit, Ashutosh wrote:
> On Tue, 03 Mar 2020 14:19:04 -0800, Umesh Nerlige Ramappa wrote:
>> From: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>>
>> This let's the application choose to be driven by the interrupt
>> mechanism of the HW. In conjuction with long periods for checks for
>> the availability of data on the CPU, this can reduce the CPU load when
>> doing capture of OA data.
>>
>> v2: Version the new parameter (Joonas)
>> v3: Rebase (Umesh)
>>
>> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
> [snip]
>
>> +	/**
>> +	 * Specifying this property sets up the interrupt mechanism for the OA
>> +	 * buffer in i915. This option in conjuction with a long polling delay
>> +	 * for avaibility of OA data can reduce CPU load significantly if you
>> +	 * do not care about OA data being read as soon as it's available.
>> +	 *
>> +	 * This property is available in perf revision 5.
>> +	 */
>> +	DRM_I915_PERF_PROP_OA_ENABLE_INTERRUPT,
> What if we do not expose this parameter in the uapi at all and internally
> decide in i915 whether to leave the interrupt either always enabled or
> always disabled (and in that case always use the hrtimer)? This way we
> retain flexibility in i915 if hardware evolves in the future e.g. to use
> watermarks for the interrupt, without yielding control to userspace.
>
> Overall I feel we should avoid exposing unnecessary details of the internal
> implemenation to userspace, they would be neither interested in knowing
> internal details nor know how to properly use these parameters. Shouldn't
> the driver be able to make these kinds of decisions internally?
>
> At this point the only parameter which implicitly exposed to userspace is
> the hrtimer poll period, so perhaps all we need to do is to expose that in
> the uapi? Thoughts?


I guess I agree with you. I can't remember why I exposed it to userspace.

There might be one test that checks the stream reports LOST_BUFFER with 
no poll() wakeup, but I guess we could update it.


-Lionel

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Intel-gfx] ✓ Fi.CI.IGT: success for drm/i915/perf: add OA interrupt support (rev5)
  2020-03-03 22:18 [Intel-gfx] [PATCH 0/7] drm/i915/perf: add OA interrupt support Umesh Nerlige Ramappa
                   ` (10 preceding siblings ...)
  2020-03-04  3:18 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
@ 2020-03-04 19:31 ` Patchwork
  11 siblings, 0 replies; 28+ messages in thread
From: Patchwork @ 2020-03-04 19:31 UTC (permalink / raw)
  To: Lionel Landwerlin; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/perf: add OA interrupt support (rev5)
URL   : https://patchwork.freedesktop.org/series/54280/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_8061_full -> Patchwork_16808_full
====================================================

Summary
-------

  **WARNING**

  Minor unknown changes coming with Patchwork_16808_full need to be verified
  manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_16808_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_16808_full:

### IGT changes ###

#### Warnings ####

  * igt@gem_exec_schedule@pi-userfault-bsd2:
    - shard-iclb:         [SKIP][1] ([fdo#109276]) -> [INCOMPLETE][2]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8061/shard-iclb7/igt@gem_exec_schedule@pi-userfault-bsd2.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/shard-iclb2/igt@gem_exec_schedule@pi-userfault-bsd2.html

  
Known issues
------------

  Here are the changes found in Patchwork_16808_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_ctx_isolation@bcs0-s3:
    - shard-apl:          [PASS][3] -> [DMESG-WARN][4] ([i915#180]) +4 similar issues
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8061/shard-apl4/igt@gem_ctx_isolation@bcs0-s3.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/shard-apl6/igt@gem_ctx_isolation@bcs0-s3.html

  * igt@gem_ctx_persistence@engines-mixed-process@vcs0:
    - shard-kbl:          [PASS][5] -> [FAIL][6] ([i915#679])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8061/shard-kbl2/igt@gem_ctx_persistence@engines-mixed-process@vcs0.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/shard-kbl7/igt@gem_ctx_persistence@engines-mixed-process@vcs0.html

  * igt@gem_ctx_persistence@engines-mixed-process@vcs1:
    - shard-kbl:          [PASS][7] -> [INCOMPLETE][8] ([fdo#103665])
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8061/shard-kbl2/igt@gem_ctx_persistence@engines-mixed-process@vcs1.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/shard-kbl7/igt@gem_ctx_persistence@engines-mixed-process@vcs1.html

  * igt@gem_exec_balancer@smoke:
    - shard-iclb:         [PASS][9] -> [SKIP][10] ([fdo#110854])
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8061/shard-iclb4/igt@gem_exec_balancer@smoke.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/shard-iclb3/igt@gem_exec_balancer@smoke.html

  * igt@gem_exec_parallel@vcs1-fds:
    - shard-iclb:         [PASS][11] -> [SKIP][12] ([fdo#112080]) +14 similar issues
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8061/shard-iclb4/igt@gem_exec_parallel@vcs1-fds.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/shard-iclb3/igt@gem_exec_parallel@vcs1-fds.html

  * igt@gem_exec_schedule@implicit-both-bsd1:
    - shard-iclb:         [PASS][13] -> [SKIP][14] ([fdo#109276] / [i915#677]) +1 similar issue
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8061/shard-iclb1/igt@gem_exec_schedule@implicit-both-bsd1.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/shard-iclb3/igt@gem_exec_schedule@implicit-both-bsd1.html

  * igt@gem_exec_schedule@independent-bsd2:
    - shard-iclb:         [PASS][15] -> [SKIP][16] ([fdo#109276]) +23 similar issues
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8061/shard-iclb1/igt@gem_exec_schedule@independent-bsd2.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/shard-iclb3/igt@gem_exec_schedule@independent-bsd2.html

  * igt@gem_exec_schedule@pi-common-bsd:
    - shard-iclb:         [PASS][17] -> [SKIP][18] ([i915#677])
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8061/shard-iclb6/igt@gem_exec_schedule@pi-common-bsd.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/shard-iclb2/igt@gem_exec_schedule@pi-common-bsd.html

  * igt@gem_exec_schedule@preempt-other-chain-bsd:
    - shard-iclb:         [PASS][19] -> [SKIP][20] ([fdo#112146]) +8 similar issues
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8061/shard-iclb8/igt@gem_exec_schedule@preempt-other-chain-bsd.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/shard-iclb2/igt@gem_exec_schedule@preempt-other-chain-bsd.html

  * igt@gem_exec_whisper@basic-queues-forked:
    - shard-apl:          [PASS][21] -> [INCOMPLETE][22] ([fdo#103927])
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8061/shard-apl6/igt@gem_exec_whisper@basic-queues-forked.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/shard-apl2/igt@gem_exec_whisper@basic-queues-forked.html

  * igt@i915_pm_rps@reset:
    - shard-iclb:         [PASS][23] -> [FAIL][24] ([i915#413])
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8061/shard-iclb4/igt@i915_pm_rps@reset.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/shard-iclb3/igt@i915_pm_rps@reset.html

  * igt@kms_cursor_crc@pipe-a-cursor-256x256-offscreen:
    - shard-skl:          [PASS][25] -> [FAIL][26] ([i915#54])
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8061/shard-skl8/igt@kms_cursor_crc@pipe-a-cursor-256x256-offscreen.html
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/shard-skl9/igt@kms_cursor_crc@pipe-a-cursor-256x256-offscreen.html

  * igt@kms_cursor_crc@pipe-c-cursor-suspend:
    - shard-kbl:          [PASS][27] -> [DMESG-WARN][28] ([i915#180]) +4 similar issues
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8061/shard-kbl3/igt@kms_cursor_crc@pipe-c-cursor-suspend.html
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/shard-kbl3/igt@kms_cursor_crc@pipe-c-cursor-suspend.html

  * igt@kms_flip@2x-flip-vs-expired-vblank-interruptible:
    - shard-glk:          [PASS][29] -> [FAIL][30] ([i915#79])
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8061/shard-glk6/igt@kms_flip@2x-flip-vs-expired-vblank-interruptible.html
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/shard-glk2/igt@kms_flip@2x-flip-vs-expired-vblank-interruptible.html

  * igt@kms_flip@flip-vs-expired-vblank:
    - shard-skl:          [PASS][31] -> [FAIL][32] ([i915#79])
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8061/shard-skl3/igt@kms_flip@flip-vs-expired-vblank.html
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/shard-skl6/igt@kms_flip@flip-vs-expired-vblank.html

  * igt@kms_flip@plain-flip-fb-recreate-interruptible:
    - shard-skl:          [PASS][33] -> [FAIL][34] ([i915#34])
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8061/shard-skl6/igt@kms_flip@plain-flip-fb-recreate-interruptible.html
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/shard-skl6/igt@kms_flip@plain-flip-fb-recreate-interruptible.html

  * igt@kms_hdr@bpc-switch-suspend:
    - shard-skl:          [PASS][35] -> [FAIL][36] ([i915#1188])
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8061/shard-skl7/igt@kms_hdr@bpc-switch-suspend.html
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/shard-skl3/igt@kms_hdr@bpc-switch-suspend.html

  * igt@kms_plane_alpha_blend@pipe-a-constant-alpha-min:
    - shard-skl:          [PASS][37] -> [FAIL][38] ([fdo#108145])
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8061/shard-skl6/igt@kms_plane_alpha_blend@pipe-a-constant-alpha-min.html
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/shard-skl3/igt@kms_plane_alpha_blend@pipe-a-constant-alpha-min.html

  * igt@kms_psr2_su@page_flip:
    - shard-iclb:         [PASS][39] -> [SKIP][40] ([fdo#109642] / [fdo#111068])
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8061/shard-iclb2/igt@kms_psr2_su@page_flip.html
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/shard-iclb7/igt@kms_psr2_su@page_flip.html

  * igt@kms_psr@psr2_primary_page_flip:
    - shard-iclb:         [PASS][41] -> [SKIP][42] ([fdo#109441]) +1 similar issue
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8061/shard-iclb2/igt@kms_psr@psr2_primary_page_flip.html
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/shard-iclb5/igt@kms_psr@psr2_primary_page_flip.html

  * igt@kms_vblank@pipe-c-accuracy-idle:
    - shard-glk:          [PASS][43] -> [FAIL][44] ([i915#43])
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8061/shard-glk5/igt@kms_vblank@pipe-c-accuracy-idle.html
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/shard-glk4/igt@kms_vblank@pipe-c-accuracy-idle.html

  
#### Possible fixes ####

  * igt@gem_ctx_shared@exec-single-timeline-bsd:
    - shard-iclb:         [SKIP][45] ([fdo#110841]) -> [PASS][46]
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8061/shard-iclb2/igt@gem_ctx_shared@exec-single-timeline-bsd.html
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/shard-iclb5/igt@gem_ctx_shared@exec-single-timeline-bsd.html

  * igt@gem_exec_schedule@implicit-read-write-bsd1:
    - shard-iclb:         [SKIP][47] ([fdo#109276] / [i915#677]) -> [PASS][48] +1 similar issue
   [47]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8061/shard-iclb7/igt@gem_exec_schedule@implicit-read-write-bsd1.html
   [48]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/shard-iclb2/igt@gem_exec_schedule@implicit-read-write-bsd1.html

  * igt@gem_exec_schedule@preempt-bsd:
    - shard-iclb:         [SKIP][49] ([fdo#112146]) -> [PASS][50] +2 similar issues
   [49]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8061/shard-iclb1/igt@gem_exec_schedule@preempt-bsd.html
   [50]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/shard-iclb3/igt@gem_exec_schedule@preempt-bsd.html

  * igt@gem_ppgtt@flink-and-close-vma-leak:
    - shard-apl:          [FAIL][51] ([i915#644]) -> [PASS][52]
   [51]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8061/shard-apl8/igt@gem_ppgtt@flink-and-close-vma-leak.html
   [52]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/shard-apl8/igt@gem_ppgtt@flink-and-close-vma-leak.html

  * igt@i915_pm_dc@dc6-dpms:
    - shard-iclb:         [FAIL][53] -> [PASS][54]
   [53]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8061/shard-iclb3/igt@i915_pm_dc@dc6-dpms.html
   [54]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/shard-iclb8/igt@i915_pm_dc@dc6-dpms.html

  * igt@i915_pm_rpm@system-suspend:
    - shard-skl:          [INCOMPLETE][55] ([i915#151] / [i915#69]) -> [PASS][56]
   [55]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8061/shard-skl6/igt@i915_pm_rpm@system-suspend.html
   [56]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/shard-skl10/igt@i915_pm_rpm@system-suspend.html

  * igt@i915_suspend@fence-restore-untiled:
    - shard-apl:          [DMESG-WARN][57] ([i915#180]) -> [PASS][58] +1 similar issue
   [57]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8061/shard-apl6/igt@i915_suspend@fence-restore-untiled.html
   [58]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/shard-apl7/igt@i915_suspend@fence-restore-untiled.html

  * igt@kms_big_fb@y-tiled-16bpp-rotate-0:
    - shard-glk:          [FAIL][59] ([i915#1119]) -> [PASS][60]
   [59]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8061/shard-glk2/igt@kms_big_fb@y-tiled-16bpp-rotate-0.html
   [60]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/shard-glk9/igt@kms_big_fb@y-tiled-16bpp-rotate-0.html

  * igt@kms_draw_crc@draw-method-xrgb2101010-mmap-cpu-ytiled:
    - shard-skl:          [FAIL][61] ([i915#52] / [i915#54]) -> [PASS][62]
   [61]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8061/shard-skl3/igt@kms_draw_crc@draw-method-xrgb2101010-mmap-cpu-ytiled.html
   [62]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/shard-skl8/igt@kms_draw_crc@draw-method-xrgb2101010-mmap-cpu-ytiled.html

  * igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a:
    - shard-kbl:          [DMESG-WARN][63] ([i915#180]) -> [PASS][64]
   [63]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8061/shard-kbl3/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a.html
   [64]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/shard-kbl3/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a.html

  * igt@kms_plane_alpha_blend@pipe-c-constant-alpha-min:
    - shard-skl:          [FAIL][65] ([fdo#108145]) -> [PASS][66] +1 similar issue
   [65]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8061/shard-skl10/igt@kms_plane_alpha_blend@pipe-c-constant-alpha-min.html
   [66]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/shard-skl6/igt@kms_plane_alpha_blend@pipe-c-constant-alpha-min.html

  * igt@kms_plane_lowres@pipe-a-tiling-x:
    - shard-glk:          [FAIL][67] ([i915#899]) -> [PASS][68] +1 similar issue
   [67]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8061/shard-glk1/igt@kms_plane_lowres@pipe-a-tiling-x.html
   [68]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/shard-glk3/igt@kms_plane_lowres@pipe-a-tiling-x.html

  * igt@kms_psr@psr2_no_drrs:
    - shard-iclb:         [SKIP][69] ([fdo#109441]) -> [PASS][70] +1 similar issue
   [69]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8061/shard-iclb8/igt@kms_psr@psr2_no_drrs.html
   [70]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/shard-iclb2/igt@kms_psr@psr2_no_drrs.html

  * igt@kms_setmode@basic:
    - shard-glk:          [FAIL][71] ([i915#31]) -> [PASS][72]
   [71]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8061/shard-glk5/igt@kms_setmode@basic.html
   [72]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/shard-glk4/igt@kms_setmode@basic.html

  * igt@perf_pmu@busy-vcs1:
    - shard-iclb:         [SKIP][73] ([fdo#112080]) -> [PASS][74] +4 similar issues
   [73]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8061/shard-iclb6/igt@perf_pmu@busy-vcs1.html
   [74]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/shard-iclb2/igt@perf_pmu@busy-vcs1.html

  * igt@prime_busy@hang-bsd2:
    - shard-iclb:         [SKIP][75] ([fdo#109276]) -> [PASS][76] +15 similar issues
   [75]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8061/shard-iclb8/igt@prime_busy@hang-bsd2.html
   [76]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/shard-iclb2/igt@prime_busy@hang-bsd2.html

  
  [fdo#103665]: https://bugs.freedesktop.org/show_bug.cgi?id=103665
  [fdo#103927]: https://bugs.freedesktop.org/show_bug.cgi?id=103927
  [fdo#108145]: https://bugs.freedesktop.org/show_bug.cgi?id=108145
  [fdo#109276]: https://bugs.freedesktop.org/show_bug.cgi?id=109276
  [fdo#109441]: https://bugs.freedesktop.org/show_bug.cgi?id=109441
  [fdo#109642]: https://bugs.freedesktop.org/show_bug.cgi?id=109642
  [fdo#110841]: https://bugs.freedesktop.org/show_bug.cgi?id=110841
  [fdo#110854]: https://bugs.freedesktop.org/show_bug.cgi?id=110854
  [fdo#111068]: https://bugs.freedesktop.org/show_bug.cgi?id=111068
  [fdo#112080]: https://bugs.freedesktop.org/show_bug.cgi?id=112080
  [fdo#112146]: https://bugs.freedesktop.org/show_bug.cgi?id=112146
  [i915#1119]: https://gitlab.freedesktop.org/drm/intel/issues/1119
  [i915#1188]: https://gitlab.freedesktop.org/drm/intel/issues/1188
  [i915#151]: https://gitlab.freedesktop.org/drm/intel/issues/151
  [i915#180]: https://gitlab.freedesktop.org/drm/intel/issues/180
  [i915#31]: https://gitlab.freedesktop.org/drm/intel/issues/31
  [i915#34]: https://gitlab.freedesktop.org/drm/intel/issues/34
  [i915#413]: https://gitlab.freedesktop.org/drm/intel/issues/413
  [i915#43]: https://gitlab.freedesktop.org/drm/intel/issues/43
  [i915#52]: https://gitlab.freedesktop.org/drm/intel/issues/52
  [i915#54]: https://gitlab.freedesktop.org/drm/intel/issues/54
  [i915#644]: https://gitlab.freedesktop.org/drm/intel/issues/644
  [i915#677]: https://gitlab.freedesktop.org/drm/intel/issues/677
  [i915#679]: https://gitlab.freedesktop.org/drm/intel/issues/679
  [i915#69]: https://gitlab.freedesktop.org/drm/intel/issues/69
  [i915#79]: https://gitlab.freedesktop.org/drm/intel/issues/79
  [i915#899]: https://gitlab.freedesktop.org/drm/intel/issues/899


Participating hosts (10 -> 10)
------------------------------

  No changes in participating hosts


Build changes
-------------

  * CI: CI-20190529 -> None
  * Linux: CI_DRM_8061 -> Patchwork_16808

  CI-20190529: 20190529
  CI_DRM_8061: e6398b7ae826a02206308a7e164bcf354bbf4993 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5490: f98b33cbd5b57bae52b8e2fae2039e89ac877822 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_16808: cd7ce8c94fd002813b25c573ab7db8a7b7ce7f14 @ git://anongit.freedesktop.org/gfx-ci/linux
  piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16808/index.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Intel-gfx] [PATCH 7/7] drm/i915/perf: add flushing ioctl
  2020-03-04  8:52     ` Lionel Landwerlin
@ 2020-03-05  5:56       ` Dixit, Ashutosh
  2020-03-09 19:51         ` Umesh Nerlige Ramappa
  2020-03-09 21:15         ` Umesh Nerlige Ramappa
  0 siblings, 2 replies; 28+ messages in thread
From: Dixit, Ashutosh @ 2020-03-05  5:56 UTC (permalink / raw)
  To: Lionel Landwerlin; +Cc: intel-gfx

On Wed, 04 Mar 2020 00:52:34 -0800, Lionel Landwerlin wrote:
>
> On 04/03/2020 07:48, Dixit, Ashutosh wrote:
> > On Tue, 03 Mar 2020 14:19:05 -0800, Umesh Nerlige Ramappa wrote:
> >> From: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
> >>
> >> With the currently available parameters for the i915-perf stream,
> >> there are still situations that are not well covered :
> >>
> >> If an application opens the stream with polling disable or at very low
> >> frequency and OA interrupt enabled, no data will be available even
> >> though somewhere between nothing and half of the OA buffer worth of
> >> data might have landed in memory.
> >>
> >> To solve this issue we have a new flush ioctl on the perf stream that
> >> forces the i915-perf driver to look at the state of the buffer when
> >> called and makes any data available through both poll() & read() type
> >> syscalls.
> >>
> >> v2: Version the ioctl (Joonas)
> >> v3: Rebase (Umesh)
> >>
> >> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
> >> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
> > [snip]
> >
> >> +/**
> >> + * i915_perf_flush_data - handle `I915_PERF_IOCTL_FLUSH_DATA` ioctl
> >> + * @stream: An enabled i915 perf stream
> >> + *
> >> + * The intention is to flush all the data available for reading from the OA
> >> + * buffer
> >> + */
> >> +static void i915_perf_flush_data(struct i915_perf_stream *stream)
> >> +{
> >> +	stream->pollin = oa_buffer_check(stream, true);
> >> +}
> > Since this function doesn't actually wake up any thread (which anyway can
> > be done by sending a signal to the blocked thread), is the only purpose of
> > this function to update OA buffer head/tail? But in that it is not clear
> > why a separate ioctl should be created for this, can't the read() call
> > itself call oa_buffer_check() to update the OA buffer head/tail?
> >
> > Again just trying to minimize uapi changes if possible.
>
> Most applications will call read() after being notified by poll()/select()
> that some data is available.

Correct this is the standard non blocking read behavior.

> Changing that behavior will break some of the existing perf tests .

I am not suggesting changing that (that standard non blocking read
behavior).

> If any data is available, this new ioctl will wake up existing waiters on
> poll()/select().

The issue is we are not calling wake_up() in the above function to wake up
any blocked waiters. The ioctl will just update the OA buffer head/tail so
that (a) a subsequent blocking read will not block, or (b) a subsequent non
blocking read will return valid data (not -EAGAIN), or (c) a poll/select
will not block but return immediately saying data is available.

That is why it seems to me the ioctl is not required, updating the OA
buffer head/tail can be done as part of the read() (and the poll/select)
calls themselves.

We will investigate if this can be done and update the patches in the next
revision accordingly. Thanks!
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Intel-gfx] [PATCH 7/7] drm/i915/perf: add flushing ioctl
  2020-03-05  5:56       ` Dixit, Ashutosh
@ 2020-03-09 19:51         ` Umesh Nerlige Ramappa
  2020-03-10 20:44           ` Lionel Landwerlin
  2020-03-09 21:15         ` Umesh Nerlige Ramappa
  1 sibling, 1 reply; 28+ messages in thread
From: Umesh Nerlige Ramappa @ 2020-03-09 19:51 UTC (permalink / raw)
  To: Dixit, Ashutosh; +Cc: intel-gfx

On Wed, Mar 04, 2020 at 09:56:28PM -0800, Dixit, Ashutosh wrote:
>On Wed, 04 Mar 2020 00:52:34 -0800, Lionel Landwerlin wrote:
>>
>> On 04/03/2020 07:48, Dixit, Ashutosh wrote:
>> > On Tue, 03 Mar 2020 14:19:05 -0800, Umesh Nerlige Ramappa wrote:
>> >> From: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>> >>
>> >> With the currently available parameters for the i915-perf stream,
>> >> there are still situations that are not well covered :
>> >>
>> >> If an application opens the stream with polling disable or at very low
>> >> frequency and OA interrupt enabled, no data will be available even
>> >> though somewhere between nothing and half of the OA buffer worth of
>> >> data might have landed in memory.
>> >>
>> >> To solve this issue we have a new flush ioctl on the perf stream that
>> >> forces the i915-perf driver to look at the state of the buffer when
>> >> called and makes any data available through both poll() & read() type
>> >> syscalls.
>> >>
>> >> v2: Version the ioctl (Joonas)
>> >> v3: Rebase (Umesh)
>> >>
>> >> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>> >> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
>> > [snip]
>> >
>> >> +/**
>> >> + * i915_perf_flush_data - handle `I915_PERF_IOCTL_FLUSH_DATA` ioctl
>> >> + * @stream: An enabled i915 perf stream
>> >> + *
>> >> + * The intention is to flush all the data available for reading from the OA
>> >> + * buffer
>> >> + */
>> >> +static void i915_perf_flush_data(struct i915_perf_stream *stream)
>> >> +{
>> >> +	stream->pollin = oa_buffer_check(stream, true);
>> >> +}
>> > Since this function doesn't actually wake up any thread (which anyway can
>> > be done by sending a signal to the blocked thread), is the only purpose of
>> > this function to update OA buffer head/tail? But in that it is not clear
>> > why a separate ioctl should be created for this, can't the read() call
>> > itself call oa_buffer_check() to update the OA buffer head/tail?
>> >
>> > Again just trying to minimize uapi changes if possible.
>>
>> Most applications will call read() after being notified by poll()/select()
>> that some data is available.
>
>Correct this is the standard non blocking read behavior.
>
>> Changing that behavior will break some of the existing perf tests .
>
>I am not suggesting changing that (that standard non blocking read
>behavior).
>
>> If any data is available, this new ioctl will wake up existing waiters on
>> poll()/select().
>
>The issue is we are not calling wake_up() in the above function to wake up
>any blocked waiters. The ioctl will just update the OA buffer head/tail so
>that (a) a subsequent blocking read will not block, or (b) a subsequent non
>blocking read will return valid data (not -EAGAIN), or (c) a poll/select
>will not block but return immediately saying data is available.
>
>That is why it seems to me the ioctl is not required, updating the OA
>buffer head/tail can be done as part of the read() (and the poll/select)
>calls themselves.
>
>We will investigate if this can be done and update the patches in the next
>revision accordingly. Thanks!

In this case, where we are trying to determine if there is any data in 
the oa buffer before the next interrupt has fired, user could call poll 
with a reasonable timeout to determine if data is available or not.  
That would eliminate the need for the flush ioctl. Thoughts?

Thanks,
Umesh
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Intel-gfx] [PATCH 7/7] drm/i915/perf: add flushing ioctl
  2020-03-05  5:56       ` Dixit, Ashutosh
  2020-03-09 19:51         ` Umesh Nerlige Ramappa
@ 2020-03-09 21:15         ` Umesh Nerlige Ramappa
  1 sibling, 0 replies; 28+ messages in thread
From: Umesh Nerlige Ramappa @ 2020-03-09 21:15 UTC (permalink / raw)
  To: Dixit, Ashutosh; +Cc: intel-gfx

On Wed, Mar 04, 2020 at 09:56:28PM -0800, Dixit, Ashutosh wrote:
>On Wed, 04 Mar 2020 00:52:34 -0800, Lionel Landwerlin wrote:
>>
>> On 04/03/2020 07:48, Dixit, Ashutosh wrote:
>> > On Tue, 03 Mar 2020 14:19:05 -0800, Umesh Nerlige Ramappa wrote:
>> >> From: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>> >>
>> >> With the currently available parameters for the i915-perf stream,
>> >> there are still situations that are not well covered :
>> >>
>> >> If an application opens the stream with polling disable or at very low
>> >> frequency and OA interrupt enabled, no data will be available even
>> >> though somewhere between nothing and half of the OA buffer worth of
>> >> data might have landed in memory.
>> >>
>> >> To solve this issue we have a new flush ioctl on the perf stream that
>> >> forces the i915-perf driver to look at the state of the buffer when
>> >> called and makes any data available through both poll() & read() type
>> >> syscalls.
>> >>
>> >> v2: Version the ioctl (Joonas)
>> >> v3: Rebase (Umesh)
>> >>
>> >> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>> >> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
>> > [snip]
>> >
>> >> +/**
>> >> + * i915_perf_flush_data - handle `I915_PERF_IOCTL_FLUSH_DATA` ioctl
>> >> + * @stream: An enabled i915 perf stream
>> >> + *
>> >> + * The intention is to flush all the data available for reading from the OA
>> >> + * buffer
>> >> + */
>> >> +static void i915_perf_flush_data(struct i915_perf_stream *stream)
>> >> +{
>> >> +	stream->pollin = oa_buffer_check(stream, true);
>> >> +}
>> > Since this function doesn't actually wake up any thread (which anyway can
>> > be done by sending a signal to the blocked thread), is the only purpose of
>> > this function to update OA buffer head/tail? But in that it is not clear
>> > why a separate ioctl should be created for this, can't the read() call
>> > itself call oa_buffer_check() to update the OA buffer head/tail?
>> >
>> > Again just trying to minimize uapi changes if possible.
>>
>> Most applications will call read() after being notified by poll()/select()
>> that some data is available.
>
>Correct this is the standard non blocking read behavior.
>
>> Changing that behavior will break some of the existing perf tests .
>
>I am not suggesting changing that (that standard non blocking read
>behavior).
>
>> If any data is available, this new ioctl will wake up existing waiters on
>> poll()/select().
>
>The issue is we are not calling wake_up() in the above function to wake up
>any blocked waiters. The ioctl will just update the OA buffer head/tail so
>that (a) a subsequent blocking read will not block, or (b) a subsequent non
>blocking read will return valid data (not -EAGAIN), or (c) a poll/select
>will not block but return immediately saying data is available.
>
>That is why it seems to me the ioctl is not required, updating the OA
>buffer head/tail can be done as part of the read() (and the poll/select)
>calls themselves.
>
>We will investigate if this can be done and update the patches in the next
>revision accordingly. Thanks!

resending (cc: Lionel)..

In this case, where we are trying to determine if there is any data in 
the oa buffer before the next interrupt has fired, user could call poll 
with a reasonable timeout to determine if data is available or not.  
That would eliminate the need for the flush ioctl.  Thoughts?

Thanks,
Umesh
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Intel-gfx] [PATCH 6/7] drm/i915/perf: add interrupt enabling parameter
  2020-03-03 22:19 ` [Intel-gfx] [PATCH 6/7] drm/i915/perf: add interrupt enabling parameter Umesh Nerlige Ramappa
  2020-03-04  5:47   ` Dixit, Ashutosh
@ 2020-03-10 20:08   ` Umesh Nerlige Ramappa
  2020-03-10 20:57     ` Lionel Landwerlin
  1 sibling, 1 reply; 28+ messages in thread
From: Umesh Nerlige Ramappa @ 2020-03-10 20:08 UTC (permalink / raw)
  To: intel-gfx, Lionel G Landwerlin

On Tue, Mar 03, 2020 at 02:19:04PM -0800, Umesh Nerlige Ramappa wrote:
>From: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>
>This let's the application choose to be driven by the interrupt
>mechanism of the HW. In conjuction with long periods for checks for
>the availability of data on the CPU, this can reduce the CPU load when
>doing capture of OA data.
>
>v2: Version the new parameter (Joonas)
>v3: Rebase (Umesh)
>
>Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
>---
> drivers/gpu/drm/i915/i915_perf.c | 58 +++++++++++++++++++++++---------
> include/uapi/drm/i915_drm.h      | 10 ++++++
> 2 files changed, 53 insertions(+), 15 deletions(-)
>
>diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
>index 502961da840d..ab41cba85b40 100644
>--- a/drivers/gpu/drm/i915/i915_perf.c
>+++ b/drivers/gpu/drm/i915/i915_perf.c
>@@ -252,7 +252,7 @@
>  * oa_buffer_check().
>  *
>  * Most of the implementation details for this workaround are in
>- * oa_buffer_check_unlocked() and _append_oa_reports()
>+ * oa_buffer_check() and _append_oa_reports()
>  *
>  * Note for posterity: previously the driver used to define an effective tail
>  * pointer that lagged the real pointer by a 'tail margin' measured in bytes
>@@ -447,8 +447,9 @@ static u32 gen7_oa_hw_tail_read(struct i915_perf_stream *stream)
> }
>
> /**
>- * oa_buffer_check_unlocked - check for data and update tail ptr state
>+ * oa_buffer_check - check for data and update tail ptr state
>  * @stream: i915 stream instance
>+ * @lock: whether to take the oa_buffer spin lock
>  *
>  * This is either called via fops (for blocking reads in user ctx) or the poll
>  * check hrtimer (atomic ctx) to check the OA buffer tail pointer and check
>@@ -470,8 +471,9 @@ static u32 gen7_oa_hw_tail_read(struct i915_perf_stream *stream)
>  *
>  * Returns: %true if the OA buffer contains data, else %false
>  */
>-static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
>+static bool oa_buffer_check(struct i915_perf_stream *stream, bool lock)

Hi Lionel,

All callers seem to set the lock to true when calling oa_buffer_check().  
Do you recall why the parameter was introduced?  If not, we probably 
want to remove this change.

Thanks,
Umesh
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Intel-gfx] [PATCH 7/7] drm/i915/perf: add flushing ioctl
  2020-03-09 19:51         ` Umesh Nerlige Ramappa
@ 2020-03-10 20:44           ` Lionel Landwerlin
  2020-03-11  3:05             ` Dixit, Ashutosh
  0 siblings, 1 reply; 28+ messages in thread
From: Lionel Landwerlin @ 2020-03-10 20:44 UTC (permalink / raw)
  To: Umesh Nerlige Ramappa, Dixit, Ashutosh; +Cc: intel-gfx

On 09/03/2020 21:51, Umesh Nerlige Ramappa wrote:
> On Wed, Mar 04, 2020 at 09:56:28PM -0800, Dixit, Ashutosh wrote:
>> On Wed, 04 Mar 2020 00:52:34 -0800, Lionel Landwerlin wrote:
>>>
>>> On 04/03/2020 07:48, Dixit, Ashutosh wrote:
>>> > On Tue, 03 Mar 2020 14:19:05 -0800, Umesh Nerlige Ramappa wrote:
>>> >> From: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>>> >>
>>> >> With the currently available parameters for the i915-perf stream,
>>> >> there are still situations that are not well covered :
>>> >>
>>> >> If an application opens the stream with polling disable or at 
>>> very low
>>> >> frequency and OA interrupt enabled, no data will be available even
>>> >> though somewhere between nothing and half of the OA buffer worth of
>>> >> data might have landed in memory.
>>> >>
>>> >> To solve this issue we have a new flush ioctl on the perf stream 
>>> that
>>> >> forces the i915-perf driver to look at the state of the buffer when
>>> >> called and makes any data available through both poll() & read() 
>>> type
>>> >> syscalls.
>>> >>
>>> >> v2: Version the ioctl (Joonas)
>>> >> v3: Rebase (Umesh)
>>> >>
>>> >> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>>> >> Signed-off-by: Umesh Nerlige Ramappa 
>>> <umesh.nerlige.ramappa@intel.com>
>>> > [snip]
>>> >
>>> >> +/**
>>> >> + * i915_perf_flush_data - handle `I915_PERF_IOCTL_FLUSH_DATA` ioctl
>>> >> + * @stream: An enabled i915 perf stream
>>> >> + *
>>> >> + * The intention is to flush all the data available for reading 
>>> from the OA
>>> >> + * buffer
>>> >> + */
>>> >> +static void i915_perf_flush_data(struct i915_perf_stream *stream)
>>> >> +{
>>> >> +    stream->pollin = oa_buffer_check(stream, true);
>>> >> +}
>>> > Since this function doesn't actually wake up any thread (which 
>>> anyway can
>>> > be done by sending a signal to the blocked thread), is the only 
>>> purpose of
>>> > this function to update OA buffer head/tail? But in that it is not 
>>> clear
>>> > why a separate ioctl should be created for this, can't the read() 
>>> call
>>> > itself call oa_buffer_check() to update the OA buffer head/tail?
>>> >
>>> > Again just trying to minimize uapi changes if possible.
>>>
>>> Most applications will call read() after being notified by 
>>> poll()/select()
>>> that some data is available.
>>
>> Correct this is the standard non blocking read behavior.
>>
>>> Changing that behavior will break some of the existing perf tests .
>>
>> I am not suggesting changing that (that standard non blocking read
>> behavior).
>>
>>> If any data is available, this new ioctl will wake up existing 
>>> waiters on
>>> poll()/select().
>>
>> The issue is we are not calling wake_up() in the above function to 
>> wake up
>> any blocked waiters. The ioctl will just update the OA buffer 
>> head/tail so
>> that (a) a subsequent blocking read will not block, or (b) a 
>> subsequent non
>> blocking read will return valid data (not -EAGAIN), or (c) a poll/select
>> will not block but return immediately saying data is available.
>>
>> That is why it seems to me the ioctl is not required, updating the OA
>> buffer head/tail can be done as part of the read() (and the poll/select)
>> calls themselves.
>>
>> We will investigate if this can be done and update the patches in the 
>> next
>> revision accordingly. Thanks!
>
> In this case, where we are trying to determine if there is any data in 
> the oa buffer before the next interrupt has fired, user could call 
> poll with a reasonable timeout to determine if data is available or 
> not.  That would eliminate the need for the flush ioctl. Thoughts?
>
> Thanks,
> Umesh


I almost forgot why this would cause problem.

Checking the state of the buffer every time you call poll() will pretty 
much guarantee you have at least one report to read every time.

So that would lead to lot more wakeups :(


The whole system has to stay "unidirectional" with either interrupts or 
timeout driving the wakeups.

This additional ioctl is the only solution I could find to add one more 
input to the wakeup mechanism.


-Lionel

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Intel-gfx] [PATCH 6/7] drm/i915/perf: add interrupt enabling parameter
  2020-03-10 20:08   ` Umesh Nerlige Ramappa
@ 2020-03-10 20:57     ` Lionel Landwerlin
  0 siblings, 0 replies; 28+ messages in thread
From: Lionel Landwerlin @ 2020-03-10 20:57 UTC (permalink / raw)
  To: Umesh Nerlige Ramappa, intel-gfx

On 10/03/2020 22:08, Umesh Nerlige Ramappa wrote:
> On Tue, Mar 03, 2020 at 02:19:04PM -0800, Umesh Nerlige Ramappa wrote:
>> From: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>>
>> This let's the application choose to be driven by the interrupt
>> mechanism of the HW. In conjuction with long periods for checks for
>> the availability of data on the CPU, this can reduce the CPU load when
>> doing capture of OA data.
>>
>> v2: Version the new parameter (Joonas)
>> v3: Rebase (Umesh)
>>
>> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
>> ---
>> drivers/gpu/drm/i915/i915_perf.c | 58 +++++++++++++++++++++++---------
>> include/uapi/drm/i915_drm.h      | 10 ++++++
>> 2 files changed, 53 insertions(+), 15 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_perf.c 
>> b/drivers/gpu/drm/i915/i915_perf.c
>> index 502961da840d..ab41cba85b40 100644
>> --- a/drivers/gpu/drm/i915/i915_perf.c
>> +++ b/drivers/gpu/drm/i915/i915_perf.c
>> @@ -252,7 +252,7 @@
>>  * oa_buffer_check().
>>  *
>>  * Most of the implementation details for this workaround are in
>> - * oa_buffer_check_unlocked() and _append_oa_reports()
>> + * oa_buffer_check() and _append_oa_reports()
>>  *
>>  * Note for posterity: previously the driver used to define an 
>> effective tail
>>  * pointer that lagged the real pointer by a 'tail margin' measured 
>> in bytes
>> @@ -447,8 +447,9 @@ static u32 gen7_oa_hw_tail_read(struct 
>> i915_perf_stream *stream)
>> }
>>
>> /**
>> - * oa_buffer_check_unlocked - check for data and update tail ptr state
>> + * oa_buffer_check - check for data and update tail ptr state
>>  * @stream: i915 stream instance
>> + * @lock: whether to take the oa_buffer spin lock
>>  *
>>  * This is either called via fops (for blocking reads in user ctx) or 
>> the poll
>>  * check hrtimer (atomic ctx) to check the OA buffer tail pointer and 
>> check
>> @@ -470,8 +471,9 @@ static u32 gen7_oa_hw_tail_read(struct 
>> i915_perf_stream *stream)
>>  *
>>  * Returns: %true if the OA buffer contains data, else %false
>>  */
>> -static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
>> +static bool oa_buffer_check(struct i915_perf_stream *stream, bool lock)
>
> Hi Lionel,
>
> All callers seem to set the lock to true when calling 
> oa_buffer_check().  Do you recall why the parameter was introduced?  
> If not, we probably want to remove this change.
>
> Thanks,
> Umesh


Err... Sorry, I don't remember.

It's probably a leftover the initial iteration where I was trying to get 
the OA head/tail register from the interrupt.

I guess you can drop that param and leave the function with the 
_unlocked prefix.


Thanks,


-Lionel


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Intel-gfx] [PATCH 7/7] drm/i915/perf: add flushing ioctl
  2020-03-10 20:44           ` Lionel Landwerlin
@ 2020-03-11  3:05             ` Dixit, Ashutosh
  0 siblings, 0 replies; 28+ messages in thread
From: Dixit, Ashutosh @ 2020-03-11  3:05 UTC (permalink / raw)
  To: Lionel Landwerlin; +Cc: intel-gfx

On Tue, 10 Mar 2020 13:44:30 -0700, Lionel Landwerlin wrote:
>
> On 09/03/2020 21:51, Umesh Nerlige Ramappa wrote:
> > On Wed, Mar 04, 2020 at 09:56:28PM -0800, Dixit, Ashutosh wrote:
> >> On Wed, 04 Mar 2020 00:52:34 -0800, Lionel Landwerlin wrote:
> >>>
> >>> On 04/03/2020 07:48, Dixit, Ashutosh wrote:
> >>> > On Tue, 03 Mar 2020 14:19:05 -0800, Umesh Nerlige Ramappa wrote:
> >>> >> From: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
> >>> >>
> >>> >> With the currently available parameters for the i915-perf stream,
> >>> >> there are still situations that are not well covered :
> >>> >>
> >>> >> If an application opens the stream with polling disable or at very
> >>> low
> >>> >> frequency and OA interrupt enabled, no data will be available even
> >>> >> though somewhere between nothing and half of the OA buffer worth of
> >>> >> data might have landed in memory.
> >>> >>
> >>> >> To solve this issue we have a new flush ioctl on the perf stream
> >>> that
> >>> >> forces the i915-perf driver to look at the state of the buffer when
> >>> >> called and makes any data available through both poll() & read()
> >>> type
> >>> >> syscalls.
> >>> >>
> >>> >> v2: Version the ioctl (Joonas)
> >>> >> v3: Rebase (Umesh)
> >>> >>
> >>> >> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
> >>> >> Signed-off-by: Umesh Nerlige Ramappa
> >>> <umesh.nerlige.ramappa@intel.com>
> >>> > [snip]
> >>> >
> >>> >> +/**
> >>> >> + * i915_perf_flush_data - handle `I915_PERF_IOCTL_FLUSH_DATA` ioctl
> >>> >> + * @stream: An enabled i915 perf stream
> >>> >> + *
> >>> >> + * The intention is to flush all the data available for reading
> >>> from the OA
> >>> >> + * buffer
> >>> >> + */
> >>> >> +static void i915_perf_flush_data(struct i915_perf_stream *stream)
> >>> >> +{
> >>> >> +    stream->pollin = oa_buffer_check(stream, true);
> >>> >> +}
> >>> > Since this function doesn't actually wake up any thread (which anyway
> >>> can
> >>> > be done by sending a signal to the blocked thread), is the only
> >>> purpose of
> >>> > this function to update OA buffer head/tail? But in that it is not
> >>> clear
> >>> > why a separate ioctl should be created for this, can't the read()
> >>> call
> >>> > itself call oa_buffer_check() to update the OA buffer head/tail?
> >>> >
> >>> > Again just trying to minimize uapi changes if possible.
> >>>
> >>> Most applications will call read() after being notified by
> >>> poll()/select()
> >>> that some data is available.
> >>
> >> Correct this is the standard non blocking read behavior.
> >>
> >>> Changing that behavior will break some of the existing perf tests .
> >>
> >> I am not suggesting changing that (that standard non blocking read
> >> behavior).
> >>
> >>> If any data is available, this new ioctl will wake up existing waiters
> >>> on
> >>> poll()/select().
> >>
> >> The issue is we are not calling wake_up() in the above function to wake
> >> up
> >> any blocked waiters. The ioctl will just update the OA buffer head/tail
> >> so
> >> that (a) a subsequent blocking read will not block, or (b) a subsequent
> >> non
> >> blocking read will return valid data (not -EAGAIN), or (c) a poll/select
> >> will not block but return immediately saying data is available.
> >>
> >> That is why it seems to me the ioctl is not required, updating the OA
> >> buffer head/tail can be done as part of the read() (and the poll/select)
> >> calls themselves.
> >>
> >> We will investigate if this can be done and update the patches in the
> >> next
> >> revision accordingly. Thanks!
> >
> > In this case, where we are trying to determine if there is any data in
> > the oa buffer before the next interrupt has fired, user could call poll
> > with a reasonable timeout to determine if data is available or not.  That
> > would eliminate the need for the flush ioctl. Thoughts?
> >
> > Thanks,
> > Umesh
>
>
> I almost forgot why this would cause problem.
>
> Checking the state of the buffer every time you call poll() will pretty
> much guarantee you have at least one report to read every time.
>
> So that would lead to lot more wakeups :(
>
> The whole system has to stay "unidirectional" with either interrupts or
> timeout driving the wakeups.
>
> This additional ioctl is the only solution I could find to add one more
> input to the wakeup mechanism.

Well, aren't we asking the app to sleep for time T and then call flush
(followed by read)? Then we might as well ask them to sleep for time T and
call poll? Or we can ask them set the hrtimer to T, skip the sleep and call
poll (followed by read)? Aren't these 3 mechanisms equivalent? To me the
last option seems to be the cleanest. Thanks!
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Intel-gfx] [PATCH 4/7] drm/i915/perf: add new open param to configure polling of OA buffer
  2020-03-03 22:19 ` [Intel-gfx] [PATCH 4/7] drm/i915/perf: add new open param to configure polling of OA buffer Umesh Nerlige Ramappa
@ 2020-03-12 19:27   ` Dixit, Ashutosh
  2020-03-12 20:37     ` Lionel Landwerlin
  0 siblings, 1 reply; 28+ messages in thread
From: Dixit, Ashutosh @ 2020-03-12 19:27 UTC (permalink / raw)
  To: Umesh Nerlige Ramappa; +Cc: intel-gfx

On Tue, 03 Mar 2020 14:19:02 -0800, Umesh Nerlige Ramappa wrote:
>
> From: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>
> This new parameter let's the application choose how often the OA
> buffer should be checked on the CPU side for data availability. Longer
> polling period tend to reduce CPU overhead if the application does not
> care about somewhat real time data collection.
>
> v2: Allow disabling polling completely with 0 value (Lionel)
> v3: Version the new parameter (Joonas)
> v4: Rebase (Umesh)

Hi Lionel, I was thinking that one way to keep things simple for now (and
fix the high cpu usage issue) would be to expose _ONLY_ this OA poll period
parameter to user space. That is we don't expose the interrupt or the flush
ioctl to user space at this time. This way the user will be able to
configure the hrtimer frequency to a lower value to bring down the cpu
usage.

Also we would disallow disabling the timer (and internally also not use the
interrupt). So everything will happen in exactly the same way as it used to
(no other changes needed) but at a lower rate if the user so chooses.

What do you think about this?

Thanks!
--
Ashutosh
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Intel-gfx] [PATCH 4/7] drm/i915/perf: add new open param to configure polling of OA buffer
  2020-03-12 19:27   ` Dixit, Ashutosh
@ 2020-03-12 20:37     ` Lionel Landwerlin
  2020-03-12 22:20       ` Dixit, Ashutosh
  2020-03-12 23:10       ` Umesh Nerlige Ramappa
  0 siblings, 2 replies; 28+ messages in thread
From: Lionel Landwerlin @ 2020-03-12 20:37 UTC (permalink / raw)
  To: Dixit, Ashutosh, Umesh Nerlige Ramappa; +Cc: intel-gfx

On 12/03/2020 21:27, Dixit, Ashutosh wrote:
> On Tue, 03 Mar 2020 14:19:02 -0800, Umesh Nerlige Ramappa wrote:
>> From: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>>
>> This new parameter let's the application choose how often the OA
>> buffer should be checked on the CPU side for data availability. Longer
>> polling period tend to reduce CPU overhead if the application does not
>> care about somewhat real time data collection.
>>
>> v2: Allow disabling polling completely with 0 value (Lionel)
>> v3: Version the new parameter (Joonas)
>> v4: Rebase (Umesh)
> Hi Lionel, I was thinking that one way to keep things simple for now (and
> fix the high cpu usage issue) would be to expose _ONLY_ this OA poll period
> parameter to user space. That is we don't expose the interrupt or the flush
> ioctl to user space at this time. This way the user will be able to
> configure the hrtimer frequency to a lower value to bring down the cpu
> usage.
>
> Also we would disallow disabling the timer (and internally also not use the
> interrupt). So everything will happen in exactly the same way as it used to
> (no other changes needed) but at a lower rate if the user so chooses.
>
> What do you think about this?
>
> Thanks!
> --
> Ashutosh


Sure, just make sure the users know about this.

The fact that they can now select timer values that will potentially 
lead to the loss of the buffer's data because it was overridden.


Thanks,

-Lionel

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Intel-gfx] [PATCH 4/7] drm/i915/perf: add new open param to configure polling of OA buffer
  2020-03-12 20:37     ` Lionel Landwerlin
@ 2020-03-12 22:20       ` Dixit, Ashutosh
  2020-03-12 23:10       ` Umesh Nerlige Ramappa
  1 sibling, 0 replies; 28+ messages in thread
From: Dixit, Ashutosh @ 2020-03-12 22:20 UTC (permalink / raw)
  To: Lionel Landwerlin; +Cc: intel-gfx

On Thu, 12 Mar 2020 13:37:12 -0700, Lionel Landwerlin wrote:
> On 12/03/2020 21:27, Dixit, Ashutosh wrote:
> > On Tue, 03 Mar 2020 14:19:02 -0800, Umesh Nerlige Ramappa wrote:
> >> From: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
> >>
> >> This new parameter let's the application choose how often the OA
> >> buffer should be checked on the CPU side for data availability. Longer
> >> polling period tend to reduce CPU overhead if the application does not
> >> care about somewhat real time data collection.
> >>
> >> v2: Allow disabling polling completely with 0 value (Lionel)
> >> v3: Version the new parameter (Joonas)
> >> v4: Rebase (Umesh)
> > Hi Lionel, I was thinking that one way to keep things simple for now (and
> > fix the high cpu usage issue) would be to expose _ONLY_ this OA poll period
> > parameter to user space. That is we don't expose the interrupt or the flush
> > ioctl to user space at this time. This way the user will be able to
> > configure the hrtimer frequency to a lower value to bring down the cpu
> > usage.
> >
> > Also we would disallow disabling the timer (and internally also not use the
> > interrupt). So everything will happen in exactly the same way as it used to
> > (no other changes needed) but at a lower rate if the user so chooses.
> >
> > What do you think about this?
> >
> > Thanks!
> > --
> > Ashutosh
>
> Sure, just make sure the users know about this.

Ok, so the plan now is to just post and review/merge the first 4 patches
mostly as is, except that the poll timer cannot be disabled. IMO this
should solve the cpu usage issue. Then we can take up the remaining 3
interrupt and flush patches and see if they are really needed and move them
forward if they are.

> The fact that they can now select timer values that will potentially lead
> to the loss of the buffer's data because it was overridden.

I think you mean over-written. You are right but I think there is way
around this and we can post that patch soon which I think will avoid this
issue too.

Thanks!
--
Ashutosh
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Intel-gfx] [PATCH 4/7] drm/i915/perf: add new open param to configure polling of OA buffer
  2020-03-12 20:37     ` Lionel Landwerlin
  2020-03-12 22:20       ` Dixit, Ashutosh
@ 2020-03-12 23:10       ` Umesh Nerlige Ramappa
  1 sibling, 0 replies; 28+ messages in thread
From: Umesh Nerlige Ramappa @ 2020-03-12 23:10 UTC (permalink / raw)
  To: Lionel Landwerlin; +Cc: intel-gfx

On Thu, Mar 12, 2020 at 10:37:12PM +0200, Lionel Landwerlin wrote:
>On 12/03/2020 21:27, Dixit, Ashutosh wrote:
>>On Tue, 03 Mar 2020 14:19:02 -0800, Umesh Nerlige Ramappa wrote:
>>>From: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>>>
>>>This new parameter let's the application choose how often the OA
>>>buffer should be checked on the CPU side for data availability. Longer
>>>polling period tend to reduce CPU overhead if the application does not
>>>care about somewhat real time data collection.
>>>
>>>v2: Allow disabling polling completely with 0 value (Lionel)
>>>v3: Version the new parameter (Joonas)
>>>v4: Rebase (Umesh)
>>Hi Lionel, I was thinking that one way to keep things simple for now (and
>>fix the high cpu usage issue) would be to expose _ONLY_ this OA poll period
>>parameter to user space. That is we don't expose the interrupt or the flush
>>ioctl to user space at this time. This way the user will be able to
>>configure the hrtimer frequency to a lower value to bring down the cpu
>>usage.
>>
>>Also we would disallow disabling the timer (and internally also not use the
>>interrupt). So everything will happen in exactly the same way as it used to
>>(no other changes needed) but at a lower rate if the user so chooses.
>>
>>What do you think about this?
>>
>>Thanks!
>>--
>>Ashutosh
>
>
>Sure, just make sure the users know about this.
>
>The fact that they can now select timer values that will potentially 
>lead to the loss of the buffer's data because it was overridden.
>

posted v2:
Kernel patches - https://patchwork.freedesktop.org/series/54280/#rev6
IGT - https://patchwork.freedesktop.org/series/74655/

Thanks,
Umesh

>
>Thanks,
>
>-Lionel
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2020-03-12 23:11 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-03 22:18 [Intel-gfx] [PATCH 0/7] drm/i915/perf: add OA interrupt support Umesh Nerlige Ramappa
2020-03-03 22:18 ` [Intel-gfx] [PATCH 1/7] drm/i915/perf: rework aging tail workaround Umesh Nerlige Ramappa
2020-03-03 22:19 ` [Intel-gfx] [PATCH 2/7] drm/i915/perf: move pollin setup to non hw specific code Umesh Nerlige Ramappa
2020-03-03 22:19 ` [Intel-gfx] [PATCH 3/7] drm/i915/perf: only append status when data is available Umesh Nerlige Ramappa
2020-03-03 22:19 ` [Intel-gfx] [PATCH 4/7] drm/i915/perf: add new open param to configure polling of OA buffer Umesh Nerlige Ramappa
2020-03-12 19:27   ` Dixit, Ashutosh
2020-03-12 20:37     ` Lionel Landwerlin
2020-03-12 22:20       ` Dixit, Ashutosh
2020-03-12 23:10       ` Umesh Nerlige Ramappa
2020-03-03 22:19 ` [Intel-gfx] [PATCH 5/7] drm/i915: handle interrupts from the OA unit Umesh Nerlige Ramappa
2020-03-03 22:19 ` [Intel-gfx] [PATCH 6/7] drm/i915/perf: add interrupt enabling parameter Umesh Nerlige Ramappa
2020-03-04  5:47   ` Dixit, Ashutosh
2020-03-04  8:55     ` Lionel Landwerlin
2020-03-10 20:08   ` Umesh Nerlige Ramappa
2020-03-10 20:57     ` Lionel Landwerlin
2020-03-03 22:19 ` [Intel-gfx] [PATCH 7/7] drm/i915/perf: add flushing ioctl Umesh Nerlige Ramappa
2020-03-04  5:48   ` Dixit, Ashutosh
2020-03-04  8:52     ` Lionel Landwerlin
2020-03-05  5:56       ` Dixit, Ashutosh
2020-03-09 19:51         ` Umesh Nerlige Ramappa
2020-03-10 20:44           ` Lionel Landwerlin
2020-03-11  3:05             ` Dixit, Ashutosh
2020-03-09 21:15         ` Umesh Nerlige Ramappa
2020-03-04  2:53 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915/perf: add OA interrupt support (rev5) Patchwork
2020-03-04  2:58 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2020-03-04  3:13 ` [Intel-gfx] ✗ Fi.CI.DOCS: " Patchwork
2020-03-04  3:18 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2020-03-04 19:31 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.