All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 00/20] Support for sustained capturing of GuC firmware logs
@ 2016-08-12  6:25 akash.goel
  2016-08-12  6:25 ` [PATCH 01/20] drm/i915: Decouple GuC log setup from verbosity parameter akash.goel
                   ` (20 more replies)
  0 siblings, 21 replies; 68+ messages in thread
From: akash.goel @ 2016-08-12  6:25 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel

From: Akash Goel <akash.goel@intel.com>

GuC firmware log its debug messages into a Host-GuC shared memory buffer
and when the buffer is half full it sends a Flush interrupt to Host.
GuC firmware follows the half-full draining protocol where it expects that
while it is writing to 2nd half of the buffer, first half would get consumed
by Host and then get a flush completed acknowledgement from Host, so that it
does not end up doing any overwrite causing loss of logs.
So far flush interrupt wasn't enabled on Host side & User could capture the
contents/snapshot of log buffer through 'i915_guc_log_dump' debugfs iface.
But this couldn't meet couple of key requirements, especially of Validation,
first is to ensure capturing of all boot time logs even with high verbosity
level and second is to enable capturing of logs in a sustained manner like
for the entire duration of a workload.
Now Driver will enable flush interrupt and on receving it, would copy the
contents of log buffer into its local buffer. The size of local buffer would
be big enough to contain multiple snapshots of the log buffer giving ample
time to User to pull boot time messages.
Have added a debugfs interface '/sys/kernel/debug/dri/guc_log' for User to
collect the logs. Availed relay framework to implement this interface, where
Driver will have to just use a relay API to store snapshots of GuC log buffer
in a buffer managed by relay. The relay buffer can be operated in a mode,
equivalent to 'dmesg -c' where the old data, not yet collected by User, will
be overwritten if buffer becomes full or it can be operated in no-overwrite
mode where relay will stop accepting new data if all sub buffers are full.
Have used the latter mode to avoid the possibility of getting garbled data. 
Besides mmap method, through which User can directly access the relay
buffer contents, relay also supports the 'poll' method. Through the 'poll'
call on log file, User can come to know whenever a new snapshot of the log
buffer is taken by Driver, so can run in tandem with the Driver and thus
capture logs in a sustained/streaming manner, without any loss of data.

v2: Rebased to the latest drm-intel-nightly.

v3: Aligned with the modification of late debugfs registration, at the end of
    i915 Driver load. Did cleanup as per Tvrtko's review comments, added 3
    new patches to optimize the log-buffer flush interrupt handling, gather
    and report the logging related stats.

v4: Added 2 new patches to further optimize the log-buffer flush interrupt
    handling. Did cleanup as per Chris's review comments, fixed couple of
    issues related to clearing of Guc2Host message register. Switched to
    no-overwrite mode for the relay.

v5: Added a new patch to avail MOVNTDQA instruction based fast memcpy provided
    by a patch from Chris. Dropped the rt priority kthread patch, after
    evaluating all the optimizations with certain benchmarks like
    synmark_oglmultithread, synmark_oglbatch5 which generates flush interupts
    almost at every ms or less. Updated the older patches as per the review
    comments from Tvrtko and Chris W. Added a new patch to augment i915 error
    state with the GuC log buffer contents. Fixed the issue of User interrupt
    getting disabled for VEBOX ring, causing failure for certain IGTs.
    Also included 2 patches to support early logging for capturing boot
    time logs and use per CPU constructs on the relay side so as to address
    a WARNING issue with the call to relay_reserve(), without disabling
    preemption.

Akash Goel (12):
  drm/i915: New structure to contain GuC logging related fields
  drm/i915: Add low level set of routines for programming PM IER/IIR/IMR
    register set
  relay: Use per CPU constructs for the relay channel buffer pointers
  drm/i915: Add a relay backed debugfs interface for capturing GuC logs
  drm/i915: New lock to serialize the Host2GuC actions
  drm/i915: Add stats for GuC log buffer flush interrupts
  drm/i915: Optimization to reduce the sampling time of GuC log buffer
  drm/i915: Increase GuC log buffer size to reduce flush interrupts
  drm/i915: Augment i915 error state to include the dump of GuC log
    buffer
  drm/i915: Use uncached(WC) mapping for acessing the GuC log buffer
  drm/i915: Use SSE4.1 movntdqa based memcpy for sampling GuC log buffer
  drm/i915: Early creation of relay channel for capturing boot time logs

Chris Wilson (2):
  drm/i915: Support to create write combined type vmaps
  drm/i915: Use SSE4.1 movntdqa to accelerate reads from WC memory

Sagar Arun Kamble (6):
  drm/i915: Decouple GuC log setup from verbosity parameter
  drm/i915: Add GuC ukernel logging related fields to fw interface file
  drm/i915: Support for GuC interrupts
  drm/i915: Handle log buffer flush interrupt event from GuC
  drm/i915: Forcefully flush GuC log buffer on reset
  drm/i915: Debugfs support for GuC logging control

 drivers/gpu/drm/i915/Kconfig               |   1 +
 drivers/gpu/drm/i915/Makefile              |   3 +
 drivers/gpu/drm/i915/i915_debugfs.c        |  74 +++-
 drivers/gpu/drm/i915/i915_drv.c            |  18 +
 drivers/gpu/drm/i915/i915_drv.h            |  17 +-
 drivers/gpu/drm/i915/i915_gem.c            |  58 ++-
 drivers/gpu/drm/i915/i915_gem_dmabuf.c     |   2 +-
 drivers/gpu/drm/i915/i915_gpu_error.c      |  29 ++
 drivers/gpu/drm/i915/i915_guc_submission.c | 547 ++++++++++++++++++++++++++++-
 drivers/gpu/drm/i915/i915_irq.c            | 178 ++++++++--
 drivers/gpu/drm/i915/i915_memcpy.c         | 101 ++++++
 drivers/gpu/drm/i915/i915_reg.h            |  11 +
 drivers/gpu/drm/i915/intel_drv.h           |   6 +
 drivers/gpu/drm/i915/intel_guc.h           |  30 +-
 drivers/gpu/drm/i915/intel_guc_fwif.h      |  82 ++++-
 drivers/gpu/drm/i915/intel_guc_loader.c    |  12 +-
 drivers/gpu/drm/i915/intel_lrc.c           |   8 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c    |   6 +-
 include/linux/relay.h                      |  17 +-
 kernel/relay.c                             |  74 ++--
 20 files changed, 1166 insertions(+), 108 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_memcpy.c

-- 
1.9.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH 01/20] drm/i915: Decouple GuC log setup from verbosity parameter
  2016-08-12  6:25 [PATCH v5 00/20] Support for sustained capturing of GuC firmware logs akash.goel
@ 2016-08-12  6:25 ` akash.goel
  2016-08-12  6:25 ` [PATCH 02/20] drm/i915: Add GuC ukernel logging related fields to fw interface file akash.goel
                   ` (19 subsequent siblings)
  20 siblings, 0 replies; 68+ messages in thread
From: akash.goel @ 2016-08-12  6:25 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel

From: Sagar Arun Kamble <sagar.a.kamble@intel.com>

GuC Log buffer allocation was tied up with verbosity level module param
i915.guc_log_level. User would be given a provision to enable firmware
logging at runtime, through a host2guc action, and not necessarily during
Driver load time. But the address of log buffer can be passed only in
init params, at firmware load time, so GuC has to be reset and firmware
needs to be reloaded to pass the log buffer address at runtime.
To avoid reset of GuC & reload of firmware, allocation of log buffer will
be done always but logging would be enabled initially on GuC side based on
the value of module paramter guc_log_level.

v2: Update commit message to describe the constraint with allocation of
    log buffer at runtime. (Tvrtko)

Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 3 ---
 drivers/gpu/drm/i915/intel_guc_loader.c    | 8 +++++---
 2 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 6831321..6a1ced0 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -847,9 +847,6 @@ static void guc_create_log(struct intel_guc *guc)
 	unsigned long offset;
 	uint32_t size, flags;
 
-	if (i915.guc_log_level < GUC_LOG_VERBOSITY_MIN)
-		return;
-
 	if (i915.guc_log_level > GUC_LOG_VERBOSITY_MAX)
 		i915.guc_log_level = GUC_LOG_VERBOSITY_MAX;
 
diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c b/drivers/gpu/drm/i915/intel_guc_loader.c
index bfcf6b5..ec24bab 100644
--- a/drivers/gpu/drm/i915/intel_guc_loader.c
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -187,11 +187,13 @@ static void set_guc_init_params(struct drm_i915_private *dev_priv)
 	params[GUC_CTL_FEATURE] |= GUC_CTL_DISABLE_SCHEDULER |
 			GUC_CTL_VCS2_ENABLED;
 
-	if (i915.guc_log_level >= 0) {
-		params[GUC_CTL_LOG_PARAMS] = guc->log_flags;
+	params[GUC_CTL_LOG_PARAMS] = guc->log_flags;
+
+	if (i915.guc_log_level >= 0)
 		params[GUC_CTL_DEBUG] =
 			i915.guc_log_level << GUC_LOG_VERBOSITY_SHIFT;
-	}
+	else
+		params[GUC_CTL_DEBUG] = GUC_LOG_DISABLED;
 
 	if (guc->ads_obj) {
 		u32 ads = (u32)i915_gem_obj_ggtt_offset(guc->ads_obj)
-- 
1.9.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 02/20] drm/i915: Add GuC ukernel logging related fields to fw interface file
  2016-08-12  6:25 [PATCH v5 00/20] Support for sustained capturing of GuC firmware logs akash.goel
  2016-08-12  6:25 ` [PATCH 01/20] drm/i915: Decouple GuC log setup from verbosity parameter akash.goel
@ 2016-08-12  6:25 ` akash.goel
  2016-08-12  6:25 ` [PATCH 03/20] drm/i915: New structure to contain GuC logging related fields akash.goel
                   ` (18 subsequent siblings)
  20 siblings, 0 replies; 68+ messages in thread
From: akash.goel @ 2016-08-12  6:25 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel

From: Sagar Arun Kamble <sagar.a.kamble@intel.com>

The first page of the GuC log buffer contains state info or meta data
which is required to parse the logs contained in the subsequent pages.
The structure representing the state info is added to interface file
as Driver would need to handle log buffer flush interrupts from GuC.
Added an enum for the different message/event types that can be send
by the GuC ukernel to Host.
Also added 2 new Host to GuC action types to inform GuC when Host has
flushed the log buffer and forcefuly cause the GuC to send a new
log buffer flush interrupt.

v2: Make documentation of log buffer state structure more elaborate. (Tvrtko)
    Rename LOGBUFFERFLUSH action to LOG_BUFFER_FLUSH for consistency. (Tvrtko)

v3: Add GuC log buffer layout diagram for more clarity.

Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/intel_guc_fwif.h | 78 +++++++++++++++++++++++++++++++++++
 1 file changed, 78 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h b/drivers/gpu/drm/i915/intel_guc_fwif.h
index 944786d..1de6928 100644
--- a/drivers/gpu/drm/i915/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
@@ -418,15 +418,87 @@ struct guc_ads {
 	u32 reserved2[4];
 } __packed;
 
+/* GuC logging structures */
+
+enum guc_log_buffer_type {
+	GUC_ISR_LOG_BUFFER,
+	GUC_DPC_LOG_BUFFER,
+	GUC_CRASH_DUMP_LOG_BUFFER,
+	GUC_MAX_LOG_BUFFER
+};
+
+/**
+ * DOC: GuC Log buffer Layout
+ *
+ * Page0  +-------------------------------+
+ *        |   ISR state header (32 bytes) |
+ *        |      DPC state header         |
+ *        |   Crash dump state header     |
+ * Page1  +-------------------------------+
+ *        |           ISR logs            |
+ * Page5  +-------------------------------+
+ *        |           DPC logs            |
+ * Page9  +-------------------------------+
+ *        |         Crash Dump logs       |
+ *        +-------------------------------+
+ *
+ * Below state structure is used for coordination of retrieval of GuC firmware
+ * logs. Separate state is maintained for each log buffer type.
+ * read_ptr points to the location where i915 read last in log buffer and
+ * is read only for GuC firmware. write_ptr is incremented by GuC with number
+ * of bytes written for each log entry and is read only for i915.
+ * When any type of log buffer becomes half full, GuC sends a flush interrupt.
+ * GuC firmware expects that while it is writing to 2nd half of the buffer,
+ * first half would get consumed by Host and then get a flush completed
+ * acknowledgement from Host, so that it does not end up doing any overwrite
+ * causing loss of logs. So when buffer gets half filled & i915 has requested
+ * for interrupt, GuC will set flush_to_file field, set the sampled_write_ptr
+ * to the value of write_ptr and raise the interrupt.
+ * On receving the interrupt i915 should read the buffer, clear flush_to_file
+ * field and also update read_ptr with the value of sample_write_ptr, before
+ * sending an acknowledgement to GuC. marker & version fields are for internal
+ * usage of GuC and opaque to i915. buffer_full_cnt field is incremented every
+ * time GuC detects the log buffer overflow.
+ */
+struct guc_log_buffer_state {
+	u32 marker[2];
+	u32 read_ptr;
+	u32 write_ptr;
+	u32 size;
+	u32 sampled_write_ptr;
+	union {
+		struct {
+			u32 flush_to_file:1;
+			u32 buffer_full_cnt:4;
+			u32 reserved:27;
+		};
+		u32 flags;
+	};
+	u32 version;
+} __packed;
+
+union guc_log_control {
+	struct {
+		u32 logging_enabled:1;
+		u32 reserved1:3;
+		u32 verbosity:4;
+		u32 reserved2:24;
+	};
+	u32 value;
+} __packed;
+
 /* This Action will be programmed in C180 - SOFT_SCRATCH_O_REG */
 enum host2guc_action {
 	HOST2GUC_ACTION_DEFAULT = 0x0,
 	HOST2GUC_ACTION_SAMPLE_FORCEWAKE = 0x6,
 	HOST2GUC_ACTION_ALLOCATE_DOORBELL = 0x10,
 	HOST2GUC_ACTION_DEALLOCATE_DOORBELL = 0x20,
+	HOST2GUC_ACTION_LOG_BUFFER_FILE_FLUSH_COMPLETE = 0x30,
+	HOST2GUC_ACTION_FORCE_LOG_BUFFER_FLUSH = 0x302,
 	HOST2GUC_ACTION_ENTER_S_STATE = 0x501,
 	HOST2GUC_ACTION_EXIT_S_STATE = 0x502,
 	HOST2GUC_ACTION_SLPC_REQUEST = 0x3003,
+	HOST2GUC_ACTION_UK_LOG_ENABLE_LOGGING = 0x0E000,
 	HOST2GUC_ACTION_LIMIT
 };
 
@@ -448,4 +520,10 @@ enum guc2host_status {
 	GUC2HOST_STATUS_GENERIC_FAIL = GUC2HOST_STATUS(0x0000F000)
 };
 
+/* This action will be programmed in C1BC - SOFT_SCRATCH_15_REG */
+enum guc2host_message {
+	GUC2HOST_MSG_CRASH_DUMP_POSTED = (1 << 1),
+	GUC2HOST_MSG_FLUSH_LOG_BUFFER = (1 << 3)
+};
+
 #endif
-- 
1.9.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 03/20] drm/i915: New structure to contain GuC logging related fields
  2016-08-12  6:25 [PATCH v5 00/20] Support for sustained capturing of GuC firmware logs akash.goel
  2016-08-12  6:25 ` [PATCH 01/20] drm/i915: Decouple GuC log setup from verbosity parameter akash.goel
  2016-08-12  6:25 ` [PATCH 02/20] drm/i915: Add GuC ukernel logging related fields to fw interface file akash.goel
@ 2016-08-12  6:25 ` akash.goel
  2016-08-12  6:25 ` [PATCH 04/20] drm/i915: Add low level set of routines for programming PM IER/IIR/IMR register set akash.goel
                   ` (17 subsequent siblings)
  20 siblings, 0 replies; 68+ messages in thread
From: akash.goel @ 2016-08-12  6:25 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel

From: Akash Goel <akash.goel@intel.com>

So far there were 2 fields related to GuC logs in 'intel_guc' structure.
For the support of capturing GuC logs & storing them in a local buffer,
multiple new fields would have to be added. This warrants a separate
structure to contain the fields related to GuC logging state.
Added a new structure 'intel_guc_log' and instance of it inside
'intel_guc' structure.

Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c        |  2 +-
 drivers/gpu/drm/i915/i915_guc_submission.c | 10 +++++-----
 drivers/gpu/drm/i915/intel_guc.h           |  8 ++++++--
 drivers/gpu/drm/i915/intel_guc_loader.c    |  2 +-
 4 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index c461072..51b59d5 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2623,7 +2623,7 @@ static int i915_guc_log_dump(struct seq_file *m, void *data)
 	struct drm_info_node *node = m->private;
 	struct drm_device *dev = node->minor->dev;
 	struct drm_i915_private *dev_priv = to_i915(dev);
-	struct drm_i915_gem_object *log_obj = dev_priv->guc.log_obj;
+	struct drm_i915_gem_object *log_obj = dev_priv->guc.log.obj;
 	u32 *log;
 	int i = 0, pg;
 
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 6a1ced0..ad3b55f 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -856,7 +856,7 @@ static void guc_create_log(struct intel_guc *guc)
 		GUC_LOG_ISR_PAGES + 1 +
 		GUC_LOG_CRASH_PAGES + 1) << PAGE_SHIFT;
 
-	obj = guc->log_obj;
+	obj = guc->log.obj;
 	if (!obj) {
 		obj = gem_allocate_guc_obj(dev_priv, size);
 		if (!obj) {
@@ -865,7 +865,7 @@ static void guc_create_log(struct intel_guc *guc)
 			return;
 		}
 
-		guc->log_obj = obj;
+		guc->log.obj = obj;
 	}
 
 	/* each allocated unit is a page */
@@ -875,7 +875,7 @@ static void guc_create_log(struct intel_guc *guc)
 		(GUC_LOG_CRASH_PAGES << GUC_LOG_CRASH_SHIFT);
 
 	offset = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT; /* in pages */
-	guc->log_flags = (offset << GUC_LOG_BUF_ADDR_SHIFT) | flags;
+	guc->log.flags = (offset << GUC_LOG_BUF_ADDR_SHIFT) | flags;
 }
 
 static void init_guc_policies(struct guc_policies *policies)
@@ -1048,8 +1048,8 @@ void i915_guc_submission_fini(struct drm_i915_private *dev_priv)
 	gem_release_guc_obj(dev_priv->guc.ads_obj);
 	guc->ads_obj = NULL;
 
-	gem_release_guc_obj(dev_priv->guc.log_obj);
-	guc->log_obj = NULL;
+	gem_release_guc_obj(dev_priv->guc.log.obj);
+	guc->log.obj = NULL;
 
 	if (guc->ctx_pool_obj)
 		ida_destroy(&guc->ctx_ids);
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 26f3d9c..7e22803 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -121,10 +121,14 @@ struct intel_guc_fw {
 	uint32_t ucode_offset;
 };
 
+struct intel_guc_log {
+	uint32_t flags;
+	struct drm_i915_gem_object *obj;
+};
+
 struct intel_guc {
 	struct intel_guc_fw guc_fw;
-	uint32_t log_flags;
-	struct drm_i915_gem_object *log_obj;
+	struct intel_guc_log log;
 
 	struct drm_i915_gem_object *ads_obj;
 
diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c b/drivers/gpu/drm/i915/intel_guc_loader.c
index ec24bab..f23bb33 100644
--- a/drivers/gpu/drm/i915/intel_guc_loader.c
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -187,7 +187,7 @@ static void set_guc_init_params(struct drm_i915_private *dev_priv)
 	params[GUC_CTL_FEATURE] |= GUC_CTL_DISABLE_SCHEDULER |
 			GUC_CTL_VCS2_ENABLED;
 
-	params[GUC_CTL_LOG_PARAMS] = guc->log_flags;
+	params[GUC_CTL_LOG_PARAMS] = guc->log.flags;
 
 	if (i915.guc_log_level >= 0)
 		params[GUC_CTL_DEBUG] =
-- 
1.9.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 04/20] drm/i915: Add low level set of routines for programming PM IER/IIR/IMR register set
  2016-08-12  6:25 [PATCH v5 00/20] Support for sustained capturing of GuC firmware logs akash.goel
                   ` (2 preceding siblings ...)
  2016-08-12  6:25 ` [PATCH 03/20] drm/i915: New structure to contain GuC logging related fields akash.goel
@ 2016-08-12  6:25 ` akash.goel
  2016-08-12 11:15   ` Tvrtko Ursulin
  2016-08-12  6:25 ` [PATCH 05/20] drm/i915: Support for GuC interrupts akash.goel
                   ` (16 subsequent siblings)
  20 siblings, 1 reply; 68+ messages in thread
From: akash.goel @ 2016-08-12  6:25 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel

From: Akash Goel <akash.goel@intel.com>

So far PM IER/IIR/IMR registers were being used only for Turbo related
interrupts. But interrupts coming from GuC also use the same set.
As a precursor to supporting GuC interrupts, added new low level routines
so as to allow sharing the programming of PM IER/IIR/IMR registers between
Turbo & GuC.
Also similar to PM IMR, maintaining a bitmask for PM IER register, to allow
easy sharing of it between Turbo & GuC without involving a rmw operation.

v2:
- For appropriateness & avoid any ambiguity, rename old functions
  enable/disable pm_irq to mask/unmask pm_irq and rename new functions
  enable/disable pm_interrupts to enable/disable pm_irq. (Tvrtko)
- Use u32 in place of uint32_t. (Tvrtko)

v3:
- Rename the fields pm_irq_mask & pm_ier_mask and do some cleanup. (Chris)
- Rebase.

v4: Fix the inadvertent disabling of User interrupt for VECS ring causing
    failure for certain IGTs.

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         |  3 +-
 drivers/gpu/drm/i915/i915_irq.c         | 75 ++++++++++++++++++++++-----------
 drivers/gpu/drm/i915/intel_drv.h        |  3 ++
 drivers/gpu/drm/i915/intel_ringbuffer.c |  4 +-
 4 files changed, 57 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 7971c76..a608a5c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1776,7 +1776,8 @@ struct drm_i915_private {
 		u32 de_irq_mask[I915_MAX_PIPES];
 	};
 	u32 gt_irq_mask;
-	u32 pm_irq_mask;
+	u32 pm_imr;
+	u32 pm_ier;
 	u32 pm_rps_events;
 	u32 pipestat_irq_mask[I915_MAX_PIPES];
 
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index ebb83d5..5f93309 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -303,18 +303,18 @@ static void snb_update_pm_irq(struct drm_i915_private *dev_priv,
 
 	assert_spin_locked(&dev_priv->irq_lock);
 
-	new_val = dev_priv->pm_irq_mask;
+	new_val = dev_priv->pm_imr;
 	new_val &= ~interrupt_mask;
 	new_val |= (~enabled_irq_mask & interrupt_mask);
 
-	if (new_val != dev_priv->pm_irq_mask) {
-		dev_priv->pm_irq_mask = new_val;
-		I915_WRITE(gen6_pm_imr(dev_priv), dev_priv->pm_irq_mask);
+	if (new_val != dev_priv->pm_imr) {
+		dev_priv->pm_imr = new_val;
+		I915_WRITE(gen6_pm_imr(dev_priv), dev_priv->pm_imr);
 		POSTING_READ(gen6_pm_imr(dev_priv));
 	}
 }
 
-void gen6_enable_pm_irq(struct drm_i915_private *dev_priv, uint32_t mask)
+void gen6_unmask_pm_irq(struct drm_i915_private *dev_priv, u32 mask)
 {
 	if (WARN_ON(!intel_irqs_enabled(dev_priv)))
 		return;
@@ -322,28 +322,54 @@ void gen6_enable_pm_irq(struct drm_i915_private *dev_priv, uint32_t mask)
 	snb_update_pm_irq(dev_priv, mask, mask);
 }
 
-static void __gen6_disable_pm_irq(struct drm_i915_private *dev_priv,
-				  uint32_t mask)
+static void __gen6_mask_pm_irq(struct drm_i915_private *dev_priv, u32 mask)
 {
 	snb_update_pm_irq(dev_priv, mask, 0);
 }
 
-void gen6_disable_pm_irq(struct drm_i915_private *dev_priv, uint32_t mask)
+void gen6_mask_pm_irq(struct drm_i915_private *dev_priv, u32 mask)
 {
 	if (WARN_ON(!intel_irqs_enabled(dev_priv)))
 		return;
 
-	__gen6_disable_pm_irq(dev_priv, mask);
+	__gen6_mask_pm_irq(dev_priv, mask);
 }
 
-void gen6_reset_rps_interrupts(struct drm_i915_private *dev_priv)
+void gen6_reset_pm_iir(struct drm_i915_private *dev_priv, u32 reset_mask)
 {
 	i915_reg_t reg = gen6_pm_iir(dev_priv);
 
-	spin_lock_irq(&dev_priv->irq_lock);
-	I915_WRITE(reg, dev_priv->pm_rps_events);
-	I915_WRITE(reg, dev_priv->pm_rps_events);
+	assert_spin_locked(&dev_priv->irq_lock);
+
+	I915_WRITE(reg, reset_mask);
+	I915_WRITE(reg, reset_mask);
 	POSTING_READ(reg);
+}
+
+void gen6_enable_pm_irq(struct drm_i915_private *dev_priv, u32 enable_mask)
+{
+	assert_spin_locked(&dev_priv->irq_lock);
+
+	dev_priv->pm_ier |= enable_mask;
+	I915_WRITE(gen6_pm_ier(dev_priv), dev_priv->pm_ier);
+	gen6_unmask_pm_irq(dev_priv, enable_mask);
+	/* unmask_pm_irq provides an implicit barrier (POSTING_READ) */
+}
+
+void gen6_disable_pm_irq(struct drm_i915_private *dev_priv, u32 disable_mask)
+{
+	assert_spin_locked(&dev_priv->irq_lock);
+
+	dev_priv->pm_ier &= ~disable_mask;
+	__gen6_mask_pm_irq(dev_priv, disable_mask);
+	I915_WRITE(gen6_pm_ier(dev_priv), dev_priv->pm_ier);
+	/* though a barrier is missing here, but don't really need a one */
+}
+
+void gen6_reset_rps_interrupts(struct drm_i915_private *dev_priv)
+{
+	spin_lock_irq(&dev_priv->irq_lock);
+	gen6_reset_pm_iir(dev_priv, dev_priv->pm_rps_events);
 	dev_priv->rps.pm_iir = 0;
 	spin_unlock_irq(&dev_priv->irq_lock);
 }
@@ -354,8 +380,6 @@ void gen6_enable_rps_interrupts(struct drm_i915_private *dev_priv)
 	WARN_ON_ONCE(dev_priv->rps.pm_iir);
 	WARN_ON_ONCE(I915_READ(gen6_pm_iir(dev_priv)) & dev_priv->pm_rps_events);
 	dev_priv->rps.interrupts_enabled = true;
-	I915_WRITE(gen6_pm_ier(dev_priv), I915_READ(gen6_pm_ier(dev_priv)) |
-				dev_priv->pm_rps_events);
 	gen6_enable_pm_irq(dev_priv, dev_priv->pm_rps_events);
 
 	spin_unlock_irq(&dev_priv->irq_lock);
@@ -373,9 +397,7 @@ void gen6_disable_rps_interrupts(struct drm_i915_private *dev_priv)
 
 	I915_WRITE(GEN6_PMINTRMSK, gen6_sanitize_rps_pm_mask(dev_priv, ~0));
 
-	__gen6_disable_pm_irq(dev_priv, dev_priv->pm_rps_events);
-	I915_WRITE(gen6_pm_ier(dev_priv), I915_READ(gen6_pm_ier(dev_priv)) &
-				~dev_priv->pm_rps_events);
+	gen6_disable_pm_irq(dev_priv, dev_priv->pm_rps_events);
 
 	spin_unlock_irq(&dev_priv->irq_lock);
 	synchronize_irq(dev_priv->drm.irq);
@@ -1078,7 +1100,7 @@ static void gen6_pm_rps_work(struct work_struct *work)
 	pm_iir = dev_priv->rps.pm_iir;
 	dev_priv->rps.pm_iir = 0;
 	/* Make sure not to corrupt PMIMR state used by ringbuffer on GEN6 */
-	gen6_enable_pm_irq(dev_priv, dev_priv->pm_rps_events);
+	gen6_unmask_pm_irq(dev_priv, dev_priv->pm_rps_events);
 	client_boost = dev_priv->rps.client_boost;
 	dev_priv->rps.client_boost = false;
 	spin_unlock_irq(&dev_priv->irq_lock);
@@ -1579,7 +1601,7 @@ static void gen6_rps_irq_handler(struct drm_i915_private *dev_priv, u32 pm_iir)
 {
 	if (pm_iir & dev_priv->pm_rps_events) {
 		spin_lock(&dev_priv->irq_lock);
-		gen6_disable_pm_irq(dev_priv, pm_iir & dev_priv->pm_rps_events);
+		gen6_mask_pm_irq(dev_priv, pm_iir & dev_priv->pm_rps_events);
 		if (dev_priv->rps.interrupts_enabled) {
 			dev_priv->rps.pm_iir |= pm_iir & dev_priv->pm_rps_events;
 			schedule_work(&dev_priv->rps.work);
@@ -3568,11 +3590,13 @@ static void gen5_gt_irq_postinstall(struct drm_device *dev)
 		 * RPS interrupts will get enabled/disabled on demand when RPS
 		 * itself is enabled/disabled.
 		 */
-		if (HAS_VEBOX(dev))
+		if (HAS_VEBOX(dev)) {
 			pm_irqs |= PM_VEBOX_USER_INTERRUPT;
+			dev_priv->pm_ier |= PM_VEBOX_USER_INTERRUPT;
+		}
 
-		dev_priv->pm_irq_mask = 0xffffffff;
-		GEN5_IRQ_INIT(GEN6_PM, dev_priv->pm_irq_mask, pm_irqs);
+		dev_priv->pm_imr = 0xffffffff;
+		GEN5_IRQ_INIT(GEN6_PM, dev_priv->pm_imr, pm_irqs);
 	}
 }
 
@@ -3692,14 +3716,15 @@ static void gen8_gt_irq_postinstall(struct drm_i915_private *dev_priv)
 	if (HAS_L3_DPF(dev_priv))
 		gt_interrupts[0] |= GT_RENDER_L3_PARITY_ERROR_INTERRUPT;
 
-	dev_priv->pm_irq_mask = 0xffffffff;
+	dev_priv->pm_ier = 0x0;
+	dev_priv->pm_imr = ~dev_priv->pm_ier;
 	GEN8_IRQ_INIT_NDX(GT, 0, ~gt_interrupts[0], gt_interrupts[0]);
 	GEN8_IRQ_INIT_NDX(GT, 1, ~gt_interrupts[1], gt_interrupts[1]);
 	/*
 	 * RPS interrupts will get enabled/disabled on demand when RPS itself
 	 * is enabled/disabled.
 	 */
-	GEN8_IRQ_INIT_NDX(GT, 2, dev_priv->pm_irq_mask, 0);
+	GEN8_IRQ_INIT_NDX(GT, 2, dev_priv->pm_imr, dev_priv->pm_ier);
 	GEN8_IRQ_INIT_NDX(GT, 3, ~gt_interrupts[3], gt_interrupts[3]);
 }
 
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 9539f0e..80cd05f 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -1094,6 +1094,9 @@ void intel_check_pch_fifo_underruns(struct drm_i915_private *dev_priv);
 /* i915_irq.c */
 void gen5_enable_gt_irq(struct drm_i915_private *dev_priv, uint32_t mask);
 void gen5_disable_gt_irq(struct drm_i915_private *dev_priv, uint32_t mask);
+void gen6_reset_pm_iir(struct drm_i915_private *dev_priv, u32 mask);
+void gen6_mask_pm_irq(struct drm_i915_private *dev_priv, u32 mask);
+void gen6_unmask_pm_irq(struct drm_i915_private *dev_priv, u32 mask);
 void gen6_enable_pm_irq(struct drm_i915_private *dev_priv, uint32_t mask);
 void gen6_disable_pm_irq(struct drm_i915_private *dev_priv, uint32_t mask);
 void gen6_reset_rps_interrupts(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index ed19868..e8fa26c 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1700,7 +1700,7 @@ hsw_vebox_irq_enable(struct intel_engine_cs *engine)
 	struct drm_i915_private *dev_priv = engine->i915;
 
 	I915_WRITE_IMR(engine, ~engine->irq_enable_mask);
-	gen6_enable_pm_irq(dev_priv, engine->irq_enable_mask);
+	gen6_unmask_pm_irq(dev_priv, engine->irq_enable_mask);
 }
 
 static void
@@ -1709,7 +1709,7 @@ hsw_vebox_irq_disable(struct intel_engine_cs *engine)
 	struct drm_i915_private *dev_priv = engine->i915;
 
 	I915_WRITE_IMR(engine, ~0);
-	gen6_disable_pm_irq(dev_priv, engine->irq_enable_mask);
+	gen6_mask_pm_irq(dev_priv, engine->irq_enable_mask);
 }
 
 static void
-- 
1.9.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 05/20] drm/i915: Support for GuC interrupts
  2016-08-12  6:25 [PATCH v5 00/20] Support for sustained capturing of GuC firmware logs akash.goel
                   ` (3 preceding siblings ...)
  2016-08-12  6:25 ` [PATCH 04/20] drm/i915: Add low level set of routines for programming PM IER/IIR/IMR register set akash.goel
@ 2016-08-12  6:25 ` akash.goel
  2016-08-12 11:54   ` Tvrtko Ursulin
  2016-08-12  6:25 ` [PATCH 06/20] drm/i915: Handle log buffer flush interrupt event from GuC akash.goel
                   ` (15 subsequent siblings)
  20 siblings, 1 reply; 68+ messages in thread
From: akash.goel @ 2016-08-12  6:25 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel

From: Sagar Arun Kamble <sagar.a.kamble@intel.com>

There are certain types of interrupts which Host can recieve from GuC.
GuC ukernel sends an interrupt to Host for certain events, like for
example retrieve/consume the logs generated by ukernel.
This patch adds support to receive interrupts from GuC but currently
enables & partially handles only the interrupt sent by GuC ukernel.
Future patches will add support for handling other interrupt types.

v2:
- Use common low level routines for PM IER/IIR programming (Chris)
- Rename interrupt functions to gen9_xxx from gen8_xxx (Chris)
- Replace disabling of wake ref asserts with rpm get/put (Chris)

v3:
- Update comments for more clarity. (Tvrtko)
- Remove the masking of GuC interrupt, which was kept masked till the
  start of bottom half, its not really needed as there is only a
  single instance of work item & wq is ordered. (Tvrtko)

v4:
- Rebase.
- Rename guc_events to pm_guc_events so as to be indicative of the
  register/control block it is associated with. (Chris)
- Add handling for back to back log buffer flush interrupts.

v5:
- Move the read & clearing of register, containing Guc2Host message
  bits, outside the irq spinlock. (Tvrtko)

Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h            |   1 +
 drivers/gpu/drm/i915/i915_guc_submission.c |   5 ++
 drivers/gpu/drm/i915/i915_irq.c            | 100 +++++++++++++++++++++++++++--
 drivers/gpu/drm/i915/i915_reg.h            |  11 ++++
 drivers/gpu/drm/i915/intel_drv.h           |   3 +
 drivers/gpu/drm/i915/intel_guc.h           |   4 ++
 drivers/gpu/drm/i915/intel_guc_loader.c    |   4 ++
 7 files changed, 124 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index a608a5c..28ffac5 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1779,6 +1779,7 @@ struct drm_i915_private {
 	u32 pm_imr;
 	u32 pm_ier;
 	u32 pm_rps_events;
+	u32 pm_guc_events;
 	u32 pipestat_irq_mask[I915_MAX_PIPES];
 
 	struct i915_hotplug hotplug;
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index ad3b55f..c7c679f 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -1071,6 +1071,8 @@ int intel_guc_suspend(struct drm_device *dev)
 	if (guc->guc_fw.guc_fw_load_status != GUC_FIRMWARE_SUCCESS)
 		return 0;
 
+	gen9_disable_guc_interrupts(dev_priv);
+
 	ctx = dev_priv->kernel_context;
 
 	data[0] = HOST2GUC_ACTION_ENTER_S_STATE;
@@ -1097,6 +1099,9 @@ int intel_guc_resume(struct drm_device *dev)
 	if (guc->guc_fw.guc_fw_load_status != GUC_FIRMWARE_SUCCESS)
 		return 0;
 
+	if (i915.guc_log_level >= 0)
+		gen9_enable_guc_interrupts(dev_priv);
+
 	ctx = dev_priv->kernel_context;
 
 	data[0] = HOST2GUC_ACTION_EXIT_S_STATE;
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 5f93309..5f1974f 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -170,6 +170,7 @@ static void gen5_assert_iir_is_zero(struct drm_i915_private *dev_priv,
 } while (0)
 
 static void gen6_rps_irq_handler(struct drm_i915_private *dev_priv, u32 pm_iir);
+static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv, u32 pm_iir);
 
 /* For display hotplug interrupt */
 static inline void
@@ -411,6 +412,38 @@ void gen6_disable_rps_interrupts(struct drm_i915_private *dev_priv)
 	gen6_reset_rps_interrupts(dev_priv);
 }
 
+void gen9_reset_guc_interrupts(struct drm_i915_private *dev_priv)
+{
+	spin_lock_irq(&dev_priv->irq_lock);
+	gen6_reset_pm_iir(dev_priv, dev_priv->pm_guc_events);
+	spin_unlock_irq(&dev_priv->irq_lock);
+}
+
+void gen9_enable_guc_interrupts(struct drm_i915_private *dev_priv)
+{
+	spin_lock_irq(&dev_priv->irq_lock);
+	if (!dev_priv->guc.interrupts_enabled) {
+		WARN_ON_ONCE(I915_READ(gen6_pm_iir(dev_priv)) &
+						dev_priv->pm_guc_events);
+		dev_priv->guc.interrupts_enabled = true;
+		gen6_enable_pm_irq(dev_priv, dev_priv->pm_guc_events);
+	}
+	spin_unlock_irq(&dev_priv->irq_lock);
+}
+
+void gen9_disable_guc_interrupts(struct drm_i915_private *dev_priv)
+{
+	spin_lock_irq(&dev_priv->irq_lock);
+	dev_priv->guc.interrupts_enabled = false;
+
+	gen6_disable_pm_irq(dev_priv, dev_priv->pm_guc_events);
+
+	spin_unlock_irq(&dev_priv->irq_lock);
+	synchronize_irq(dev_priv->drm.irq);
+
+	gen9_reset_guc_interrupts(dev_priv);
+}
+
 /**
  * bdw_update_port_irq - update DE port interrupt
  * @dev_priv: driver private
@@ -1167,6 +1200,21 @@ static void gen6_pm_rps_work(struct work_struct *work)
 	mutex_unlock(&dev_priv->rps.hw_lock);
 }
 
+static void gen9_guc2host_events_work(struct work_struct *work)
+{
+	struct drm_i915_private *dev_priv =
+		container_of(work, struct drm_i915_private, guc.events_work);
+
+	spin_lock_irq(&dev_priv->irq_lock);
+	/* Speed up work cancellation during disabling guc interrupts. */
+	if (!dev_priv->guc.interrupts_enabled) {
+		spin_unlock_irq(&dev_priv->irq_lock);
+		return;
+	}
+	spin_unlock_irq(&dev_priv->irq_lock);
+
+	/* TODO: Handle the events for which GuC interrupted host */
+}
 
 /**
  * ivybridge_parity_work - Workqueue called when a parity error interrupt
@@ -1339,11 +1387,13 @@ static irqreturn_t gen8_gt_irq_ack(struct drm_i915_private *dev_priv,
 			DRM_ERROR("The master control interrupt lied (GT3)!\n");
 	}
 
-	if (master_ctl & GEN8_GT_PM_IRQ) {
+	if (master_ctl & (GEN8_GT_PM_IRQ | GEN8_GT_GUC_IRQ)) {
 		gt_iir[2] = I915_READ_FW(GEN8_GT_IIR(2));
-		if (gt_iir[2] & dev_priv->pm_rps_events) {
+		if (gt_iir[2] & (dev_priv->pm_rps_events |
+				 dev_priv->pm_guc_events)) {
 			I915_WRITE_FW(GEN8_GT_IIR(2),
-				      gt_iir[2] & dev_priv->pm_rps_events);
+				      gt_iir[2] & (dev_priv->pm_rps_events |
+						   dev_priv->pm_guc_events));
 			ret = IRQ_HANDLED;
 		} else
 			DRM_ERROR("The master control interrupt lied (PM)!\n");
@@ -1375,6 +1425,9 @@ static void gen8_gt_irq_handler(struct drm_i915_private *dev_priv,
 
 	if (gt_iir[2] & dev_priv->pm_rps_events)
 		gen6_rps_irq_handler(dev_priv, gt_iir[2]);
+
+	if (gt_iir[2] & dev_priv->pm_guc_events)
+		gen9_guc_irq_handler(dev_priv, gt_iir[2]);
 }
 
 static bool bxt_port_hotplug_long_detect(enum port port, u32 val)
@@ -1621,6 +1674,41 @@ static void gen6_rps_irq_handler(struct drm_i915_private *dev_priv, u32 pm_iir)
 	}
 }
 
+static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv, u32 gt_iir)
+{
+	bool interrupts_enabled;
+
+	if (gt_iir & GEN9_GUC_TO_HOST_INT_EVENT) {
+		spin_lock(&dev_priv->irq_lock);
+		interrupts_enabled = dev_priv->guc.interrupts_enabled;
+		spin_unlock(&dev_priv->irq_lock);
+		if (interrupts_enabled) {
+			/* Sample the log buffer flush related bits & clear them
+			 * out now itself from the message identity register to
+			 * minimize the probability of losing a flush interrupt,
+			 * when there are back to back flush interrupts.
+			 * There can be a new flush interrupt, for different log
+			 * buffer type (like for ISR), whilst Host is handling
+			 * one (for DPC). Since same bit is used in message
+			 * register for ISR & DPC, it could happen that GuC
+			 * sets the bit for 2nd interrupt but Host clears out
+			 * the bit on handling the 1st interrupt.
+			 */
+			u32 msg = I915_READ(SOFT_SCRATCH(15)) &
+					(GUC2HOST_MSG_CRASH_DUMP_POSTED |
+					 GUC2HOST_MSG_FLUSH_LOG_BUFFER);
+			if (msg) {
+				/* Clear the message bits that are handled */
+				I915_WRITE(SOFT_SCRATCH(15),
+					I915_READ(SOFT_SCRATCH(15)) & ~msg);
+
+				/* Handle flush interrupt event in bottom half */
+				queue_work(dev_priv->wq, &dev_priv->guc.events_work);
+			}
+		}
+	}
+}
+
 static bool intel_pipe_handle_vblank(struct drm_i915_private *dev_priv,
 				     enum pipe pipe)
 {
@@ -3722,7 +3810,7 @@ static void gen8_gt_irq_postinstall(struct drm_i915_private *dev_priv)
 	GEN8_IRQ_INIT_NDX(GT, 1, ~gt_interrupts[1], gt_interrupts[1]);
 	/*
 	 * RPS interrupts will get enabled/disabled on demand when RPS itself
-	 * is enabled/disabled.
+	 * is enabled/disabled. Same wil be the case for GuC interrupts.
 	 */
 	GEN8_IRQ_INIT_NDX(GT, 2, dev_priv->pm_imr, dev_priv->pm_ier);
 	GEN8_IRQ_INIT_NDX(GT, 3, ~gt_interrupts[3], gt_interrupts[3]);
@@ -4507,6 +4595,10 @@ void intel_irq_init(struct drm_i915_private *dev_priv)
 
 	INIT_WORK(&dev_priv->rps.work, gen6_pm_rps_work);
 	INIT_WORK(&dev_priv->l3_parity.error_work, ivybridge_parity_work);
+	INIT_WORK(&dev_priv->guc.events_work, gen9_guc2host_events_work);
+
+	if (HAS_GUC_UCODE(dev))
+		dev_priv->pm_guc_events = GEN9_GUC_TO_HOST_INT_EVENT;
 
 	/* Let's track the enabled rps events */
 	if (IS_VALLEYVIEW(dev_priv))
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index da82744..62046dc 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -6011,6 +6011,7 @@ enum {
 #define  GEN8_DE_PIPE_A_IRQ		(1<<16)
 #define  GEN8_DE_PIPE_IRQ(pipe)		(1<<(16+(pipe)))
 #define  GEN8_GT_VECS_IRQ		(1<<6)
+#define  GEN8_GT_GUC_IRQ		(1<<5)
 #define  GEN8_GT_PM_IRQ			(1<<4)
 #define  GEN8_GT_VCS2_IRQ		(1<<3)
 #define  GEN8_GT_VCS1_IRQ		(1<<2)
@@ -6022,6 +6023,16 @@ enum {
 #define GEN8_GT_IIR(which) _MMIO(0x44308 + (0x10 * (which)))
 #define GEN8_GT_IER(which) _MMIO(0x4430c + (0x10 * (which)))
 
+#define GEN9_GUC_TO_HOST_INT_EVENT	(1<<31)
+#define GEN9_GUC_EXEC_ERROR_EVENT	(1<<30)
+#define GEN9_GUC_DISPLAY_EVENT		(1<<29)
+#define GEN9_GUC_SEMA_SIGNAL_EVENT	(1<<28)
+#define GEN9_GUC_IOMMU_MSG_EVENT	(1<<27)
+#define GEN9_GUC_DB_RING_EVENT		(1<<26)
+#define GEN9_GUC_DMA_DONE_EVENT		(1<<25)
+#define GEN9_GUC_FATAL_ERROR_EVENT	(1<<24)
+#define GEN9_GUC_NOTIFICATION_EVENT	(1<<23)
+
 #define GEN8_RCS_IRQ_SHIFT 0
 #define GEN8_BCS_IRQ_SHIFT 16
 #define GEN8_VCS1_IRQ_SHIFT 0
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 80cd05f..9619ce9 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -1119,6 +1119,9 @@ void gen8_irq_power_well_post_enable(struct drm_i915_private *dev_priv,
 				     unsigned int pipe_mask);
 void gen8_irq_power_well_pre_disable(struct drm_i915_private *dev_priv,
 				     unsigned int pipe_mask);
+void gen9_reset_guc_interrupts(struct drm_i915_private *dev_priv);
+void gen9_enable_guc_interrupts(struct drm_i915_private *dev_priv);
+void gen9_disable_guc_interrupts(struct drm_i915_private *dev_priv);
 
 /* intel_crt.c */
 void intel_crt_init(struct drm_device *dev);
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 7e22803..be1e04d 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -130,6 +130,10 @@ struct intel_guc {
 	struct intel_guc_fw guc_fw;
 	struct intel_guc_log log;
 
+	/* GuC2Host interrupt related state */
+	struct work_struct events_work;
+	bool interrupts_enabled;
+
 	struct drm_i915_gem_object *ads_obj;
 
 	struct drm_i915_gem_object *ctx_pool_obj;
diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c b/drivers/gpu/drm/i915/intel_guc_loader.c
index f23bb33..b7e97cc 100644
--- a/drivers/gpu/drm/i915/intel_guc_loader.c
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -464,6 +464,7 @@ int intel_guc_setup(struct drm_device *dev)
 	}
 
 	direct_interrupts_to_host(dev_priv);
+	gen9_reset_guc_interrupts(dev_priv);
 
 	guc_fw->guc_fw_load_status = GUC_FIRMWARE_PENDING;
 
@@ -510,6 +511,9 @@ int intel_guc_setup(struct drm_device *dev)
 		intel_guc_fw_status_repr(guc_fw->guc_fw_load_status));
 
 	if (i915.enable_guc_submission) {
+		if (i915.guc_log_level >= 0)
+			gen9_enable_guc_interrupts(dev_priv);
+
 		err = i915_guc_submission_enable(dev_priv);
 		if (err)
 			goto fail;
-- 
1.9.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 06/20] drm/i915: Handle log buffer flush interrupt event from GuC
  2016-08-12  6:25 [PATCH v5 00/20] Support for sustained capturing of GuC firmware logs akash.goel
                   ` (4 preceding siblings ...)
  2016-08-12  6:25 ` [PATCH 05/20] drm/i915: Support for GuC interrupts akash.goel
@ 2016-08-12  6:25 ` akash.goel
  2016-08-12  6:28   ` Chris Wilson
  2016-08-12 13:17   ` Tvrtko Ursulin
  2016-08-12  6:25 ` [PATCH 07/20] relay: Use per CPU constructs for the relay channel buffer pointers akash.goel
                   ` (14 subsequent siblings)
  20 siblings, 2 replies; 68+ messages in thread
From: akash.goel @ 2016-08-12  6:25 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel

From: Sagar Arun Kamble <sagar.a.kamble@intel.com>

GuC ukernel sends an interrupt to Host to flush the log buffer
and expects Host to correspondingly update the read pointer
information in the state structure, once it has consumed the
log buffer contents by copying them to a file or buffer.
Even if Host couldn't copy the contents, it can still update the
read pointer so that logging state is not disturbed on GuC side.

v2:
- Use a dedicated workqueue for handling flush interrupt. (Tvrtko)
- Reduce the overall log buffer copying time by skipping the copy of
  crash buffer area for regular cases and copying only the state
  structure data in first page.

v3:
 - Create a vmalloc mapping of log buffer. (Chris)
 - Cover the flush acknowledgment under rpm get & put.(Chris)
 - Revert the change of skipping the copy of crash dump area, as
   not really needed, will be covered by subsequent patch.

v4:
 - Destroy the wq under the same condition in which it was created,
   pass dev_piv pointer instead of dev to newly added GuC function,
   add more comments & rename variable for clarity. (Tvrtko)

Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c            |  14 +++
 drivers/gpu/drm/i915/i915_guc_submission.c | 150 +++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_irq.c            |   5 +-
 drivers/gpu/drm/i915/intel_guc.h           |   3 +
 4 files changed, 170 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 0fcd1c0..fc2da32 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -770,8 +770,20 @@ static int i915_workqueues_init(struct drm_i915_private *dev_priv)
 	if (dev_priv->hotplug.dp_wq == NULL)
 		goto out_free_wq;
 
+	if (HAS_GUC_SCHED(dev_priv)) {
+		/* Need a dedicated wq to process log buffer flush interrupts
+		 * from GuC without much delay so as to avoid any loss of logs.
+		 */
+		dev_priv->guc.log.wq =
+			alloc_ordered_workqueue("i915-guc_log", 0);
+		if (dev_priv->guc.log.wq == NULL)
+			goto out_free_hotplug_dp_wq;
+	}
+
 	return 0;
 
+out_free_hotplug_dp_wq:
+	destroy_workqueue(dev_priv->hotplug.dp_wq);
 out_free_wq:
 	destroy_workqueue(dev_priv->wq);
 out_err:
@@ -782,6 +794,8 @@ out_err:
 
 static void i915_workqueues_cleanup(struct drm_i915_private *dev_priv)
 {
+	if (HAS_GUC_SCHED(dev_priv))
+		destroy_workqueue(dev_priv->guc.log.wq);
 	destroy_workqueue(dev_priv->hotplug.dp_wq);
 	destroy_workqueue(dev_priv->wq);
 }
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index c7c679f..2635b67 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -172,6 +172,15 @@ static int host2guc_sample_forcewake(struct intel_guc *guc,
 	return host2guc_action(guc, data, ARRAY_SIZE(data));
 }
 
+static int host2guc_logbuffer_flush_complete(struct intel_guc *guc)
+{
+	u32 data[1];
+
+	data[0] = HOST2GUC_ACTION_LOG_BUFFER_FILE_FLUSH_COMPLETE;
+
+	return host2guc_action(guc, data, 1);
+}
+
 /*
  * Initialise, update, or clear doorbell data shared with the GuC
  *
@@ -840,6 +849,127 @@ err:
 	return NULL;
 }
 
+static void guc_move_to_next_buf(struct intel_guc *guc)
+{
+	return;
+}
+
+static void* guc_get_write_buffer(struct intel_guc *guc)
+{
+	return NULL;
+}
+
+static void guc_read_update_log_buffer(struct intel_guc *guc)
+{
+	struct guc_log_buffer_state *log_buffer_state, *log_buffer_snapshot_state;
+	struct guc_log_buffer_state log_buffer_state_local;
+	void *src_data_ptr, *dst_data_ptr;
+	u32 i, buffer_size;
+
+	if (!guc->log.buf_addr)
+		return;
+
+	/* Get the pointer to shared GuC log buffer */
+	log_buffer_state = src_data_ptr = guc->log.buf_addr;
+
+	/* Get the pointer to local buffer to store the logs */
+	dst_data_ptr = log_buffer_snapshot_state = guc_get_write_buffer(guc);
+
+	/* Actual logs are present from the 2nd page */
+	src_data_ptr += PAGE_SIZE;
+	dst_data_ptr += PAGE_SIZE;
+
+	for (i = 0; i < GUC_MAX_LOG_BUFFER; i++) {
+		/* Make a copy of the state structure in GuC log buffer (which
+		 * is uncached mapped) on the stack to avoid reading from it
+		 * multiple times.
+		 */
+		memcpy(&log_buffer_state_local, log_buffer_state,
+				sizeof(struct guc_log_buffer_state));
+		buffer_size = log_buffer_state_local.size;
+
+		if (log_buffer_snapshot_state) {
+			/* First copy the state structure in local buffer */
+			memcpy(log_buffer_snapshot_state, &log_buffer_state_local,
+					sizeof(struct guc_log_buffer_state));
+
+			/* The write pointer could have been updated by the GuC
+			 * firmware, after sending the flush interrupt to Host,
+			 * for consistency set the write pointer value to same
+			 * value of sampled_write_ptr in the snapshot buffer.
+			 */
+			log_buffer_snapshot_state->write_ptr =
+				log_buffer_snapshot_state->sampled_write_ptr;
+
+			log_buffer_snapshot_state++;
+
+			/* Now copy the actual logs */
+			memcpy(dst_data_ptr, src_data_ptr, buffer_size);
+
+			src_data_ptr += buffer_size;
+			dst_data_ptr += buffer_size;
+		}
+
+		/* FIXME: invalidate/flush for log buffer needed */
+
+		/* Update the read pointer in the shared log buffer */
+		log_buffer_state->read_ptr =
+			log_buffer_state_local.sampled_write_ptr;
+
+		/* Clear the 'flush to file' flag */
+		log_buffer_state->flush_to_file = 0;
+		log_buffer_state++;
+	}
+
+	if (log_buffer_snapshot_state)
+		guc_move_to_next_buf(guc);
+}
+
+static void guc_log_cleanup(struct intel_guc *guc)
+{
+	struct drm_i915_private *dev_priv = guc_to_i915(guc);
+
+	lockdep_assert_held(&dev_priv->drm.struct_mutex);
+
+	if (i915.guc_log_level < 0)
+		return;
+
+	/* First disable the flush interrupt */
+	gen9_disable_guc_interrupts(dev_priv);
+
+	if (guc->log.buf_addr)
+		i915_gem_object_unpin_map(guc->log.obj);
+
+	guc->log.buf_addr = NULL;
+}
+
+static int guc_create_log_extras(struct intel_guc *guc)
+{
+	struct drm_i915_private *dev_priv = guc_to_i915(guc);
+	void *vaddr;
+	int ret;
+
+	lockdep_assert_held(&dev_priv->drm.struct_mutex);
+
+	/* Nothing to do */
+	if (i915.guc_log_level < 0)
+		return 0;
+
+	if (!guc->log.buf_addr) {
+		/* Create a vmalloc mapping of log buffer pages */
+		vaddr = i915_gem_object_pin_map(guc->log.obj);
+		if (IS_ERR(vaddr)) {
+			ret = PTR_ERR(vaddr);
+			DRM_ERROR("Couldn't map log buffer pages %d\n", ret);
+			return ret;
+		}
+
+		guc->log.buf_addr = vaddr;
+	}
+
+	return 0;
+}
+
 static void guc_create_log(struct intel_guc *guc)
 {
 	struct drm_i915_private *dev_priv = guc_to_i915(guc);
@@ -866,6 +996,13 @@ static void guc_create_log(struct intel_guc *guc)
 		}
 
 		guc->log.obj = obj;
+
+		if (guc_create_log_extras(guc)) {
+			gem_release_guc_obj(guc->log.obj);
+			guc->log.obj = NULL;
+			i915.guc_log_level = -1;
+			return;
+		}
 	}
 
 	/* each allocated unit is a page */
@@ -1048,6 +1185,7 @@ void i915_guc_submission_fini(struct drm_i915_private *dev_priv)
 	gem_release_guc_obj(dev_priv->guc.ads_obj);
 	guc->ads_obj = NULL;
 
+	guc_log_cleanup(guc);
 	gem_release_guc_obj(dev_priv->guc.log.obj);
 	guc->log.obj = NULL;
 
@@ -1111,3 +1249,15 @@ int intel_guc_resume(struct drm_device *dev)
 
 	return host2guc_action(guc, data, ARRAY_SIZE(data));
 }
+
+void i915_guc_capture_logs(struct drm_i915_private *dev_priv)
+{
+	guc_read_update_log_buffer(&dev_priv->guc);
+
+	/* Generally device is expected to be active only at this
+	 * time, so get/put should be really quick.
+	 */
+	intel_runtime_pm_get(dev_priv);
+	host2guc_logbuffer_flush_complete(&dev_priv->guc);
+	intel_runtime_pm_put(dev_priv);
+}
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 5f1974f..d4d6f0a 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1213,7 +1213,7 @@ static void gen9_guc2host_events_work(struct work_struct *work)
 	}
 	spin_unlock_irq(&dev_priv->irq_lock);
 
-	/* TODO: Handle the events for which GuC interrupted host */
+	i915_guc_capture_logs(dev_priv);
 }
 
 /**
@@ -1703,7 +1703,8 @@ static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv, u32 gt_iir)
 					I915_READ(SOFT_SCRATCH(15)) & ~msg);
 
 				/* Handle flush interrupt event in bottom half */
-				queue_work(dev_priv->wq, &dev_priv->guc.events_work);
+				queue_work(dev_priv->guc.log.wq,
+						&dev_priv->guc.events_work);
 			}
 		}
 	}
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index be1e04d..7c0bdba 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -124,6 +124,8 @@ struct intel_guc_fw {
 struct intel_guc_log {
 	uint32_t flags;
 	struct drm_i915_gem_object *obj;
+	struct workqueue_struct *wq;
+	void *buf_addr;
 };
 
 struct intel_guc {
@@ -169,5 +171,6 @@ int i915_guc_submission_enable(struct drm_i915_private *dev_priv);
 int i915_guc_wq_check_space(struct drm_i915_gem_request *rq);
 void i915_guc_submission_disable(struct drm_i915_private *dev_priv);
 void i915_guc_submission_fini(struct drm_i915_private *dev_priv);
+void i915_guc_capture_logs(struct drm_i915_private *dev_priv);
 
 #endif
-- 
1.9.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 07/20] relay: Use per CPU constructs for the relay channel buffer pointers
  2016-08-12  6:25 [PATCH v5 00/20] Support for sustained capturing of GuC firmware logs akash.goel
                   ` (5 preceding siblings ...)
  2016-08-12  6:25 ` [PATCH 06/20] drm/i915: Handle log buffer flush interrupt event from GuC akash.goel
@ 2016-08-12  6:25 ` akash.goel
  2016-08-12  6:25 ` [PATCH 08/20] drm/i915: Add a relay backed debugfs interface for capturing GuC logs akash.goel
                   ` (13 subsequent siblings)
  20 siblings, 0 replies; 68+ messages in thread
From: akash.goel @ 2016-08-12  6:25 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel

From: Akash Goel <akash.goel@intel.com>

relay essentially needs to maintain the per CPU array of channel buffer
pointers but it manually creates that array.
Instead its better to avail the per CPU constructs, provided by the
kernel, to allocate & access the array of pointer to channel buffers.

v2: Include <linux/percpu.h> in relay.h so that it pulls in the percpu
    api explicitly. (Chris)

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 include/linux/relay.h | 17 +++++++-----
 kernel/relay.c        | 74 +++++++++++++++++++++++++++++----------------------
 2 files changed, 52 insertions(+), 39 deletions(-)

diff --git a/include/linux/relay.h b/include/linux/relay.h
index d7c8359..eb295e3 100644
--- a/include/linux/relay.h
+++ b/include/linux/relay.h
@@ -19,6 +19,7 @@
 #include <linux/fs.h>
 #include <linux/poll.h>
 #include <linux/kref.h>
+#include <linux/percpu.h>
 
 /*
  * Tracks changes to rchan/rchan_buf structs
@@ -63,7 +64,7 @@ struct rchan
 	struct kref kref;		/* channel refcount */
 	void *private_data;		/* for user-defined data */
 	size_t last_toobig;		/* tried to log event > subbuf size */
-	struct rchan_buf *buf[NR_CPUS]; /* per-cpu channel buffers */
+	struct rchan_buf ** __percpu buf; /* per-cpu channel buffers */
 	int is_global;			/* One global buffer ? */
 	struct list_head list;		/* for channel list */
 	struct dentry *parent;		/* parent dentry passed to open */
@@ -204,7 +205,7 @@ static inline void relay_write(struct rchan *chan,
 	struct rchan_buf *buf;
 
 	local_irq_save(flags);
-	buf = chan->buf[smp_processor_id()];
+	buf = *this_cpu_ptr(chan->buf);
 	if (unlikely(buf->offset + length > chan->subbuf_size))
 		length = relay_switch_subbuf(buf, length);
 	memcpy(buf->data + buf->offset, data, length);
@@ -230,12 +231,12 @@ static inline void __relay_write(struct rchan *chan,
 {
 	struct rchan_buf *buf;
 
-	buf = chan->buf[get_cpu()];
+	buf = *get_cpu_ptr(chan->buf);
 	if (unlikely(buf->offset + length > buf->chan->subbuf_size))
 		length = relay_switch_subbuf(buf, length);
 	memcpy(buf->data + buf->offset, data, length);
 	buf->offset += length;
-	put_cpu();
+	put_cpu_ptr(chan->buf);
 }
 
 /**
@@ -251,17 +252,19 @@ static inline void __relay_write(struct rchan *chan,
  */
 static inline void *relay_reserve(struct rchan *chan, size_t length)
 {
-	void *reserved;
-	struct rchan_buf *buf = chan->buf[smp_processor_id()];
+	void *reserved = NULL;
+	struct rchan_buf *buf = *get_cpu_ptr(chan->buf);
 
 	if (unlikely(buf->offset + length > buf->chan->subbuf_size)) {
 		length = relay_switch_subbuf(buf, length);
 		if (!length)
-			return NULL;
+			goto end;
 	}
 	reserved = buf->data + buf->offset;
 	buf->offset += length;
 
+end:
+	put_cpu_ptr(chan->buf);
 	return reserved;
 }
 
diff --git a/kernel/relay.c b/kernel/relay.c
index d797502..f55ab82 100644
--- a/kernel/relay.c
+++ b/kernel/relay.c
@@ -214,7 +214,7 @@ static void relay_destroy_buf(struct rchan_buf *buf)
 			__free_page(buf->page_array[i]);
 		relay_free_page_array(buf->page_array);
 	}
-	chan->buf[buf->cpu] = NULL;
+	*per_cpu_ptr(chan->buf, buf->cpu) = NULL;
 	kfree(buf->padding);
 	kfree(buf);
 	kref_put(&chan->kref, relay_destroy_channel);
@@ -382,20 +382,21 @@ static void __relay_reset(struct rchan_buf *buf, unsigned int init)
  */
 void relay_reset(struct rchan *chan)
 {
+	struct rchan_buf *buf;
 	unsigned int i;
 
 	if (!chan)
 		return;
 
-	if (chan->is_global && chan->buf[0]) {
-		__relay_reset(chan->buf[0], 0);
+	if (chan->is_global && (buf = *per_cpu_ptr(chan->buf, 0))) {
+		__relay_reset(buf, 0);
 		return;
 	}
 
 	mutex_lock(&relay_channels_mutex);
 	for_each_possible_cpu(i)
-		if (chan->buf[i])
-			__relay_reset(chan->buf[i], 0);
+		if ((buf = *per_cpu_ptr(chan->buf, i)))
+			__relay_reset(buf, 0);
 	mutex_unlock(&relay_channels_mutex);
 }
 EXPORT_SYMBOL_GPL(relay_reset);
@@ -440,7 +441,7 @@ static struct rchan_buf *relay_open_buf(struct rchan *chan, unsigned int cpu)
 	struct dentry *dentry;
 
  	if (chan->is_global)
-		return chan->buf[0];
+		return *per_cpu_ptr(chan->buf, 0);
 
 	buf = relay_create_buf(chan);
 	if (!buf)
@@ -464,7 +465,7 @@ static struct rchan_buf *relay_open_buf(struct rchan *chan, unsigned int cpu)
  	__relay_reset(buf, 1);
 
  	if(chan->is_global) {
- 		chan->buf[0] = buf;
+		*per_cpu_ptr(chan->buf, 0) = buf;
  		buf->cpu = 0;
   	}
 
@@ -526,22 +527,24 @@ static int relay_hotcpu_callback(struct notifier_block *nb,
 {
 	unsigned int hotcpu = (unsigned long)hcpu;
 	struct rchan *chan;
+	struct rchan_buf *buf;
 
 	switch(action) {
 	case CPU_UP_PREPARE:
 	case CPU_UP_PREPARE_FROZEN:
 		mutex_lock(&relay_channels_mutex);
 		list_for_each_entry(chan, &relay_channels, list) {
-			if (chan->buf[hotcpu])
+			if ((buf = *per_cpu_ptr(chan->buf, hotcpu)))
 				continue;
-			chan->buf[hotcpu] = relay_open_buf(chan, hotcpu);
-			if(!chan->buf[hotcpu]) {
+			buf = relay_open_buf(chan, hotcpu);
+			if(!buf) {
 				printk(KERN_ERR
 					"relay_hotcpu_callback: cpu %d buffer "
 					"creation failed\n", hotcpu);
 				mutex_unlock(&relay_channels_mutex);
 				return notifier_from_errno(-ENOMEM);
 			}
+			*per_cpu_ptr(chan->buf, hotcpu) = buf;
 		}
 		mutex_unlock(&relay_channels_mutex);
 		break;
@@ -583,6 +586,7 @@ struct rchan *relay_open(const char *base_filename,
 {
 	unsigned int i;
 	struct rchan *chan;
+	struct rchan_buf *buf;
 
 	if (!(subbuf_size && n_subbufs))
 		return NULL;
@@ -593,6 +597,7 @@ struct rchan *relay_open(const char *base_filename,
 	if (!chan)
 		return NULL;
 
+	chan->buf = alloc_percpu(struct rchan_buf *);
 	chan->version = RELAYFS_CHANNEL_VERSION;
 	chan->n_subbufs = n_subbufs;
 	chan->subbuf_size = subbuf_size;
@@ -608,9 +613,10 @@ struct rchan *relay_open(const char *base_filename,
 
 	mutex_lock(&relay_channels_mutex);
 	for_each_online_cpu(i) {
-		chan->buf[i] = relay_open_buf(chan, i);
-		if (!chan->buf[i])
+		buf = relay_open_buf(chan, i);
+		if (!buf)
 			goto free_bufs;
+		*per_cpu_ptr(chan->buf, i) = buf;
 	}
 	list_add(&chan->list, &relay_channels);
 	mutex_unlock(&relay_channels_mutex);
@@ -619,8 +625,8 @@ struct rchan *relay_open(const char *base_filename,
 
 free_bufs:
 	for_each_possible_cpu(i) {
-		if (chan->buf[i])
-			relay_close_buf(chan->buf[i]);
+		if ((buf = *per_cpu_ptr(chan->buf, i)))
+			relay_close_buf(buf);
 	}
 
 	kref_put(&chan->kref, relay_destroy_channel);
@@ -666,6 +672,7 @@ int relay_late_setup_files(struct rchan *chan,
 	unsigned int i, curr_cpu;
 	unsigned long flags;
 	struct dentry *dentry;
+	struct rchan_buf *buf;
 	struct rchan_percpu_buf_dispatcher disp;
 
 	if (!chan || !base_filename)
@@ -684,10 +691,11 @@ int relay_late_setup_files(struct rchan *chan,
 
 	if (chan->is_global) {
 		err = -EINVAL;
-		if (!WARN_ON_ONCE(!chan->buf[0])) {
-			dentry = relay_create_buf_file(chan, chan->buf[0], 0);
+		buf = *per_cpu_ptr(chan->buf, 0);
+		if (!WARN_ON_ONCE(!buf)) {
+			dentry = relay_create_buf_file(chan, buf, 0);
 			if (dentry && !WARN_ON_ONCE(!chan->is_global)) {
-				relay_set_buf_dentry(chan->buf[0], dentry);
+				relay_set_buf_dentry(buf, dentry);
 				err = 0;
 			}
 		}
@@ -702,13 +710,14 @@ int relay_late_setup_files(struct rchan *chan,
 	 * on all currently online CPUs.
 	 */
 	for_each_online_cpu(i) {
-		if (unlikely(!chan->buf[i])) {
+		buf = *per_cpu_ptr(chan->buf, i);
+		if (unlikely(!buf)) {
 			WARN_ONCE(1, KERN_ERR "CPU has no buffer!\n");
 			err = -EINVAL;
 			break;
 		}
 
-		dentry = relay_create_buf_file(chan, chan->buf[i], i);
+		dentry = relay_create_buf_file(chan, buf, i);
 		if (unlikely(!dentry)) {
 			err = -EINVAL;
 			break;
@@ -716,10 +725,10 @@ int relay_late_setup_files(struct rchan *chan,
 
 		if (curr_cpu == i) {
 			local_irq_save(flags);
-			relay_set_buf_dentry(chan->buf[i], dentry);
+			relay_set_buf_dentry(buf, dentry);
 			local_irq_restore(flags);
 		} else {
-			disp.buf = chan->buf[i];
+			disp.buf = buf;
 			disp.dentry = dentry;
 			smp_mb();
 			/* relay_channels_mutex must be held, so wait. */
@@ -822,11 +831,10 @@ void relay_subbufs_consumed(struct rchan *chan,
 	if (!chan)
 		return;
 
-	if (cpu >= NR_CPUS || !chan->buf[cpu] ||
-					subbufs_consumed > chan->n_subbufs)
+	buf = *per_cpu_ptr(chan->buf, cpu);
+	if (cpu >= NR_CPUS || !buf || subbufs_consumed > chan->n_subbufs)
 		return;
 
-	buf = chan->buf[cpu];
 	if (subbufs_consumed > buf->subbufs_produced - buf->subbufs_consumed)
 		buf->subbufs_consumed = buf->subbufs_produced;
 	else
@@ -842,18 +850,19 @@ EXPORT_SYMBOL_GPL(relay_subbufs_consumed);
  */
 void relay_close(struct rchan *chan)
 {
+	struct rchan_buf *buf;
 	unsigned int i;
 
 	if (!chan)
 		return;
 
 	mutex_lock(&relay_channels_mutex);
-	if (chan->is_global && chan->buf[0])
-		relay_close_buf(chan->buf[0]);
+	if (chan->is_global && (buf = *per_cpu_ptr(chan->buf, 0)))
+		relay_close_buf(buf);
 	else
 		for_each_possible_cpu(i)
-			if (chan->buf[i])
-				relay_close_buf(chan->buf[i]);
+			if ((buf = *per_cpu_ptr(chan->buf, i)))
+				relay_close_buf(buf);
 
 	if (chan->last_toobig)
 		printk(KERN_WARNING "relay: one or more items not logged "
@@ -874,20 +883,21 @@ EXPORT_SYMBOL_GPL(relay_close);
  */
 void relay_flush(struct rchan *chan)
 {
+	struct rchan_buf *buf;
 	unsigned int i;
 
 	if (!chan)
 		return;
 
-	if (chan->is_global && chan->buf[0]) {
-		relay_switch_subbuf(chan->buf[0], 0);
+	if (chan->is_global && (buf = *per_cpu_ptr(chan->buf, 0))) {
+		relay_switch_subbuf(buf, 0);
 		return;
 	}
 
 	mutex_lock(&relay_channels_mutex);
 	for_each_possible_cpu(i)
-		if (chan->buf[i])
-			relay_switch_subbuf(chan->buf[i], 0);
+		if ((buf = *per_cpu_ptr(chan->buf, i)))
+			relay_switch_subbuf(buf, 0);
 	mutex_unlock(&relay_channels_mutex);
 }
 EXPORT_SYMBOL_GPL(relay_flush);
-- 
1.9.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 08/20] drm/i915: Add a relay backed debugfs interface for capturing GuC logs
  2016-08-12  6:25 [PATCH v5 00/20] Support for sustained capturing of GuC firmware logs akash.goel
                   ` (6 preceding siblings ...)
  2016-08-12  6:25 ` [PATCH 07/20] relay: Use per CPU constructs for the relay channel buffer pointers akash.goel
@ 2016-08-12  6:25 ` akash.goel
  2016-08-12 13:53   ` Tvrtko Ursulin
  2016-08-12  6:25 ` [PATCH 09/20] drm/i915: New lock to serialize the Host2GuC actions akash.goel
                   ` (12 subsequent siblings)
  20 siblings, 1 reply; 68+ messages in thread
From: akash.goel @ 2016-08-12  6:25 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel, Sourab Gupta

From: Akash Goel <akash.goel@intel.com>

Added a new debugfs interface '/sys/kernel/debug/dri/guc_log' for the
User to capture GuC firmware logs. Availed relay framework to implement
the interface, where Driver will have to just use a relay API to store
snapshots of the GuC log buffer in the buffer managed by relay.
The snapshot will be taken when GuC firmware sends a log buffer flush
interrupt and up to four snaphots could be stored in the relay buffer.
The relay buffer will be operated in a mode where it will overwrite the
data not yet collected by User.
Besides mmap method, through which User can directly access the relay
buffer contents, relay also supports the 'poll' method. Through the 'poll'
call on log file, User can come to know whenever a new snapshot of the
log buffer is taken by Driver, so can run in tandem with the Driver and
capture the logs in a sustained/streaming manner, without any loss of data.

v2: Defer the creation of relay channel & associated debugfs file, as
    debugfs setup is now done at the end of i915 Driver load. (Chris)

v3:
- Switch to no-overwrite mode for relay.
- Fix the relay sub buffer switching sequence.

v4:
- Update i915 Kconfig to select RELAY config. (TvrtKo)
- Log a message when there is no sub buffer available to capture
  the GuC log buffer. (Tvrtko)
- Increase the number of relay sub buffers to 8 from 4, to have
  sufficient buffering for boot time logs

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/Kconfig               |   1 +
 drivers/gpu/drm/i915/i915_drv.c            |   2 +
 drivers/gpu/drm/i915/i915_guc_submission.c | 206 ++++++++++++++++++++++++++++-
 drivers/gpu/drm/i915/intel_guc.h           |   3 +
 4 files changed, 209 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
index 7769e46..fc900d2 100644
--- a/drivers/gpu/drm/i915/Kconfig
+++ b/drivers/gpu/drm/i915/Kconfig
@@ -11,6 +11,7 @@ config DRM_I915
 	select DRM_KMS_HELPER
 	select DRM_PANEL
 	select DRM_MIPI_DSI
+	select RELAY
 	# i915 depends on ACPI_VIDEO when ACPI is enabled
 	# but for select to work, need to select ACPI_VIDEO's dependencies, ick
 	select BACKLIGHT_LCD_SUPPORT if ACPI
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index fc2da32..cb8c943 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1145,6 +1145,7 @@ static void i915_driver_register(struct drm_i915_private *dev_priv)
 	/* Reveal our presence to userspace */
 	if (drm_dev_register(dev, 0) == 0) {
 		i915_debugfs_register(dev_priv);
+		i915_guc_register(dev_priv);
 		i915_setup_sysfs(dev);
 	} else
 		DRM_ERROR("Failed to register driver for userspace access!\n");
@@ -1183,6 +1184,7 @@ static void i915_driver_unregister(struct drm_i915_private *dev_priv)
 	intel_opregion_unregister(dev_priv);
 
 	i915_teardown_sysfs(&dev_priv->drm);
+	i915_guc_unregister(dev_priv);
 	i915_debugfs_unregister(dev_priv);
 	drm_dev_unregister(&dev_priv->drm);
 
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 2635b67..1a2d648 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -23,6 +23,8 @@
  */
 #include <linux/firmware.h>
 #include <linux/circ_buf.h>
+#include <linux/debugfs.h>
+#include <linux/relay.h>
 #include "i915_drv.h"
 #include "intel_guc.h"
 
@@ -851,12 +853,33 @@ err:
 
 static void guc_move_to_next_buf(struct intel_guc *guc)
 {
-	return;
+	/* Make sure the updates made in the sub buffer are visible when
+	 * Consumer sees the following update to offset inside the sub buffer.
+	 */
+	smp_wmb();
+
+	/* All data has been written, so now move the offset of sub buffer. */
+	relay_reserve(guc->log.relay_chan, guc->log.obj->base.size);
+
+	/* Switch to the next sub buffer */
+	relay_flush(guc->log.relay_chan);
 }
 
 static void* guc_get_write_buffer(struct intel_guc *guc)
 {
-	return NULL;
+	/* FIXME: Cover the check under a lock ? */
+	if (!guc->log.relay_chan)
+		return NULL;
+
+	/* Just get the base address of a new sub buffer and copy data into it
+	 * ourselves. NULL will be returned in no-overwrite mode, if all sub
+	 * buffers are full. Could have used the relay_write() to indirectly
+	 * copy the data, but that would have been bit convoluted, as we need to
+	 * write to only certain locations inside a sub buffer which cannot be
+	 * done without using relay_reserve() along with relay_write(). So its
+	 * better to use relay_reserve() alone.
+	 */
+	return relay_reserve(guc->log.relay_chan, 0);
 }
 
 static void guc_read_update_log_buffer(struct intel_guc *guc)
@@ -923,6 +946,130 @@ static void guc_read_update_log_buffer(struct intel_guc *guc)
 
 	if (log_buffer_snapshot_state)
 		guc_move_to_next_buf(guc);
+	else {
+		/* Used rate limited to avoid deluge of messages, logs might be
+		 * getting consumed by User at a slow rate.
+		 */
+		DRM_ERROR_RATELIMITED("no sub-buffer to capture log buffer\n");
+	}
+}
+
+/*
+ * Sub buffer switch callback. Called whenever relay has to switch to a new
+ * sub buffer, relay stays on the same sub buffer if 0 is returned.
+ */
+static int subbuf_start_callback(struct rchan_buf *buf,
+				 void *subbuf,
+				 void *prev_subbuf,
+				 size_t prev_padding)
+{
+	/* Use no-overwrite mode by default, where relay will stop accepting
+	 * new data if there are no empty sub buffers left.
+	 * There is no strict synchronization enforced by relay between Consumer
+	 * and Producer. In overwrite mode, there is a possibility of getting
+	 * inconsistent/garbled data, the producer could be writing on to the
+	 * same sub buffer from which Consumer is reading. This can't be avoided
+	 * unless Consumer is fast enough and can always run in tandem with
+	 * Producer.
+	 */
+	if (relay_buf_full(buf))
+		return 0;
+
+	return 1;
+}
+
+/*
+ * file_create() callback. Creates relay file in debugfs.
+ */
+static struct dentry *create_buf_file_callback(const char *filename,
+					       struct dentry *parent,
+					       umode_t mode,
+					       struct rchan_buf *buf,
+					       int *is_global)
+{
+	struct dentry *buf_file = NULL;
+
+	if (parent) {
+		/* Not using the channel filename passed as an argument, since
+		 * for each channel relay appends the corresponding CPU number
+		 * to the filename passed in relay_open(). This should be fine
+		 * as relay just needs a dentry of the file associated with the
+		 * channel buffer and that file's name need not be same as the
+		 * filename passed as an argument.
+		 */
+		buf_file = debugfs_create_file("guc_log", mode,
+				parent, buf, &relay_file_operations);
+	}
+
+	/* This to enable the use of a single buffer for the relay channel and
+	 * correspondingly have a single file exposed to User, through which
+	 * it can collect the logs inorder without any post-processing.
+	 */
+	*is_global = 1;
+
+	return buf_file;
+}
+
+/*
+ * file_remove() default callback. Removes relay file in debugfs.
+ */
+static int remove_buf_file_callback(struct dentry *dentry)
+{
+	debugfs_remove(dentry);
+	return 0;
+}
+
+/* relay channel callbacks */
+static struct rchan_callbacks relay_callbacks = {
+	.subbuf_start = subbuf_start_callback,
+	.create_buf_file = create_buf_file_callback,
+	.remove_buf_file = remove_buf_file_callback,
+};
+
+static void guc_remove_log_relay_file(struct intel_guc *guc)
+{
+	relay_close(guc->log.relay_chan);
+}
+
+static int guc_create_log_relay_file(struct intel_guc *guc)
+{
+	struct drm_i915_private *dev_priv = guc_to_i915(guc);
+	struct rchan *guc_log_relay_chan;
+	struct dentry *log_dir;
+	size_t n_subbufs, subbuf_size;
+
+	/* For now create the log file in /sys/kernel/debug/dri/0 dir */
+	log_dir = dev_priv->drm.primary->debugfs_root;
+
+	/* If /sys/kernel/debug/dri/0 location do not exist, then debugfs is
+	 * not mounted and so can't create the relay file.
+	 * The relay API seems to fit well with debugfs only.
+	 */
+	if (!log_dir) {
+		DRM_DEBUG_DRIVER("Parent debugfs directory not available yet\n");
+		return -ENODEV;
+	}
+
+	/* Keep the size of sub buffers same as shared log buffer */
+	subbuf_size = guc->log.obj->base.size;
+	/* Store up to 8 snaphosts, which is large enough to buffer sufficient
+	 * boot time logs and provides enough leeway to User, in terms of
+	 * latency, for consuming the logs from relay. Also doesn't take
+	 * up too much memory.
+         */
+	n_subbufs = 8;
+
+	guc_log_relay_chan = relay_open("guc_log", log_dir,
+			subbuf_size, n_subbufs, &relay_callbacks, dev_priv);
+
+	if (!guc_log_relay_chan) {
+		DRM_DEBUG_DRIVER("Couldn't create relay chan for guc logs\n");
+		return -ENOMEM;
+	}
+
+	/* FIXME: Cover the update under a lock ? */
+	guc->log.relay_chan = guc_log_relay_chan;
+	return 0;
 }
 
 static void guc_log_cleanup(struct intel_guc *guc)
@@ -937,6 +1084,11 @@ static void guc_log_cleanup(struct intel_guc *guc)
 	/* First disable the flush interrupt */
 	gen9_disable_guc_interrupts(dev_priv);
 
+	if (guc->log.relay_chan)
+		guc_remove_log_relay_file(guc);
+
+	guc->log.relay_chan = NULL;
+
 	if (guc->log.buf_addr)
 		i915_gem_object_unpin_map(guc->log.obj);
 
@@ -1015,6 +1167,35 @@ static void guc_create_log(struct intel_guc *guc)
 	guc->log.flags = (offset << GUC_LOG_BUF_ADDR_SHIFT) | flags;
 }
 
+static int guc_log_late_setup(struct intel_guc *guc)
+{
+	struct drm_i915_private *dev_priv = guc_to_i915(guc);
+	int ret;
+
+	lockdep_assert_held(&dev_priv->drm.struct_mutex);
+
+	if (i915.guc_log_level < 0)
+		return -EINVAL;
+
+	/* If log_level was set as -1 at boot time, then vmalloc mapping would
+	 * not have been created for the log buffer, so create one now.
+	 */
+	ret = guc_create_log_extras(guc);
+	if (ret)
+		goto err;
+
+	ret = guc_create_log_relay_file(guc);
+	if (ret)
+		goto err;
+
+	return 0;
+err:
+	guc_log_cleanup(guc);
+	/* logging will remain off */
+	i915.guc_log_level = -1;
+	return ret;
+}
+
 static void init_guc_policies(struct guc_policies *policies)
 {
 	struct guc_policy *policy;
@@ -1185,7 +1366,6 @@ void i915_guc_submission_fini(struct drm_i915_private *dev_priv)
 	gem_release_guc_obj(dev_priv->guc.ads_obj);
 	guc->ads_obj = NULL;
 
-	guc_log_cleanup(guc);
 	gem_release_guc_obj(dev_priv->guc.log.obj);
 	guc->log.obj = NULL;
 
@@ -1261,3 +1441,23 @@ void i915_guc_capture_logs(struct drm_i915_private *dev_priv)
 	host2guc_logbuffer_flush_complete(&dev_priv->guc);
 	intel_runtime_pm_put(dev_priv);
 }
+
+void i915_guc_unregister(struct drm_i915_private *dev_priv)
+{
+	if (!i915.enable_guc_submission)
+		return;
+
+	mutex_lock(&dev_priv->drm.struct_mutex);
+	guc_log_cleanup(&dev_priv->guc);
+	mutex_unlock(&dev_priv->drm.struct_mutex);
+}
+
+void i915_guc_register(struct drm_i915_private *dev_priv)
+{
+	if (!i915.enable_guc_submission)
+		return;
+
+	mutex_lock(&dev_priv->drm.struct_mutex);
+	guc_log_late_setup(&dev_priv->guc);
+	mutex_unlock(&dev_priv->drm.struct_mutex);
+}
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 7c0bdba..96ef7dc 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -126,6 +126,7 @@ struct intel_guc_log {
 	struct drm_i915_gem_object *obj;
 	struct workqueue_struct *wq;
 	void *buf_addr;
+	struct rchan *relay_chan;
 };
 
 struct intel_guc {
@@ -172,5 +173,7 @@ int i915_guc_wq_check_space(struct drm_i915_gem_request *rq);
 void i915_guc_submission_disable(struct drm_i915_private *dev_priv);
 void i915_guc_submission_fini(struct drm_i915_private *dev_priv);
 void i915_guc_capture_logs(struct drm_i915_private *dev_priv);
+void i915_guc_register(struct drm_i915_private *dev_priv);
+void i915_guc_unregister(struct drm_i915_private *dev_priv);
 
 #endif
-- 
1.9.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 09/20] drm/i915: New lock to serialize the Host2GuC actions
  2016-08-12  6:25 [PATCH v5 00/20] Support for sustained capturing of GuC firmware logs akash.goel
                   ` (7 preceding siblings ...)
  2016-08-12  6:25 ` [PATCH 08/20] drm/i915: Add a relay backed debugfs interface for capturing GuC logs akash.goel
@ 2016-08-12  6:25 ` akash.goel
  2016-08-12 13:55   ` Tvrtko Ursulin
  2016-08-12  6:25 ` [PATCH 10/20] drm/i915: Add stats for GuC log buffer flush interrupts akash.goel
                   ` (11 subsequent siblings)
  20 siblings, 1 reply; 68+ messages in thread
From: akash.goel @ 2016-08-12  6:25 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel

From: Akash Goel <akash.goel@intel.com>

With the addition of new Host2GuC actions related to GuC logging, there
is a need of a lock to serialize them, as they can execute concurrently
with each other and also with other existing actions.

v2: Use mutex in place of spinlock to serialize, as sleep can happen
    while waiting for the action's response from GuC. (Tvrtko)

Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 3 +++
 drivers/gpu/drm/i915/intel_guc.h           | 3 +++
 2 files changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 1a2d648..cb9672b 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -88,6 +88,7 @@ static int host2guc_action(struct intel_guc *guc, u32 *data, u32 len)
 		return -EINVAL;
 
 	intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
+	mutex_lock(&guc->action_lock);
 
 	dev_priv->guc.action_count += 1;
 	dev_priv->guc.action_cmd = data[0];
@@ -126,6 +127,7 @@ static int host2guc_action(struct intel_guc *guc, u32 *data, u32 len)
 	}
 	dev_priv->guc.action_status = status;
 
+	mutex_unlock(&guc->action_lock);
 	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
 
 	return ret;
@@ -1312,6 +1314,7 @@ int i915_guc_submission_init(struct drm_i915_private *dev_priv)
 		return -ENOMEM;
 
 	ida_init(&guc->ctx_ids);
+	mutex_init(&guc->action_lock);
 	guc_create_log(guc);
 	guc_create_ads(guc);
 
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 96ef7dc..e4ec8d8 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -156,6 +156,9 @@ struct intel_guc {
 
 	uint64_t submissions[I915_NUM_ENGINES];
 	uint32_t last_seqno[I915_NUM_ENGINES];
+
+	/* To serialize the Host2GuC actions */
+	struct mutex action_lock;
 };
 
 /* intel_guc_loader.c */
-- 
1.9.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 10/20] drm/i915: Add stats for GuC log buffer flush interrupts
  2016-08-12  6:25 [PATCH v5 00/20] Support for sustained capturing of GuC firmware logs akash.goel
                   ` (8 preceding siblings ...)
  2016-08-12  6:25 ` [PATCH 09/20] drm/i915: New lock to serialize the Host2GuC actions akash.goel
@ 2016-08-12  6:25 ` akash.goel
  2016-08-12 14:26   ` Tvrtko Ursulin
  2016-08-12  6:25 ` [PATCH 11/20] drm/i915: Optimization to reduce the sampling time of GuC log buffer akash.goel
                   ` (10 subsequent siblings)
  20 siblings, 1 reply; 68+ messages in thread
From: akash.goel @ 2016-08-12  6:25 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel

From: Akash Goel <akash.goel@intel.com>

GuC firmware sends an interrupt to flush the log buffer when it
becomes half full. GuC firmware also tracks how many times the
buffer overflowed.
It would be useful to maintain a statistics of how many flush
interrupts were received and for which type of log buffer,
along with the overflow count of each buffer type.
Augmented i915_log_info debugfs to report back these statistics.

v2:
- Update the logic to detect multiple overflows between the 2
  flush interrupts and also log a message for overflow (Tvrtko)
- Track the number of times there was no free sub buffer to capture
  the GuC log buffer. (Tvrtko)

Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c        | 28 ++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_guc_submission.c | 19 +++++++++++++++++++
 drivers/gpu/drm/i915/i915_irq.c            |  2 ++
 drivers/gpu/drm/i915/intel_guc.h           |  7 +++++++
 4 files changed, 56 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 51b59d5..14e0dcf 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2539,6 +2539,32 @@ static int i915_guc_load_status_info(struct seq_file *m, void *data)
 	return 0;
 }
 
+static void i915_guc_log_info(struct seq_file *m,
+				 struct drm_i915_private *dev_priv)
+{
+	struct intel_guc *guc = &dev_priv->guc;
+
+	seq_printf(m, "\nGuC logging stats:\n");
+
+	seq_printf(m, "\tISR:   flush count %10u, overflow count %8u\n",
+		guc->log.flush_count[GUC_ISR_LOG_BUFFER],
+		guc->log.total_overflow_count[GUC_ISR_LOG_BUFFER]);
+
+	seq_printf(m, "\tDPC:   flush count %10u, overflow count %8u\n",
+		guc->log.flush_count[GUC_DPC_LOG_BUFFER],
+		guc->log.total_overflow_count[GUC_DPC_LOG_BUFFER]);
+
+	seq_printf(m, "\tCRASH: flush count %10u, overflow count %8u\n",
+		guc->log.flush_count[GUC_CRASH_DUMP_LOG_BUFFER],
+		guc->log.total_overflow_count[GUC_CRASH_DUMP_LOG_BUFFER]);
+
+	seq_printf(m, "\tTotal flush interrupt count: %u\n",
+		       guc->log.flush_interrupt_count);
+
+	seq_printf(m, "\tCapture miss count: %u\n",
+		       guc->log.capture_miss_count);
+}
+
 static void i915_guc_client_info(struct seq_file *m,
 				 struct drm_i915_private *dev_priv,
 				 struct i915_guc_client *client)
@@ -2613,6 +2639,8 @@ static int i915_guc_info(struct seq_file *m, void *data)
 	seq_printf(m, "\nGuC execbuf client @ %p:\n", guc.execbuf_client);
 	i915_guc_client_info(m, dev_priv, &client);
 
+	i915_guc_log_info(m, dev_priv);
+
 	/* Add more as required ... */
 
 	return 0;
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index cb9672b..1ca1866 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -913,6 +913,24 @@ static void guc_read_update_log_buffer(struct intel_guc *guc)
 				sizeof(struct guc_log_buffer_state));
 		buffer_size = log_buffer_state_local.size;
 
+		guc->log.flush_count[i] += log_buffer_state_local.flush_to_file;
+		if (log_buffer_state_local.buffer_full_cnt !=
+					guc->log.prev_overflow_count[i]) {
+			guc->log.total_overflow_count[i] +=
+				(log_buffer_state_local.buffer_full_cnt -
+				 guc->log.prev_overflow_count[i]);
+
+			if (log_buffer_state_local.buffer_full_cnt <
+					guc->log.prev_overflow_count[i]) {
+				/* buffer_full_cnt is a 4 bit counter */
+				guc->log.total_overflow_count[i] += 16;
+			}
+
+			guc->log.prev_overflow_count[i] =
+					log_buffer_state_local.buffer_full_cnt;
+			DRM_ERROR_RATELIMITED("GuC log buffer overflow\n");
+		}
+
 		if (log_buffer_snapshot_state) {
 			/* First copy the state structure in local buffer */
 			memcpy(log_buffer_snapshot_state, &log_buffer_state_local,
@@ -953,6 +971,7 @@ static void guc_read_update_log_buffer(struct intel_guc *guc)
 		 * getting consumed by User at a slow rate.
 		 */
 		DRM_ERROR_RATELIMITED("no sub-buffer to capture log buffer\n");
+		guc->log.capture_miss_count++;
 	}
 }
 
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index d4d6f0a..b08d1d2 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1705,6 +1705,8 @@ static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv, u32 gt_iir)
 				/* Handle flush interrupt event in bottom half */
 				queue_work(dev_priv->guc.log.wq,
 						&dev_priv->guc.events_work);
+
+				dev_priv->guc.log.flush_interrupt_count++;
 			}
 		}
 	}
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index e4ec8d8..ed87e98 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -127,6 +127,13 @@ struct intel_guc_log {
 	struct workqueue_struct *wq;
 	void *buf_addr;
 	struct rchan *relay_chan;
+
+	/* logging related stats */
+	u32 capture_miss_count;
+	u32 flush_interrupt_count;
+	u32 prev_overflow_count[GUC_MAX_LOG_BUFFER];
+	u32 total_overflow_count[GUC_MAX_LOG_BUFFER];
+	u32 flush_count[GUC_MAX_LOG_BUFFER];
 };
 
 struct intel_guc {
-- 
1.9.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 11/20] drm/i915: Optimization to reduce the sampling time of GuC log buffer
  2016-08-12  6:25 [PATCH v5 00/20] Support for sustained capturing of GuC firmware logs akash.goel
                   ` (9 preceding siblings ...)
  2016-08-12  6:25 ` [PATCH 10/20] drm/i915: Add stats for GuC log buffer flush interrupts akash.goel
@ 2016-08-12  6:25 ` akash.goel
  2016-08-12 14:42   ` Tvrtko Ursulin
  2016-08-12  6:25 ` [PATCH 12/20] drm/i915: Increase GuC log buffer size to reduce flush interrupts akash.goel
                   ` (9 subsequent siblings)
  20 siblings, 1 reply; 68+ messages in thread
From: akash.goel @ 2016-08-12  6:25 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel

From: Akash Goel <akash.goel@intel.com>

GuC firmware sends an interrupt to flush the log buffer when it becomes
half full, so Driver doesn't really need to sample the complete buffer
and can just copy only the newly written data by GuC into the local
buffer, i.e. as per the read & write pointer values.
Moreover the flush interrupt would generally come for one type of log
buffer, when it becomes half full, so at that time the other 2 types of
log buffer would comparatively have much lesser unread data in them.
In case of overflow reported by GuC, Driver do need to copy the entire
buffer as the whole buffer would contain the unread data.

v2: Rebase.

Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 40 +++++++++++++++++++++++++-----
 1 file changed, 34 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 1ca1866..8e0f360 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -889,7 +889,8 @@ static void guc_read_update_log_buffer(struct intel_guc *guc)
 	struct guc_log_buffer_state *log_buffer_state, *log_buffer_snapshot_state;
 	struct guc_log_buffer_state log_buffer_state_local;
 	void *src_data_ptr, *dst_data_ptr;
-	u32 i, buffer_size;
+	bool new_overflow;
+	u32 i, buffer_size, read_offset, write_offset, bytes_to_copy;
 
 	if (!guc->log.buf_addr)
 		return;
@@ -912,10 +913,13 @@ static void guc_read_update_log_buffer(struct intel_guc *guc)
 		memcpy(&log_buffer_state_local, log_buffer_state,
 				sizeof(struct guc_log_buffer_state));
 		buffer_size = log_buffer_state_local.size;
+		read_offset = log_buffer_state_local.read_ptr;
+		write_offset = log_buffer_state_local.sampled_write_ptr;
 
 		guc->log.flush_count[i] += log_buffer_state_local.flush_to_file;
 		if (log_buffer_state_local.buffer_full_cnt !=
 					guc->log.prev_overflow_count[i]) {
+			new_overflow = 1;
 			guc->log.total_overflow_count[i] +=
 				(log_buffer_state_local.buffer_full_cnt -
 				 guc->log.prev_overflow_count[i]);
@@ -929,7 +933,8 @@ static void guc_read_update_log_buffer(struct intel_guc *guc)
 			guc->log.prev_overflow_count[i] =
 					log_buffer_state_local.buffer_full_cnt;
 			DRM_ERROR_RATELIMITED("GuC log buffer overflow\n");
-		}
+		} else
+			new_overflow = 0;
 
 		if (log_buffer_snapshot_state) {
 			/* First copy the state structure in local buffer */
@@ -941,13 +946,37 @@ static void guc_read_update_log_buffer(struct intel_guc *guc)
 			 * for consistency set the write pointer value to same
 			 * value of sampled_write_ptr in the snapshot buffer.
 			 */
-			log_buffer_snapshot_state->write_ptr =
-				log_buffer_snapshot_state->sampled_write_ptr;
+			log_buffer_snapshot_state->write_ptr = write_offset;
 
 			log_buffer_snapshot_state++;
 
 			/* Now copy the actual logs */
 			memcpy(dst_data_ptr, src_data_ptr, buffer_size);
+			if (unlikely(new_overflow)) {
+				/* copy the whole buffer in case of overflow */
+				read_offset = 0;
+				write_offset = buffer_size;
+			} else if (unlikely((read_offset > buffer_size) ||
+					    (write_offset > buffer_size))) {
+				DRM_ERROR("invalid log buffer state\n");
+				/* copy whole buffer as offsets are unreliable */
+				read_offset = 0;
+				write_offset = buffer_size;
+			}
+
+			/* Just copy the newly written data */
+			if (read_offset <= write_offset) {
+				bytes_to_copy = write_offset - read_offset;
+				memcpy(dst_data_ptr + read_offset,
+				     src_data_ptr + read_offset, bytes_to_copy);
+			} else {
+				bytes_to_copy = buffer_size - read_offset;
+				memcpy(dst_data_ptr + read_offset,
+				     src_data_ptr + read_offset, bytes_to_copy);
+
+				bytes_to_copy = write_offset;
+				memcpy(dst_data_ptr, src_data_ptr, bytes_to_copy);
+			}
 
 			src_data_ptr += buffer_size;
 			dst_data_ptr += buffer_size;
@@ -956,8 +985,7 @@ static void guc_read_update_log_buffer(struct intel_guc *guc)
 		/* FIXME: invalidate/flush for log buffer needed */
 
 		/* Update the read pointer in the shared log buffer */
-		log_buffer_state->read_ptr =
-			log_buffer_state_local.sampled_write_ptr;
+		log_buffer_state->read_ptr = write_offset;
 
 		/* Clear the 'flush to file' flag */
 		log_buffer_state->flush_to_file = 0;
-- 
1.9.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 12/20] drm/i915: Increase GuC log buffer size to reduce flush interrupts
  2016-08-12  6:25 [PATCH v5 00/20] Support for sustained capturing of GuC firmware logs akash.goel
                   ` (10 preceding siblings ...)
  2016-08-12  6:25 ` [PATCH 11/20] drm/i915: Optimization to reduce the sampling time of GuC log buffer akash.goel
@ 2016-08-12  6:25 ` akash.goel
  2016-08-12  6:25 ` [PATCH 13/20] drm/i915: Augment i915 error state to include the dump of GuC log buffer akash.goel
                   ` (8 subsequent siblings)
  20 siblings, 0 replies; 68+ messages in thread
From: akash.goel @ 2016-08-12  6:25 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel

From: Akash Goel <akash.goel@intel.com>

In cases where GuC generate logs at a very high rate, correspondingly
the rate of flush interrupts is also very high.
So far total 8 pages were allocated for storing both ISR & DPC logs.
As per the half-full draining protocol followed by GuC, by doubling
the number of pages, the frequency of flush interrupts can be cut down
to almost half, which then helps in reducing the logging overhead.
So now allocating 8 pages apiece for ISR & DPC logs.
This also helps in reducing the output log file size, apart from
reducing the flush interrupt count. With the original settings,
44 KB was needed for one snapshot. With modified settings, 76 KB is
needed for a snapshot which will be equivalent to 2 snapshots of the
original setting. So 12KB saving, every 88 KB, over the original setting.

Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/intel_guc_fwif.h | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h b/drivers/gpu/drm/i915/intel_guc_fwif.h
index 1de6928..7521ed5 100644
--- a/drivers/gpu/drm/i915/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
@@ -104,9 +104,9 @@
 #define   GUC_LOG_ALLOC_IN_MEGABYTE	(1 << 3)
 #define   GUC_LOG_CRASH_PAGES		1
 #define   GUC_LOG_CRASH_SHIFT		4
-#define   GUC_LOG_DPC_PAGES		3
+#define   GUC_LOG_DPC_PAGES		7
 #define   GUC_LOG_DPC_SHIFT		6
-#define   GUC_LOG_ISR_PAGES		3
+#define   GUC_LOG_ISR_PAGES		7
 #define   GUC_LOG_ISR_SHIFT		9
 #define   GUC_LOG_BUF_ADDR_SHIFT	12
 
@@ -436,9 +436,9 @@ enum guc_log_buffer_type {
  *        |   Crash dump state header     |
  * Page1  +-------------------------------+
  *        |           ISR logs            |
- * Page5  +-------------------------------+
- *        |           DPC logs            |
  * Page9  +-------------------------------+
+ *        |           DPC logs            |
+ * Page17 +-------------------------------+
  *        |         Crash Dump logs       |
  *        +-------------------------------+
  *
-- 
1.9.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 13/20] drm/i915: Augment i915 error state to include the dump of GuC log buffer
  2016-08-12  6:25 [PATCH v5 00/20] Support for sustained capturing of GuC firmware logs akash.goel
                   ` (11 preceding siblings ...)
  2016-08-12  6:25 ` [PATCH 12/20] drm/i915: Increase GuC log buffer size to reduce flush interrupts akash.goel
@ 2016-08-12  6:25 ` akash.goel
  2016-08-12 15:20   ` Tvrtko Ursulin
  2016-08-12  6:25 ` [PATCH 14/20] drm/i915: Forcefully flush GuC log buffer on reset akash.goel
                   ` (7 subsequent siblings)
  20 siblings, 1 reply; 68+ messages in thread
From: akash.goel @ 2016-08-12  6:25 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel

From: Akash Goel <akash.goel@intel.com>

Added the dump of GuC log buffer to i915 error state, as the contents of
GuC log buffer would also be useful to determine that why the GPU reset
was triggered.

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h       |  1 +
 drivers/gpu/drm/i915/i915_gpu_error.c | 27 +++++++++++++++++++++++++++
 2 files changed, 28 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 28ffac5..4bd3790 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -509,6 +509,7 @@ struct drm_i915_error_state {
 	struct intel_overlay_error_state *overlay;
 	struct intel_display_error_state *display;
 	struct drm_i915_error_object *semaphore_obj;
+	struct drm_i915_error_object *guc_log_obj;
 
 	struct drm_i915_error_engine {
 		int engine_id;
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index eecb870..561b523 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -546,6 +546,21 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
 		}
 	}
 
+	if ((obj = error->guc_log_obj)) {
+		err_printf(m, "GuC log buffer = 0x%08x\n",
+			   lower_32_bits(obj->gtt_offset));
+		for (i = 0; i < obj->page_count; i++) {
+			for (elt = 0; elt < PAGE_SIZE/4; elt += 4) {
+				err_printf(m, "[%08x] %08x %08x %08x %08x\n",
+					   (u32)(i*PAGE_SIZE) + elt*4,
+					   obj->pages[i][elt],
+					   obj->pages[i][elt+1],
+					   obj->pages[i][elt+2],
+					   obj->pages[i][elt+3]);
+			}
+		}
+	}
+
 	if (error->overlay)
 		intel_overlay_print_error_state(m, error->overlay);
 
@@ -625,6 +640,7 @@ static void i915_error_state_free(struct kref *error_ref)
 	}
 
 	i915_error_object_free(error->semaphore_obj);
+	i915_error_object_free(error->guc_log_obj);
 
 	for (i = 0; i < error->vm_count; i++)
 		kfree(error->active_bo[i]);
@@ -1210,6 +1226,16 @@ static void i915_gem_record_rings(struct drm_i915_private *dev_priv,
 	}
 }
 
+static void i915_gem_capture_guc_log_buffer(struct drm_i915_private *dev_priv,
+				     struct drm_i915_error_state *error)
+{
+	if (!dev_priv->guc.log.obj)
+		return;
+
+	error->guc_log_obj = i915_error_ggtt_object_create(dev_priv,
+						dev_priv->guc.log.obj);
+}
+
 /* FIXME: Since pin count/bound list is global, we duplicate what we capture per
  * VM.
  */
@@ -1439,6 +1465,7 @@ void i915_capture_error_state(struct drm_i915_private *dev_priv,
 	i915_gem_capture_buffers(dev_priv, error);
 	i915_gem_record_fences(dev_priv, error);
 	i915_gem_record_rings(dev_priv, error);
+	i915_gem_capture_guc_log_buffer(dev_priv, error);
 
 	do_gettimeofday(&error->time);
 
-- 
1.9.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 14/20] drm/i915: Forcefully flush GuC log buffer on reset
  2016-08-12  6:25 [PATCH v5 00/20] Support for sustained capturing of GuC firmware logs akash.goel
                   ` (12 preceding siblings ...)
  2016-08-12  6:25 ` [PATCH 13/20] drm/i915: Augment i915 error state to include the dump of GuC log buffer akash.goel
@ 2016-08-12  6:25 ` akash.goel
  2016-08-12  6:33   ` Chris Wilson
  2016-08-12  6:25 ` [PATCH 15/20] drm/i915: Debugfs support for GuC logging control akash.goel
                   ` (6 subsequent siblings)
  20 siblings, 1 reply; 68+ messages in thread
From: akash.goel @ 2016-08-12  6:25 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel

From: Sagar Arun Kamble <sagar.a.kamble@intel.com>

Before capturing the GuC logs as a part of error state, there should be a
force log buffer flush action sent to GuC before proceeding with GPU reset
and re-initializing GUC. There could be some data in the log buffer which is
yet to be captured and those logs would be particularly useful to understand
that why the GPU reset was initiated.

Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/i915_gpu_error.c      |  2 ++
 drivers/gpu/drm/i915/i915_guc_submission.c | 27 +++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_guc.h           |  1 +
 3 files changed, 30 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 561b523..5e358e2 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1232,6 +1232,8 @@ static void i915_gem_capture_guc_log_buffer(struct drm_i915_private *dev_priv,
 	if (!dev_priv->guc.log.obj)
 		return;
 
+	i915_guc_flush_logs(dev_priv);
+
 	error->guc_log_obj = i915_error_ggtt_object_create(dev_priv,
 						dev_priv->guc.log.obj);
 }
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 8e0f360..4a75c16 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -185,6 +185,16 @@ static int host2guc_logbuffer_flush_complete(struct intel_guc *guc)
 	return host2guc_action(guc, data, 1);
 }
 
+static int host2guc_force_logbuffer_flush(struct intel_guc *guc)
+{
+	u32 data[2];
+
+	data[0] = HOST2GUC_ACTION_FORCE_LOG_BUFFER_FLUSH;
+	data[1] = 0;
+
+	return host2guc_action(guc, data, 2);
+}
+
 /*
  * Initialise, update, or clear doorbell data shared with the GuC
  *
@@ -1492,6 +1502,23 @@ void i915_guc_capture_logs(struct drm_i915_private *dev_priv)
 	intel_runtime_pm_put(dev_priv);
 }
 
+void i915_guc_flush_logs(struct drm_i915_private *dev_priv)
+{
+	if (!i915.enable_guc_submission || (i915.guc_log_level < 0))
+		return;
+
+	/* First disable the interrupts, will be renabled afterwards */
+	gen9_disable_guc_interrupts(dev_priv);
+
+	/* Before initiating the forceful flush wait for the pending/ongoing
+	 * flush to complete.
+	 */
+	flush_work(&dev_priv->guc.events_work);
+
+	/* Ask GuC to update the log buffer state */
+	host2guc_force_logbuffer_flush(&dev_priv->guc);
+}
+
 void i915_guc_unregister(struct drm_i915_private *dev_priv)
 {
 	if (!i915.enable_guc_submission)
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index ed87e98..d3a5447 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -183,6 +183,7 @@ int i915_guc_wq_check_space(struct drm_i915_gem_request *rq);
 void i915_guc_submission_disable(struct drm_i915_private *dev_priv);
 void i915_guc_submission_fini(struct drm_i915_private *dev_priv);
 void i915_guc_capture_logs(struct drm_i915_private *dev_priv);
+void i915_guc_flush_logs(struct drm_i915_private *dev_priv);
 void i915_guc_register(struct drm_i915_private *dev_priv);
 void i915_guc_unregister(struct drm_i915_private *dev_priv);
 
-- 
1.9.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 15/20] drm/i915: Debugfs support for GuC logging control
  2016-08-12  6:25 [PATCH v5 00/20] Support for sustained capturing of GuC firmware logs akash.goel
                   ` (13 preceding siblings ...)
  2016-08-12  6:25 ` [PATCH 14/20] drm/i915: Forcefully flush GuC log buffer on reset akash.goel
@ 2016-08-12  6:25 ` akash.goel
  2016-08-12 15:57   ` Tvrtko Ursulin
  2016-08-12  6:25 ` [PATCH 16/20] drm/i915: Support to create write combined type vmaps akash.goel
                   ` (5 subsequent siblings)
  20 siblings, 1 reply; 68+ messages in thread
From: akash.goel @ 2016-08-12  6:25 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel

From: Sagar Arun Kamble <sagar.a.kamble@intel.com>

This patch provides debugfs interface i915_guc_output_control for
on the fly enabling/disabling of logging in GuC firmware and controlling
the verbosity level of logs.
The value written to the file, should have bit 0 set to enable logging and
bits 4-7 should contain the verbosity info.

v2: Add a forceful flush, to collect left over logs, on disabling logging.
    Useful for Validation.

v3: Besides minor cleanup, implement read method for the debugfs file and
    set the guc_log_level to -1 when logging is disabled. (Tvrtko)

Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c        | 44 ++++++++++++++++++++-
 drivers/gpu/drm/i915/i915_guc_submission.c | 63 ++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_guc.h           |  1 +
 3 files changed, 107 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 14e0dcf..f472fbcd3 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2674,6 +2674,47 @@ static int i915_guc_log_dump(struct seq_file *m, void *data)
 	return 0;
 }
 
+static int i915_guc_log_control_get(void *data, u64 *val)
+{
+	struct drm_device *dev = data;
+	struct drm_i915_private *dev_priv = to_i915(dev);
+
+	if (!dev_priv->guc.log.obj)
+		return -EINVAL;
+
+	*val = i915.guc_log_level;
+
+	return 0;
+}
+
+static int i915_guc_log_control_set(void *data, u64 val)
+{
+	struct drm_device *dev = data;
+	struct drm_i915_private *dev_priv = to_i915(dev);
+	int ret;
+
+	ret = mutex_lock_interruptible(&dev->struct_mutex);
+	if (ret)
+		return ret;
+
+	if (!dev_priv->guc.log.obj) {
+		ret = -EINVAL;
+		goto end;
+	}
+
+	intel_runtime_pm_get(dev_priv);
+	ret = i915_guc_log_control(dev_priv, val);
+	intel_runtime_pm_put(dev_priv);
+
+end:
+	mutex_unlock(&dev->struct_mutex);
+	return ret;
+}
+
+DEFINE_SIMPLE_ATTRIBUTE(i915_guc_log_control_fops,
+			i915_guc_log_control_get, i915_guc_log_control_set,
+			"%lld\n");
+
 static int i915_edp_psr_status(struct seq_file *m, void *data)
 {
 	struct drm_info_node *node = m->private;
@@ -5477,7 +5518,8 @@ static const struct i915_debugfs_files {
 	{"i915_fbc_false_color", &i915_fbc_fc_fops},
 	{"i915_dp_test_data", &i915_displayport_test_data_fops},
 	{"i915_dp_test_type", &i915_displayport_test_type_fops},
-	{"i915_dp_test_active", &i915_displayport_test_active_fops}
+	{"i915_dp_test_active", &i915_displayport_test_active_fops},
+	{"i915_guc_log_control", &i915_guc_log_control_fops}
 };
 
 void intel_display_crc_init(struct drm_device *dev)
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 4a75c16..041cf68 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -195,6 +195,16 @@ static int host2guc_force_logbuffer_flush(struct intel_guc *guc)
 	return host2guc_action(guc, data, 2);
 }
 
+static int host2guc_logging_control(struct intel_guc *guc, u32 control_val)
+{
+	u32 data[2];
+
+	data[0] = HOST2GUC_ACTION_UK_LOG_ENABLE_LOGGING;
+	data[1] = control_val;
+
+	return host2guc_action(guc, data, 2);
+}
+
 /*
  * Initialise, update, or clear doorbell data shared with the GuC
  *
@@ -1538,3 +1548,56 @@ void i915_guc_register(struct drm_i915_private *dev_priv)
 	guc_log_late_setup(&dev_priv->guc);
 	mutex_unlock(&dev_priv->drm.struct_mutex);
 }
+
+int i915_guc_log_control(struct drm_i915_private *dev_priv, u64 control_val)
+{
+	union guc_log_control log_param;
+	int ret;
+
+	log_param.logging_enabled = control_val & 0x1;
+	log_param.verbosity = (control_val >> 4) & 0xF;
+
+	if (log_param.verbosity < GUC_LOG_VERBOSITY_MIN ||
+	    log_param.verbosity > GUC_LOG_VERBOSITY_MAX)
+		return -EINVAL;
+
+	/* This combination doesn't make sense & won't have any effect */
+	if (!log_param.logging_enabled && (i915.guc_log_level < 0))
+		return 0;
+
+	ret = host2guc_logging_control(&dev_priv->guc, log_param.value);
+	if (ret < 0) {
+		DRM_DEBUG_DRIVER("host2guc action failed %d\n", ret);
+		return ret;
+	}
+
+	i915.guc_log_level = log_param.verbosity;
+
+	/* If log_level was set as -1 at boot time, then the relay channel file
+	 * wouldn't have been created by now and interrupts also would not have
+	 * been enabled.
+	 */
+	if (!dev_priv->guc.log.relay_chan) {
+		ret = guc_log_late_setup(&dev_priv->guc);
+		if (!ret)
+			gen9_enable_guc_interrupts(dev_priv);
+	} else if (!log_param.logging_enabled) {
+		/* Once logging is disabled, GuC won't generate logs & send an
+		 * interrupt. But there could be some data in the log buffer
+		 * which is yet to be captured. So request GuC to update the log
+		 * buffer state and then collect the left over logs.
+		 */
+		i915_guc_flush_logs(dev_priv);
+
+		/* GuC would have updated the log buffer by now, so capture it */
+		i915_guc_capture_logs(dev_priv);
+
+		/* As logging is disabled, update the log level to reflect that */
+		i915.guc_log_level = -1;
+	} else {
+		/* In case interrupts were disabled, enable them now */
+		gen9_enable_guc_interrupts(dev_priv);
+	}
+
+	return ret;
+}
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index d3a5447..2f8c408 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -186,5 +186,6 @@ void i915_guc_capture_logs(struct drm_i915_private *dev_priv);
 void i915_guc_flush_logs(struct drm_i915_private *dev_priv);
 void i915_guc_register(struct drm_i915_private *dev_priv);
 void i915_guc_unregister(struct drm_i915_private *dev_priv);
+int i915_guc_log_control(struct drm_i915_private *dev_priv, u64 control_val);
 
 #endif
-- 
1.9.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 16/20] drm/i915: Support to create write combined type vmaps
  2016-08-12  6:25 [PATCH v5 00/20] Support for sustained capturing of GuC firmware logs akash.goel
                   ` (14 preceding siblings ...)
  2016-08-12  6:25 ` [PATCH 15/20] drm/i915: Debugfs support for GuC logging control akash.goel
@ 2016-08-12  6:25 ` akash.goel
  2016-08-12 10:49   ` Tvrtko Ursulin
  2016-08-12  6:25 ` [PATCH 17/20] drm/i915: Use uncached(WC) mapping for acessing the GuC log buffer akash.goel
                   ` (4 subsequent siblings)
  20 siblings, 1 reply; 68+ messages in thread
From: akash.goel @ 2016-08-12  6:25 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel

From: Chris Wilson <chris@chris-wilson.co.uk>

vmaps has a provision for controlling the page protection bits, with which
we can use to control the mapping type, e.g. WB, WC, UC or even WT.
To allow the caller to choose their mapping type, we add a parameter to
i915_gem_object_pin_map - but we still only allow one vmap to be cached
per object. If the object is currently not pinned, then we recreate the
previous vmap with the new access type, but if it was pinned we report an
error. This effectively limits the access via i915_gem_object_pin_map to a
single mapping type for the lifetime of the object. Not usually a problem,
but something to be aware of when setting up the object's vmap.

We will want to vary the access type to enable WC mappings of ringbuffer
and context objects on !llc platforms, as well as other objects where we
need coherent access to the GPU's pages without going through the GTT

v2: Remove the redundant braces around pin count check and fix the marker
    in documentation (Chris)

v3:
- Add a new enum for the vmalloc mapping type & pass that as an argument to
  i915_object_pin_map. (Tvrtko)
- Use PAGE_MASK to extract or filter the mapping type info and remove a
  superfluous BUG_ON.(Tvrtko)

v4:
- Rename the enums and clean up the pin_map function. (Chris)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h            |  9 ++++-
 drivers/gpu/drm/i915/i915_gem.c            | 58 +++++++++++++++++++++++-------
 drivers/gpu/drm/i915/i915_gem_dmabuf.c     |  2 +-
 drivers/gpu/drm/i915/i915_guc_submission.c |  2 +-
 drivers/gpu/drm/i915/intel_lrc.c           |  8 ++---
 drivers/gpu/drm/i915/intel_ringbuffer.c    |  2 +-
 6 files changed, 60 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 4bd3790..6603812 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -834,6 +834,11 @@ enum i915_cache_level {
 	I915_CACHE_WT, /* hsw:gt3e WriteThrough for scanouts */
 };
 
+enum i915_map_type {
+	I915_MAP_WB = 0,
+	I915_MAP_WC,
+};
+
 struct i915_ctx_hang_stats {
 	/* This context had batch pending when hang was declared */
 	unsigned batch_pending;
@@ -3150,6 +3155,7 @@ static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
 /**
  * i915_gem_object_pin_map - return a contiguous mapping of the entire object
  * @obj - the object to map into kernel address space
+ * @map_type - whether the vmalloc mapping should be using WC or WB pgprot_t
  *
  * Calls i915_gem_object_pin_pages() to prevent reaping of the object's
  * pages and then returns a contiguous mapping of the backing storage into
@@ -3161,7 +3167,8 @@ static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
  * Returns the pointer through which to access the mapped object, or an
  * ERR_PTR() on error.
  */
-void *__must_check i915_gem_object_pin_map(struct drm_i915_gem_object *obj);
+void *__must_check i915_gem_object_pin_map(struct drm_i915_gem_object *obj,
+					enum i915_map_type map_type);
 
 /**
  * i915_gem_object_unpin_map - releases an earlier mapping
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 03548db..7dabbc3f 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2077,10 +2077,11 @@ i915_gem_object_put_pages(struct drm_i915_gem_object *obj)
 	list_del(&obj->global_list);
 
 	if (obj->mapping) {
-		if (is_vmalloc_addr(obj->mapping))
-			vunmap(obj->mapping);
+		void *ptr = (void *)((uintptr_t)obj->mapping & PAGE_MASK);
+		if (is_vmalloc_addr(ptr))
+			vunmap(ptr);
 		else
-			kunmap(kmap_to_page(obj->mapping));
+			kunmap(kmap_to_page(ptr));
 		obj->mapping = NULL;
 	}
 
@@ -2253,7 +2254,8 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
 }
 
 /* The 'mapping' part of i915_gem_object_pin_map() below */
-static void *i915_gem_object_map(const struct drm_i915_gem_object *obj)
+static void *i915_gem_object_map(const struct drm_i915_gem_object *obj,
+				 enum i915_map_type type)
 {
 	unsigned long n_pages = obj->base.size >> PAGE_SHIFT;
 	struct sg_table *sgt = obj->pages;
@@ -2263,9 +2265,10 @@ static void *i915_gem_object_map(const struct drm_i915_gem_object *obj)
 	struct page **pages = stack_pages;
 	unsigned long i = 0;
 	void *addr;
+	bool use_wc = (type == I915_MAP_WC);
 
 	/* A single page can always be kmapped */
-	if (n_pages == 1)
+	if (n_pages == 1 && !use_wc)
 		return kmap(sg_page(sgt->sgl));
 
 	if (n_pages > ARRAY_SIZE(stack_pages)) {
@@ -2281,7 +2284,8 @@ static void *i915_gem_object_map(const struct drm_i915_gem_object *obj)
 	/* Check that we have the expected number of pages */
 	GEM_BUG_ON(i != n_pages);
 
-	addr = vmap(pages, n_pages, 0, PAGE_KERNEL);
+	addr = vmap(pages, n_pages, VM_NO_GUARD,
+		    use_wc ? pgprot_writecombine(PAGE_KERNEL_IO) : PAGE_KERNEL);
 
 	if (pages != stack_pages)
 		drm_free_large(pages);
@@ -2290,11 +2294,16 @@ static void *i915_gem_object_map(const struct drm_i915_gem_object *obj)
 }
 
 /* get, pin, and map the pages of the object into kernel space */
-void *i915_gem_object_pin_map(struct drm_i915_gem_object *obj)
+void *i915_gem_object_pin_map(struct drm_i915_gem_object *obj,
+			      enum i915_map_type type)
 {
+	enum i915_map_type has_type;
+	bool pinned;
+	void *ptr;
 	int ret;
 
 	lockdep_assert_held(&obj->base.dev->struct_mutex);
+	GEM_BUG_ON((obj->ops->flags & I915_GEM_OBJECT_HAS_STRUCT_PAGE) == 0);
 
 	ret = i915_gem_object_get_pages(obj);
 	if (ret)
@@ -2302,15 +2311,38 @@ void *i915_gem_object_pin_map(struct drm_i915_gem_object *obj)
 
 	i915_gem_object_pin_pages(obj);
 
-	if (!obj->mapping) {
-		obj->mapping = i915_gem_object_map(obj);
-		if (!obj->mapping) {
-			i915_gem_object_unpin_pages(obj);
-			return ERR_PTR(-ENOMEM);
+	pinned = obj->pages_pin_count > 1;
+	ptr = (void *)((uintptr_t)obj->mapping & PAGE_MASK);
+	has_type = (uintptr_t)obj->mapping & ~PAGE_MASK;
+
+	if (ptr && has_type != type) {
+		if (pinned) {
+			ret = -EBUSY;
+			goto err;
+		}
+
+		if (is_vmalloc_addr(ptr))
+			vunmap(ptr);
+		else
+			kunmap(kmap_to_page(ptr));
+		ptr = obj->mapping = NULL;
+	}
+
+	if (!ptr) {
+		ptr = i915_gem_object_map(obj, type);
+		if (!ptr) {
+			ret = -ENOMEM;
+			goto err;
 		}
+
+		obj->mapping = (void *)((uintptr_t)ptr | type);
 	}
 
-	return obj->mapping;
+	return ptr;
+
+err:
+	i915_gem_object_unpin_pages(obj);
+	return ERR_PTR(ret);
 }
 
 static void
diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
index c60a8d5b..10265bb 100644
--- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
@@ -119,7 +119,7 @@ static void *i915_gem_dmabuf_vmap(struct dma_buf *dma_buf)
 	if (ret)
 		return ERR_PTR(ret);
 
-	addr = i915_gem_object_pin_map(obj);
+	addr = i915_gem_object_pin_map(obj, I915_MAP_WB);
 	mutex_unlock(&dev->struct_mutex);
 
 	return addr;
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 041cf68..1d58d36 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -1178,7 +1178,7 @@ static int guc_create_log_extras(struct intel_guc *guc)
 
 	if (!guc->log.buf_addr) {
 		/* Create a vmalloc mapping of log buffer pages */
-		vaddr = i915_gem_object_pin_map(guc->log.obj);
+		vaddr = i915_gem_object_pin_map(guc->log.obj, I915_MAP_WB);
 		if (IS_ERR(vaddr)) {
 			ret = PTR_ERR(vaddr);
 			DRM_ERROR("Couldn't map log buffer pages %d\n", ret);
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index c7f4b64..c24ac39 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -780,7 +780,7 @@ static int intel_lr_context_pin(struct i915_gem_context *ctx,
 	if (ret)
 		goto err;
 
-	vaddr = i915_gem_object_pin_map(ce->state);
+	vaddr = i915_gem_object_pin_map(ce->state, I915_MAP_WB);
 	if (IS_ERR(vaddr)) {
 		ret = PTR_ERR(vaddr);
 		goto unpin_ctx_obj;
@@ -1755,7 +1755,7 @@ lrc_setup_hws(struct intel_engine_cs *engine,
 	/* The HWSP is part of the default context object in LRC mode. */
 	engine->status_page.gfx_addr = i915_gem_obj_ggtt_offset(dctx_obj) +
 				       LRC_PPHWSP_PN * PAGE_SIZE;
-	hws = i915_gem_object_pin_map(dctx_obj);
+	hws = i915_gem_object_pin_map(dctx_obj, I915_MAP_WB);
 	if (IS_ERR(hws))
 		return PTR_ERR(hws);
 	engine->status_page.page_addr = hws + LRC_PPHWSP_PN * PAGE_SIZE;
@@ -1968,7 +1968,7 @@ populate_lr_context(struct i915_gem_context *ctx,
 		return ret;
 	}
 
-	vaddr = i915_gem_object_pin_map(ctx_obj);
+	vaddr = i915_gem_object_pin_map(ctx_obj, I915_MAP_WB);
 	if (IS_ERR(vaddr)) {
 		ret = PTR_ERR(vaddr);
 		DRM_DEBUG_DRIVER("Could not map object pages! (%d)\n", ret);
@@ -2189,7 +2189,7 @@ void intel_lr_context_reset(struct drm_i915_private *dev_priv,
 		if (!ctx_obj)
 			continue;
 
-		vaddr = i915_gem_object_pin_map(ctx_obj);
+		vaddr = i915_gem_object_pin_map(ctx_obj, I915_MAP_WB);
 		if (WARN_ON(IS_ERR(vaddr)))
 			continue;
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index e8fa26c..69ec5da 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1951,7 +1951,7 @@ int intel_ring_pin(struct intel_ring *ring)
 		if (ret)
 			goto err_unpin;
 
-		addr = i915_gem_object_pin_map(obj);
+		addr = i915_gem_object_pin_map(obj, I915_MAP_WB);
 		if (IS_ERR(addr)) {
 			ret = PTR_ERR(addr);
 			goto err_unpin;
-- 
1.9.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 17/20] drm/i915: Use uncached(WC) mapping for acessing the GuC log buffer
  2016-08-12  6:25 [PATCH v5 00/20] Support for sustained capturing of GuC firmware logs akash.goel
                   ` (15 preceding siblings ...)
  2016-08-12  6:25 ` [PATCH 16/20] drm/i915: Support to create write combined type vmaps akash.goel
@ 2016-08-12  6:25 ` akash.goel
  2016-08-12 16:05   ` Tvrtko Ursulin
  2016-08-12  6:25 ` [PATCH 18/20] drm/i915: Use SSE4.1 movntdqa to accelerate reads from WC memory akash.goel
                   ` (3 subsequent siblings)
  20 siblings, 1 reply; 68+ messages in thread
From: akash.goel @ 2016-08-12  6:25 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel

From: Akash Goel <akash.goel@intel.com>

Host needs to sample the GuC log buffer on every flush interrupt from GuC.
To ensure that we always get the up-to-date data from log buffer, its
better to access the buffer through an uncached CPU mapping. Also the way
buffer is accessed from GuC & Host side, manually doing cache flush may
not be effective always if cached CPU mapping is used.
Though there could be some performance implication with Uncached read, but
reliability of data will be ensured.

v2: Rebase.

v3: Rebase.

Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 1d58d36..1818343 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -1002,8 +1002,6 @@ static void guc_read_update_log_buffer(struct intel_guc *guc)
 			dst_data_ptr += buffer_size;
 		}
 
-		/* FIXME: invalidate/flush for log buffer needed */
-
 		/* Update the read pointer in the shared log buffer */
 		log_buffer_state->read_ptr = write_offset;
 
@@ -1177,8 +1175,11 @@ static int guc_create_log_extras(struct intel_guc *guc)
 		return 0;
 
 	if (!guc->log.buf_addr) {
-		/* Create a vmalloc mapping of log buffer pages */
-		vaddr = i915_gem_object_pin_map(guc->log.obj, I915_MAP_WB);
+		/* Create a WC (Uncached for read) vmalloc mapping of log
+		 * buffer pages, so that we can directly get the data
+		 * (up-to-date) from memory.
+		 */
+		vaddr = i915_gem_object_pin_map(guc->log.obj, I915_MAP_WC);
 		if (IS_ERR(vaddr)) {
 			ret = PTR_ERR(vaddr);
 			DRM_ERROR("Couldn't map log buffer pages %d\n", ret);
-- 
1.9.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 18/20] drm/i915: Use SSE4.1 movntdqa to accelerate reads from WC memory
  2016-08-12  6:25 [PATCH v5 00/20] Support for sustained capturing of GuC firmware logs akash.goel
                   ` (16 preceding siblings ...)
  2016-08-12  6:25 ` [PATCH 17/20] drm/i915: Use uncached(WC) mapping for acessing the GuC log buffer akash.goel
@ 2016-08-12  6:25 ` akash.goel
  2016-08-12 10:54   ` Tvrtko Ursulin
  2016-08-12  6:25 ` [PATCH 19/20] drm/i915: Use SSE4.1 movntdqa based memcpy for sampling GuC log buffer akash.goel
                   ` (2 subsequent siblings)
  20 siblings, 1 reply; 68+ messages in thread
From: akash.goel @ 2016-08-12  6:25 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel, Mika Kuoppala

From: Chris Wilson <chris@chris-wilson.co.uk>

This patch provides the infrastructure for performing a 16-byte aligned
read from WC memory using non-temporal instructions introduced with sse4.1.
Using movntdqa we can bypass the CPU caches and read directly from memory
and ignoring the page attributes set on the CPU PTE i.e. negating the
impact of an otherwise UC access. Copying using movntqda from WC is almost
as fast as reading from WB memory, modulo the possibility of both hitting
the CPU cache or leaving the data in the CPU cache for the next consumer.
(The CPU cache itself my be flushed for the region of the movntdqa and on
later access the movntdqa reads from a separate internal buffer for the
cacheline.) The write back to the memory is however cached.

This will be used in later patches to accelerate accessing WC memory.

v2: Report whether the accelerated copy is successful/possible.
v3: Function alignment override was only necessary when using the
function target("sse4.1") - which is not necessary for emitting movntdqa
from __asm__.
v4: Improve notes on CPU cache behaviour vs non-temporal stores.
v5: Fix byte offsets for unrolled moves.
v6: Find all remaining typos of movntqda, use kernel_fpu_begin.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Akash Goel <akash.goel@intel.com>
Cc: Damien Lespiau <damien.lespiau@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/Makefile      |   3 ++
 drivers/gpu/drm/i915/i915_drv.c    |   2 +
 drivers/gpu/drm/i915/i915_drv.h    |   3 ++
 drivers/gpu/drm/i915/i915_memcpy.c | 101 +++++++++++++++++++++++++++++++++++++
 4 files changed, 109 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/i915_memcpy.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index dda724f..3412413 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -3,12 +3,15 @@
 # Direct Rendering Infrastructure (DRI) in XFree86 4.1.0 and higher.
 
 subdir-ccflags-$(CONFIG_DRM_I915_WERROR) := -Werror
+subdir-ccflags-y += \
+	$(call as-instr,movntdqa (%eax)$(comma)%xmm0,-DCONFIG_AS_MOVNTDQA)
 
 # Please keep these build lists sorted!
 
 # core driver code
 i915-y := i915_drv.o \
 	  i915_irq.o \
+	  i915_memcpy.o \
 	  i915_params.o \
 	  i915_pci.o \
           i915_suspend.o \
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index cb8c943..4bbf0af 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -841,6 +841,8 @@ static int i915_driver_init_early(struct drm_i915_private *dev_priv,
 	mutex_init(&dev_priv->wm.wm_mutex);
 	mutex_init(&dev_priv->pps_mutex);
 
+	i915_memcpy_init_early(dev_priv);
+
 	ret = i915_workqueues_init(dev_priv);
 	if (ret < 0)
 		return ret;
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 6603812..fca09ea 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3909,4 +3909,7 @@ static inline bool __i915_request_irq_complete(struct drm_i915_gem_request *req)
 	return false;
 }
 
+void i915_memcpy_init_early(struct drm_i915_private *dev_priv);
+bool i915_memcpy_from_wc(void *dst, const void *src, unsigned long len);
+
 #endif
diff --git a/drivers/gpu/drm/i915/i915_memcpy.c b/drivers/gpu/drm/i915/i915_memcpy.c
new file mode 100644
index 0000000..50fc579
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_memcpy.c
@@ -0,0 +1,101 @@
+/*
+ * Copyright © 2016 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#include <linux/kernel.h>
+#include <asm/fpu/api.h>
+
+#include "i915_drv.h"
+
+DEFINE_STATIC_KEY_FALSE(has_movntdqa);
+
+#ifdef CONFIG_AS_MOVNTDQA
+static void __memcpy_ntdqa(void *dst, const void *src, unsigned long len)
+{
+	kernel_fpu_begin();
+
+	len >>= 4;
+	while (len >= 4) {
+		asm("movntdqa   (%0), %%xmm0\n"
+		    "movntdqa 16(%0), %%xmm1\n"
+		    "movntdqa 32(%0), %%xmm2\n"
+		    "movntdqa 48(%0), %%xmm3\n"
+		    "movaps %%xmm0,   (%1)\n"
+		    "movaps %%xmm1, 16(%1)\n"
+		    "movaps %%xmm2, 32(%1)\n"
+		    "movaps %%xmm3, 48(%1)\n"
+		    :: "r" (src), "r" (dst) : "memory");
+		src += 64;
+		dst += 64;
+		len -= 4;
+	}
+	while (len--) {
+		asm("movntdqa (%0), %%xmm0\n"
+		    "movaps %%xmm0, (%1)\n"
+		    :: "r" (src), "r" (dst) : "memory");
+		src += 16;
+		dst += 16;
+	}
+
+	kernel_fpu_end();
+}
+#endif
+
+/**
+ * i915_memcpy_from_wc: perform an accelerated *aligned* read from WC
+ * @dst: destination pointer
+ * @src: source pointer
+ * @len: how many bytes to copy
+ *
+ * i915_memcpy_from_wc copies @len bytes from @src to @dst using
+ * non-temporal instructions where available. Note that all arguments
+ * (@src, @dst) must be aligned to 16 bytes and @len must be a multiple
+ * of 16.
+ *
+ * To test whether accelerated reads from WC are supported, use
+ * i915_memcpy_from_wc(NULL, NULL, 0);
+ *
+ * Returns true if the copy was successful, false if the preconditions
+ * are not met.
+ */
+bool i915_memcpy_from_wc(void *dst, const void *src, unsigned long len)
+{
+	if (unlikely(((unsigned long)dst | (unsigned long)src | len) & 15))
+		return false;
+
+#ifdef CONFIG_AS_MOVNTDQA
+	if (static_branch_likely(&has_movntdqa)) {
+		if (len)
+			__memcpy_ntdqa(dst, src, len);
+		return true;
+	}
+#endif
+
+	return false;
+}
+
+void i915_memcpy_init_early(struct drm_i915_private *dev_priv)
+{
+	if (static_cpu_has(X86_FEATURE_XMM4_1))
+		static_branch_enable(&has_movntdqa);
+}
-- 
1.9.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 19/20] drm/i915: Use SSE4.1 movntdqa based memcpy for sampling GuC log buffer
  2016-08-12  6:25 [PATCH v5 00/20] Support for sustained capturing of GuC firmware logs akash.goel
                   ` (17 preceding siblings ...)
  2016-08-12  6:25 ` [PATCH 18/20] drm/i915: Use SSE4.1 movntdqa to accelerate reads from WC memory akash.goel
@ 2016-08-12  6:25 ` akash.goel
  2016-08-12 16:06   ` Tvrtko Ursulin
  2016-08-12  6:25 ` [PATCH 20/20] drm/i915: Early creation of relay channel for capturing boot time logs akash.goel
  2016-08-12  6:58 ` ✗ Ro.CI.BAT: warning for Support for sustained capturing of GuC firmware logs (rev6) Patchwork
  20 siblings, 1 reply; 68+ messages in thread
From: akash.goel @ 2016-08-12  6:25 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel

From: Akash Goel <akash.goel@intel.com>

In order to have fast reads from the GuC log buffer, used SSE4.1 movntdqa
based memcpy function i915_memcpy_from_wc.
GuC log buffer has a WC type vmalloc mapping and copying using movntqda from
WC type memory is almost as fast as reading from WB memory.
This will further reduce the log buffer sampling time, so is needed dearly
to deal with the flush interrupt storm when GuC is generating logs at a very
high rate.
Ideally SSE 4.1 should be present on all chipsets supporting GuC based
submisssions, but if not then logging will not be enabled.

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 1818343..af48f62 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -987,15 +987,16 @@ static void guc_read_update_log_buffer(struct intel_guc *guc)
 			/* Just copy the newly written data */
 			if (read_offset <= write_offset) {
 				bytes_to_copy = write_offset - read_offset;
-				memcpy(dst_data_ptr + read_offset,
+				i915_memcpy_from_wc(dst_data_ptr + read_offset,
 				     src_data_ptr + read_offset, bytes_to_copy);
 			} else {
 				bytes_to_copy = buffer_size - read_offset;
-				memcpy(dst_data_ptr + read_offset,
+				i915_memcpy_from_wc(dst_data_ptr + read_offset,
 				     src_data_ptr + read_offset, bytes_to_copy);
 
 				bytes_to_copy = write_offset;
-				memcpy(dst_data_ptr, src_data_ptr, bytes_to_copy);
+				i915_memcpy_from_wc(dst_data_ptr, src_data_ptr,
+				     bytes_to_copy);
 			}
 
 			src_data_ptr += buffer_size;
@@ -1210,6 +1211,16 @@ static void guc_create_log(struct intel_guc *guc)
 
 	obj = guc->log.obj;
 	if (!obj) {
+		/* We require SSE 4.1 for fast reads from the GuC log buffer and
+		 * it should be present on the chipsets supporting GuC based
+		 * submisssions.
+		 */
+		if (WARN_ON(!i915_memcpy_from_wc(NULL, NULL, 0))) {
+			/* logging will not be enabled */
+			i915.guc_log_level = -1;
+			return;
+		}
+
 		obj = gem_allocate_guc_obj(dev_priv, size);
 		if (!obj) {
 			/* logging will be off */
-- 
1.9.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 20/20] drm/i915: Early creation of relay channel for capturing boot time logs
  2016-08-12  6:25 [PATCH v5 00/20] Support for sustained capturing of GuC firmware logs akash.goel
                   ` (18 preceding siblings ...)
  2016-08-12  6:25 ` [PATCH 19/20] drm/i915: Use SSE4.1 movntdqa based memcpy for sampling GuC log buffer akash.goel
@ 2016-08-12  6:25 ` akash.goel
  2016-08-12 16:22   ` Tvrtko Ursulin
  2016-08-12  6:58 ` ✗ Ro.CI.BAT: warning for Support for sustained capturing of GuC firmware logs (rev6) Patchwork
  20 siblings, 1 reply; 68+ messages in thread
From: akash.goel @ 2016-08-12  6:25 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel

From: Akash Goel <akash.goel@intel.com>

As per the current i915 Driver load sequence, debugfs registration is done
at the end and so the relay channel debugfs file is also created after that
but the GuC firmware is loaded much earlier in the sequence.
As a result Driver could miss capturing the boot-time logs of GuC firmware
if there are flush interrupts from the GuC side.
Relay has a provision to support early logging where initially only relay
channel can be created, to have buffers for storing logs, and later on
channel can be associated with a debugfs file at appropriate time.
Have availed that, which allows Driver to capture boot time logs also,
which can be collected once Userspace comes up.

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 61 +++++++++++++++++++++---------
 1 file changed, 44 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index af48f62..1c287d7 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -1099,25 +1099,12 @@ static void guc_remove_log_relay_file(struct intel_guc *guc)
 	relay_close(guc->log.relay_chan);
 }
 
-static int guc_create_log_relay_file(struct intel_guc *guc)
+static int guc_create_relay_channel(struct intel_guc *guc)
 {
 	struct drm_i915_private *dev_priv = guc_to_i915(guc);
 	struct rchan *guc_log_relay_chan;
-	struct dentry *log_dir;
 	size_t n_subbufs, subbuf_size;
 
-	/* For now create the log file in /sys/kernel/debug/dri/0 dir */
-	log_dir = dev_priv->drm.primary->debugfs_root;
-
-	/* If /sys/kernel/debug/dri/0 location do not exist, then debugfs is
-	 * not mounted and so can't create the relay file.
-	 * The relay API seems to fit well with debugfs only.
-	 */
-	if (!log_dir) {
-		DRM_DEBUG_DRIVER("Parent debugfs directory not available yet\n");
-		return -ENODEV;
-	}
-
 	/* Keep the size of sub buffers same as shared log buffer */
 	subbuf_size = guc->log.obj->base.size;
 	/* Store up to 8 snaphosts, which is large enough to buffer sufficient
@@ -1127,7 +1114,7 @@ static int guc_create_log_relay_file(struct intel_guc *guc)
          */
 	n_subbufs = 8;
 
-	guc_log_relay_chan = relay_open("guc_log", log_dir,
+	guc_log_relay_chan = relay_open(NULL, NULL,
 			subbuf_size, n_subbufs, &relay_callbacks, dev_priv);
 
 	if (!guc_log_relay_chan) {
@@ -1140,6 +1127,33 @@ static int guc_create_log_relay_file(struct intel_guc *guc)
 	return 0;
 }
 
+static int guc_create_log_relay_file(struct intel_guc *guc)
+{
+	struct drm_i915_private *dev_priv = guc_to_i915(guc);
+	struct dentry *log_dir;
+	int ret;
+
+	/* For now create the log file in /sys/kernel/debug/dri/0 dir */
+	log_dir = dev_priv->drm.primary->debugfs_root;
+
+	/* If /sys/kernel/debug/dri/0 location do not exist, then debugfs is
+	 * not mounted and so can't create the relay file.
+	 * The relay API seems to fit well with debugfs only.
+	 */
+	if (!log_dir) {
+		DRM_DEBUG_DRIVER("Parent debugfs directory not available yet\n");
+		return -ENODEV;
+	}
+
+	ret = relay_late_setup_files(guc->log.relay_chan, "guc_log", log_dir);
+	if (ret) {
+		DRM_DEBUG_DRIVER("Couldn't associate the channel with file %d\n", ret);
+		return ret;
+	}
+
+	return 0;
+}
+
 static void guc_log_cleanup(struct intel_guc *guc)
 {
 	struct drm_i915_private *dev_priv = guc_to_i915(guc);
@@ -1167,7 +1181,7 @@ static int guc_create_log_extras(struct intel_guc *guc)
 {
 	struct drm_i915_private *dev_priv = guc_to_i915(guc);
 	void *vaddr;
-	int ret;
+	int ret = 0;
 
 	lockdep_assert_held(&dev_priv->drm.struct_mutex);
 
@@ -1190,7 +1204,15 @@ static int guc_create_log_extras(struct intel_guc *guc)
 		guc->log.buf_addr = vaddr;
 	}
 
-	return 0;
+	if (!guc->log.relay_chan) {
+		/* Create a relay channel, so that we have buffers for storing
+		 * the GuC firmware logs, the channel will be linked with a file
+		 * later on when debugfs is registered.
+		 */
+		ret = guc_create_relay_channel(guc);
+	}
+
+	return ret;
 }
 
 static void guc_create_log(struct intel_guc *guc)
@@ -1231,6 +1253,7 @@ static void guc_create_log(struct intel_guc *guc)
 		guc->log.obj = obj;
 
 		if (guc_create_log_extras(guc)) {
+			guc_log_cleanup(guc);
 			gem_release_guc_obj(guc->log.obj);
 			guc->log.obj = NULL;
 			i915.guc_log_level = -1;
@@ -1265,6 +1288,10 @@ static int guc_log_late_setup(struct intel_guc *guc)
 	if (ret)
 		goto err;
 
+	/* Parent debugfs dir should be available by now, associate the already
+	 * opened relay channel with a debugfs file, which will then allow User
+	 * to pull the logs.
+	 */
 	ret = guc_create_log_relay_file(guc);
 	if (ret)
 		goto err;
-- 
1.9.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: [PATCH 06/20] drm/i915: Handle log buffer flush interrupt event from GuC
  2016-08-12  6:25 ` [PATCH 06/20] drm/i915: Handle log buffer flush interrupt event from GuC akash.goel
@ 2016-08-12  6:28   ` Chris Wilson
  2016-08-12  6:44     ` Goel, Akash
  2016-08-12 13:17   ` Tvrtko Ursulin
  1 sibling, 1 reply; 68+ messages in thread
From: Chris Wilson @ 2016-08-12  6:28 UTC (permalink / raw)
  To: akash.goel; +Cc: intel-gfx

On Fri, Aug 12, 2016 at 11:55:09AM +0530, akash.goel@intel.com wrote:
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 0fcd1c0..fc2da32 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
>  static void i915_workqueues_cleanup(struct drm_i915_private *dev_priv)
>  {
> +	if (HAS_GUC_SCHED(dev_priv))
> +		destroy_workqueue(dev_priv->guc.log.wq);

if (dev_priv->guc.log.wq)
	destroy_workqueue(dev_priv->guc.log.wq);

This shouldn't be here, but in guc teardown.

Likewise this is

> @@ -770,8 +770,20 @@ static int i915_workqueues_init(struct drm_i915_private *dev_priv)
>  	if (dev_priv->hotplug.dp_wq == NULL)
>  		goto out_free_wq;
>  
> +	if (HAS_GUC_SCHED(dev_priv)) {
> +		/* Need a dedicated wq to process log buffer flush interrupts
> +		 * from GuC without much delay so as to avoid any loss of logs.
> +		 */
> +		dev_priv->guc.log.wq =

creating guc specific wq, not drm_i915_private's. They can even be
managed by guc.log?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 14/20] drm/i915: Forcefully flush GuC log buffer on reset
  2016-08-12  6:25 ` [PATCH 14/20] drm/i915: Forcefully flush GuC log buffer on reset akash.goel
@ 2016-08-12  6:33   ` Chris Wilson
  2016-08-12  7:02     ` Goel, Akash
  0 siblings, 1 reply; 68+ messages in thread
From: Chris Wilson @ 2016-08-12  6:33 UTC (permalink / raw)
  To: akash.goel; +Cc: intel-gfx

On Fri, Aug 12, 2016 at 11:55:17AM +0530, akash.goel@intel.com wrote:
> From: Sagar Arun Kamble <sagar.a.kamble@intel.com>
> 
> Before capturing the GuC logs as a part of error state, there should be a
> force log buffer flush action sent to GuC before proceeding with GPU reset
> and re-initializing GUC. There could be some data in the log buffer which is
> yet to be captured and those logs would be particularly useful to understand
> that why the GPU reset was initiated.
> 
> Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
> Signed-off-by: Akash Goel <akash.goel@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gpu_error.c      |  2 ++
>  drivers/gpu/drm/i915/i915_guc_submission.c | 27 +++++++++++++++++++++++++++
>  drivers/gpu/drm/i915/intel_guc.h           |  1 +
>  3 files changed, 30 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> index 561b523..5e358e2 100644
> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> @@ -1232,6 +1232,8 @@ static void i915_gem_capture_guc_log_buffer(struct drm_i915_private *dev_priv,
>  	if (!dev_priv->guc.log.obj)
>  		return;
>  
> +	i915_guc_flush_logs(dev_priv);

This is an invalid context for this function, flush_work() is illegal
inside error capture.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 06/20] drm/i915: Handle log buffer flush interrupt event from GuC
  2016-08-12  6:28   ` Chris Wilson
@ 2016-08-12  6:44     ` Goel, Akash
  2016-08-12  6:51       ` Chris Wilson
  0 siblings, 1 reply; 68+ messages in thread
From: Goel, Akash @ 2016-08-12  6:44 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: akash.goel



On 8/12/2016 11:58 AM, Chris Wilson wrote:
> On Fri, Aug 12, 2016 at 11:55:09AM +0530, akash.goel@intel.com wrote:
>> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
>> index 0fcd1c0..fc2da32 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.c
>> +++ b/drivers/gpu/drm/i915/i915_drv.c
>>  static void i915_workqueues_cleanup(struct drm_i915_private *dev_priv)
>>  {
>> +	if (HAS_GUC_SCHED(dev_priv))
>> +		destroy_workqueue(dev_priv->guc.log.wq);
>
> if (dev_priv->guc.log.wq)
> 	destroy_workqueue(dev_priv->guc.log.wq);
>
> This shouldn't be here, but in guc teardown.
>
> Likewise this is
>
Fine will move it to GuC teardown.

>> @@ -770,8 +770,20 @@ static int i915_workqueues_init(struct drm_i915_private *dev_priv)
>>  	if (dev_priv->hotplug.dp_wq == NULL)
>>  		goto out_free_wq;
>>
>> +	if (HAS_GUC_SCHED(dev_priv)) {
>> +		/* Need a dedicated wq to process log buffer flush interrupts
>> +		 * from GuC without much delay so as to avoid any loss of logs.
>> +		 */
>> +		dev_priv->guc.log.wq =
>
> creating guc specific wq, not drm_i915_private's. They can even be
> managed by guc.log?
Sorry for the inconsistency here, but didn't get your question.
	dev_priv->guc.log.wq

	dev_priv->guc.events_work

Best regards
Akash

> -Chris
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 06/20] drm/i915: Handle log buffer flush interrupt event from GuC
  2016-08-12  6:44     ` Goel, Akash
@ 2016-08-12  6:51       ` Chris Wilson
  2016-08-12  7:00         ` Goel, Akash
  0 siblings, 1 reply; 68+ messages in thread
From: Chris Wilson @ 2016-08-12  6:51 UTC (permalink / raw)
  To: Goel, Akash; +Cc: intel-gfx

On Fri, Aug 12, 2016 at 12:14:28PM +0530, Goel, Akash wrote:
> 
> 
> On 8/12/2016 11:58 AM, Chris Wilson wrote:
> >On Fri, Aug 12, 2016 at 11:55:09AM +0530, akash.goel@intel.com wrote:
> >>diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> >>index 0fcd1c0..fc2da32 100644
> >>--- a/drivers/gpu/drm/i915/i915_drv.c
> >>+++ b/drivers/gpu/drm/i915/i915_drv.c
> >> static void i915_workqueues_cleanup(struct drm_i915_private *dev_priv)
> >> {
> >>+	if (HAS_GUC_SCHED(dev_priv))
> >>+		destroy_workqueue(dev_priv->guc.log.wq);
> >
> >if (dev_priv->guc.log.wq)
> >	destroy_workqueue(dev_priv->guc.log.wq);
> >
> >This shouldn't be here, but in guc teardown.
> >
> >Likewise this is
> >
> Fine will move it to GuC teardown.
> 
> >>@@ -770,8 +770,20 @@ static int i915_workqueues_init(struct drm_i915_private *dev_priv)
> >> 	if (dev_priv->hotplug.dp_wq == NULL)
> >> 		goto out_free_wq;
> >>
> >>+	if (HAS_GUC_SCHED(dev_priv)) {
> >>+		/* Need a dedicated wq to process log buffer flush interrupts
> >>+		 * from GuC without much delay so as to avoid any loss of logs.
> >>+		 */
> >>+		dev_priv->guc.log.wq =
> >
> >creating guc specific wq, not drm_i915_private's. They can even be
> >managed by guc.log?
> Sorry for the inconsistency here, but didn't get your question.
> 	dev_priv->guc.log.wq

Just somewhere inside guc, I was just noting that you probably already
have setup/teardown for dev_priv->guc.log itself.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* ✗ Ro.CI.BAT: warning for Support for sustained capturing of GuC firmware logs (rev6)
  2016-08-12  6:25 [PATCH v5 00/20] Support for sustained capturing of GuC firmware logs akash.goel
                   ` (19 preceding siblings ...)
  2016-08-12  6:25 ` [PATCH 20/20] drm/i915: Early creation of relay channel for capturing boot time logs akash.goel
@ 2016-08-12  6:58 ` Patchwork
  20 siblings, 0 replies; 68+ messages in thread
From: Patchwork @ 2016-08-12  6:58 UTC (permalink / raw)
  To: Akash Goel; +Cc: intel-gfx

== Series Details ==

Series: Support for sustained capturing of GuC firmware logs (rev6)
URL   : https://patchwork.freedesktop.org/series/7910/
State : warning

== Summary ==

Series 7910v6 Support for sustained capturing of GuC firmware logs
http://patchwork.freedesktop.org/api/1.0/series/7910/revisions/6/mbox

Test gem_exec_suspend:
        Subgroup basic-s3:
                dmesg-warn -> PASS       (ro-bdw-i7-5600u)
Test kms_cursor_legacy:
        Subgroup basic-flip-vs-cursor-legacy:
                fail       -> PASS       (ro-skl3-i5-6260u)
                fail       -> PASS       (ro-bdw-i5-5250u)
        Subgroup basic-flip-vs-cursor-varying-size:
                fail       -> PASS       (ro-skl3-i5-6260u)
Test kms_pipe_crc_basic:
        Subgroup read-crc-pipe-b-frame-sequence:
                fail       -> PASS       (ro-ivb2-i7-3770)
        Subgroup suspend-read-crc-pipe-c:
                skip       -> DMESG-WARN (ro-bdw-i5-5250u)

fi-hsw-i7-4770k  total:244  pass:222  dwarn:0   dfail:0   fail:0   skip:22 
fi-kbl-qkkr      total:244  pass:185  dwarn:29  dfail:0   fail:3   skip:27 
fi-skl-i5-6260u  total:244  pass:224  dwarn:4   dfail:0   fail:2   skip:14 
fi-skl-i7-6700k  total:244  pass:208  dwarn:4   dfail:2   fail:2   skip:28 
fi-snb-i7-2600   total:244  pass:202  dwarn:0   dfail:0   fail:0   skip:42 
ro-bdw-i5-5250u  total:240  pass:219  dwarn:3   dfail:0   fail:1   skip:17 
ro-bdw-i7-5600u  total:240  pass:207  dwarn:0   dfail:0   fail:1   skip:32 
ro-bsw-n3050     total:240  pass:194  dwarn:0   dfail:0   fail:4   skip:42 
ro-byt-n2820     total:240  pass:197  dwarn:0   dfail:0   fail:3   skip:40 
ro-hsw-i3-4010u  total:240  pass:214  dwarn:0   dfail:0   fail:0   skip:26 
ro-hsw-i7-4770r  total:240  pass:185  dwarn:0   dfail:0   fail:0   skip:55 
ro-ilk1-i5-650   total:235  pass:174  dwarn:0   dfail:0   fail:1   skip:60 
ro-ivb-i7-3770   total:240  pass:205  dwarn:0   dfail:0   fail:0   skip:35 
ro-ivb2-i7-3770  total:240  pass:209  dwarn:0   dfail:0   fail:0   skip:31 
ro-skl3-i5-6260u total:240  pass:224  dwarn:0   dfail:0   fail:2   skip:14 

Results at /archive/results/CI_IGT_test/RO_Patchwork_1847/

4a26251 drm-intel-nightly: 2016y-08m-11d-16h-12m-42s UTC integration manifest
016c8ce drm/i915: Early creation of relay channel for capturing boot time logs
aabeff5 drm/i915: Use SSE4.1 movntdqa based memcpy for sampling GuC log buffer
d674908 drm/i915: Use SSE4.1 movntdqa to accelerate reads from WC memory
cceb875 drm/i915: Use uncached(WC) mapping for acessing the GuC log buffer
9b502e1 drm/i915: Support to create write combined type vmaps
29467fc drm/i915: Debugfs support for GuC logging control
b693a7e drm/i915: Forcefully flush GuC log buffer on reset
413b75d drm/i915: Augment i915 error state to include the dump of GuC log buffer
696fec8 drm/i915: Increase GuC log buffer size to reduce flush interrupts
f442750 drm/i915: Optimization to reduce the sampling time of GuC log buffer
e66bde2 drm/i915: Add stats for GuC log buffer flush interrupts
873f3ed drm/i915: New lock to serialize the Host2GuC actions
04634bf drm/i915: Add a relay backed debugfs interface for capturing GuC logs
6a914c9 relay: Use per CPU constructs for the relay channel buffer pointers
ff7d465 drm/i915: Handle log buffer flush interrupt event from GuC
156ff5a drm/i915: Support for GuC interrupts
2f85bdb drm/i915: Add low level set of routines for programming PM IER/IIR/IMR register set
971bf84 drm/i915: New structure to contain GuC logging related fields
63363c4 drm/i915: Add GuC ukernel logging related fields to fw interface file
a1008fe drm/i915: Decouple GuC log setup from verbosity parameter

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 06/20] drm/i915: Handle log buffer flush interrupt event from GuC
  2016-08-12  6:51       ` Chris Wilson
@ 2016-08-12  7:00         ` Goel, Akash
  0 siblings, 0 replies; 68+ messages in thread
From: Goel, Akash @ 2016-08-12  7:00 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: akash.goel



On 8/12/2016 12:21 PM, Chris Wilson wrote:
> On Fri, Aug 12, 2016 at 12:14:28PM +0530, Goel, Akash wrote:
>>
>>
>> On 8/12/2016 11:58 AM, Chris Wilson wrote:
>>> On Fri, Aug 12, 2016 at 11:55:09AM +0530, akash.goel@intel.com wrote:
>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
>>>> index 0fcd1c0..fc2da32 100644
>>>> --- a/drivers/gpu/drm/i915/i915_drv.c
>>>> +++ b/drivers/gpu/drm/i915/i915_drv.c
>>>> static void i915_workqueues_cleanup(struct drm_i915_private *dev_priv)
>>>> {
>>>> +	if (HAS_GUC_SCHED(dev_priv))
>>>> +		destroy_workqueue(dev_priv->guc.log.wq);
>>>
>>> if (dev_priv->guc.log.wq)
>>> 	destroy_workqueue(dev_priv->guc.log.wq);
>>>
>>> This shouldn't be here, but in guc teardown.
>>>
>>> Likewise this is
>>>
>> Fine will move it to GuC teardown.
>>
>>>> @@ -770,8 +770,20 @@ static int i915_workqueues_init(struct drm_i915_private *dev_priv)
>>>> 	if (dev_priv->hotplug.dp_wq == NULL)
>>>> 		goto out_free_wq;
>>>>
>>>> +	if (HAS_GUC_SCHED(dev_priv)) {
>>>> +		/* Need a dedicated wq to process log buffer flush interrupts
>>>> +		 * from GuC without much delay so as to avoid any loss of logs.
>>>> +		 */
>>>> +		dev_priv->guc.log.wq =
>>>
>>> creating guc specific wq, not drm_i915_private's. They can even be
>>> managed by guc.log?
>> Sorry for the inconsistency here, but didn't get your question.
>> 	dev_priv->guc.log.wq
>
> Just somewhere inside guc, I was just noting that you probably already
> have setup/teardown for dev_priv->guc.log itself.

Fine, will move the dedicated wq creation/destruction in the
setup/teardown routines for guc.log.

Best Regards
Akash

> -Chris
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 14/20] drm/i915: Forcefully flush GuC log buffer on reset
  2016-08-12  6:33   ` Chris Wilson
@ 2016-08-12  7:02     ` Goel, Akash
  0 siblings, 0 replies; 68+ messages in thread
From: Goel, Akash @ 2016-08-12  7:02 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: akash.goel



On 8/12/2016 12:03 PM, Chris Wilson wrote:
> On Fri, Aug 12, 2016 at 11:55:17AM +0530, akash.goel@intel.com wrote:
>> From: Sagar Arun Kamble <sagar.a.kamble@intel.com>
>>
>> Before capturing the GuC logs as a part of error state, there should be a
>> force log buffer flush action sent to GuC before proceeding with GPU reset
>> and re-initializing GUC. There could be some data in the log buffer which is
>> yet to be captured and those logs would be particularly useful to understand
>> that why the GPU reset was initiated.
>>
>> Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
>> Signed-off-by: Akash Goel <akash.goel@intel.com>
>> ---
>>  drivers/gpu/drm/i915/i915_gpu_error.c      |  2 ++
>>  drivers/gpu/drm/i915/i915_guc_submission.c | 27 +++++++++++++++++++++++++++
>>  drivers/gpu/drm/i915/intel_guc.h           |  1 +
>>  3 files changed, 30 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
>> index 561b523..5e358e2 100644
>> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
>> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
>> @@ -1232,6 +1232,8 @@ static void i915_gem_capture_guc_log_buffer(struct drm_i915_private *dev_priv,
>>  	if (!dev_priv->guc.log.obj)
>>  		return;
>>
>> +	i915_guc_flush_logs(dev_priv);
>
> This is an invalid context for this function, flush_work() is illegal
> inside error capture.

Actually the concerned work item should not take much time for execution 
and also it doesn't acquire any such locks due to which it can get blocked.

Should there be no wait whatsoever in error capture ?
Will have to drop this patch.

Best regards
Akash
> -Chris
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 16/20] drm/i915: Support to create write combined type vmaps
  2016-08-12  6:25 ` [PATCH 16/20] drm/i915: Support to create write combined type vmaps akash.goel
@ 2016-08-12 10:49   ` Tvrtko Ursulin
  2016-08-12 15:13     ` Goel, Akash
  0 siblings, 1 reply; 68+ messages in thread
From: Tvrtko Ursulin @ 2016-08-12 10:49 UTC (permalink / raw)
  To: akash.goel, intel-gfx


On 12/08/16 07:25, akash.goel@intel.com wrote:
> From: Chris Wilson <chris@chris-wilson.co.uk>
>
> vmaps has a provision for controlling the page protection bits, with which
> we can use to control the mapping type, e.g. WB, WC, UC or even WT.
> To allow the caller to choose their mapping type, we add a parameter to
> i915_gem_object_pin_map - but we still only allow one vmap to be cached
> per object. If the object is currently not pinned, then we recreate the
> previous vmap with the new access type, but if it was pinned we report an
> error. This effectively limits the access via i915_gem_object_pin_map to a
> single mapping type for the lifetime of the object. Not usually a problem,
> but something to be aware of when setting up the object's vmap.
>
> We will want to vary the access type to enable WC mappings of ringbuffer
> and context objects on !llc platforms, as well as other objects where we
> need coherent access to the GPU's pages without going through the GTT
>
> v2: Remove the redundant braces around pin count check and fix the marker
>      in documentation (Chris)
>
> v3:
> - Add a new enum for the vmalloc mapping type & pass that as an argument to
>    i915_object_pin_map. (Tvrtko)
> - Use PAGE_MASK to extract or filter the mapping type info and remove a
>    superfluous BUG_ON.(Tvrtko)
>
> v4:
> - Rename the enums and clean up the pin_map function. (Chris)
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Akash Goel <akash.goel@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_drv.h            |  9 ++++-
>   drivers/gpu/drm/i915/i915_gem.c            | 58 +++++++++++++++++++++++-------
>   drivers/gpu/drm/i915/i915_gem_dmabuf.c     |  2 +-
>   drivers/gpu/drm/i915/i915_guc_submission.c |  2 +-
>   drivers/gpu/drm/i915/intel_lrc.c           |  8 ++---
>   drivers/gpu/drm/i915/intel_ringbuffer.c    |  2 +-
>   6 files changed, 60 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 4bd3790..6603812 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -834,6 +834,11 @@ enum i915_cache_level {
>   	I915_CACHE_WT, /* hsw:gt3e WriteThrough for scanouts */
>   };
>
> +enum i915_map_type {
> +	I915_MAP_WB = 0,
> +	I915_MAP_WC,
> +};
> +
>   struct i915_ctx_hang_stats {
>   	/* This context had batch pending when hang was declared */
>   	unsigned batch_pending;
> @@ -3150,6 +3155,7 @@ static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
>   /**
>    * i915_gem_object_pin_map - return a contiguous mapping of the entire object
>    * @obj - the object to map into kernel address space
> + * @map_type - whether the vmalloc mapping should be using WC or WB pgprot_t
>    *
>    * Calls i915_gem_object_pin_pages() to prevent reaping of the object's
>    * pages and then returns a contiguous mapping of the backing storage into
> @@ -3161,7 +3167,8 @@ static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
>    * Returns the pointer through which to access the mapped object, or an
>    * ERR_PTR() on error.
>    */
> -void *__must_check i915_gem_object_pin_map(struct drm_i915_gem_object *obj);
> +void *__must_check i915_gem_object_pin_map(struct drm_i915_gem_object *obj,
> +					enum i915_map_type map_type);
>
>   /**
>    * i915_gem_object_unpin_map - releases an earlier mapping
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 03548db..7dabbc3f 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2077,10 +2077,11 @@ i915_gem_object_put_pages(struct drm_i915_gem_object *obj)
>   	list_del(&obj->global_list);
>
>   	if (obj->mapping) {
> -		if (is_vmalloc_addr(obj->mapping))
> -			vunmap(obj->mapping);
> +		void *ptr = (void *)((uintptr_t)obj->mapping & PAGE_MASK);
> +		if (is_vmalloc_addr(ptr))
> +			vunmap(ptr);
>   		else
> -			kunmap(kmap_to_page(obj->mapping));
> +			kunmap(kmap_to_page(ptr));
>   		obj->mapping = NULL;
>   	}
>
> @@ -2253,7 +2254,8 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
>   }
>
>   /* The 'mapping' part of i915_gem_object_pin_map() below */
> -static void *i915_gem_object_map(const struct drm_i915_gem_object *obj)
> +static void *i915_gem_object_map(const struct drm_i915_gem_object *obj,
> +				 enum i915_map_type type)
>   {
>   	unsigned long n_pages = obj->base.size >> PAGE_SHIFT;
>   	struct sg_table *sgt = obj->pages;
> @@ -2263,9 +2265,10 @@ static void *i915_gem_object_map(const struct drm_i915_gem_object *obj)
>   	struct page **pages = stack_pages;
>   	unsigned long i = 0;
>   	void *addr;
> +	bool use_wc = (type == I915_MAP_WC);
>
>   	/* A single page can always be kmapped */
> -	if (n_pages == 1)
> +	if (n_pages == 1 && !use_wc)
>   		return kmap(sg_page(sgt->sgl));
>
>   	if (n_pages > ARRAY_SIZE(stack_pages)) {
> @@ -2281,7 +2284,8 @@ static void *i915_gem_object_map(const struct drm_i915_gem_object *obj)
>   	/* Check that we have the expected number of pages */
>   	GEM_BUG_ON(i != n_pages);
>
> -	addr = vmap(pages, n_pages, 0, PAGE_KERNEL);
> +	addr = vmap(pages, n_pages, VM_NO_GUARD,

Unreleated and unmentioned change to no guard page. Best to remove IMHO. 
Can keep the RB in that case.

> +		    use_wc ? pgprot_writecombine(PAGE_KERNEL_IO) : PAGE_KERNEL);
>
>   	if (pages != stack_pages)
>   		drm_free_large(pages);
> @@ -2290,11 +2294,16 @@ static void *i915_gem_object_map(const struct drm_i915_gem_object *obj)
>   }
>
>   /* get, pin, and map the pages of the object into kernel space */
> -void *i915_gem_object_pin_map(struct drm_i915_gem_object *obj)
> +void *i915_gem_object_pin_map(struct drm_i915_gem_object *obj,
> +			      enum i915_map_type type)
>   {
> +	enum i915_map_type has_type;
> +	bool pinned;
> +	void *ptr;
>   	int ret;
>
>   	lockdep_assert_held(&obj->base.dev->struct_mutex);
> +	GEM_BUG_ON((obj->ops->flags & I915_GEM_OBJECT_HAS_STRUCT_PAGE) == 0);
>
>   	ret = i915_gem_object_get_pages(obj);
>   	if (ret)
> @@ -2302,15 +2311,38 @@ void *i915_gem_object_pin_map(struct drm_i915_gem_object *obj)
>
>   	i915_gem_object_pin_pages(obj);
>
> -	if (!obj->mapping) {
> -		obj->mapping = i915_gem_object_map(obj);
> -		if (!obj->mapping) {
> -			i915_gem_object_unpin_pages(obj);
> -			return ERR_PTR(-ENOMEM);
> +	pinned = obj->pages_pin_count > 1;
> +	ptr = (void *)((uintptr_t)obj->mapping & PAGE_MASK);
> +	has_type = (uintptr_t)obj->mapping & ~PAGE_MASK;
> +
> +	if (ptr && has_type != type) {
> +		if (pinned) {
> +			ret = -EBUSY;
> +			goto err;
> +		}
> +
> +		if (is_vmalloc_addr(ptr))
> +			vunmap(ptr);
> +		else
> +			kunmap(kmap_to_page(ptr));
> +		ptr = obj->mapping = NULL;
> +	}
> +
> +	if (!ptr) {
> +		ptr = i915_gem_object_map(obj, type);
> +		if (!ptr) {
> +			ret = -ENOMEM;
> +			goto err;
>   		}
> +
> +		obj->mapping = (void *)((uintptr_t)ptr | type);
>   	}
>
> -	return obj->mapping;
> +	return ptr;
> +
> +err:
> +	i915_gem_object_unpin_pages(obj);
> +	return ERR_PTR(ret);
>   }
>
>   static void
> diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
> index c60a8d5b..10265bb 100644
> --- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c
> +++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
> @@ -119,7 +119,7 @@ static void *i915_gem_dmabuf_vmap(struct dma_buf *dma_buf)
>   	if (ret)
>   		return ERR_PTR(ret);
>
> -	addr = i915_gem_object_pin_map(obj);
> +	addr = i915_gem_object_pin_map(obj, I915_MAP_WB);
>   	mutex_unlock(&dev->struct_mutex);
>
>   	return addr;
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> index 041cf68..1d58d36 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -1178,7 +1178,7 @@ static int guc_create_log_extras(struct intel_guc *guc)
>
>   	if (!guc->log.buf_addr) {
>   		/* Create a vmalloc mapping of log buffer pages */
> -		vaddr = i915_gem_object_pin_map(guc->log.obj);
> +		vaddr = i915_gem_object_pin_map(guc->log.obj, I915_MAP_WB);
>   		if (IS_ERR(vaddr)) {
>   			ret = PTR_ERR(vaddr);
>   			DRM_ERROR("Couldn't map log buffer pages %d\n", ret);
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index c7f4b64..c24ac39 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -780,7 +780,7 @@ static int intel_lr_context_pin(struct i915_gem_context *ctx,
>   	if (ret)
>   		goto err;
>
> -	vaddr = i915_gem_object_pin_map(ce->state);
> +	vaddr = i915_gem_object_pin_map(ce->state, I915_MAP_WB);
>   	if (IS_ERR(vaddr)) {
>   		ret = PTR_ERR(vaddr);
>   		goto unpin_ctx_obj;
> @@ -1755,7 +1755,7 @@ lrc_setup_hws(struct intel_engine_cs *engine,
>   	/* The HWSP is part of the default context object in LRC mode. */
>   	engine->status_page.gfx_addr = i915_gem_obj_ggtt_offset(dctx_obj) +
>   				       LRC_PPHWSP_PN * PAGE_SIZE;
> -	hws = i915_gem_object_pin_map(dctx_obj);
> +	hws = i915_gem_object_pin_map(dctx_obj, I915_MAP_WB);
>   	if (IS_ERR(hws))
>   		return PTR_ERR(hws);
>   	engine->status_page.page_addr = hws + LRC_PPHWSP_PN * PAGE_SIZE;
> @@ -1968,7 +1968,7 @@ populate_lr_context(struct i915_gem_context *ctx,
>   		return ret;
>   	}
>
> -	vaddr = i915_gem_object_pin_map(ctx_obj);
> +	vaddr = i915_gem_object_pin_map(ctx_obj, I915_MAP_WB);
>   	if (IS_ERR(vaddr)) {
>   		ret = PTR_ERR(vaddr);
>   		DRM_DEBUG_DRIVER("Could not map object pages! (%d)\n", ret);
> @@ -2189,7 +2189,7 @@ void intel_lr_context_reset(struct drm_i915_private *dev_priv,
>   		if (!ctx_obj)
>   			continue;
>
> -		vaddr = i915_gem_object_pin_map(ctx_obj);
> +		vaddr = i915_gem_object_pin_map(ctx_obj, I915_MAP_WB);
>   		if (WARN_ON(IS_ERR(vaddr)))
>   			continue;
>
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index e8fa26c..69ec5da 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -1951,7 +1951,7 @@ int intel_ring_pin(struct intel_ring *ring)
>   		if (ret)
>   			goto err_unpin;
>
> -		addr = i915_gem_object_pin_map(obj);
> +		addr = i915_gem_object_pin_map(obj, I915_MAP_WB);
>   		if (IS_ERR(addr)) {
>   			ret = PTR_ERR(addr);
>   			goto err_unpin;
>

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 18/20] drm/i915: Use SSE4.1 movntdqa to accelerate reads from WC memory
  2016-08-12  6:25 ` [PATCH 18/20] drm/i915: Use SSE4.1 movntdqa to accelerate reads from WC memory akash.goel
@ 2016-08-12 10:54   ` Tvrtko Ursulin
  2016-08-12 12:22     ` Chris Wilson
  0 siblings, 1 reply; 68+ messages in thread
From: Tvrtko Ursulin @ 2016-08-12 10:54 UTC (permalink / raw)
  To: akash.goel, intel-gfx; +Cc: Mika Kuoppala


On 12/08/16 07:25, akash.goel@intel.com wrote:
> From: Chris Wilson <chris@chris-wilson.co.uk>
>
> This patch provides the infrastructure for performing a 16-byte aligned
> read from WC memory using non-temporal instructions introduced with sse4.1.
> Using movntdqa we can bypass the CPU caches and read directly from memory
> and ignoring the page attributes set on the CPU PTE i.e. negating the
> impact of an otherwise UC access. Copying using movntqda from WC is almost
> as fast as reading from WB memory, modulo the possibility of both hitting
> the CPU cache or leaving the data in the CPU cache for the next consumer.
> (The CPU cache itself my be flushed for the region of the movntdqa and on
> later access the movntdqa reads from a separate internal buffer for the
> cacheline.) The write back to the memory is however cached.
>
> This will be used in later patches to accelerate accessing WC memory.
>
> v2: Report whether the accelerated copy is successful/possible.
> v3: Function alignment override was only necessary when using the
> function target("sse4.1") - which is not necessary for emitting movntdqa
> from __asm__.
> v4: Improve notes on CPU cache behaviour vs non-temporal stores.
> v5: Fix byte offsets for unrolled moves.
> v6: Find all remaining typos of movntqda, use kernel_fpu_begin.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Akash Goel <akash.goel@intel.com>
> Cc: Damien Lespiau <damien.lespiau@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/Makefile      |   3 ++
>   drivers/gpu/drm/i915/i915_drv.c    |   2 +
>   drivers/gpu/drm/i915/i915_drv.h    |   3 ++
>   drivers/gpu/drm/i915/i915_memcpy.c | 101 +++++++++++++++++++++++++++++++++++++
>   4 files changed, 109 insertions(+)
>   create mode 100644 drivers/gpu/drm/i915/i915_memcpy.c
>
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index dda724f..3412413 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -3,12 +3,15 @@
>   # Direct Rendering Infrastructure (DRI) in XFree86 4.1.0 and higher.
>
>   subdir-ccflags-$(CONFIG_DRM_I915_WERROR) := -Werror
> +subdir-ccflags-y += \
> +	$(call as-instr,movntdqa (%eax)$(comma)%xmm0,-DCONFIG_AS_MOVNTDQA)
>
>   # Please keep these build lists sorted!
>
>   # core driver code
>   i915-y := i915_drv.o \
>   	  i915_irq.o \
> +	  i915_memcpy.o \
>   	  i915_params.o \
>   	  i915_pci.o \
>             i915_suspend.o \
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index cb8c943..4bbf0af 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -841,6 +841,8 @@ static int i915_driver_init_early(struct drm_i915_private *dev_priv,
>   	mutex_init(&dev_priv->wm.wm_mutex);
>   	mutex_init(&dev_priv->pps_mutex);
>
> +	i915_memcpy_init_early(dev_priv);
> +
>   	ret = i915_workqueues_init(dev_priv);
>   	if (ret < 0)
>   		return ret;
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 6603812..fca09ea 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -3909,4 +3909,7 @@ static inline bool __i915_request_irq_complete(struct drm_i915_gem_request *req)
>   	return false;
>   }
>
> +void i915_memcpy_init_early(struct drm_i915_private *dev_priv);
> +bool i915_memcpy_from_wc(void *dst, const void *src, unsigned long len);
> +
>   #endif
> diff --git a/drivers/gpu/drm/i915/i915_memcpy.c b/drivers/gpu/drm/i915/i915_memcpy.c
> new file mode 100644
> index 0000000..50fc579
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/i915_memcpy.c
> @@ -0,0 +1,101 @@
> +/*
> + * Copyright © 2016 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
> + * IN THE SOFTWARE.
> + *
> + */
> +
> +#include <linux/kernel.h>
> +#include <asm/fpu/api.h>
> +
> +#include "i915_drv.h"
> +
> +DEFINE_STATIC_KEY_FALSE(has_movntdqa);
> +
> +#ifdef CONFIG_AS_MOVNTDQA
> +static void __memcpy_ntdqa(void *dst, const void *src, unsigned long len)
> +{
> +	kernel_fpu_begin();
> +
> +	len >>= 4;
> +	while (len >= 4) {
> +		asm("movntdqa   (%0), %%xmm0\n"
> +		    "movntdqa 16(%0), %%xmm1\n"
> +		    "movntdqa 32(%0), %%xmm2\n"
> +		    "movntdqa 48(%0), %%xmm3\n"
> +		    "movaps %%xmm0,   (%1)\n"
> +		    "movaps %%xmm1, 16(%1)\n"
> +		    "movaps %%xmm2, 32(%1)\n"
> +		    "movaps %%xmm3, 48(%1)\n"
> +		    :: "r" (src), "r" (dst) : "memory");
> +		src += 64;
> +		dst += 64;
> +		len -= 4;
> +	}
> +	while (len--) {
> +		asm("movntdqa (%0), %%xmm0\n"
> +		    "movaps %%xmm0, (%1)\n"
> +		    :: "r" (src), "r" (dst) : "memory");
> +		src += 16;
> +		dst += 16;
> +	}
> +
> +	kernel_fpu_end();
> +}
> +#endif
> +
> +/**
> + * i915_memcpy_from_wc: perform an accelerated *aligned* read from WC
> + * @dst: destination pointer
> + * @src: source pointer
> + * @len: how many bytes to copy
> + *
> + * i915_memcpy_from_wc copies @len bytes from @src to @dst using
> + * non-temporal instructions where available. Note that all arguments
> + * (@src, @dst) must be aligned to 16 bytes and @len must be a multiple
> + * of 16.
> + *
> + * To test whether accelerated reads from WC are supported, use
> + * i915_memcpy_from_wc(NULL, NULL, 0);
> + *
> + * Returns true if the copy was successful, false if the preconditions
> + * are not met.
> + */
> +bool i915_memcpy_from_wc(void *dst, const void *src, unsigned long len)
> +{
> +	if (unlikely(((unsigned long)dst | (unsigned long)src | len) & 15))
> +		return false;
> +
> +#ifdef CONFIG_AS_MOVNTDQA
> +	if (static_branch_likely(&has_movntdqa)) {
> +		if (len)

Potentially could annotate this with another likely.

> +			__memcpy_ntdqa(dst, src, len);
> +		return true;
> +	}
> +#endif
> +
> +	return false;
> +}
> +
> +void i915_memcpy_init_early(struct drm_i915_private *dev_priv)
> +{
> +	if (static_cpu_has(X86_FEATURE_XMM4_1))
> +		static_branch_enable(&has_movntdqa);
> +}
>

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 04/20] drm/i915: Add low level set of routines for programming PM IER/IIR/IMR register set
  2016-08-12  6:25 ` [PATCH 04/20] drm/i915: Add low level set of routines for programming PM IER/IIR/IMR register set akash.goel
@ 2016-08-12 11:15   ` Tvrtko Ursulin
  0 siblings, 0 replies; 68+ messages in thread
From: Tvrtko Ursulin @ 2016-08-12 11:15 UTC (permalink / raw)
  To: akash.goel, intel-gfx


On 12/08/16 07:25, akash.goel@intel.com wrote:
> From: Akash Goel <akash.goel@intel.com>
>
> So far PM IER/IIR/IMR registers were being used only for Turbo related
> interrupts. But interrupts coming from GuC also use the same set.
> As a precursor to supporting GuC interrupts, added new low level routines
> so as to allow sharing the programming of PM IER/IIR/IMR registers between
> Turbo & GuC.
> Also similar to PM IMR, maintaining a bitmask for PM IER register, to allow
> easy sharing of it between Turbo & GuC without involving a rmw operation.
>
> v2:
> - For appropriateness & avoid any ambiguity, rename old functions
>    enable/disable pm_irq to mask/unmask pm_irq and rename new functions
>    enable/disable pm_interrupts to enable/disable pm_irq. (Tvrtko)
> - Use u32 in place of uint32_t. (Tvrtko)
>
> v3:
> - Rename the fields pm_irq_mask & pm_ier_mask and do some cleanup. (Chris)
> - Rebase.
>
> v4: Fix the inadvertent disabling of User interrupt for VECS ring causing
>      failure for certain IGTs.
>
> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Akash Goel <akash.goel@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_drv.h         |  3 +-
>   drivers/gpu/drm/i915/i915_irq.c         | 75 ++++++++++++++++++++++-----------
>   drivers/gpu/drm/i915/intel_drv.h        |  3 ++
>   drivers/gpu/drm/i915/intel_ringbuffer.c |  4 +-
>   4 files changed, 57 insertions(+), 28 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 7971c76..a608a5c 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1776,7 +1776,8 @@ struct drm_i915_private {
>   		u32 de_irq_mask[I915_MAX_PIPES];
>   	};
>   	u32 gt_irq_mask;
> -	u32 pm_irq_mask;
> +	u32 pm_imr;
> +	u32 pm_ier;
>   	u32 pm_rps_events;
>   	u32 pipestat_irq_mask[I915_MAX_PIPES];
>
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index ebb83d5..5f93309 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -303,18 +303,18 @@ static void snb_update_pm_irq(struct drm_i915_private *dev_priv,
>
>   	assert_spin_locked(&dev_priv->irq_lock);
>
> -	new_val = dev_priv->pm_irq_mask;
> +	new_val = dev_priv->pm_imr;
>   	new_val &= ~interrupt_mask;
>   	new_val |= (~enabled_irq_mask & interrupt_mask);
>
> -	if (new_val != dev_priv->pm_irq_mask) {
> -		dev_priv->pm_irq_mask = new_val;
> -		I915_WRITE(gen6_pm_imr(dev_priv), dev_priv->pm_irq_mask);
> +	if (new_val != dev_priv->pm_imr) {
> +		dev_priv->pm_imr = new_val;
> +		I915_WRITE(gen6_pm_imr(dev_priv), dev_priv->pm_imr);
>   		POSTING_READ(gen6_pm_imr(dev_priv));
>   	}
>   }
>
> -void gen6_enable_pm_irq(struct drm_i915_private *dev_priv, uint32_t mask)
> +void gen6_unmask_pm_irq(struct drm_i915_private *dev_priv, u32 mask)
>   {
>   	if (WARN_ON(!intel_irqs_enabled(dev_priv)))
>   		return;
> @@ -322,28 +322,54 @@ void gen6_enable_pm_irq(struct drm_i915_private *dev_priv, uint32_t mask)
>   	snb_update_pm_irq(dev_priv, mask, mask);
>   }
>
> -static void __gen6_disable_pm_irq(struct drm_i915_private *dev_priv,
> -				  uint32_t mask)
> +static void __gen6_mask_pm_irq(struct drm_i915_private *dev_priv, u32 mask)
>   {
>   	snb_update_pm_irq(dev_priv, mask, 0);
>   }
>
> -void gen6_disable_pm_irq(struct drm_i915_private *dev_priv, uint32_t mask)
> +void gen6_mask_pm_irq(struct drm_i915_private *dev_priv, u32 mask)
>   {
>   	if (WARN_ON(!intel_irqs_enabled(dev_priv)))
>   		return;
>
> -	__gen6_disable_pm_irq(dev_priv, mask);
> +	__gen6_mask_pm_irq(dev_priv, mask);
>   }
>
> -void gen6_reset_rps_interrupts(struct drm_i915_private *dev_priv)
> +void gen6_reset_pm_iir(struct drm_i915_private *dev_priv, u32 reset_mask)
>   {
>   	i915_reg_t reg = gen6_pm_iir(dev_priv);
>
> -	spin_lock_irq(&dev_priv->irq_lock);
> -	I915_WRITE(reg, dev_priv->pm_rps_events);
> -	I915_WRITE(reg, dev_priv->pm_rps_events);
> +	assert_spin_locked(&dev_priv->irq_lock);
> +
> +	I915_WRITE(reg, reset_mask);
> +	I915_WRITE(reg, reset_mask);
>   	POSTING_READ(reg);
> +}
> +
> +void gen6_enable_pm_irq(struct drm_i915_private *dev_priv, u32 enable_mask)
> +{
> +	assert_spin_locked(&dev_priv->irq_lock);
> +
> +	dev_priv->pm_ier |= enable_mask;
> +	I915_WRITE(gen6_pm_ier(dev_priv), dev_priv->pm_ier);
> +	gen6_unmask_pm_irq(dev_priv, enable_mask);
> +	/* unmask_pm_irq provides an implicit barrier (POSTING_READ) */
> +}
> +
> +void gen6_disable_pm_irq(struct drm_i915_private *dev_priv, u32 disable_mask)
> +{
> +	assert_spin_locked(&dev_priv->irq_lock);
> +
> +	dev_priv->pm_ier &= ~disable_mask;
> +	__gen6_mask_pm_irq(dev_priv, disable_mask);
> +	I915_WRITE(gen6_pm_ier(dev_priv), dev_priv->pm_ier);
> +	/* though a barrier is missing here, but don't really need a one */
> +}
> +
> +void gen6_reset_rps_interrupts(struct drm_i915_private *dev_priv)
> +{
> +	spin_lock_irq(&dev_priv->irq_lock);
> +	gen6_reset_pm_iir(dev_priv, dev_priv->pm_rps_events);
>   	dev_priv->rps.pm_iir = 0;
>   	spin_unlock_irq(&dev_priv->irq_lock);
>   }
> @@ -354,8 +380,6 @@ void gen6_enable_rps_interrupts(struct drm_i915_private *dev_priv)
>   	WARN_ON_ONCE(dev_priv->rps.pm_iir);
>   	WARN_ON_ONCE(I915_READ(gen6_pm_iir(dev_priv)) & dev_priv->pm_rps_events);
>   	dev_priv->rps.interrupts_enabled = true;
> -	I915_WRITE(gen6_pm_ier(dev_priv), I915_READ(gen6_pm_ier(dev_priv)) |
> -				dev_priv->pm_rps_events);
>   	gen6_enable_pm_irq(dev_priv, dev_priv->pm_rps_events);
>
>   	spin_unlock_irq(&dev_priv->irq_lock);
> @@ -373,9 +397,7 @@ void gen6_disable_rps_interrupts(struct drm_i915_private *dev_priv)
>
>   	I915_WRITE(GEN6_PMINTRMSK, gen6_sanitize_rps_pm_mask(dev_priv, ~0));
>
> -	__gen6_disable_pm_irq(dev_priv, dev_priv->pm_rps_events);
> -	I915_WRITE(gen6_pm_ier(dev_priv), I915_READ(gen6_pm_ier(dev_priv)) &
> -				~dev_priv->pm_rps_events);
> +	gen6_disable_pm_irq(dev_priv, dev_priv->pm_rps_events);
>
>   	spin_unlock_irq(&dev_priv->irq_lock);
>   	synchronize_irq(dev_priv->drm.irq);
> @@ -1078,7 +1100,7 @@ static void gen6_pm_rps_work(struct work_struct *work)
>   	pm_iir = dev_priv->rps.pm_iir;
>   	dev_priv->rps.pm_iir = 0;
>   	/* Make sure not to corrupt PMIMR state used by ringbuffer on GEN6 */
> -	gen6_enable_pm_irq(dev_priv, dev_priv->pm_rps_events);
> +	gen6_unmask_pm_irq(dev_priv, dev_priv->pm_rps_events);
>   	client_boost = dev_priv->rps.client_boost;
>   	dev_priv->rps.client_boost = false;
>   	spin_unlock_irq(&dev_priv->irq_lock);
> @@ -1579,7 +1601,7 @@ static void gen6_rps_irq_handler(struct drm_i915_private *dev_priv, u32 pm_iir)
>   {
>   	if (pm_iir & dev_priv->pm_rps_events) {
>   		spin_lock(&dev_priv->irq_lock);
> -		gen6_disable_pm_irq(dev_priv, pm_iir & dev_priv->pm_rps_events);
> +		gen6_mask_pm_irq(dev_priv, pm_iir & dev_priv->pm_rps_events);
>   		if (dev_priv->rps.interrupts_enabled) {
>   			dev_priv->rps.pm_iir |= pm_iir & dev_priv->pm_rps_events;
>   			schedule_work(&dev_priv->rps.work);
> @@ -3568,11 +3590,13 @@ static void gen5_gt_irq_postinstall(struct drm_device *dev)
>   		 * RPS interrupts will get enabled/disabled on demand when RPS
>   		 * itself is enabled/disabled.
>   		 */
> -		if (HAS_VEBOX(dev))
> +		if (HAS_VEBOX(dev)) {

dev_priv should be used for all these macros in new code.

>   			pm_irqs |= PM_VEBOX_USER_INTERRUPT;
> +			dev_priv->pm_ier |= PM_VEBOX_USER_INTERRUPT;
> +		}
>
> -		dev_priv->pm_irq_mask = 0xffffffff;
> -		GEN5_IRQ_INIT(GEN6_PM, dev_priv->pm_irq_mask, pm_irqs);
> +		dev_priv->pm_imr = 0xffffffff;
> +		GEN5_IRQ_INIT(GEN6_PM, dev_priv->pm_imr, pm_irqs);
>   	}
>   }
>
> @@ -3692,14 +3716,15 @@ static void gen8_gt_irq_postinstall(struct drm_i915_private *dev_priv)
>   	if (HAS_L3_DPF(dev_priv))
>   		gt_interrupts[0] |= GT_RENDER_L3_PARITY_ERROR_INTERRUPT;
>
> -	dev_priv->pm_irq_mask = 0xffffffff;
> +	dev_priv->pm_ier = 0x0;
> +	dev_priv->pm_imr = ~dev_priv->pm_ier;
>   	GEN8_IRQ_INIT_NDX(GT, 0, ~gt_interrupts[0], gt_interrupts[0]);
>   	GEN8_IRQ_INIT_NDX(GT, 1, ~gt_interrupts[1], gt_interrupts[1]);
>   	/*
>   	 * RPS interrupts will get enabled/disabled on demand when RPS itself
>   	 * is enabled/disabled.
>   	 */
> -	GEN8_IRQ_INIT_NDX(GT, 2, dev_priv->pm_irq_mask, 0);
> +	GEN8_IRQ_INIT_NDX(GT, 2, dev_priv->pm_imr, dev_priv->pm_ier);
>   	GEN8_IRQ_INIT_NDX(GT, 3, ~gt_interrupts[3], gt_interrupts[3]);
>   }
>
> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> index 9539f0e..80cd05f 100644
> --- a/drivers/gpu/drm/i915/intel_drv.h
> +++ b/drivers/gpu/drm/i915/intel_drv.h
> @@ -1094,6 +1094,9 @@ void intel_check_pch_fifo_underruns(struct drm_i915_private *dev_priv);
>   /* i915_irq.c */
>   void gen5_enable_gt_irq(struct drm_i915_private *dev_priv, uint32_t mask);
>   void gen5_disable_gt_irq(struct drm_i915_private *dev_priv, uint32_t mask);
> +void gen6_reset_pm_iir(struct drm_i915_private *dev_priv, u32 mask);
> +void gen6_mask_pm_irq(struct drm_i915_private *dev_priv, u32 mask);
> +void gen6_unmask_pm_irq(struct drm_i915_private *dev_priv, u32 mask);
>   void gen6_enable_pm_irq(struct drm_i915_private *dev_priv, uint32_t mask);
>   void gen6_disable_pm_irq(struct drm_i915_private *dev_priv, uint32_t mask);
>   void gen6_reset_rps_interrupts(struct drm_i915_private *dev_priv);
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index ed19868..e8fa26c 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -1700,7 +1700,7 @@ hsw_vebox_irq_enable(struct intel_engine_cs *engine)
>   	struct drm_i915_private *dev_priv = engine->i915;
>
>   	I915_WRITE_IMR(engine, ~engine->irq_enable_mask);
> -	gen6_enable_pm_irq(dev_priv, engine->irq_enable_mask);
> +	gen6_unmask_pm_irq(dev_priv, engine->irq_enable_mask);
>   }
>
>   static void
> @@ -1709,7 +1709,7 @@ hsw_vebox_irq_disable(struct intel_engine_cs *engine)
>   	struct drm_i915_private *dev_priv = engine->i915;
>
>   	I915_WRITE_IMR(engine, ~0);
> -	gen6_disable_pm_irq(dev_priv, engine->irq_enable_mask);
> +	gen6_mask_pm_irq(dev_priv, engine->irq_enable_mask);
>   }
>
>   static void
>

Looks like only a single line change since v3, so with the above small 
detail fixed;

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 05/20] drm/i915: Support for GuC interrupts
  2016-08-12  6:25 ` [PATCH 05/20] drm/i915: Support for GuC interrupts akash.goel
@ 2016-08-12 11:54   ` Tvrtko Ursulin
  2016-08-12 13:10     ` Goel, Akash
  0 siblings, 1 reply; 68+ messages in thread
From: Tvrtko Ursulin @ 2016-08-12 11:54 UTC (permalink / raw)
  To: akash.goel, intel-gfx


On 12/08/16 07:25, akash.goel@intel.com wrote:
> From: Sagar Arun Kamble <sagar.a.kamble@intel.com>
>
> There are certain types of interrupts which Host can recieve from GuC.
> GuC ukernel sends an interrupt to Host for certain events, like for
> example retrieve/consume the logs generated by ukernel.
> This patch adds support to receive interrupts from GuC but currently
> enables & partially handles only the interrupt sent by GuC ukernel.
> Future patches will add support for handling other interrupt types.
>
> v2:
> - Use common low level routines for PM IER/IIR programming (Chris)
> - Rename interrupt functions to gen9_xxx from gen8_xxx (Chris)
> - Replace disabling of wake ref asserts with rpm get/put (Chris)
>
> v3:
> - Update comments for more clarity. (Tvrtko)
> - Remove the masking of GuC interrupt, which was kept masked till the
>    start of bottom half, its not really needed as there is only a
>    single instance of work item & wq is ordered. (Tvrtko)
>
> v4:
> - Rebase.
> - Rename guc_events to pm_guc_events so as to be indicative of the
>    register/control block it is associated with. (Chris)
> - Add handling for back to back log buffer flush interrupts.
>
> v5:
> - Move the read & clearing of register, containing Guc2Host message
>    bits, outside the irq spinlock. (Tvrtko)
>
> Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
> Signed-off-by: Akash Goel <akash.goel@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_drv.h            |   1 +
>   drivers/gpu/drm/i915/i915_guc_submission.c |   5 ++
>   drivers/gpu/drm/i915/i915_irq.c            | 100 +++++++++++++++++++++++++++--
>   drivers/gpu/drm/i915/i915_reg.h            |  11 ++++
>   drivers/gpu/drm/i915/intel_drv.h           |   3 +
>   drivers/gpu/drm/i915/intel_guc.h           |   4 ++
>   drivers/gpu/drm/i915/intel_guc_loader.c    |   4 ++
>   7 files changed, 124 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index a608a5c..28ffac5 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1779,6 +1779,7 @@ struct drm_i915_private {
>   	u32 pm_imr;
>   	u32 pm_ier;
>   	u32 pm_rps_events;
> +	u32 pm_guc_events;
>   	u32 pipestat_irq_mask[I915_MAX_PIPES];
>
>   	struct i915_hotplug hotplug;
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> index ad3b55f..c7c679f 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -1071,6 +1071,8 @@ int intel_guc_suspend(struct drm_device *dev)
>   	if (guc->guc_fw.guc_fw_load_status != GUC_FIRMWARE_SUCCESS)
>   		return 0;
>
> +	gen9_disable_guc_interrupts(dev_priv);
> +
>   	ctx = dev_priv->kernel_context;
>
>   	data[0] = HOST2GUC_ACTION_ENTER_S_STATE;
> @@ -1097,6 +1099,9 @@ int intel_guc_resume(struct drm_device *dev)
>   	if (guc->guc_fw.guc_fw_load_status != GUC_FIRMWARE_SUCCESS)
>   		return 0;
>
> +	if (i915.guc_log_level >= 0)
> +		gen9_enable_guc_interrupts(dev_priv);
> +
>   	ctx = dev_priv->kernel_context;
>
>   	data[0] = HOST2GUC_ACTION_EXIT_S_STATE;
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 5f93309..5f1974f 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -170,6 +170,7 @@ static void gen5_assert_iir_is_zero(struct drm_i915_private *dev_priv,
>   } while (0)
>
>   static void gen6_rps_irq_handler(struct drm_i915_private *dev_priv, u32 pm_iir);
> +static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv, u32 pm_iir);
>
>   /* For display hotplug interrupt */
>   static inline void
> @@ -411,6 +412,38 @@ void gen6_disable_rps_interrupts(struct drm_i915_private *dev_priv)
>   	gen6_reset_rps_interrupts(dev_priv);
>   }
>
> +void gen9_reset_guc_interrupts(struct drm_i915_private *dev_priv)
> +{
> +	spin_lock_irq(&dev_priv->irq_lock);
> +	gen6_reset_pm_iir(dev_priv, dev_priv->pm_guc_events);
> +	spin_unlock_irq(&dev_priv->irq_lock);
> +}
> +
> +void gen9_enable_guc_interrupts(struct drm_i915_private *dev_priv)
> +{
> +	spin_lock_irq(&dev_priv->irq_lock);
> +	if (!dev_priv->guc.interrupts_enabled) {
> +		WARN_ON_ONCE(I915_READ(gen6_pm_iir(dev_priv)) &
> +						dev_priv->pm_guc_events);
> +		dev_priv->guc.interrupts_enabled = true;
> +		gen6_enable_pm_irq(dev_priv, dev_priv->pm_guc_events);
> +	}
> +	spin_unlock_irq(&dev_priv->irq_lock);
> +}
> +
> +void gen9_disable_guc_interrupts(struct drm_i915_private *dev_priv)
> +{
> +	spin_lock_irq(&dev_priv->irq_lock);
> +	dev_priv->guc.interrupts_enabled = false;
> +
> +	gen6_disable_pm_irq(dev_priv, dev_priv->pm_guc_events);
> +
> +	spin_unlock_irq(&dev_priv->irq_lock);
> +	synchronize_irq(dev_priv->drm.irq);
> +
> +	gen9_reset_guc_interrupts(dev_priv);
> +}
> +
>   /**
>    * bdw_update_port_irq - update DE port interrupt
>    * @dev_priv: driver private
> @@ -1167,6 +1200,21 @@ static void gen6_pm_rps_work(struct work_struct *work)
>   	mutex_unlock(&dev_priv->rps.hw_lock);
>   }
>
> +static void gen9_guc2host_events_work(struct work_struct *work)
> +{
> +	struct drm_i915_private *dev_priv =
> +		container_of(work, struct drm_i915_private, guc.events_work);
> +
> +	spin_lock_irq(&dev_priv->irq_lock);
> +	/* Speed up work cancellation during disabling guc interrupts. */
> +	if (!dev_priv->guc.interrupts_enabled) {
> +		spin_unlock_irq(&dev_priv->irq_lock);
> +		return;

I suppose locking for early exit is something about ensuring the worker 
sees the update to dev_priv->guc.interrupts_enabled done on another CPU? 
synchronize_irq there is not enough for some reason?

> +	}
> +	spin_unlock_irq(&dev_priv->irq_lock);
> +
> +	/* TODO: Handle the events for which GuC interrupted host */
> +}
>
>   /**
>    * ivybridge_parity_work - Workqueue called when a parity error interrupt
> @@ -1339,11 +1387,13 @@ static irqreturn_t gen8_gt_irq_ack(struct drm_i915_private *dev_priv,
>   			DRM_ERROR("The master control interrupt lied (GT3)!\n");
>   	}
>
> -	if (master_ctl & GEN8_GT_PM_IRQ) {
> +	if (master_ctl & (GEN8_GT_PM_IRQ | GEN8_GT_GUC_IRQ)) {
>   		gt_iir[2] = I915_READ_FW(GEN8_GT_IIR(2));
> -		if (gt_iir[2] & dev_priv->pm_rps_events) {
> +		if (gt_iir[2] & (dev_priv->pm_rps_events |
> +				 dev_priv->pm_guc_events)) {
>   			I915_WRITE_FW(GEN8_GT_IIR(2),
> -				      gt_iir[2] & dev_priv->pm_rps_events);
> +				      gt_iir[2] & (dev_priv->pm_rps_events |
> +						   dev_priv->pm_guc_events));
>   			ret = IRQ_HANDLED;
>   		} else
>   			DRM_ERROR("The master control interrupt lied (PM)!\n");
> @@ -1375,6 +1425,9 @@ static void gen8_gt_irq_handler(struct drm_i915_private *dev_priv,
>
>   	if (gt_iir[2] & dev_priv->pm_rps_events)
>   		gen6_rps_irq_handler(dev_priv, gt_iir[2]);
> +
> +	if (gt_iir[2] & dev_priv->pm_guc_events)
> +		gen9_guc_irq_handler(dev_priv, gt_iir[2]);
>   }
>
>   static bool bxt_port_hotplug_long_detect(enum port port, u32 val)
> @@ -1621,6 +1674,41 @@ static void gen6_rps_irq_handler(struct drm_i915_private *dev_priv, u32 pm_iir)
>   	}
>   }
>
> +static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv, u32 gt_iir)
> +{
> +	bool interrupts_enabled;
> +
> +	if (gt_iir & GEN9_GUC_TO_HOST_INT_EVENT) {
> +		spin_lock(&dev_priv->irq_lock);
> +		interrupts_enabled = dev_priv->guc.interrupts_enabled;
> +		spin_unlock(&dev_priv->irq_lock);

Not sure that taking a lock around only this read is needed.

> +		if (interrupts_enabled) {
> +			/* Sample the log buffer flush related bits & clear them
> +			 * out now itself from the message identity register to
> +			 * minimize the probability of losing a flush interrupt,
> +			 * when there are back to back flush interrupts.
> +			 * There can be a new flush interrupt, for different log
> +			 * buffer type (like for ISR), whilst Host is handling
> +			 * one (for DPC). Since same bit is used in message
> +			 * register for ISR & DPC, it could happen that GuC
> +			 * sets the bit for 2nd interrupt but Host clears out
> +			 * the bit on handling the 1st interrupt.
> +			 */
> +			u32 msg = I915_READ(SOFT_SCRATCH(15)) &
> +					(GUC2HOST_MSG_CRASH_DUMP_POSTED |
> +					 GUC2HOST_MSG_FLUSH_LOG_BUFFER);
> +			if (msg) {
> +				/* Clear the message bits that are handled */
> +				I915_WRITE(SOFT_SCRATCH(15),
> +					I915_READ(SOFT_SCRATCH(15)) & ~msg);

Cache full value of SOFT_SCRATCH(15) so you don't have to mmio read it 
twice?

Also, is the RMW outside any locks safe?

> +
> +				/* Handle flush interrupt event in bottom half */
> +				queue_work(dev_priv->wq, &dev_priv->guc.events_work);

IMHO it would be nicer if the code started straight away with a final wq 
solution.

Especially since the next patch in the series is called "Handle log 
buffer flush interrupt event from GuC" and the actual handling of the 
log buffer flush interrupt is split between this one 
(GUC2HOST_MSG_FLUSH_LOG_BUFFER above) and that one.

So it would almost be nicer that the above chunk which handles 
GUC2HOST_MSG_FLUSH_LOG_BUFFER and the worker init is only added in the 
next patch and this one only does the generic bits.

I don't know.. I'll leave it on your conscience - if you think the split 
(series) can't be done any nicer or it makes sense to have it in this 
order then ok.

> +			}

Mabye:

	} else

And log something unexpected has happened in the !msg case?

Since it won't clear the message in that case so would it keep triggering?

> +		}
> +	}
> +}
> +
>   static bool intel_pipe_handle_vblank(struct drm_i915_private *dev_priv,
>   				     enum pipe pipe)
>   {
> @@ -3722,7 +3810,7 @@ static void gen8_gt_irq_postinstall(struct drm_i915_private *dev_priv)
>   	GEN8_IRQ_INIT_NDX(GT, 1, ~gt_interrupts[1], gt_interrupts[1]);
>   	/*
>   	 * RPS interrupts will get enabled/disabled on demand when RPS itself
> -	 * is enabled/disabled.
> +	 * is enabled/disabled. Same wil be the case for GuC interrupts.
>   	 */
>   	GEN8_IRQ_INIT_NDX(GT, 2, dev_priv->pm_imr, dev_priv->pm_ier);
>   	GEN8_IRQ_INIT_NDX(GT, 3, ~gt_interrupts[3], gt_interrupts[3]);
> @@ -4507,6 +4595,10 @@ void intel_irq_init(struct drm_i915_private *dev_priv)
>
>   	INIT_WORK(&dev_priv->rps.work, gen6_pm_rps_work);
>   	INIT_WORK(&dev_priv->l3_parity.error_work, ivybridge_parity_work);
> +	INIT_WORK(&dev_priv->guc.events_work, gen9_guc2host_events_work);
> +
> +	if (HAS_GUC_UCODE(dev))
> +		dev_priv->pm_guc_events = GEN9_GUC_TO_HOST_INT_EVENT;
>
>   	/* Let's track the enabled rps events */
>   	if (IS_VALLEYVIEW(dev_priv))
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index da82744..62046dc 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -6011,6 +6011,7 @@ enum {
>   #define  GEN8_DE_PIPE_A_IRQ		(1<<16)
>   #define  GEN8_DE_PIPE_IRQ(pipe)		(1<<(16+(pipe)))
>   #define  GEN8_GT_VECS_IRQ		(1<<6)
> +#define  GEN8_GT_GUC_IRQ		(1<<5)
>   #define  GEN8_GT_PM_IRQ			(1<<4)
>   #define  GEN8_GT_VCS2_IRQ		(1<<3)
>   #define  GEN8_GT_VCS1_IRQ		(1<<2)
> @@ -6022,6 +6023,16 @@ enum {
>   #define GEN8_GT_IIR(which) _MMIO(0x44308 + (0x10 * (which)))
>   #define GEN8_GT_IER(which) _MMIO(0x4430c + (0x10 * (which)))
>
> +#define GEN9_GUC_TO_HOST_INT_EVENT	(1<<31)
> +#define GEN9_GUC_EXEC_ERROR_EVENT	(1<<30)
> +#define GEN9_GUC_DISPLAY_EVENT		(1<<29)
> +#define GEN9_GUC_SEMA_SIGNAL_EVENT	(1<<28)
> +#define GEN9_GUC_IOMMU_MSG_EVENT	(1<<27)
> +#define GEN9_GUC_DB_RING_EVENT		(1<<26)
> +#define GEN9_GUC_DMA_DONE_EVENT		(1<<25)
> +#define GEN9_GUC_FATAL_ERROR_EVENT	(1<<24)
> +#define GEN9_GUC_NOTIFICATION_EVENT	(1<<23)
> +
>   #define GEN8_RCS_IRQ_SHIFT 0
>   #define GEN8_BCS_IRQ_SHIFT 16
>   #define GEN8_VCS1_IRQ_SHIFT 0
> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> index 80cd05f..9619ce9 100644
> --- a/drivers/gpu/drm/i915/intel_drv.h
> +++ b/drivers/gpu/drm/i915/intel_drv.h
> @@ -1119,6 +1119,9 @@ void gen8_irq_power_well_post_enable(struct drm_i915_private *dev_priv,
>   				     unsigned int pipe_mask);
>   void gen8_irq_power_well_pre_disable(struct drm_i915_private *dev_priv,
>   				     unsigned int pipe_mask);
> +void gen9_reset_guc_interrupts(struct drm_i915_private *dev_priv);
> +void gen9_enable_guc_interrupts(struct drm_i915_private *dev_priv);
> +void gen9_disable_guc_interrupts(struct drm_i915_private *dev_priv);
>
>   /* intel_crt.c */
>   void intel_crt_init(struct drm_device *dev);
> diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
> index 7e22803..be1e04d 100644
> --- a/drivers/gpu/drm/i915/intel_guc.h
> +++ b/drivers/gpu/drm/i915/intel_guc.h
> @@ -130,6 +130,10 @@ struct intel_guc {
>   	struct intel_guc_fw guc_fw;
>   	struct intel_guc_log log;
>
> +	/* GuC2Host interrupt related state */
> +	struct work_struct events_work;
> +	bool interrupts_enabled;
> +
>   	struct drm_i915_gem_object *ads_obj;
>
>   	struct drm_i915_gem_object *ctx_pool_obj;
> diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c b/drivers/gpu/drm/i915/intel_guc_loader.c
> index f23bb33..b7e97cc 100644
> --- a/drivers/gpu/drm/i915/intel_guc_loader.c
> +++ b/drivers/gpu/drm/i915/intel_guc_loader.c
> @@ -464,6 +464,7 @@ int intel_guc_setup(struct drm_device *dev)
>   	}
>
>   	direct_interrupts_to_host(dev_priv);
> +	gen9_reset_guc_interrupts(dev_priv);
>
>   	guc_fw->guc_fw_load_status = GUC_FIRMWARE_PENDING;
>
> @@ -510,6 +511,9 @@ int intel_guc_setup(struct drm_device *dev)
>   		intel_guc_fw_status_repr(guc_fw->guc_fw_load_status));
>
>   	if (i915.enable_guc_submission) {
> +		if (i915.guc_log_level >= 0)
> +			gen9_enable_guc_interrupts(dev_priv);
> +
>   		err = i915_guc_submission_enable(dev_priv);
>   		if (err)
>   			goto fail;
>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 18/20] drm/i915: Use SSE4.1 movntdqa to accelerate reads from WC memory
  2016-08-12 10:54   ` Tvrtko Ursulin
@ 2016-08-12 12:22     ` Chris Wilson
  0 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2016-08-12 12:22 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: akash.goel, intel-gfx, Mika Kuoppala

On Fri, Aug 12, 2016 at 11:54:04AM +0100, Tvrtko Ursulin wrote:
> On 12/08/16 07:25, akash.goel@intel.com wrote:
> >From: Chris Wilson <chris@chris-wilson.co.uk>
> >
> >This patch provides the infrastructure for performing a 16-byte aligned
> >read from WC memory using non-temporal instructions introduced with sse4.1.
> >Using movntdqa we can bypass the CPU caches and read directly from memory
> >and ignoring the page attributes set on the CPU PTE i.e. negating the
> >impact of an otherwise UC access. Copying using movntqda from WC is almost
> >as fast as reading from WB memory, modulo the possibility of both hitting
> >the CPU cache or leaving the data in the CPU cache for the next consumer.
> >(The CPU cache itself my be flushed for the region of the movntdqa and on
> >later access the movntdqa reads from a separate internal buffer for the
> >cacheline.) The write back to the memory is however cached.
> >
> >This will be used in later patches to accelerate accessing WC memory.
> >
> >v2: Report whether the accelerated copy is successful/possible.
> >v3: Function alignment override was only necessary when using the
> >function target("sse4.1") - which is not necessary for emitting movntdqa
> >from __asm__.
> >v4: Improve notes on CPU cache behaviour vs non-temporal stores.
> >v5: Fix byte offsets for unrolled moves.
> >v6: Find all remaining typos of movntqda, use kernel_fpu_begin.
> >
> >Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >Cc: Akash Goel <akash.goel@intel.com>
> >Cc: Damien Lespiau <damien.lespiau@intel.com>
> >Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> >Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Picked up the 2 WC prep patches. Thanks for the review, testing and
improvements,
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 05/20] drm/i915: Support for GuC interrupts
  2016-08-12 11:54   ` Tvrtko Ursulin
@ 2016-08-12 13:10     ` Goel, Akash
  2016-08-12 13:31       ` Tvrtko Ursulin
  0 siblings, 1 reply; 68+ messages in thread
From: Goel, Akash @ 2016-08-12 13:10 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: akash.goel



On 8/12/2016 5:24 PM, Tvrtko Ursulin wrote:
>
> On 12/08/16 07:25, akash.goel@intel.com wrote:
>> From: Sagar Arun Kamble <sagar.a.kamble@intel.com>
>>
>> There are certain types of interrupts which Host can recieve from GuC.
>> GuC ukernel sends an interrupt to Host for certain events, like for
>> example retrieve/consume the logs generated by ukernel.
>> This patch adds support to receive interrupts from GuC but currently
>> enables & partially handles only the interrupt sent by GuC ukernel.
>> Future patches will add support for handling other interrupt types.
>>
>> v2:
>> - Use common low level routines for PM IER/IIR programming (Chris)
>> - Rename interrupt functions to gen9_xxx from gen8_xxx (Chris)
>> - Replace disabling of wake ref asserts with rpm get/put (Chris)
>>
>> v3:
>> - Update comments for more clarity. (Tvrtko)
>> - Remove the masking of GuC interrupt, which was kept masked till the
>>    start of bottom half, its not really needed as there is only a
>>    single instance of work item & wq is ordered. (Tvrtko)
>>
>> v4:
>> - Rebase.
>> - Rename guc_events to pm_guc_events so as to be indicative of the
>>    register/control block it is associated with. (Chris)
>> - Add handling for back to back log buffer flush interrupts.
>>
>> v5:
>> - Move the read & clearing of register, containing Guc2Host message
>>    bits, outside the irq spinlock. (Tvrtko)
>>
>> Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
>> Signed-off-by: Akash Goel <akash.goel@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_drv.h            |   1 +
>>   drivers/gpu/drm/i915/i915_guc_submission.c |   5 ++
>>   drivers/gpu/drm/i915/i915_irq.c            | 100
>> +++++++++++++++++++++++++++--
>>   drivers/gpu/drm/i915/i915_reg.h            |  11 ++++
>>   drivers/gpu/drm/i915/intel_drv.h           |   3 +
>>   drivers/gpu/drm/i915/intel_guc.h           |   4 ++
>>   drivers/gpu/drm/i915/intel_guc_loader.c    |   4 ++
>>   7 files changed, 124 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h
>> b/drivers/gpu/drm/i915/i915_drv.h
>> index a608a5c..28ffac5 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -1779,6 +1779,7 @@ struct drm_i915_private {
>>       u32 pm_imr;
>>       u32 pm_ier;
>>       u32 pm_rps_events;
>> +    u32 pm_guc_events;
>>       u32 pipestat_irq_mask[I915_MAX_PIPES];
>>
>>       struct i915_hotplug hotplug;
>> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
>> b/drivers/gpu/drm/i915/i915_guc_submission.c
>> index ad3b55f..c7c679f 100644
>> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
>> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
>> @@ -1071,6 +1071,8 @@ int intel_guc_suspend(struct drm_device *dev)
>>       if (guc->guc_fw.guc_fw_load_status != GUC_FIRMWARE_SUCCESS)
>>           return 0;
>>
>> +    gen9_disable_guc_interrupts(dev_priv);
>> +
>>       ctx = dev_priv->kernel_context;
>>
>>       data[0] = HOST2GUC_ACTION_ENTER_S_STATE;
>> @@ -1097,6 +1099,9 @@ int intel_guc_resume(struct drm_device *dev)
>>       if (guc->guc_fw.guc_fw_load_status != GUC_FIRMWARE_SUCCESS)
>>           return 0;
>>
>> +    if (i915.guc_log_level >= 0)
>> +        gen9_enable_guc_interrupts(dev_priv);
>> +
>>       ctx = dev_priv->kernel_context;
>>
>>       data[0] = HOST2GUC_ACTION_EXIT_S_STATE;
>> diff --git a/drivers/gpu/drm/i915/i915_irq.c
>> b/drivers/gpu/drm/i915/i915_irq.c
>> index 5f93309..5f1974f 100644
>> --- a/drivers/gpu/drm/i915/i915_irq.c
>> +++ b/drivers/gpu/drm/i915/i915_irq.c
>> @@ -170,6 +170,7 @@ static void gen5_assert_iir_is_zero(struct
>> drm_i915_private *dev_priv,
>>   } while (0)
>>
>>   static void gen6_rps_irq_handler(struct drm_i915_private *dev_priv,
>> u32 pm_iir);
>> +static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv,
>> u32 pm_iir);
>>
>>   /* For display hotplug interrupt */
>>   static inline void
>> @@ -411,6 +412,38 @@ void gen6_disable_rps_interrupts(struct
>> drm_i915_private *dev_priv)
>>       gen6_reset_rps_interrupts(dev_priv);
>>   }
>>
>> +void gen9_reset_guc_interrupts(struct drm_i915_private *dev_priv)
>> +{
>> +    spin_lock_irq(&dev_priv->irq_lock);
>> +    gen6_reset_pm_iir(dev_priv, dev_priv->pm_guc_events);
>> +    spin_unlock_irq(&dev_priv->irq_lock);
>> +}
>> +
>> +void gen9_enable_guc_interrupts(struct drm_i915_private *dev_priv)
>> +{
>> +    spin_lock_irq(&dev_priv->irq_lock);
>> +    if (!dev_priv->guc.interrupts_enabled) {
>> +        WARN_ON_ONCE(I915_READ(gen6_pm_iir(dev_priv)) &
>> +                        dev_priv->pm_guc_events);
>> +        dev_priv->guc.interrupts_enabled = true;
>> +        gen6_enable_pm_irq(dev_priv, dev_priv->pm_guc_events);
>> +    }
>> +    spin_unlock_irq(&dev_priv->irq_lock);
>> +}
>> +
>> +void gen9_disable_guc_interrupts(struct drm_i915_private *dev_priv)
>> +{
>> +    spin_lock_irq(&dev_priv->irq_lock);
>> +    dev_priv->guc.interrupts_enabled = false;
>> +
>> +    gen6_disable_pm_irq(dev_priv, dev_priv->pm_guc_events);
>> +
>> +    spin_unlock_irq(&dev_priv->irq_lock);
>> +    synchronize_irq(dev_priv->drm.irq);
>> +
>> +    gen9_reset_guc_interrupts(dev_priv);
>> +}
>> +
>>   /**
>>    * bdw_update_port_irq - update DE port interrupt
>>    * @dev_priv: driver private
>> @@ -1167,6 +1200,21 @@ static void gen6_pm_rps_work(struct work_struct
>> *work)
>>       mutex_unlock(&dev_priv->rps.hw_lock);
>>   }
>>
>> +static void gen9_guc2host_events_work(struct work_struct *work)
>> +{
>> +    struct drm_i915_private *dev_priv =
>> +        container_of(work, struct drm_i915_private, guc.events_work);
>> +
>> +    spin_lock_irq(&dev_priv->irq_lock);
>> +    /* Speed up work cancellation during disabling guc interrupts. */
>> +    if (!dev_priv->guc.interrupts_enabled) {
>> +        spin_unlock_irq(&dev_priv->irq_lock);
>> +        return;
>
> I suppose locking for early exit is something about ensuring the worker
> sees the update to dev_priv->guc.interrupts_enabled done on another CPU?

Yes locking (providing implicit barrier) will ensure that update made 
from another CPU is immediately visible to the worker.

> synchronize_irq there is not enough for some reason?
>
synchronize_irq would not be enough, its for a different purpose, to 
ensure that any ongoing handling of irq completes (after the caller has 
disabled the irq).

As per my understanding synchronize_irq won't have an effect on the
worker, with respect to the moment when the update of
'interrupts_enabled' flag is visible to the worker.

>> +    }
>> +    spin_unlock_irq(&dev_priv->irq_lock);
>> +
>> +    /* TODO: Handle the events for which GuC interrupted host */
>> +}
>>
>>   /**
>>    * ivybridge_parity_work - Workqueue called when a parity error
>> interrupt
>> @@ -1339,11 +1387,13 @@ static irqreturn_t gen8_gt_irq_ack(struct
>> drm_i915_private *dev_priv,
>>               DRM_ERROR("The master control interrupt lied (GT3)!\n");
>>       }
>>
>> -    if (master_ctl & GEN8_GT_PM_IRQ) {
>> +    if (master_ctl & (GEN8_GT_PM_IRQ | GEN8_GT_GUC_IRQ)) {
>>           gt_iir[2] = I915_READ_FW(GEN8_GT_IIR(2));
>> -        if (gt_iir[2] & dev_priv->pm_rps_events) {
>> +        if (gt_iir[2] & (dev_priv->pm_rps_events |
>> +                 dev_priv->pm_guc_events)) {
>>               I915_WRITE_FW(GEN8_GT_IIR(2),
>> -                      gt_iir[2] & dev_priv->pm_rps_events);
>> +                      gt_iir[2] & (dev_priv->pm_rps_events |
>> +                           dev_priv->pm_guc_events));
>>               ret = IRQ_HANDLED;
>>           } else
>>               DRM_ERROR("The master control interrupt lied (PM)!\n");
>> @@ -1375,6 +1425,9 @@ static void gen8_gt_irq_handler(struct
>> drm_i915_private *dev_priv,
>>
>>       if (gt_iir[2] & dev_priv->pm_rps_events)
>>           gen6_rps_irq_handler(dev_priv, gt_iir[2]);
>> +
>> +    if (gt_iir[2] & dev_priv->pm_guc_events)
>> +        gen9_guc_irq_handler(dev_priv, gt_iir[2]);
>>   }
>>
>>   static bool bxt_port_hotplug_long_detect(enum port port, u32 val)
>> @@ -1621,6 +1674,41 @@ static void gen6_rps_irq_handler(struct
>> drm_i915_private *dev_priv, u32 pm_iir)
>>       }
>>   }
>>
>> +static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv,
>> u32 gt_iir)
>> +{
>> +    bool interrupts_enabled;
>> +
>> +    if (gt_iir & GEN9_GUC_TO_HOST_INT_EVENT) {
>> +        spin_lock(&dev_priv->irq_lock);
>> +        interrupts_enabled = dev_priv->guc.interrupts_enabled;
>> +        spin_unlock(&dev_priv->irq_lock);
>
> Not sure that taking a lock around only this read is needed.
>
Again same reason as above, to make sure an update made on another CPU 
is immediately visible to the irq handler.

>> +        if (interrupts_enabled) {
>> +            /* Sample the log buffer flush related bits & clear them
>> +             * out now itself from the message identity register to
>> +             * minimize the probability of losing a flush interrupt,
>> +             * when there are back to back flush interrupts.
>> +             * There can be a new flush interrupt, for different log
>> +             * buffer type (like for ISR), whilst Host is handling
>> +             * one (for DPC). Since same bit is used in message
>> +             * register for ISR & DPC, it could happen that GuC
>> +             * sets the bit for 2nd interrupt but Host clears out
>> +             * the bit on handling the 1st interrupt.
>> +             */
>> +            u32 msg = I915_READ(SOFT_SCRATCH(15)) &
>> +                    (GUC2HOST_MSG_CRASH_DUMP_POSTED |
>> +                     GUC2HOST_MSG_FLUSH_LOG_BUFFER);
>> +            if (msg) {
>> +                /* Clear the message bits that are handled */
>> +                I915_WRITE(SOFT_SCRATCH(15),
>> +                    I915_READ(SOFT_SCRATCH(15)) & ~msg);
>
> Cache full value of SOFT_SCRATCH(15) so you don't have to mmio read it
> twice?
>
Thought reading it again (just before the update) is bit safer compared 
to reading it once, as there is a potential race problem here.
GuC could also write to the SOFT_SCRATCH(15) register, set new events 
bit, while Host clears off the bit of handled events.

> Also, is the RMW outside any locks safe?
>

Ideally need a way to atomically do the RMW, i.e. read the register 
value, clear off the handled events bit and update the register with the 
modified value.

Please kindly suggest how to address the above.
Or can this be left as a TODO, when we do start handling other events also.

>> +
>> +                /* Handle flush interrupt event in bottom half */
>> +                queue_work(dev_priv->wq, &dev_priv->guc.events_work);
>
> IMHO it would be nicer if the code started straight away with a final wq
> solution.
>
> Especially since the next patch in the series is called "Handle log
> buffer flush interrupt event from GuC" and the actual handling of the
> log buffer flush interrupt is split between this one
> (GUC2HOST_MSG_FLUSH_LOG_BUFFER above) and that one.
>
> So it would almost be nicer that the above chunk which handles
> GUC2HOST_MSG_FLUSH_LOG_BUFFER and the worker init is only added in the
> next patch and this one only does the generic bits.
>

Fine will move the log buffer flush interrupt event related stuff to the 
next patch and so irq handler in this patch will just be a
placeholder.

> I don't know.. I'll leave it on your conscience - if you think the split
> (series) can't be done any nicer or it makes sense to have it in this
> order then ok.
>
>> +            }
>
> Mabye:
>
>     } else
>
> And log something unexpected has happened in the !msg case?
>
> Since it won't clear the message in that case so would it keep triggering?
>

Actually after enabling of GuC interrupt, there can be interrupts from 
GuC side for some other events which are right now not handled by Host.

But not clearing of unhandled event bits won't result in re-triggering 
of the interrupt.

Best regards
Akash

>> +        }
>> +    }
>> +}
>> +
>>   static bool intel_pipe_handle_vblank(struct drm_i915_private *dev_priv,
>>                        enum pipe pipe)
>>   {
>> @@ -3722,7 +3810,7 @@ static void gen8_gt_irq_postinstall(struct
>> drm_i915_private *dev_priv)
>>       GEN8_IRQ_INIT_NDX(GT, 1, ~gt_interrupts[1], gt_interrupts[1]);
>>       /*
>>        * RPS interrupts will get enabled/disabled on demand when RPS
>> itself
>> -     * is enabled/disabled.
>> +     * is enabled/disabled. Same wil be the case for GuC interrupts.
>>        */
>>       GEN8_IRQ_INIT_NDX(GT, 2, dev_priv->pm_imr, dev_priv->pm_ier);
>>       GEN8_IRQ_INIT_NDX(GT, 3, ~gt_interrupts[3], gt_interrupts[3]);
>> @@ -4507,6 +4595,10 @@ void intel_irq_init(struct drm_i915_private
>> *dev_priv)
>>
>>       INIT_WORK(&dev_priv->rps.work, gen6_pm_rps_work);
>>       INIT_WORK(&dev_priv->l3_parity.error_work, ivybridge_parity_work);
>> +    INIT_WORK(&dev_priv->guc.events_work, gen9_guc2host_events_work);
>> +
>> +    if (HAS_GUC_UCODE(dev))
>> +        dev_priv->pm_guc_events = GEN9_GUC_TO_HOST_INT_EVENT;
>>
>>       /* Let's track the enabled rps events */
>>       if (IS_VALLEYVIEW(dev_priv))
>> diff --git a/drivers/gpu/drm/i915/i915_reg.h
>> b/drivers/gpu/drm/i915/i915_reg.h
>> index da82744..62046dc 100644
>> --- a/drivers/gpu/drm/i915/i915_reg.h
>> +++ b/drivers/gpu/drm/i915/i915_reg.h
>> @@ -6011,6 +6011,7 @@ enum {
>>   #define  GEN8_DE_PIPE_A_IRQ        (1<<16)
>>   #define  GEN8_DE_PIPE_IRQ(pipe)        (1<<(16+(pipe)))
>>   #define  GEN8_GT_VECS_IRQ        (1<<6)
>> +#define  GEN8_GT_GUC_IRQ        (1<<5)
>>   #define  GEN8_GT_PM_IRQ            (1<<4)
>>   #define  GEN8_GT_VCS2_IRQ        (1<<3)
>>   #define  GEN8_GT_VCS1_IRQ        (1<<2)
>> @@ -6022,6 +6023,16 @@ enum {
>>   #define GEN8_GT_IIR(which) _MMIO(0x44308 + (0x10 * (which)))
>>   #define GEN8_GT_IER(which) _MMIO(0x4430c + (0x10 * (which)))
>>
>> +#define GEN9_GUC_TO_HOST_INT_EVENT    (1<<31)
>> +#define GEN9_GUC_EXEC_ERROR_EVENT    (1<<30)
>> +#define GEN9_GUC_DISPLAY_EVENT        (1<<29)
>> +#define GEN9_GUC_SEMA_SIGNAL_EVENT    (1<<28)
>> +#define GEN9_GUC_IOMMU_MSG_EVENT    (1<<27)
>> +#define GEN9_GUC_DB_RING_EVENT        (1<<26)
>> +#define GEN9_GUC_DMA_DONE_EVENT        (1<<25)
>> +#define GEN9_GUC_FATAL_ERROR_EVENT    (1<<24)
>> +#define GEN9_GUC_NOTIFICATION_EVENT    (1<<23)
>> +
>>   #define GEN8_RCS_IRQ_SHIFT 0
>>   #define GEN8_BCS_IRQ_SHIFT 16
>>   #define GEN8_VCS1_IRQ_SHIFT 0
>> diff --git a/drivers/gpu/drm/i915/intel_drv.h
>> b/drivers/gpu/drm/i915/intel_drv.h
>> index 80cd05f..9619ce9 100644
>> --- a/drivers/gpu/drm/i915/intel_drv.h
>> +++ b/drivers/gpu/drm/i915/intel_drv.h
>> @@ -1119,6 +1119,9 @@ void gen8_irq_power_well_post_enable(struct
>> drm_i915_private *dev_priv,
>>                        unsigned int pipe_mask);
>>   void gen8_irq_power_well_pre_disable(struct drm_i915_private *dev_priv,
>>                        unsigned int pipe_mask);
>> +void gen9_reset_guc_interrupts(struct drm_i915_private *dev_priv);
>> +void gen9_enable_guc_interrupts(struct drm_i915_private *dev_priv);
>> +void gen9_disable_guc_interrupts(struct drm_i915_private *dev_priv);
>>
>>   /* intel_crt.c */
>>   void intel_crt_init(struct drm_device *dev);
>> diff --git a/drivers/gpu/drm/i915/intel_guc.h
>> b/drivers/gpu/drm/i915/intel_guc.h
>> index 7e22803..be1e04d 100644
>> --- a/drivers/gpu/drm/i915/intel_guc.h
>> +++ b/drivers/gpu/drm/i915/intel_guc.h
>> @@ -130,6 +130,10 @@ struct intel_guc {
>>       struct intel_guc_fw guc_fw;
>>       struct intel_guc_log log;
>>
>> +    /* GuC2Host interrupt related state */
>> +    struct work_struct events_work;
>> +    bool interrupts_enabled;
>> +
>>       struct drm_i915_gem_object *ads_obj;
>>
>>       struct drm_i915_gem_object *ctx_pool_obj;
>> diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c
>> b/drivers/gpu/drm/i915/intel_guc_loader.c
>> index f23bb33..b7e97cc 100644
>> --- a/drivers/gpu/drm/i915/intel_guc_loader.c
>> +++ b/drivers/gpu/drm/i915/intel_guc_loader.c
>> @@ -464,6 +464,7 @@ int intel_guc_setup(struct drm_device *dev)
>>       }
>>
>>       direct_interrupts_to_host(dev_priv);
>> +    gen9_reset_guc_interrupts(dev_priv);
>>
>>       guc_fw->guc_fw_load_status = GUC_FIRMWARE_PENDING;
>>
>> @@ -510,6 +511,9 @@ int intel_guc_setup(struct drm_device *dev)
>>           intel_guc_fw_status_repr(guc_fw->guc_fw_load_status));
>>
>>       if (i915.enable_guc_submission) {
>> +        if (i915.guc_log_level >= 0)
>> +            gen9_enable_guc_interrupts(dev_priv);
>> +
>>           err = i915_guc_submission_enable(dev_priv);
>>           if (err)
>>               goto fail;
>>
>
> Regards,
>
> Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 06/20] drm/i915: Handle log buffer flush interrupt event from GuC
  2016-08-12  6:25 ` [PATCH 06/20] drm/i915: Handle log buffer flush interrupt event from GuC akash.goel
  2016-08-12  6:28   ` Chris Wilson
@ 2016-08-12 13:17   ` Tvrtko Ursulin
  2016-08-12 13:45     ` Goel, Akash
  1 sibling, 1 reply; 68+ messages in thread
From: Tvrtko Ursulin @ 2016-08-12 13:17 UTC (permalink / raw)
  To: akash.goel, intel-gfx


On 12/08/16 07:25, akash.goel@intel.com wrote:
> From: Sagar Arun Kamble <sagar.a.kamble@intel.com>
>
> GuC ukernel sends an interrupt to Host to flush the log buffer
> and expects Host to correspondingly update the read pointer
> information in the state structure, once it has consumed the
> log buffer contents by copying them to a file or buffer.
> Even if Host couldn't copy the contents, it can still update the
> read pointer so that logging state is not disturbed on GuC side.
>
> v2:
> - Use a dedicated workqueue for handling flush interrupt. (Tvrtko)
> - Reduce the overall log buffer copying time by skipping the copy of
>    crash buffer area for regular cases and copying only the state
>    structure data in first page.
>
> v3:
>   - Create a vmalloc mapping of log buffer. (Chris)
>   - Cover the flush acknowledgment under rpm get & put.(Chris)
>   - Revert the change of skipping the copy of crash dump area, as
>     not really needed, will be covered by subsequent patch.
>
> v4:
>   - Destroy the wq under the same condition in which it was created,
>     pass dev_piv pointer instead of dev to newly added GuC function,
>     add more comments & rename variable for clarity. (Tvrtko)
>
> Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
> Signed-off-by: Akash Goel <akash.goel@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_drv.c            |  14 +++
>   drivers/gpu/drm/i915/i915_guc_submission.c | 150 +++++++++++++++++++++++++++++
>   drivers/gpu/drm/i915/i915_irq.c            |   5 +-
>   drivers/gpu/drm/i915/intel_guc.h           |   3 +
>   4 files changed, 170 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 0fcd1c0..fc2da32 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -770,8 +770,20 @@ static int i915_workqueues_init(struct drm_i915_private *dev_priv)
>   	if (dev_priv->hotplug.dp_wq == NULL)
>   		goto out_free_wq;
>
> +	if (HAS_GUC_SCHED(dev_priv)) {

This just reminded me that a previous patch had:

+	if (HAS_GUC_UCODE(dev))
+		dev_priv->pm_guc_events = GEN9_GUC_TO_HOST_INT_EVENT;

In the interrupt setup. I don't think there is a bug right now, but 
there is a disagreement between the two which would be good to resolve.

This HAS_GUC_UCODE in the other patch should probably be HAS_GUC_SCHED 
for correctness. I think.

> +		/* Need a dedicated wq to process log buffer flush interrupts
> +		 * from GuC without much delay so as to avoid any loss of logs.
> +		 */
> +		dev_priv->guc.log.wq =
> +			alloc_ordered_workqueue("i915-guc_log", 0);
> +		if (dev_priv->guc.log.wq == NULL)
> +			goto out_free_hotplug_dp_wq;
> +	}
> +
>   	return 0;
>
> +out_free_hotplug_dp_wq:
> +	destroy_workqueue(dev_priv->hotplug.dp_wq);
>   out_free_wq:
>   	destroy_workqueue(dev_priv->wq);
>   out_err:
> @@ -782,6 +794,8 @@ out_err:
>
>   static void i915_workqueues_cleanup(struct drm_i915_private *dev_priv)
>   {
> +	if (HAS_GUC_SCHED(dev_priv))
> +		destroy_workqueue(dev_priv->guc.log.wq);
>   	destroy_workqueue(dev_priv->hotplug.dp_wq);
>   	destroy_workqueue(dev_priv->wq);
>   }
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> index c7c679f..2635b67 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -172,6 +172,15 @@ static int host2guc_sample_forcewake(struct intel_guc *guc,
>   	return host2guc_action(guc, data, ARRAY_SIZE(data));
>   }
>
> +static int host2guc_logbuffer_flush_complete(struct intel_guc *guc)
> +{
> +	u32 data[1];
> +
> +	data[0] = HOST2GUC_ACTION_LOG_BUFFER_FILE_FLUSH_COMPLETE;
> +
> +	return host2guc_action(guc, data, 1);
> +}
> +
>   /*
>    * Initialise, update, or clear doorbell data shared with the GuC
>    *
> @@ -840,6 +849,127 @@ err:
>   	return NULL;
>   }
>
> +static void guc_move_to_next_buf(struct intel_guc *guc)
> +{
> +	return;
> +}
> +
> +static void* guc_get_write_buffer(struct intel_guc *guc)
> +{
> +	return NULL;
> +}
> +
> +static void guc_read_update_log_buffer(struct intel_guc *guc)
> +{
> +	struct guc_log_buffer_state *log_buffer_state, *log_buffer_snapshot_state;
> +	struct guc_log_buffer_state log_buffer_state_local;
> +	void *src_data_ptr, *dst_data_ptr;
> +	u32 i, buffer_size;

unsigned int i if you can be bothered.

> +
> +	if (!guc->log.buf_addr)
> +		return;

Can it hit this? If yes, I think better disable GuC logging when pin map 
on the object fails rather than let it generate interrupts in vain.

> +
> +	/* Get the pointer to shared GuC log buffer */
> +	log_buffer_state = src_data_ptr = guc->log.buf_addr;
> +
> +	/* Get the pointer to local buffer to store the logs */
> +	dst_data_ptr = log_buffer_snapshot_state = guc_get_write_buffer(guc);
> +
> +	/* Actual logs are present from the 2nd page */
> +	src_data_ptr += PAGE_SIZE;
> +	dst_data_ptr += PAGE_SIZE;
> +
> +	for (i = 0; i < GUC_MAX_LOG_BUFFER; i++) {
> +		/* Make a copy of the state structure in GuC log buffer (which
> +		 * is uncached mapped) on the stack to avoid reading from it
> +		 * multiple times.
> +		 */
> +		memcpy(&log_buffer_state_local, log_buffer_state,
> +				sizeof(struct guc_log_buffer_state));
> +		buffer_size = log_buffer_state_local.size;

Needs checking (and logging) that a bad firmware or some other error 
does not report a dangerous size (too big) which would then overwrite 
memory pointed by dst_data_ptr memory. (And/or read from random memory.)

> +
> +		if (log_buffer_snapshot_state) {
> +			/* First copy the state structure in local buffer */

Maybe "destination buffer" would be clearer?

> +			memcpy(log_buffer_snapshot_state, &log_buffer_state_local,
> +					sizeof(struct guc_log_buffer_state));
> +
> +			/* The write pointer could have been updated by the GuC
> +			 * firmware, after sending the flush interrupt to Host,
> +			 * for consistency set the write pointer value to same
> +			 * value of sampled_write_ptr in the snapshot buffer.
> +			 */
> +			log_buffer_snapshot_state->write_ptr =
> +				log_buffer_snapshot_state->sampled_write_ptr;
> +
> +			log_buffer_snapshot_state++;
> +
> +			/* Now copy the actual logs */
> +			memcpy(dst_data_ptr, src_data_ptr, buffer_size);
> +
> +			src_data_ptr += buffer_size;
> +			dst_data_ptr += buffer_size;
> +		}
> +
> +		/* FIXME: invalidate/flush for log buffer needed */

Yes no maybe? :)

> +
> +		/* Update the read pointer in the shared log buffer */
> +		log_buffer_state->read_ptr =
> +			log_buffer_state_local.sampled_write_ptr;
> +
> +		/* Clear the 'flush to file' flag */
> +		log_buffer_state->flush_to_file = 0;
> +		log_buffer_state++;
> +	}
> +
> +	if (log_buffer_snapshot_state)
> +		guc_move_to_next_buf(guc);
> +}
> +
> +static void guc_log_cleanup(struct intel_guc *guc)
> +{
> +	struct drm_i915_private *dev_priv = guc_to_i915(guc);
> +
> +	lockdep_assert_held(&dev_priv->drm.struct_mutex);
> +
> +	if (i915.guc_log_level < 0)
> +		return;
> +
> +	/* First disable the flush interrupt */
> +	gen9_disable_guc_interrupts(dev_priv);
> +
> +	if (guc->log.buf_addr)
> +		i915_gem_object_unpin_map(guc->log.obj);
> +
> +	guc->log.buf_addr = NULL;
> +}
> +
> +static int guc_create_log_extras(struct intel_guc *guc)
> +{
> +	struct drm_i915_private *dev_priv = guc_to_i915(guc);
> +	void *vaddr;
> +	int ret;
> +
> +	lockdep_assert_held(&dev_priv->drm.struct_mutex);
> +
> +	/* Nothing to do */
> +	if (i915.guc_log_level < 0)
> +		return 0;
> +
> +	if (!guc->log.buf_addr) {
> +		/* Create a vmalloc mapping of log buffer pages */
> +		vaddr = i915_gem_object_pin_map(guc->log.obj);
> +		if (IS_ERR(vaddr)) {
> +			ret = PTR_ERR(vaddr);
> +			DRM_ERROR("Couldn't map log buffer pages %d\n", ret);
> +			return ret;
> +		}
> +
> +		guc->log.buf_addr = vaddr;
> +	}
> +
> +	return 0;
> +}
> +
>   static void guc_create_log(struct intel_guc *guc)
>   {
>   	struct drm_i915_private *dev_priv = guc_to_i915(guc);
> @@ -866,6 +996,13 @@ static void guc_create_log(struct intel_guc *guc)
>   		}
>
>   		guc->log.obj = obj;
> +
> +		if (guc_create_log_extras(guc)) {
> +			gem_release_guc_obj(guc->log.obj);
> +			guc->log.obj = NULL;
> +			i915.guc_log_level = -1;
> +			return;
> +		}
>   	}
>
>   	/* each allocated unit is a page */
> @@ -1048,6 +1185,7 @@ void i915_guc_submission_fini(struct drm_i915_private *dev_priv)
>   	gem_release_guc_obj(dev_priv->guc.ads_obj);
>   	guc->ads_obj = NULL;
>
> +	guc_log_cleanup(guc);
>   	gem_release_guc_obj(dev_priv->guc.log.obj);
>   	guc->log.obj = NULL;
>
> @@ -1111,3 +1249,15 @@ int intel_guc_resume(struct drm_device *dev)
>
>   	return host2guc_action(guc, data, ARRAY_SIZE(data));
>   }
> +
> +void i915_guc_capture_logs(struct drm_i915_private *dev_priv)
> +{
> +	guc_read_update_log_buffer(&dev_priv->guc);
> +
> +	/* Generally device is expected to be active only at this
> +	 * time, so get/put should be really quick.
> +	 */
> +	intel_runtime_pm_get(dev_priv);
> +	host2guc_logbuffer_flush_complete(&dev_priv->guc);
> +	intel_runtime_pm_put(dev_priv);
> +}
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 5f1974f..d4d6f0a 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -1213,7 +1213,7 @@ static void gen9_guc2host_events_work(struct work_struct *work)
>   	}
>   	spin_unlock_irq(&dev_priv->irq_lock);
>
> -	/* TODO: Handle the events for which GuC interrupted host */
> +	i915_guc_capture_logs(dev_priv);
>   }
>
>   /**
> @@ -1703,7 +1703,8 @@ static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv, u32 gt_iir)
>   					I915_READ(SOFT_SCRATCH(15)) & ~msg);
>
>   				/* Handle flush interrupt event in bottom half */
> -				queue_work(dev_priv->wq, &dev_priv->guc.events_work);
> +				queue_work(dev_priv->guc.log.wq,
> +						&dev_priv->guc.events_work);
>   			}
>   		}
>   	}
> diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
> index be1e04d..7c0bdba 100644
> --- a/drivers/gpu/drm/i915/intel_guc.h
> +++ b/drivers/gpu/drm/i915/intel_guc.h
> @@ -124,6 +124,8 @@ struct intel_guc_fw {
>   struct intel_guc_log {
>   	uint32_t flags;
>   	struct drm_i915_gem_object *obj;
> +	struct workqueue_struct *wq;
> +	void *buf_addr;
>   };
>
>   struct intel_guc {
> @@ -169,5 +171,6 @@ int i915_guc_submission_enable(struct drm_i915_private *dev_priv);
>   int i915_guc_wq_check_space(struct drm_i915_gem_request *rq);
>   void i915_guc_submission_disable(struct drm_i915_private *dev_priv);
>   void i915_guc_submission_fini(struct drm_i915_private *dev_priv);
> +void i915_guc_capture_logs(struct drm_i915_private *dev_priv);
>
>   #endif
>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 05/20] drm/i915: Support for GuC interrupts
  2016-08-12 13:10     ` Goel, Akash
@ 2016-08-12 13:31       ` Tvrtko Ursulin
  2016-08-12 14:31         ` Goel, Akash
  0 siblings, 1 reply; 68+ messages in thread
From: Tvrtko Ursulin @ 2016-08-12 13:31 UTC (permalink / raw)
  To: Goel, Akash, intel-gfx


On 12/08/16 14:10, Goel, Akash wrote:
> On 8/12/2016 5:24 PM, Tvrtko Ursulin wrote:
>>
>> On 12/08/16 07:25, akash.goel@intel.com wrote:
>>> From: Sagar Arun Kamble <sagar.a.kamble@intel.com>
>>>
>>> There are certain types of interrupts which Host can recieve from GuC.
>>> GuC ukernel sends an interrupt to Host for certain events, like for
>>> example retrieve/consume the logs generated by ukernel.
>>> This patch adds support to receive interrupts from GuC but currently
>>> enables & partially handles only the interrupt sent by GuC ukernel.
>>> Future patches will add support for handling other interrupt types.
>>>
>>> v2:
>>> - Use common low level routines for PM IER/IIR programming (Chris)
>>> - Rename interrupt functions to gen9_xxx from gen8_xxx (Chris)
>>> - Replace disabling of wake ref asserts with rpm get/put (Chris)
>>>
>>> v3:
>>> - Update comments for more clarity. (Tvrtko)
>>> - Remove the masking of GuC interrupt, which was kept masked till the
>>>    start of bottom half, its not really needed as there is only a
>>>    single instance of work item & wq is ordered. (Tvrtko)
>>>
>>> v4:
>>> - Rebase.
>>> - Rename guc_events to pm_guc_events so as to be indicative of the
>>>    register/control block it is associated with. (Chris)
>>> - Add handling for back to back log buffer flush interrupts.
>>>
>>> v5:
>>> - Move the read & clearing of register, containing Guc2Host message
>>>    bits, outside the irq spinlock. (Tvrtko)
>>>
>>> Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
>>> Signed-off-by: Akash Goel <akash.goel@intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/i915_drv.h            |   1 +
>>>   drivers/gpu/drm/i915/i915_guc_submission.c |   5 ++
>>>   drivers/gpu/drm/i915/i915_irq.c            | 100
>>> +++++++++++++++++++++++++++--
>>>   drivers/gpu/drm/i915/i915_reg.h            |  11 ++++
>>>   drivers/gpu/drm/i915/intel_drv.h           |   3 +
>>>   drivers/gpu/drm/i915/intel_guc.h           |   4 ++
>>>   drivers/gpu/drm/i915/intel_guc_loader.c    |   4 ++
>>>   7 files changed, 124 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h
>>> b/drivers/gpu/drm/i915/i915_drv.h
>>> index a608a5c..28ffac5 100644
>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>> @@ -1779,6 +1779,7 @@ struct drm_i915_private {
>>>       u32 pm_imr;
>>>       u32 pm_ier;
>>>       u32 pm_rps_events;
>>> +    u32 pm_guc_events;
>>>       u32 pipestat_irq_mask[I915_MAX_PIPES];
>>>
>>>       struct i915_hotplug hotplug;
>>> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
>>> b/drivers/gpu/drm/i915/i915_guc_submission.c
>>> index ad3b55f..c7c679f 100644
>>> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
>>> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
>>> @@ -1071,6 +1071,8 @@ int intel_guc_suspend(struct drm_device *dev)
>>>       if (guc->guc_fw.guc_fw_load_status != GUC_FIRMWARE_SUCCESS)
>>>           return 0;
>>>
>>> +    gen9_disable_guc_interrupts(dev_priv);
>>> +
>>>       ctx = dev_priv->kernel_context;
>>>
>>>       data[0] = HOST2GUC_ACTION_ENTER_S_STATE;
>>> @@ -1097,6 +1099,9 @@ int intel_guc_resume(struct drm_device *dev)
>>>       if (guc->guc_fw.guc_fw_load_status != GUC_FIRMWARE_SUCCESS)
>>>           return 0;
>>>
>>> +    if (i915.guc_log_level >= 0)
>>> +        gen9_enable_guc_interrupts(dev_priv);
>>> +
>>>       ctx = dev_priv->kernel_context;
>>>
>>>       data[0] = HOST2GUC_ACTION_EXIT_S_STATE;
>>> diff --git a/drivers/gpu/drm/i915/i915_irq.c
>>> b/drivers/gpu/drm/i915/i915_irq.c
>>> index 5f93309..5f1974f 100644
>>> --- a/drivers/gpu/drm/i915/i915_irq.c
>>> +++ b/drivers/gpu/drm/i915/i915_irq.c
>>> @@ -170,6 +170,7 @@ static void gen5_assert_iir_is_zero(struct
>>> drm_i915_private *dev_priv,
>>>   } while (0)
>>>
>>>   static void gen6_rps_irq_handler(struct drm_i915_private *dev_priv,
>>> u32 pm_iir);
>>> +static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv,
>>> u32 pm_iir);
>>>
>>>   /* For display hotplug interrupt */
>>>   static inline void
>>> @@ -411,6 +412,38 @@ void gen6_disable_rps_interrupts(struct
>>> drm_i915_private *dev_priv)
>>>       gen6_reset_rps_interrupts(dev_priv);
>>>   }
>>>
>>> +void gen9_reset_guc_interrupts(struct drm_i915_private *dev_priv)
>>> +{
>>> +    spin_lock_irq(&dev_priv->irq_lock);
>>> +    gen6_reset_pm_iir(dev_priv, dev_priv->pm_guc_events);
>>> +    spin_unlock_irq(&dev_priv->irq_lock);
>>> +}
>>> +
>>> +void gen9_enable_guc_interrupts(struct drm_i915_private *dev_priv)
>>> +{
>>> +    spin_lock_irq(&dev_priv->irq_lock);
>>> +    if (!dev_priv->guc.interrupts_enabled) {
>>> +        WARN_ON_ONCE(I915_READ(gen6_pm_iir(dev_priv)) &
>>> +                        dev_priv->pm_guc_events);
>>> +        dev_priv->guc.interrupts_enabled = true;
>>> +        gen6_enable_pm_irq(dev_priv, dev_priv->pm_guc_events);
>>> +    }
>>> +    spin_unlock_irq(&dev_priv->irq_lock);
>>> +}
>>> +
>>> +void gen9_disable_guc_interrupts(struct drm_i915_private *dev_priv)
>>> +{
>>> +    spin_lock_irq(&dev_priv->irq_lock);
>>> +    dev_priv->guc.interrupts_enabled = false;
>>> +
>>> +    gen6_disable_pm_irq(dev_priv, dev_priv->pm_guc_events);
>>> +
>>> +    spin_unlock_irq(&dev_priv->irq_lock);
>>> +    synchronize_irq(dev_priv->drm.irq);
>>> +
>>> +    gen9_reset_guc_interrupts(dev_priv);
>>> +}
>>> +
>>>   /**
>>>    * bdw_update_port_irq - update DE port interrupt
>>>    * @dev_priv: driver private
>>> @@ -1167,6 +1200,21 @@ static void gen6_pm_rps_work(struct work_struct
>>> *work)
>>>       mutex_unlock(&dev_priv->rps.hw_lock);
>>>   }
>>>
>>> +static void gen9_guc2host_events_work(struct work_struct *work)
>>> +{
>>> +    struct drm_i915_private *dev_priv =
>>> +        container_of(work, struct drm_i915_private, guc.events_work);
>>> +
>>> +    spin_lock_irq(&dev_priv->irq_lock);
>>> +    /* Speed up work cancellation during disabling guc interrupts. */
>>> +    if (!dev_priv->guc.interrupts_enabled) {
>>> +        spin_unlock_irq(&dev_priv->irq_lock);
>>> +        return;
>>
>> I suppose locking for early exit is something about ensuring the worker
>> sees the update to dev_priv->guc.interrupts_enabled done on another CPU?
>
> Yes locking (providing implicit barrier) will ensure that update made
> from another CPU is immediately visible to the worker.

What if the disable happens after the unlock above? It would wait in 
disable until the irq handler exits. So the same as if not bothering 
with the spinlock above, no?

>> synchronize_irq there is not enough for some reason?
>>
> synchronize_irq would not be enough, its for a different purpose, to
> ensure that any ongoing handling of irq completes (after the caller has
> disabled the irq).
>
> As per my understanding synchronize_irq won't have an effect on the
> worker, with respect to the moment when the update of
> 'interrupts_enabled' flag is visible to the worker.
>
>>> +    }
>>> +    spin_unlock_irq(&dev_priv->irq_lock);
>>> +
>>> +    /* TODO: Handle the events for which GuC interrupted host */
>>> +}
>>>
>>>   /**
>>>    * ivybridge_parity_work - Workqueue called when a parity error
>>> interrupt
>>> @@ -1339,11 +1387,13 @@ static irqreturn_t gen8_gt_irq_ack(struct
>>> drm_i915_private *dev_priv,
>>>               DRM_ERROR("The master control interrupt lied (GT3)!\n");
>>>       }
>>>
>>> -    if (master_ctl & GEN8_GT_PM_IRQ) {
>>> +    if (master_ctl & (GEN8_GT_PM_IRQ | GEN8_GT_GUC_IRQ)) {
>>>           gt_iir[2] = I915_READ_FW(GEN8_GT_IIR(2));
>>> -        if (gt_iir[2] & dev_priv->pm_rps_events) {
>>> +        if (gt_iir[2] & (dev_priv->pm_rps_events |
>>> +                 dev_priv->pm_guc_events)) {
>>>               I915_WRITE_FW(GEN8_GT_IIR(2),
>>> -                      gt_iir[2] & dev_priv->pm_rps_events);
>>> +                      gt_iir[2] & (dev_priv->pm_rps_events |
>>> +                           dev_priv->pm_guc_events));
>>>               ret = IRQ_HANDLED;
>>>           } else
>>>               DRM_ERROR("The master control interrupt lied (PM)!\n");
>>> @@ -1375,6 +1425,9 @@ static void gen8_gt_irq_handler(struct
>>> drm_i915_private *dev_priv,
>>>
>>>       if (gt_iir[2] & dev_priv->pm_rps_events)
>>>           gen6_rps_irq_handler(dev_priv, gt_iir[2]);
>>> +
>>> +    if (gt_iir[2] & dev_priv->pm_guc_events)
>>> +        gen9_guc_irq_handler(dev_priv, gt_iir[2]);
>>>   }
>>>
>>>   static bool bxt_port_hotplug_long_detect(enum port port, u32 val)
>>> @@ -1621,6 +1674,41 @@ static void gen6_rps_irq_handler(struct
>>> drm_i915_private *dev_priv, u32 pm_iir)
>>>       }
>>>   }
>>>
>>> +static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv,
>>> u32 gt_iir)
>>> +{
>>> +    bool interrupts_enabled;
>>> +
>>> +    if (gt_iir & GEN9_GUC_TO_HOST_INT_EVENT) {
>>> +        spin_lock(&dev_priv->irq_lock);
>>> +        interrupts_enabled = dev_priv->guc.interrupts_enabled;
>>> +        spin_unlock(&dev_priv->irq_lock);
>>
>> Not sure that taking a lock around only this read is needed.
>>
> Again same reason as above, to make sure an update made on another CPU
> is immediately visible to the irq handler.

I don't get it, see above. :)

>>> +        if (interrupts_enabled) {
>>> +            /* Sample the log buffer flush related bits & clear them
>>> +             * out now itself from the message identity register to
>>> +             * minimize the probability of losing a flush interrupt,
>>> +             * when there are back to back flush interrupts.
>>> +             * There can be a new flush interrupt, for different log
>>> +             * buffer type (like for ISR), whilst Host is handling
>>> +             * one (for DPC). Since same bit is used in message
>>> +             * register for ISR & DPC, it could happen that GuC
>>> +             * sets the bit for 2nd interrupt but Host clears out
>>> +             * the bit on handling the 1st interrupt.
>>> +             */
>>> +            u32 msg = I915_READ(SOFT_SCRATCH(15)) &
>>> +                    (GUC2HOST_MSG_CRASH_DUMP_POSTED |
>>> +                     GUC2HOST_MSG_FLUSH_LOG_BUFFER);
>>> +            if (msg) {
>>> +                /* Clear the message bits that are handled */
>>> +                I915_WRITE(SOFT_SCRATCH(15),
>>> +                    I915_READ(SOFT_SCRATCH(15)) & ~msg);
>>
>> Cache full value of SOFT_SCRATCH(15) so you don't have to mmio read it
>> twice?
>>
> Thought reading it again (just before the update) is bit safer compared
> to reading it once, as there is a potential race problem here.
> GuC could also write to the SOFT_SCRATCH(15) register, set new events
> bit, while Host clears off the bit of handled events.

Don't get it. If there is a race between read and write there still is, 
don't see how a second read makes it safer.

>> Also, is the RMW outside any locks safe?
>>
>
> Ideally need a way to atomically do the RMW, i.e. read the register
> value, clear off the handled events bit and update the register with the
> modified value.
>
> Please kindly suggest how to address the above.
> Or can this be left as a TODO, when we do start handling other events also.

 From the comment in code above it sounds like a GuC fw interface 
shortcoming - that there is a single bit for two different interrupt 
sources, is that right? Is there any other register or something that 
you can read to detect that the interrupt has been re-asserted while in 
the irq handler? Although I thought you said before that the GuC will 
not do that - that it won't re-assert the interrupt before we send the 
flush command.

>>> +
>>> +                /* Handle flush interrupt event in bottom half */
>>> +                queue_work(dev_priv->wq, &dev_priv->guc.events_work);
>>
>> IMHO it would be nicer if the code started straight away with a final wq
>> solution.
>>
>> Especially since the next patch in the series is called "Handle log
>> buffer flush interrupt event from GuC" and the actual handling of the
>> log buffer flush interrupt is split between this one
>> (GUC2HOST_MSG_FLUSH_LOG_BUFFER above) and that one.
>>
>> So it would almost be nicer that the above chunk which handles
>> GUC2HOST_MSG_FLUSH_LOG_BUFFER and the worker init is only added in the
>> next patch and this one only does the generic bits.
>>
>
> Fine will move the log buffer flush interrupt event related stuff to the
> next patch and so irq handler in this patch will just be a
> placeholder.

Great thanks!

>> I don't know.. I'll leave it on your conscience - if you think the split
>> (series) can't be done any nicer or it makes sense to have it in this
>> order then ok.
>>
>>> +            }
>>
>> Mabye:
>>
>>     } else
>>
>> And log something unexpected has happened in the !msg case?
>>
>> Since it won't clear the message in that case so would it keep
>> triggering?
>>
>
> Actually after enabling of GuC interrupt, there can be interrupts from
> GuC side for some other events which are right now not handled by Host.
>
> But not clearing of unhandled event bits won't result in re-triggering
> of the interrupt.

Ok I suggest documenting that as a comment in code then.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 06/20] drm/i915: Handle log buffer flush interrupt event from GuC
  2016-08-12 13:17   ` Tvrtko Ursulin
@ 2016-08-12 13:45     ` Goel, Akash
  2016-08-12 14:07       ` Tvrtko Ursulin
  0 siblings, 1 reply; 68+ messages in thread
From: Goel, Akash @ 2016-08-12 13:45 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: akash.goel



On 8/12/2016 6:47 PM, Tvrtko Ursulin wrote:
>
> On 12/08/16 07:25, akash.goel@intel.com wrote:
>> From: Sagar Arun Kamble <sagar.a.kamble@intel.com>
>>
>> GuC ukernel sends an interrupt to Host to flush the log buffer
>> and expects Host to correspondingly update the read pointer
>> information in the state structure, once it has consumed the
>> log buffer contents by copying them to a file or buffer.
>> Even if Host couldn't copy the contents, it can still update the
>> read pointer so that logging state is not disturbed on GuC side.
>>
>> v2:
>> - Use a dedicated workqueue for handling flush interrupt. (Tvrtko)
>> - Reduce the overall log buffer copying time by skipping the copy of
>>    crash buffer area for regular cases and copying only the state
>>    structure data in first page.
>>
>> v3:
>>   - Create a vmalloc mapping of log buffer. (Chris)
>>   - Cover the flush acknowledgment under rpm get & put.(Chris)
>>   - Revert the change of skipping the copy of crash dump area, as
>>     not really needed, will be covered by subsequent patch.
>>
>> v4:
>>   - Destroy the wq under the same condition in which it was created,
>>     pass dev_piv pointer instead of dev to newly added GuC function,
>>     add more comments & rename variable for clarity. (Tvrtko)
>>
>> Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
>> Signed-off-by: Akash Goel <akash.goel@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_drv.c            |  14 +++
>>   drivers/gpu/drm/i915/i915_guc_submission.c | 150
>> +++++++++++++++++++++++++++++
>>   drivers/gpu/drm/i915/i915_irq.c            |   5 +-
>>   drivers/gpu/drm/i915/intel_guc.h           |   3 +
>>   4 files changed, 170 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.c
>> b/drivers/gpu/drm/i915/i915_drv.c
>> index 0fcd1c0..fc2da32 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.c
>> +++ b/drivers/gpu/drm/i915/i915_drv.c
>> @@ -770,8 +770,20 @@ static int i915_workqueues_init(struct
>> drm_i915_private *dev_priv)
>>       if (dev_priv->hotplug.dp_wq == NULL)
>>           goto out_free_wq;
>>
>> +    if (HAS_GUC_SCHED(dev_priv)) {
>
> This just reminded me that a previous patch had:
>
> +    if (HAS_GUC_UCODE(dev))
> +        dev_priv->pm_guc_events = GEN9_GUC_TO_HOST_INT_EVENT;
>
> In the interrupt setup. I don't think there is a bug right now, but
> there is a disagreement between the two which would be good to resolve.
>
> This HAS_GUC_UCODE in the other patch should probably be HAS_GUC_SCHED
> for correctness. I think.

Sorry for inconsistency, Will use HAS_GUC_SCHED in the previous patch.

As per Chris's comments will move the wq init/destroy to the GuC logging 
setup/teardown routines (guc_create_log_extras, guc_log_cleanup)
You are fine with that ?.

>
>> +        /* Need a dedicated wq to process log buffer flush interrupts
>> +         * from GuC without much delay so as to avoid any loss of logs.
>> +         */
>> +        dev_priv->guc.log.wq =
>> +            alloc_ordered_workqueue("i915-guc_log", 0);
>> +        if (dev_priv->guc.log.wq == NULL)
>> +            goto out_free_hotplug_dp_wq;
>> +    }
>> +
>>       return 0;
>>
>> +out_free_hotplug_dp_wq:
>> +    destroy_workqueue(dev_priv->hotplug.dp_wq);
>>   out_free_wq:
>>       destroy_workqueue(dev_priv->wq);
>>   out_err:
>> @@ -782,6 +794,8 @@ out_err:
>>
>>   static void i915_workqueues_cleanup(struct drm_i915_private *dev_priv)
>>   {
>> +    if (HAS_GUC_SCHED(dev_priv))
>> +        destroy_workqueue(dev_priv->guc.log.wq);
>>       destroy_workqueue(dev_priv->hotplug.dp_wq);
>>       destroy_workqueue(dev_priv->wq);
>>   }
>> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
>> b/drivers/gpu/drm/i915/i915_guc_submission.c
>> index c7c679f..2635b67 100644
>> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
>> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
>> @@ -172,6 +172,15 @@ static int host2guc_sample_forcewake(struct
>> intel_guc *guc,
>>       return host2guc_action(guc, data, ARRAY_SIZE(data));
>>   }
>>
>> +static int host2guc_logbuffer_flush_complete(struct intel_guc *guc)
>> +{
>> +    u32 data[1];
>> +
>> +    data[0] = HOST2GUC_ACTION_LOG_BUFFER_FILE_FLUSH_COMPLETE;
>> +
>> +    return host2guc_action(guc, data, 1);
>> +}
>> +
>>   /*
>>    * Initialise, update, or clear doorbell data shared with the GuC
>>    *
>> @@ -840,6 +849,127 @@ err:
>>       return NULL;
>>   }
>>
>> +static void guc_move_to_next_buf(struct intel_guc *guc)
>> +{
>> +    return;
>> +}
>> +
>> +static void* guc_get_write_buffer(struct intel_guc *guc)
>> +{
>> +    return NULL;
>> +}
>> +
>> +static void guc_read_update_log_buffer(struct intel_guc *guc)
>> +{
>> +    struct guc_log_buffer_state *log_buffer_state,
>> *log_buffer_snapshot_state;
>> +    struct guc_log_buffer_state log_buffer_state_local;
>> +    void *src_data_ptr, *dst_data_ptr;
>> +    u32 i, buffer_size;
>
> unsigned int i if you can be bothered.

Fine will do that for both i & buffer_size.

But I remember earlier in one of the patch, you suggested to use u32 as 
a type for some variables.
Please could you share the guideline.
Should u32, u64 be used we are exactly sure of the range of the 
variable, like for variables containing the register values ?


>
>> +
>> +    if (!guc->log.buf_addr)
>> +        return;
>
> Can it hit this? If yes, I think better disable GuC logging when pin map
> on the object fails rather than let it generate interrupts in vain.
>
>> +
>> +    /* Get the pointer to shared GuC log buffer */
>> +    log_buffer_state = src_data_ptr = guc->log.buf_addr;
>> +
>> +    /* Get the pointer to local buffer to store the logs */
>> +    dst_data_ptr = log_buffer_snapshot_state =
>> guc_get_write_buffer(guc);
>> +
>> +    /* Actual logs are present from the 2nd page */
>> +    src_data_ptr += PAGE_SIZE;
>> +    dst_data_ptr += PAGE_SIZE;
>> +
>> +    for (i = 0; i < GUC_MAX_LOG_BUFFER; i++) {
>> +        /* Make a copy of the state structure in GuC log buffer (which
>> +         * is uncached mapped) on the stack to avoid reading from it
>> +         * multiple times.
>> +         */
>> +        memcpy(&log_buffer_state_local, log_buffer_state,
>> +                sizeof(struct guc_log_buffer_state));
>> +        buffer_size = log_buffer_state_local.size;
>
> Needs checking (and logging) that a bad firmware or some other error
> does not report a dangerous size (too big) which would then overwrite
> memory pointed by dst_data_ptr memory. (And/or read from random memory.)
>
Have done the range checking for the read/write offset values, which are 
updated repeatedly by GuC firmware.
The buffer size is updated only at init time by GuC firmware, hence less 
vulnerable.

But nevertheless will add the checks.

>> +
>> +        if (log_buffer_snapshot_state) {
>> +            /* First copy the state structure in local buffer */
>
> Maybe "destination buffer" would be clearer?

Sorry my bad, well spotted.

>
>> +            memcpy(log_buffer_snapshot_state, &log_buffer_state_local,
>> +                    sizeof(struct guc_log_buffer_state));
>> +
>> +            /* The write pointer could have been updated by the GuC
>> +             * firmware, after sending the flush interrupt to Host,
>> +             * for consistency set the write pointer value to same
>> +             * value of sampled_write_ptr in the snapshot buffer.
>> +             */
>> +            log_buffer_snapshot_state->write_ptr =
>> +                log_buffer_snapshot_state->sampled_write_ptr;
>> +
>> +            log_buffer_snapshot_state++;
>> +
>> +            /* Now copy the actual logs */
>> +            memcpy(dst_data_ptr, src_data_ptr, buffer_size);
>> +
>> +            src_data_ptr += buffer_size;
>> +            dst_data_ptr += buffer_size;
>> +        }
>> +
>> +        /* FIXME: invalidate/flush for log buffer needed */
>
> Yes no maybe? :)

Will like to keep it, if you allow.
>
>> +
>> +        /* Update the read pointer in the shared log buffer */
>> +        log_buffer_state->read_ptr =
>> +            log_buffer_state_local.sampled_write_ptr;
>> +
>> +        /* Clear the 'flush to file' flag */
>> +        log_buffer_state->flush_to_file = 0;
>> +        log_buffer_state++;
>> +    }
>> +
>> +    if (log_buffer_snapshot_state)
>> +        guc_move_to_next_buf(guc);
>> +}
>> +
>> +static void guc_log_cleanup(struct intel_guc *guc)
>> +{
>> +    struct drm_i915_private *dev_priv = guc_to_i915(guc);
>> +
>> +    lockdep_assert_held(&dev_priv->drm.struct_mutex);
>> +
>> +    if (i915.guc_log_level < 0)
>> +        return;
>> +
>> +    /* First disable the flush interrupt */
>> +    gen9_disable_guc_interrupts(dev_priv);
>> +
>> +    if (guc->log.buf_addr)
>> +        i915_gem_object_unpin_map(guc->log.obj);
>> +
>> +    guc->log.buf_addr = NULL;
>> +}
>> +
>> +static int guc_create_log_extras(struct intel_guc *guc)
>> +{
>> +    struct drm_i915_private *dev_priv = guc_to_i915(guc);
>> +    void *vaddr;
>> +    int ret;
>> +
>> +    lockdep_assert_held(&dev_priv->drm.struct_mutex);
>> +
>> +    /* Nothing to do */
>> +    if (i915.guc_log_level < 0)
>> +        return 0;
>> +
>> +    if (!guc->log.buf_addr) {
>> +        /* Create a vmalloc mapping of log buffer pages */
>> +        vaddr = i915_gem_object_pin_map(guc->log.obj);
>> +        if (IS_ERR(vaddr)) {
>> +            ret = PTR_ERR(vaddr);
>> +            DRM_ERROR("Couldn't map log buffer pages %d\n", ret);
>> +            return ret;
>> +        }
>> +
>> +        guc->log.buf_addr = vaddr;
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>>   static void guc_create_log(struct intel_guc *guc)
>>   {
>>       struct drm_i915_private *dev_priv = guc_to_i915(guc);
>> @@ -866,6 +996,13 @@ static void guc_create_log(struct intel_guc *guc)
>>           }
>>
>>           guc->log.obj = obj;
>> +
>> +        if (guc_create_log_extras(guc)) {
>> +            gem_release_guc_obj(guc->log.obj);
>> +            guc->log.obj = NULL;
>> +            i915.guc_log_level = -1;
>> +            return;
>> +        }
>>       }
>>
>>       /* each allocated unit is a page */
>> @@ -1048,6 +1185,7 @@ void i915_guc_submission_fini(struct
>> drm_i915_private *dev_priv)
>>       gem_release_guc_obj(dev_priv->guc.ads_obj);
>>       guc->ads_obj = NULL;
>>
>> +    guc_log_cleanup(guc);
>>       gem_release_guc_obj(dev_priv->guc.log.obj);
>>       guc->log.obj = NULL;
>>
>> @@ -1111,3 +1249,15 @@ int intel_guc_resume(struct drm_device *dev)
>>
>>       return host2guc_action(guc, data, ARRAY_SIZE(data));
>>   }
>> +
>> +void i915_guc_capture_logs(struct drm_i915_private *dev_priv)
>> +{
>> +    guc_read_update_log_buffer(&dev_priv->guc);
>> +
>> +    /* Generally device is expected to be active only at this
>> +     * time, so get/put should be really quick.
>> +     */
>> +    intel_runtime_pm_get(dev_priv);
>> +    host2guc_logbuffer_flush_complete(&dev_priv->guc);
>> +    intel_runtime_pm_put(dev_priv);
>> +}
>> diff --git a/drivers/gpu/drm/i915/i915_irq.c
>> b/drivers/gpu/drm/i915/i915_irq.c
>> index 5f1974f..d4d6f0a 100644
>> --- a/drivers/gpu/drm/i915/i915_irq.c
>> +++ b/drivers/gpu/drm/i915/i915_irq.c
>> @@ -1213,7 +1213,7 @@ static void gen9_guc2host_events_work(struct
>> work_struct *work)
>>       }
>>       spin_unlock_irq(&dev_priv->irq_lock);
>>
>> -    /* TODO: Handle the events for which GuC interrupted host */
>> +    i915_guc_capture_logs(dev_priv);
>>   }
>>
>>   /**
>> @@ -1703,7 +1703,8 @@ static void gen9_guc_irq_handler(struct
>> drm_i915_private *dev_priv, u32 gt_iir)
>>                       I915_READ(SOFT_SCRATCH(15)) & ~msg);
>>
>>                   /* Handle flush interrupt event in bottom half */
>> -                queue_work(dev_priv->wq, &dev_priv->guc.events_work);
>> +                queue_work(dev_priv->guc.log.wq,
>> +                        &dev_priv->guc.events_work);
>>               }
>>           }
>>       }
>> diff --git a/drivers/gpu/drm/i915/intel_guc.h
>> b/drivers/gpu/drm/i915/intel_guc.h
>> index be1e04d..7c0bdba 100644
>> --- a/drivers/gpu/drm/i915/intel_guc.h
>> +++ b/drivers/gpu/drm/i915/intel_guc.h
>> @@ -124,6 +124,8 @@ struct intel_guc_fw {
>>   struct intel_guc_log {
>>       uint32_t flags;
>>       struct drm_i915_gem_object *obj;
>> +    struct workqueue_struct *wq;
>> +    void *buf_addr;
>>   };
>>
>>   struct intel_guc {
>> @@ -169,5 +171,6 @@ int i915_guc_submission_enable(struct
>> drm_i915_private *dev_priv);
>>   int i915_guc_wq_check_space(struct drm_i915_gem_request *rq);
>>   void i915_guc_submission_disable(struct drm_i915_private *dev_priv);
>>   void i915_guc_submission_fini(struct drm_i915_private *dev_priv);
>> +void i915_guc_capture_logs(struct drm_i915_private *dev_priv);
>>
>>   #endif
>>
>
> Regards,
>
> Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 08/20] drm/i915: Add a relay backed debugfs interface for capturing GuC logs
  2016-08-12  6:25 ` [PATCH 08/20] drm/i915: Add a relay backed debugfs interface for capturing GuC logs akash.goel
@ 2016-08-12 13:53   ` Tvrtko Ursulin
  2016-08-12 16:10     ` Goel, Akash
  0 siblings, 1 reply; 68+ messages in thread
From: Tvrtko Ursulin @ 2016-08-12 13:53 UTC (permalink / raw)
  To: akash.goel, intel-gfx; +Cc: Sourab Gupta


On 12/08/16 07:25, akash.goel@intel.com wrote:
> From: Akash Goel <akash.goel@intel.com>
>
> Added a new debugfs interface '/sys/kernel/debug/dri/guc_log' for the
> User to capture GuC firmware logs. Availed relay framework to implement
> the interface, where Driver will have to just use a relay API to store
> snapshots of the GuC log buffer in the buffer managed by relay.
> The snapshot will be taken when GuC firmware sends a log buffer flush
> interrupt and up to four snaphots could be stored in the relay buffer.

snapshots

> The relay buffer will be operated in a mode where it will overwrite the
> data not yet collected by User.
> Besides mmap method, through which User can directly access the relay
> buffer contents, relay also supports the 'poll' method. Through the 'poll'
> call on log file, User can come to know whenever a new snapshot of the
> log buffer is taken by Driver, so can run in tandem with the Driver and
> capture the logs in a sustained/streaming manner, without any loss of data.
>
> v2: Defer the creation of relay channel & associated debugfs file, as
>      debugfs setup is now done at the end of i915 Driver load. (Chris)
>
> v3:
> - Switch to no-overwrite mode for relay.
> - Fix the relay sub buffer switching sequence.
>
> v4:
> - Update i915 Kconfig to select RELAY config. (TvrtKo)
> - Log a message when there is no sub buffer available to capture
>    the GuC log buffer. (Tvrtko)
> - Increase the number of relay sub buffers to 8 from 4, to have
>    sufficient buffering for boot time logs
>
> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> Signed-off-by: Akash Goel <akash.goel@intel.com>
> ---
>   drivers/gpu/drm/i915/Kconfig               |   1 +
>   drivers/gpu/drm/i915/i915_drv.c            |   2 +
>   drivers/gpu/drm/i915/i915_guc_submission.c | 206 ++++++++++++++++++++++++++++-
>   drivers/gpu/drm/i915/intel_guc.h           |   3 +
>   4 files changed, 209 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
> index 7769e46..fc900d2 100644
> --- a/drivers/gpu/drm/i915/Kconfig
> +++ b/drivers/gpu/drm/i915/Kconfig
> @@ -11,6 +11,7 @@ config DRM_I915
>   	select DRM_KMS_HELPER
>   	select DRM_PANEL
>   	select DRM_MIPI_DSI
> +	select RELAY
>   	# i915 depends on ACPI_VIDEO when ACPI is enabled
>   	# but for select to work, need to select ACPI_VIDEO's dependencies, ick
>   	select BACKLIGHT_LCD_SUPPORT if ACPI
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index fc2da32..cb8c943 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -1145,6 +1145,7 @@ static void i915_driver_register(struct drm_i915_private *dev_priv)
>   	/* Reveal our presence to userspace */
>   	if (drm_dev_register(dev, 0) == 0) {
>   		i915_debugfs_register(dev_priv);
> +		i915_guc_register(dev_priv);
>   		i915_setup_sysfs(dev);
>   	} else
>   		DRM_ERROR("Failed to register driver for userspace access!\n");
> @@ -1183,6 +1184,7 @@ static void i915_driver_unregister(struct drm_i915_private *dev_priv)
>   	intel_opregion_unregister(dev_priv);
>
>   	i915_teardown_sysfs(&dev_priv->drm);
> +	i915_guc_unregister(dev_priv);
>   	i915_debugfs_unregister(dev_priv);
>   	drm_dev_unregister(&dev_priv->drm);
>
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> index 2635b67..1a2d648 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -23,6 +23,8 @@
>    */
>   #include <linux/firmware.h>
>   #include <linux/circ_buf.h>
> +#include <linux/debugfs.h>
> +#include <linux/relay.h>
>   #include "i915_drv.h"
>   #include "intel_guc.h"
>
> @@ -851,12 +853,33 @@ err:
>
>   static void guc_move_to_next_buf(struct intel_guc *guc)
>   {
> -	return;
> +	/* Make sure the updates made in the sub buffer are visible when
> +	 * Consumer sees the following update to offset inside the sub buffer.
> +	 */
> +	smp_wmb();
> +
> +	/* All data has been written, so now move the offset of sub buffer. */
> +	relay_reserve(guc->log.relay_chan, guc->log.obj->base.size);
> +
> +	/* Switch to the next sub buffer */
> +	relay_flush(guc->log.relay_chan);
>   }
>
>   static void* guc_get_write_buffer(struct intel_guc *guc)
>   {
> -	return NULL;
> +	/* FIXME: Cover the check under a lock ? */

Need to resolve before r-b in any case.

> +	if (!guc->log.relay_chan)
> +		return NULL;
> +
> +	/* Just get the base address of a new sub buffer and copy data into it
> +	 * ourselves. NULL will be returned in no-overwrite mode, if all sub
> +	 * buffers are full. Could have used the relay_write() to indirectly
> +	 * copy the data, but that would have been bit convoluted, as we need to
> +	 * write to only certain locations inside a sub buffer which cannot be
> +	 * done without using relay_reserve() along with relay_write(). So its
> +	 * better to use relay_reserve() alone.
> +	 */
> +	return relay_reserve(guc->log.relay_chan, 0);
>   }
>
>   static void guc_read_update_log_buffer(struct intel_guc *guc)
> @@ -923,6 +946,130 @@ static void guc_read_update_log_buffer(struct intel_guc *guc)
>
>   	if (log_buffer_snapshot_state)
>   		guc_move_to_next_buf(guc);
> +	else {
> +		/* Used rate limited to avoid deluge of messages, logs might be
> +		 * getting consumed by User at a slow rate.
> +		 */
> +		DRM_ERROR_RATELIMITED("no sub-buffer to capture log buffer\n");
> +	}
> +}
> +
> +/*
> + * Sub buffer switch callback. Called whenever relay has to switch to a new
> + * sub buffer, relay stays on the same sub buffer if 0 is returned.
> + */
> +static int subbuf_start_callback(struct rchan_buf *buf,
> +				 void *subbuf,
> +				 void *prev_subbuf,
> +				 size_t prev_padding)
> +{
> +	/* Use no-overwrite mode by default, where relay will stop accepting
> +	 * new data if there are no empty sub buffers left.
> +	 * There is no strict synchronization enforced by relay between Consumer
> +	 * and Producer. In overwrite mode, there is a possibility of getting
> +	 * inconsistent/garbled data, the producer could be writing on to the
> +	 * same sub buffer from which Consumer is reading. This can't be avoided
> +	 * unless Consumer is fast enough and can always run in tandem with
> +	 * Producer.
> +	 */
> +	if (relay_buf_full(buf))
> +		return 0;
> +
> +	return 1;
> +}
> +
> +/*
> + * file_create() callback. Creates relay file in debugfs.
> + */
> +static struct dentry *create_buf_file_callback(const char *filename,
> +					       struct dentry *parent,
> +					       umode_t mode,
> +					       struct rchan_buf *buf,
> +					       int *is_global)
> +{
> +	struct dentry *buf_file = NULL;
> +

Would this function be a tiny bit simpler as:

	if (!parent)
		return NULL;

?

> +	if (parent) {
> +		/* Not using the channel filename passed as an argument, since
> +		 * for each channel relay appends the corresponding CPU number
> +		 * to the filename passed in relay_open(). This should be fine
> +		 * as relay just needs a dentry of the file associated with the
> +		 * channel buffer and that file's name need not be same as the
> +		 * filename passed as an argument.
> +		 */
> +		buf_file = debugfs_create_file("guc_log", mode,
> +				parent, buf, &relay_file_operations);

Alignment is wrong.

> +	}
> +
> +	/* This to enable the use of a single buffer for the relay channel and
> +	 * correspondingly have a single file exposed to User, through which
> +	 * it can collect the logs inorder without any post-processing.

"in order"

> +	 */
> +	*is_global = 1;
> +
> +	return buf_file;
> +}
> +
> +/*
> + * file_remove() default callback. Removes relay file in debugfs.
> + */
> +static int remove_buf_file_callback(struct dentry *dentry)
> +{
> +	debugfs_remove(dentry);
> +	return 0;
> +}
> +
> +/* relay channel callbacks */
> +static struct rchan_callbacks relay_callbacks = {
> +	.subbuf_start = subbuf_start_callback,
> +	.create_buf_file = create_buf_file_callback,
> +	.remove_buf_file = remove_buf_file_callback,
> +};
> +
> +static void guc_remove_log_relay_file(struct intel_guc *guc)
> +{
> +	relay_close(guc->log.relay_chan);
> +}
> +
> +static int guc_create_log_relay_file(struct intel_guc *guc)
> +{
> +	struct drm_i915_private *dev_priv = guc_to_i915(guc);
> +	struct rchan *guc_log_relay_chan;
> +	struct dentry *log_dir;
> +	size_t n_subbufs, subbuf_size;
> +
> +	/* For now create the log file in /sys/kernel/debug/dri/0 dir */

Where it should be or will be later?

> +	log_dir = dev_priv->drm.primary->debugfs_root;
> +
> +	/* If /sys/kernel/debug/dri/0 location do not exist, then debugfs is
> +	 * not mounted and so can't create the relay file.
> +	 * The relay API seems to fit well with debugfs only.
> +	 */
> +	if (!log_dir) {
> +		DRM_DEBUG_DRIVER("Parent debugfs directory not available yet\n");
> +		return -ENODEV;
> +	}
> +
> +	/* Keep the size of sub buffers same as shared log buffer */
> +	subbuf_size = guc->log.obj->base.size;

Insert blank line for readability probably.

> +	/* Store up to 8 snaphosts, which is large enough to buffer sufficient

snapshots

> +	 * boot time logs and provides enough leeway to User, in terms of
> +	 * latency, for consuming the logs from relay. Also doesn't take
> +	 * up too much memory.
> +         */

Indentation is off.

> +	n_subbufs = 8;
> +
> +	guc_log_relay_chan = relay_open("guc_log", log_dir,
> +			subbuf_size, n_subbufs, &relay_callbacks, dev_priv);

Alignment is wrong.

> +
> +	if (!guc_log_relay_chan) {
> +		DRM_DEBUG_DRIVER("Couldn't create relay chan for guc logs\n");
> +		return -ENOMEM;
> +	}
> +
> +	/* FIXME: Cover the update under a lock ? */

Another FIXME to be resolved.

> +	guc->log.relay_chan = guc_log_relay_chan;
> +	return 0;
>   }
>
>   static void guc_log_cleanup(struct intel_guc *guc)
> @@ -937,6 +1084,11 @@ static void guc_log_cleanup(struct intel_guc *guc)
>   	/* First disable the flush interrupt */
>   	gen9_disable_guc_interrupts(dev_priv);
>
> +	if (guc->log.relay_chan)
> +		guc_remove_log_relay_file(guc);
> +
> +	guc->log.relay_chan = NULL;
> +
>   	if (guc->log.buf_addr)
>   		i915_gem_object_unpin_map(guc->log.obj);
>
> @@ -1015,6 +1167,35 @@ static void guc_create_log(struct intel_guc *guc)
>   	guc->log.flags = (offset << GUC_LOG_BUF_ADDR_SHIFT) | flags;
>   }
>
> +static int guc_log_late_setup(struct intel_guc *guc)

static void if failure cannot be handled or otherwise act on the return 
value?

> +{
> +	struct drm_i915_private *dev_priv = guc_to_i915(guc);
> +	int ret;
> +
> +	lockdep_assert_held(&dev_priv->drm.struct_mutex);
> +
> +	if (i915.guc_log_level < 0)
> +		return -EINVAL;
> +
> +	/* If log_level was set as -1 at boot time, then vmalloc mapping would
> +	 * not have been created for the log buffer, so create one now.
> +	 */
> +	ret = guc_create_log_extras(guc);
> +	if (ret)
> +		goto err;
> +
> +	ret = guc_create_log_relay_file(guc);
> +	if (ret)
> +		goto err;
> +
> +	return 0;
> +err:
> +	guc_log_cleanup(guc);
> +	/* logging will remain off */
> +	i915.guc_log_level = -1;
> +	return ret;
> +}
> +
>   static void init_guc_policies(struct guc_policies *policies)
>   {
>   	struct guc_policy *policy;
> @@ -1185,7 +1366,6 @@ void i915_guc_submission_fini(struct drm_i915_private *dev_priv)
>   	gem_release_guc_obj(dev_priv->guc.ads_obj);
>   	guc->ads_obj = NULL;
>
> -	guc_log_cleanup(guc);
>   	gem_release_guc_obj(dev_priv->guc.log.obj);
>   	guc->log.obj = NULL;
>
> @@ -1261,3 +1441,23 @@ void i915_guc_capture_logs(struct drm_i915_private *dev_priv)
>   	host2guc_logbuffer_flush_complete(&dev_priv->guc);
>   	intel_runtime_pm_put(dev_priv);
>   }
> +
> +void i915_guc_unregister(struct drm_i915_private *dev_priv)
> +{
> +	if (!i915.enable_guc_submission)
> +		return;
> +
> +	mutex_lock(&dev_priv->drm.struct_mutex);
> +	guc_log_cleanup(&dev_priv->guc);
> +	mutex_unlock(&dev_priv->drm.struct_mutex);
> +}
> +
> +void i915_guc_register(struct drm_i915_private *dev_priv)
> +{
> +	if (!i915.enable_guc_submission)
> +		return;
> +
> +	mutex_lock(&dev_priv->drm.struct_mutex);
> +	guc_log_late_setup(&dev_priv->guc);
> +	mutex_unlock(&dev_priv->drm.struct_mutex);
> +}
> diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
> index 7c0bdba..96ef7dc 100644
> --- a/drivers/gpu/drm/i915/intel_guc.h
> +++ b/drivers/gpu/drm/i915/intel_guc.h
> @@ -126,6 +126,7 @@ struct intel_guc_log {
>   	struct drm_i915_gem_object *obj;
>   	struct workqueue_struct *wq;
>   	void *buf_addr;
> +	struct rchan *relay_chan;
>   };
>
>   struct intel_guc {
> @@ -172,5 +173,7 @@ int i915_guc_wq_check_space(struct drm_i915_gem_request *rq);
>   void i915_guc_submission_disable(struct drm_i915_private *dev_priv);
>   void i915_guc_submission_fini(struct drm_i915_private *dev_priv);
>   void i915_guc_capture_logs(struct drm_i915_private *dev_priv);
> +void i915_guc_register(struct drm_i915_private *dev_priv);
> +void i915_guc_unregister(struct drm_i915_private *dev_priv);
>
>   #endif
>

Regards,

Tvrtko

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 09/20] drm/i915: New lock to serialize the Host2GuC actions
  2016-08-12  6:25 ` [PATCH 09/20] drm/i915: New lock to serialize the Host2GuC actions akash.goel
@ 2016-08-12 13:55   ` Tvrtko Ursulin
  2016-08-12 15:01     ` Goel, Akash
  0 siblings, 1 reply; 68+ messages in thread
From: Tvrtko Ursulin @ 2016-08-12 13:55 UTC (permalink / raw)
  To: akash.goel, intel-gfx


On 12/08/16 07:25, akash.goel@intel.com wrote:
> From: Akash Goel <akash.goel@intel.com>
>
> With the addition of new Host2GuC actions related to GuC logging, there
> is a need of a lock to serialize them, as they can execute concurrently
> with each other and also with other existing actions.
>
> v2: Use mutex in place of spinlock to serialize, as sleep can happen
>      while waiting for the action's response from GuC. (Tvrtko)
>
> Signed-off-by: Akash Goel <akash.goel@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_guc_submission.c | 3 +++
>   drivers/gpu/drm/i915/intel_guc.h           | 3 +++
>   2 files changed, 6 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> index 1a2d648..cb9672b 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -88,6 +88,7 @@ static int host2guc_action(struct intel_guc *guc, u32 *data, u32 len)
>   		return -EINVAL;
>
>   	intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
> +	mutex_lock(&guc->action_lock);

I would probably take the mutex before grabbing forcewake as a general 
rule. Not that I think it matters in this case since we don't expect any 
contention on this one.

>
>   	dev_priv->guc.action_count += 1;
>   	dev_priv->guc.action_cmd = data[0];
> @@ -126,6 +127,7 @@ static int host2guc_action(struct intel_guc *guc, u32 *data, u32 len)
>   	}
>   	dev_priv->guc.action_status = status;
>
> +	mutex_unlock(&guc->action_lock);
>   	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
>
>   	return ret;
> @@ -1312,6 +1314,7 @@ int i915_guc_submission_init(struct drm_i915_private *dev_priv)
>   		return -ENOMEM;
>
>   	ida_init(&guc->ctx_ids);
> +	mutex_init(&guc->action_lock);
>   	guc_create_log(guc);
>   	guc_create_ads(guc);
>
> diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
> index 96ef7dc..e4ec8d8 100644
> --- a/drivers/gpu/drm/i915/intel_guc.h
> +++ b/drivers/gpu/drm/i915/intel_guc.h
> @@ -156,6 +156,9 @@ struct intel_guc {
>
>   	uint64_t submissions[I915_NUM_ENGINES];
>   	uint32_t last_seqno[I915_NUM_ENGINES];
> +
> +	/* To serialize the Host2GuC actions */
> +	struct mutex action_lock;
>   };
>
>   /* intel_guc_loader.c */
>

With or without the mutex vs forcewake ordering change:

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 06/20] drm/i915: Handle log buffer flush interrupt event from GuC
  2016-08-12 13:45     ` Goel, Akash
@ 2016-08-12 14:07       ` Tvrtko Ursulin
  2016-08-12 16:17         ` Goel, Akash
  0 siblings, 1 reply; 68+ messages in thread
From: Tvrtko Ursulin @ 2016-08-12 14:07 UTC (permalink / raw)
  To: Goel, Akash, intel-gfx


On 12/08/16 14:45, Goel, Akash wrote:
>
>
> On 8/12/2016 6:47 PM, Tvrtko Ursulin wrote:
>>
>> On 12/08/16 07:25, akash.goel@intel.com wrote:
>>> From: Sagar Arun Kamble <sagar.a.kamble@intel.com>
>>>
>>> GuC ukernel sends an interrupt to Host to flush the log buffer
>>> and expects Host to correspondingly update the read pointer
>>> information in the state structure, once it has consumed the
>>> log buffer contents by copying them to a file or buffer.
>>> Even if Host couldn't copy the contents, it can still update the
>>> read pointer so that logging state is not disturbed on GuC side.
>>>
>>> v2:
>>> - Use a dedicated workqueue for handling flush interrupt. (Tvrtko)
>>> - Reduce the overall log buffer copying time by skipping the copy of
>>>    crash buffer area for regular cases and copying only the state
>>>    structure data in first page.
>>>
>>> v3:
>>>   - Create a vmalloc mapping of log buffer. (Chris)
>>>   - Cover the flush acknowledgment under rpm get & put.(Chris)
>>>   - Revert the change of skipping the copy of crash dump area, as
>>>     not really needed, will be covered by subsequent patch.
>>>
>>> v4:
>>>   - Destroy the wq under the same condition in which it was created,
>>>     pass dev_piv pointer instead of dev to newly added GuC function,
>>>     add more comments & rename variable for clarity. (Tvrtko)
>>>
>>> Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
>>> Signed-off-by: Akash Goel <akash.goel@intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/i915_drv.c            |  14 +++
>>>   drivers/gpu/drm/i915/i915_guc_submission.c | 150
>>> +++++++++++++++++++++++++++++
>>>   drivers/gpu/drm/i915/i915_irq.c            |   5 +-
>>>   drivers/gpu/drm/i915/intel_guc.h           |   3 +
>>>   4 files changed, 170 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_drv.c
>>> b/drivers/gpu/drm/i915/i915_drv.c
>>> index 0fcd1c0..fc2da32 100644
>>> --- a/drivers/gpu/drm/i915/i915_drv.c
>>> +++ b/drivers/gpu/drm/i915/i915_drv.c
>>> @@ -770,8 +770,20 @@ static int i915_workqueues_init(struct
>>> drm_i915_private *dev_priv)
>>>       if (dev_priv->hotplug.dp_wq == NULL)
>>>           goto out_free_wq;
>>>
>>> +    if (HAS_GUC_SCHED(dev_priv)) {
>>
>> This just reminded me that a previous patch had:
>>
>> +    if (HAS_GUC_UCODE(dev))
>> +        dev_priv->pm_guc_events = GEN9_GUC_TO_HOST_INT_EVENT;
>>
>> In the interrupt setup. I don't think there is a bug right now, but
>> there is a disagreement between the two which would be good to resolve.
>>
>> This HAS_GUC_UCODE in the other patch should probably be HAS_GUC_SCHED
>> for correctness. I think.
>
> Sorry for inconsistency, Will use HAS_GUC_SCHED in the previous patch.
>
> As per Chris's comments will move the wq init/destroy to the GuC logging
> setup/teardown routines (guc_create_log_extras, guc_log_cleanup)
> You are fine with that ?.

Yes thats OK I think.

>>
>>> +        /* Need a dedicated wq to process log buffer flush interrupts
>>> +         * from GuC without much delay so as to avoid any loss of logs.
>>> +         */
>>> +        dev_priv->guc.log.wq =
>>> +            alloc_ordered_workqueue("i915-guc_log", 0);
>>> +        if (dev_priv->guc.log.wq == NULL)
>>> +            goto out_free_hotplug_dp_wq;
>>> +    }
>>> +
>>>       return 0;
>>>
>>> +out_free_hotplug_dp_wq:
>>> +    destroy_workqueue(dev_priv->hotplug.dp_wq);
>>>   out_free_wq:
>>>       destroy_workqueue(dev_priv->wq);
>>>   out_err:
>>> @@ -782,6 +794,8 @@ out_err:
>>>
>>>   static void i915_workqueues_cleanup(struct drm_i915_private *dev_priv)
>>>   {
>>> +    if (HAS_GUC_SCHED(dev_priv))
>>> +        destroy_workqueue(dev_priv->guc.log.wq);
>>>       destroy_workqueue(dev_priv->hotplug.dp_wq);
>>>       destroy_workqueue(dev_priv->wq);
>>>   }
>>> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
>>> b/drivers/gpu/drm/i915/i915_guc_submission.c
>>> index c7c679f..2635b67 100644
>>> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
>>> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
>>> @@ -172,6 +172,15 @@ static int host2guc_sample_forcewake(struct
>>> intel_guc *guc,
>>>       return host2guc_action(guc, data, ARRAY_SIZE(data));
>>>   }
>>>
>>> +static int host2guc_logbuffer_flush_complete(struct intel_guc *guc)
>>> +{
>>> +    u32 data[1];
>>> +
>>> +    data[0] = HOST2GUC_ACTION_LOG_BUFFER_FILE_FLUSH_COMPLETE;
>>> +
>>> +    return host2guc_action(guc, data, 1);
>>> +}
>>> +
>>>   /*
>>>    * Initialise, update, or clear doorbell data shared with the GuC
>>>    *
>>> @@ -840,6 +849,127 @@ err:
>>>       return NULL;
>>>   }
>>>
>>> +static void guc_move_to_next_buf(struct intel_guc *guc)
>>> +{
>>> +    return;
>>> +}
>>> +
>>> +static void* guc_get_write_buffer(struct intel_guc *guc)
>>> +{
>>> +    return NULL;
>>> +}
>>> +
>>> +static void guc_read_update_log_buffer(struct intel_guc *guc)
>>> +{
>>> +    struct guc_log_buffer_state *log_buffer_state,
>>> *log_buffer_snapshot_state;
>>> +    struct guc_log_buffer_state log_buffer_state_local;
>>> +    void *src_data_ptr, *dst_data_ptr;
>>> +    u32 i, buffer_size;
>>
>> unsigned int i if you can be bothered.
>
> Fine will do that for both i & buffer_size.

buffer_size can match the type of log_buffer_state_local.size or use 
something else if more appropriate.

> But I remember earlier in one of the patch, you suggested to use u32 as
> a type for some variables.
> Please could you share the guideline.
> Should u32, u64 be used we are exactly sure of the range of the
> variable, like for variables containing the register values ?

Depends what the variable is. "i" in this case is just a standard 
counter so not appropriate to use u32. It is not that wrong on x86, just 
looks a bit out of place.

grep "u32 i;" has no hits in i915. :)

They can/should be used for variables tied with hardware or protocols. 
Or in some cases where you really want to save space by using a small type.

>
>>
>>> +
>>> +    if (!guc->log.buf_addr)
>>> +        return;
>>
>> Can it hit this? If yes, I think better disable GuC logging when pin map
>> on the object fails rather than let it generate interrupts in vain.
>>
>>> +
>>> +    /* Get the pointer to shared GuC log buffer */
>>> +    log_buffer_state = src_data_ptr = guc->log.buf_addr;
>>> +
>>> +    /* Get the pointer to local buffer to store the logs */
>>> +    dst_data_ptr = log_buffer_snapshot_state =
>>> guc_get_write_buffer(guc);
>>> +
>>> +    /* Actual logs are present from the 2nd page */
>>> +    src_data_ptr += PAGE_SIZE;
>>> +    dst_data_ptr += PAGE_SIZE;
>>> +
>>> +    for (i = 0; i < GUC_MAX_LOG_BUFFER; i++) {
>>> +        /* Make a copy of the state structure in GuC log buffer (which
>>> +         * is uncached mapped) on the stack to avoid reading from it
>>> +         * multiple times.
>>> +         */
>>> +        memcpy(&log_buffer_state_local, log_buffer_state,
>>> +                sizeof(struct guc_log_buffer_state));
>>> +        buffer_size = log_buffer_state_local.size;
>>
>> Needs checking (and logging) that a bad firmware or some other error
>> does not report a dangerous size (too big) which would then overwrite
>> memory pointed by dst_data_ptr memory. (And/or read from random memory.)
>>
> Have done the range checking for the read/write offset values, which are
> updated repeatedly by GuC firmware.
> The buffer size is updated only at init time by GuC firmware, hence less
> vulnerable.
>
> But nevertheless will add the checks.

Ok good.

>>> +
>>> +        if (log_buffer_snapshot_state) {
>>> +            /* First copy the state structure in local buffer */
>>
>> Maybe "destination buffer" would be clearer?
>
> Sorry my bad, well spotted.
>
>>
>>> +            memcpy(log_buffer_snapshot_state, &log_buffer_state_local,
>>> +                    sizeof(struct guc_log_buffer_state));
>>> +
>>> +            /* The write pointer could have been updated by the GuC
>>> +             * firmware, after sending the flush interrupt to Host,
>>> +             * for consistency set the write pointer value to same
>>> +             * value of sampled_write_ptr in the snapshot buffer.
>>> +             */
>>> +            log_buffer_snapshot_state->write_ptr =
>>> +                log_buffer_snapshot_state->sampled_write_ptr;
>>> +
>>> +            log_buffer_snapshot_state++;
>>> +
>>> +            /* Now copy the actual logs */
>>> +            memcpy(dst_data_ptr, src_data_ptr, buffer_size);
>>> +
>>> +            src_data_ptr += buffer_size;
>>> +            dst_data_ptr += buffer_size;
>>> +        }
>>> +
>>> +        /* FIXME: invalidate/flush for log buffer needed */
>>
>> Yes no maybe? :)
>
> Will like to keep it, if you allow.

I think you need a really good justification to get r-b on patches with 
FIXMEs, especially like this one. Do you maybe handle it or remove it in 
a following patch or something?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 10/20] drm/i915: Add stats for GuC log buffer flush interrupts
  2016-08-12  6:25 ` [PATCH 10/20] drm/i915: Add stats for GuC log buffer flush interrupts akash.goel
@ 2016-08-12 14:26   ` Tvrtko Ursulin
  2016-08-12 14:56     ` Goel, Akash
  0 siblings, 1 reply; 68+ messages in thread
From: Tvrtko Ursulin @ 2016-08-12 14:26 UTC (permalink / raw)
  To: akash.goel, intel-gfx


On 12/08/16 07:25, akash.goel@intel.com wrote:
> From: Akash Goel <akash.goel@intel.com>
>
> GuC firmware sends an interrupt to flush the log buffer when it
> becomes half full. GuC firmware also tracks how many times the
> buffer overflowed.
> It would be useful to maintain a statistics of how many flush
> interrupts were received and for which type of log buffer,
> along with the overflow count of each buffer type.
> Augmented i915_log_info debugfs to report back these statistics.
>
> v2:
> - Update the logic to detect multiple overflows between the 2
>    flush interrupts and also log a message for overflow (Tvrtko)
> - Track the number of times there was no free sub buffer to capture
>    the GuC log buffer. (Tvrtko)
>
> Signed-off-by: Akash Goel <akash.goel@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_debugfs.c        | 28 ++++++++++++++++++++++++++++
>   drivers/gpu/drm/i915/i915_guc_submission.c | 19 +++++++++++++++++++
>   drivers/gpu/drm/i915/i915_irq.c            |  2 ++
>   drivers/gpu/drm/i915/intel_guc.h           |  7 +++++++
>   4 files changed, 56 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 51b59d5..14e0dcf 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -2539,6 +2539,32 @@ static int i915_guc_load_status_info(struct seq_file *m, void *data)
>   	return 0;
>   }
>
> +static void i915_guc_log_info(struct seq_file *m,
> +				 struct drm_i915_private *dev_priv)
> +{
> +	struct intel_guc *guc = &dev_priv->guc;
> +
> +	seq_printf(m, "\nGuC logging stats:\n");
> +
> +	seq_printf(m, "\tISR:   flush count %10u, overflow count %8u\n",
> +		guc->log.flush_count[GUC_ISR_LOG_BUFFER],
> +		guc->log.total_overflow_count[GUC_ISR_LOG_BUFFER]);
> +
> +	seq_printf(m, "\tDPC:   flush count %10u, overflow count %8u\n",
> +		guc->log.flush_count[GUC_DPC_LOG_BUFFER],
> +		guc->log.total_overflow_count[GUC_DPC_LOG_BUFFER]);
> +
> +	seq_printf(m, "\tCRASH: flush count %10u, overflow count %8u\n",
> +		guc->log.flush_count[GUC_CRASH_DUMP_LOG_BUFFER],
> +		guc->log.total_overflow_count[GUC_CRASH_DUMP_LOG_BUFFER]);

Why is the width for overflow only 8 chars and not 10 like for flush 
since both are u32?

> +
> +	seq_printf(m, "\tTotal flush interrupt count: %u\n",
> +		       guc->log.flush_interrupt_count);
> +
> +	seq_printf(m, "\tCapture miss count: %u\n",
> +		       guc->log.capture_miss_count);
> +}
> +
>   static void i915_guc_client_info(struct seq_file *m,
>   				 struct drm_i915_private *dev_priv,
>   				 struct i915_guc_client *client)
> @@ -2613,6 +2639,8 @@ static int i915_guc_info(struct seq_file *m, void *data)
>   	seq_printf(m, "\nGuC execbuf client @ %p:\n", guc.execbuf_client);
>   	i915_guc_client_info(m, dev_priv, &client);
>
> +	i915_guc_log_info(m, dev_priv);
> +
>   	/* Add more as required ... */
>
>   	return 0;
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> index cb9672b..1ca1866 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -913,6 +913,24 @@ static void guc_read_update_log_buffer(struct intel_guc *guc)
>   				sizeof(struct guc_log_buffer_state));
>   		buffer_size = log_buffer_state_local.size;
>
> +		guc->log.flush_count[i] += log_buffer_state_local.flush_to_file;
> +		if (log_buffer_state_local.buffer_full_cnt !=
> +					guc->log.prev_overflow_count[i]) {
> +			guc->log.total_overflow_count[i] +=
> +				(log_buffer_state_local.buffer_full_cnt -
> +				 guc->log.prev_overflow_count[i]);
> +
> +			if (log_buffer_state_local.buffer_full_cnt <
> +					guc->log.prev_overflow_count[i]) {
> +				/* buffer_full_cnt is a 4 bit counter */
> +				guc->log.total_overflow_count[i] += 16;
> +			}
> +
> +			guc->log.prev_overflow_count[i] =
> +					log_buffer_state_local.buffer_full_cnt;
> +			DRM_ERROR_RATELIMITED("GuC log buffer overflow\n");
> +		}
> +
>   		if (log_buffer_snapshot_state) {
>   			/* First copy the state structure in local buffer */
>   			memcpy(log_buffer_snapshot_state, &log_buffer_state_local,
> @@ -953,6 +971,7 @@ static void guc_read_update_log_buffer(struct intel_guc *guc)
>   		 * getting consumed by User at a slow rate.
>   		 */
>   		DRM_ERROR_RATELIMITED("no sub-buffer to capture log buffer\n");
> +		guc->log.capture_miss_count++;
>   	}
>   }
>
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index d4d6f0a..b08d1d2 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -1705,6 +1705,8 @@ static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv, u32 gt_iir)
>   				/* Handle flush interrupt event in bottom half */
>   				queue_work(dev_priv->guc.log.wq,
>   						&dev_priv->guc.events_work);
> +
> +				dev_priv->guc.log.flush_interrupt_count++;
>   			}
>   		}
>   	}
> diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
> index e4ec8d8..ed87e98 100644
> --- a/drivers/gpu/drm/i915/intel_guc.h
> +++ b/drivers/gpu/drm/i915/intel_guc.h
> @@ -127,6 +127,13 @@ struct intel_guc_log {
>   	struct workqueue_struct *wq;
>   	void *buf_addr;
>   	struct rchan *relay_chan;
> +
> +	/* logging related stats */
> +	u32 capture_miss_count;
> +	u32 flush_interrupt_count;
> +	u32 prev_overflow_count[GUC_MAX_LOG_BUFFER];
> +	u32 total_overflow_count[GUC_MAX_LOG_BUFFER];
> +	u32 flush_count[GUC_MAX_LOG_BUFFER];
>   };
>
>   struct intel_guc {
>

Either if the formatting widths above are fine or you adjust them:

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 05/20] drm/i915: Support for GuC interrupts
  2016-08-12 13:31       ` Tvrtko Ursulin
@ 2016-08-12 14:31         ` Goel, Akash
  2016-08-12 15:05           ` Tvrtko Ursulin
  0 siblings, 1 reply; 68+ messages in thread
From: Goel, Akash @ 2016-08-12 14:31 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: akash.goel



On 8/12/2016 7:01 PM, Tvrtko Ursulin wrote:
>
> On 12/08/16 14:10, Goel, Akash wrote:
>> On 8/12/2016 5:24 PM, Tvrtko Ursulin wrote:
>>>
>>> On 12/08/16 07:25, akash.goel@intel.com wrote:
>>>> From: Sagar Arun Kamble <sagar.a.kamble@intel.com>
>>>>
>>>> There are certain types of interrupts which Host can recieve from GuC.
>>>> GuC ukernel sends an interrupt to Host for certain events, like for
>>>> example retrieve/consume the logs generated by ukernel.
>>>> This patch adds support to receive interrupts from GuC but currently
>>>> enables & partially handles only the interrupt sent by GuC ukernel.
>>>> Future patches will add support for handling other interrupt types.
>>>>
>>>> v2:
>>>> - Use common low level routines for PM IER/IIR programming (Chris)
>>>> - Rename interrupt functions to gen9_xxx from gen8_xxx (Chris)
>>>> - Replace disabling of wake ref asserts with rpm get/put (Chris)
>>>>
>>>> v3:
>>>> - Update comments for more clarity. (Tvrtko)
>>>> - Remove the masking of GuC interrupt, which was kept masked till the
>>>>    start of bottom half, its not really needed as there is only a
>>>>    single instance of work item & wq is ordered. (Tvrtko)
>>>>
>>>> v4:
>>>> - Rebase.
>>>> - Rename guc_events to pm_guc_events so as to be indicative of the
>>>>    register/control block it is associated with. (Chris)
>>>> - Add handling for back to back log buffer flush interrupts.
>>>>
>>>> v5:
>>>> - Move the read & clearing of register, containing Guc2Host message
>>>>    bits, outside the irq spinlock. (Tvrtko)
>>>>
>>>> Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
>>>> Signed-off-by: Akash Goel <akash.goel@intel.com>
>>>> ---
>>>>   drivers/gpu/drm/i915/i915_drv.h            |   1 +
>>>>   drivers/gpu/drm/i915/i915_guc_submission.c |   5 ++
>>>>   drivers/gpu/drm/i915/i915_irq.c            | 100
>>>> +++++++++++++++++++++++++++--
>>>>   drivers/gpu/drm/i915/i915_reg.h            |  11 ++++
>>>>   drivers/gpu/drm/i915/intel_drv.h           |   3 +
>>>>   drivers/gpu/drm/i915/intel_guc.h           |   4 ++
>>>>   drivers/gpu/drm/i915/intel_guc_loader.c    |   4 ++
>>>>   7 files changed, 124 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h
>>>> b/drivers/gpu/drm/i915/i915_drv.h
>>>> index a608a5c..28ffac5 100644
>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>> @@ -1779,6 +1779,7 @@ struct drm_i915_private {
>>>>       u32 pm_imr;
>>>>       u32 pm_ier;
>>>>       u32 pm_rps_events;
>>>> +    u32 pm_guc_events;
>>>>       u32 pipestat_irq_mask[I915_MAX_PIPES];
>>>>
>>>>       struct i915_hotplug hotplug;
>>>> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
>>>> b/drivers/gpu/drm/i915/i915_guc_submission.c
>>>> index ad3b55f..c7c679f 100644
>>>> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
>>>> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
>>>> @@ -1071,6 +1071,8 @@ int intel_guc_suspend(struct drm_device *dev)
>>>>       if (guc->guc_fw.guc_fw_load_status != GUC_FIRMWARE_SUCCESS)
>>>>           return 0;
>>>>
>>>> +    gen9_disable_guc_interrupts(dev_priv);
>>>> +
>>>>       ctx = dev_priv->kernel_context;
>>>>
>>>>       data[0] = HOST2GUC_ACTION_ENTER_S_STATE;
>>>> @@ -1097,6 +1099,9 @@ int intel_guc_resume(struct drm_device *dev)
>>>>       if (guc->guc_fw.guc_fw_load_status != GUC_FIRMWARE_SUCCESS)
>>>>           return 0;
>>>>
>>>> +    if (i915.guc_log_level >= 0)
>>>> +        gen9_enable_guc_interrupts(dev_priv);
>>>> +
>>>>       ctx = dev_priv->kernel_context;
>>>>
>>>>       data[0] = HOST2GUC_ACTION_EXIT_S_STATE;
>>>> diff --git a/drivers/gpu/drm/i915/i915_irq.c
>>>> b/drivers/gpu/drm/i915/i915_irq.c
>>>> index 5f93309..5f1974f 100644
>>>> --- a/drivers/gpu/drm/i915/i915_irq.c
>>>> +++ b/drivers/gpu/drm/i915/i915_irq.c
>>>> @@ -170,6 +170,7 @@ static void gen5_assert_iir_is_zero(struct
>>>> drm_i915_private *dev_priv,
>>>>   } while (0)
>>>>
>>>>   static void gen6_rps_irq_handler(struct drm_i915_private *dev_priv,
>>>> u32 pm_iir);
>>>> +static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv,
>>>> u32 pm_iir);
>>>>
>>>>   /* For display hotplug interrupt */
>>>>   static inline void
>>>> @@ -411,6 +412,38 @@ void gen6_disable_rps_interrupts(struct
>>>> drm_i915_private *dev_priv)
>>>>       gen6_reset_rps_interrupts(dev_priv);
>>>>   }
>>>>
>>>> +void gen9_reset_guc_interrupts(struct drm_i915_private *dev_priv)
>>>> +{
>>>> +    spin_lock_irq(&dev_priv->irq_lock);
>>>> +    gen6_reset_pm_iir(dev_priv, dev_priv->pm_guc_events);
>>>> +    spin_unlock_irq(&dev_priv->irq_lock);
>>>> +}
>>>> +
>>>> +void gen9_enable_guc_interrupts(struct drm_i915_private *dev_priv)
>>>> +{
>>>> +    spin_lock_irq(&dev_priv->irq_lock);
>>>> +    if (!dev_priv->guc.interrupts_enabled) {
>>>> +        WARN_ON_ONCE(I915_READ(gen6_pm_iir(dev_priv)) &
>>>> +                        dev_priv->pm_guc_events);
>>>> +        dev_priv->guc.interrupts_enabled = true;
>>>> +        gen6_enable_pm_irq(dev_priv, dev_priv->pm_guc_events);
>>>> +    }
>>>> +    spin_unlock_irq(&dev_priv->irq_lock);
>>>> +}
>>>> +
>>>> +void gen9_disable_guc_interrupts(struct drm_i915_private *dev_priv)
>>>> +{
>>>> +    spin_lock_irq(&dev_priv->irq_lock);
>>>> +    dev_priv->guc.interrupts_enabled = false;
>>>> +
>>>> +    gen6_disable_pm_irq(dev_priv, dev_priv->pm_guc_events);
>>>> +
>>>> +    spin_unlock_irq(&dev_priv->irq_lock);
>>>> +    synchronize_irq(dev_priv->drm.irq);
>>>> +
>>>> +    gen9_reset_guc_interrupts(dev_priv);
>>>> +}
>>>> +
>>>>   /**
>>>>    * bdw_update_port_irq - update DE port interrupt
>>>>    * @dev_priv: driver private
>>>> @@ -1167,6 +1200,21 @@ static void gen6_pm_rps_work(struct work_struct
>>>> *work)
>>>>       mutex_unlock(&dev_priv->rps.hw_lock);
>>>>   }
>>>>
>>>> +static void gen9_guc2host_events_work(struct work_struct *work)
>>>> +{
>>>> +    struct drm_i915_private *dev_priv =
>>>> +        container_of(work, struct drm_i915_private, guc.events_work);
>>>> +
>>>> +    spin_lock_irq(&dev_priv->irq_lock);
>>>> +    /* Speed up work cancellation during disabling guc interrupts. */
>>>> +    if (!dev_priv->guc.interrupts_enabled) {
>>>> +        spin_unlock_irq(&dev_priv->irq_lock);
>>>> +        return;
>>>
>>> I suppose locking for early exit is something about ensuring the worker
>>> sees the update to dev_priv->guc.interrupts_enabled done on another CPU?
>>
>> Yes locking (providing implicit barrier) will ensure that update made
>> from another CPU is immediately visible to the worker.
>
> What if the disable happens after the unlock above? It would wait in
> disable until the irq handler exits.
Most probably it will not have to wait, as irq handler would have 
completed if work item began the execution.
Irq handler just queues the work item, which gets scheduled later on.

Using the lock is beneficial for the case where the execution of work 
item and interrupt disabling is done around the same time.

> So the same as if not bothering
> with the spinlock above, no?
>
>>> synchronize_irq there is not enough for some reason?
>>>
>> synchronize_irq would not be enough, its for a different purpose, to
>> ensure that any ongoing handling of irq completes (after the caller has
>> disabled the irq).
>>
>> As per my understanding synchronize_irq won't have an effect on the
>> worker, with respect to the moment when the update of
>> 'interrupts_enabled' flag is visible to the worker.
>>
>>>> +    }
>>>> +    spin_unlock_irq(&dev_priv->irq_lock);
>>>> +
>>>> +    /* TODO: Handle the events for which GuC interrupted host */
>>>> +}
>>>>
>>>>   /**
>>>>    * ivybridge_parity_work - Workqueue called when a parity error
>>>> interrupt
>>>> @@ -1339,11 +1387,13 @@ static irqreturn_t gen8_gt_irq_ack(struct
>>>> drm_i915_private *dev_priv,
>>>>               DRM_ERROR("The master control interrupt lied (GT3)!\n");
>>>>       }
>>>>
>>>> -    if (master_ctl & GEN8_GT_PM_IRQ) {
>>>> +    if (master_ctl & (GEN8_GT_PM_IRQ | GEN8_GT_GUC_IRQ)) {
>>>>           gt_iir[2] = I915_READ_FW(GEN8_GT_IIR(2));
>>>> -        if (gt_iir[2] & dev_priv->pm_rps_events) {
>>>> +        if (gt_iir[2] & (dev_priv->pm_rps_events |
>>>> +                 dev_priv->pm_guc_events)) {
>>>>               I915_WRITE_FW(GEN8_GT_IIR(2),
>>>> -                      gt_iir[2] & dev_priv->pm_rps_events);
>>>> +                      gt_iir[2] & (dev_priv->pm_rps_events |
>>>> +                           dev_priv->pm_guc_events));
>>>>               ret = IRQ_HANDLED;
>>>>           } else
>>>>               DRM_ERROR("The master control interrupt lied (PM)!\n");
>>>> @@ -1375,6 +1425,9 @@ static void gen8_gt_irq_handler(struct
>>>> drm_i915_private *dev_priv,
>>>>
>>>>       if (gt_iir[2] & dev_priv->pm_rps_events)
>>>>           gen6_rps_irq_handler(dev_priv, gt_iir[2]);
>>>> +
>>>> +    if (gt_iir[2] & dev_priv->pm_guc_events)
>>>> +        gen9_guc_irq_handler(dev_priv, gt_iir[2]);
>>>>   }
>>>>
>>>>   static bool bxt_port_hotplug_long_detect(enum port port, u32 val)
>>>> @@ -1621,6 +1674,41 @@ static void gen6_rps_irq_handler(struct
>>>> drm_i915_private *dev_priv, u32 pm_iir)
>>>>       }
>>>>   }
>>>>
>>>> +static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv,
>>>> u32 gt_iir)
>>>> +{
>>>> +    bool interrupts_enabled;
>>>> +
>>>> +    if (gt_iir & GEN9_GUC_TO_HOST_INT_EVENT) {
>>>> +        spin_lock(&dev_priv->irq_lock);
>>>> +        interrupts_enabled = dev_priv->guc.interrupts_enabled;
>>>> +        spin_unlock(&dev_priv->irq_lock);
>>>
>>> Not sure that taking a lock around only this read is needed.
>>>
>> Again same reason as above, to make sure an update made on another CPU
>> is immediately visible to the irq handler.
>
> I don't get it, see above. :)

Here also If interrupt disabling & ISR execution happens around the same 
time then ISR might miss the reset of 'interrupts_enabled' flag and 
queue the new work.
And same applies to the case when interrupt is re-enabled, ISR might 
still see the 'interrupts_enabled' flag as false.
It will eventually see the update though.

>
>>>> +        if (interrupts_enabled) {
>>>> +            /* Sample the log buffer flush related bits & clear them
>>>> +             * out now itself from the message identity register to
>>>> +             * minimize the probability of losing a flush interrupt,
>>>> +             * when there are back to back flush interrupts.
>>>> +             * There can be a new flush interrupt, for different log
>>>> +             * buffer type (like for ISR), whilst Host is handling
>>>> +             * one (for DPC). Since same bit is used in message
>>>> +             * register for ISR & DPC, it could happen that GuC
>>>> +             * sets the bit for 2nd interrupt but Host clears out
>>>> +             * the bit on handling the 1st interrupt.
>>>> +             */
>>>> +            u32 msg = I915_READ(SOFT_SCRATCH(15)) &
>>>> +                    (GUC2HOST_MSG_CRASH_DUMP_POSTED |
>>>> +                     GUC2HOST_MSG_FLUSH_LOG_BUFFER);
>>>> +            if (msg) {
>>>> +                /* Clear the message bits that are handled */
>>>> +                I915_WRITE(SOFT_SCRATCH(15),
>>>> +                    I915_READ(SOFT_SCRATCH(15)) & ~msg);
>>>
>>> Cache full value of SOFT_SCRATCH(15) so you don't have to mmio read it
>>> twice?
>>>
>> Thought reading it again (just before the update) is bit safer compared
>> to reading it once, as there is a potential race problem here.
>> GuC could also write to the SOFT_SCRATCH(15) register, set new events
>> bit, while Host clears off the bit of handled events.
>
> Don't get it. If there is a race between read and write there still is,
> don't see how a second read makes it safer.
>
Yes can't avoid the race completely by double reads, but can reduce the 
race window size.

Also I felt code looked better in current form, as macros
GUC2HOST_MSG_CRASH_DUMP_POSTED & GUC2HOST_MSG_FLUSH_LOG_BUFFER were used 
only once.

Will change as per the initial implementation.

	u32 msg = I915_READ(SOFT_SCRATCH(15));
	if (msg & (GUC2HOST_MSG_CRASH_DUMP_POSTED |
		   GUC2HOST_MSG_FLUSH_LOG_BUFFER) {
		msg &= ~(GUC2HOST_MSG_CRASH_DUMP_POSTED |
			 GUC2HOST_MSG_FLUSH_LOG_BUFFER);
		I915_WRITE(SOFT_SCRATCH(15), msg);
	}


>>> Also, is the RMW outside any locks safe?
>>>
>>
>> Ideally need a way to atomically do the RMW, i.e. read the register
>> value, clear off the handled events bit and update the register with the
>> modified value.
>>
>> Please kindly suggest how to address the above.
>> Or can this be left as a TODO, when we do start handling other events
>> also.
>
> From the comment in code above it sounds like a GuC fw interface
> shortcoming - that there is a single bit for two different interrupt
> sources, is that right?
Yes that shortcoming is there, GUC2HOST_MSG_FLUSH_LOG_BUFFER bit is used 
for conveying the flush for ISR & DPC log buffers.

> Is there any other register or something that
> you can read to detect that the interrupt has been re-asserted while in
> the irq handler?


> Although I thought you said before that the GuC will
> not do that - that it won't re-assert the interrupt before we send the
> flush command.
Yes that is the case, but with respect to one type of a log buffer, like 
for example unless GuC firmware receives the ack for DPC log
buffer it won't send a new flush for DPC buffer, but if meanwhile ISR 
buffer becomes half full it will send a flush interrupt.

>
>>>> +
>>>> +                /* Handle flush interrupt event in bottom half */
>>>> +                queue_work(dev_priv->wq, &dev_priv->guc.events_work);
>>>
>>> IMHO it would be nicer if the code started straight away with a final wq
>>> solution.
>>>
>>> Especially since the next patch in the series is called "Handle log
>>> buffer flush interrupt event from GuC" and the actual handling of the
>>> log buffer flush interrupt is split between this one
>>> (GUC2HOST_MSG_FLUSH_LOG_BUFFER above) and that one.
>>>
>>> So it would almost be nicer that the above chunk which handles
>>> GUC2HOST_MSG_FLUSH_LOG_BUFFER and the worker init is only added in the
>>> next patch and this one only does the generic bits.
>>>
>>
>> Fine will move the log buffer flush interrupt event related stuff to the
>> next patch and so irq handler in this patch will just be a
>> placeholder.
>
> Great thanks!
>
>>> I don't know.. I'll leave it on your conscience - if you think the split
>>> (series) can't be done any nicer or it makes sense to have it in this
>>> order then ok.
>>>
>>>> +            }
>>>
>>> Mabye:
>>>
>>>     } else
>>>
>>> And log something unexpected has happened in the !msg case?
>>>
>>> Since it won't clear the message in that case so would it keep
>>> triggering?
>>>
>>
>> Actually after enabling of GuC interrupt, there can be interrupts from
>> GuC side for some other events which are right now not handled by Host.
>>
>> But not clearing of unhandled event bits won't result in re-triggering
>> of the interrupt.
>
> Ok I suggest documenting that as a comment in code then.
>
Fine will add a comment in the else case.
> Regards,
>
> Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 11/20] drm/i915: Optimization to reduce the sampling time of GuC log buffer
  2016-08-12  6:25 ` [PATCH 11/20] drm/i915: Optimization to reduce the sampling time of GuC log buffer akash.goel
@ 2016-08-12 14:42   ` Tvrtko Ursulin
  2016-08-12 14:48     ` Goel, Akash
  0 siblings, 1 reply; 68+ messages in thread
From: Tvrtko Ursulin @ 2016-08-12 14:42 UTC (permalink / raw)
  To: akash.goel, intel-gfx


On 12/08/16 07:25, akash.goel@intel.com wrote:
> From: Akash Goel <akash.goel@intel.com>
>
> GuC firmware sends an interrupt to flush the log buffer when it becomes
> half full, so Driver doesn't really need to sample the complete buffer
> and can just copy only the newly written data by GuC into the local
> buffer, i.e. as per the read & write pointer values.
> Moreover the flush interrupt would generally come for one type of log
> buffer, when it becomes half full, so at that time the other 2 types of
> log buffer would comparatively have much lesser unread data in them.
> In case of overflow reported by GuC, Driver do need to copy the entire
> buffer as the whole buffer would contain the unread data.
>
> v2: Rebase.
>
> Signed-off-by: Akash Goel <akash.goel@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_guc_submission.c | 40 +++++++++++++++++++++++++-----
>   1 file changed, 34 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> index 1ca1866..8e0f360 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -889,7 +889,8 @@ static void guc_read_update_log_buffer(struct intel_guc *guc)
>   	struct guc_log_buffer_state *log_buffer_state, *log_buffer_snapshot_state;
>   	struct guc_log_buffer_state log_buffer_state_local;
>   	void *src_data_ptr, *dst_data_ptr;
> -	u32 i, buffer_size;
> +	bool new_overflow;
> +	u32 i, buffer_size, read_offset, write_offset, bytes_to_copy;
>
>   	if (!guc->log.buf_addr)
>   		return;
> @@ -912,10 +913,13 @@ static void guc_read_update_log_buffer(struct intel_guc *guc)
>   		memcpy(&log_buffer_state_local, log_buffer_state,
>   				sizeof(struct guc_log_buffer_state));
>   		buffer_size = log_buffer_state_local.size;
> +		read_offset = log_buffer_state_local.read_ptr;
> +		write_offset = log_buffer_state_local.sampled_write_ptr;
>
>   		guc->log.flush_count[i] += log_buffer_state_local.flush_to_file;
>   		if (log_buffer_state_local.buffer_full_cnt !=
>   					guc->log.prev_overflow_count[i]) {

Wrong alignment. You can try checkpatch.pl for all of those.

> +			new_overflow = 1;

true/false since it is a bool

>   			guc->log.total_overflow_count[i] +=
>   				(log_buffer_state_local.buffer_full_cnt -
>   				 guc->log.prev_overflow_count[i]);
> @@ -929,7 +933,8 @@ static void guc_read_update_log_buffer(struct intel_guc *guc)
>   			guc->log.prev_overflow_count[i] =
>   					log_buffer_state_local.buffer_full_cnt;
>   			DRM_ERROR_RATELIMITED("GuC log buffer overflow\n");
> -		}
> +		} else
> +			new_overflow = 0;
>
>   		if (log_buffer_snapshot_state) {
>   			/* First copy the state structure in local buffer */
> @@ -941,13 +946,37 @@ static void guc_read_update_log_buffer(struct intel_guc *guc)
>   			 * for consistency set the write pointer value to same
>   			 * value of sampled_write_ptr in the snapshot buffer.
>   			 */
> -			log_buffer_snapshot_state->write_ptr =
> -				log_buffer_snapshot_state->sampled_write_ptr;
> +			log_buffer_snapshot_state->write_ptr = write_offset;
>
>   			log_buffer_snapshot_state++;
>
>   			/* Now copy the actual logs */
>   			memcpy(dst_data_ptr, src_data_ptr, buffer_size);

The confusing bit - the memcpy above still copies the whole buffer, no?

> +			if (unlikely(new_overflow)) {
> +				/* copy the whole buffer in case of overflow */
> +				read_offset = 0;
> +				write_offset = buffer_size;
> +			} else if (unlikely((read_offset > buffer_size) ||
> +					    (write_offset > buffer_size))) {
> +				DRM_ERROR("invalid log buffer state\n");
> +				/* copy whole buffer as offsets are unreliable */
> +				read_offset = 0;
> +				write_offset = buffer_size;
> +			}
> +
> +			/* Just copy the newly written data */
> +			if (read_offset <= write_offset) {
> +				bytes_to_copy = write_offset - read_offset;
> +				memcpy(dst_data_ptr + read_offset,
> +				     src_data_ptr + read_offset, bytes_to_copy);
> +			} else {
> +				bytes_to_copy = buffer_size - read_offset;
> +				memcpy(dst_data_ptr + read_offset,
> +				     src_data_ptr + read_offset, bytes_to_copy);
> +
> +				bytes_to_copy = write_offset;
> +				memcpy(dst_data_ptr, src_data_ptr, bytes_to_copy);
> +			}
>
>   			src_data_ptr += buffer_size;
>   			dst_data_ptr += buffer_size;
> @@ -956,8 +985,7 @@ static void guc_read_update_log_buffer(struct intel_guc *guc)
>   		/* FIXME: invalidate/flush for log buffer needed */
>
>   		/* Update the read pointer in the shared log buffer */
> -		log_buffer_state->read_ptr =
> -			log_buffer_state_local.sampled_write_ptr;
> +		log_buffer_state->read_ptr = write_offset;
>
>   		/* Clear the 'flush to file' flag */
>   		log_buffer_state->flush_to_file = 0;
>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 11/20] drm/i915: Optimization to reduce the sampling time of GuC log buffer
  2016-08-12 14:42   ` Tvrtko Ursulin
@ 2016-08-12 14:48     ` Goel, Akash
  2016-08-12 15:06       ` Tvrtko Ursulin
  0 siblings, 1 reply; 68+ messages in thread
From: Goel, Akash @ 2016-08-12 14:48 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: akash.goel



On 8/12/2016 8:12 PM, Tvrtko Ursulin wrote:
>
> On 12/08/16 07:25, akash.goel@intel.com wrote:
>> From: Akash Goel <akash.goel@intel.com>
>>
>> GuC firmware sends an interrupt to flush the log buffer when it becomes
>> half full, so Driver doesn't really need to sample the complete buffer
>> and can just copy only the newly written data by GuC into the local
>> buffer, i.e. as per the read & write pointer values.
>> Moreover the flush interrupt would generally come for one type of log
>> buffer, when it becomes half full, so at that time the other 2 types of
>> log buffer would comparatively have much lesser unread data in them.
>> In case of overflow reported by GuC, Driver do need to copy the entire
>> buffer as the whole buffer would contain the unread data.
>>
>> v2: Rebase.
>>
>> Signed-off-by: Akash Goel <akash.goel@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_guc_submission.c | 40
>> +++++++++++++++++++++++++-----
>>   1 file changed, 34 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
>> b/drivers/gpu/drm/i915/i915_guc_submission.c
>> index 1ca1866..8e0f360 100644
>> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
>> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
>> @@ -889,7 +889,8 @@ static void guc_read_update_log_buffer(struct
>> intel_guc *guc)
>>       struct guc_log_buffer_state *log_buffer_state,
>> *log_buffer_snapshot_state;
>>       struct guc_log_buffer_state log_buffer_state_local;
>>       void *src_data_ptr, *dst_data_ptr;
>> -    u32 i, buffer_size;
>> +    bool new_overflow;
>> +    u32 i, buffer_size, read_offset, write_offset, bytes_to_copy;
>>
>>       if (!guc->log.buf_addr)
>>           return;
>> @@ -912,10 +913,13 @@ static void guc_read_update_log_buffer(struct
>> intel_guc *guc)
>>           memcpy(&log_buffer_state_local, log_buffer_state,
>>                   sizeof(struct guc_log_buffer_state));
>>           buffer_size = log_buffer_state_local.size;
>> +        read_offset = log_buffer_state_local.read_ptr;
>> +        write_offset = log_buffer_state_local.sampled_write_ptr;
>>
>>           guc->log.flush_count[i] +=
>> log_buffer_state_local.flush_to_file;
>>           if (log_buffer_state_local.buffer_full_cnt !=
>>                       guc->log.prev_overflow_count[i]) {
>
> Wrong alignment. You can try checkpatch.pl for all of those.
>
Sorry for all the alignment & indentation issues.

Should the above condition be written like this ?

	if (log_buffer_state_local.buffer_full_cnt !=
	    guc->log.prev_overflow_count[i]) {


>> +            new_overflow = 1;
>
> true/false since it is a bool
fine will do that.
>
>>               guc->log.total_overflow_count[i] +=
>>                   (log_buffer_state_local.buffer_full_cnt -
>>                    guc->log.prev_overflow_count[i]);
>> @@ -929,7 +933,8 @@ static void guc_read_update_log_buffer(struct
>> intel_guc *guc)
>>               guc->log.prev_overflow_count[i] =
>>                       log_buffer_state_local.buffer_full_cnt;
>>               DRM_ERROR_RATELIMITED("GuC log buffer overflow\n");
>> -        }
>> +        } else
>> +            new_overflow = 0;
>>
>>           if (log_buffer_snapshot_state) {
>>               /* First copy the state structure in local buffer */
>> @@ -941,13 +946,37 @@ static void guc_read_update_log_buffer(struct
>> intel_guc *guc)
>>                * for consistency set the write pointer value to same
>>                * value of sampled_write_ptr in the snapshot buffer.
>>                */
>> -            log_buffer_snapshot_state->write_ptr =
>> -                log_buffer_snapshot_state->sampled_write_ptr;
>> +            log_buffer_snapshot_state->write_ptr = write_offset;
>>
>>               log_buffer_snapshot_state++;
>>
>>               /* Now copy the actual logs */
>>               memcpy(dst_data_ptr, src_data_ptr, buffer_size);
>
> The confusing bit - the memcpy above still copies the whole buffer, no?
>
Really very sorry for this blooper.

Best regards
Akash

>> +            if (unlikely(new_overflow)) {
>> +                /* copy the whole buffer in case of overflow */
>> +                read_offset = 0;
>> +                write_offset = buffer_size;
>> +            } else if (unlikely((read_offset > buffer_size) ||
>> +                        (write_offset > buffer_size))) {
>> +                DRM_ERROR("invalid log buffer state\n");
>> +                /* copy whole buffer as offsets are unreliable */
>> +                read_offset = 0;
>> +                write_offset = buffer_size;
>> +            }
>> +
>> +            /* Just copy the newly written data */
>> +            if (read_offset <= write_offset) {
>> +                bytes_to_copy = write_offset - read_offset;
>> +                memcpy(dst_data_ptr + read_offset,
>> +                     src_data_ptr + read_offset, bytes_to_copy);
>> +            } else {
>> +                bytes_to_copy = buffer_size - read_offset;
>> +                memcpy(dst_data_ptr + read_offset,
>> +                     src_data_ptr + read_offset, bytes_to_copy);
>> +
>> +                bytes_to_copy = write_offset;
>> +                memcpy(dst_data_ptr, src_data_ptr, bytes_to_copy);
>> +            }
>>
>>               src_data_ptr += buffer_size;
>>               dst_data_ptr += buffer_size;
>> @@ -956,8 +985,7 @@ static void guc_read_update_log_buffer(struct
>> intel_guc *guc)
>>           /* FIXME: invalidate/flush for log buffer needed */
>>
>>           /* Update the read pointer in the shared log buffer */
>> -        log_buffer_state->read_ptr =
>> -            log_buffer_state_local.sampled_write_ptr;
>> +        log_buffer_state->read_ptr = write_offset;
>>
>>           /* Clear the 'flush to file' flag */
>>           log_buffer_state->flush_to_file = 0;
>>
>
> Regards,
>
> Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 10/20] drm/i915: Add stats for GuC log buffer flush interrupts
  2016-08-12 14:26   ` Tvrtko Ursulin
@ 2016-08-12 14:56     ` Goel, Akash
  0 siblings, 0 replies; 68+ messages in thread
From: Goel, Akash @ 2016-08-12 14:56 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: akash.goel



On 8/12/2016 7:56 PM, Tvrtko Ursulin wrote:
>
> On 12/08/16 07:25, akash.goel@intel.com wrote:
>> From: Akash Goel <akash.goel@intel.com>
>>
>> GuC firmware sends an interrupt to flush the log buffer when it
>> becomes half full. GuC firmware also tracks how many times the
>> buffer overflowed.
>> It would be useful to maintain a statistics of how many flush
>> interrupts were received and for which type of log buffer,
>> along with the overflow count of each buffer type.
>> Augmented i915_log_info debugfs to report back these statistics.
>>
>> v2:
>> - Update the logic to detect multiple overflows between the 2
>>    flush interrupts and also log a message for overflow (Tvrtko)
>> - Track the number of times there was no free sub buffer to capture
>>    the GuC log buffer. (Tvrtko)
>>
>> Signed-off-by: Akash Goel <akash.goel@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_debugfs.c        | 28
>> ++++++++++++++++++++++++++++
>>   drivers/gpu/drm/i915/i915_guc_submission.c | 19 +++++++++++++++++++
>>   drivers/gpu/drm/i915/i915_irq.c            |  2 ++
>>   drivers/gpu/drm/i915/intel_guc.h           |  7 +++++++
>>   4 files changed, 56 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c
>> b/drivers/gpu/drm/i915/i915_debugfs.c
>> index 51b59d5..14e0dcf 100644
>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
>> @@ -2539,6 +2539,32 @@ static int i915_guc_load_status_info(struct
>> seq_file *m, void *data)
>>       return 0;
>>   }
>>
>> +static void i915_guc_log_info(struct seq_file *m,
>> +                 struct drm_i915_private *dev_priv)
>> +{
>> +    struct intel_guc *guc = &dev_priv->guc;
>> +
>> +    seq_printf(m, "\nGuC logging stats:\n");
>> +
>> +    seq_printf(m, "\tISR:   flush count %10u, overflow count %8u\n",
>> +        guc->log.flush_count[GUC_ISR_LOG_BUFFER],
>> +        guc->log.total_overflow_count[GUC_ISR_LOG_BUFFER]);
>> +
>> +    seq_printf(m, "\tDPC:   flush count %10u, overflow count %8u\n",
>> +        guc->log.flush_count[GUC_DPC_LOG_BUFFER],
>> +        guc->log.total_overflow_count[GUC_DPC_LOG_BUFFER]);
>> +
>> +    seq_printf(m, "\tCRASH: flush count %10u, overflow count %8u\n",
>> +        guc->log.flush_count[GUC_CRASH_DUMP_LOG_BUFFER],
>> +        guc->log.total_overflow_count[GUC_CRASH_DUMP_LOG_BUFFER]);
>
> Why is the width for overflow only 8 chars and not 10 like for flush
> since both are u32?

Looks to be a discrepancy. I will check.
Both should be 10 as per the max value of u32, which takes 10 digits in 
decimal form.

>
>> +
>> +    seq_printf(m, "\tTotal flush interrupt count: %u\n",
>> +               guc->log.flush_interrupt_count);
>> +
>> +    seq_printf(m, "\tCapture miss count: %u\n",
>> +               guc->log.capture_miss_count);
>> +}
>> +
>>   static void i915_guc_client_info(struct seq_file *m,
>>                    struct drm_i915_private *dev_priv,
>>                    struct i915_guc_client *client)
>> @@ -2613,6 +2639,8 @@ static int i915_guc_info(struct seq_file *m,
>> void *data)
>>       seq_printf(m, "\nGuC execbuf client @ %p:\n", guc.execbuf_client);
>>       i915_guc_client_info(m, dev_priv, &client);
>>
>> +    i915_guc_log_info(m, dev_priv);
>> +
>>       /* Add more as required ... */
>>
>>       return 0;
>> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
>> b/drivers/gpu/drm/i915/i915_guc_submission.c
>> index cb9672b..1ca1866 100644
>> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
>> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
>> @@ -913,6 +913,24 @@ static void guc_read_update_log_buffer(struct
>> intel_guc *guc)
>>                   sizeof(struct guc_log_buffer_state));
>>           buffer_size = log_buffer_state_local.size;
>>
>> +        guc->log.flush_count[i] += log_buffer_state_local.flush_to_file;
>> +        if (log_buffer_state_local.buffer_full_cnt !=
>> +                    guc->log.prev_overflow_count[i]) {
>> +            guc->log.total_overflow_count[i] +=
>> +                (log_buffer_state_local.buffer_full_cnt -
>> +                 guc->log.prev_overflow_count[i]);
>> +
>> +            if (log_buffer_state_local.buffer_full_cnt <
>> +                    guc->log.prev_overflow_count[i]) {
>> +                /* buffer_full_cnt is a 4 bit counter */
>> +                guc->log.total_overflow_count[i] += 16;
>> +            }
>> +
>> +            guc->log.prev_overflow_count[i] =
>> +                    log_buffer_state_local.buffer_full_cnt;
>> +            DRM_ERROR_RATELIMITED("GuC log buffer overflow\n");
>> +        }
>> +
>>           if (log_buffer_snapshot_state) {
>>               /* First copy the state structure in local buffer */
>>               memcpy(log_buffer_snapshot_state, &log_buffer_state_local,
>> @@ -953,6 +971,7 @@ static void guc_read_update_log_buffer(struct
>> intel_guc *guc)
>>            * getting consumed by User at a slow rate.
>>            */
>>           DRM_ERROR_RATELIMITED("no sub-buffer to capture log buffer\n");
>> +        guc->log.capture_miss_count++;
>>       }
>>   }
>>
>> diff --git a/drivers/gpu/drm/i915/i915_irq.c
>> b/drivers/gpu/drm/i915/i915_irq.c
>> index d4d6f0a..b08d1d2 100644
>> --- a/drivers/gpu/drm/i915/i915_irq.c
>> +++ b/drivers/gpu/drm/i915/i915_irq.c
>> @@ -1705,6 +1705,8 @@ static void gen9_guc_irq_handler(struct
>> drm_i915_private *dev_priv, u32 gt_iir)
>>                   /* Handle flush interrupt event in bottom half */
>>                   queue_work(dev_priv->guc.log.wq,
>>                           &dev_priv->guc.events_work);
>> +
>> +                dev_priv->guc.log.flush_interrupt_count++;
>>               }
>>           }
>>       }
>> diff --git a/drivers/gpu/drm/i915/intel_guc.h
>> b/drivers/gpu/drm/i915/intel_guc.h
>> index e4ec8d8..ed87e98 100644
>> --- a/drivers/gpu/drm/i915/intel_guc.h
>> +++ b/drivers/gpu/drm/i915/intel_guc.h
>> @@ -127,6 +127,13 @@ struct intel_guc_log {
>>       struct workqueue_struct *wq;
>>       void *buf_addr;
>>       struct rchan *relay_chan;
>> +
>> +    /* logging related stats */
>> +    u32 capture_miss_count;
>> +    u32 flush_interrupt_count;
>> +    u32 prev_overflow_count[GUC_MAX_LOG_BUFFER];
>> +    u32 total_overflow_count[GUC_MAX_LOG_BUFFER];
>> +    u32 flush_count[GUC_MAX_LOG_BUFFER];
>>   };
>>
>>   struct intel_guc {
>>
>
> Either if the formatting widths above are fine or you adjust them:
>
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>
> Regards,
>
> Tvrtko
>
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 09/20] drm/i915: New lock to serialize the Host2GuC actions
  2016-08-12 13:55   ` Tvrtko Ursulin
@ 2016-08-12 15:01     ` Goel, Akash
  0 siblings, 0 replies; 68+ messages in thread
From: Goel, Akash @ 2016-08-12 15:01 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: akash.goel



On 8/12/2016 7:25 PM, Tvrtko Ursulin wrote:
>
> On 12/08/16 07:25, akash.goel@intel.com wrote:
>> From: Akash Goel <akash.goel@intel.com>
>>
>> With the addition of new Host2GuC actions related to GuC logging, there
>> is a need of a lock to serialize them, as they can execute concurrently
>> with each other and also with other existing actions.
>>
>> v2: Use mutex in place of spinlock to serialize, as sleep can happen
>>      while waiting for the action's response from GuC. (Tvrtko)
>>
>> Signed-off-by: Akash Goel <akash.goel@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_guc_submission.c | 3 +++
>>   drivers/gpu/drm/i915/intel_guc.h           | 3 +++
>>   2 files changed, 6 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
>> b/drivers/gpu/drm/i915/i915_guc_submission.c
>> index 1a2d648..cb9672b 100644
>> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
>> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
>> @@ -88,6 +88,7 @@ static int host2guc_action(struct intel_guc *guc,
>> u32 *data, u32 len)
>>           return -EINVAL;
>>
>>       intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
>> +    mutex_lock(&guc->action_lock);
>
> I would probably take the mutex before grabbing forcewake as a general
> rule. Not that I think it matters in this case since we don't expect any
> contention on this one.
>
Yes did not expected a contention for this mutex, hence thought it use 
just around the code where it is actually needed.
Will move it before the forcewake, as you suggested, to conform to the 
rules.

Best regards
Akash
>>
>>       dev_priv->guc.action_count += 1;
>>       dev_priv->guc.action_cmd = data[0];
>> @@ -126,6 +127,7 @@ static int host2guc_action(struct intel_guc *guc,
>> u32 *data, u32 len)
>>       }
>>       dev_priv->guc.action_status = status;
>>
>> +    mutex_unlock(&guc->action_lock);
>>       intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
>>
>>       return ret;
>> @@ -1312,6 +1314,7 @@ int i915_guc_submission_init(struct
>> drm_i915_private *dev_priv)
>>           return -ENOMEM;
>>
>>       ida_init(&guc->ctx_ids);
>> +    mutex_init(&guc->action_lock);
>>       guc_create_log(guc);
>>       guc_create_ads(guc);
>>
>> diff --git a/drivers/gpu/drm/i915/intel_guc.h
>> b/drivers/gpu/drm/i915/intel_guc.h
>> index 96ef7dc..e4ec8d8 100644
>> --- a/drivers/gpu/drm/i915/intel_guc.h
>> +++ b/drivers/gpu/drm/i915/intel_guc.h
>> @@ -156,6 +156,9 @@ struct intel_guc {
>>
>>       uint64_t submissions[I915_NUM_ENGINES];
>>       uint32_t last_seqno[I915_NUM_ENGINES];
>> +
>> +    /* To serialize the Host2GuC actions */
>> +    struct mutex action_lock;
>>   };
>>
>>   /* intel_guc_loader.c */
>>
>
> With or without the mutex vs forcewake ordering change:
>
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>
> Regards,
>
> Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 05/20] drm/i915: Support for GuC interrupts
  2016-08-12 14:31         ` Goel, Akash
@ 2016-08-12 15:05           ` Tvrtko Ursulin
  2016-08-12 15:32             ` Goel, Akash
  0 siblings, 1 reply; 68+ messages in thread
From: Tvrtko Ursulin @ 2016-08-12 15:05 UTC (permalink / raw)
  To: Goel, Akash, intel-gfx


On 12/08/16 15:31, Goel, Akash wrote:
> On 8/12/2016 7:01 PM, Tvrtko Ursulin wrote:
>>>>> +static void gen9_guc2host_events_work(struct work_struct *work)
>>>>> +{
>>>>> +    struct drm_i915_private *dev_priv =
>>>>> +        container_of(work, struct drm_i915_private, guc.events_work);
>>>>> +
>>>>> +    spin_lock_irq(&dev_priv->irq_lock);
>>>>> +    /* Speed up work cancellation during disabling guc interrupts. */
>>>>> +    if (!dev_priv->guc.interrupts_enabled) {
>>>>> +        spin_unlock_irq(&dev_priv->irq_lock);
>>>>> +        return;
>>>>
>>>> I suppose locking for early exit is something about ensuring the worker
>>>> sees the update to dev_priv->guc.interrupts_enabled done on another
>>>> CPU?
>>>
>>> Yes locking (providing implicit barrier) will ensure that update made
>>> from another CPU is immediately visible to the worker.
>>
>> What if the disable happens after the unlock above? It would wait in
>> disable until the irq handler exits.
> Most probably it will not have to wait, as irq handler would have
> completed if work item began the execution.
> Irq handler just queues the work item, which gets scheduled later on.
>
> Using the lock is beneficial for the case where the execution of work
> item and interrupt disabling is done around the same time.

Ok maybe I am missing something.

When can the interrupt disabling happen? Will it be controlled by the 
debugfs file or is it driver load/unload and suspend/resume?

>>>>> +static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv,
>>>>> u32 gt_iir)
>>>>> +{
>>>>> +    bool interrupts_enabled;
>>>>> +
>>>>> +    if (gt_iir & GEN9_GUC_TO_HOST_INT_EVENT) {
>>>>> +        spin_lock(&dev_priv->irq_lock);
>>>>> +        interrupts_enabled = dev_priv->guc.interrupts_enabled;
>>>>> +        spin_unlock(&dev_priv->irq_lock);
>>>>
>>>> Not sure that taking a lock around only this read is needed.
>>>>
>>> Again same reason as above, to make sure an update made on another CPU
>>> is immediately visible to the irq handler.
>>
>> I don't get it, see above. :)
>
> Here also If interrupt disabling & ISR execution happens around the same
> time then ISR might miss the reset of 'interrupts_enabled' flag and
> queue the new work.

What if reset of interrupts_enabled happens just as the ISR releases the 
lock?

> And same applies to the case when interrupt is re-enabled, ISR might
> still see the 'interrupts_enabled' flag as false.
> It will eventually see the update though.
>
>>
>>>>> +        if (interrupts_enabled) {
>>>>> +            /* Sample the log buffer flush related bits & clear them
>>>>> +             * out now itself from the message identity register to
>>>>> +             * minimize the probability of losing a flush interrupt,
>>>>> +             * when there are back to back flush interrupts.
>>>>> +             * There can be a new flush interrupt, for different log
>>>>> +             * buffer type (like for ISR), whilst Host is handling
>>>>> +             * one (for DPC). Since same bit is used in message
>>>>> +             * register for ISR & DPC, it could happen that GuC
>>>>> +             * sets the bit for 2nd interrupt but Host clears out
>>>>> +             * the bit on handling the 1st interrupt.
>>>>> +             */
>>>>> +            u32 msg = I915_READ(SOFT_SCRATCH(15)) &
>>>>> +                    (GUC2HOST_MSG_CRASH_DUMP_POSTED |
>>>>> +                     GUC2HOST_MSG_FLUSH_LOG_BUFFER);
>>>>> +            if (msg) {
>>>>> +                /* Clear the message bits that are handled */
>>>>> +                I915_WRITE(SOFT_SCRATCH(15),
>>>>> +                    I915_READ(SOFT_SCRATCH(15)) & ~msg);
>>>>
>>>> Cache full value of SOFT_SCRATCH(15) so you don't have to mmio read it
>>>> twice?
>>>>
>>> Thought reading it again (just before the update) is bit safer compared
>>> to reading it once, as there is a potential race problem here.
>>> GuC could also write to the SOFT_SCRATCH(15) register, set new events
>>> bit, while Host clears off the bit of handled events.
>>
>> Don't get it. If there is a race between read and write there still is,
>> don't see how a second read makes it safer.
>>
> Yes can't avoid the race completely by double reads, but can reduce the
> race window size.

There was only one thing between the two reads, and that was "if (msg)":

  +            u32 msg = I915_READ(SOFT_SCRATCH(15)) &
  +                    (GUC2HOST_MSG_CRASH_DUMP_POSTED |
  +                     GUC2HOST_MSG_FLUSH_LOG_BUFFER);

  +            if (msg) {

  +                /* Clear the message bits that are handled */
  +                I915_WRITE(SOFT_SCRATCH(15),
  +                    I915_READ(SOFT_SCRATCH(15)) & ~msg);

>
> Also I felt code looked better in current form, as macros
> GUC2HOST_MSG_CRASH_DUMP_POSTED & GUC2HOST_MSG_FLUSH_LOG_BUFFER were used
> only once.
>
> Will change as per the initial implementation.
>
>      u32 msg = I915_READ(SOFT_SCRATCH(15));
>      if (msg & (GUC2HOST_MSG_CRASH_DUMP_POSTED |
>             GUC2HOST_MSG_FLUSH_LOG_BUFFER) {
>          msg &= ~(GUC2HOST_MSG_CRASH_DUMP_POSTED |
>               GUC2HOST_MSG_FLUSH_LOG_BUFFER);
>          I915_WRITE(SOFT_SCRATCH(15), msg);
>      }

Or:
	u32 msg, flush;

	msg = I915_READ(SOFT_SCRATCH(15));
	flush = msg & (GUC2HOST_MSG_CRASH_DUMP_POSTED | 
GUC2HOST_MSG_FLUSH_LOG_BUFFER);
	if (flush) {
		I915_WRITE(SOFT_SCRATCH(15), ~flush);
		... queue woker ...

?

>
>>>> Also, is the RMW outside any locks safe?
>>>>
>>>
>>> Ideally need a way to atomically do the RMW, i.e. read the register
>>> value, clear off the handled events bit and update the register with the
>>> modified value.
>>>
>>> Please kindly suggest how to address the above.
>>> Or can this be left as a TODO, when we do start handling other events
>>> also.
>>
>> From the comment in code above it sounds like a GuC fw interface
>> shortcoming - that there is a single bit for two different interrupt
>> sources, is that right?
> Yes that shortcoming is there, GUC2HOST_MSG_FLUSH_LOG_BUFFER bit is used
> for conveying the flush for ISR & DPC log buffers.
>
>> Is there any other register or something that
>> you can read to detect that the interrupt has been re-asserted while in
>> the irq handler?
>
>
>> Although I thought you said before that the GuC will
>> not do that - that it won't re-assert the interrupt before we send the
>> flush command.
> Yes that is the case, but with respect to one type of a log buffer, like
> for example unless GuC firmware receives the ack for DPC log
> buffer it won't send a new flush for DPC buffer, but if meanwhile ISR
> buffer becomes half full it will send a flush interrupt.

So the potential for losing interrupts is unavoidable it seems. :( If I 
am understanding this correctly.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 11/20] drm/i915: Optimization to reduce the sampling time of GuC log buffer
  2016-08-12 14:48     ` Goel, Akash
@ 2016-08-12 15:06       ` Tvrtko Ursulin
  0 siblings, 0 replies; 68+ messages in thread
From: Tvrtko Ursulin @ 2016-08-12 15:06 UTC (permalink / raw)
  To: Goel, Akash, intel-gfx


On 12/08/16 15:48, Goel, Akash wrote:
> On 8/12/2016 8:12 PM, Tvrtko Ursulin wrote:
>>
>> On 12/08/16 07:25, akash.goel@intel.com wrote:
>>> From: Akash Goel <akash.goel@intel.com>
>>>
>>> GuC firmware sends an interrupt to flush the log buffer when it becomes
>>> half full, so Driver doesn't really need to sample the complete buffer
>>> and can just copy only the newly written data by GuC into the local
>>> buffer, i.e. as per the read & write pointer values.
>>> Moreover the flush interrupt would generally come for one type of log
>>> buffer, when it becomes half full, so at that time the other 2 types of
>>> log buffer would comparatively have much lesser unread data in them.
>>> In case of overflow reported by GuC, Driver do need to copy the entire
>>> buffer as the whole buffer would contain the unread data.
>>>
>>> v2: Rebase.
>>>
>>> Signed-off-by: Akash Goel <akash.goel@intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/i915_guc_submission.c | 40
>>> +++++++++++++++++++++++++-----
>>>   1 file changed, 34 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
>>> b/drivers/gpu/drm/i915/i915_guc_submission.c
>>> index 1ca1866..8e0f360 100644
>>> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
>>> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
>>> @@ -889,7 +889,8 @@ static void guc_read_update_log_buffer(struct
>>> intel_guc *guc)
>>>       struct guc_log_buffer_state *log_buffer_state,
>>> *log_buffer_snapshot_state;
>>>       struct guc_log_buffer_state log_buffer_state_local;
>>>       void *src_data_ptr, *dst_data_ptr;
>>> -    u32 i, buffer_size;
>>> +    bool new_overflow;
>>> +    u32 i, buffer_size, read_offset, write_offset, bytes_to_copy;
>>>
>>>       if (!guc->log.buf_addr)
>>>           return;
>>> @@ -912,10 +913,13 @@ static void guc_read_update_log_buffer(struct
>>> intel_guc *guc)
>>>           memcpy(&log_buffer_state_local, log_buffer_state,
>>>                   sizeof(struct guc_log_buffer_state));
>>>           buffer_size = log_buffer_state_local.size;
>>> +        read_offset = log_buffer_state_local.read_ptr;
>>> +        write_offset = log_buffer_state_local.sampled_write_ptr;
>>>
>>>           guc->log.flush_count[i] +=
>>> log_buffer_state_local.flush_to_file;
>>>           if (log_buffer_state_local.buffer_full_cnt !=
>>>                       guc->log.prev_overflow_count[i]) {
>>
>> Wrong alignment. You can try checkpatch.pl for all of those.
>>
> Sorry for all the alignment & indentation issues.
>
> Should the above condition be written like this ?
>
>      if (log_buffer_state_local.buffer_full_cnt !=
>          guc->log.prev_overflow_count[i]) {

Yes, but checkpatch.pl is your friend. :)

>>> +            new_overflow = 1;
>>
>> true/false since it is a bool
> fine will do that.
>>
>>>               guc->log.total_overflow_count[i] +=
>>>                   (log_buffer_state_local.buffer_full_cnt -
>>>                    guc->log.prev_overflow_count[i]);
>>> @@ -929,7 +933,8 @@ static void guc_read_update_log_buffer(struct
>>> intel_guc *guc)
>>>               guc->log.prev_overflow_count[i] =
>>>                       log_buffer_state_local.buffer_full_cnt;
>>>               DRM_ERROR_RATELIMITED("GuC log buffer overflow\n");
>>> -        }
>>> +        } else
>>> +            new_overflow = 0;
>>>
>>>           if (log_buffer_snapshot_state) {
>>>               /* First copy the state structure in local buffer */
>>> @@ -941,13 +946,37 @@ static void guc_read_update_log_buffer(struct
>>> intel_guc *guc)
>>>                * for consistency set the write pointer value to same
>>>                * value of sampled_write_ptr in the snapshot buffer.
>>>                */
>>> -            log_buffer_snapshot_state->write_ptr =
>>> -                log_buffer_snapshot_state->sampled_write_ptr;
>>> +            log_buffer_snapshot_state->write_ptr = write_offset;
>>>
>>>               log_buffer_snapshot_state++;
>>>
>>>               /* Now copy the actual logs */
>>>               memcpy(dst_data_ptr, src_data_ptr, buffer_size);
>>
>> The confusing bit - the memcpy above still copies the whole buffer, no?
>>
> Really very sorry for this blooper.

No worries, it happens to everyone!

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 16/20] drm/i915: Support to create write combined type vmaps
  2016-08-12 10:49   ` Tvrtko Ursulin
@ 2016-08-12 15:13     ` Goel, Akash
  2016-08-12 15:16       ` Chris Wilson
  0 siblings, 1 reply; 68+ messages in thread
From: Goel, Akash @ 2016-08-12 15:13 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: akash.goel



On 8/12/2016 4:19 PM, Tvrtko Ursulin wrote:
>
> On 12/08/16 07:25, akash.goel@intel.com wrote:
>> From: Chris Wilson <chris@chris-wilson.co.uk>
>>
>> vmaps has a provision for controlling the page protection bits, with
>> which
>> we can use to control the mapping type, e.g. WB, WC, UC or even WT.
>> To allow the caller to choose their mapping type, we add a parameter to
>> i915_gem_object_pin_map - but we still only allow one vmap to be cached
>> per object. If the object is currently not pinned, then we recreate the
>> previous vmap with the new access type, but if it was pinned we report an
>> error. This effectively limits the access via i915_gem_object_pin_map
>> to a
>> single mapping type for the lifetime of the object. Not usually a
>> problem,
>> but something to be aware of when setting up the object's vmap.
>>
>> We will want to vary the access type to enable WC mappings of ringbuffer
>> and context objects on !llc platforms, as well as other objects where we
>> need coherent access to the GPU's pages without going through the GTT
>>
>> v2: Remove the redundant braces around pin count check and fix the marker
>>      in documentation (Chris)
>>
>> v3:
>> - Add a new enum for the vmalloc mapping type & pass that as an
>> argument to
>>    i915_object_pin_map. (Tvrtko)
>> - Use PAGE_MASK to extract or filter the mapping type info and remove a
>>    superfluous BUG_ON.(Tvrtko)
>>
>> v4:
>> - Rename the enums and clean up the pin_map function. (Chris)
>>
>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> Signed-off-by: Akash Goel <akash.goel@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_drv.h            |  9 ++++-
>>   drivers/gpu/drm/i915/i915_gem.c            | 58
>> +++++++++++++++++++++++-------
>>   drivers/gpu/drm/i915/i915_gem_dmabuf.c     |  2 +-
>>   drivers/gpu/drm/i915/i915_guc_submission.c |  2 +-
>>   drivers/gpu/drm/i915/intel_lrc.c           |  8 ++---
>>   drivers/gpu/drm/i915/intel_ringbuffer.c    |  2 +-
>>   6 files changed, 60 insertions(+), 21 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h
>> b/drivers/gpu/drm/i915/i915_drv.h
>> index 4bd3790..6603812 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -834,6 +834,11 @@ enum i915_cache_level {
>>       I915_CACHE_WT, /* hsw:gt3e WriteThrough for scanouts */
>>   };
>>
>> +enum i915_map_type {
>> +    I915_MAP_WB = 0,
>> +    I915_MAP_WC,
>> +};
>> +
>>   struct i915_ctx_hang_stats {
>>       /* This context had batch pending when hang was declared */
>>       unsigned batch_pending;
>> @@ -3150,6 +3155,7 @@ static inline void
>> i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
>>   /**
>>    * i915_gem_object_pin_map - return a contiguous mapping of the
>> entire object
>>    * @obj - the object to map into kernel address space
>> + * @map_type - whether the vmalloc mapping should be using WC or WB
>> pgprot_t
>>    *
>>    * Calls i915_gem_object_pin_pages() to prevent reaping of the object's
>>    * pages and then returns a contiguous mapping of the backing
>> storage into
>> @@ -3161,7 +3167,8 @@ static inline void
>> i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
>>    * Returns the pointer through which to access the mapped object, or an
>>    * ERR_PTR() on error.
>>    */
>> -void *__must_check i915_gem_object_pin_map(struct drm_i915_gem_object
>> *obj);
>> +void *__must_check i915_gem_object_pin_map(struct drm_i915_gem_object
>> *obj,
>> +                    enum i915_map_type map_type);
>>
>>   /**
>>    * i915_gem_object_unpin_map - releases an earlier mapping
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c
>> b/drivers/gpu/drm/i915/i915_gem.c
>> index 03548db..7dabbc3f 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -2077,10 +2077,11 @@ i915_gem_object_put_pages(struct
>> drm_i915_gem_object *obj)
>>       list_del(&obj->global_list);
>>
>>       if (obj->mapping) {
>> -        if (is_vmalloc_addr(obj->mapping))
>> -            vunmap(obj->mapping);
>> +        void *ptr = (void *)((uintptr_t)obj->mapping & PAGE_MASK);
>> +        if (is_vmalloc_addr(ptr))
>> +            vunmap(ptr);
>>           else
>> -            kunmap(kmap_to_page(obj->mapping));
>> +            kunmap(kmap_to_page(ptr));
>>           obj->mapping = NULL;
>>       }
>>
>> @@ -2253,7 +2254,8 @@ i915_gem_object_get_pages(struct
>> drm_i915_gem_object *obj)
>>   }
>>
>>   /* The 'mapping' part of i915_gem_object_pin_map() below */
>> -static void *i915_gem_object_map(const struct drm_i915_gem_object *obj)
>> +static void *i915_gem_object_map(const struct drm_i915_gem_object *obj,
>> +                 enum i915_map_type type)
>>   {
>>       unsigned long n_pages = obj->base.size >> PAGE_SHIFT;
>>       struct sg_table *sgt = obj->pages;
>> @@ -2263,9 +2265,10 @@ static void *i915_gem_object_map(const struct
>> drm_i915_gem_object *obj)
>>       struct page **pages = stack_pages;
>>       unsigned long i = 0;
>>       void *addr;
>> +    bool use_wc = (type == I915_MAP_WC);
>>
>>       /* A single page can always be kmapped */
>> -    if (n_pages == 1)
>> +    if (n_pages == 1 && !use_wc)
>>           return kmap(sg_page(sgt->sgl));
>>
>>       if (n_pages > ARRAY_SIZE(stack_pages)) {
>> @@ -2281,7 +2284,8 @@ static void *i915_gem_object_map(const struct
>> drm_i915_gem_object *obj)
>>       /* Check that we have the expected number of pages */
>>       GEM_BUG_ON(i != n_pages);
>>
>> -    addr = vmap(pages, n_pages, 0, PAGE_KERNEL);
>> +    addr = vmap(pages, n_pages, VM_NO_GUARD,
>
> Unreleated and unmentioned change to no guard page. Best to remove IMHO.
> Can keep the RB in that case.

Though its not called out, sorry for that, but isn't it better to avoid 
using the guard page, which will save 4KB of vmalloc virtual space 
(which is scarce) for every mapping created by Driver.

Updating the commit message would be fine to mention about this ?.

Best regards
Akash

>
>> +            use_wc ? pgprot_writecombine(PAGE_KERNEL_IO) : PAGE_KERNEL);
>>
>>       if (pages != stack_pages)
>>           drm_free_large(pages);
>> @@ -2290,11 +2294,16 @@ static void *i915_gem_object_map(const struct
>> drm_i915_gem_object *obj)
>>   }
>>
>>   /* get, pin, and map the pages of the object into kernel space */
>> -void *i915_gem_object_pin_map(struct drm_i915_gem_object *obj)
>> +void *i915_gem_object_pin_map(struct drm_i915_gem_object *obj,
>> +                  enum i915_map_type type)
>>   {
>> +    enum i915_map_type has_type;
>> +    bool pinned;
>> +    void *ptr;
>>       int ret;
>>
>>       lockdep_assert_held(&obj->base.dev->struct_mutex);
>> +    GEM_BUG_ON((obj->ops->flags & I915_GEM_OBJECT_HAS_STRUCT_PAGE) ==
>> 0);
>>
>>       ret = i915_gem_object_get_pages(obj);
>>       if (ret)
>> @@ -2302,15 +2311,38 @@ void *i915_gem_object_pin_map(struct
>> drm_i915_gem_object *obj)
>>
>>       i915_gem_object_pin_pages(obj);
>>
>> -    if (!obj->mapping) {
>> -        obj->mapping = i915_gem_object_map(obj);
>> -        if (!obj->mapping) {
>> -            i915_gem_object_unpin_pages(obj);
>> -            return ERR_PTR(-ENOMEM);
>> +    pinned = obj->pages_pin_count > 1;
>> +    ptr = (void *)((uintptr_t)obj->mapping & PAGE_MASK);
>> +    has_type = (uintptr_t)obj->mapping & ~PAGE_MASK;
>> +
>> +    if (ptr && has_type != type) {
>> +        if (pinned) {
>> +            ret = -EBUSY;
>> +            goto err;
>> +        }
>> +
>> +        if (is_vmalloc_addr(ptr))
>> +            vunmap(ptr);
>> +        else
>> +            kunmap(kmap_to_page(ptr));
>> +        ptr = obj->mapping = NULL;
>> +    }
>> +
>> +    if (!ptr) {
>> +        ptr = i915_gem_object_map(obj, type);
>> +        if (!ptr) {
>> +            ret = -ENOMEM;
>> +            goto err;
>>           }
>> +
>> +        obj->mapping = (void *)((uintptr_t)ptr | type);
>>       }
>>
>> -    return obj->mapping;
>> +    return ptr;
>> +
>> +err:
>> +    i915_gem_object_unpin_pages(obj);
>> +    return ERR_PTR(ret);
>>   }
>>
>>   static void
>> diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c
>> b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
>> index c60a8d5b..10265bb 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
>> @@ -119,7 +119,7 @@ static void *i915_gem_dmabuf_vmap(struct dma_buf
>> *dma_buf)
>>       if (ret)
>>           return ERR_PTR(ret);
>>
>> -    addr = i915_gem_object_pin_map(obj);
>> +    addr = i915_gem_object_pin_map(obj, I915_MAP_WB);
>>       mutex_unlock(&dev->struct_mutex);
>>
>>       return addr;
>> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
>> b/drivers/gpu/drm/i915/i915_guc_submission.c
>> index 041cf68..1d58d36 100644
>> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
>> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
>> @@ -1178,7 +1178,7 @@ static int guc_create_log_extras(struct
>> intel_guc *guc)
>>
>>       if (!guc->log.buf_addr) {
>>           /* Create a vmalloc mapping of log buffer pages */
>> -        vaddr = i915_gem_object_pin_map(guc->log.obj);
>> +        vaddr = i915_gem_object_pin_map(guc->log.obj, I915_MAP_WB);
>>           if (IS_ERR(vaddr)) {
>>               ret = PTR_ERR(vaddr);
>>               DRM_ERROR("Couldn't map log buffer pages %d\n", ret);
>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c
>> b/drivers/gpu/drm/i915/intel_lrc.c
>> index c7f4b64..c24ac39 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>> @@ -780,7 +780,7 @@ static int intel_lr_context_pin(struct
>> i915_gem_context *ctx,
>>       if (ret)
>>           goto err;
>>
>> -    vaddr = i915_gem_object_pin_map(ce->state);
>> +    vaddr = i915_gem_object_pin_map(ce->state, I915_MAP_WB);
>>       if (IS_ERR(vaddr)) {
>>           ret = PTR_ERR(vaddr);
>>           goto unpin_ctx_obj;
>> @@ -1755,7 +1755,7 @@ lrc_setup_hws(struct intel_engine_cs *engine,
>>       /* The HWSP is part of the default context object in LRC mode. */
>>       engine->status_page.gfx_addr = i915_gem_obj_ggtt_offset(dctx_obj) +
>>                          LRC_PPHWSP_PN * PAGE_SIZE;
>> -    hws = i915_gem_object_pin_map(dctx_obj);
>> +    hws = i915_gem_object_pin_map(dctx_obj, I915_MAP_WB);
>>       if (IS_ERR(hws))
>>           return PTR_ERR(hws);
>>       engine->status_page.page_addr = hws + LRC_PPHWSP_PN * PAGE_SIZE;
>> @@ -1968,7 +1968,7 @@ populate_lr_context(struct i915_gem_context *ctx,
>>           return ret;
>>       }
>>
>> -    vaddr = i915_gem_object_pin_map(ctx_obj);
>> +    vaddr = i915_gem_object_pin_map(ctx_obj, I915_MAP_WB);
>>       if (IS_ERR(vaddr)) {
>>           ret = PTR_ERR(vaddr);
>>           DRM_DEBUG_DRIVER("Could not map object pages! (%d)\n", ret);
>> @@ -2189,7 +2189,7 @@ void intel_lr_context_reset(struct
>> drm_i915_private *dev_priv,
>>           if (!ctx_obj)
>>               continue;
>>
>> -        vaddr = i915_gem_object_pin_map(ctx_obj);
>> +        vaddr = i915_gem_object_pin_map(ctx_obj, I915_MAP_WB);
>>           if (WARN_ON(IS_ERR(vaddr)))
>>               continue;
>>
>> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c
>> b/drivers/gpu/drm/i915/intel_ringbuffer.c
>> index e8fa26c..69ec5da 100644
>> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
>> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
>> @@ -1951,7 +1951,7 @@ int intel_ring_pin(struct intel_ring *ring)
>>           if (ret)
>>               goto err_unpin;
>>
>> -        addr = i915_gem_object_pin_map(obj);
>> +        addr = i915_gem_object_pin_map(obj, I915_MAP_WB);
>>           if (IS_ERR(addr)) {
>>               ret = PTR_ERR(addr);
>>               goto err_unpin;
>>
>
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>
> Regards,
>
> Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 16/20] drm/i915: Support to create write combined type vmaps
  2016-08-12 15:13     ` Goel, Akash
@ 2016-08-12 15:16       ` Chris Wilson
  2016-08-12 16:46         ` Goel, Akash
  0 siblings, 1 reply; 68+ messages in thread
From: Chris Wilson @ 2016-08-12 15:16 UTC (permalink / raw)
  To: Goel, Akash; +Cc: intel-gfx

On Fri, Aug 12, 2016 at 08:43:58PM +0530, Goel, Akash wrote:
> On 8/12/2016 4:19 PM, Tvrtko Ursulin wrote:
> >Unreleated and unmentioned change to no guard page. Best to remove IMHO.
> >Can keep the RB in that case.
> 
> Though its not called out, sorry for that, but isn't it better to
> avoid using the guard page, which will save 4KB of vmalloc virtual
> space (which is scarce) for every mapping created by Driver.
> 
> Updating the commit message would be fine to mention about this ?.

Too late, already applied without the new flag.

Yes, that's why I dropped the guard page when I found out it was being
added. Send a patch to add the flag and we can discuss whether we think
our code is adequate to not require the protection.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 13/20] drm/i915: Augment i915 error state to include the dump of GuC log buffer
  2016-08-12  6:25 ` [PATCH 13/20] drm/i915: Augment i915 error state to include the dump of GuC log buffer akash.goel
@ 2016-08-12 15:20   ` Tvrtko Ursulin
  2016-08-12 15:32     ` Chris Wilson
  0 siblings, 1 reply; 68+ messages in thread
From: Tvrtko Ursulin @ 2016-08-12 15:20 UTC (permalink / raw)
  To: akash.goel, intel-gfx


On 12/08/16 07:25, akash.goel@intel.com wrote:
> From: Akash Goel <akash.goel@intel.com>
>
> Added the dump of GuC log buffer to i915 error state, as the contents of
> GuC log buffer would also be useful to determine that why the GPU reset
> was triggered.
>
> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Akash Goel <akash.goel@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_drv.h       |  1 +
>   drivers/gpu/drm/i915/i915_gpu_error.c | 27 +++++++++++++++++++++++++++
>   2 files changed, 28 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 28ffac5..4bd3790 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -509,6 +509,7 @@ struct drm_i915_error_state {
>   	struct intel_overlay_error_state *overlay;
>   	struct intel_display_error_state *display;
>   	struct drm_i915_error_object *semaphore_obj;
> +	struct drm_i915_error_object *guc_log_obj;
>
>   	struct drm_i915_error_engine {
>   		int engine_id;
> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> index eecb870..561b523 100644
> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> @@ -546,6 +546,21 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
>   		}
>   	}
>
> +	if ((obj = error->guc_log_obj)) {
> +		err_printf(m, "GuC log buffer = 0x%08x\n",
> +			   lower_32_bits(obj->gtt_offset));
> +		for (i = 0; i < obj->page_count; i++) {
> +			for (elt = 0; elt < PAGE_SIZE/4; elt += 4) {

Should the condition be PAGE_SIZE / 16 ? I am not sure, looks like it is 
counting in u32 * 4 chunks so it might be. Or I might be confused..

> +				err_printf(m, "[%08x] %08x %08x %08x %08x\n",
> +					   (u32)(i*PAGE_SIZE) + elt*4,
> +					   obj->pages[i][elt],
> +					   obj->pages[i][elt+1],
> +					   obj->pages[i][elt+2],
> +					   obj->pages[i][elt+3]);
> +			}
> +		}
> +	}
> +
>   	if (error->overlay)
>   		intel_overlay_print_error_state(m, error->overlay);
>
> @@ -625,6 +640,7 @@ static void i915_error_state_free(struct kref *error_ref)
>   	}
>
>   	i915_error_object_free(error->semaphore_obj);
> +	i915_error_object_free(error->guc_log_obj);
>
>   	for (i = 0; i < error->vm_count; i++)
>   		kfree(error->active_bo[i]);
> @@ -1210,6 +1226,16 @@ static void i915_gem_record_rings(struct drm_i915_private *dev_priv,
>   	}
>   }
>
> +static void i915_gem_capture_guc_log_buffer(struct drm_i915_private *dev_priv,
> +				     struct drm_i915_error_state *error)

Alignment.

> +{
> +	if (!dev_priv->guc.log.obj)
> +		return;
> +
> +	error->guc_log_obj = i915_error_ggtt_object_create(dev_priv,
> +						dev_priv->guc.log.obj);
> +}
> +
>   /* FIXME: Since pin count/bound list is global, we duplicate what we capture per
>    * VM.
>    */
> @@ -1439,6 +1465,7 @@ void i915_capture_error_state(struct drm_i915_private *dev_priv,
>   	i915_gem_capture_buffers(dev_priv, error);
>   	i915_gem_record_fences(dev_priv, error);
>   	i915_gem_record_rings(dev_priv, error);
> +	i915_gem_capture_guc_log_buffer(dev_priv, error);
>
>   	do_gettimeofday(&error->time);
>
>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 13/20] drm/i915: Augment i915 error state to include the dump of GuC log buffer
  2016-08-12 15:20   ` Tvrtko Ursulin
@ 2016-08-12 15:32     ` Chris Wilson
  2016-08-12 15:46       ` Goel, Akash
  0 siblings, 1 reply; 68+ messages in thread
From: Chris Wilson @ 2016-08-12 15:32 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: akash.goel, intel-gfx

On Fri, Aug 12, 2016 at 04:20:03PM +0100, Tvrtko Ursulin wrote:
> 
> On 12/08/16 07:25, akash.goel@intel.com wrote:
> >From: Akash Goel <akash.goel@intel.com>
> >
> >Added the dump of GuC log buffer to i915 error state, as the contents of
> >GuC log buffer would also be useful to determine that why the GPU reset
> >was triggered.
> >
> >Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
> >Signed-off-by: Akash Goel <akash.goel@intel.com>
> >---
> >  drivers/gpu/drm/i915/i915_drv.h       |  1 +
> >  drivers/gpu/drm/i915/i915_gpu_error.c | 27 +++++++++++++++++++++++++++
> >  2 files changed, 28 insertions(+)
> >
> >diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> >index 28ffac5..4bd3790 100644
> >--- a/drivers/gpu/drm/i915/i915_drv.h
> >+++ b/drivers/gpu/drm/i915/i915_drv.h
> >@@ -509,6 +509,7 @@ struct drm_i915_error_state {
> >  	struct intel_overlay_error_state *overlay;
> >  	struct intel_display_error_state *display;
> >  	struct drm_i915_error_object *semaphore_obj;
> >+	struct drm_i915_error_object *guc_log_obj;
> >
> >  	struct drm_i915_error_engine {
> >  		int engine_id;
> >diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> >index eecb870..561b523 100644
> >--- a/drivers/gpu/drm/i915/i915_gpu_error.c
> >+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> >@@ -546,6 +546,21 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
> >  		}
> >  	}
> >
> >+	if ((obj = error->guc_log_obj)) {
> >+		err_printf(m, "GuC log buffer = 0x%08x\n",
> >+			   lower_32_bits(obj->gtt_offset));
> >+		for (i = 0; i < obj->page_count; i++) {
> >+			for (elt = 0; elt < PAGE_SIZE/4; elt += 4) {
> 
> Should the condition be PAGE_SIZE / 16 ? I am not sure, looks like
> it is counting in u32 * 4 chunks so it might be. Or I might be
> confused..

There's (or will be) a function to dump the error object in a uniform
manner. This patch is obsolete.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 05/20] drm/i915: Support for GuC interrupts
  2016-08-12 15:05           ` Tvrtko Ursulin
@ 2016-08-12 15:32             ` Goel, Akash
  0 siblings, 0 replies; 68+ messages in thread
From: Goel, Akash @ 2016-08-12 15:32 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: akash.goel



On 8/12/2016 8:35 PM, Tvrtko Ursulin wrote:
>
> On 12/08/16 15:31, Goel, Akash wrote:
>> On 8/12/2016 7:01 PM, Tvrtko Ursulin wrote:
>>>>>> +static void gen9_guc2host_events_work(struct work_struct *work)
>>>>>> +{
>>>>>> +    struct drm_i915_private *dev_priv =
>>>>>> +        container_of(work, struct drm_i915_private,
>>>>>> guc.events_work);
>>>>>> +
>>>>>> +    spin_lock_irq(&dev_priv->irq_lock);
>>>>>> +    /* Speed up work cancellation during disabling guc
>>>>>> interrupts. */
>>>>>> +    if (!dev_priv->guc.interrupts_enabled) {
>>>>>> +        spin_unlock_irq(&dev_priv->irq_lock);
>>>>>> +        return;
>>>>>
>>>>> I suppose locking for early exit is something about ensuring the
>>>>> worker
>>>>> sees the update to dev_priv->guc.interrupts_enabled done on another
>>>>> CPU?
>>>>
>>>> Yes locking (providing implicit barrier) will ensure that update made
>>>> from another CPU is immediately visible to the worker.
>>>
>>> What if the disable happens after the unlock above? It would wait in
>>> disable until the irq handler exits.
>> Most probably it will not have to wait, as irq handler would have
>> completed if work item began the execution.
>> Irq handler just queues the work item, which gets scheduled later on.
>>
>> Using the lock is beneficial for the case where the execution of work
>> item and interrupt disabling is done around the same time.
>
> Ok maybe I am missing something.
>
> When can the interrupt disabling happen? Will it be controlled by the
> debugfs file or is it driver load/unload and suspend/resume?
>
yes disabling will happen for all the above 3 scenarios.

>>>>>> +static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv,
>>>>>> u32 gt_iir)
>>>>>> +{
>>>>>> +    bool interrupts_enabled;
>>>>>> +
>>>>>> +    if (gt_iir & GEN9_GUC_TO_HOST_INT_EVENT) {
>>>>>> +        spin_lock(&dev_priv->irq_lock);
>>>>>> +        interrupts_enabled = dev_priv->guc.interrupts_enabled;
>>>>>> +        spin_unlock(&dev_priv->irq_lock);
>>>>>
>>>>> Not sure that taking a lock around only this read is needed.
>>>>>
>>>> Again same reason as above, to make sure an update made on another CPU
>>>> is immediately visible to the irq handler.
>>>
>>> I don't get it, see above. :)
>>
>> Here also If interrupt disabling & ISR execution happens around the same
>> time then ISR might miss the reset of 'interrupts_enabled' flag and
>> queue the new work.
>
> What if reset of interrupts_enabled happens just as the ISR releases the
> lock?
>
Then ISR will proceed ahead and queue the work item.

Lock is useful if reset of interrupts_enabled flag just happens before 
the ISR inspects the value of that flag.
Also lock will help when interrupts_enabled flag is set again, next ISR 
will definitely see it as set.

>> And same applies to the case when interrupt is re-enabled, ISR might
>> still see the 'interrupts_enabled' flag as false.
>> It will eventually see the update though.
>>
>>>
>>>>>> +        if (interrupts_enabled) {
>>>>>> +            /* Sample the log buffer flush related bits & clear them
>>>>>> +             * out now itself from the message identity register to
>>>>>> +             * minimize the probability of losing a flush interrupt,
>>>>>> +             * when there are back to back flush interrupts.
>>>>>> +             * There can be a new flush interrupt, for different log
>>>>>> +             * buffer type (like for ISR), whilst Host is handling
>>>>>> +             * one (for DPC). Since same bit is used in message
>>>>>> +             * register for ISR & DPC, it could happen that GuC
>>>>>> +             * sets the bit for 2nd interrupt but Host clears out
>>>>>> +             * the bit on handling the 1st interrupt.
>>>>>> +             */
>>>>>> +            u32 msg = I915_READ(SOFT_SCRATCH(15)) &
>>>>>> +                    (GUC2HOST_MSG_CRASH_DUMP_POSTED |
>>>>>> +                     GUC2HOST_MSG_FLUSH_LOG_BUFFER);
>>>>>> +            if (msg) {
>>>>>> +                /* Clear the message bits that are handled */
>>>>>> +                I915_WRITE(SOFT_SCRATCH(15),
>>>>>> +                    I915_READ(SOFT_SCRATCH(15)) & ~msg);
>>>>>
>>>>> Cache full value of SOFT_SCRATCH(15) so you don't have to mmio read it
>>>>> twice?
>>>>>
>>>> Thought reading it again (just before the update) is bit safer compared
>>>> to reading it once, as there is a potential race problem here.
>>>> GuC could also write to the SOFT_SCRATCH(15) register, set new events
>>>> bit, while Host clears off the bit of handled events.
>>>
>>> Don't get it. If there is a race between read and write there still is,
>>> don't see how a second read makes it safer.
>>>
>> Yes can't avoid the race completely by double reads, but can reduce the
>> race window size.
>
> There was only one thing between the two reads, and that was "if (msg)":
>
>  +            u32 msg = I915_READ(SOFT_SCRATCH(15)) &
>  +                    (GUC2HOST_MSG_CRASH_DUMP_POSTED |
>  +                     GUC2HOST_MSG_FLUSH_LOG_BUFFER);
>
>  +            if (msg) {
>
>  +                /* Clear the message bits that are handled */
>  +                I915_WRITE(SOFT_SCRATCH(15),
>  +                    I915_READ(SOFT_SCRATCH(15)) & ~msg);
>
>>
>> Also I felt code looked better in current form, as macros
>> GUC2HOST_MSG_CRASH_DUMP_POSTED & GUC2HOST_MSG_FLUSH_LOG_BUFFER were used
>> only once.
>>
>> Will change as per the initial implementation.
>>
>>      u32 msg = I915_READ(SOFT_SCRATCH(15));
>>      if (msg & (GUC2HOST_MSG_CRASH_DUMP_POSTED |
>>             GUC2HOST_MSG_FLUSH_LOG_BUFFER) {
>>          msg &= ~(GUC2HOST_MSG_CRASH_DUMP_POSTED |
>>               GUC2HOST_MSG_FLUSH_LOG_BUFFER);
>>          I915_WRITE(SOFT_SCRATCH(15), msg);
>>      }
>
> Or:
>     u32 msg, flush;
>
>     msg = I915_READ(SOFT_SCRATCH(15));
>     flush = msg & (GUC2HOST_MSG_CRASH_DUMP_POSTED |
> GUC2HOST_MSG_FLUSH_LOG_BUFFER);
>     if (flush) {
>         I915_WRITE(SOFT_SCRATCH(15), ~flush);
>         ... queue woker ...
>
> ?
Thanks much, will change as per the above.

>
>>
>>>>> Also, is the RMW outside any locks safe?
>>>>>
>>>>
>>>> Ideally need a way to atomically do the RMW, i.e. read the register
>>>> value, clear off the handled events bit and update the register with
>>>> the
>>>> modified value.
>>>>
>>>> Please kindly suggest how to address the above.
>>>> Or can this be left as a TODO, when we do start handling other events
>>>> also.
>>>
>>> From the comment in code above it sounds like a GuC fw interface
>>> shortcoming - that there is a single bit for two different interrupt
>>> sources, is that right?
>> Yes that shortcoming is there, GUC2HOST_MSG_FLUSH_LOG_BUFFER bit is used
>> for conveying the flush for ISR & DPC log buffers.
>>
>>> Is there any other register or something that
>>> you can read to detect that the interrupt has been re-asserted while in
>>> the irq handler?
>>
>>
>>> Although I thought you said before that the GuC will
>>> not do that - that it won't re-assert the interrupt before we send the
>>> flush command.
>> Yes that is the case, but with respect to one type of a log buffer, like
>> for example unless GuC firmware receives the ack for DPC log
>> buffer it won't send a new flush for DPC buffer, but if meanwhile ISR
>> buffer becomes half full it will send a flush interrupt.
>
> So the potential for losing interrupts is unavoidable it seems. :( If I
> am understanding this correctly.
>
Yes unavoidable, but that's a very small window.
Have apprised GuC folks about this.

Best Regards
Akash

> Regards,
>
> Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 13/20] drm/i915: Augment i915 error state to include the dump of GuC log buffer
  2016-08-12 15:32     ` Chris Wilson
@ 2016-08-12 15:46       ` Goel, Akash
  2016-08-12 15:52         ` Chris Wilson
  0 siblings, 1 reply; 68+ messages in thread
From: Goel, Akash @ 2016-08-12 15:46 UTC (permalink / raw)
  To: Chris Wilson, Tvrtko Ursulin; +Cc: intel-gfx, akash.goel



On 8/12/2016 9:02 PM, Chris Wilson wrote:
> On Fri, Aug 12, 2016 at 04:20:03PM +0100, Tvrtko Ursulin wrote:
>>
>> On 12/08/16 07:25, akash.goel@intel.com wrote:
>>> From: Akash Goel <akash.goel@intel.com>
>>>
>>> Added the dump of GuC log buffer to i915 error state, as the contents of
>>> GuC log buffer would also be useful to determine that why the GPU reset
>>> was triggered.
>>>
>>> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> Signed-off-by: Akash Goel <akash.goel@intel.com>
>>> ---
>>>  drivers/gpu/drm/i915/i915_drv.h       |  1 +
>>>  drivers/gpu/drm/i915/i915_gpu_error.c | 27 +++++++++++++++++++++++++++
>>>  2 files changed, 28 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>>> index 28ffac5..4bd3790 100644
>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>> @@ -509,6 +509,7 @@ struct drm_i915_error_state {
>>>  	struct intel_overlay_error_state *overlay;
>>>  	struct intel_display_error_state *display;
>>>  	struct drm_i915_error_object *semaphore_obj;
>>> +	struct drm_i915_error_object *guc_log_obj;
>>>
>>>  	struct drm_i915_error_engine {
>>>  		int engine_id;
>>> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
>>> index eecb870..561b523 100644
>>> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
>>> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
>>> @@ -546,6 +546,21 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
>>>  		}
>>>  	}
>>>
>>> +	if ((obj = error->guc_log_obj)) {
>>> +		err_printf(m, "GuC log buffer = 0x%08x\n",
>>> +			   lower_32_bits(obj->gtt_offset));
>>> +		for (i = 0; i < obj->page_count; i++) {
>>> +			for (elt = 0; elt < PAGE_SIZE/4; elt += 4) {
>>
>> Should the condition be PAGE_SIZE / 16 ? I am not sure, looks like
>> it is counting in u32 * 4 chunks so it might be. Or I might be
>> confused..
It will be PAGE_SIZE / 4 only. It took me some iterations to get it right.
PAGE_SIZE/4 is number of dwords and
elt+=4      is covering 4 dwords in every iteration


>
> There's (or will be) a function to dump the error object in a uniform
> manner. This patch is obsolete.

There is a print_error_obj() function, but that prints one dword per line.
For GuC log buffer its better (for ease of interpretation) to print 4 
dwords per line as each sample if of 4 dwords, also headers are of 8 dwords.
Other benefit is that it reduces the line count of the error state file 
(Compared to other captured buffers like ring buffer, batch buffers, 
status page, size of Log buffer is more, 76 KB).

Best regards
Akash



> -Chris
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 13/20] drm/i915: Augment i915 error state to include the dump of GuC log buffer
  2016-08-12 15:46       ` Goel, Akash
@ 2016-08-12 15:52         ` Chris Wilson
  2016-08-12 16:04           ` Goel, Akash
  0 siblings, 1 reply; 68+ messages in thread
From: Chris Wilson @ 2016-08-12 15:52 UTC (permalink / raw)
  To: Goel, Akash; +Cc: intel-gfx

On Fri, Aug 12, 2016 at 09:16:03PM +0530, Goel, Akash wrote:
> On 8/12/2016 9:02 PM, Chris Wilson wrote:
> >There's (or will be) a function to dump the error object in a uniform
> >manner. This patch is obsolete.
> 
> There is a print_error_obj() function, but that prints one dword per line.

It used to. It will shortly be a compressed stream. Pretty printing is
left to userspace.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 15/20] drm/i915: Debugfs support for GuC logging control
  2016-08-12  6:25 ` [PATCH 15/20] drm/i915: Debugfs support for GuC logging control akash.goel
@ 2016-08-12 15:57   ` Tvrtko Ursulin
  2016-08-12 17:08     ` Goel, Akash
  0 siblings, 1 reply; 68+ messages in thread
From: Tvrtko Ursulin @ 2016-08-12 15:57 UTC (permalink / raw)
  To: akash.goel, intel-gfx


On 12/08/16 07:25, akash.goel@intel.com wrote:
> From: Sagar Arun Kamble <sagar.a.kamble@intel.com>
>
> This patch provides debugfs interface i915_guc_output_control for
> on the fly enabling/disabling of logging in GuC firmware and controlling
> the verbosity level of logs.
> The value written to the file, should have bit 0 set to enable logging and
> bits 4-7 should contain the verbosity info.
>
> v2: Add a forceful flush, to collect left over logs, on disabling logging.
>      Useful for Validation.
>
> v3: Besides minor cleanup, implement read method for the debugfs file and
>      set the guc_log_level to -1 when logging is disabled. (Tvrtko)
>
> Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
> Signed-off-by: Akash Goel <akash.goel@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_debugfs.c        | 44 ++++++++++++++++++++-
>   drivers/gpu/drm/i915/i915_guc_submission.c | 63 ++++++++++++++++++++++++++++++
>   drivers/gpu/drm/i915/intel_guc.h           |  1 +
>   3 files changed, 107 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 14e0dcf..f472fbcd3 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -2674,6 +2674,47 @@ static int i915_guc_log_dump(struct seq_file *m, void *data)
>   	return 0;
>   }
>
> +static int i915_guc_log_control_get(void *data, u64 *val)
> +{
> +	struct drm_device *dev = data;
> +	struct drm_i915_private *dev_priv = to_i915(dev);
> +
> +	if (!dev_priv->guc.log.obj)
> +		return -EINVAL;
> +
> +	*val = i915.guc_log_level;
> +
> +	return 0;
> +}
> +
> +static int i915_guc_log_control_set(void *data, u64 val)
> +{
> +	struct drm_device *dev = data;
> +	struct drm_i915_private *dev_priv = to_i915(dev);
> +	int ret;
> +
> +	ret = mutex_lock_interruptible(&dev->struct_mutex);
> +	if (ret)
> +		return ret;
> +
> +	if (!dev_priv->guc.log.obj) {
> +		ret = -EINVAL;
> +		goto end;
> +	}
> +
> +	intel_runtime_pm_get(dev_priv);
> +	ret = i915_guc_log_control(dev_priv, val);
> +	intel_runtime_pm_put(dev_priv);
> +
> +end:
> +	mutex_unlock(&dev->struct_mutex);
> +	return ret;
> +}
> +
> +DEFINE_SIMPLE_ATTRIBUTE(i915_guc_log_control_fops,
> +			i915_guc_log_control_get, i915_guc_log_control_set,
> +			"%lld\n");
> +
>   static int i915_edp_psr_status(struct seq_file *m, void *data)
>   {
>   	struct drm_info_node *node = m->private;
> @@ -5477,7 +5518,8 @@ static const struct i915_debugfs_files {
>   	{"i915_fbc_false_color", &i915_fbc_fc_fops},
>   	{"i915_dp_test_data", &i915_displayport_test_data_fops},
>   	{"i915_dp_test_type", &i915_displayport_test_type_fops},
> -	{"i915_dp_test_active", &i915_displayport_test_active_fops}
> +	{"i915_dp_test_active", &i915_displayport_test_active_fops},
> +	{"i915_guc_log_control", &i915_guc_log_control_fops}
>   };
>
>   void intel_display_crc_init(struct drm_device *dev)
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> index 4a75c16..041cf68 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -195,6 +195,16 @@ static int host2guc_force_logbuffer_flush(struct intel_guc *guc)
>   	return host2guc_action(guc, data, 2);
>   }
>
> +static int host2guc_logging_control(struct intel_guc *guc, u32 control_val)
> +{
> +	u32 data[2];
> +
> +	data[0] = HOST2GUC_ACTION_UK_LOG_ENABLE_LOGGING;
> +	data[1] = control_val;
> +
> +	return host2guc_action(guc, data, 2);
> +}
> +
>   /*
>    * Initialise, update, or clear doorbell data shared with the GuC
>    *
> @@ -1538,3 +1548,56 @@ void i915_guc_register(struct drm_i915_private *dev_priv)
>   	guc_log_late_setup(&dev_priv->guc);
>   	mutex_unlock(&dev_priv->drm.struct_mutex);
>   }
> +
> +int i915_guc_log_control(struct drm_i915_private *dev_priv, u64 control_val)
> +{
> +	union guc_log_control log_param;
> +	int ret;
> +
> +	log_param.logging_enabled = control_val & 0x1;
> +	log_param.verbosity = (control_val >> 4) & 0xF;

Maybe "log_param.value = control_val" would also work since 
guc_log_control is conveniently defined as an union. Doesn't matter though.

> +
> +	if (log_param.verbosity < GUC_LOG_VERBOSITY_MIN ||
> +	    log_param.verbosity > GUC_LOG_VERBOSITY_MAX)
> +		return -EINVAL;
> +
> +	/* This combination doesn't make sense & won't have any effect */
> +	if (!log_param.logging_enabled && (i915.guc_log_level < 0))
> +		return 0;

I wonder if it would work and maybe look nicer to generalize as:

	int guc_log_level;

	guc_log_level = log_param.logging_enabled ? log_param.verbosity : -1;
	if (i915.guc_log_level == guc_log_level)
		return 0;
> +
> +	ret = host2guc_logging_control(&dev_priv->guc, log_param.value);
> +	if (ret < 0) {
> +		DRM_DEBUG_DRIVER("host2guc action failed %d\n", ret);
> +		return ret;
> +	}
> +
> +	i915.guc_log_level = log_param.verbosity;

This would then become i915.guc_log_level = guc_log_level.

> +
> +	/* If log_level was set as -1 at boot time, then the relay channel file
> +	 * wouldn't have been created by now and interrupts also would not have
> +	 * been enabled.
> +	 */
> +	if (!dev_priv->guc.log.relay_chan) {
> +		ret = guc_log_late_setup(&dev_priv->guc);
> +		if (!ret)
> +			gen9_enable_guc_interrupts(dev_priv);
> +	} else if (!log_param.logging_enabled) {
> +		/* Once logging is disabled, GuC won't generate logs & send an
> +		 * interrupt. But there could be some data in the log buffer
> +		 * which is yet to be captured. So request GuC to update the log
> +		 * buffer state and then collect the left over logs.
> +		 */
> +		i915_guc_flush_logs(dev_priv);
> +
> +		/* GuC would have updated the log buffer by now, so capture it */
> +		i915_guc_capture_logs(dev_priv);
> +
> +		/* As logging is disabled, update the log level to reflect that */
> +		i915.guc_log_level = -1;
> +	} else {
> +		/* In case interrupts were disabled, enable them now */
> +		gen9_enable_guc_interrupts(dev_priv);
> +	}

And this block would need some adjustments with my guc_log_level idea.

Well not sure, see what you think. I am just attracted to the idea of 
operating in the same value domain as much as possible for readability 
and simplicity. Maybe it would not improve anything, I did not bother 
with typing it all to check.

> +
> +	return ret;
> +}
> diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
> index d3a5447..2f8c408 100644
> --- a/drivers/gpu/drm/i915/intel_guc.h
> +++ b/drivers/gpu/drm/i915/intel_guc.h
> @@ -186,5 +186,6 @@ void i915_guc_capture_logs(struct drm_i915_private *dev_priv);
>   void i915_guc_flush_logs(struct drm_i915_private *dev_priv);
>   void i915_guc_register(struct drm_i915_private *dev_priv);
>   void i915_guc_unregister(struct drm_i915_private *dev_priv);
> +int i915_guc_log_control(struct drm_i915_private *dev_priv, u64 control_val);
>
>   #endif
>

Patch looks correct as is, so:

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Although I would be happier though if my suggestion to use the same 
value domain as for the module parameter was used. In other words:

	{"i915_guc_log_level", &i915_guc_log_control_fops}

...

int i915_guc_log_control(struct drm_i915_private *dev_priv, u64 control_val)
...
	int guc_log_level = (int)control_val;
...
	log_param.logging_enabled = guc_log_level > -1;
	log_param.verbosity = guc_log_level > -1 ? guc_log_level : 0;
...

It think it would be simpler for the user and developer to only have to 
think about one set of values when dealing with guc logging.

But maybe extensions to guc_log_control are imminent and expected so it 
would not be worth it in the long run. No idea.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 13/20] drm/i915: Augment i915 error state to include the dump of GuC log buffer
  2016-08-12 15:52         ` Chris Wilson
@ 2016-08-12 16:04           ` Goel, Akash
  2016-08-12 16:09             ` Chris Wilson
  0 siblings, 1 reply; 68+ messages in thread
From: Goel, Akash @ 2016-08-12 16:04 UTC (permalink / raw)
  To: Chris Wilson, Tvrtko Ursulin; +Cc: intel-gfx, akash.goel



On 8/12/2016 9:22 PM, Chris Wilson wrote:
> On Fri, Aug 12, 2016 at 09:16:03PM +0530, Goel, Akash wrote:
>> On 8/12/2016 9:02 PM, Chris Wilson wrote:
>>> There's (or will be) a function to dump the error object in a uniform
>>> manner. This patch is obsolete.
>>
>> There is a print_error_obj() function, but that prints one dword per line.
>
> It used to. It will shortly be a compressed stream.

> Pretty printing is left to userspace.
But invariably, we only will be interpreting the error state or Guc log 
buffer dump, and it will be really convenient if we can have 4 dwords 
per line matching the log sample size.


Best regards
Akash
> -Chris
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 17/20] drm/i915: Use uncached(WC) mapping for acessing the GuC log buffer
  2016-08-12  6:25 ` [PATCH 17/20] drm/i915: Use uncached(WC) mapping for acessing the GuC log buffer akash.goel
@ 2016-08-12 16:05   ` Tvrtko Ursulin
  0 siblings, 0 replies; 68+ messages in thread
From: Tvrtko Ursulin @ 2016-08-12 16:05 UTC (permalink / raw)
  To: akash.goel, intel-gfx


On 12/08/16 07:25, akash.goel@intel.com wrote:
> From: Akash Goel <akash.goel@intel.com>
>
> Host needs to sample the GuC log buffer on every flush interrupt from GuC.
> To ensure that we always get the up-to-date data from log buffer, its
> better to access the buffer through an uncached CPU mapping. Also the way
> buffer is accessed from GuC & Host side, manually doing cache flush may
> not be effective always if cached CPU mapping is used.
> Though there could be some performance implication with Uncached read, but
> reliability of data will be ensured.
>
> v2: Rebase.
>
> v3: Rebase.
>
> Signed-off-by: Akash Goel <akash.goel@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_guc_submission.c | 9 +++++----
>   1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> index 1d58d36..1818343 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -1002,8 +1002,6 @@ static void guc_read_update_log_buffer(struct intel_guc *guc)
>   			dst_data_ptr += buffer_size;
>   		}
>
> -		/* FIXME: invalidate/flush for log buffer needed */
> -
>   		/* Update the read pointer in the shared log buffer */
>   		log_buffer_state->read_ptr = write_offset;
>
> @@ -1177,8 +1175,11 @@ static int guc_create_log_extras(struct intel_guc *guc)
>   		return 0;
>
>   	if (!guc->log.buf_addr) {
> -		/* Create a vmalloc mapping of log buffer pages */
> -		vaddr = i915_gem_object_pin_map(guc->log.obj, I915_MAP_WB);
> +		/* Create a WC (Uncached for read) vmalloc mapping of log
> +		 * buffer pages, so that we can directly get the data
> +		 * (up-to-date) from memory.
> +		 */
> +		vaddr = i915_gem_object_pin_map(guc->log.obj, I915_MAP_WC);
>   		if (IS_ERR(vaddr)) {
>   			ret = PTR_ERR(vaddr);
>   			DRM_ERROR("Couldn't map log buffer pages %d\n", ret);
>

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Hopefully no one applies this without 19/20. :)

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 19/20] drm/i915: Use SSE4.1 movntdqa based memcpy for sampling GuC log buffer
  2016-08-12  6:25 ` [PATCH 19/20] drm/i915: Use SSE4.1 movntdqa based memcpy for sampling GuC log buffer akash.goel
@ 2016-08-12 16:06   ` Tvrtko Ursulin
  0 siblings, 0 replies; 68+ messages in thread
From: Tvrtko Ursulin @ 2016-08-12 16:06 UTC (permalink / raw)
  To: akash.goel, intel-gfx


On 12/08/16 07:25, akash.goel@intel.com wrote:
> From: Akash Goel <akash.goel@intel.com>
>
> In order to have fast reads from the GuC log buffer, used SSE4.1 movntdqa
> based memcpy function i915_memcpy_from_wc.
> GuC log buffer has a WC type vmalloc mapping and copying using movntqda from
> WC type memory is almost as fast as reading from WB memory.
> This will further reduce the log buffer sampling time, so is needed dearly
> to deal with the flush interrupt storm when GuC is generating logs at a very
> high rate.
> Ideally SSE 4.1 should be present on all chipsets supporting GuC based
> submisssions, but if not then logging will not be enabled.
>
> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Akash Goel <akash.goel@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_guc_submission.c | 17 ++++++++++++++---
>   1 file changed, 14 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> index 1818343..af48f62 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -987,15 +987,16 @@ static void guc_read_update_log_buffer(struct intel_guc *guc)
>   			/* Just copy the newly written data */
>   			if (read_offset <= write_offset) {
>   				bytes_to_copy = write_offset - read_offset;
> -				memcpy(dst_data_ptr + read_offset,
> +				i915_memcpy_from_wc(dst_data_ptr + read_offset,
>   				     src_data_ptr + read_offset, bytes_to_copy);
>   			} else {
>   				bytes_to_copy = buffer_size - read_offset;
> -				memcpy(dst_data_ptr + read_offset,
> +				i915_memcpy_from_wc(dst_data_ptr + read_offset,
>   				     src_data_ptr + read_offset, bytes_to_copy);
>
>   				bytes_to_copy = write_offset;
> -				memcpy(dst_data_ptr, src_data_ptr, bytes_to_copy);
> +				i915_memcpy_from_wc(dst_data_ptr, src_data_ptr,
> +				     bytes_to_copy);
>   			}
>
>   			src_data_ptr += buffer_size;
> @@ -1210,6 +1211,16 @@ static void guc_create_log(struct intel_guc *guc)
>
>   	obj = guc->log.obj;
>   	if (!obj) {
> +		/* We require SSE 4.1 for fast reads from the GuC log buffer and
> +		 * it should be present on the chipsets supporting GuC based
> +		 * submisssions.
> +		 */
> +		if (WARN_ON(!i915_memcpy_from_wc(NULL, NULL, 0))) {
> +			/* logging will not be enabled */
> +			i915.guc_log_level = -1;
> +			return;
> +		}
> +
>   		obj = gem_allocate_guc_obj(dev_priv, size);
>   		if (!obj) {
>   			/* logging will be off */
>

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 13/20] drm/i915: Augment i915 error state to include the dump of GuC log buffer
  2016-08-12 16:04           ` Goel, Akash
@ 2016-08-12 16:09             ` Chris Wilson
  0 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2016-08-12 16:09 UTC (permalink / raw)
  To: Goel, Akash; +Cc: intel-gfx

On Fri, Aug 12, 2016 at 09:34:23PM +0530, Goel, Akash wrote:
> 
> 
> On 8/12/2016 9:22 PM, Chris Wilson wrote:
> >On Fri, Aug 12, 2016 at 09:16:03PM +0530, Goel, Akash wrote:
> >>On 8/12/2016 9:02 PM, Chris Wilson wrote:
> >>>There's (or will be) a function to dump the error object in a uniform
> >>>manner. This patch is obsolete.
> >>
> >>There is a print_error_obj() function, but that prints one dword per line.
> >
> >It used to. It will shortly be a compressed stream.
> 
> >Pretty printing is left to userspace.
> But invariably, we only will be interpreting the error state or Guc
> log buffer dump, and it will be really convenient if we can have 4
> dwords per line matching the log sample size.

That's fine. Do it in userspace.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 08/20] drm/i915: Add a relay backed debugfs interface for capturing GuC logs
  2016-08-12 13:53   ` Tvrtko Ursulin
@ 2016-08-12 16:10     ` Goel, Akash
  0 siblings, 0 replies; 68+ messages in thread
From: Goel, Akash @ 2016-08-12 16:10 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: Sourab Gupta, akash.goel



On 8/12/2016 7:23 PM, Tvrtko Ursulin wrote:
>
> On 12/08/16 07:25, akash.goel@intel.com wrote:
>> From: Akash Goel <akash.goel@intel.com>
>>
>> Added a new debugfs interface '/sys/kernel/debug/dri/guc_log' for the
>> User to capture GuC firmware logs. Availed relay framework to implement
>> the interface, where Driver will have to just use a relay API to store
>> snapshots of the GuC log buffer in the buffer managed by relay.
>> The snapshot will be taken when GuC firmware sends a log buffer flush
>> interrupt and up to four snaphots could be stored in the relay buffer.
>
> snapshots
>
>> The relay buffer will be operated in a mode where it will overwrite the
>> data not yet collected by User.
>> Besides mmap method, through which User can directly access the relay
>> buffer contents, relay also supports the 'poll' method. Through the
>> 'poll'
>> call on log file, User can come to know whenever a new snapshot of the
>> log buffer is taken by Driver, so can run in tandem with the Driver and
>> capture the logs in a sustained/streaming manner, without any loss of
>> data.
>>
>> v2: Defer the creation of relay channel & associated debugfs file, as
>>      debugfs setup is now done at the end of i915 Driver load. (Chris)
>>
>> v3:
>> - Switch to no-overwrite mode for relay.
>> - Fix the relay sub buffer switching sequence.
>>
>> v4:
>> - Update i915 Kconfig to select RELAY config. (TvrtKo)
>> - Log a message when there is no sub buffer available to capture
>>    the GuC log buffer. (Tvrtko)
>> - Increase the number of relay sub buffers to 8 from 4, to have
>>    sufficient buffering for boot time logs
>>
>> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
>> Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
>> Signed-off-by: Akash Goel <akash.goel@intel.com>
>> ---
>>   drivers/gpu/drm/i915/Kconfig               |   1 +
>>   drivers/gpu/drm/i915/i915_drv.c            |   2 +
>>   drivers/gpu/drm/i915/i915_guc_submission.c | 206
>> ++++++++++++++++++++++++++++-
>>   drivers/gpu/drm/i915/intel_guc.h           |   3 +
>>   4 files changed, 209 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
>> index 7769e46..fc900d2 100644
>> --- a/drivers/gpu/drm/i915/Kconfig
>> +++ b/drivers/gpu/drm/i915/Kconfig
>> @@ -11,6 +11,7 @@ config DRM_I915
>>       select DRM_KMS_HELPER
>>       select DRM_PANEL
>>       select DRM_MIPI_DSI
>> +    select RELAY
>>       # i915 depends on ACPI_VIDEO when ACPI is enabled
>>       # but for select to work, need to select ACPI_VIDEO's
>> dependencies, ick
>>       select BACKLIGHT_LCD_SUPPORT if ACPI
>> diff --git a/drivers/gpu/drm/i915/i915_drv.c
>> b/drivers/gpu/drm/i915/i915_drv.c
>> index fc2da32..cb8c943 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.c
>> +++ b/drivers/gpu/drm/i915/i915_drv.c
>> @@ -1145,6 +1145,7 @@ static void i915_driver_register(struct
>> drm_i915_private *dev_priv)
>>       /* Reveal our presence to userspace */
>>       if (drm_dev_register(dev, 0) == 0) {
>>           i915_debugfs_register(dev_priv);
>> +        i915_guc_register(dev_priv);
>>           i915_setup_sysfs(dev);
>>       } else
>>           DRM_ERROR("Failed to register driver for userspace access!\n");
>> @@ -1183,6 +1184,7 @@ static void i915_driver_unregister(struct
>> drm_i915_private *dev_priv)
>>       intel_opregion_unregister(dev_priv);
>>
>>       i915_teardown_sysfs(&dev_priv->drm);
>> +    i915_guc_unregister(dev_priv);
>>       i915_debugfs_unregister(dev_priv);
>>       drm_dev_unregister(&dev_priv->drm);
>>
>> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
>> b/drivers/gpu/drm/i915/i915_guc_submission.c
>> index 2635b67..1a2d648 100644
>> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
>> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
>> @@ -23,6 +23,8 @@
>>    */
>>   #include <linux/firmware.h>
>>   #include <linux/circ_buf.h>
>> +#include <linux/debugfs.h>
>> +#include <linux/relay.h>
>>   #include "i915_drv.h"
>>   #include "intel_guc.h"
>>
>> @@ -851,12 +853,33 @@ err:
>>
>>   static void guc_move_to_next_buf(struct intel_guc *guc)
>>   {
>> -    return;
>> +    /* Make sure the updates made in the sub buffer are visible when
>> +     * Consumer sees the following update to offset inside the sub
>> buffer.
>> +     */
>> +    smp_wmb();
>> +
>> +    /* All data has been written, so now move the offset of sub
>> buffer. */
>> +    relay_reserve(guc->log.relay_chan, guc->log.obj->base.size);
>> +
>> +    /* Switch to the next sub buffer */
>> +    relay_flush(guc->log.relay_chan);
>>   }
>>
>>   static void* guc_get_write_buffer(struct intel_guc *guc)
>>   {
>> -    return NULL;
>> +    /* FIXME: Cover the check under a lock ? */
>
> Need to resolve before r-b in any case.
After the last patch in this series, where relay channel will be created 
before enabling the GuC interrupts, the need of lock will not be there 
so will remove these comments in that patch.

>
>> +    if (!guc->log.relay_chan)
>> +        return NULL;
>> +
>> +    /* Just get the base address of a new sub buffer and copy data
>> into it
>> +     * ourselves. NULL will be returned in no-overwrite mode, if all sub
>> +     * buffers are full. Could have used the relay_write() to indirectly
>> +     * copy the data, but that would have been bit convoluted, as we
>> need to
>> +     * write to only certain locations inside a sub buffer which
>> cannot be
>> +     * done without using relay_reserve() along with relay_write().
>> So its
>> +     * better to use relay_reserve() alone.
>> +     */
>> +    return relay_reserve(guc->log.relay_chan, 0);
>>   }
>>
>>   static void guc_read_update_log_buffer(struct intel_guc *guc)
>> @@ -923,6 +946,130 @@ static void guc_read_update_log_buffer(struct
>> intel_guc *guc)
>>
>>       if (log_buffer_snapshot_state)
>>           guc_move_to_next_buf(guc);
>> +    else {
>> +        /* Used rate limited to avoid deluge of messages, logs might be
>> +         * getting consumed by User at a slow rate.
>> +         */
>> +        DRM_ERROR_RATELIMITED("no sub-buffer to capture log buffer\n");
>> +    }
>> +}
>> +
>> +/*
>> + * Sub buffer switch callback. Called whenever relay has to switch to
>> a new
>> + * sub buffer, relay stays on the same sub buffer if 0 is returned.
>> + */
>> +static int subbuf_start_callback(struct rchan_buf *buf,
>> +                 void *subbuf,
>> +                 void *prev_subbuf,
>> +                 size_t prev_padding)
>> +{
>> +    /* Use no-overwrite mode by default, where relay will stop accepting
>> +     * new data if there are no empty sub buffers left.
>> +     * There is no strict synchronization enforced by relay between
>> Consumer
>> +     * and Producer. In overwrite mode, there is a possibility of
>> getting
>> +     * inconsistent/garbled data, the producer could be writing on to
>> the
>> +     * same sub buffer from which Consumer is reading. This can't be
>> avoided
>> +     * unless Consumer is fast enough and can always run in tandem with
>> +     * Producer.
>> +     */
>> +    if (relay_buf_full(buf))
>> +        return 0;
>> +
>> +    return 1;
>> +}
>> +
>> +/*
>> + * file_create() callback. Creates relay file in debugfs.
>> + */
>> +static struct dentry *create_buf_file_callback(const char *filename,
>> +                           struct dentry *parent,
>> +                           umode_t mode,
>> +                           struct rchan_buf *buf,
>> +                           int *is_global)
>> +{
>> +    struct dentry *buf_file = NULL;
>> +
>
> Would this function be a tiny bit simpler as:
>
>     if (!parent)
>         return NULL;
>
Fine, will do like this.

> ?
>
>> +    if (parent) {
>> +        /* Not using the channel filename passed as an argument, since
>> +         * for each channel relay appends the corresponding CPU number
>> +         * to the filename passed in relay_open(). This should be fine
>> +         * as relay just needs a dentry of the file associated with the
>> +         * channel buffer and that file's name need not be same as the
>> +         * filename passed as an argument.
>> +         */
>> +        buf_file = debugfs_create_file("guc_log", mode,
>> +                parent, buf, &relay_file_operations);
>
> Alignment is wrong.
>
Will change like this.

	buf_file = debugfs_create_file("guc_log", mode,
				       parent, buf, &relay_file_operations);

>> +    }
>> +
>> +    /* This to enable the use of a single buffer for the relay
>> channel and
>> +     * correspondingly have a single file exposed to User, through which
>> +     * it can collect the logs inorder without any post-processing.
>
> "in order"

Fine.
>
>> +     */
>> +    *is_global = 1;
>> +
>> +    return buf_file;
>> +}
>> +
>> +/*
>> + * file_remove() default callback. Removes relay file in debugfs.
>> + */
>> +static int remove_buf_file_callback(struct dentry *dentry)
>> +{
>> +    debugfs_remove(dentry);
>> +    return 0;
>> +}
>> +
>> +/* relay channel callbacks */
>> +static struct rchan_callbacks relay_callbacks = {
>> +    .subbuf_start = subbuf_start_callback,
>> +    .create_buf_file = create_buf_file_callback,
>> +    .remove_buf_file = remove_buf_file_callback,
>> +};
>> +
>> +static void guc_remove_log_relay_file(struct intel_guc *guc)
>> +{
>> +    relay_close(guc->log.relay_chan);
>> +}
>> +
>> +static int guc_create_log_relay_file(struct intel_guc *guc)
>> +{
>> +    struct drm_i915_private *dev_priv = guc_to_i915(guc);
>> +    struct rchan *guc_log_relay_chan;
>> +    struct dentry *log_dir;
>> +    size_t n_subbufs, subbuf_size;
>> +
>> +    /* For now create the log file in /sys/kernel/debug/dri/0 dir */
>
> Where it should be or will be later?

There was a proposal to create a new custom file system for drm 
subsystem (with a differnt mount point) and all the debugfs files &
some sysfs files for drm/i915 would be moved there.

>
>> +    log_dir = dev_priv->drm.primary->debugfs_root;
>> +
>> +    /* If /sys/kernel/debug/dri/0 location do not exist, then debugfs is
>> +     * not mounted and so can't create the relay file.
>> +     * The relay API seems to fit well with debugfs only.
>> +     */
>> +    if (!log_dir) {
>> +        DRM_DEBUG_DRIVER("Parent debugfs directory not available
>> yet\n");
>> +        return -ENODEV;
>> +    }
>> +
>> +    /* Keep the size of sub buffers same as shared log buffer */
>> +    subbuf_size = guc->log.obj->base.size;
>
> Insert blank line for readability probably.
Fine will add a newline.
>
>> +    /* Store up to 8 snaphosts, which is large enough to buffer
>> sufficient
>
> snapshots
Sorry for typo.
>
>> +     * boot time logs and provides enough leeway to User, in terms of
>> +     * latency, for consuming the logs from relay. Also doesn't take
>> +     * up too much memory.
>> +         */
>
> Indentation is off.
Sorry.
>
>> +    n_subbufs = 8;
>> +
>> +    guc_log_relay_chan = relay_open("guc_log", log_dir,
>> +            subbuf_size, n_subbufs, &relay_callbacks, dev_priv);
>
> Alignment is wrong.
Sorry.
>
>> +
>> +    if (!guc_log_relay_chan) {
>> +        DRM_DEBUG_DRIVER("Couldn't create relay chan for guc logs\n");
>> +        return -ENOMEM;
>> +    }
>> +
>> +    /* FIXME: Cover the update under a lock ? */
>
> Another FIXME to be resolved.
Same as above, will remove it in the last patch of this series.

>
>> +    guc->log.relay_chan = guc_log_relay_chan;
>> +    return 0;
>>   }
>>
>>   static void guc_log_cleanup(struct intel_guc *guc)
>> @@ -937,6 +1084,11 @@ static void guc_log_cleanup(struct intel_guc *guc)
>>       /* First disable the flush interrupt */
>>       gen9_disable_guc_interrupts(dev_priv);
>>
>> +    if (guc->log.relay_chan)
>> +        guc_remove_log_relay_file(guc);
>> +
>> +    guc->log.relay_chan = NULL;
>> +
>>       if (guc->log.buf_addr)
>>           i915_gem_object_unpin_map(guc->log.obj);
>>
>> @@ -1015,6 +1167,35 @@ static void guc_create_log(struct intel_guc *guc)
>>       guc->log.flags = (offset << GUC_LOG_BUF_ADDR_SHIFT) | flags;
>>   }
>>
>> +static int guc_log_late_setup(struct intel_guc *guc)
>
> static void if failure cannot be handled or otherwise act on the return
> value?
>
return value is acted upon for the call from debugfs function.

But no use of that called from Driver load path.

Best regards
Akash

>> +{
>> +    struct drm_i915_private *dev_priv = guc_to_i915(guc);
>> +    int ret;
>> +
>> +    lockdep_assert_held(&dev_priv->drm.struct_mutex);
>> +
>> +    if (i915.guc_log_level < 0)
>> +        return -EINVAL;
>> +
>> +    /* If log_level was set as -1 at boot time, then vmalloc mapping
>> would
>> +     * not have been created for the log buffer, so create one now.
>> +     */
>> +    ret = guc_create_log_extras(guc);
>> +    if (ret)
>> +        goto err;
>> +
>> +    ret = guc_create_log_relay_file(guc);
>> +    if (ret)
>> +        goto err;
>> +
>> +    return 0;
>> +err:
>> +    guc_log_cleanup(guc);
>> +    /* logging will remain off */
>> +    i915.guc_log_level = -1;
>> +    return ret;
>> +}
>> +
>>   static void init_guc_policies(struct guc_policies *policies)
>>   {
>>       struct guc_policy *policy;
>> @@ -1185,7 +1366,6 @@ void i915_guc_submission_fini(struct
>> drm_i915_private *dev_priv)
>>       gem_release_guc_obj(dev_priv->guc.ads_obj);
>>       guc->ads_obj = NULL;
>>
>> -    guc_log_cleanup(guc);
>>       gem_release_guc_obj(dev_priv->guc.log.obj);
>>       guc->log.obj = NULL;
>>
>> @@ -1261,3 +1441,23 @@ void i915_guc_capture_logs(struct
>> drm_i915_private *dev_priv)
>>       host2guc_logbuffer_flush_complete(&dev_priv->guc);
>>       intel_runtime_pm_put(dev_priv);
>>   }
>> +
>> +void i915_guc_unregister(struct drm_i915_private *dev_priv)
>> +{
>> +    if (!i915.enable_guc_submission)
>> +        return;
>> +
>> +    mutex_lock(&dev_priv->drm.struct_mutex);
>> +    guc_log_cleanup(&dev_priv->guc);
>> +    mutex_unlock(&dev_priv->drm.struct_mutex);
>> +}
>> +
>> +void i915_guc_register(struct drm_i915_private *dev_priv)
>> +{
>> +    if (!i915.enable_guc_submission)
>> +        return;
>> +
>> +    mutex_lock(&dev_priv->drm.struct_mutex);
>> +    guc_log_late_setup(&dev_priv->guc);
>> +    mutex_unlock(&dev_priv->drm.struct_mutex);
>> +}
>> diff --git a/drivers/gpu/drm/i915/intel_guc.h
>> b/drivers/gpu/drm/i915/intel_guc.h
>> index 7c0bdba..96ef7dc 100644
>> --- a/drivers/gpu/drm/i915/intel_guc.h
>> +++ b/drivers/gpu/drm/i915/intel_guc.h
>> @@ -126,6 +126,7 @@ struct intel_guc_log {
>>       struct drm_i915_gem_object *obj;
>>       struct workqueue_struct *wq;
>>       void *buf_addr;
>> +    struct rchan *relay_chan;
>>   };
>>
>>   struct intel_guc {
>> @@ -172,5 +173,7 @@ int i915_guc_wq_check_space(struct
>> drm_i915_gem_request *rq);
>>   void i915_guc_submission_disable(struct drm_i915_private *dev_priv);
>>   void i915_guc_submission_fini(struct drm_i915_private *dev_priv);
>>   void i915_guc_capture_logs(struct drm_i915_private *dev_priv);
>> +void i915_guc_register(struct drm_i915_private *dev_priv);
>> +void i915_guc_unregister(struct drm_i915_private *dev_priv);
>>
>>   #endif
>>
>
> Regards,
>
> Tvrtko
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 06/20] drm/i915: Handle log buffer flush interrupt event from GuC
  2016-08-12 14:07       ` Tvrtko Ursulin
@ 2016-08-12 16:17         ` Goel, Akash
  0 siblings, 0 replies; 68+ messages in thread
From: Goel, Akash @ 2016-08-12 16:17 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: akash.goel



On 8/12/2016 7:37 PM, Tvrtko Ursulin wrote:
>
> On 12/08/16 14:45, Goel, Akash wrote:
>>
>>
>> On 8/12/2016 6:47 PM, Tvrtko Ursulin wrote:
>>>
>>> On 12/08/16 07:25, akash.goel@intel.com wrote:
>>>> From: Sagar Arun Kamble <sagar.a.kamble@intel.com>
>>>>
>>>> GuC ukernel sends an interrupt to Host to flush the log buffer
>>>> and expects Host to correspondingly update the read pointer
>>>> information in the state structure, once it has consumed the
>>>> log buffer contents by copying them to a file or buffer.
>>>> Even if Host couldn't copy the contents, it can still update the
>>>> read pointer so that logging state is not disturbed on GuC side.
>>>>
>>>> v2:
>>>> - Use a dedicated workqueue for handling flush interrupt. (Tvrtko)
>>>> - Reduce the overall log buffer copying time by skipping the copy of
>>>>    crash buffer area for regular cases and copying only the state
>>>>    structure data in first page.
>>>>
>>>> v3:
>>>>   - Create a vmalloc mapping of log buffer. (Chris)
>>>>   - Cover the flush acknowledgment under rpm get & put.(Chris)
>>>>   - Revert the change of skipping the copy of crash dump area, as
>>>>     not really needed, will be covered by subsequent patch.
>>>>
>>>> v4:
>>>>   - Destroy the wq under the same condition in which it was created,
>>>>     pass dev_piv pointer instead of dev to newly added GuC function,
>>>>     add more comments & rename variable for clarity. (Tvrtko)
>>>>
>>>> Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
>>>> Signed-off-by: Akash Goel <akash.goel@intel.com>
>>>> ---
>>>>   drivers/gpu/drm/i915/i915_drv.c            |  14 +++
>>>>   drivers/gpu/drm/i915/i915_guc_submission.c | 150
>>>> +++++++++++++++++++++++++++++
>>>>   drivers/gpu/drm/i915/i915_irq.c            |   5 +-
>>>>   drivers/gpu/drm/i915/intel_guc.h           |   3 +
>>>>   4 files changed, 170 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.c
>>>> b/drivers/gpu/drm/i915/i915_drv.c
>>>> index 0fcd1c0..fc2da32 100644
>>>> --- a/drivers/gpu/drm/i915/i915_drv.c
>>>> +++ b/drivers/gpu/drm/i915/i915_drv.c
>>>> @@ -770,8 +770,20 @@ static int i915_workqueues_init(struct
>>>> drm_i915_private *dev_priv)
>>>>       if (dev_priv->hotplug.dp_wq == NULL)
>>>>           goto out_free_wq;
>>>>
>>>> +    if (HAS_GUC_SCHED(dev_priv)) {
>>>
>>> This just reminded me that a previous patch had:
>>>
>>> +    if (HAS_GUC_UCODE(dev))
>>> +        dev_priv->pm_guc_events = GEN9_GUC_TO_HOST_INT_EVENT;
>>>
>>> In the interrupt setup. I don't think there is a bug right now, but
>>> there is a disagreement between the two which would be good to resolve.
>>>
>>> This HAS_GUC_UCODE in the other patch should probably be HAS_GUC_SCHED
>>> for correctness. I think.
>>
>> Sorry for inconsistency, Will use HAS_GUC_SCHED in the previous patch.
>>
>> As per Chris's comments will move the wq init/destroy to the GuC logging
>> setup/teardown routines (guc_create_log_extras, guc_log_cleanup)
>> You are fine with that ?.
>
> Yes thats OK I think.
>
>>>
>>>> +        /* Need a dedicated wq to process log buffer flush interrupts
>>>> +         * from GuC without much delay so as to avoid any loss of
>>>> logs.
>>>> +         */
>>>> +        dev_priv->guc.log.wq =
>>>> +            alloc_ordered_workqueue("i915-guc_log", 0);
>>>> +        if (dev_priv->guc.log.wq == NULL)
>>>> +            goto out_free_hotplug_dp_wq;
>>>> +    }
>>>> +
>>>>       return 0;
>>>>
>>>> +out_free_hotplug_dp_wq:
>>>> +    destroy_workqueue(dev_priv->hotplug.dp_wq);
>>>>   out_free_wq:
>>>>       destroy_workqueue(dev_priv->wq);
>>>>   out_err:
>>>> @@ -782,6 +794,8 @@ out_err:
>>>>
>>>>   static void i915_workqueues_cleanup(struct drm_i915_private
>>>> *dev_priv)
>>>>   {
>>>> +    if (HAS_GUC_SCHED(dev_priv))
>>>> +        destroy_workqueue(dev_priv->guc.log.wq);
>>>>       destroy_workqueue(dev_priv->hotplug.dp_wq);
>>>>       destroy_workqueue(dev_priv->wq);
>>>>   }
>>>> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
>>>> b/drivers/gpu/drm/i915/i915_guc_submission.c
>>>> index c7c679f..2635b67 100644
>>>> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
>>>> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
>>>> @@ -172,6 +172,15 @@ static int host2guc_sample_forcewake(struct
>>>> intel_guc *guc,
>>>>       return host2guc_action(guc, data, ARRAY_SIZE(data));
>>>>   }
>>>>
>>>> +static int host2guc_logbuffer_flush_complete(struct intel_guc *guc)
>>>> +{
>>>> +    u32 data[1];
>>>> +
>>>> +    data[0] = HOST2GUC_ACTION_LOG_BUFFER_FILE_FLUSH_COMPLETE;
>>>> +
>>>> +    return host2guc_action(guc, data, 1);
>>>> +}
>>>> +
>>>>   /*
>>>>    * Initialise, update, or clear doorbell data shared with the GuC
>>>>    *
>>>> @@ -840,6 +849,127 @@ err:
>>>>       return NULL;
>>>>   }
>>>>
>>>> +static void guc_move_to_next_buf(struct intel_guc *guc)
>>>> +{
>>>> +    return;
>>>> +}
>>>> +
>>>> +static void* guc_get_write_buffer(struct intel_guc *guc)
>>>> +{
>>>> +    return NULL;
>>>> +}
>>>> +
>>>> +static void guc_read_update_log_buffer(struct intel_guc *guc)
>>>> +{
>>>> +    struct guc_log_buffer_state *log_buffer_state,
>>>> *log_buffer_snapshot_state;
>>>> +    struct guc_log_buffer_state log_buffer_state_local;
>>>> +    void *src_data_ptr, *dst_data_ptr;
>>>> +    u32 i, buffer_size;
>>>
>>> unsigned int i if you can be bothered.
>>
>> Fine will do that for both i & buffer_size.
>
> buffer_size can match the type of log_buffer_state_local.size or use
> something else if more appropriate.
>
>> But I remember earlier in one of the patch, you suggested to use u32 as
>> a type for some variables.
>> Please could you share the guideline.
>> Should u32, u64 be used we are exactly sure of the range of the
>> variable, like for variables containing the register values ?
>
> Depends what the variable is. "i" in this case is just a standard
> counter so not appropriate to use u32. It is not that wrong on x86, just
> looks a bit out of place.
>
> grep "u32 i;" has no hits in i915. :)
>
> They can/should be used for variables tied with hardware or protocols.
> Or in some cases where you really want to save space by using a small type.
>
Thanks will be mindful of this.

>>
>>>
>>>> +
>>>> +    if (!guc->log.buf_addr)
>>>> +        return;
>>>
>>> Can it hit this? If yes, I think better disable GuC logging when pin map
>>> on the object fails rather than let it generate interrupts in vain.
>>>
Ideally it should not, as logging itself is disabled (interrupts also 
disabled) on log buffer allocation failure.

>>>> +
>>>> +    /* Get the pointer to shared GuC log buffer */
>>>> +    log_buffer_state = src_data_ptr = guc->log.buf_addr;
>>>> +
>>>> +    /* Get the pointer to local buffer to store the logs */
>>>> +    dst_data_ptr = log_buffer_snapshot_state =
>>>> guc_get_write_buffer(guc);
>>>> +
>>>> +    /* Actual logs are present from the 2nd page */
>>>> +    src_data_ptr += PAGE_SIZE;
>>>> +    dst_data_ptr += PAGE_SIZE;
>>>> +
>>>> +    for (i = 0; i < GUC_MAX_LOG_BUFFER; i++) {
>>>> +        /* Make a copy of the state structure in GuC log buffer (which
>>>> +         * is uncached mapped) on the stack to avoid reading from it
>>>> +         * multiple times.
>>>> +         */
>>>> +        memcpy(&log_buffer_state_local, log_buffer_state,
>>>> +                sizeof(struct guc_log_buffer_state));
>>>> +        buffer_size = log_buffer_state_local.size;
>>>
>>> Needs checking (and logging) that a bad firmware or some other error
>>> does not report a dangerous size (too big) which would then overwrite
>>> memory pointed by dst_data_ptr memory. (And/or read from random memory.)
>>>
>> Have done the range checking for the read/write offset values, which are
>> updated repeatedly by GuC firmware.
>> The buffer size is updated only at init time by GuC firmware, hence less
>> vulnerable.
>>
>> But nevertheless will add the checks.
>
> Ok good.
>
>>>> +
>>>> +        if (log_buffer_snapshot_state) {
>>>> +            /* First copy the state structure in local buffer */
>>>
>>> Maybe "destination buffer" would be clearer?
>>
>> Sorry my bad, well spotted.
>>
>>>
>>>> +            memcpy(log_buffer_snapshot_state, &log_buffer_state_local,
>>>> +                    sizeof(struct guc_log_buffer_state));
>>>> +
>>>> +            /* The write pointer could have been updated by the GuC
>>>> +             * firmware, after sending the flush interrupt to Host,
>>>> +             * for consistency set the write pointer value to same
>>>> +             * value of sampled_write_ptr in the snapshot buffer.
>>>> +             */
>>>> +            log_buffer_snapshot_state->write_ptr =
>>>> +                log_buffer_snapshot_state->sampled_write_ptr;
>>>> +
>>>> +            log_buffer_snapshot_state++;
>>>> +
>>>> +            /* Now copy the actual logs */
>>>> +            memcpy(dst_data_ptr, src_data_ptr, buffer_size);
>>>> +
>>>> +            src_data_ptr += buffer_size;
>>>> +            dst_data_ptr += buffer_size;
>>>> +        }
>>>> +
>>>> +        /* FIXME: invalidate/flush for log buffer needed */
>>>
>>> Yes no maybe? :)
>>
>> Will like to keep it, if you allow.
>
> I think you need a really good justification to get r-b on patches with
> FIXMEs, especially like this one. Do you maybe handle it or remove it in
> a following patch or something?
>
Have removed this FIXME in the later patch
"[PATCH 17/20] drm/i915: Use uncached(WC) mapping for acessing the GuC 
log buffer"

Best regards
Akash

> Regards,
>
> Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 20/20] drm/i915: Early creation of relay channel for capturing boot time logs
  2016-08-12  6:25 ` [PATCH 20/20] drm/i915: Early creation of relay channel for capturing boot time logs akash.goel
@ 2016-08-12 16:22   ` Tvrtko Ursulin
  2016-08-12 16:31     ` Goel, Akash
  0 siblings, 1 reply; 68+ messages in thread
From: Tvrtko Ursulin @ 2016-08-12 16:22 UTC (permalink / raw)
  To: akash.goel, intel-gfx


On 12/08/16 07:25, akash.goel@intel.com wrote:
> From: Akash Goel <akash.goel@intel.com>
>
> As per the current i915 Driver load sequence, debugfs registration is done
> at the end and so the relay channel debugfs file is also created after that
> but the GuC firmware is loaded much earlier in the sequence.
> As a result Driver could miss capturing the boot-time logs of GuC firmware
> if there are flush interrupts from the GuC side.
> Relay has a provision to support early logging where initially only relay
> channel can be created, to have buffers for storing logs, and later on
> channel can be associated with a debugfs file at appropriate time.
> Have availed that, which allows Driver to capture boot time logs also,
> which can be collected once Userspace comes up.
>
> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Akash Goel <akash.goel@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_guc_submission.c | 61 +++++++++++++++++++++---------
>   1 file changed, 44 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> index af48f62..1c287d7 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -1099,25 +1099,12 @@ static void guc_remove_log_relay_file(struct intel_guc *guc)
>   	relay_close(guc->log.relay_chan);
>   }
>
> -static int guc_create_log_relay_file(struct intel_guc *guc)
> +static int guc_create_relay_channel(struct intel_guc *guc)
>   {
>   	struct drm_i915_private *dev_priv = guc_to_i915(guc);
>   	struct rchan *guc_log_relay_chan;
> -	struct dentry *log_dir;
>   	size_t n_subbufs, subbuf_size;
>
> -	/* For now create the log file in /sys/kernel/debug/dri/0 dir */
> -	log_dir = dev_priv->drm.primary->debugfs_root;
> -
> -	/* If /sys/kernel/debug/dri/0 location do not exist, then debugfs is
> -	 * not mounted and so can't create the relay file.
> -	 * The relay API seems to fit well with debugfs only.

It only needs a dentry, I don't see that it has to be a debugfs one.

> -	 */
> -	if (!log_dir) {
> -		DRM_DEBUG_DRIVER("Parent debugfs directory not available yet\n");
> -		return -ENODEV;
> -	}
> -
>   	/* Keep the size of sub buffers same as shared log buffer */
>   	subbuf_size = guc->log.obj->base.size;
>   	/* Store up to 8 snaphosts, which is large enough to buffer sufficient
> @@ -1127,7 +1114,7 @@ static int guc_create_log_relay_file(struct intel_guc *guc)
>            */
>   	n_subbufs = 8;
>
> -	guc_log_relay_chan = relay_open("guc_log", log_dir,
> +	guc_log_relay_chan = relay_open(NULL, NULL,
>   			subbuf_size, n_subbufs, &relay_callbacks, dev_priv);
>
>   	if (!guc_log_relay_chan) {
> @@ -1140,6 +1127,33 @@ static int guc_create_log_relay_file(struct intel_guc *guc)
>   	return 0;
>   }
>
> +static int guc_create_log_relay_file(struct intel_guc *guc)
> +{
> +	struct drm_i915_private *dev_priv = guc_to_i915(guc);
> +	struct dentry *log_dir;
> +	int ret;
> +
> +	/* For now create the log file in /sys/kernel/debug/dri/0 dir */
> +	log_dir = dev_priv->drm.primary->debugfs_root;
> +
> +	/* If /sys/kernel/debug/dri/0 location do not exist, then debugfs is
> +	 * not mounted and so can't create the relay file.
> +	 * The relay API seems to fit well with debugfs only.
> +	 */
> +	if (!log_dir) {
> +		DRM_DEBUG_DRIVER("Parent debugfs directory not available yet\n");
> +		return -ENODEV;
> +	}
> +
> +	ret = relay_late_setup_files(guc->log.relay_chan, "guc_log", log_dir);
> +	if (ret) {
> +		DRM_DEBUG_DRIVER("Couldn't associate the channel with file %d\n", ret);
> +		return ret;
> +	}
> +
> +	return 0;
> +}
> +
>   static void guc_log_cleanup(struct intel_guc *guc)
>   {
>   	struct drm_i915_private *dev_priv = guc_to_i915(guc);
> @@ -1167,7 +1181,7 @@ static int guc_create_log_extras(struct intel_guc *guc)
>   {
>   	struct drm_i915_private *dev_priv = guc_to_i915(guc);
>   	void *vaddr;
> -	int ret;
> +	int ret = 0;
>
>   	lockdep_assert_held(&dev_priv->drm.struct_mutex);
>
> @@ -1190,7 +1204,15 @@ static int guc_create_log_extras(struct intel_guc *guc)
>   		guc->log.buf_addr = vaddr;
>   	}
>
> -	return 0;
> +	if (!guc->log.relay_chan) {
> +		/* Create a relay channel, so that we have buffers for storing
> +		 * the GuC firmware logs, the channel will be linked with a file
> +		 * later on when debugfs is registered.
> +		 */
> +		ret = guc_create_relay_channel(guc);
> +	}
> +
> +	return ret;
>   }
>
>   static void guc_create_log(struct intel_guc *guc)
> @@ -1231,6 +1253,7 @@ static void guc_create_log(struct intel_guc *guc)
>   		guc->log.obj = obj;
>
>   		if (guc_create_log_extras(guc)) {
> +			guc_log_cleanup(guc);
>   			gem_release_guc_obj(guc->log.obj);
>   			guc->log.obj = NULL;
>   			i915.guc_log_level = -1;
> @@ -1265,6 +1288,10 @@ static int guc_log_late_setup(struct intel_guc *guc)
>   	if (ret)
>   		goto err;
>
> +	/* Parent debugfs dir should be available by now, associate the already
> +	 * opened relay channel with a debugfs file, which will then allow User
> +	 * to pull the logs.
> +	 */
>   	ret = guc_create_log_relay_file(guc);
>   	if (ret)
>   		goto err;
>

Can't spot any problems.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

I suppose you will also have to add some IGTs to exercise the whole 
thing. Stopping/starting the logging, reading back, capturing some 
commands etc. Or contract someone from validation to do it. :) Are there 
any such plans?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 20/20] drm/i915: Early creation of relay channel for capturing boot time logs
  2016-08-12 16:22   ` Tvrtko Ursulin
@ 2016-08-12 16:31     ` Goel, Akash
  2016-08-15  9:20       ` Tvrtko Ursulin
  0 siblings, 1 reply; 68+ messages in thread
From: Goel, Akash @ 2016-08-12 16:31 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: akash.goel



On 8/12/2016 9:52 PM, Tvrtko Ursulin wrote:
>
> On 12/08/16 07:25, akash.goel@intel.com wrote:
>> From: Akash Goel <akash.goel@intel.com>
>>
>> As per the current i915 Driver load sequence, debugfs registration is
>> done
>> at the end and so the relay channel debugfs file is also created after
>> that
>> but the GuC firmware is loaded much earlier in the sequence.
>> As a result Driver could miss capturing the boot-time logs of GuC
>> firmware
>> if there are flush interrupts from the GuC side.
>> Relay has a provision to support early logging where initially only relay
>> channel can be created, to have buffers for storing logs, and later on
>> channel can be associated with a debugfs file at appropriate time.
>> Have availed that, which allows Driver to capture boot time logs also,
>> which can be collected once Userspace comes up.
>>
>> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
>> Signed-off-by: Akash Goel <akash.goel@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_guc_submission.c | 61
>> +++++++++++++++++++++---------
>>   1 file changed, 44 insertions(+), 17 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
>> b/drivers/gpu/drm/i915/i915_guc_submission.c
>> index af48f62..1c287d7 100644
>> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
>> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
>> @@ -1099,25 +1099,12 @@ static void guc_remove_log_relay_file(struct
>> intel_guc *guc)
>>       relay_close(guc->log.relay_chan);
>>   }
>>
>> -static int guc_create_log_relay_file(struct intel_guc *guc)
>> +static int guc_create_relay_channel(struct intel_guc *guc)
>>   {
>>       struct drm_i915_private *dev_priv = guc_to_i915(guc);
>>       struct rchan *guc_log_relay_chan;
>> -    struct dentry *log_dir;
>>       size_t n_subbufs, subbuf_size;
>>
>> -    /* For now create the log file in /sys/kernel/debug/dri/0 dir */
>> -    log_dir = dev_priv->drm.primary->debugfs_root;
>> -
>> -    /* If /sys/kernel/debug/dri/0 location do not exist, then debugfs is
>> -     * not mounted and so can't create the relay file.
>> -     * The relay API seems to fit well with debugfs only.
>
> It only needs a dentry, I don't see that it has to be a debugfs one.
>
Besides dentry, there are other requirements for using relay, which can 
be met only for a debugfs file.
debugfs wasn't the preferred choice to place the log file, but had no 
other option, as relay API is compatible with debugfs only.

Also retrieving dentry of a file is not so straight forward, as it might 
seem (spent considerable time on this initially).


>> -     */
>> -    if (!log_dir) {
>> -        DRM_DEBUG_DRIVER("Parent debugfs directory not available
>> yet\n");
>> -        return -ENODEV;
>> -    }
>> -
>>       /* Keep the size of sub buffers same as shared log buffer */
>>       subbuf_size = guc->log.obj->base.size;
>>       /* Store up to 8 snaphosts, which is large enough to buffer
>> sufficient
>> @@ -1127,7 +1114,7 @@ static int guc_create_log_relay_file(struct
>> intel_guc *guc)
>>            */
>>       n_subbufs = 8;
>>
>> -    guc_log_relay_chan = relay_open("guc_log", log_dir,
>> +    guc_log_relay_chan = relay_open(NULL, NULL,
>>               subbuf_size, n_subbufs, &relay_callbacks, dev_priv);
>>
>>       if (!guc_log_relay_chan) {
>> @@ -1140,6 +1127,33 @@ static int guc_create_log_relay_file(struct
>> intel_guc *guc)
>>       return 0;
>>   }
>>
>> +static int guc_create_log_relay_file(struct intel_guc *guc)
>> +{
>> +    struct drm_i915_private *dev_priv = guc_to_i915(guc);
>> +    struct dentry *log_dir;
>> +    int ret;
>> +
>> +    /* For now create the log file in /sys/kernel/debug/dri/0 dir */
>> +    log_dir = dev_priv->drm.primary->debugfs_root;
>> +
>> +    /* If /sys/kernel/debug/dri/0 location do not exist, then debugfs is
>> +     * not mounted and so can't create the relay file.
>> +     * The relay API seems to fit well with debugfs only.
>> +     */
>> +    if (!log_dir) {
>> +        DRM_DEBUG_DRIVER("Parent debugfs directory not available
>> yet\n");
>> +        return -ENODEV;
>> +    }
>> +
>> +    ret = relay_late_setup_files(guc->log.relay_chan, "guc_log",
>> log_dir);
>> +    if (ret) {
>> +        DRM_DEBUG_DRIVER("Couldn't associate the channel with file
>> %d\n", ret);
>> +        return ret;
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>>   static void guc_log_cleanup(struct intel_guc *guc)
>>   {
>>       struct drm_i915_private *dev_priv = guc_to_i915(guc);
>> @@ -1167,7 +1181,7 @@ static int guc_create_log_extras(struct
>> intel_guc *guc)
>>   {
>>       struct drm_i915_private *dev_priv = guc_to_i915(guc);
>>       void *vaddr;
>> -    int ret;
>> +    int ret = 0;
>>
>>       lockdep_assert_held(&dev_priv->drm.struct_mutex);
>>
>> @@ -1190,7 +1204,15 @@ static int guc_create_log_extras(struct
>> intel_guc *guc)
>>           guc->log.buf_addr = vaddr;
>>       }
>>
>> -    return 0;
>> +    if (!guc->log.relay_chan) {
>> +        /* Create a relay channel, so that we have buffers for storing
>> +         * the GuC firmware logs, the channel will be linked with a file
>> +         * later on when debugfs is registered.
>> +         */
>> +        ret = guc_create_relay_channel(guc);
>> +    }
>> +
>> +    return ret;
>>   }
>>
>>   static void guc_create_log(struct intel_guc *guc)
>> @@ -1231,6 +1253,7 @@ static void guc_create_log(struct intel_guc *guc)
>>           guc->log.obj = obj;
>>
>>           if (guc_create_log_extras(guc)) {
>> +            guc_log_cleanup(guc);
>>               gem_release_guc_obj(guc->log.obj);
>>               guc->log.obj = NULL;
>>               i915.guc_log_level = -1;
>> @@ -1265,6 +1288,10 @@ static int guc_log_late_setup(struct intel_guc
>> *guc)
>>       if (ret)
>>           goto err;
>>
>> +    /* Parent debugfs dir should be available by now, associate the
>> already
>> +     * opened relay channel with a debugfs file, which will then
>> allow User
>> +     * to pull the logs.
>> +     */
>>       ret = guc_create_log_relay_file(guc);
>>       if (ret)
>>           goto err;
>>
>
> Can't spot any problems.
>
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>
> I suppose you will also have to add some IGTs to exercise the whole
> thing. Stopping/starting the logging, reading back, capturing some
> commands etc. Or contract someone from validation to do it. :) Are there
> any such plans?

Sure will seek help from Validation folks on this.

Best regards
Akash
>
> Regards,
>
> Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 16/20] drm/i915: Support to create write combined type vmaps
  2016-08-12 15:16       ` Chris Wilson
@ 2016-08-12 16:46         ` Goel, Akash
  0 siblings, 0 replies; 68+ messages in thread
From: Goel, Akash @ 2016-08-12 16:46 UTC (permalink / raw)
  To: Chris Wilson, Tvrtko Ursulin; +Cc: intel-gfx, akash.goel



On 8/12/2016 8:46 PM, Chris Wilson wrote:
> On Fri, Aug 12, 2016 at 08:43:58PM +0530, Goel, Akash wrote:
>> On 8/12/2016 4:19 PM, Tvrtko Ursulin wrote:
>>> Unreleated and unmentioned change to no guard page. Best to remove IMHO.
>>> Can keep the RB in that case.
>>
>> Though its not called out, sorry for that, but isn't it better to
>> avoid using the guard page, which will save 4KB of vmalloc virtual
>> space (which is scarce) for every mapping created by Driver.
>>
>> Updating the commit message would be fine to mention about this ?.
>
> Too late, already applied without the new flag.
>
ohh, the patch is already queued for merge ?

> Yes, that's why I dropped the guard page when I found out it was being
> added. Send a patch to add the flag and we can discuss whether we think
> our code is adequate to not require the protection.

Fine, will prepare a separate patch to avoid using the guard page.

Best regards
Akash

> -Chris
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 15/20] drm/i915: Debugfs support for GuC logging control
  2016-08-12 15:57   ` Tvrtko Ursulin
@ 2016-08-12 17:08     ` Goel, Akash
  0 siblings, 0 replies; 68+ messages in thread
From: Goel, Akash @ 2016-08-12 17:08 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: akash.goel



On 8/12/2016 9:27 PM, Tvrtko Ursulin wrote:
>
> On 12/08/16 07:25, akash.goel@intel.com wrote:
>> From: Sagar Arun Kamble <sagar.a.kamble@intel.com>
>>
>> This patch provides debugfs interface i915_guc_output_control for
>> on the fly enabling/disabling of logging in GuC firmware and controlling
>> the verbosity level of logs.
>> The value written to the file, should have bit 0 set to enable logging
>> and
>> bits 4-7 should contain the verbosity info.
>>
>> v2: Add a forceful flush, to collect left over logs, on disabling
>> logging.
>>      Useful for Validation.
>>
>> v3: Besides minor cleanup, implement read method for the debugfs file and
>>      set the guc_log_level to -1 when logging is disabled. (Tvrtko)
>>
>> Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
>> Signed-off-by: Akash Goel <akash.goel@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_debugfs.c        | 44 ++++++++++++++++++++-
>>   drivers/gpu/drm/i915/i915_guc_submission.c | 63
>> ++++++++++++++++++++++++++++++
>>   drivers/gpu/drm/i915/intel_guc.h           |  1 +
>>   3 files changed, 107 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c
>> b/drivers/gpu/drm/i915/i915_debugfs.c
>> index 14e0dcf..f472fbcd3 100644
>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
>> @@ -2674,6 +2674,47 @@ static int i915_guc_log_dump(struct seq_file
>> *m, void *data)
>>       return 0;
>>   }
>>
>> +static int i915_guc_log_control_get(void *data, u64 *val)
>> +{
>> +    struct drm_device *dev = data;
>> +    struct drm_i915_private *dev_priv = to_i915(dev);
>> +
>> +    if (!dev_priv->guc.log.obj)
>> +        return -EINVAL;
>> +
>> +    *val = i915.guc_log_level;
>> +
>> +    return 0;
>> +}
>> +
>> +static int i915_guc_log_control_set(void *data, u64 val)
>> +{
>> +    struct drm_device *dev = data;
>> +    struct drm_i915_private *dev_priv = to_i915(dev);
>> +    int ret;
>> +
>> +    ret = mutex_lock_interruptible(&dev->struct_mutex);
>> +    if (ret)
>> +        return ret;
>> +
>> +    if (!dev_priv->guc.log.obj) {
>> +        ret = -EINVAL;
>> +        goto end;
>> +    }
>> +
>> +    intel_runtime_pm_get(dev_priv);
>> +    ret = i915_guc_log_control(dev_priv, val);
>> +    intel_runtime_pm_put(dev_priv);
>> +
>> +end:
>> +    mutex_unlock(&dev->struct_mutex);
>> +    return ret;
>> +}
>> +
>> +DEFINE_SIMPLE_ATTRIBUTE(i915_guc_log_control_fops,
>> +            i915_guc_log_control_get, i915_guc_log_control_set,
>> +            "%lld\n");
>> +
>>   static int i915_edp_psr_status(struct seq_file *m, void *data)
>>   {
>>       struct drm_info_node *node = m->private;
>> @@ -5477,7 +5518,8 @@ static const struct i915_debugfs_files {
>>       {"i915_fbc_false_color", &i915_fbc_fc_fops},
>>       {"i915_dp_test_data", &i915_displayport_test_data_fops},
>>       {"i915_dp_test_type", &i915_displayport_test_type_fops},
>> -    {"i915_dp_test_active", &i915_displayport_test_active_fops}
>> +    {"i915_dp_test_active", &i915_displayport_test_active_fops},
>> +    {"i915_guc_log_control", &i915_guc_log_control_fops}
>>   };
>>
>>   void intel_display_crc_init(struct drm_device *dev)
>> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
>> b/drivers/gpu/drm/i915/i915_guc_submission.c
>> index 4a75c16..041cf68 100644
>> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
>> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
>> @@ -195,6 +195,16 @@ static int host2guc_force_logbuffer_flush(struct
>> intel_guc *guc)
>>       return host2guc_action(guc, data, 2);
>>   }
>>
>> +static int host2guc_logging_control(struct intel_guc *guc, u32
>> control_val)
>> +{
>> +    u32 data[2];
>> +
>> +    data[0] = HOST2GUC_ACTION_UK_LOG_ENABLE_LOGGING;
>> +    data[1] = control_val;
>> +
>> +    return host2guc_action(guc, data, 2);
>> +}
>> +
>>   /*
>>    * Initialise, update, or clear doorbell data shared with the GuC
>>    *
>> @@ -1538,3 +1548,56 @@ void i915_guc_register(struct drm_i915_private
>> *dev_priv)
>>       guc_log_late_setup(&dev_priv->guc);
>>       mutex_unlock(&dev_priv->drm.struct_mutex);
>>   }
>> +
>> +int i915_guc_log_control(struct drm_i915_private *dev_priv, u64
>> control_val)
>> +{
>> +    union guc_log_control log_param;
>> +    int ret;
>> +
>> +    log_param.logging_enabled = control_val & 0x1;
>> +    log_param.verbosity = (control_val >> 4) & 0xF;
>
> Maybe "log_param.value = control_val" would also work since
> guc_log_control is conveniently defined as an union. Doesn't matter though.
>
>> +
>> +    if (log_param.verbosity < GUC_LOG_VERBOSITY_MIN ||
>> +        log_param.verbosity > GUC_LOG_VERBOSITY_MAX)
>> +        return -EINVAL;
>> +
>> +    /* This combination doesn't make sense & won't have any effect */
>> +    if (!log_param.logging_enabled && (i915.guc_log_level < 0))
>> +        return 0;
>
> I wonder if it would work and maybe look nicer to generalize as:
>
>     int guc_log_level;
>
>     guc_log_level = log_param.logging_enabled ? log_param.verbosity : -1;
>     if (i915.guc_log_level == guc_log_level)
>         return 0;

Fine, will try to refactor the code as per your suggestions.
Thanks for the suggestions.

>> +
>> +    ret = host2guc_logging_control(&dev_priv->guc, log_param.value);
>> +    if (ret < 0) {
>> +        DRM_DEBUG_DRIVER("host2guc action failed %d\n", ret);
>> +        return ret;
>> +    }
>> +
>> +    i915.guc_log_level = log_param.verbosity;
>
> This would then become i915.guc_log_level = guc_log_level.
>
>> +
>> +    /* If log_level was set as -1 at boot time, then the relay
>> channel file
>> +     * wouldn't have been created by now and interrupts also would
>> not have
>> +     * been enabled.
>> +     */
>> +    if (!dev_priv->guc.log.relay_chan) {
>> +        ret = guc_log_late_setup(&dev_priv->guc);
>> +        if (!ret)
>> +            gen9_enable_guc_interrupts(dev_priv);
>> +    } else if (!log_param.logging_enabled) {
>> +        /* Once logging is disabled, GuC won't generate logs & send an
>> +         * interrupt. But there could be some data in the log buffer
>> +         * which is yet to be captured. So request GuC to update the log
>> +         * buffer state and then collect the left over logs.
>> +         */
>> +        i915_guc_flush_logs(dev_priv);
>> +
>> +        /* GuC would have updated the log buffer by now, so capture
>> it */
>> +        i915_guc_capture_logs(dev_priv);
>> +
>> +        /* As logging is disabled, update the log level to reflect
>> that */
>> +        i915.guc_log_level = -1;
>> +    } else {
>> +        /* In case interrupts were disabled, enable them now */
>> +        gen9_enable_guc_interrupts(dev_priv);
>> +    }
>
> And this block would need some adjustments with my guc_log_level idea.
>
> Well not sure, see what you think. I am just attracted to the idea of
> operating in the same value domain as much as possible for readability
> and simplicity. Maybe it would not improve anything, I did not bother
> with typing it all to check.
>
>> +
>> +    return ret;
>> +}
>> diff --git a/drivers/gpu/drm/i915/intel_guc.h
>> b/drivers/gpu/drm/i915/intel_guc.h
>> index d3a5447..2f8c408 100644
>> --- a/drivers/gpu/drm/i915/intel_guc.h
>> +++ b/drivers/gpu/drm/i915/intel_guc.h
>> @@ -186,5 +186,6 @@ void i915_guc_capture_logs(struct drm_i915_private
>> *dev_priv);
>>   void i915_guc_flush_logs(struct drm_i915_private *dev_priv);
>>   void i915_guc_register(struct drm_i915_private *dev_priv);
>>   void i915_guc_unregister(struct drm_i915_private *dev_priv);
>> +int i915_guc_log_control(struct drm_i915_private *dev_priv, u64
>> control_val);
>>
>>   #endif
>>
>
> Patch looks correct as is, so:
>
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>
> Although I would be happier though if my suggestion to use the same
> value domain as for the module parameter was used. In other words:
>
>     {"i915_guc_log_level", &i915_guc_log_control_fops}
>
> ...
>
> int i915_guc_log_control(struct drm_i915_private *dev_priv, u64
> control_val)
> ...
>     int guc_log_level = (int)control_val;
> ...
>     log_param.logging_enabled = guc_log_level > -1;
>     log_param.verbosity = guc_log_level > -1 ? guc_log_level : 0;
> ...
>
> It think it would be simpler for the user and developer to only have to
> think about one set of values when dealing with guc logging.
>
Really nice suggestion, but as you mentioned below this log control 
interface is most likely to get extended in near future.

Best regards
Akash

> But maybe extensions to guc_log_control are imminent and expected so it
> would not be worth it in the long run. No idea.
>

> Regards,
>
> Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 20/20] drm/i915: Early creation of relay channel for capturing boot time logs
  2016-08-12 16:31     ` Goel, Akash
@ 2016-08-15  9:20       ` Tvrtko Ursulin
  2016-08-15 10:20         ` Goel, Akash
  0 siblings, 1 reply; 68+ messages in thread
From: Tvrtko Ursulin @ 2016-08-15  9:20 UTC (permalink / raw)
  To: Goel, Akash, intel-gfx


On 12/08/16 17:31, Goel, Akash wrote:
> On 8/12/2016 9:52 PM, Tvrtko Ursulin wrote:
>> On 12/08/16 07:25, akash.goel@intel.com wrote:
>>> From: Akash Goel <akash.goel@intel.com>
>>>
>>> As per the current i915 Driver load sequence, debugfs registration is
>>> done
>>> at the end and so the relay channel debugfs file is also created after
>>> that
>>> but the GuC firmware is loaded much earlier in the sequence.
>>> As a result Driver could miss capturing the boot-time logs of GuC
>>> firmware
>>> if there are flush interrupts from the GuC side.
>>> Relay has a provision to support early logging where initially only
>>> relay
>>> channel can be created, to have buffers for storing logs, and later on
>>> channel can be associated with a debugfs file at appropriate time.
>>> Have availed that, which allows Driver to capture boot time logs also,
>>> which can be collected once Userspace comes up.
>>>
>>> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> Signed-off-by: Akash Goel <akash.goel@intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/i915_guc_submission.c | 61
>>> +++++++++++++++++++++---------
>>>   1 file changed, 44 insertions(+), 17 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
>>> b/drivers/gpu/drm/i915/i915_guc_submission.c
>>> index af48f62..1c287d7 100644
>>> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
>>> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
>>> @@ -1099,25 +1099,12 @@ static void guc_remove_log_relay_file(struct
>>> intel_guc *guc)
>>>       relay_close(guc->log.relay_chan);
>>>   }
>>>
>>> -static int guc_create_log_relay_file(struct intel_guc *guc)
>>> +static int guc_create_relay_channel(struct intel_guc *guc)
>>>   {
>>>       struct drm_i915_private *dev_priv = guc_to_i915(guc);
>>>       struct rchan *guc_log_relay_chan;
>>> -    struct dentry *log_dir;
>>>       size_t n_subbufs, subbuf_size;
>>>
>>> -    /* For now create the log file in /sys/kernel/debug/dri/0 dir */
>>> -    log_dir = dev_priv->drm.primary->debugfs_root;
>>> -
>>> -    /* If /sys/kernel/debug/dri/0 location do not exist, then
>>> debugfs is
>>> -     * not mounted and so can't create the relay file.
>>> -     * The relay API seems to fit well with debugfs only.
>>
>> It only needs a dentry, I don't see that it has to be a debugfs one.
>>
> Besides dentry, there are other requirements for using relay, which can
> be met only for a debugfs file.
> debugfs wasn't the preferred choice to place the log file, but had no
> other option, as relay API is compatible with debugfs only.

What are those and should they be mentioned in the comment above?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 20/20] drm/i915: Early creation of relay channel for capturing boot time logs
  2016-08-15  9:20       ` Tvrtko Ursulin
@ 2016-08-15 10:20         ` Goel, Akash
  0 siblings, 0 replies; 68+ messages in thread
From: Goel, Akash @ 2016-08-15 10:20 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: akash.goel



On 8/15/2016 2:50 PM, Tvrtko Ursulin wrote:
>
> On 12/08/16 17:31, Goel, Akash wrote:
>> On 8/12/2016 9:52 PM, Tvrtko Ursulin wrote:
>>> On 12/08/16 07:25, akash.goel@intel.com wrote:
>>>> From: Akash Goel <akash.goel@intel.com>
>>>>
>>>> As per the current i915 Driver load sequence, debugfs registration is
>>>> done
>>>> at the end and so the relay channel debugfs file is also created after
>>>> that
>>>> but the GuC firmware is loaded much earlier in the sequence.
>>>> As a result Driver could miss capturing the boot-time logs of GuC
>>>> firmware
>>>> if there are flush interrupts from the GuC side.
>>>> Relay has a provision to support early logging where initially only
>>>> relay
>>>> channel can be created, to have buffers for storing logs, and later on
>>>> channel can be associated with a debugfs file at appropriate time.
>>>> Have availed that, which allows Driver to capture boot time logs also,
>>>> which can be collected once Userspace comes up.
>>>>
>>>> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
>>>> Signed-off-by: Akash Goel <akash.goel@intel.com>
>>>> ---
>>>>   drivers/gpu/drm/i915/i915_guc_submission.c | 61
>>>> +++++++++++++++++++++---------
>>>>   1 file changed, 44 insertions(+), 17 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
>>>> b/drivers/gpu/drm/i915/i915_guc_submission.c
>>>> index af48f62..1c287d7 100644
>>>> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
>>>> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
>>>> @@ -1099,25 +1099,12 @@ static void guc_remove_log_relay_file(struct
>>>> intel_guc *guc)
>>>>       relay_close(guc->log.relay_chan);
>>>>   }
>>>>
>>>> -static int guc_create_log_relay_file(struct intel_guc *guc)
>>>> +static int guc_create_relay_channel(struct intel_guc *guc)
>>>>   {
>>>>       struct drm_i915_private *dev_priv = guc_to_i915(guc);
>>>>       struct rchan *guc_log_relay_chan;
>>>> -    struct dentry *log_dir;
>>>>       size_t n_subbufs, subbuf_size;
>>>>
>>>> -    /* For now create the log file in /sys/kernel/debug/dri/0 dir */
>>>> -    log_dir = dev_priv->drm.primary->debugfs_root;
>>>> -
>>>> -    /* If /sys/kernel/debug/dri/0 location do not exist, then
>>>> debugfs is
>>>> -     * not mounted and so can't create the relay file.
>>>> -     * The relay API seems to fit well with debugfs only.
>>>
>>> It only needs a dentry, I don't see that it has to be a debugfs one.
>>>
>> Besides dentry, there are other requirements for using relay, which can
>> be met only for a debugfs file.
>> debugfs wasn't the preferred choice to place the log file, but had no
>> other option, as relay API is compatible with debugfs only.
>
> What are those and
For availing relay there are 3 requirements :-
a) Need the associated ‘dentry’ pointer of the file, while opening the
    relay channel.
b) Should be able to use 'relay_file_operations' fops for the file.
c) Set the 'i_private' field of file’s inode to the pointer of relay
    channel buffer.

All the above 3 requirements can be met for a debugfs file in a 
straightforward manner. But not all of them can be met for a file 
created inside sysfs or if the file is created inside /dev as a 
character device file.

> should they be mentioned in the comment above?

Or should I mention them in the cover letter or commit message.

Best regards
Akash

> Regards,
>
> Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

end of thread, other threads:[~2016-08-15 10:20 UTC | newest]

Thread overview: 68+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-12  6:25 [PATCH v5 00/20] Support for sustained capturing of GuC firmware logs akash.goel
2016-08-12  6:25 ` [PATCH 01/20] drm/i915: Decouple GuC log setup from verbosity parameter akash.goel
2016-08-12  6:25 ` [PATCH 02/20] drm/i915: Add GuC ukernel logging related fields to fw interface file akash.goel
2016-08-12  6:25 ` [PATCH 03/20] drm/i915: New structure to contain GuC logging related fields akash.goel
2016-08-12  6:25 ` [PATCH 04/20] drm/i915: Add low level set of routines for programming PM IER/IIR/IMR register set akash.goel
2016-08-12 11:15   ` Tvrtko Ursulin
2016-08-12  6:25 ` [PATCH 05/20] drm/i915: Support for GuC interrupts akash.goel
2016-08-12 11:54   ` Tvrtko Ursulin
2016-08-12 13:10     ` Goel, Akash
2016-08-12 13:31       ` Tvrtko Ursulin
2016-08-12 14:31         ` Goel, Akash
2016-08-12 15:05           ` Tvrtko Ursulin
2016-08-12 15:32             ` Goel, Akash
2016-08-12  6:25 ` [PATCH 06/20] drm/i915: Handle log buffer flush interrupt event from GuC akash.goel
2016-08-12  6:28   ` Chris Wilson
2016-08-12  6:44     ` Goel, Akash
2016-08-12  6:51       ` Chris Wilson
2016-08-12  7:00         ` Goel, Akash
2016-08-12 13:17   ` Tvrtko Ursulin
2016-08-12 13:45     ` Goel, Akash
2016-08-12 14:07       ` Tvrtko Ursulin
2016-08-12 16:17         ` Goel, Akash
2016-08-12  6:25 ` [PATCH 07/20] relay: Use per CPU constructs for the relay channel buffer pointers akash.goel
2016-08-12  6:25 ` [PATCH 08/20] drm/i915: Add a relay backed debugfs interface for capturing GuC logs akash.goel
2016-08-12 13:53   ` Tvrtko Ursulin
2016-08-12 16:10     ` Goel, Akash
2016-08-12  6:25 ` [PATCH 09/20] drm/i915: New lock to serialize the Host2GuC actions akash.goel
2016-08-12 13:55   ` Tvrtko Ursulin
2016-08-12 15:01     ` Goel, Akash
2016-08-12  6:25 ` [PATCH 10/20] drm/i915: Add stats for GuC log buffer flush interrupts akash.goel
2016-08-12 14:26   ` Tvrtko Ursulin
2016-08-12 14:56     ` Goel, Akash
2016-08-12  6:25 ` [PATCH 11/20] drm/i915: Optimization to reduce the sampling time of GuC log buffer akash.goel
2016-08-12 14:42   ` Tvrtko Ursulin
2016-08-12 14:48     ` Goel, Akash
2016-08-12 15:06       ` Tvrtko Ursulin
2016-08-12  6:25 ` [PATCH 12/20] drm/i915: Increase GuC log buffer size to reduce flush interrupts akash.goel
2016-08-12  6:25 ` [PATCH 13/20] drm/i915: Augment i915 error state to include the dump of GuC log buffer akash.goel
2016-08-12 15:20   ` Tvrtko Ursulin
2016-08-12 15:32     ` Chris Wilson
2016-08-12 15:46       ` Goel, Akash
2016-08-12 15:52         ` Chris Wilson
2016-08-12 16:04           ` Goel, Akash
2016-08-12 16:09             ` Chris Wilson
2016-08-12  6:25 ` [PATCH 14/20] drm/i915: Forcefully flush GuC log buffer on reset akash.goel
2016-08-12  6:33   ` Chris Wilson
2016-08-12  7:02     ` Goel, Akash
2016-08-12  6:25 ` [PATCH 15/20] drm/i915: Debugfs support for GuC logging control akash.goel
2016-08-12 15:57   ` Tvrtko Ursulin
2016-08-12 17:08     ` Goel, Akash
2016-08-12  6:25 ` [PATCH 16/20] drm/i915: Support to create write combined type vmaps akash.goel
2016-08-12 10:49   ` Tvrtko Ursulin
2016-08-12 15:13     ` Goel, Akash
2016-08-12 15:16       ` Chris Wilson
2016-08-12 16:46         ` Goel, Akash
2016-08-12  6:25 ` [PATCH 17/20] drm/i915: Use uncached(WC) mapping for acessing the GuC log buffer akash.goel
2016-08-12 16:05   ` Tvrtko Ursulin
2016-08-12  6:25 ` [PATCH 18/20] drm/i915: Use SSE4.1 movntdqa to accelerate reads from WC memory akash.goel
2016-08-12 10:54   ` Tvrtko Ursulin
2016-08-12 12:22     ` Chris Wilson
2016-08-12  6:25 ` [PATCH 19/20] drm/i915: Use SSE4.1 movntdqa based memcpy for sampling GuC log buffer akash.goel
2016-08-12 16:06   ` Tvrtko Ursulin
2016-08-12  6:25 ` [PATCH 20/20] drm/i915: Early creation of relay channel for capturing boot time logs akash.goel
2016-08-12 16:22   ` Tvrtko Ursulin
2016-08-12 16:31     ` Goel, Akash
2016-08-15  9:20       ` Tvrtko Ursulin
2016-08-15 10:20         ` Goel, Akash
2016-08-12  6:58 ` ✗ Ro.CI.BAT: warning for Support for sustained capturing of GuC firmware logs (rev6) Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.