All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/13 v4] Batch submission via GuC
@ 2015-07-09 18:29 Dave Gordon
  2015-07-09 18:29 ` [PATCH 01/13 v4] drm/i915: Add i915_gem_object_create_from_data() Dave Gordon
                   ` (13 more replies)
  0 siblings, 14 replies; 42+ messages in thread
From: Dave Gordon @ 2015-07-09 18:29 UTC (permalink / raw)
  To: intel-gfx

This patch series enables command submission via the GuC. In this mode,
instead of the host CPU driving the execlist port directly, it hands
over work items to the GuC, using a doorbell mechanism to tell the GuC
that new items have been added to its work queue. The GuC then dispatches
contexts to the various GPU engines, and manages the resulting context-
switch interrupts. Completion of a batch is however still signalled to
the CPU; the GuC is not involved in handling user interrupts.

There are two subsequences within the patch series:

  drm/i915: Add i915_gem_object_create_from_data()
  drm/i915: Add GuC-related module parameters
  drm/i915: Add GuC-related header files
  drm/i915: GuC-specific firmware loader
  drm/i915: Debugfs interface to read GuC load status

These five patches make up the GuC loader and its prerequisites.  At this
point in the sequence we can load and activate the GuC firmware, but not
submit any batches through it. (This is nonetheless a potentially useful
state, as the GuC could do other useful work even when not handling batch
submissions).

  drm/i915: Expose two LRC functions for GuC submission mode
  drm/i915: GuC submission setup, phase 1
  drm/i915: Enable GuC firmware log
  drm/i915: Implementation of GuC client
  drm/i915: Interrupt routing for GuC submission
  drm/i915: Integrate GuC-based command submission
  drm/i915: Debugfs interface for GuC submission statistics
  drm/i915: Enable GuC submission, where supported

In this second section, we implement the GuC submission mechanism, link
it into the (execlist-based) submission path, and finally enable it
(on supported platforms). On platforms where there is no GuC, or if
GuC submission is explicitly disabled, batch submission will revert to
using the execlist mechanism directly.

On the other hand, if the GuC firmware cannot be found or is invalid,
the GPU will be unusable.

The GuC firmware itself is not included in this patchset; it is or will
be available for download from https://01.org/linuxgraphics/downloads/
This driver works with and requires GuC firmware revision 3.x. It will
not work with any firmware version 1.x, as the GuC protocol in those
revisions was incompatible and is no longer supported.

Ben Widawsky (0):
Vinit Azad (0):
Michael H. Nguyen (0):
  created the original versions on which some of these patches are based.

Alex Dai (6):
  drm/i915: Add GuC-related module parameters
  drm/i915: GuC-specific firmware loader
  drm/i915: Debugfs interface to read GuC load status
  drm/i915: GuC submission setup, phase 1
  drm/i915: Enable GuC firmware log
  drm/i915: Integrate GuC-based command submission

Dave Gordon (7):
  drm/i915: Add i915_gem_object_create_from_data()
  drm/i915: Add GuC-related header files
  drm/i915: Expose two LRC functions for GuC submission mode
  drm/i915: Implementation of GuC client
  drm/i915: Interrupt routing for GuC submission
  drm/i915: Debugfs interface for GuC submission statistics
  drm/i915: Enable GuC submission, where supported

 Documentation/DocBook/drm.tmpl             |  14 +
 drivers/gpu/drm/i915/Makefile              |   4 +
 drivers/gpu/drm/i915/i915_debugfs.c        | 110 +++-
 drivers/gpu/drm/i915/i915_dma.c            |   4 +
 drivers/gpu/drm/i915/i915_drv.h            |  15 +
 drivers/gpu/drm/i915/i915_gem.c            |  53 ++
 drivers/gpu/drm/i915/i915_guc_reg.h        | 102 ++++
 drivers/gpu/drm/i915/i915_guc_submission.c | 853 +++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_params.c         |   9 +
 drivers/gpu/drm/i915/i915_reg.h            |  15 +-
 drivers/gpu/drm/i915/intel_guc.h           | 118 ++++
 drivers/gpu/drm/i915/intel_guc_fwif.h      | 245 +++++++++
 drivers/gpu/drm/i915/intel_guc_loader.c    | 618 +++++++++++++++++++++
 drivers/gpu/drm/i915/intel_lrc.c           |  72 ++-
 drivers/gpu/drm/i915/intel_lrc.h           |   9 +
 15 files changed, 2211 insertions(+), 30 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_guc_reg.h
 create mode 100644 drivers/gpu/drm/i915/i915_guc_submission.c
 create mode 100644 drivers/gpu/drm/i915/intel_guc.h
 create mode 100644 drivers/gpu/drm/i915/intel_guc_fwif.h
 create mode 100644 drivers/gpu/drm/i915/intel_guc_loader.c

-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 01/13 v4] drm/i915: Add i915_gem_object_create_from_data()
  2015-07-09 18:29 [PATCH 00/13 v4] Batch submission via GuC Dave Gordon
@ 2015-07-09 18:29 ` Dave Gordon
  2015-07-18  0:36   ` O'Rourke, Tom
  2015-07-09 18:29 ` [PATCH 02/13 v4] drm/i915: Add GuC-related module parameters Dave Gordon
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 42+ messages in thread
From: Dave Gordon @ 2015-07-09 18:29 UTC (permalink / raw)
  To: intel-gfx

i915_gem_object_create_from_data() is a generic function to save data
from a plain linear buffer in a new pageable gem object that can later
be accessed by the CPU and/or GPU.

We will need this for the microcontroller firmware loading support code.

Derived from i915_gem_object_write(), originally by Alex Dai

v2:
    Change of function: now allocates & fills a new object, rather than
        writing to an existing object
    New name courtesy of Chris Wilson
    Explicit domain-setting and other improvements per review comments
        by Chris Wilson & Daniel Vetter

v4:
    Rebased

Issue: VIZ-4884
Signed-off-by: Alex Dai <yu.dai@intel.com>
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h |  2 ++
 drivers/gpu/drm/i915/i915_gem.c | 40 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 42 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 464b28d..3c91507 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2755,6 +2755,8 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
 			 const struct drm_i915_gem_object_ops *ops);
 struct drm_i915_gem_object *i915_gem_alloc_object(struct drm_device *dev,
 						  size_t size);
+struct drm_i915_gem_object *i915_gem_object_create_from_data(
+		struct drm_device *dev, const void *data, size_t size);
 void i915_init_vm(struct drm_i915_private *dev_priv,
 		  struct i915_address_space *vm);
 void i915_gem_free_object(struct drm_gem_object *obj);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index a0bff41..dbbb649 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -5478,3 +5478,43 @@ bool i915_gem_obj_is_pinned(struct drm_i915_gem_object *obj)
 
 	return false;
 }
+
+/* Allocate a new GEM object and fill it with the supplied data */
+struct drm_i915_gem_object *
+i915_gem_object_create_from_data(struct drm_device *dev,
+			         const void *data, size_t size)
+{
+	struct drm_i915_gem_object *obj;
+	struct sg_table *sg;
+	size_t bytes;
+	int ret;
+
+	obj = i915_gem_alloc_object(dev, round_up(size, PAGE_SIZE));
+	if (IS_ERR_OR_NULL(obj))
+		return obj;
+
+	ret = i915_gem_object_set_to_cpu_domain(obj, true);
+	if (ret)
+		goto fail;
+
+	ret = i915_gem_object_get_pages(obj);
+	if (ret)
+		goto fail;
+
+	i915_gem_object_pin_pages(obj);
+	sg = obj->pages;
+	bytes = sg_copy_from_buffer(sg->sgl, sg->nents, (void *)data, size);
+	i915_gem_object_unpin_pages(obj);
+
+	if (WARN_ON(bytes != size)) {
+		DRM_ERROR("Incomplete copy, wrote %zu of %zu", bytes, size);
+		ret = -EFAULT;
+		goto fail;
+	}
+
+	return obj;
+
+fail:
+	drm_gem_object_unreference(&obj->base);
+	return ERR_PTR(ret);
+}
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 02/13 v4] drm/i915: Add GuC-related module parameters
  2015-07-09 18:29 [PATCH 00/13 v4] Batch submission via GuC Dave Gordon
  2015-07-09 18:29 ` [PATCH 01/13 v4] drm/i915: Add i915_gem_object_create_from_data() Dave Gordon
@ 2015-07-09 18:29 ` Dave Gordon
  2015-07-18  0:37   ` O'Rourke, Tom
  2015-07-09 18:29 ` [PATCH 03/13 v4] drm/i915: Add GuC-related header files Dave Gordon
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 42+ messages in thread
From: Dave Gordon @ 2015-07-09 18:29 UTC (permalink / raw)
  To: intel-gfx

From: Alex Dai <yu.dai@intel.com>

Two new module parameters: "enable_guc_submission" which will turn
on submission of batchbuffers via the GuC (when implemented), and
"guc_log_level" which controls the level of debugging logged by the
GuC and captured by the host.

Signed-off-by: Alex Dai <yu.dai@intel.com>

v4:
    Mark "enable_guc_submission" unsafe [Daniel Vetter]

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h    | 2 ++
 drivers/gpu/drm/i915/i915_params.c | 9 +++++++++
 2 files changed, 11 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 3c91507..4a512da 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2606,6 +2606,8 @@ struct i915_params {
 	bool reset;
 	bool disable_display;
 	bool disable_vtd_wa;
+	bool enable_guc_submission;
+	int guc_log_level;
 	int use_mmio_flip;
 	int mmio_debug;
 	bool verbose_state_checks;
diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
index 7983fe4..2791b5a 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -53,6 +53,8 @@ struct i915_params i915 __read_mostly = {
 	.verbose_state_checks = 1,
 	.nuclear_pageflip = 0,
 	.edp_vswing = 0,
+	.enable_guc_submission = false,
+	.guc_log_level = -1,
 };
 
 module_param_named(modeset, i915.modeset, int, 0400);
@@ -186,3 +188,10 @@ MODULE_PARM_DESC(edp_vswing,
 		 "Ignore/Override vswing pre-emph table selection from VBT "
 		 "(0=use value from vbt [default], 1=low power swing(200mV),"
 		 "2=default swing(400mV))");
+
+module_param_named_unsafe(enable_guc_submission, i915.enable_guc_submission, bool, 0400);
+MODULE_PARM_DESC(enable_guc_submission, "Enable GuC submission (default:false)");
+
+module_param_named(guc_log_level, i915.guc_log_level, int, 0400);
+MODULE_PARM_DESC(guc_log_level,
+	"GuC firmware logging level (-1:disabled (default), 0-3:enabled)");
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 03/13 v4] drm/i915: Add GuC-related header files
  2015-07-09 18:29 [PATCH 00/13 v4] Batch submission via GuC Dave Gordon
  2015-07-09 18:29 ` [PATCH 01/13 v4] drm/i915: Add i915_gem_object_create_from_data() Dave Gordon
  2015-07-09 18:29 ` [PATCH 02/13 v4] drm/i915: Add GuC-related module parameters Dave Gordon
@ 2015-07-09 18:29 ` Dave Gordon
  2015-07-18  0:38   ` O'Rourke, Tom
  2015-07-09 18:29 ` [PATCH 04/13 v4] drm/i915: GuC-specific firmware loader Dave Gordon
                   ` (10 subsequent siblings)
  13 siblings, 1 reply; 42+ messages in thread
From: Dave Gordon @ 2015-07-09 18:29 UTC (permalink / raw)
  To: intel-gfx

intel_guc_fwif.h contains the subset of the GuC interface that we
will need for submission of commands through the GuC. These MUST
be kept in sync with the definitions used by the GuC firmware, and
updates to this file will (or should) be autogenerated from the
source files used to build the firmware. Editing this file is
therefore not recommended.

i915_guc_reg.h contains definitions of GuC-related hardware:
registers, bitmasks, etc. These should match the BSpec.

v2:
    Files renamed & resliced per review comments by Chris Wilson

v4:
    Added DON'T-EDIT-ME warning [Tom O'Rourke]

Issue: VIZ-4884
Signed-off-by: Alex Dai <yu.dai@intel.com>
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
---
 drivers/gpu/drm/i915/i915_guc_reg.h   | 102 ++++++++++++++
 drivers/gpu/drm/i915/intel_guc_fwif.h | 245 ++++++++++++++++++++++++++++++++++
 2 files changed, 347 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/i915_guc_reg.h
 create mode 100644 drivers/gpu/drm/i915/intel_guc_fwif.h

diff --git a/drivers/gpu/drm/i915/i915_guc_reg.h b/drivers/gpu/drm/i915/i915_guc_reg.h
new file mode 100644
index 0000000..ccdc6c8
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_guc_reg.h
@@ -0,0 +1,102 @@
+/*
+ * Copyright © 2014 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+#ifndef _I915_GUC_REG_H_
+#define _I915_GUC_REG_H_
+
+/* Definitions of GuC H/W registers, bits, etc */
+
+#define GUC_STATUS			0xc000
+#define   GS_BOOTROM_SHIFT		1
+#define   GS_BOOTROM_MASK		  (0x7F << GS_BOOTROM_SHIFT)
+#define   GS_BOOTROM_RSA_FAILED		  (0x50 << GS_BOOTROM_SHIFT)
+#define   GS_UKERNEL_SHIFT		8
+#define   GS_UKERNEL_MASK		  (0xFF << GS_UKERNEL_SHIFT)
+#define   GS_UKERNEL_LAPIC_DONE		  (0x30 << GS_UKERNEL_SHIFT)
+#define   GS_UKERNEL_DPC_ERROR		  (0x60 << GS_UKERNEL_SHIFT)
+#define   GS_UKERNEL_READY		  (0xF0 << GS_UKERNEL_SHIFT)
+#define   GS_MIA_SHIFT			16
+#define   GS_MIA_MASK			  (0x07 << GS_MIA_SHIFT)
+
+#define GUC_WOPCM_SIZE			0xc050
+#define   GUC_WOPCM_SIZE_VALUE  	  (0x80 << 12)	/* 512KB */
+#define GUC_WOPCM_OFFSET		0x80000		/* 512KB */
+
+#define SOFT_SCRATCH(n)			(0xc180 + ((n) * 4))
+
+#define UOS_RSA_SCRATCH_0		0xc200
+#define DMA_ADDR_0_LOW			0xc300
+#define DMA_ADDR_0_HIGH			0xc304
+#define DMA_ADDR_1_LOW			0xc308
+#define DMA_ADDR_1_HIGH			0xc30c
+#define   DMA_ADDRESS_SPACE_WOPCM	  (7 << 16)
+#define   DMA_ADDRESS_SPACE_GTT		  (8 << 16)
+#define DMA_COPY_SIZE			0xc310
+#define DMA_CTRL			0xc314
+#define   UOS_MOVE			  (1<<4)
+#define   START_DMA			  (1<<0)
+#define DMA_GUC_WOPCM_OFFSET		0xc340
+
+#define GEN8_GT_PM_CONFIG		0x138140
+#define GEN9_GT_PM_CONFIG		0x13816c
+#define   GEN8_GT_DOORBELL_ENABLE	  (1<<0)
+
+#define GEN8_GTCR			0x4274
+#define   GEN8_GTCR_INVALIDATE		  (1<<0)
+
+#define GUC_ARAT_C6DIS			0xA178
+
+#define GUC_SHIM_CONTROL		0xc064
+#define   GUC_DISABLE_SRAM_INIT_TO_ZEROES	(1<<0)
+#define   GUC_ENABLE_READ_CACHE_LOGIC		(1<<1)
+#define   GUC_ENABLE_MIA_CACHING		(1<<2)
+#define   GUC_GEN10_MSGCH_ENABLE		(1<<4)
+#define   GUC_ENABLE_READ_CACHE_FOR_SRAM_DATA	(1<<9)
+#define   GUC_ENABLE_READ_CACHE_FOR_WOPCM_DATA	(1<<10)
+#define   GUC_ENABLE_MIA_CLOCK_GATING		(1<<15)
+#define   GUC_GEN10_SHIM_WC_ENABLE		(1<<21)
+
+#define GUC_SHIM_CONTROL_VALUE	(GUC_DISABLE_SRAM_INIT_TO_ZEROES	| \
+				 GUC_ENABLE_READ_CACHE_LOGIC		| \
+				 GUC_ENABLE_MIA_CACHING			| \
+				 GUC_ENABLE_READ_CACHE_FOR_SRAM_DATA	| \
+				 GUC_ENABLE_READ_CACHE_FOR_WOPCM_DATA)
+
+#define HOST2GUC_INTERRUPT		0xc4c8
+#define   HOST2GUC_TRIGGER		  (1<<0)
+
+#define DRBMISC1			0x1984
+#define   DOORBELL_ENABLE		  (1<<0)
+
+#define GEN8_DRBREGL(x)			(0x1000 + (x) * 8)
+#define   GEN8_DRB_VALID		  (1<<0)
+#define GEN8_DRBREGU(x)			(GEN8_DRBREGL(x) + 4)
+
+#define DE_GUCRMR			0x44054
+
+#define GUC_BCS_RCS_IER			0xC550
+#define GUC_VCS2_VCS1_IER		0xC554
+#define GUC_WD_VECS_IER			0xC558
+#define GUC_PM_P24C_IER			0xC55C
+
+#endif
diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h b/drivers/gpu/drm/i915/intel_guc_fwif.h
new file mode 100644
index 0000000..18d7f20
--- /dev/null
+++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
@@ -0,0 +1,245 @@
+/*
+ * Copyright © 2014 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+#ifndef _INTEL_GUC_FWIF_H
+#define _INTEL_GUC_FWIF_H
+
+/*
+ * This file is partially autogenerated, although currently with some manual
+ * fixups afterwards. In future, it should be entirely autogenerated, in order
+ * to ensure that the definitions herein remain in sync with those used by the
+ * GuC's own firmware.
+ *
+ * EDITING THIS FILE IS THEREFORE NOT RECOMMENDED - YOUR CHANGES MAY BE LOST.
+ */
+
+#define GFXCORE_FAMILY_GEN8		11
+#define GFXCORE_FAMILY_GEN9		12
+#define GFXCORE_FAMILY_FORCE_ULONG	0x7fffffff
+
+#define GUC_CTX_PRIORITY_CRITICAL	0
+#define GUC_CTX_PRIORITY_HIGH		1
+#define GUC_CTX_PRIORITY_NORMAL		2
+#define GUC_CTX_PRIORITY_LOW		3
+
+#define GUC_MAX_GPU_CONTEXTS		1024
+#define	GUC_INVALID_CTX_ID		(GUC_MAX_GPU_CONTEXTS + 1)
+
+/* Work queue item header definitions */
+#define WQ_STATUS_ACTIVE		1
+#define WQ_STATUS_SUSPENDED		2
+#define WQ_STATUS_CMD_ERROR		3
+#define WQ_STATUS_ENGINE_ID_NOT_USED	4
+#define WQ_STATUS_SUSPENDED_FROM_RESET	5
+#define WQ_TYPE_SHIFT			0
+#define   WQ_TYPE_BATCH_BUF		(0x1 << WQ_TYPE_SHIFT)
+#define   WQ_TYPE_PSEUDO		(0x2 << WQ_TYPE_SHIFT)
+#define   WQ_TYPE_INORDER		(0x3 << WQ_TYPE_SHIFT)
+#define WQ_TARGET_SHIFT			10
+#define WQ_LEN_SHIFT			16
+#define WQ_NO_WCFLUSH_WAIT		(1 << 27)
+#define WQ_PRESENT_WORKLOAD		(1 << 28)
+#define WQ_WORKLOAD_SHIFT		29
+#define   WQ_WORKLOAD_GENERAL		(0 << WQ_WORKLOAD_SHIFT)
+#define   WQ_WORKLOAD_GPGPU		(1 << WQ_WORKLOAD_SHIFT)
+#define   WQ_WORKLOAD_TOUCH		(2 << WQ_WORKLOAD_SHIFT)
+
+#define WQ_RING_TAIL_SHIFT		20
+#define WQ_RING_TAIL_MASK		(0x7FF << WQ_RING_TAIL_SHIFT)
+
+#define GUC_DOORBELL_ENABLED		1
+#define GUC_DOORBELL_DISABLED		0
+
+#define GUC_CTX_DESC_ATTR_ACTIVE	(1 << 0)
+#define GUC_CTX_DESC_ATTR_PENDING_DB	(1 << 1)
+#define GUC_CTX_DESC_ATTR_KERNEL	(1 << 2)
+#define GUC_CTX_DESC_ATTR_PREEMPT	(1 << 3)
+#define GUC_CTX_DESC_ATTR_RESET		(1 << 4)
+#define GUC_CTX_DESC_ATTR_WQLOCKED	(1 << 5)
+#define GUC_CTX_DESC_ATTR_PCH		(1 << 6)
+
+/* The guc control data is 10 DWORDs */
+#define GUC_CTL_CTXINFO			0
+#define   GUC_CTL_CTXNUM_IN16_SHIFT	0
+#define   GUC_CTL_BASE_ADDR_SHIFT	12
+#define GUC_CTL_ARAT_HIGH		1
+#define GUC_CTL_ARAT_LOW		2
+#define GUC_CTL_DEVICE_INFO		3
+#define   GUC_CTL_GTTYPE_SHIFT		0
+#define   GUC_CTL_COREFAMILY_SHIFT	7
+#define GUC_CTL_LOG_PARAMS		4
+#define   GUC_LOG_VALID			(1 << 0)
+#define   GUC_LOG_NOTIFY_ON_HALF_FULL	(1 << 1)
+#define   GUC_LOG_ALLOC_IN_MEGABYTE	(1 << 3)
+#define   GUC_LOG_CRASH_PAGES		1
+#define   GUC_LOG_CRASH_SHIFT		4
+#define   GUC_LOG_DPC_PAGES		3
+#define   GUC_LOG_DPC_SHIFT		6
+#define   GUC_LOG_ISR_PAGES		3
+#define   GUC_LOG_ISR_SHIFT		9
+#define   GUC_LOG_BUF_ADDR_SHIFT	12
+#define GUC_CTL_PAGE_FAULT_CONTROL	5
+#define GUC_CTL_WA			6
+#define   GUC_CTL_WA_UK_BY_DRIVER	(1 << 3)
+#define GUC_CTL_FEATURE			7
+#define   GUC_CTL_VCS2_ENABLED		(1 << 0)
+#define   GUC_CTL_KERNEL_SUBMISSIONS	(1 << 1)
+#define   GUC_CTL_FEATURE2		(1 << 2)
+#define   GUC_CTL_POWER_GATING		(1 << 3)
+#define   GUC_CTL_DISABLE_SCHEDULER	(1 << 4)
+#define   GUC_CTL_PREEMPTION_LOG	(1 << 5)
+#define   GUC_CTL_ENABLE_SLPC		(1 << 7)
+#define GUC_CTL_DEBUG			8
+#define   GUC_LOG_VERBOSITY_SHIFT	0
+#define   GUC_LOG_VERBOSITY_LOW		(0 << GUC_LOG_VERBOSITY_SHIFT)
+#define   GUC_LOG_VERBOSITY_MED		(1 << GUC_LOG_VERBOSITY_SHIFT)
+#define   GUC_LOG_VERBOSITY_HIGH	(2 << GUC_LOG_VERBOSITY_SHIFT)
+#define   GUC_LOG_VERBOSITY_ULTRA	(3 << GUC_LOG_VERBOSITY_SHIFT)
+/* Verbosity range-check limits, without the shift */
+#define	  GUC_LOG_VERBOSITY_MIN		0
+#define	  GUC_LOG_VERBOSITY_MAX		3
+
+#define GUC_CTL_MAX_DWORDS		(GUC_CTL_DEBUG + 1)
+
+struct guc_doorbell_info {
+	u32 db_status;
+	u32 cookie;
+	u32 reserved[14];
+} __packed;
+
+union guc_doorbell_qw {
+	struct {
+		u32 db_status;
+		u32 cookie;
+	};
+	u64 value_qw;
+} __packed;
+
+#define GUC_MAX_DOORBELLS		256
+#define GUC_INVALID_DOORBELL_ID		(GUC_MAX_DOORBELLS)
+
+#define GUC_DB_SIZE			(PAGE_SIZE)
+#define GUC_WQ_SIZE			(PAGE_SIZE * 2)
+
+/* Work item for submitting workloads into work queue of GuC. */
+struct guc_wq_item {
+	u32 header;
+	u32 context_desc;
+	u32 ring_tail;
+	u32 fence_id;
+} __packed;
+
+struct guc_process_desc {
+	u32 context_id;
+	u64 db_base_addr;
+	u32 head;
+	u32 tail;
+	u32 error_offset;
+	u64 wq_base_addr;
+	u32 wq_size_bytes;
+	u32 wq_status;
+	u32 engine_presence;
+	u32 priority;
+	u32 reserved[30];
+} __packed;
+
+/* engine id and context id is packed into guc_execlist_context.context_id*/
+#define GUC_ELC_CTXID_OFFSET		0
+#define GUC_ELC_ENGINE_OFFSET		29
+
+/* The execlist context including software and HW information */
+struct guc_execlist_context {
+	u32 context_desc;
+	u32 context_id;
+	u32 ring_status;
+	u32 ring_lcra;
+	u32 ring_begin;
+	u32 ring_end;
+	u32 ring_next_free_location;
+	u32 ring_current_tail_pointer_value;
+	u8 engine_state_submit_value;
+	u8 engine_state_wait_value;
+	u16 pagefault_count;
+	u16 engine_submit_queue_count;
+} __packed;
+
+/*Context descriptor for communicating between uKernel and Driver*/
+struct guc_context_desc {
+	u32 sched_common_area;
+	u32 context_id;
+	u32 pas_id;
+	u8 engines_used;
+	u64 db_trigger_cpu;
+	u32 db_trigger_uk;
+	u64 db_trigger_phy;
+	u16 db_id;
+
+	struct guc_execlist_context lrc[I915_NUM_RINGS];
+
+	u8 attribute;
+
+	u32 priority;
+
+	u32 wq_sampled_tail_offset;
+	u32 wq_total_submit_enqueues;
+
+	u32 process_desc;
+	u32 wq_addr;
+	u32 wq_size;
+
+	u32 engine_presence;
+
+	u32 reserved0[1];
+	u64 reserved1[1];
+
+	u64 desc_private;
+} __packed;
+
+/* This Action will be programmed in C180 - SOFT_SCRATCH_O_REG */
+enum host2guc_action {
+	HOST2GUC_ACTION_DEFAULT = 0x0,
+	HOST2GUC_ACTION_SAMPLE_FORCEWAKE = 0x6,
+	HOST2GUC_ACTION_ALLOCATE_DOORBELL = 0x10,
+	HOST2GUC_ACTION_DEALLOCATE_DOORBELL = 0x20,
+	HOST2GUC_ACTION_SLPC_REQUEST = 0x3003,
+	HOST2GUC_ACTION_LIMIT
+};
+
+/*
+ * The GuC sends its response to a command by overwriting the
+ * command in SS0. The response is distinguishable from a command
+ * by the fact that all the MASK bits are set. The remaining bits
+ * give more detail.
+ */
+#define	GUC2HOST_RESPONSE_MASK		((u32)0xF0000000)
+#define	GUC2HOST_IS_RESPONSE(x) 	((u32)(x) >= GUC2HOST_RESPONSE_MASK)
+#define	GUC2HOST_STATUS(x)		(GUC2HOST_RESPONSE_MASK | (x))
+
+/* GUC will return status back to SOFT_SCRATCH_O_REG */
+enum guc2host_status {
+	GUC2HOST_STATUS_SUCCESS = GUC2HOST_STATUS(0x0),
+	GUC2HOST_STATUS_ALLOCATE_DOORBELL_FAIL = GUC2HOST_STATUS(0x10),
+	GUC2HOST_STATUS_DEALLOCATE_DOORBELL_FAIL = GUC2HOST_STATUS(0x20),
+	GUC2HOST_STATUS_GENERIC_FAIL = GUC2HOST_STATUS(0x0000F000)
+};
+
+#endif
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 04/13 v4] drm/i915: GuC-specific firmware loader
  2015-07-09 18:29 [PATCH 00/13 v4] Batch submission via GuC Dave Gordon
                   ` (2 preceding siblings ...)
  2015-07-09 18:29 ` [PATCH 03/13 v4] drm/i915: Add GuC-related header files Dave Gordon
@ 2015-07-09 18:29 ` Dave Gordon
  2015-07-13 15:35   ` Daniel Vetter
  2015-07-18  0:35   ` O'Rourke, Tom
  2015-07-09 18:29 ` [PATCH 05/13 v4] drm/i915: Debugfs interface to read GuC load status Dave Gordon
                   ` (9 subsequent siblings)
  13 siblings, 2 replies; 42+ messages in thread
From: Dave Gordon @ 2015-07-09 18:29 UTC (permalink / raw)
  To: intel-gfx

From: Alex Dai <yu.dai@intel.com>

This fetches the required firmware image from the filesystem,
then loads it into the GuC's memory via a dedicated DMA engine.

This patch is derived from GuC loading work originally done by
Vinit Azad and Ben Widawsky.

v2:
    Various improvements per review comments by Chris Wilson

v3:
    Removed 'wait' parameter to intel_guc_ucode_load() as firmware
        prefetch is no longer supported in the common firmware loader,
	per Daniel Vetter's request.
    Firmware checker callback fn now returns errno rather than bool.

v4:
    Squash uC-independent code into GuC-specifc loader [Daniel Vetter]
    Don't keep the driver working (by falling back to execlist mode)
        if GuC firmware loading fails [Daniel Vetter]

Issue: VIZ-4884
Signed-off-by: Alex Dai <yu.dai@intel.com>
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
---
 drivers/gpu/drm/i915/Makefile           |   3 +
 drivers/gpu/drm/i915/i915_dma.c         |   4 +
 drivers/gpu/drm/i915/i915_drv.h         |  11 +
 drivers/gpu/drm/i915/i915_gem.c         |  13 +
 drivers/gpu/drm/i915/i915_reg.h         |   4 +-
 drivers/gpu/drm/i915/intel_guc.h        |  67 ++++
 drivers/gpu/drm/i915/intel_guc_loader.c | 536 ++++++++++++++++++++++++++++++++
 7 files changed, 637 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/i915/intel_guc.h
 create mode 100644 drivers/gpu/drm/i915/intel_guc_loader.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index de21965..e604cfe 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -39,6 +39,9 @@ i915-y += i915_cmd_parser.o \
 	  intel_ringbuffer.o \
 	  intel_uncore.o
 
+# general-purpose microcontroller (GuC) support
+i915-y += intel_guc_loader.o
+
 # autogenerated null render state
 i915-y += intel_renderstate_gen6.o \
 	  intel_renderstate_gen7.o \
diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 066c34c..958ab4f 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -472,6 +472,7 @@ static int i915_load_modeset_init(struct drm_device *dev)
 
 cleanup_gem:
 	mutex_lock(&dev->struct_mutex);
+	intel_guc_ucode_fini(dev);
 	i915_gem_cleanup_ringbuffer(dev);
 	i915_gem_context_fini(dev);
 	mutex_unlock(&dev->struct_mutex);
@@ -869,6 +870,8 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
 
 	intel_uncore_init(dev);
 
+	intel_guc_ucode_init(dev);
+
 	/* Load CSR Firmware for SKL */
 	intel_csr_ucode_init(dev);
 
@@ -1120,6 +1123,7 @@ int i915_driver_unload(struct drm_device *dev)
 	flush_workqueue(dev_priv->wq);
 
 	mutex_lock(&dev->struct_mutex);
+	intel_guc_ucode_fini(dev);
 	i915_gem_cleanup_ringbuffer(dev);
 	i915_gem_context_fini(dev);
 	mutex_unlock(&dev->struct_mutex);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 4a512da..15b9202 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -50,6 +50,7 @@
 #include <linux/intel-iommu.h>
 #include <linux/kref.h>
 #include <linux/pm_qos.h>
+#include "intel_guc.h"
 
 /* General customization:
  */
@@ -1694,6 +1695,8 @@ struct drm_i915_private {
 
 	struct i915_virtual_gpu vgpu;
 
+	struct intel_guc guc;
+
 	struct intel_csr csr;
 
 	/* Display CSR-related protection */
@@ -1938,6 +1941,11 @@ static inline struct drm_i915_private *dev_to_i915(struct device *dev)
 	return to_i915(dev_get_drvdata(dev));
 }
 
+static inline struct drm_i915_private *guc_to_i915(struct intel_guc *guc)
+{
+	return container_of(guc, struct drm_i915_private, guc);
+}
+
 /* Iterate over initialised rings */
 #define for_each_ring(ring__, dev_priv__, i__) \
 	for ((i__) = 0; (i__) < I915_NUM_RINGS; (i__)++) \
@@ -2543,6 +2551,9 @@ struct drm_i915_cmd_table {
 
 #define HAS_CSR(dev)	(IS_SKYLAKE(dev))
 
+#define HAS_GUC_UCODE(dev)	(IS_GEN9(dev))
+#define HAS_GUC_SCHED(dev)	(IS_GEN9(dev))
+
 #define HAS_RESOURCE_STREAMER(dev) (IS_HASWELL(dev) || \
 				    INTEL_INFO(dev)->gen >= 8)
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index dbbb649..e020309 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -5074,6 +5074,19 @@ i915_gem_init_hw(struct drm_device *dev)
 			goto out;
 	}
 
+	/* We can't enable contexts until all firmware is loaded */
+	ret = intel_guc_ucode_load(dev);
+
+	/*
+	 * If we got an error and GuC submission is enabled, map
+	 * the error to -EIO so the GPU will be declared wedged.
+	 * OTOH, if we didn't intend to use the GuC anyway, just
+	 * discard the error and carry on.
+	 */
+	ret = ret && i915.enable_guc_submission ? -EIO : 0;
+	if (ret)
+		goto out;
+
 	/* Now it is safe to go back round and do everything else: */
 	for_each_ring(ring, dev_priv, i) {
 		struct drm_i915_gem_request *req;
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 2a29bcc..63728c1 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -6843,7 +6843,9 @@ enum skl_disp_power_wells {
 #define   GEN9_PGCTL_SSB_EU311_ACK	(1 << 14)
 
 #define GEN7_MISCCPCTL			(0x9424)
-#define   GEN7_DOP_CLOCK_GATE_ENABLE	(1<<0)
+#define   GEN7_DOP_CLOCK_GATE_ENABLE		(1<<0)
+#define   GEN8_DOP_CLOCK_GATE_CFCLK_ENABLE	(1<<2)
+#define   GEN8_DOP_CLOCK_GATE_GUC_ENABLE	(1<<4)
 
 /* IVYBRIDGE DPF */
 #define GEN7_L3CDERRST1			0xB008 /* L3CD Error Status 1 */
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
new file mode 100644
index 0000000..2846b6d
--- /dev/null
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -0,0 +1,67 @@
+/*
+ * Copyright © 2014 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+#ifndef _INTEL_GUC_H_
+#define _INTEL_GUC_H_
+
+#include "intel_guc_fwif.h"
+#include "i915_guc_reg.h"
+
+enum intel_guc_fw_status {
+	GUC_FIRMWARE_FAIL = -1,
+	GUC_FIRMWARE_NONE = 0,
+	GUC_FIRMWARE_PENDING,
+	GUC_FIRMWARE_SUCCESS
+};
+
+/*
+ * This structure encapsulates all the data needed during the process
+ * of fetching, caching, and loading the firmware image into the GuC.
+ */
+struct intel_guc_fw {
+	struct drm_device *		guc_dev;
+	const char *			guc_fw_path;
+	size_t				guc_fw_size;
+	struct drm_i915_gem_object *	guc_fw_obj;
+	enum intel_guc_fw_status	guc_fw_fetch_status;
+	enum intel_guc_fw_status	guc_fw_load_status;
+
+	uint16_t			guc_fw_major_wanted;
+	uint16_t			guc_fw_minor_wanted;
+	uint16_t			guc_fw_major_found;
+	uint16_t			guc_fw_minor_found;
+};
+
+struct intel_guc {
+	struct intel_guc_fw guc_fw;
+
+	uint32_t log_flags;
+};
+
+/* intel_guc_loader.c */
+extern void intel_guc_ucode_init(struct drm_device *dev);
+extern int intel_guc_ucode_load(struct drm_device *dev);
+extern void intel_guc_ucode_fini(struct drm_device *dev);
+extern const char *intel_guc_fw_status_repr(enum intel_guc_fw_status status);
+
+#endif
diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c b/drivers/gpu/drm/i915/intel_guc_loader.c
new file mode 100644
index 0000000..2080bca
--- /dev/null
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -0,0 +1,536 @@
+/*
+ * Copyright © 2014 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ * Authors:
+ *    Vinit Azad <vinit.azad@intel.com>
+ *    Ben Widawsky <ben@bwidawsk.net>
+ *    Dave Gordon <david.s.gordon@intel.com>
+ *    Alex Dai <yu.dai@intel.com>
+ */
+#include <linux/firmware.h>
+#include "i915_drv.h"
+#include "intel_guc.h"
+
+/**
+ * DOC: GuC
+ *
+ * intel_guc:
+ * Top level structure of guc. It handles firmware loading and manages client
+ * pool and doorbells. intel_guc owns a i915_guc_client to replace the legacy
+ * ExecList submission.
+ *
+ * Firmware versioning:
+ * The firmware build process will generate a version header file with major and
+ * minor version defined. The versions are built into CSS header of firmware.
+ * i915 kernel driver set the minimal firmware version required per platform.
+ * The firmware installation package will install (symbolic link) proper version
+ * of firmware.
+ *
+ * GuC address space:
+ * GuC does not allow any gfx GGTT address that falls into range [0, WOPCM_TOP),
+ * which is reserved for Boot ROM, SRAM and WOPCM. Currently this top address is
+ * 512K. In order to exclude 0-512K address space from GGTT, all gfx objects
+ * used by GuC is pinned with PIN_OFFSET_BIAS along with size of WOPCM.
+ *
+ * Firmware log:
+ * Firmware log is enabled by setting i915.guc_log_level to non-negative level.
+ * Log data is printed out via reading debugfs i915_guc_log_dump. Reading from
+ * i915_guc_load_status will print out firmware loading status and scratch
+ * registers value.
+ *
+ */
+
+#define I915_SKL_GUC_UCODE "i915/skl_guc_ver3.bin"
+MODULE_FIRMWARE(I915_SKL_GUC_UCODE);
+
+/* User-friendly representation of an enum */
+const char *intel_guc_fw_status_repr(enum intel_guc_fw_status status)
+{
+	switch (status) {
+	case GUC_FIRMWARE_FAIL:
+		return "FAIL";
+	case GUC_FIRMWARE_NONE:
+		return "NONE";
+	case GUC_FIRMWARE_PENDING:
+		return "PENDING";
+	case GUC_FIRMWARE_SUCCESS:
+		return "SUCCESS";
+	default:
+		return "UNKNOWN!";
+	}
+};
+
+static u32 get_gttype(struct drm_i915_private *dev_priv)
+{
+	/* XXX: GT type based on PCI device ID? field seems unused by fw */
+	return 0;
+}
+
+static u32 get_core_family(struct drm_i915_private *dev_priv)
+{
+	switch (INTEL_INFO(dev_priv)->gen) {
+	case 8:
+		return GFXCORE_FAMILY_GEN8;
+	case 9:
+		return GFXCORE_FAMILY_GEN9;
+	default:
+		DRM_ERROR("GUC: unknown gen for scheduler init\n");
+		return GFXCORE_FAMILY_FORCE_ULONG;
+	}
+}
+
+static void set_guc_init_params(struct drm_i915_private *dev_priv)
+{
+	struct intel_guc *guc = &dev_priv->guc;
+	u32 params[GUC_CTL_MAX_DWORDS];
+	int i;
+
+	memset(&params, 0, sizeof(params));
+
+	params[GUC_CTL_DEVICE_INFO] |=
+		(get_gttype(dev_priv) << GUC_CTL_GTTYPE_SHIFT) |
+		(get_core_family(dev_priv) << GUC_CTL_COREFAMILY_SHIFT);
+
+	/* GuC ARAT increment is 10 ns. GuC default scheduler quantum is one
+	 * second. This ARAR is calculated by:
+	 * Scheduler-Quantum-in-ns / ARAT-increment-in-ns = 1000000000 / 10
+	 */
+	params[GUC_CTL_ARAT_HIGH] = 0;
+	params[GUC_CTL_ARAT_LOW] = 100000000;
+
+	params[GUC_CTL_WA] |= GUC_CTL_WA_UK_BY_DRIVER;
+
+	params[GUC_CTL_FEATURE] |= GUC_CTL_DISABLE_SCHEDULER |
+			GUC_CTL_VCS2_ENABLED;
+
+	if (i915.guc_log_level >= 0) {
+		params[GUC_CTL_LOG_PARAMS] = guc->log_flags;
+		params[GUC_CTL_DEBUG] =
+			i915.guc_log_level << GUC_LOG_VERBOSITY_SHIFT;
+	}
+
+	I915_WRITE(SOFT_SCRATCH(0), 0);
+
+	for (i = 0; i < GUC_CTL_MAX_DWORDS; i++)
+		I915_WRITE(SOFT_SCRATCH(1 + i), params[i]);
+}
+
+/*
+ * Read the GuC status register (GUC_STATUS) and store it in the
+ * specified location; then return a boolean indicating whether
+ * the value matches either of two values representing completion
+ * of the GuC boot process.
+ *
+ * This is used for polling the GuC status in a wait_for_atomic()
+ * loop below.
+ */
+static inline bool guc_ucode_response(struct drm_i915_private *dev_priv,
+				      u32 *status)
+{
+	u32 val = I915_READ(GUC_STATUS);
+	*status = val;
+	return ((val & GS_UKERNEL_MASK) == GS_UKERNEL_READY ||
+		(val & GS_UKERNEL_MASK) == GS_UKERNEL_LAPIC_DONE);
+}
+
+/*
+ * Transfer the firmware image to RAM for execution by the microcontroller.
+ *
+ * GuC Firmware layout:
+ * +-------------------------------+  ----
+ * |          CSS header           |  128B
+ * | contains major/minor version  |
+ * +-------------------------------+  ----
+ * |             uCode             |
+ * +-------------------------------+  ----
+ * |         RSA signature         |  256B
+ * +-------------------------------+  ----
+ * |         RSA public Key        |  256B
+ * +-------------------------------+  ----
+ * |       Public key modulus      |    4B
+ * +-------------------------------+  ----
+ *
+ * Architecturally, the DMA engine is bidirectional, and can potentially even
+ * transfer between GTT locations. This functionality is left out of the API
+ * for now as there is no need for it.
+ *
+ * Note that GuC needs the CSS header plus uKernel code to be copied by the
+ * DMA engine in one operation, whereas the RSA signature is loaded via MMIO.
+ */
+
+#define UOS_CSS_HEADER_OFFSET		0
+#define UOS_VER_MINOR_OFFSET		0x44
+#define UOS_VER_MAJOR_OFFSET		0x46
+#define UOS_CSS_HEADER_SIZE		0x80
+#define UOS_RSA_SIG_SIZE		0x100
+#define UOS_CSS_SIGNING_SIZE		0x204
+
+static int guc_ucode_xfer_dma(struct drm_i915_private *dev_priv)
+{
+	struct intel_guc_fw *guc_fw = &dev_priv->guc.guc_fw;
+	struct drm_i915_gem_object *fw_obj = guc_fw->guc_fw_obj;
+	unsigned long offset;
+	struct sg_table *sg = fw_obj->pages;
+	u32 status, ucode_size, rsa[UOS_RSA_SIG_SIZE / sizeof(u32)];
+	int i, ret = 0;
+
+	/* uCode size, also is where RSA signature starts */
+	offset = ucode_size = guc_fw->guc_fw_size - UOS_CSS_SIGNING_SIZE;
+
+	/* Copy RSA signature from the fw image to HW for verification */
+	sg_pcopy_to_buffer(sg->sgl, sg->nents, rsa, UOS_RSA_SIG_SIZE, offset);
+	for (i = 0; i < UOS_RSA_SIG_SIZE / sizeof(u32); i++)
+		I915_WRITE(UOS_RSA_SCRATCH_0 + i * sizeof(u32), rsa[i]);
+
+	/* Set the source address for the new blob */
+	offset = i915_gem_obj_ggtt_offset(fw_obj);
+	I915_WRITE(DMA_ADDR_0_LOW, lower_32_bits(offset));
+	I915_WRITE(DMA_ADDR_0_HIGH, upper_32_bits(offset) & 0xFFFF);
+
+	/* Set the destination. Current uCode expects an 8k stack starting from
+	 * offset 0. */
+	I915_WRITE(DMA_ADDR_1_LOW, 0x2000);
+
+	/* XXX: The image is automatically transfered to SRAM after the RSA
+	 * verification. This is why the address space is chosen as such. */
+	I915_WRITE(DMA_ADDR_1_HIGH, DMA_ADDRESS_SPACE_WOPCM);
+
+	I915_WRITE(DMA_COPY_SIZE, ucode_size);
+
+	/* Finally start the DMA */
+	I915_WRITE(DMA_CTRL, _MASKED_BIT_ENABLE(UOS_MOVE | START_DMA));
+
+	/*
+	 * Spin-wait for the DMA to complete & the GuC to start up.
+	 * NB: Docs recommend not using the interrupt for completion.
+	 * Measurements indicate this should take no more than 20ms, so a
+	 * timeout here indicates that the GuC has failed and is unusable.
+	 * (Higher levels of the driver will attempt to fall back to
+	 * execlist mode if this happens.)
+	 */
+	ret = wait_for_atomic(guc_ucode_response(dev_priv, &status), 100);
+
+	DRM_DEBUG_DRIVER("DMA status 0x%x, GuC status 0x%x\n",
+			I915_READ(DMA_CTRL), status);
+
+	if ((status & GS_BOOTROM_MASK) == GS_BOOTROM_RSA_FAILED) {
+		DRM_ERROR("GuC firmware signature verification failed\n");
+		ret = -ENOEXEC;
+	}
+
+	DRM_DEBUG_DRIVER("returning %d\n", ret);
+
+	return ret;
+}
+
+/*
+ * Load the GuC firmware blob into the MinuteIA.
+ */
+static int guc_ucode_xfer(struct drm_i915_private *dev_priv)
+{
+	struct intel_guc_fw *guc_fw = &dev_priv->guc.guc_fw;
+	struct drm_device *dev = dev_priv->dev;
+	int ret;
+
+	ret = i915_gem_object_set_to_gtt_domain(guc_fw->guc_fw_obj, false);
+	if (ret) {
+		DRM_DEBUG_DRIVER("set-domain failed %d\n", ret);
+		return ret;
+	}
+
+	ret = i915_gem_obj_ggtt_pin(guc_fw->guc_fw_obj, 0, 0);
+	if (ret) {
+		DRM_DEBUG_DRIVER("pin failed %d\n", ret);
+		return ret;
+	}
+
+	intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
+
+	/* init WOPCM */
+	I915_WRITE(GUC_WOPCM_SIZE, GUC_WOPCM_SIZE_VALUE);
+	I915_WRITE(DMA_GUC_WOPCM_OFFSET, GUC_WOPCM_OFFSET);
+
+	/* Invalidate GuC TLB to let GuC take the latest updates to GTT. */
+	I915_WRITE(GEN8_GTCR, GEN8_GTCR_INVALIDATE);
+
+	/* Set MMIO/WA for GuC init */
+	I915_WRITE(DRBMISC1, DOORBELL_ENABLE);
+
+	/* Enable MIA caching. GuC clock gating is disabled. */
+	I915_WRITE(GUC_SHIM_CONTROL, GUC_SHIM_CONTROL_VALUE);
+
+	/* WaC6DisallowByGfxPause*/
+	I915_WRITE(GEN6_GFXPAUSE, 0x30FFF);
+
+	if (IS_SKYLAKE(dev))
+		I915_WRITE(GEN9_GT_PM_CONFIG, GEN8_GT_DOORBELL_ENABLE);
+	else
+		I915_WRITE(GEN8_GT_PM_CONFIG, GEN8_GT_DOORBELL_ENABLE);
+
+	if (IS_GEN9(dev)) {
+		/* DOP Clock Gating Enable for GuC clocks */
+		I915_WRITE(GEN7_MISCCPCTL, (GEN8_DOP_CLOCK_GATE_GUC_ENABLE |
+					    I915_READ(GEN7_MISCCPCTL)));
+
+		/* allows for 5us before GT can go to RC6 */
+		I915_WRITE(GUC_ARAT_C6DIS, 0x1FF);
+	}
+
+	set_guc_init_params(dev_priv);
+
+	ret = guc_ucode_xfer_dma(dev_priv);
+
+	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
+
+	/*
+	 * We keep the object pages for reuse during resume. But we can unpin it
+	 * now that DMA has completed, so it doesn't continue to take up space.
+	 */
+	i915_gem_object_ggtt_unpin(guc_fw->guc_fw_obj);
+
+	return ret;
+}
+
+static void guc_fw_fetch(struct drm_device *dev, struct intel_guc_fw *guc_fw)
+{
+	struct drm_i915_gem_object *obj;
+	const struct firmware *fw;
+	const u8 *css_header;
+	const size_t minsize = UOS_CSS_HEADER_SIZE + UOS_CSS_SIGNING_SIZE;
+	const size_t maxsize = GUC_WOPCM_SIZE_VALUE + UOS_CSS_SIGNING_SIZE
+			- 0x8000; /* 32k reserved (8K stack + 24k context) */
+
+	DRM_DEBUG_DRIVER("before requesting firmware: GuC fw fetch status %s\n",
+		intel_guc_fw_status_repr(guc_fw->guc_fw_fetch_status));
+
+	if (request_firmware(&fw, guc_fw->guc_fw_path, &dev->pdev->dev))
+		goto fail;
+	if (!fw)
+		goto fail;
+
+	DRM_DEBUG_DRIVER("fetch GuC fw from %s succeeded, fw %p\n",
+		guc_fw->guc_fw_path, fw);
+
+	DRM_DEBUG_DRIVER("firmware file size %zu (minimum %zu, maximum %zu)\n",
+		fw->size, minsize, maxsize);
+
+	/* Check the size of the blob befoe examining buffer contents */
+	if (fw->size < minsize || fw->size > maxsize)
+		goto fail;
+
+	/*
+	 * The GuC firmware image has the version number embedded at a well-known
+	 * offset within the firmware blob; note that major / minor version are
+	 * TWO bytes each (i.e. u16), although all pointers and offsets are defined
+	 * in terms of bytes (u8).
+	 */
+	css_header = fw->data + UOS_CSS_HEADER_OFFSET;
+	guc_fw->guc_fw_major_found = *(u16 *)(css_header + UOS_VER_MAJOR_OFFSET);
+	guc_fw->guc_fw_minor_found = *(u16 *)(css_header + UOS_VER_MINOR_OFFSET);
+
+	if (guc_fw->guc_fw_major_found != guc_fw->guc_fw_major_wanted ||
+	    guc_fw->guc_fw_minor_found < guc_fw->guc_fw_minor_wanted) {
+		DRM_ERROR("GuC firmware version %d.%d, required %d.%d\n",
+			guc_fw->guc_fw_major_found, guc_fw->guc_fw_minor_found,
+			guc_fw->guc_fw_major_wanted, guc_fw->guc_fw_minor_wanted);
+		goto fail;
+	}
+
+	DRM_DEBUG_DRIVER("firmware version %d.%d OK (minimum %d.%d)\n",
+			guc_fw->guc_fw_major_found, guc_fw->guc_fw_minor_found,
+			guc_fw->guc_fw_major_wanted, guc_fw->guc_fw_minor_wanted);
+
+	obj = i915_gem_object_create_from_data(dev, fw->data, fw->size);
+	if (!obj)
+		goto fail;
+
+	guc_fw->guc_fw_obj = obj;
+	guc_fw->guc_fw_size = fw->size;
+
+	DRM_DEBUG_DRIVER("GuC fw fetch status SUCCESS, obj %p\n",
+			guc_fw->guc_fw_obj);
+
+	release_firmware(fw);
+	guc_fw->guc_fw_fetch_status = GUC_FIRMWARE_SUCCESS;
+	return;
+
+fail:
+	DRM_DEBUG_DRIVER("GuC fw fetch status FAIL; fw %p, obj %p\n",
+		fw, guc_fw->guc_fw_obj);
+	DRM_ERROR("Failed to fetch GuC firmware from %s\n",
+		  guc_fw->guc_fw_path);
+
+	obj = guc_fw->guc_fw_obj;
+	if (obj)
+		drm_gem_object_unreference(&obj->base);
+	guc_fw->guc_fw_obj = NULL;
+
+	release_firmware(fw);		/* OK even if fw is NULL */
+	guc_fw->guc_fw_fetch_status = GUC_FIRMWARE_FAIL;
+}
+
+/**
+ * intel_guc_ucode_load() - load GuC uCode into the device
+ * @dev:	drm device
+ *
+ * Called from gem_init_hw() during driver loading and also after a GPU reset.
+ *
+ * On the first call only, this will fetch the blob from the filesystem;
+ * thereafter, we will already either have the blob in a GEM object, or
+ * have determined that no valid firmware image could be found.
+ *
+ * If we have a good firmware image, transfer it to the h/w.
+ *
+ * Return:	non-zero code on error
+ */
+int intel_guc_ucode_load(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_guc_fw *guc_fw = &dev_priv->guc.guc_fw;
+	int err = 0;
+
+	DRM_DEBUG_DRIVER("GuC fw status: fetch %s, load %s\n",
+		intel_guc_fw_status_repr(guc_fw->guc_fw_fetch_status),
+		intel_guc_fw_status_repr(guc_fw->guc_fw_load_status));
+
+	if (guc_fw->guc_fw_fetch_status == GUC_FIRMWARE_NONE)
+		return 0;
+
+	if (guc_fw->guc_fw_fetch_status == GUC_FIRMWARE_SUCCESS &&
+	    guc_fw->guc_fw_load_status == GUC_FIRMWARE_FAIL)
+		return -ENOEXEC;
+
+	guc_fw->guc_fw_load_status = GUC_FIRMWARE_PENDING;
+	if (guc_fw->guc_fw_fetch_status == GUC_FIRMWARE_PENDING) {
+		/* We only come here once */
+		guc_fw_fetch(dev, guc_fw);
+		/* status must now be FAIL or SUCCESS */
+	}
+
+	DRM_DEBUG_DRIVER("GuC fw fetch status %s\n",
+		intel_guc_fw_status_repr(guc_fw->guc_fw_fetch_status));
+
+	switch (guc_fw->guc_fw_fetch_status) {
+	case GUC_FIRMWARE_FAIL:
+		/* something went wrong :( */
+		err = -EIO;
+		goto fail;
+
+	case GUC_FIRMWARE_NONE:
+	case GUC_FIRMWARE_PENDING:
+	default:
+		/* "can't happen" */
+		WARN_ONCE(1, "GuC fw %s invalid guc_fw_fetch_status %s [%d]\n",
+			guc_fw->guc_fw_path,
+			intel_guc_fw_status_repr(guc_fw->guc_fw_fetch_status),
+			guc_fw->guc_fw_fetch_status);
+		err = -ENXIO;
+		goto fail;
+
+	case GUC_FIRMWARE_SUCCESS:
+		break;
+	}
+
+	err = guc_ucode_xfer(dev_priv);
+	if (err)
+		goto fail;
+
+	guc_fw->guc_fw_load_status = GUC_FIRMWARE_SUCCESS;
+
+	DRM_DEBUG_DRIVER("GuC fw status: fetch %s, load %s\n",
+		intel_guc_fw_status_repr(guc_fw->guc_fw_fetch_status),
+		intel_guc_fw_status_repr(guc_fw->guc_fw_load_status));
+
+	return 0;
+
+fail:
+	if (guc_fw->guc_fw_load_status == GUC_FIRMWARE_PENDING)
+		guc_fw->guc_fw_load_status = GUC_FIRMWARE_FAIL;
+
+	DRM_ERROR("Failed to initialize GuC, error %d\n", err);
+
+	return err;
+}
+
+/**
+ * intel_guc_ucode_init() - define parameters for fetching firmware
+ * @dev:	drm device
+ *
+ * Called early during driver load, before GEM is initialised.
+ * Driver is single threaded, so no mutex is required.
+ *
+ * This just sets parameters for use when intel_guc_ucode_load()
+ * is called later, after GEM initialisation is complete.
+ */
+void intel_guc_ucode_init(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_guc_fw *guc_fw = &dev_priv->guc.guc_fw;
+	const char *fw_path;
+
+	if (!HAS_GUC_SCHED(dev))
+		i915.enable_guc_submission = false;
+
+	if (!HAS_GUC_UCODE(dev)) {
+		fw_path = NULL;
+	} else if (IS_SKYLAKE(dev)) {
+		fw_path = I915_SKL_GUC_UCODE;
+		guc_fw->guc_fw_major_wanted = 3;
+		guc_fw->guc_fw_minor_wanted = 0;
+	} else {
+		i915.enable_guc_submission = false;
+		fw_path = "";	/* unknown device */
+	}
+
+	guc_fw->guc_dev = dev;
+	guc_fw->guc_fw_path = fw_path;
+	guc_fw->guc_fw_fetch_status = GUC_FIRMWARE_NONE;
+	guc_fw->guc_fw_load_status = GUC_FIRMWARE_NONE;
+
+	if (fw_path == NULL)
+		return;
+
+	if (*fw_path == '\0') {
+		DRM_ERROR("No GuC firmware known for this platform\n");
+		guc_fw->guc_fw_fetch_status = GUC_FIRMWARE_FAIL;
+		return;
+	}
+
+	guc_fw->guc_fw_fetch_status = GUC_FIRMWARE_PENDING;
+	DRM_DEBUG_DRIVER("GuC firmware pending, path %s\n", fw_path);
+}
+
+/**
+ * intel_guc_ucode_fini() - clean up all allocated resources
+ * @dev:	drm device
+ */
+void intel_guc_ucode_fini(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_guc_fw *guc_fw = &dev_priv->guc.guc_fw;
+
+	if (guc_fw->guc_fw_obj)
+		drm_gem_object_unreference(&guc_fw->guc_fw_obj->base);
+	guc_fw->guc_fw_obj = NULL;
+
+	guc_fw->guc_fw_fetch_status = GUC_FIRMWARE_NONE;
+}
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 05/13 v4] drm/i915: Debugfs interface to read GuC load status
  2015-07-09 18:29 [PATCH 00/13 v4] Batch submission via GuC Dave Gordon
                   ` (3 preceding siblings ...)
  2015-07-09 18:29 ` [PATCH 04/13 v4] drm/i915: GuC-specific firmware loader Dave Gordon
@ 2015-07-09 18:29 ` Dave Gordon
  2015-07-18  0:39   ` O'Rourke, Tom
  2015-07-09 18:29 ` [PATCH 06/13 v4] drm/i915: Expose two LRC functions for GuC submission mode Dave Gordon
                   ` (8 subsequent siblings)
  13 siblings, 1 reply; 42+ messages in thread
From: Dave Gordon @ 2015-07-09 18:29 UTC (permalink / raw)
  To: intel-gfx

From: Alex Dai <yu.dai@intel.com>

The new node provides access to the status of the GuC-specific loader;
also the scratch registers used for communication between the i915
driver and the GuC firmware.

v2:
    Changes to output formats per Chris Wilson's suggestions

v4:
    Rebased

Issue: VIZ-4884
Signed-off-by: Alex Dai <yu.dai@intel.com>
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c | 39 +++++++++++++++++++++++++++++++++++++
 1 file changed, 39 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 98fd3c9..9ff5f17 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2359,6 +2359,44 @@ static int i915_llc(struct seq_file *m, void *data)
 	return 0;
 }
 
+static int i915_guc_load_status_info(struct seq_file *m, void *data)
+{
+	struct drm_info_node *node = m->private;
+	struct drm_i915_private *dev_priv = node->minor->dev->dev_private;
+	struct intel_guc_fw *guc_fw = &dev_priv->guc.guc_fw;
+	u32 tmp, i;
+
+	if (!HAS_GUC_UCODE(dev_priv->dev))
+		return 0;
+
+	seq_printf(m, "GuC firmware status:\n");
+	seq_printf(m, "\tpath: %s\n",
+		guc_fw->guc_fw_path);
+	seq_printf(m, "\tfetch: %s\n",
+		intel_guc_fw_status_repr(guc_fw->guc_fw_fetch_status));
+	seq_printf(m, "\tload: %s\n",
+		intel_guc_fw_status_repr(guc_fw->guc_fw_load_status));
+	seq_printf(m, "\tversion wanted: %d.%d\n",
+		guc_fw->guc_fw_major_wanted, guc_fw->guc_fw_minor_wanted);
+	seq_printf(m, "\tversion found: %d.%d\n",
+		guc_fw->guc_fw_major_found, guc_fw->guc_fw_minor_found);
+
+	tmp = I915_READ(GUC_STATUS);
+
+	seq_printf(m, "\nGuC status 0x%08x:\n", tmp);
+	seq_printf(m, "\tBootrom status = 0x%x\n",
+		(tmp & GS_BOOTROM_MASK) >> GS_BOOTROM_SHIFT);
+	seq_printf(m, "\tuKernel status = 0x%x\n",
+		(tmp & GS_UKERNEL_MASK) >> GS_UKERNEL_SHIFT);
+	seq_printf(m, "\tMIA Core status = 0x%x\n",
+		(tmp & GS_MIA_MASK) >> GS_MIA_SHIFT);
+	seq_puts(m, "\nScratch registers:\n");
+	for (i = 0; i < 16; i++)
+		seq_printf(m, "\t%2d: \t0x%x\n", i, I915_READ(SOFT_SCRATCH(i)));
+
+	return 0;
+}
+
 static int i915_edp_psr_status(struct seq_file *m, void *data)
 {
 	struct drm_info_node *node = m->private;
@@ -5073,6 +5111,7 @@ static const struct drm_info_list i915_debugfs_list[] = {
 	{"i915_gem_hws_bsd", i915_hws_info, 0, (void *)VCS},
 	{"i915_gem_hws_vebox", i915_hws_info, 0, (void *)VECS},
 	{"i915_gem_batch_pool", i915_gem_batch_pool_info, 0},
+	{"i915_guc_load_status", i915_guc_load_status_info, 0},
 	{"i915_frequency_info", i915_frequency_info, 0},
 	{"i915_hangcheck_info", i915_hangcheck_info, 0},
 	{"i915_drpc_info", i915_drpc_info, 0},
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 06/13 v4] drm/i915: Expose two LRC functions for GuC submission mode
  2015-07-09 18:29 [PATCH 00/13 v4] Batch submission via GuC Dave Gordon
                   ` (4 preceding siblings ...)
  2015-07-09 18:29 ` [PATCH 05/13 v4] drm/i915: Debugfs interface to read GuC load status Dave Gordon
@ 2015-07-09 18:29 ` Dave Gordon
  2015-07-24 22:12   ` O'Rourke, Tom
  2015-07-09 18:29 ` [PATCH 07/13 v4] drm/i915: GuC submission setup, phase 1 Dave Gordon
                   ` (7 subsequent siblings)
  13 siblings, 1 reply; 42+ messages in thread
From: Dave Gordon @ 2015-07-09 18:29 UTC (permalink / raw)
  To: intel-gfx

GuC submission is basically execlist submission, but with the GuC
handling the actual writes to the ELSP and the resulting context
switch interrupts. So to prepare a context for submission via the
GuC, we need some of the same functions used in execlist mode.
This commit exposes two such functions, changing their names to
better describe what they do (they're related to logical ring
contexts rather than to execlists per se).

v2:
    Replaces previous "drm/i915: Move execlists defines from .c to .h"

v3:
    Incorporates a change to one of the functions exposed here that was
    previously part of an internal patch, but which was omitted from
    the version recently committed to drm-intel-nightly:
	7a01a0a drm/i915/lrc: Update PDPx registers with lri commands
    So we reinstate this change here.

v4:
    Drop v3 change, update function parameters due to collision with
    8ee3615 drm/i915: Convert execlists_ctx_descriptor() for requests

Issue: VIZ-4884
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c | 21 ++++++++++-----------
 drivers/gpu/drm/i915/intel_lrc.h |  3 +++
 2 files changed, 13 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index d4f8b43..9e121d3 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -261,11 +261,11 @@ u32 intel_execlists_ctx_id(struct drm_i915_gem_object *ctx_obj)
 	return lrca >> 12;
 }
 
-static uint64_t execlists_ctx_descriptor(struct drm_i915_gem_request *rq)
+uint64_t intel_lr_context_descriptor(struct intel_context *ctx,
+				     struct intel_engine_cs *ring)
 {
-	struct intel_engine_cs *ring = rq->ring;
 	struct drm_device *dev = ring->dev;
-	struct drm_i915_gem_object *ctx_obj = rq->ctx->engine[ring->id].state;
+	struct drm_i915_gem_object *ctx_obj = ctx->engine[ring->id].state;
 	uint64_t desc;
 	uint64_t lrca = i915_gem_obj_ggtt_offset(ctx_obj);
 
@@ -303,13 +303,13 @@ static void execlists_elsp_write(struct drm_i915_gem_request *rq0,
 	uint64_t desc[2];
 
 	if (rq1) {
-		desc[1] = execlists_ctx_descriptor(rq1);
+		desc[1] = intel_lr_context_descriptor(rq1->ctx, rq1->ring);
 		rq1->elsp_submitted++;
 	} else {
 		desc[1] = 0;
 	}
 
-	desc[0] = execlists_ctx_descriptor(rq0);
+	desc[0] = intel_lr_context_descriptor(rq0->ctx, rq0->ring);
 	rq0->elsp_submitted++;
 
 	/* You must always write both descriptors in the order below. */
@@ -328,7 +328,8 @@ static void execlists_elsp_write(struct drm_i915_gem_request *rq0,
 	spin_unlock(&dev_priv->uncore.lock);
 }
 
-static int execlists_update_context(struct drm_i915_gem_request *rq)
+/* Update the ringbuffer pointer and tail offset in a saved context image */
+void intel_lr_context_update(struct drm_i915_gem_request *rq)
 {
 	struct intel_engine_cs *ring = rq->ring;
 	struct i915_hw_ppgtt *ppgtt = rq->ctx->ppgtt;
@@ -358,17 +359,15 @@ static int execlists_update_context(struct drm_i915_gem_request *rq)
 	}
 
 	kunmap_atomic(reg_state);
-
-	return 0;
 }
 
 static void execlists_submit_requests(struct drm_i915_gem_request *rq0,
 				      struct drm_i915_gem_request *rq1)
 {
-	execlists_update_context(rq0);
+	intel_lr_context_update(rq0);
 
 	if (rq1)
-		execlists_update_context(rq1);
+		intel_lr_context_update(rq1);
 
 	execlists_elsp_write(rq0, rq1);
 }
@@ -2051,7 +2050,7 @@ populate_lr_context(struct intel_context *ctx, struct drm_i915_gem_object *ctx_o
 	reg_state[CTX_RING_TAIL+1] = 0;
 	reg_state[CTX_RING_BUFFER_START] = RING_START(ring->mmio_base);
 	/* Ring buffer start address is not known until the buffer is pinned.
-	 * It is written to the context image in execlists_update_context()
+	 * It is written to the context image in intel_lr_context_update()
 	 */
 	reg_state[CTX_RING_BUFFER_CONTROL] = RING_CTL(ring->mmio_base);
 	reg_state[CTX_RING_BUFFER_CONTROL+1] =
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index e0299fb..6ecc0b3 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -73,6 +73,9 @@ int intel_lr_context_deferred_create(struct intel_context *ctx,
 void intel_lr_context_unpin(struct drm_i915_gem_request *req);
 void intel_lr_context_reset(struct drm_device *dev,
 			struct intel_context *ctx);
+void intel_lr_context_update(struct drm_i915_gem_request *rq);
+uint64_t intel_lr_context_descriptor(struct intel_context *ctx,
+				     struct intel_engine_cs *ring);
 
 /* Execlists */
 int intel_sanitize_enable_execlists(struct drm_device *dev, int enable_execlists);
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 07/13 v4] drm/i915: GuC submission setup, phase 1
  2015-07-09 18:29 [PATCH 00/13 v4] Batch submission via GuC Dave Gordon
                   ` (5 preceding siblings ...)
  2015-07-09 18:29 ` [PATCH 06/13 v4] drm/i915: Expose two LRC functions for GuC submission mode Dave Gordon
@ 2015-07-09 18:29 ` Dave Gordon
  2015-07-24 22:31   ` O'Rourke, Tom
  2015-07-09 18:29 ` [PATCH 08/13 v4] drm/i915: Enable GuC firmware log Dave Gordon
                   ` (6 subsequent siblings)
  13 siblings, 1 reply; 42+ messages in thread
From: Dave Gordon @ 2015-07-09 18:29 UTC (permalink / raw)
  To: intel-gfx

From: Alex Dai <yu.dai@intel.com>

This adds the first of the data structures used to communicate with the
GuC (the pool of guc_context structures).

We create a GuC-specific wrapper round the GEM object allocator as all
GEM objects shared with the GuC must be pinned into GGTT space at an
address that is NOT in the range [0..WOPCM_SIZE), as that range of GGTT
addresses is not accessible to the GuC (from the GuC's point of view,
it's permanently reserved for other objects such as the BootROM & SRAM).

Later, we will need to allocate additional GuC-sharable objects for the
submission client(s) and the GuC's debug log.

v2:
    Remove redundant initialisation [Chris Wilson]
    Defer adding struct members until needed [Chris Wilson]
    Local functions should pass dev_priv rather than dev [Chris Wilson]

v4:
    Rebased

Issue: VIZ-4884
Signed-off-by: Alex Dai <yu.dai@intel.com>
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
---
 drivers/gpu/drm/i915/Makefile              |   3 +-
 drivers/gpu/drm/i915/i915_guc_submission.c | 114 +++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_guc.h           |   7 ++
 drivers/gpu/drm/i915/intel_guc_loader.c    |  19 +++++
 4 files changed, 142 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/i915/i915_guc_submission.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index e604cfe..23f5612 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -40,7 +40,8 @@ i915-y += i915_cmd_parser.o \
 	  intel_uncore.o
 
 # general-purpose microcontroller (GuC) support
-i915-y += intel_guc_loader.o
+i915-y += intel_guc_loader.o \
+	  i915_guc_submission.o
 
 # autogenerated null render state
 i915-y += intel_renderstate_gen6.o \
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
new file mode 100644
index 0000000..70a0405
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -0,0 +1,114 @@
+/*
+ * Copyright © 2014 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+#include <linux/firmware.h>
+#include <linux/circ_buf.h>
+#include "i915_drv.h"
+#include "intel_guc.h"
+
+/**
+ * gem_allocate_guc_obj() - Allocate gem object for GuC usage
+ * @dev:	drm device
+ * @size:	size of object
+ *
+ * This is a wrapper to create a gem obj. In order to use it inside GuC, the
+ * object needs to be pinned lifetime. Also we must pin it to gtt space other
+ * than [0, GUC_WOPCM_SIZE] because this range is reserved inside GuC.
+ *
+ * Return:	A drm_i915_gem_object if successful, otherwise NULL.
+ */
+static struct drm_i915_gem_object *gem_allocate_guc_obj(struct drm_device *dev,
+							u32 size)
+{
+	struct drm_i915_gem_object *obj;
+
+	obj = i915_gem_alloc_object(dev, size);
+	if (!obj)
+		return NULL;
+
+	if (i915_gem_object_get_pages(obj)) {
+		drm_gem_object_unreference(&obj->base);
+		return NULL;
+	}
+
+	if (i915_gem_obj_ggtt_pin(obj, PAGE_SIZE,
+			PIN_OFFSET_BIAS | GUC_WOPCM_SIZE_VALUE)) {
+		drm_gem_object_unreference(&obj->base);
+		return NULL;
+	}
+
+	return obj;
+}
+
+/**
+ * gem_release_guc_obj() - Release gem object allocated for GuC usage
+ * @obj:	gem obj to be released
+  */
+static void gem_release_guc_obj(struct drm_i915_gem_object *obj)
+{
+	if (!obj)
+		return;
+
+	if (i915_gem_obj_is_pinned(obj))
+		i915_gem_object_ggtt_unpin(obj);
+
+	drm_gem_object_unreference(&obj->base);
+}
+
+/*
+ * Set up the memory resources to be shared with the GuC.  At this point,
+ * we require just one object that can be mapped through the GGTT.
+ */
+int i915_guc_submission_init(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	const size_t ctxsize = sizeof(struct guc_context_desc);
+	const size_t poolsize = GUC_MAX_GPU_CONTEXTS * ctxsize;
+	const size_t gemsize = round_up(poolsize, PAGE_SIZE);
+	struct intel_guc *guc = &dev_priv->guc;
+
+	if (!i915.enable_guc_submission)
+		return 0; /* not enabled  */
+
+	if (guc->ctx_pool_obj)
+		return 0; /* already allocated */
+
+	guc->ctx_pool_obj = gem_allocate_guc_obj(dev_priv->dev, gemsize);
+	if (!guc->ctx_pool_obj)
+		return -ENOMEM;
+
+	ida_init(&guc->ctx_ids);
+
+	return 0;
+}
+
+void i915_guc_submission_fini(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_guc *guc = &dev_priv->guc;
+
+	if (guc->ctx_pool_obj)
+		ida_destroy(&guc->ctx_ids);
+	gem_release_guc_obj(guc->ctx_pool_obj);
+	guc->ctx_pool_obj = NULL;
+}
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 2846b6d..be3cad8 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -56,6 +56,9 @@ struct intel_guc {
 	struct intel_guc_fw guc_fw;
 
 	uint32_t log_flags;
+
+	struct drm_i915_gem_object *ctx_pool_obj;
+	struct ida ctx_ids;
 };
 
 /* intel_guc_loader.c */
@@ -64,4 +67,8 @@ extern int intel_guc_ucode_load(struct drm_device *dev);
 extern void intel_guc_ucode_fini(struct drm_device *dev);
 extern const char *intel_guc_fw_status_repr(enum intel_guc_fw_status status);
 
+/* i915_guc_submission.c */
+int i915_guc_submission_init(struct drm_device *dev);
+void i915_guc_submission_fini(struct drm_device *dev);
+
 #endif
diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c b/drivers/gpu/drm/i915/intel_guc_loader.c
index 2080bca..e5d7136 100644
--- a/drivers/gpu/drm/i915/intel_guc_loader.c
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -128,6 +128,21 @@ static void set_guc_init_params(struct drm_i915_private *dev_priv)
 			i915.guc_log_level << GUC_LOG_VERBOSITY_SHIFT;
 	}
 
+	/* If GuC scheduling is enabled, setup params here. */
+	if (i915.enable_guc_submission) {
+		u32 pgs = i915_gem_obj_ggtt_offset(dev_priv->guc.ctx_pool_obj);
+		u32 ctx_in_16 = GUC_MAX_GPU_CONTEXTS / 16;
+
+		pgs >>= PAGE_SHIFT;
+		params[GUC_CTL_CTXINFO] = (pgs << GUC_CTL_BASE_ADDR_SHIFT) |
+			(ctx_in_16 << GUC_CTL_CTXNUM_IN16_SHIFT);
+
+		params[GUC_CTL_FEATURE] |= GUC_CTL_KERNEL_SUBMISSIONS;
+
+		/* Unmask this bit to enable GuC scheduler */
+		params[GUC_CTL_FEATURE] &= ~GUC_CTL_DISABLE_SCHEDULER;
+	}
+
 	I915_WRITE(SOFT_SCRATCH(0), 0);
 
 	for (i = 0; i < GUC_CTL_MAX_DWORDS; i++)
@@ -450,6 +465,10 @@ int intel_guc_ucode_load(struct drm_device *dev)
 		break;
 	}
 
+	err = i915_guc_submission_init(dev);
+	if (err)
+		goto fail;
+
 	err = guc_ucode_xfer(dev_priv);
 	if (err)
 		goto fail;
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 08/13 v4] drm/i915: Enable GuC firmware log
  2015-07-09 18:29 [PATCH 00/13 v4] Batch submission via GuC Dave Gordon
                   ` (6 preceding siblings ...)
  2015-07-09 18:29 ` [PATCH 07/13 v4] drm/i915: GuC submission setup, phase 1 Dave Gordon
@ 2015-07-09 18:29 ` Dave Gordon
  2015-07-24 22:40   ` O'Rourke, Tom
  2015-07-09 18:29 ` [PATCH 09/13 v4] drm/i915: Implementation of GuC client Dave Gordon
                   ` (5 subsequent siblings)
  13 siblings, 1 reply; 42+ messages in thread
From: Dave Gordon @ 2015-07-09 18:29 UTC (permalink / raw)
  To: intel-gfx

From: Alex Dai <yu.dai@intel.com>

Allocate a GEM object to hold GuC log data. A debugfs interface
(i915_guc_log_dump) is provided to print out the log content.

v2:
    Add struct members at point of use [Chris Wilson]

v4:
    Rebased

Issue: VIZ-4884
Signed-off-by: Alex Dai <yu.dai@intel.com>
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c        | 29 +++++++++++++++++++
 drivers/gpu/drm/i915/i915_guc_submission.c | 46 ++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_guc.h           |  1 +
 3 files changed, 76 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 9ff5f17..13e37d1 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2397,6 +2397,34 @@ static int i915_guc_load_status_info(struct seq_file *m, void *data)
 	return 0;
 }
 
+static int i915_guc_log_dump(struct seq_file *m, void *data)
+{
+	struct drm_info_node *node = m->private;
+	struct drm_device *dev = node->minor->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct drm_i915_gem_object *log_obj = dev_priv->guc.log_obj;
+	u32 *log;
+	int i = 0, pg;
+
+	if (!log_obj)
+		return 0;
+
+	for (pg = 0; pg < log_obj->base.size / PAGE_SIZE; pg++) {
+		log = kmap_atomic(i915_gem_object_get_page(log_obj, pg));
+
+		for (i = 0; i < PAGE_SIZE / sizeof(u32); i += 4)
+			seq_printf(m, "0x%08x 0x%08x 0x%08x 0x%08x\n",
+				   *(log + i), *(log + i + 1),
+				   *(log + i + 2), *(log + i + 3));
+
+		kunmap_atomic(log);
+	}
+
+	seq_putc(m, '\n');
+
+	return 0;
+}
+
 static int i915_edp_psr_status(struct seq_file *m, void *data)
 {
 	struct drm_info_node *node = m->private;
@@ -5112,6 +5140,7 @@ static const struct drm_info_list i915_debugfs_list[] = {
 	{"i915_gem_hws_vebox", i915_hws_info, 0, (void *)VECS},
 	{"i915_gem_batch_pool", i915_gem_batch_pool_info, 0},
 	{"i915_guc_load_status", i915_guc_load_status_info, 0},
+	{"i915_guc_log_dump", i915_guc_log_dump, 0},
 	{"i915_frequency_info", i915_frequency_info, 0},
 	{"i915_hangcheck_info", i915_hangcheck_info, 0},
 	{"i915_drpc_info", i915_drpc_info, 0},
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 70a0405..e9d46d6 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -75,6 +75,47 @@ static void gem_release_guc_obj(struct drm_i915_gem_object *obj)
 	drm_gem_object_unreference(&obj->base);
 }
 
+static void guc_create_log(struct intel_guc *guc)
+{
+	struct drm_i915_private *dev_priv = guc_to_i915(guc);
+	struct drm_i915_gem_object *obj;
+	unsigned long offset;
+	uint32_t size, flags;
+
+	if (i915.guc_log_level < GUC_LOG_VERBOSITY_MIN)
+		return;
+
+	if (i915.guc_log_level > GUC_LOG_VERBOSITY_MAX)
+		i915.guc_log_level = GUC_LOG_VERBOSITY_MAX;
+
+	/* The first page is to save log buffer state. Allocate one
+	 * extra page for others in case for overlap */
+	size = (1 + GUC_LOG_DPC_PAGES + 1 +
+		GUC_LOG_ISR_PAGES + 1 +
+		GUC_LOG_CRASH_PAGES + 1) << PAGE_SHIFT;
+
+	obj = guc->log_obj;
+	if (!obj) {
+		obj = gem_allocate_guc_obj(dev_priv->dev, size);
+		if (!obj) {
+			/* logging will be off */
+			i915.guc_log_level = -1;
+			return;
+		}
+
+		guc->log_obj = obj;
+	}
+
+	/* each allocated unit is a page */
+	flags = GUC_LOG_VALID | GUC_LOG_NOTIFY_ON_HALF_FULL |
+		(GUC_LOG_DPC_PAGES << GUC_LOG_DPC_SHIFT) |
+		(GUC_LOG_ISR_PAGES << GUC_LOG_ISR_SHIFT) |
+		(GUC_LOG_CRASH_PAGES << GUC_LOG_CRASH_SHIFT);
+
+	offset = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT; /* in pages */
+	guc->log_flags = (offset << GUC_LOG_BUF_ADDR_SHIFT) | flags;
+}
+
 /*
  * Set up the memory resources to be shared with the GuC.  At this point,
  * we require just one object that can be mapped through the GGTT.
@@ -99,6 +140,8 @@ int i915_guc_submission_init(struct drm_device *dev)
 
 	ida_init(&guc->ctx_ids);
 
+	guc_create_log(guc);
+
 	return 0;
 }
 
@@ -107,6 +150,9 @@ void i915_guc_submission_fini(struct drm_device *dev)
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_guc *guc = &dev_priv->guc;
 
+	gem_release_guc_obj(dev_priv->guc.log_obj);
+	guc->log_obj = NULL;
+
 	if (guc->ctx_pool_obj)
 		ida_destroy(&guc->ctx_ids);
 	gem_release_guc_obj(guc->ctx_pool_obj);
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index be3cad8..5b51b05 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -56,6 +56,7 @@ struct intel_guc {
 	struct intel_guc_fw guc_fw;
 
 	uint32_t log_flags;
+	struct drm_i915_gem_object *log_obj;
 
 	struct drm_i915_gem_object *ctx_pool_obj;
 	struct ida ctx_ids;
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 09/13 v4] drm/i915: Implementation of GuC client
  2015-07-09 18:29 [PATCH 00/13 v4] Batch submission via GuC Dave Gordon
                   ` (7 preceding siblings ...)
  2015-07-09 18:29 ` [PATCH 08/13 v4] drm/i915: Enable GuC firmware log Dave Gordon
@ 2015-07-09 18:29 ` Dave Gordon
  2015-07-25  2:31   ` O'Rourke, Tom
  2015-07-09 18:29 ` [PATCH 10/13 v4] drm/i915: Interrupt routing for GuC submission Dave Gordon
                   ` (4 subsequent siblings)
  13 siblings, 1 reply; 42+ messages in thread
From: Dave Gordon @ 2015-07-09 18:29 UTC (permalink / raw)
  To: intel-gfx

A GuC client has its own doorbell and workqueue. It maintains the
doorbell cache line, process description object and work queue item.

A default guc_client is created for the i915 driver to use for
normal-priority in-order submission.

Note that the created client is not yet ready for use; doorbell
allocation will fail as we haven't yet linked the GuC's context
descriptor to the default contexts for each ring (see later patch).

v2:
    Defer adding structure members until needed [Chris Wilson]
    Rationalise type declarations [Chris Wilson]

v4:
    Rebased

Issue: VIZ-4884
Signed-off-by: Alex Dai <yu.dai@intel.com>
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 649 +++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_guc.h           |  42 ++
 drivers/gpu/drm/i915/intel_guc_loader.c    |  12 +
 3 files changed, 703 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index e9d46d6..25d8807 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -27,6 +27,512 @@
 #include "intel_guc.h"
 
 /**
+ * DOC: GuC Client
+ *
+ * i915_guc_client:
+ * We use the term client to avoid confusion with contexts. A i915_guc_client is
+ * equivalent to GuC object guc_context_desc. This context descriptor is
+ * allocated from a pool of 1024 entries. Kernel driver will allocate doorbell
+ * and workqueue for it. Also the process descriptor (guc_process_desc), which
+ * is mapped to client space. So the client can write Work Item then ring the
+ * doorbell.
+ *
+ * To simplify the implementation, we allocate one gem object that contains all
+ * pages for doorbell, process descriptor and workqueue.
+ *
+ * The Scratch registers:
+ * There are 16 MMIO-based registers start from 0xC180. The kernel driver writes
+ * a value to the action register (SOFT_SCRATCH_0) along with any data. It then
+ * triggers an interrupt on the GuC via another register write (0xC4C8).
+ * Firmware writes a success/fail code back to the action register after
+ * processes the request. The kernel driver polls waiting for this update and
+ * then proceeds.
+ * See host2guc_action()
+ *
+ * Doorbells:
+ * Doorbells are interrupts to uKernel. A doorbell is a single cache line (QW)
+ * mapped into process space.
+ *
+ * Work Items:
+ * There are several types of work items that the host may place into a
+ * workqueue, each with its own requirements and limitations. Currently only
+ * WQ_TYPE_INORDER is needed to support legacy submission via GuC, which
+ * represents in-order queue. The kernel driver packs ring tail pointer and an
+ * ELSP context descriptor dword into Work Item.
+ * See guc_add_workqueue_item()
+ *
+ */
+
+/*
+ * Read GuC command/status register (SOFT_SCRATCH_0)
+ * Return true if it contains a response rather than a command
+ */
+static inline bool host2guc_action_response(struct drm_i915_private *dev_priv,
+					    u32 *status)
+{
+	u32 val = I915_READ(SOFT_SCRATCH(0));
+	*status = val;
+	return GUC2HOST_IS_RESPONSE(val);
+}
+
+static int host2guc_action(struct intel_guc *guc, u32 *data, u32 len)
+{
+	struct drm_i915_private *dev_priv = guc_to_i915(guc);
+	u32 status;
+	int i;
+	int ret;
+
+	if (WARN_ON(len < 1 || len > 15))
+		return -EINVAL;
+
+	spin_lock(&dev_priv->guc.host2guc_lock);
+
+	dev_priv->guc.action_count += 1;
+	dev_priv->guc.action_cmd = data[0];
+
+	for (i = 0; i < len; i++)
+		I915_WRITE(SOFT_SCRATCH(i), data[i]);
+
+	POSTING_READ(SOFT_SCRATCH(i - 1));
+
+	I915_WRITE(HOST2GUC_INTERRUPT, HOST2GUC_TRIGGER);
+
+	ret = wait_for_atomic(host2guc_action_response(dev_priv, &status), 10);
+	if (status != GUC2HOST_STATUS_SUCCESS) {
+		/* either GuC doesn't respond, which is a TIMEOUT,
+		 * or a failure code is returned. */
+		if (ret != -ETIMEDOUT)
+			ret = -EIO;
+
+		DRM_ERROR("GUC: host2guc action 0x%X failed. ret=%d "
+				"status=0x%08X response=0x%08X\n",
+				data[0], ret, status,
+				I915_READ(SOFT_SCRATCH(15)));
+
+		dev_priv->guc.action_fail += 1;
+		dev_priv->guc.action_err = ret;
+	}
+	dev_priv->guc.action_status = status;
+
+	spin_unlock(&dev_priv->guc.host2guc_lock);
+
+	return ret;
+}
+
+/*
+ * Tell the GuC to allocate or deallocate a specific doorbell
+ */
+
+static int host2guc_allocate_doorbell(struct intel_guc *guc,
+				      struct i915_guc_client *client)
+{
+	u32 data[2];
+
+	data[0] = HOST2GUC_ACTION_ALLOCATE_DOORBELL;
+	data[1] = client->ctx_index;
+
+	return host2guc_action(guc, data, 2);
+}
+
+static int host2guc_release_doorbell(struct intel_guc *guc,
+				     struct i915_guc_client *client)
+{
+	u32 data[2];
+
+	data[0] = HOST2GUC_ACTION_DEALLOCATE_DOORBELL;
+	data[1] = client->ctx_index;
+
+	return host2guc_action(guc, data, 2);
+}
+
+/*
+ * Initialise, update, or clear doorbell data shared with the GuC
+ *
+ * These functions modify shared data and so need access to the mapped
+ * client object which contains the page being used for the doorbell
+ */
+
+static void guc_init_doorbell(struct intel_guc *guc,
+			      struct i915_guc_client *client)
+{
+	struct guc_doorbell_info *doorbell;
+	void *base;
+
+	base = kmap_atomic(i915_gem_object_get_page(client->client_obj, 0));
+	doorbell = base + client->doorbell_offset;
+
+	doorbell->db_status = 1;
+	doorbell->cookie = 0;
+
+	kunmap_atomic(base);
+}
+
+static int guc_ring_doorbell(struct i915_guc_client *gc)
+{
+	struct guc_process_desc *desc;
+	union guc_doorbell_qw db_cmp, db_exc, db_ret;
+	union guc_doorbell_qw *db;
+	void *base;
+	int attempt = 2, ret = -EAGAIN;
+
+	base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
+	desc = base + gc->proc_desc_offset;
+
+	/* Update the tail so it is visible to GuC */
+	desc->tail = gc->wq_tail;
+
+	/* current cookie */
+	db_cmp.db_status = GUC_DOORBELL_ENABLED;
+	db_cmp.cookie = gc->cookie;
+
+	/* cookie to be updated */
+	db_exc.db_status = GUC_DOORBELL_ENABLED;
+	db_exc.cookie = gc->cookie + 1;
+	if (db_exc.cookie == 0)
+		db_exc.cookie = 1;
+
+	/* pointer of current doorbell cacheline */
+	db = base + gc->doorbell_offset;
+
+	while (attempt--) {
+		/* lets ring the doorbell */
+		db_ret.value_qw = atomic64_cmpxchg((atomic64_t *)db,
+			db_cmp.value_qw, db_exc.value_qw);
+
+		/* if the exchange was successfully executed */
+		if (db_ret.value_qw == db_cmp.value_qw) {
+			/* db was successfully rung */
+			gc->cookie = db_exc.cookie;
+			ret = 0;
+			break;
+		}
+
+		/* XXX: doorbell was lost and need to acquire it again */
+		if (db_ret.db_status == GUC_DOORBELL_DISABLED)
+			break;
+
+		DRM_ERROR("Cookie mismatch. Expected %d, returned %d\n",
+			  db_cmp.cookie, db_ret.cookie);
+
+		/* update the cookie to newly read cookie from GuC */
+		db_cmp.cookie = db_ret.cookie;
+		db_exc.cookie = db_ret.cookie + 1;
+		if (db_exc.cookie == 0)
+			db_exc.cookie = 1;
+	}
+
+	kunmap_atomic(base);
+	return ret;
+}
+
+static void guc_disable_doorbell(struct intel_guc *guc,
+				 struct i915_guc_client *client)
+{
+	struct drm_i915_private *dev_priv = guc_to_i915(guc);
+	struct guc_doorbell_info *doorbell;
+	void *base;
+	int drbreg = GEN8_DRBREGL(client->doorbell_id);
+	int value;
+
+	base = kmap_atomic(i915_gem_object_get_page(client->client_obj, 0));
+	doorbell = base + client->doorbell_offset;
+
+	doorbell->db_status = 0;
+
+	kunmap_atomic(base);
+
+	I915_WRITE(drbreg, I915_READ(drbreg) & ~GEN8_DRB_VALID);
+
+	value = I915_READ(drbreg);
+	WARN_ON((value & GEN8_DRB_VALID) != 0);
+
+	I915_WRITE(GEN8_DRBREGU(client->doorbell_id), 0);
+	I915_WRITE(drbreg, 0);
+
+	/* XXX: wait for any interrupts */
+	/* XXX: wait for workqueue to drain */
+}
+
+/*
+ * Select, assign and relase doorbell cachelines
+ *
+ * These functions track which doorbell cachelines are in use.
+ * The data they manipulate is protected by the host2guc lock.
+ */
+
+static uint32_t select_doorbell_cacheline(struct intel_guc *guc)
+{
+	const uint32_t cacheline_size = boot_cpu_data.x86_clflush_size;
+	uint32_t offset;
+
+	spin_lock(&guc->host2guc_lock);
+
+	/* Doorbell uses a single cache line within a page */
+	offset = guc->db_cacheline & PAGE_MASK;
+
+	/* Moving to next cache line to reduce contention */
+	guc->db_cacheline += cacheline_size;
+
+	spin_unlock(&guc->host2guc_lock);
+
+	return offset;
+}
+
+static uint16_t assign_doorbell(struct intel_guc *guc, uint32_t priority)
+{
+	/* The bitmap is split into two halves - high and normal priority. */
+	const uint16_t half = GUC_MAX_DOORBELLS / 2;
+	const uint16_t start = (priority <= GUC_CTX_PRIORITY_HIGH) ? half : 0;
+	const uint16_t end = start + half;
+	uint16_t id;
+
+	spin_lock(&guc->host2guc_lock);
+	id = find_next_zero_bit(guc->doorbell_bitmap, end, start);
+	if (id == end)
+		id = GUC_INVALID_DOORBELL_ID;
+	else
+		bitmap_set(guc->doorbell_bitmap, id, 1);
+	spin_unlock(&guc->host2guc_lock);
+
+	return id;
+}
+
+static void release_doorbell(struct intel_guc *guc, uint16_t id)
+{
+	spin_lock(&guc->host2guc_lock);
+	bitmap_clear(guc->doorbell_bitmap, id, 1);
+	spin_unlock(&guc->host2guc_lock);
+}
+
+/*
+ * Initialise the process descriptor shared with the GuC firmware.
+ */
+static void guc_init_proc_desc(struct intel_guc *guc,
+			       struct i915_guc_client *client)
+{
+	struct guc_process_desc *desc;
+	void *base;
+
+	base = kmap_atomic(i915_gem_object_get_page(client->client_obj, 0));
+	desc = base + client->proc_desc_offset;
+
+	memset(desc, 0, sizeof(*desc));
+
+	/*
+	 * XXX: pDoorbell and WQVBaseAddress are pointers in process address
+	 * space for ring3 clients (set them as in mmap_ioctl) or kernel
+	 * space for kernel clients (map on demand instead? May make debug
+	 * easier to have it mapped).
+	 */
+	desc->wq_base_addr = 0;
+	desc->db_base_addr = 0;
+
+	desc->context_id = client->ctx_index;
+	desc->wq_size_bytes = client->wq_size;
+	desc->wq_status = WQ_STATUS_ACTIVE;
+	desc->priority = client->priority;
+
+	kunmap_atomic(base);
+}
+
+/*
+ * Initialise/clear the context descriptor shared with the GuC firmware.
+ *
+ * This descriptor tells the GuC where (in GGTT space) to find the important
+ * data structures relating to this client (doorbell, process descriptor,
+ * write queue, etc).
+ */
+
+static void guc_init_ctx_desc(struct intel_guc *guc,
+			      struct i915_guc_client *client)
+{
+	struct guc_context_desc desc;
+	struct sg_table *sg;
+
+	memset(&desc, 0, sizeof(desc));
+
+	desc.attribute = GUC_CTX_DESC_ATTR_ACTIVE | GUC_CTX_DESC_ATTR_KERNEL;
+	desc.context_id = client->ctx_index;
+	desc.priority = client->priority;
+	desc.engines_used = (1 << RCS) | (1 << VCS) | (1 << BCS) |
+			    (1 << VECS) | (1 << VCS2); /* all engines */
+	desc.db_id = client->doorbell_id;
+
+	/*
+	 * The CPU address is only needed at certain points, so kmap_atomic on
+	 * demand instead of storing it in the ctx descriptor.
+	 * XXX: May make debug easier to have it mapped
+	 */
+	desc.db_trigger_cpu = 0;
+	desc.db_trigger_uk = client->doorbell_offset +
+		i915_gem_obj_ggtt_offset(client->client_obj);
+	desc.db_trigger_phy = client->doorbell_offset +
+		sg_dma_address(client->client_obj->pages->sgl);
+
+	desc.process_desc = client->proc_desc_offset +
+		i915_gem_obj_ggtt_offset(client->client_obj);
+
+	desc.wq_addr = client->wq_offset +
+		i915_gem_obj_ggtt_offset(client->client_obj);
+
+	desc.wq_size = client->wq_size;
+
+	/*
+	 * XXX: Take LRCs from an existing intel_context if this is not an
+	 * IsKMDCreatedContext client
+	 */
+	desc.desc_private = (uintptr_t)client;
+
+	/* Pool context is pinned already */
+	sg = guc->ctx_pool_obj->pages;
+	sg_pcopy_from_buffer(sg->sgl, sg->nents, &desc, sizeof(desc),
+			     sizeof(desc) * client->ctx_index);
+}
+
+static void guc_fini_ctx_desc(struct intel_guc *guc,
+			      struct i915_guc_client *client)
+{
+	struct guc_context_desc desc;
+	struct sg_table *sg;
+
+	memset(&desc, 0, sizeof(desc));
+
+	sg = guc->ctx_pool_obj->pages;
+	sg_pcopy_from_buffer(sg->sgl, sg->nents, &desc, sizeof(desc),
+			     sizeof(desc) * client->ctx_index);
+}
+
+/* Get valid workqueue item and return it back to offset */
+static int guc_get_workqueue_space(struct i915_guc_client *gc, u32 *offset)
+{
+	struct guc_process_desc *desc;
+	void *base;
+	u32 size = sizeof(struct guc_wq_item);
+	int ret = 0, timeout_counter = 200;
+	unsigned long flags;
+
+	base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
+	desc = base + gc->proc_desc_offset;
+
+	while (timeout_counter-- > 0) {
+		spin_lock_irqsave(&gc->wq_lock, flags);
+
+		ret = wait_for_atomic(CIRC_SPACE(gc->wq_tail, desc->head,
+				gc->wq_size) >= size, 1);
+
+		if (!ret) {
+			*offset = gc->wq_tail;
+
+			/* advance the tail for next workqueue item */
+			gc->wq_tail += size;
+			gc->wq_tail &= gc->wq_size - 1;
+
+			/* this will break the loop */
+			timeout_counter = 0;
+		}
+
+		spin_unlock_irqrestore(&gc->wq_lock, flags);
+	};
+
+	kunmap_atomic(base);
+
+	return ret;
+}
+
+static int guc_add_workqueue_item(struct i915_guc_client *gc,
+				  struct drm_i915_gem_request *rq)
+{
+	enum intel_ring_id ring_id = rq->ring->id;
+	struct guc_wq_item *wqi;
+	void *base;
+	u32 tail, wq_len, wq_off = 0;
+	int ret;
+
+	ret = guc_get_workqueue_space(gc, &wq_off);
+	if (ret)
+		return ret;
+
+	/* For now workqueue item is 4 DWs; workqueue buffer is 2 pages. So we
+	 * should not have the case where structure wqi is across page, neither
+	 * wrapped to the beginning. This simplifies the implementation below.
+	 *
+	 * XXX: if not the case, we need save data to a temp wqi and copy it to
+	 * workqueue buffer dw by dw.
+	 */
+	WARN_ON(sizeof(struct guc_wq_item) != 16);
+	WARN_ON(wq_off & 3);
+
+	/* wq starts from the page after doorbell / process_desc */
+	base = kmap_atomic(i915_gem_object_get_page(gc->client_obj,
+			(wq_off + GUC_DB_SIZE) >> PAGE_SHIFT));
+	wq_off &= PAGE_SIZE - 1;
+	wqi = (struct guc_wq_item *)((char *)base + wq_off);
+
+	/* len does not include the header */
+	wq_len = sizeof(struct guc_wq_item) / sizeof(u32) - 1;
+	wqi->header = WQ_TYPE_INORDER |
+			(wq_len << WQ_LEN_SHIFT) |
+			(ring_id << WQ_TARGET_SHIFT) |
+			WQ_NO_WCFLUSH_WAIT;
+
+	/* The GuC wants only the low-order word of the context descriptor */
+	wqi->context_desc = (u32)intel_lr_context_descriptor(rq->ctx, rq->ring);
+
+	/* The GuC firmware wants the tail index in QWords, not bytes */
+	tail = rq->ringbuf->tail >> 3;
+	wqi->ring_tail = tail << WQ_RING_TAIL_SHIFT;
+	wqi->fence_id = 0; /*XXX: what fence to be here */
+
+	kunmap_atomic(base);
+
+	return 0;
+}
+
+/**
+ * i915_guc_submit() - Submit commands through GuC
+ * @client:	the guc client where commands will go through
+ * @ctx:	LRC where commands come from
+ * @ring:	HW engine that will excute the commands
+ *
+ * Return:	0 if succeed
+ */
+int i915_guc_submit(struct i915_guc_client *client,
+		    struct drm_i915_gem_request *rq)
+{
+	unsigned long flags;
+	int q_ret, b_ret;
+
+	/* Need this because of the deferred pin ctx and ring */
+	/* Shall we move this right after ring is pinned? */
+	intel_lr_context_update(rq);
+
+	q_ret = guc_add_workqueue_item(client, rq);
+	if (q_ret == 0)
+		b_ret = guc_ring_doorbell(client);
+
+	spin_lock_irqsave(&client->wq_lock, flags);
+	client->submissions += 1;
+	if (q_ret) {
+		client->q_fail += 1;
+		client->retcode = q_ret;
+	} else if (b_ret) {
+		client->b_fail += 1;
+		client->retcode = q_ret = b_ret;
+	} else {
+		client->retcode = 0;
+	}
+	spin_unlock_irqrestore(&client->wq_lock, flags);
+
+	return q_ret;
+}
+
+/*
+ * Everything below here is concerned with setup & teardown, and is
+ * therefore not part of the somewhat time-critical batch-submission
+ * path of i915_guc_submit() above.
+ */
+
+/**
  * gem_allocate_guc_obj() - Allocate gem object for GuC usage
  * @dev:	drm device
  * @size:	size of object
@@ -75,6 +581,121 @@ static void gem_release_guc_obj(struct drm_i915_gem_object *obj)
 	drm_gem_object_unreference(&obj->base);
 }
 
+static void guc_client_free(struct drm_device *dev,
+			    struct i915_guc_client *client)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_guc *guc = &dev_priv->guc;
+
+	if (!client)
+		return;
+
+	if (client->doorbell_id != GUC_INVALID_DOORBELL_ID) {
+		/*
+		 * First disable the doorbell, then tell the GuC we've
+		 * finished with it, finally deallocate it in our bitmap
+		 */
+		guc_disable_doorbell(guc, client);
+		host2guc_release_doorbell(guc, client);
+		release_doorbell(guc, client->doorbell_id);
+	}
+
+	/*
+	 * XXX: wait for any outstanding submissions before freeing memory.
+	 * Be sure to drop any locks
+	 */
+
+	gem_release_guc_obj(client->client_obj);
+
+	if (client->ctx_index != GUC_INVALID_CTX_ID) {
+		guc_fini_ctx_desc(guc, client);
+		ida_simple_remove(&guc->ctx_ids, client->ctx_index);
+	}
+
+	kfree(client);
+}
+
+/**
+ * guc_client_alloc() - Allocate an i915_guc_client
+ * @dev:	drm device
+ * @priority:	four levels priority _CRITICAL, _HIGH, _NORMAL and _LOW
+ * 		The kernel client to replace ExecList submission is created with
+ * 		NORMAL priority. Priority of a client for scheduler can be HIGH,
+ * 		while a preemption context can use CRITICAL.
+ *
+ * Return:	An i915_guc_client object if success.
+ */
+static struct i915_guc_client *guc_client_alloc(struct drm_device *dev,
+						uint32_t priority)
+{
+	struct i915_guc_client *client;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_guc *guc = &dev_priv->guc;
+	struct drm_i915_gem_object *obj;
+
+	client = kzalloc(sizeof(*client), GFP_KERNEL);
+	if (!client)
+		return NULL;
+
+	client->doorbell_id = GUC_INVALID_DOORBELL_ID;
+	client->priority = priority;
+
+	client->ctx_index = (uint32_t)ida_simple_get(&guc->ctx_ids, 0,
+			GUC_MAX_GPU_CONTEXTS, GFP_KERNEL);
+	if (client->ctx_index >= GUC_MAX_GPU_CONTEXTS) {
+		client->ctx_index = GUC_INVALID_CTX_ID;
+		goto err;
+	}
+
+	/* The first page is doorbell/proc_desc. Two followed pages are wq. */
+	obj = gem_allocate_guc_obj(dev, GUC_DB_SIZE + GUC_WQ_SIZE);
+	if (!obj)
+		goto err;
+
+	client->client_obj = obj;
+	client->wq_offset = GUC_DB_SIZE;
+	client->wq_size = GUC_WQ_SIZE;
+	spin_lock_init(&client->wq_lock);
+
+	client->doorbell_offset = select_doorbell_cacheline(guc);
+
+	/*
+	 * Since the doorbell only requires a single cacheline, we can save
+	 * space by putting the application process descriptor in the same
+	 * page. Use the half of the page that doesn't include the doorbell.
+	 */
+	if (client->doorbell_offset >= (GUC_DB_SIZE / 2))
+		client->proc_desc_offset = 0;
+	else
+		client->proc_desc_offset = (GUC_DB_SIZE / 2);
+
+	client->doorbell_id = assign_doorbell(guc, client->priority);
+	if (client->doorbell_id == GUC_INVALID_DOORBELL_ID)
+		/* XXX: evict a doorbell instead */
+		goto err;
+
+	guc_init_proc_desc(guc, client);
+	guc_init_ctx_desc(guc, client);
+	guc_init_doorbell(guc, client);
+
+	/* Invalidate GuC TLB to let GuC take the latest updates to GTT. */
+	I915_WRITE(GEN8_GTCR, GEN8_GTCR_INVALIDATE);
+
+	/* XXX: Any cache flushes needed? General domain mgmt calls? */
+
+	if (host2guc_allocate_doorbell(guc, client))
+		goto err;
+
+	DRM_DEBUG_DRIVER("new priority %u client %p: ctx_index %u db_id %u\n",
+		priority, client, client->ctx_index, client->doorbell_id);
+
+	return client;
+
+err:
+	guc_client_free(dev, client);
+	return NULL;
+}
+
 static void guc_create_log(struct intel_guc *guc)
 {
 	struct drm_i915_private *dev_priv = guc_to_i915(guc);
@@ -138,6 +759,8 @@ int i915_guc_submission_init(struct drm_device *dev)
 	if (!guc->ctx_pool_obj)
 		return -ENOMEM;
 
+	spin_lock_init(&dev_priv->guc.host2guc_lock);
+
 	ida_init(&guc->ctx_ids);
 
 	guc_create_log(guc);
@@ -145,6 +768,32 @@ int i915_guc_submission_init(struct drm_device *dev)
 	return 0;
 }
 
+int i915_guc_submission_enable(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_guc *guc = &dev_priv->guc;
+	struct i915_guc_client *client;
+
+	/* client for execbuf submission */
+	client = guc_client_alloc(dev, GUC_CTX_PRIORITY_NORMAL);
+	if (!client) {
+		DRM_ERROR("Failed to create execbuf guc_client\n");
+		return -ENOMEM;
+	}
+
+	guc->execbuf_client = client;
+	return 0;
+}
+
+void i915_guc_submission_disable(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_guc *guc = &dev_priv->guc;
+
+	guc_client_free(dev, guc->execbuf_client);
+	guc->execbuf_client = NULL;
+}
+
 void i915_guc_submission_fini(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 5b51b05..d249326 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -27,6 +27,30 @@
 #include "intel_guc_fwif.h"
 #include "i915_guc_reg.h"
 
+struct i915_guc_client {
+	struct drm_i915_gem_object *client_obj;
+	uint32_t priority;
+	uint32_t ctx_index;
+
+	uint32_t proc_desc_offset;
+	uint32_t doorbell_offset;
+	uint32_t cookie;
+	uint16_t doorbell_id;
+	uint16_t padding;		/* Maintain alignment		*/
+
+	uint32_t wq_offset;
+	uint32_t wq_size;
+
+	spinlock_t wq_lock;		/* Protects all data below	*/
+	uint32_t wq_tail;
+
+	/* GuC submission statistics & status */
+	uint64_t submissions;
+	uint32_t q_fail;
+	uint32_t b_fail;
+	int retcode;
+};
+
 enum intel_guc_fw_status {
 	GUC_FIRMWARE_FAIL = -1,
 	GUC_FIRMWARE_NONE = 0,
@@ -60,6 +84,20 @@ struct intel_guc {
 
 	struct drm_i915_gem_object *ctx_pool_obj;
 	struct ida ctx_ids;
+
+	struct i915_guc_client *execbuf_client;
+
+	spinlock_t host2guc_lock;	/* Protects all data below	*/
+
+	DECLARE_BITMAP(doorbell_bitmap, GUC_MAX_DOORBELLS);
+	int db_cacheline;
+
+	/* Action status & statistics */
+	uint64_t action_count;		/* Total commands issued	*/
+	uint32_t action_cmd;		/* Last command word		*/
+	uint32_t action_status;		/* Last return status		*/
+	uint32_t action_fail;		/* Total number of failures	*/
+	int32_t action_err;		/* Last error code		*/
 };
 
 /* intel_guc_loader.c */
@@ -70,6 +108,10 @@ extern const char *intel_guc_fw_status_repr(enum intel_guc_fw_status status);
 
 /* i915_guc_submission.c */
 int i915_guc_submission_init(struct drm_device *dev);
+int i915_guc_submission_enable(struct drm_device *dev);
+int i915_guc_submit(struct i915_guc_client *client,
+		    struct drm_i915_gem_request *rq);
+void i915_guc_submission_disable(struct drm_device *dev);
 void i915_guc_submission_fini(struct drm_device *dev);
 
 #endif
diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c b/drivers/gpu/drm/i915/intel_guc_loader.c
index e5d7136..25ba29f 100644
--- a/drivers/gpu/drm/i915/intel_guc_loader.c
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -427,6 +427,8 @@ int intel_guc_ucode_load(struct drm_device *dev)
 		intel_guc_fw_status_repr(guc_fw->guc_fw_fetch_status),
 		intel_guc_fw_status_repr(guc_fw->guc_fw_load_status));
 
+	i915_guc_submission_disable(dev);
+
 	if (guc_fw->guc_fw_fetch_status == GUC_FIRMWARE_NONE)
 		return 0;
 
@@ -479,12 +481,20 @@ int intel_guc_ucode_load(struct drm_device *dev)
 		intel_guc_fw_status_repr(guc_fw->guc_fw_fetch_status),
 		intel_guc_fw_status_repr(guc_fw->guc_fw_load_status));
 
+	if (i915.enable_guc_submission) {
+		err = i915_guc_submission_enable(dev);
+		if (err)
+			goto fail;
+	}
+
 	return 0;
 
 fail:
 	if (guc_fw->guc_fw_load_status == GUC_FIRMWARE_PENDING)
 		guc_fw->guc_fw_load_status = GUC_FIRMWARE_FAIL;
 
+	i915_guc_submission_disable(dev);
+
 	DRM_ERROR("Failed to initialize GuC, error %d\n", err);
 
 	return err;
@@ -547,6 +557,8 @@ void intel_guc_ucode_fini(struct drm_device *dev)
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_guc_fw *guc_fw = &dev_priv->guc.guc_fw;
 
+	i915_guc_submission_fini(dev);
+
 	if (guc_fw->guc_fw_obj)
 		drm_gem_object_unreference(&guc_fw->guc_fw_obj->base);
 	guc_fw->guc_fw_obj = NULL;
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 10/13 v4] drm/i915: Interrupt routing for GuC submission
  2015-07-09 18:29 [PATCH 00/13 v4] Batch submission via GuC Dave Gordon
                   ` (8 preceding siblings ...)
  2015-07-09 18:29 ` [PATCH 09/13 v4] drm/i915: Implementation of GuC client Dave Gordon
@ 2015-07-09 18:29 ` Dave Gordon
  2015-07-27 15:33   ` O'Rourke, Tom
  2015-07-09 18:29 ` [PATCH 11/13 v4] drm/i915: Integrate GuC-based command submission Dave Gordon
                   ` (3 subsequent siblings)
  13 siblings, 1 reply; 42+ messages in thread
From: Dave Gordon @ 2015-07-09 18:29 UTC (permalink / raw)
  To: intel-gfx

Turn on interrupt steering to route necessary interrupts to GuC.

v4:
    Rebased

Issue: VIZ-4884
Signed-off-by: Alex Dai <yu.dai@intel.com>
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
---
 drivers/gpu/drm/i915/i915_reg.h         | 11 +++++--
 drivers/gpu/drm/i915/intel_guc_loader.c | 51 +++++++++++++++++++++++++++++++++
 2 files changed, 60 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 63728c1..1c2072b 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -1664,12 +1664,18 @@ enum skl_disp_power_wells {
 #define GFX_MODE_GEN7	0x0229c
 #define RING_MODE_GEN7(ring)	((ring)->mmio_base+0x29c)
 #define   GFX_RUN_LIST_ENABLE		(1<<15)
+#define   GFX_INTERRUPT_STEERING	(1<<14)
 #define   GFX_TLB_INVALIDATE_EXPLICIT	(1<<13)
 #define   GFX_SURFACE_FAULT_ENABLE	(1<<12)
 #define   GFX_REPLAY_MODE		(1<<11)
 #define   GFX_PSMI_GRANULARITY		(1<<10)
 #define   GFX_PPGTT_ENABLE		(1<<9)
 
+#define   GFX_FORWARD_VBLANK_MASK	(3<<5)
+#define   GFX_FORWARD_VBLANK_NEVER	(0<<5)
+#define   GFX_FORWARD_VBLANK_ALWAYS	(1<<5)
+#define   GFX_FORWARD_VBLANK_COND	(2<<5)
+
 #define VLV_DISPLAY_BASE 0x180000
 #define VLV_MIPI_BASE VLV_DISPLAY_BASE
 
@@ -5683,11 +5689,12 @@ enum skl_disp_power_wells {
 #define GEN8_GT_IIR(which) (0x44308 + (0x10 * (which)))
 #define GEN8_GT_IER(which) (0x4430c + (0x10 * (which)))
 
-#define GEN8_BCS_IRQ_SHIFT 16
 #define GEN8_RCS_IRQ_SHIFT 0
-#define GEN8_VCS2_IRQ_SHIFT 16
+#define GEN8_BCS_IRQ_SHIFT 16
 #define GEN8_VCS1_IRQ_SHIFT 0
+#define GEN8_VCS2_IRQ_SHIFT 16
 #define GEN8_VECS_IRQ_SHIFT 0
+#define GEN8_WD_IRQ_SHIFT 16
 
 #define GEN8_DE_PIPE_ISR(pipe) (0x44400 + (0x10 * (pipe)))
 #define GEN8_DE_PIPE_IMR(pipe) (0x44404 + (0x10 * (pipe)))
diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c b/drivers/gpu/drm/i915/intel_guc_loader.c
index 25ba29f..2aa9227 100644
--- a/drivers/gpu/drm/i915/intel_guc_loader.c
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -79,6 +79,53 @@ const char *intel_guc_fw_status_repr(enum intel_guc_fw_status status)
 	}
 };
 
+static void direct_interrupts_to_host(struct drm_i915_private *dev_priv)
+{
+	struct intel_engine_cs *ring;
+	int i, irqs;
+
+	/* tell all command streamers NOT to forward interrupts and vblank to GuC */
+	irqs = _MASKED_FIELD(GFX_FORWARD_VBLANK_MASK, GFX_FORWARD_VBLANK_NEVER);
+	irqs |= _MASKED_BIT_DISABLE(GFX_INTERRUPT_STEERING);
+	for_each_ring(ring, dev_priv, i)
+		I915_WRITE(RING_MODE_GEN7(ring), irqs);
+
+	/* tell DE to send nothing to GuC */
+	I915_WRITE(DE_GUCRMR, ~0);
+
+	/* route all GT interrupts to the host */
+	I915_WRITE(GUC_BCS_RCS_IER, 0);
+	I915_WRITE(GUC_VCS2_VCS1_IER, 0);
+	I915_WRITE(GUC_WD_VECS_IER, 0);
+}
+
+static void direct_interrupts_to_guc(struct drm_i915_private *dev_priv)
+{
+	struct intel_engine_cs *ring;
+	int i, irqs;
+
+	/* tell all command streamers to forward interrupts and vblank to GuC */
+	irqs = _MASKED_FIELD(GFX_FORWARD_VBLANK_MASK, GFX_FORWARD_VBLANK_ALWAYS);
+	irqs |= _MASKED_BIT_ENABLE(GFX_INTERRUPT_STEERING);
+	for_each_ring(ring, dev_priv, i)
+		I915_WRITE(RING_MODE_GEN7(ring), irqs);
+
+	/* tell DE to send (all) flip_done to GuC */
+	irqs = DERRMR_PIPEA_PRI_FLIP_DONE | DERRMR_PIPEA_SPR_FLIP_DONE |
+	       DERRMR_PIPEB_PRI_FLIP_DONE | DERRMR_PIPEB_SPR_FLIP_DONE |
+	       DERRMR_PIPEC_PRI_FLIP_DONE | DERRMR_PIPEC_SPR_FLIP_DONE;
+	/* Unmasked bits will cause GuC response message to be sent */
+	I915_WRITE(DE_GUCRMR, ~irqs);
+
+	/* route USER_INTERRUPT to Host, all others are sent to GuC. */
+	irqs = GT_RENDER_USER_INTERRUPT << GEN8_RCS_IRQ_SHIFT |
+	       GT_RENDER_USER_INTERRUPT << GEN8_BCS_IRQ_SHIFT;
+	/* These three registers have the same bit definitions */
+	I915_WRITE(GUC_BCS_RCS_IER, ~irqs);
+	I915_WRITE(GUC_VCS2_VCS1_IER, ~irqs);
+	I915_WRITE(GUC_WD_VECS_IER, ~irqs);
+}
+
 static u32 get_gttype(struct drm_i915_private *dev_priv)
 {
 	/* XXX: GT type based on PCI device ID? field seems unused by fw */
@@ -427,6 +474,7 @@ int intel_guc_ucode_load(struct drm_device *dev)
 		intel_guc_fw_status_repr(guc_fw->guc_fw_fetch_status),
 		intel_guc_fw_status_repr(guc_fw->guc_fw_load_status));
 
+	direct_interrupts_to_host(dev_priv);
 	i915_guc_submission_disable(dev);
 
 	if (guc_fw->guc_fw_fetch_status == GUC_FIRMWARE_NONE)
@@ -485,6 +533,7 @@ int intel_guc_ucode_load(struct drm_device *dev)
 		err = i915_guc_submission_enable(dev);
 		if (err)
 			goto fail;
+		direct_interrupts_to_guc(dev_priv);
 	}
 
 	return 0;
@@ -493,6 +542,7 @@ fail:
 	if (guc_fw->guc_fw_load_status == GUC_FIRMWARE_PENDING)
 		guc_fw->guc_fw_load_status = GUC_FIRMWARE_FAIL;
 
+	direct_interrupts_to_host(dev_priv);
 	i915_guc_submission_disable(dev);
 
 	DRM_ERROR("Failed to initialize GuC, error %d\n", err);
@@ -557,6 +607,7 @@ void intel_guc_ucode_fini(struct drm_device *dev)
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_guc_fw *guc_fw = &dev_priv->guc.guc_fw;
 
+	direct_interrupts_to_host(dev_priv);
 	i915_guc_submission_fini(dev);
 
 	if (guc_fw->guc_fw_obj)
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 11/13 v4] drm/i915: Integrate GuC-based command submission
  2015-07-09 18:29 [PATCH 00/13 v4] Batch submission via GuC Dave Gordon
                   ` (9 preceding siblings ...)
  2015-07-09 18:29 ` [PATCH 10/13 v4] drm/i915: Interrupt routing for GuC submission Dave Gordon
@ 2015-07-09 18:29 ` Dave Gordon
  2015-07-27 15:57   ` O'Rourke, Tom
  2015-07-09 18:29 ` [PATCH 12/13 v4] drm/i915: Debugfs interface for GuC submission statistics Dave Gordon
                   ` (2 subsequent siblings)
  13 siblings, 1 reply; 42+ messages in thread
From: Dave Gordon @ 2015-07-09 18:29 UTC (permalink / raw)
  To: intel-gfx

From: Alex Dai <yu.dai@intel.com>

GuC-based submission is mostly the same as execlist mode, up to
intel_logical_ring_advance_and_submit(), where the context being
dispatched would be added to the execlist queue; at this point
we submit the context to the GuC backend instead.

There are, however, a few other changes also required, notably:
1.  Contexts must be pinned at GGTT addresses accessible by the GuC
    i.e. NOT in the range [0..WOPCM_SIZE), so we have to add the
    PIN_OFFSET_BIAS flag to the relevant GGTT-pinning calls.

2.  The GuC's TLB must be invalidated after a context is pinned at
    a new GGTT address.

3.  GuC firmware uses the one page before Ring Context as shared data.
    Therefore, whenever driver wants to get base address of LRC, we
    will offset one page for it. LRC_PPHWSP_PN is defined as the page
    number of LRCA.

4.  In the work queue used to pass requests to the GuC, the GuC
    firmware requires the ring-tail-offset to be represented as an
    11-bit value, expressed in QWords. Therefore, the ringbuffer
    size must be reduced to the representable range (4 pages).

v2:
    Defer adding #defines until needed [Chris Wilson]
    Rationalise type declarations [Chris Wilson]

v4:
    Squashed kerneldoc patch into here [Daniel Vetter]

Issue: VIZ-4884
Signed-off-by: Alex Dai <yu.dai@intel.com>
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
---
 Documentation/DocBook/drm.tmpl             | 14 ++++++++
 drivers/gpu/drm/i915/i915_debugfs.c        |  2 +-
 drivers/gpu/drm/i915/i915_guc_submission.c | 52 +++++++++++++++++++++++++++---
 drivers/gpu/drm/i915/intel_guc.h           |  1 +
 drivers/gpu/drm/i915/intel_lrc.c           | 51 ++++++++++++++++++++---------
 drivers/gpu/drm/i915/intel_lrc.h           |  6 ++++
 6 files changed, 106 insertions(+), 20 deletions(-)

diff --git a/Documentation/DocBook/drm.tmpl b/Documentation/DocBook/drm.tmpl
index 596b11d..0ff5fd7 100644
--- a/Documentation/DocBook/drm.tmpl
+++ b/Documentation/DocBook/drm.tmpl
@@ -4223,6 +4223,20 @@ int num_ioctls;</synopsis>
       </sect2>
     </sect1>
     <sect1>
+      <title>GuC-based Command Submission</title>
+      <sect2>
+        <title>GuC</title>
+!Pdrivers/gpu/drm/i915/intel_guc_loader.c GuC-specific firmware loader
+!Idrivers/gpu/drm/i915/intel_guc_loader.c
+      </sect2>
+      <sect2>
+        <title>GuC Client</title>
+!Pdrivers/gpu/drm/i915/intel_guc_submission.c GuC-based command submissison
+!Idrivers/gpu/drm/i915/intel_guc_submission.c
+      </sect2>
+    </sect1>
+
+    <sect1>
       <title> Tracing </title>
       <para>
     This sections covers all things related to the tracepoints implemented in
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 13e37d1..d93732a 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1982,7 +1982,7 @@ static void i915_dump_lrc_obj(struct seq_file *m,
 		return;
 	}
 
-	page = i915_gem_object_get_page(ctx_obj, 1);
+	page = i915_gem_object_get_page(ctx_obj, LRC_STATE_PN);
 	if (!WARN_ON(page == NULL)) {
 		reg_state = kmap_atomic(page);
 
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 25d8807..c5c9fbf 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -346,18 +346,58 @@ static void guc_init_proc_desc(struct intel_guc *guc,
 static void guc_init_ctx_desc(struct intel_guc *guc,
 			      struct i915_guc_client *client)
 {
+	struct intel_context *ctx = client->owner;
 	struct guc_context_desc desc;
 	struct sg_table *sg;
+	int i;
 
 	memset(&desc, 0, sizeof(desc));
 
 	desc.attribute = GUC_CTX_DESC_ATTR_ACTIVE | GUC_CTX_DESC_ATTR_KERNEL;
 	desc.context_id = client->ctx_index;
 	desc.priority = client->priority;
-	desc.engines_used = (1 << RCS) | (1 << VCS) | (1 << BCS) |
-			    (1 << VECS) | (1 << VCS2); /* all engines */
 	desc.db_id = client->doorbell_id;
 
+	for (i = 0; i < I915_NUM_RINGS; i++) {
+		struct guc_execlist_context *lrc = &desc.lrc[i];
+		struct intel_ringbuffer *ringbuf = ctx->engine[i].ringbuf;
+		struct intel_engine_cs *ring;
+		struct drm_i915_gem_object *obj;
+		uint64_t ctx_desc;
+
+		/* TODO: We have a design issue to be solved here. Only when we
+		 * receive the first batch, we know which engine is used by the
+		 * user. But here GuC expects the lrc and ring to be pinned. It
+		 * is not an issue for default context, which is the only one
+		 * for now who owns a GuC client. But for future owner of GuC
+		 * client, need to make sure lrc is pinned prior to enter here.
+		 */
+		obj = ctx->engine[i].state;
+		if (!obj)
+			break;
+
+		ring = ringbuf->ring;
+		ctx_desc = intel_lr_context_descriptor(ctx, ring);
+		lrc->context_desc = (u32)ctx_desc;
+
+		/* The state page is after PPHWSP */
+		lrc->ring_lcra = i915_gem_obj_ggtt_offset(obj) +
+				LRC_STATE_PN * PAGE_SIZE;
+		lrc->context_id = (client->ctx_index << GUC_ELC_CTXID_OFFSET) |
+				(ring->id << GUC_ELC_ENGINE_OFFSET);
+
+		obj = ringbuf->obj;
+
+		lrc->ring_begin = i915_gem_obj_ggtt_offset(obj);
+		lrc->ring_end = lrc->ring_begin + obj->base.size - 1;
+		lrc->ring_next_free_location = lrc->ring_begin;
+		lrc->ring_current_tail_pointer_value = 0;
+
+		desc.engines_used |= (1 << ring->id);
+	}
+
+	WARN_ON(desc.engines_used == 0);
+
 	/*
 	 * The CPU address is only needed at certain points, so kmap_atomic on
 	 * demand instead of storing it in the ctx descriptor.
@@ -622,11 +662,13 @@ static void guc_client_free(struct drm_device *dev,
  * 		The kernel client to replace ExecList submission is created with
  * 		NORMAL priority. Priority of a client for scheduler can be HIGH,
  * 		while a preemption context can use CRITICAL.
+ * @ctx		the context to own the client (we use the default render context)
  *
  * Return:	An i915_guc_client object if success.
  */
 static struct i915_guc_client *guc_client_alloc(struct drm_device *dev,
-						uint32_t priority)
+						uint32_t priority,
+						struct intel_context *ctx)
 {
 	struct i915_guc_client *client;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -639,6 +681,7 @@ static struct i915_guc_client *guc_client_alloc(struct drm_device *dev,
 
 	client->doorbell_id = GUC_INVALID_DOORBELL_ID;
 	client->priority = priority;
+	client->owner = ctx;
 
 	client->ctx_index = (uint32_t)ida_simple_get(&guc->ctx_ids, 0,
 			GUC_MAX_GPU_CONTEXTS, GFP_KERNEL);
@@ -772,10 +815,11 @@ int i915_guc_submission_enable(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_guc *guc = &dev_priv->guc;
+	struct intel_context *ctx = dev_priv->ring[RCS].default_context;
 	struct i915_guc_client *client;
 
 	/* client for execbuf submission */
-	client = guc_client_alloc(dev, GUC_CTX_PRIORITY_NORMAL);
+	client = guc_client_alloc(dev, GUC_CTX_PRIORITY_NORMAL, ctx);
 	if (!client) {
 		DRM_ERROR("Failed to create execbuf guc_client\n");
 		return -ENOMEM;
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index d249326..9571b56 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -29,6 +29,7 @@
 
 struct i915_guc_client {
 	struct drm_i915_gem_object *client_obj;
+	struct intel_context *owner;
 	uint32_t priority;
 	uint32_t ctx_index;
 
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 9e121d3..8294462 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -254,7 +254,8 @@ int intel_sanitize_enable_execlists(struct drm_device *dev, int enable_execlists
  */
 u32 intel_execlists_ctx_id(struct drm_i915_gem_object *ctx_obj)
 {
-	u32 lrca = i915_gem_obj_ggtt_offset(ctx_obj);
+	u32 lrca = i915_gem_obj_ggtt_offset(ctx_obj) +
+			LRC_PPHWSP_PN * PAGE_SIZE;
 
 	/* LRCA is required to be 4K aligned so the more significant 20 bits
 	 * are globally unique */
@@ -267,7 +268,8 @@ uint64_t intel_lr_context_descriptor(struct intel_context *ctx,
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_gem_object *ctx_obj = ctx->engine[ring->id].state;
 	uint64_t desc;
-	uint64_t lrca = i915_gem_obj_ggtt_offset(ctx_obj);
+	uint64_t lrca = i915_gem_obj_ggtt_offset(ctx_obj) +
+			LRC_PPHWSP_PN * PAGE_SIZE;
 
 	WARN_ON(lrca & 0xFFFFFFFF00000FFFULL);
 
@@ -342,7 +344,7 @@ void intel_lr_context_update(struct drm_i915_gem_request *rq)
 	WARN_ON(!i915_gem_obj_is_pinned(ctx_obj));
 	WARN_ON(!i915_gem_obj_is_pinned(rb_obj));
 
-	page = i915_gem_object_get_page(ctx_obj, 1);
+	page = i915_gem_object_get_page(ctx_obj, LRC_STATE_PN);
 	reg_state = kmap_atomic(page);
 
 	reg_state[CTX_RING_TAIL+1] = rq->tail;
@@ -687,13 +689,17 @@ static void
 intel_logical_ring_advance_and_submit(struct drm_i915_gem_request *request)
 {
 	struct intel_engine_cs *ring = request->ring;
+	struct drm_i915_private *dev_priv = request->i915;
 
 	intel_logical_ring_advance(request->ringbuf);
 
 	if (intel_ring_stopped(ring))
 		return;
 
-	execlists_context_queue(request);
+	if (dev_priv->guc.execbuf_client)
+		i915_guc_submit(dev_priv->guc.execbuf_client, request);
+	else
+		execlists_context_queue(request);
 }
 
 static void __wrap_ring_buffer(struct intel_ringbuffer *ringbuf)
@@ -984,6 +990,7 @@ int logical_ring_flush_all_caches(struct drm_i915_gem_request *req)
 
 static int intel_lr_context_pin(struct drm_i915_gem_request *rq)
 {
+	struct drm_i915_private *dev_priv = rq->i915;
 	struct intel_engine_cs *ring = rq->ring;
 	struct drm_i915_gem_object *ctx_obj = rq->ctx->engine[ring->id].state;
 	struct intel_ringbuffer *ringbuf = rq->ringbuf;
@@ -991,14 +998,18 @@ static int intel_lr_context_pin(struct drm_i915_gem_request *rq)
 
 	WARN_ON(!mutex_is_locked(&ring->dev->struct_mutex));
 	if (rq->ctx->engine[ring->id].pin_count++ == 0) {
-		ret = i915_gem_obj_ggtt_pin(ctx_obj,
-				GEN8_LR_CONTEXT_ALIGN, 0);
+		ret = i915_gem_obj_ggtt_pin(ctx_obj, GEN8_LR_CONTEXT_ALIGN,
+				PIN_OFFSET_BIAS | GUC_WOPCM_SIZE_VALUE);
 		if (ret)
 			goto reset_pin_count;
 
 		ret = intel_pin_and_map_ringbuffer_obj(ring->dev, ringbuf);
 		if (ret)
 			goto unpin_ctx_obj;
+
+		/* Invalidate GuC TLB. */
+		if (i915.enable_guc_submission)
+			I915_WRITE(GEN8_GTCR, GEN8_GTCR_INVALIDATE);
 	}
 
 	return ret;
@@ -1668,8 +1679,13 @@ out:
 
 static int gen8_init_rcs_context(struct drm_i915_gem_request *req)
 {
+	struct drm_i915_private *dev_priv = req->i915;
 	int ret;
 
+	/* Invalidate GuC TLB. */
+	if (i915.enable_guc_submission)
+		I915_WRITE(GEN8_GTCR, GEN8_GTCR_INVALIDATE);
+
 	ret = intel_logical_ring_workarounds_emit(req);
 	if (ret)
 		return ret;
@@ -2026,7 +2042,7 @@ populate_lr_context(struct intel_context *ctx, struct drm_i915_gem_object *ctx_o
 
 	/* The second page of the context object contains some fields which must
 	 * be set up prior to the first execution. */
-	page = i915_gem_object_get_page(ctx_obj, 1);
+	page = i915_gem_object_get_page(ctx_obj, LRC_STATE_PN);
 	reg_state = kmap_atomic(page);
 
 	/* A context is actually a big batch buffer with several MI_LOAD_REGISTER_IMM
@@ -2185,12 +2201,13 @@ static void lrc_setup_hardware_status_page(struct intel_engine_cs *ring,
 		struct drm_i915_gem_object *default_ctx_obj)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+	struct page *page;
 
-	/* The status page is offset 0 from the default context object
-	 * in LRC mode. */
-	ring->status_page.gfx_addr = i915_gem_obj_ggtt_offset(default_ctx_obj);
-	ring->status_page.page_addr =
-			kmap(sg_page(default_ctx_obj->pages->sgl));
+	/* The HWSP is part of the default context object in LRC mode. */
+	ring->status_page.gfx_addr = i915_gem_obj_ggtt_offset(default_ctx_obj)
+			+ LRC_PPHWSP_PN * PAGE_SIZE;
+	page = i915_gem_object_get_page(default_ctx_obj, LRC_PPHWSP_PN);
+	ring->status_page.page_addr = kmap(page);
 	ring->status_page.obj = default_ctx_obj;
 
 	I915_WRITE(RING_HWS_PGA(ring->mmio_base),
@@ -2226,6 +2243,9 @@ int intel_lr_context_deferred_create(struct intel_context *ctx,
 
 	context_size = round_up(get_lr_context_size(ring), 4096);
 
+	/* One extra page as the sharing data between driver and GuC */
+	context_size += PAGE_SIZE * LRC_PPHWSP_PN;
+
 	ctx_obj = i915_gem_alloc_object(dev, context_size);
 	if (!ctx_obj) {
 		DRM_DEBUG_DRIVER("Alloc LRC backing obj failed.\n");
@@ -2233,7 +2253,8 @@ int intel_lr_context_deferred_create(struct intel_context *ctx,
 	}
 
 	if (is_global_default_ctx) {
-		ret = i915_gem_obj_ggtt_pin(ctx_obj, GEN8_LR_CONTEXT_ALIGN, 0);
+		ret = i915_gem_obj_ggtt_pin(ctx_obj, GEN8_LR_CONTEXT_ALIGN,
+				PIN_OFFSET_BIAS | GUC_WOPCM_SIZE_VALUE);
 		if (ret) {
 			DRM_DEBUG_DRIVER("Pin LRC backing obj failed: %d\n",
 					ret);
@@ -2252,7 +2273,7 @@ int intel_lr_context_deferred_create(struct intel_context *ctx,
 
 	ringbuf->ring = ring;
 
-	ringbuf->size = 32 * PAGE_SIZE;
+	ringbuf->size = 4 * PAGE_SIZE;
 	ringbuf->effective_size = ringbuf->size;
 	ringbuf->head = 0;
 	ringbuf->tail = 0;
@@ -2352,7 +2373,7 @@ void intel_lr_context_reset(struct drm_device *dev,
 			WARN(1, "Failed get_pages for context obj\n");
 			continue;
 		}
-		page = i915_gem_object_get_page(ctx_obj, 1);
+		page = i915_gem_object_get_page(ctx_obj, LRC_STATE_PN);
 		reg_state = kmap_atomic(page);
 
 		reg_state[CTX_RING_HEAD+1] = 0;
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index 6ecc0b3..e04b5c2 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -67,6 +67,12 @@ static inline void intel_logical_ring_emit(struct intel_ringbuffer *ringbuf,
 }
 
 /* Logical Ring Contexts */
+
+/* One extra page is added before LRC for GuC as shared data */
+#define LRC_GUCSHR_PN	(0)
+#define LRC_PPHWSP_PN	(LRC_GUCSHR_PN + 1)
+#define LRC_STATE_PN	(LRC_PPHWSP_PN + 1)
+
 void intel_lr_context_free(struct intel_context *ctx);
 int intel_lr_context_deferred_create(struct intel_context *ctx,
 				     struct intel_engine_cs *ring);
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 12/13 v4] drm/i915: Debugfs interface for GuC submission statistics
  2015-07-09 18:29 [PATCH 00/13 v4] Batch submission via GuC Dave Gordon
                   ` (10 preceding siblings ...)
  2015-07-09 18:29 ` [PATCH 11/13 v4] drm/i915: Integrate GuC-based command submission Dave Gordon
@ 2015-07-09 18:29 ` Dave Gordon
  2015-07-27 15:36   ` O'Rourke, Tom
  2015-07-09 18:29 ` [PATCH 13/13 v4] drm/i915: Enable GuC submission, where supported Dave Gordon
  2015-07-18  0:45 ` [PATCH 00/13 v4] Batch submission via GuC O'Rourke, Tom
  13 siblings, 1 reply; 42+ messages in thread
From: Dave Gordon @ 2015-07-09 18:29 UTC (permalink / raw)
  To: intel-gfx

This provides a means of reading status and counts relating
to GuC actions and submissions.

v2:
    Remove surplus blank line in output [Chris Wilson]

v4:
    Rebased

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Signed-off-by: Alex Dai <yu.dai@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c | 40 +++++++++++++++++++++++++++++++++++++
 1 file changed, 40 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index d93732a..cebb93c 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2397,6 +2397,45 @@ static int i915_guc_load_status_info(struct seq_file *m, void *data)
 	return 0;
 }
 
+static int i915_guc_info(struct seq_file *m, void *data)
+{
+	struct drm_info_node *node = m->private;
+	struct drm_device *dev = node->minor->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_guc guc;
+	struct i915_guc_client client = { .client_obj = 0 };
+
+	if (!HAS_GUC_SCHED(dev_priv->dev))
+		return 0;
+
+	/* Take a local copy of the GuC data, so we can dump it at leisure */
+	spin_lock(&dev_priv->guc.host2guc_lock);
+	guc = dev_priv->guc;
+	if (guc.execbuf_client) {
+		spin_lock(&guc.execbuf_client->wq_lock);
+		client = *guc.execbuf_client;
+		spin_unlock(&guc.execbuf_client->wq_lock);
+	}
+	spin_unlock(&dev_priv->guc.host2guc_lock);
+
+	seq_printf(m, "GuC total action count: %llu\n", guc.action_count);
+	seq_printf(m, "GuC last action command: 0x%x\n", guc.action_cmd);
+	seq_printf(m, "GuC last action status: 0x%x\n", guc.action_status);
+
+	seq_printf(m, "GuC action failure count: %u\n", guc.action_fail);
+	seq_printf(m, "GuC last action error code: %d\n", guc.action_err);
+
+	seq_printf(m, "\nGuC execbuf client @ %p:\n", guc.execbuf_client);
+	seq_printf(m, "\tTotal submissions: %llu\n", client.submissions);
+	seq_printf(m, "\tFailed to queue: %u\n", client.q_fail);
+	seq_printf(m, "\tFailed doorbell: %u\n", client.b_fail);
+	seq_printf(m, "\tLast submission result: %d\n", client.retcode);
+
+	/* Add more as required ... */
+
+	return 0;
+}
+
 static int i915_guc_log_dump(struct seq_file *m, void *data)
 {
 	struct drm_info_node *node = m->private;
@@ -5139,6 +5178,7 @@ static const struct drm_info_list i915_debugfs_list[] = {
 	{"i915_gem_hws_bsd", i915_hws_info, 0, (void *)VCS},
 	{"i915_gem_hws_vebox", i915_hws_info, 0, (void *)VECS},
 	{"i915_gem_batch_pool", i915_gem_batch_pool_info, 0},
+	{"i915_guc_info", i915_guc_info, 0},
 	{"i915_guc_load_status", i915_guc_load_status_info, 0},
 	{"i915_guc_log_dump", i915_guc_log_dump, 0},
 	{"i915_frequency_info", i915_frequency_info, 0},
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 13/13 v4] drm/i915: Enable GuC submission, where supported
  2015-07-09 18:29 [PATCH 00/13 v4] Batch submission via GuC Dave Gordon
                   ` (11 preceding siblings ...)
  2015-07-09 18:29 ` [PATCH 12/13 v4] drm/i915: Debugfs interface for GuC submission statistics Dave Gordon
@ 2015-07-09 18:29 ` Dave Gordon
  2015-07-18  0:45 ` [PATCH 00/13 v4] Batch submission via GuC O'Rourke, Tom
  13 siblings, 0 replies; 42+ messages in thread
From: Dave Gordon @ 2015-07-09 18:29 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>

v4:
    Rebased

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
---
 drivers/gpu/drm/i915/i915_params.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
index 2791b5a..8bb1719 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -53,7 +53,7 @@ struct i915_params i915 __read_mostly = {
 	.verbose_state_checks = 1,
 	.nuclear_pageflip = 0,
 	.edp_vswing = 0,
-	.enable_guc_submission = false,
+	.enable_guc_submission = true,
 	.guc_log_level = -1,
 };
 
@@ -190,7 +190,7 @@ MODULE_PARM_DESC(edp_vswing,
 		 "2=default swing(400mV))");
 
 module_param_named_unsafe(enable_guc_submission, i915.enable_guc_submission, bool, 0400);
-MODULE_PARM_DESC(enable_guc_submission, "Enable GuC submission (default:false)");
+MODULE_PARM_DESC(enable_guc_submission, "Enable GuC submission (default:true)");
 
 module_param_named(guc_log_level, i915.guc_log_level, int, 0400);
 MODULE_PARM_DESC(guc_log_level,
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH 04/13 v4] drm/i915: GuC-specific firmware loader
  2015-07-09 18:29 ` [PATCH 04/13 v4] drm/i915: GuC-specific firmware loader Dave Gordon
@ 2015-07-13 15:35   ` Daniel Vetter
  2015-07-18  0:35   ` O'Rourke, Tom
  1 sibling, 0 replies; 42+ messages in thread
From: Daniel Vetter @ 2015-07-13 15:35 UTC (permalink / raw)
  To: Dave Gordon; +Cc: intel-gfx

On Thu, Jul 09, 2015 at 07:29:05PM +0100, Dave Gordon wrote:
> From: Alex Dai <yu.dai@intel.com>
> 
> This fetches the required firmware image from the filesystem,
> then loads it into the GuC's memory via a dedicated DMA engine.
> 
> This patch is derived from GuC loading work originally done by
> Vinit Azad and Ben Widawsky.
> 
> v2:
>     Various improvements per review comments by Chris Wilson
> 
> v3:
>     Removed 'wait' parameter to intel_guc_ucode_load() as firmware
>         prefetch is no longer supported in the common firmware loader,
> 	per Daniel Vetter's request.
>     Firmware checker callback fn now returns errno rather than bool.
> 
> v4:
>     Squash uC-independent code into GuC-specifc loader [Daniel Vetter]
>     Don't keep the driver working (by falling back to execlist mode)
>         if GuC firmware loading fails [Daniel Vetter]
> 
> Issue: VIZ-4884
> Signed-off-by: Alex Dai <yu.dai@intel.com>
> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>

I think this is it, one comment below.

> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index dbbb649..e020309 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -5074,6 +5074,19 @@ i915_gem_init_hw(struct drm_device *dev)
>  			goto out;
>  	}
>  
> +	/* We can't enable contexts until all firmware is loaded */
> +	ret = intel_guc_ucode_load(dev);

To stay in line with the current flow I think the request_firmware +
create fw bo object code should be move into gem_init so that gem_init_hw
is only responsible for loading the fw in the bo into hw.

That will take care of not trying to re-request the firmware from
userspace in resume/gpu reset code, which should simplify the status
handling a lot.

Also with just declaring the gpu wedged we could instead move the check
below into the init_hw part of the guc load process. That would nicely
encapsulate that decision and I'd expect take care of the other status
codes - in the end callers really only care about -EIO or not here.

But imo we can polish that after merging. All my other higher-level
concerns with this have been addressed, so I think this is good to go in
after detailed code review.

Thanks, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 04/13 v4] drm/i915: GuC-specific firmware loader
  2015-07-09 18:29 ` [PATCH 04/13 v4] drm/i915: GuC-specific firmware loader Dave Gordon
  2015-07-13 15:35   ` Daniel Vetter
@ 2015-07-18  0:35   ` O'Rourke, Tom
  2015-07-20 16:18     ` Yu Dai
  1 sibling, 1 reply; 42+ messages in thread
From: O'Rourke, Tom @ 2015-07-18  0:35 UTC (permalink / raw)
  To: Dave Gordon; +Cc: intel-gfx

On Thu, Jul 09, 2015 at 07:29:05PM +0100, Dave Gordon wrote:
> From: Alex Dai <yu.dai@intel.com>
> 
> This fetches the required firmware image from the filesystem,
> then loads it into the GuC's memory via a dedicated DMA engine.
> 
> This patch is derived from GuC loading work originally done by
> Vinit Azad and Ben Widawsky.
> 
> v2:
>     Various improvements per review comments by Chris Wilson
> 
> v3:
>     Removed 'wait' parameter to intel_guc_ucode_load() as firmware
>         prefetch is no longer supported in the common firmware loader,
> 	per Daniel Vetter's request.
>     Firmware checker callback fn now returns errno rather than bool.
> 
> v4:
>     Squash uC-independent code into GuC-specifc loader [Daniel Vetter]
>     Don't keep the driver working (by falling back to execlist mode)
>         if GuC firmware loading fails [Daniel Vetter]
> 
> Issue: VIZ-4884
> Signed-off-by: Alex Dai <yu.dai@intel.com>
> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> ---
>  drivers/gpu/drm/i915/Makefile           |   3 +
>  drivers/gpu/drm/i915/i915_dma.c         |   4 +
>  drivers/gpu/drm/i915/i915_drv.h         |  11 +
>  drivers/gpu/drm/i915/i915_gem.c         |  13 +
>  drivers/gpu/drm/i915/i915_reg.h         |   4 +-
>  drivers/gpu/drm/i915/intel_guc.h        |  67 ++++
>  drivers/gpu/drm/i915/intel_guc_loader.c | 536 ++++++++++++++++++++++++++++++++
>  7 files changed, 637 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/gpu/drm/i915/intel_guc.h
>  create mode 100644 drivers/gpu/drm/i915/intel_guc_loader.c
> 
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index de21965..e604cfe 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -39,6 +39,9 @@ i915-y += i915_cmd_parser.o \
>  	  intel_ringbuffer.o \
>  	  intel_uncore.o
>  
> +# general-purpose microcontroller (GuC) support
> +i915-y += intel_guc_loader.o
> +
>  # autogenerated null render state
>  i915-y += intel_renderstate_gen6.o \
>  	  intel_renderstate_gen7.o \
> diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> index 066c34c..958ab4f 100644
> --- a/drivers/gpu/drm/i915/i915_dma.c
> +++ b/drivers/gpu/drm/i915/i915_dma.c
> @@ -472,6 +472,7 @@ static int i915_load_modeset_init(struct drm_device *dev)
>  
>  cleanup_gem:
>  	mutex_lock(&dev->struct_mutex);
> +	intel_guc_ucode_fini(dev);
>  	i915_gem_cleanup_ringbuffer(dev);
>  	i915_gem_context_fini(dev);
>  	mutex_unlock(&dev->struct_mutex);
> @@ -869,6 +870,8 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
>  
>  	intel_uncore_init(dev);
>  
> +	intel_guc_ucode_init(dev);
> +
>  	/* Load CSR Firmware for SKL */
>  	intel_csr_ucode_init(dev);
>  
> @@ -1120,6 +1123,7 @@ int i915_driver_unload(struct drm_device *dev)
>  	flush_workqueue(dev_priv->wq);
>  
>  	mutex_lock(&dev->struct_mutex);
> +	intel_guc_ucode_fini(dev);
>  	i915_gem_cleanup_ringbuffer(dev);
>  	i915_gem_context_fini(dev);
>  	mutex_unlock(&dev->struct_mutex);
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 4a512da..15b9202 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -50,6 +50,7 @@
>  #include <linux/intel-iommu.h>
>  #include <linux/kref.h>
>  #include <linux/pm_qos.h>
> +#include "intel_guc.h"
>  
>  /* General customization:
>   */
> @@ -1694,6 +1695,8 @@ struct drm_i915_private {
>  
>  	struct i915_virtual_gpu vgpu;
>  
> +	struct intel_guc guc;
> +
>  	struct intel_csr csr;
>  
>  	/* Display CSR-related protection */
> @@ -1938,6 +1941,11 @@ static inline struct drm_i915_private *dev_to_i915(struct device *dev)
>  	return to_i915(dev_get_drvdata(dev));
>  }
>  
> +static inline struct drm_i915_private *guc_to_i915(struct intel_guc *guc)
> +{
> +	return container_of(guc, struct drm_i915_private, guc);
> +}
> +
>  /* Iterate over initialised rings */
>  #define for_each_ring(ring__, dev_priv__, i__) \
>  	for ((i__) = 0; (i__) < I915_NUM_RINGS; (i__)++) \
> @@ -2543,6 +2551,9 @@ struct drm_i915_cmd_table {
>  
>  #define HAS_CSR(dev)	(IS_SKYLAKE(dev))
>  
> +#define HAS_GUC_UCODE(dev)	(IS_GEN9(dev))
> +#define HAS_GUC_SCHED(dev)	(IS_GEN9(dev))
> +
>  #define HAS_RESOURCE_STREAMER(dev) (IS_HASWELL(dev) || \
>  				    INTEL_INFO(dev)->gen >= 8)
>  
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index dbbb649..e020309 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -5074,6 +5074,19 @@ i915_gem_init_hw(struct drm_device *dev)
>  			goto out;
>  	}
>  
> +	/* We can't enable contexts until all firmware is loaded */
> +	ret = intel_guc_ucode_load(dev);
> +
> +	/*
> +	 * If we got an error and GuC submission is enabled, map
> +	 * the error to -EIO so the GPU will be declared wedged.
> +	 * OTOH, if we didn't intend to use the GuC anyway, just
> +	 * discard the error and carry on.
> +	 */
> +	ret = ret && i915.enable_guc_submission ? -EIO : 0;
> +	if (ret)
> +		goto out;
> +
>  	/* Now it is safe to go back round and do everything else: */
>  	for_each_ring(ring, dev_priv, i) {
>  		struct drm_i915_gem_request *req;
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index 2a29bcc..63728c1 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -6843,7 +6843,9 @@ enum skl_disp_power_wells {
>  #define   GEN9_PGCTL_SSB_EU311_ACK	(1 << 14)
>  
>  #define GEN7_MISCCPCTL			(0x9424)
> -#define   GEN7_DOP_CLOCK_GATE_ENABLE	(1<<0)
> +#define   GEN7_DOP_CLOCK_GATE_ENABLE		(1<<0)
> +#define   GEN8_DOP_CLOCK_GATE_CFCLK_ENABLE	(1<<2)
> +#define   GEN8_DOP_CLOCK_GATE_GUC_ENABLE	(1<<4)
>  
>  /* IVYBRIDGE DPF */
>  #define GEN7_L3CDERRST1			0xB008 /* L3CD Error Status 1 */
> diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
> new file mode 100644
> index 0000000..2846b6d
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/intel_guc.h
> @@ -0,0 +1,67 @@
> +/*
> + * Copyright © 2014 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
> + * IN THE SOFTWARE.
> + *
> + */
> +#ifndef _INTEL_GUC_H_
> +#define _INTEL_GUC_H_
> +
> +#include "intel_guc_fwif.h"
> +#include "i915_guc_reg.h"
> +
> +enum intel_guc_fw_status {
> +	GUC_FIRMWARE_FAIL = -1,
> +	GUC_FIRMWARE_NONE = 0,
> +	GUC_FIRMWARE_PENDING,
> +	GUC_FIRMWARE_SUCCESS
> +};
> +
> +/*
> + * This structure encapsulates all the data needed during the process
> + * of fetching, caching, and loading the firmware image into the GuC.
> + */
> +struct intel_guc_fw {
> +	struct drm_device *		guc_dev;
> +	const char *			guc_fw_path;
> +	size_t				guc_fw_size;
> +	struct drm_i915_gem_object *	guc_fw_obj;
> +	enum intel_guc_fw_status	guc_fw_fetch_status;
> +	enum intel_guc_fw_status	guc_fw_load_status;
> +
> +	uint16_t			guc_fw_major_wanted;
> +	uint16_t			guc_fw_minor_wanted;
> +	uint16_t			guc_fw_major_found;
> +	uint16_t			guc_fw_minor_found;
> +};
> +
> +struct intel_guc {
> +	struct intel_guc_fw guc_fw;
> +
> +	uint32_t log_flags;
> +};
> +
> +/* intel_guc_loader.c */
> +extern void intel_guc_ucode_init(struct drm_device *dev);
> +extern int intel_guc_ucode_load(struct drm_device *dev);
> +extern void intel_guc_ucode_fini(struct drm_device *dev);
> +extern const char *intel_guc_fw_status_repr(enum intel_guc_fw_status status);
> +
> +#endif
> diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c b/drivers/gpu/drm/i915/intel_guc_loader.c
> new file mode 100644
> index 0000000..2080bca
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/intel_guc_loader.c
> @@ -0,0 +1,536 @@
> +/*
> + * Copyright © 2014 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
> + * IN THE SOFTWARE.
> + *
> + * Authors:
> + *    Vinit Azad <vinit.azad@intel.com>
> + *    Ben Widawsky <ben@bwidawsk.net>
> + *    Dave Gordon <david.s.gordon@intel.com>
> + *    Alex Dai <yu.dai@intel.com>
> + */
> +#include <linux/firmware.h>
> +#include "i915_drv.h"
> +#include "intel_guc.h"
> +
> +/**
> + * DOC: GuC
> + *
> + * intel_guc:
> + * Top level structure of guc. It handles firmware loading and manages client
> + * pool and doorbells. intel_guc owns a i915_guc_client to replace the legacy
> + * ExecList submission.
> + *
> + * Firmware versioning:
> + * The firmware build process will generate a version header file with major and
> + * minor version defined. The versions are built into CSS header of firmware.
> + * i915 kernel driver set the minimal firmware version required per platform.
> + * The firmware installation package will install (symbolic link) proper version
> + * of firmware.
> + *
> + * GuC address space:
> + * GuC does not allow any gfx GGTT address that falls into range [0, WOPCM_TOP),
> + * which is reserved for Boot ROM, SRAM and WOPCM. Currently this top address is
> + * 512K. In order to exclude 0-512K address space from GGTT, all gfx objects
> + * used by GuC is pinned with PIN_OFFSET_BIAS along with size of WOPCM.
> + *
> + * Firmware log:
> + * Firmware log is enabled by setting i915.guc_log_level to non-negative level.
> + * Log data is printed out via reading debugfs i915_guc_log_dump. Reading from
> + * i915_guc_load_status will print out firmware loading status and scratch
> + * registers value.
> + *
> + */
> +
> +#define I915_SKL_GUC_UCODE "i915/skl_guc_ver3.bin"
> +MODULE_FIRMWARE(I915_SKL_GUC_UCODE);
> +
> +/* User-friendly representation of an enum */
> +const char *intel_guc_fw_status_repr(enum intel_guc_fw_status status)
> +{
> +	switch (status) {
> +	case GUC_FIRMWARE_FAIL:
> +		return "FAIL";
> +	case GUC_FIRMWARE_NONE:
> +		return "NONE";
> +	case GUC_FIRMWARE_PENDING:
> +		return "PENDING";
> +	case GUC_FIRMWARE_SUCCESS:
> +		return "SUCCESS";
> +	default:
> +		return "UNKNOWN!";
> +	}
> +};
> +
> +static u32 get_gttype(struct drm_i915_private *dev_priv)
> +{
> +	/* XXX: GT type based on PCI device ID? field seems unused by fw */
> +	return 0;
> +}
> +
> +static u32 get_core_family(struct drm_i915_private *dev_priv)
> +{
> +	switch (INTEL_INFO(dev_priv)->gen) {
> +	case 8:
> +		return GFXCORE_FAMILY_GEN8;
[TOR:] Should Gen 8 case be included here if only Gen 9 is supported?

> +	case 9:
> +		return GFXCORE_FAMILY_GEN9;
> +	default:
> +		DRM_ERROR("GUC: unknown gen for scheduler init\n");
> +		return GFXCORE_FAMILY_FORCE_ULONG;
> +	}
> +}
> +
> +static void set_guc_init_params(struct drm_i915_private *dev_priv)
> +{
> +	struct intel_guc *guc = &dev_priv->guc;
> +	u32 params[GUC_CTL_MAX_DWORDS];
> +	int i;
> +
> +	memset(&params, 0, sizeof(params));
> +
> +	params[GUC_CTL_DEVICE_INFO] |=
> +		(get_gttype(dev_priv) << GUC_CTL_GTTYPE_SHIFT) |
> +		(get_core_family(dev_priv) << GUC_CTL_COREFAMILY_SHIFT);
> +
> +	/* GuC ARAT increment is 10 ns. GuC default scheduler quantum is one
> +	 * second. This ARAR is calculated by:
> +	 * Scheduler-Quantum-in-ns / ARAT-increment-in-ns = 1000000000 / 10
> +	 */
> +	params[GUC_CTL_ARAT_HIGH] = 0;
> +	params[GUC_CTL_ARAT_LOW] = 100000000;
> +
> +	params[GUC_CTL_WA] |= GUC_CTL_WA_UK_BY_DRIVER;
> +
> +	params[GUC_CTL_FEATURE] |= GUC_CTL_DISABLE_SCHEDULER |
> +			GUC_CTL_VCS2_ENABLED;
> +
> +	if (i915.guc_log_level >= 0) {
> +		params[GUC_CTL_LOG_PARAMS] = guc->log_flags;
> +		params[GUC_CTL_DEBUG] =
> +			i915.guc_log_level << GUC_LOG_VERBOSITY_SHIFT;
> +	}
> +
> +	I915_WRITE(SOFT_SCRATCH(0), 0);
> +
> +	for (i = 0; i < GUC_CTL_MAX_DWORDS; i++)
> +		I915_WRITE(SOFT_SCRATCH(1 + i), params[i]);
> +}
> +
> +/*
> + * Read the GuC status register (GUC_STATUS) and store it in the
> + * specified location; then return a boolean indicating whether
> + * the value matches either of two values representing completion
> + * of the GuC boot process.
> + *
> + * This is used for polling the GuC status in a wait_for_atomic()
> + * loop below.
> + */
> +static inline bool guc_ucode_response(struct drm_i915_private *dev_priv,
> +				      u32 *status)
> +{
> +	u32 val = I915_READ(GUC_STATUS);
> +	*status = val;
> +	return ((val & GS_UKERNEL_MASK) == GS_UKERNEL_READY ||
> +		(val & GS_UKERNEL_MASK) == GS_UKERNEL_LAPIC_DONE);
> +}
> +
> +/*
> + * Transfer the firmware image to RAM for execution by the microcontroller.
> + *
> + * GuC Firmware layout:
> + * +-------------------------------+  ----
> + * |          CSS header           |  128B
> + * | contains major/minor version  |
> + * +-------------------------------+  ----
> + * |             uCode             |
> + * +-------------------------------+  ----
> + * |         RSA signature         |  256B
> + * +-------------------------------+  ----
> + * |         RSA public Key        |  256B
> + * +-------------------------------+  ----
> + * |       Public key modulus      |    4B
> + * +-------------------------------+  ----
> + *
> + * Architecturally, the DMA engine is bidirectional, and can potentially even
> + * transfer between GTT locations. This functionality is left out of the API
> + * for now as there is no need for it.
> + *
> + * Note that GuC needs the CSS header plus uKernel code to be copied by the
> + * DMA engine in one operation, whereas the RSA signature is loaded via MMIO.
> + */
> +
> +#define UOS_CSS_HEADER_OFFSET		0
> +#define UOS_VER_MINOR_OFFSET		0x44
> +#define UOS_VER_MAJOR_OFFSET		0x46
> +#define UOS_CSS_HEADER_SIZE		0x80
> +#define UOS_RSA_SIG_SIZE		0x100
> +#define UOS_CSS_SIGNING_SIZE		0x204
> +
> +static int guc_ucode_xfer_dma(struct drm_i915_private *dev_priv)
> +{
> +	struct intel_guc_fw *guc_fw = &dev_priv->guc.guc_fw;
> +	struct drm_i915_gem_object *fw_obj = guc_fw->guc_fw_obj;
> +	unsigned long offset;
> +	struct sg_table *sg = fw_obj->pages;
> +	u32 status, ucode_size, rsa[UOS_RSA_SIG_SIZE / sizeof(u32)];
> +	int i, ret = 0;
> +
> +	/* uCode size, also is where RSA signature starts */
> +	offset = ucode_size = guc_fw->guc_fw_size - UOS_CSS_SIGNING_SIZE;
> +
> +	/* Copy RSA signature from the fw image to HW for verification */
> +	sg_pcopy_to_buffer(sg->sgl, sg->nents, rsa, UOS_RSA_SIG_SIZE, offset);
> +	for (i = 0; i < UOS_RSA_SIG_SIZE / sizeof(u32); i++)
> +		I915_WRITE(UOS_RSA_SCRATCH_0 + i * sizeof(u32), rsa[i]);
> +
> +	/* Set the source address for the new blob */
> +	offset = i915_gem_obj_ggtt_offset(fw_obj);
> +	I915_WRITE(DMA_ADDR_0_LOW, lower_32_bits(offset));
> +	I915_WRITE(DMA_ADDR_0_HIGH, upper_32_bits(offset) & 0xFFFF);
> +
> +	/* Set the destination. Current uCode expects an 8k stack starting from
> +	 * offset 0. */
> +	I915_WRITE(DMA_ADDR_1_LOW, 0x2000);
> +
> +	/* XXX: The image is automatically transfered to SRAM after the RSA
> +	 * verification. This is why the address space is chosen as such. */
> +	I915_WRITE(DMA_ADDR_1_HIGH, DMA_ADDRESS_SPACE_WOPCM);
> +
> +	I915_WRITE(DMA_COPY_SIZE, ucode_size);
> +
> +	/* Finally start the DMA */
> +	I915_WRITE(DMA_CTRL, _MASKED_BIT_ENABLE(UOS_MOVE | START_DMA));
> +
> +	/*
> +	 * Spin-wait for the DMA to complete & the GuC to start up.
> +	 * NB: Docs recommend not using the interrupt for completion.
> +	 * Measurements indicate this should take no more than 20ms, so a
> +	 * timeout here indicates that the GuC has failed and is unusable.
> +	 * (Higher levels of the driver will attempt to fall back to
> +	 * execlist mode if this happens.)
> +	 */
> +	ret = wait_for_atomic(guc_ucode_response(dev_priv, &status), 100);
> +
> +	DRM_DEBUG_DRIVER("DMA status 0x%x, GuC status 0x%x\n",
> +			I915_READ(DMA_CTRL), status);
> +
> +	if ((status & GS_BOOTROM_MASK) == GS_BOOTROM_RSA_FAILED) {
> +		DRM_ERROR("GuC firmware signature verification failed\n");
> +		ret = -ENOEXEC;
> +	}
> +
> +	DRM_DEBUG_DRIVER("returning %d\n", ret);
> +
> +	return ret;
> +}
> +
> +/*
> + * Load the GuC firmware blob into the MinuteIA.
> + */
> +static int guc_ucode_xfer(struct drm_i915_private *dev_priv)
> +{
> +	struct intel_guc_fw *guc_fw = &dev_priv->guc.guc_fw;
> +	struct drm_device *dev = dev_priv->dev;
> +	int ret;
> +
> +	ret = i915_gem_object_set_to_gtt_domain(guc_fw->guc_fw_obj, false);
> +	if (ret) {
> +		DRM_DEBUG_DRIVER("set-domain failed %d\n", ret);
> +		return ret;
> +	}
> +
> +	ret = i915_gem_obj_ggtt_pin(guc_fw->guc_fw_obj, 0, 0);
> +	if (ret) {
> +		DRM_DEBUG_DRIVER("pin failed %d\n", ret);
> +		return ret;
> +	}
> +
> +	intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
> +
> +	/* init WOPCM */
> +	I915_WRITE(GUC_WOPCM_SIZE, GUC_WOPCM_SIZE_VALUE);
> +	I915_WRITE(DMA_GUC_WOPCM_OFFSET, GUC_WOPCM_OFFSET);
> +
> +	/* Invalidate GuC TLB to let GuC take the latest updates to GTT. */
> +	I915_WRITE(GEN8_GTCR, GEN8_GTCR_INVALIDATE);
> +
> +	/* Set MMIO/WA for GuC init */
> +	I915_WRITE(DRBMISC1, DOORBELL_ENABLE);
[TOR:] Should this DOORBELL_ENABLE be dropped?  A note in
the BSpec indicates this is not needed, but also it should
be harmless.

> +
> +	/* Enable MIA caching. GuC clock gating is disabled. */
> +	I915_WRITE(GUC_SHIM_CONTROL, GUC_SHIM_CONTROL_VALUE);
[TOR:] Should guc clock gating be enabled?  A note in the
BSpec indicates this should be disabled for certain
pre-production steppings; this note may not apply to later
steppings.  Normally, the driver would enable guc clock
gating (bit 15, GUC_ENABLE_MIA_CLOCK_GATING).

> +
> +	/* WaC6DisallowByGfxPause*/
> +	I915_WRITE(GEN6_GFXPAUSE, 0x30FFF);
> +
> +	if (IS_SKYLAKE(dev))
> +		I915_WRITE(GEN9_GT_PM_CONFIG, GEN8_GT_DOORBELL_ENABLE);
> +	else
> +		I915_WRITE(GEN8_GT_PM_CONFIG, GEN8_GT_DOORBELL_ENABLE);
[TOR:] Would a comment be helpful here?  This line is correct
for Broxton (Gen 9 and not Skylake) but the constants are
reused from Gen 8.

> +
> +	if (IS_GEN9(dev)) {
> +		/* DOP Clock Gating Enable for GuC clocks */
> +		I915_WRITE(GEN7_MISCCPCTL, (GEN8_DOP_CLOCK_GATE_GUC_ENABLE |
> +					    I915_READ(GEN7_MISCCPCTL)));
> +
> +		/* allows for 5us before GT can go to RC6 */
> +		I915_WRITE(GUC_ARAT_C6DIS, 0x1FF);
> +	}
> +
> +	set_guc_init_params(dev_priv);
> +
> +	ret = guc_ucode_xfer_dma(dev_priv);
> +
> +	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
> +
> +	/*
> +	 * We keep the object pages for reuse during resume. But we can unpin it
> +	 * now that DMA has completed, so it doesn't continue to take up space.
> +	 */
> +	i915_gem_object_ggtt_unpin(guc_fw->guc_fw_obj);
> +
> +	return ret;
> +}
> +
> +static void guc_fw_fetch(struct drm_device *dev, struct intel_guc_fw *guc_fw)
> +{
> +	struct drm_i915_gem_object *obj;
> +	const struct firmware *fw;
> +	const u8 *css_header;
> +	const size_t minsize = UOS_CSS_HEADER_SIZE + UOS_CSS_SIGNING_SIZE;
> +	const size_t maxsize = GUC_WOPCM_SIZE_VALUE + UOS_CSS_SIGNING_SIZE
> +			- 0x8000; /* 32k reserved (8K stack + 24k context) */
> +
> +	DRM_DEBUG_DRIVER("before requesting firmware: GuC fw fetch status %s\n",
> +		intel_guc_fw_status_repr(guc_fw->guc_fw_fetch_status));
> +
> +	if (request_firmware(&fw, guc_fw->guc_fw_path, &dev->pdev->dev))
> +		goto fail;
> +	if (!fw)
> +		goto fail;
> +
> +	DRM_DEBUG_DRIVER("fetch GuC fw from %s succeeded, fw %p\n",
> +		guc_fw->guc_fw_path, fw);
> +
> +	DRM_DEBUG_DRIVER("firmware file size %zu (minimum %zu, maximum %zu)\n",
> +		fw->size, minsize, maxsize);
> +
> +	/* Check the size of the blob befoe examining buffer contents */
> +	if (fw->size < minsize || fw->size > maxsize)
> +		goto fail;
> +
> +	/*
> +	 * The GuC firmware image has the version number embedded at a well-known
> +	 * offset within the firmware blob; note that major / minor version are
> +	 * TWO bytes each (i.e. u16), although all pointers and offsets are defined
> +	 * in terms of bytes (u8).
> +	 */
> +	css_header = fw->data + UOS_CSS_HEADER_OFFSET;
> +	guc_fw->guc_fw_major_found = *(u16 *)(css_header + UOS_VER_MAJOR_OFFSET);
> +	guc_fw->guc_fw_minor_found = *(u16 *)(css_header + UOS_VER_MINOR_OFFSET);
> +
> +	if (guc_fw->guc_fw_major_found != guc_fw->guc_fw_major_wanted ||
> +	    guc_fw->guc_fw_minor_found < guc_fw->guc_fw_minor_wanted) {
> +		DRM_ERROR("GuC firmware version %d.%d, required %d.%d\n",
> +			guc_fw->guc_fw_major_found, guc_fw->guc_fw_minor_found,
> +			guc_fw->guc_fw_major_wanted, guc_fw->guc_fw_minor_wanted);
> +		goto fail;
> +	}
> +
> +	DRM_DEBUG_DRIVER("firmware version %d.%d OK (minimum %d.%d)\n",
> +			guc_fw->guc_fw_major_found, guc_fw->guc_fw_minor_found,
> +			guc_fw->guc_fw_major_wanted, guc_fw->guc_fw_minor_wanted);
> +
> +	obj = i915_gem_object_create_from_data(dev, fw->data, fw->size);
> +	if (!obj)
> +		goto fail;
> +
> +	guc_fw->guc_fw_obj = obj;
> +	guc_fw->guc_fw_size = fw->size;
> +
> +	DRM_DEBUG_DRIVER("GuC fw fetch status SUCCESS, obj %p\n",
> +			guc_fw->guc_fw_obj);
> +
> +	release_firmware(fw);
> +	guc_fw->guc_fw_fetch_status = GUC_FIRMWARE_SUCCESS;
> +	return;
> +
> +fail:
> +	DRM_DEBUG_DRIVER("GuC fw fetch status FAIL; fw %p, obj %p\n",
> +		fw, guc_fw->guc_fw_obj);
> +	DRM_ERROR("Failed to fetch GuC firmware from %s\n",
> +		  guc_fw->guc_fw_path);
> +
> +	obj = guc_fw->guc_fw_obj;
> +	if (obj)
> +		drm_gem_object_unreference(&obj->base);
> +	guc_fw->guc_fw_obj = NULL;
> +
> +	release_firmware(fw);		/* OK even if fw is NULL */
> +	guc_fw->guc_fw_fetch_status = GUC_FIRMWARE_FAIL;
> +}
> +
> +/**
> + * intel_guc_ucode_load() - load GuC uCode into the device
> + * @dev:	drm device
> + *
> + * Called from gem_init_hw() during driver loading and also after a GPU reset.
> + *
> + * On the first call only, this will fetch the blob from the filesystem;
> + * thereafter, we will already either have the blob in a GEM object, or
> + * have determined that no valid firmware image could be found.
> + *
> + * If we have a good firmware image, transfer it to the h/w.
> + *
> + * Return:	non-zero code on error
> + */
> +int intel_guc_ucode_load(struct drm_device *dev)
> +{
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct intel_guc_fw *guc_fw = &dev_priv->guc.guc_fw;
> +	int err = 0;
> +
> +	DRM_DEBUG_DRIVER("GuC fw status: fetch %s, load %s\n",
> +		intel_guc_fw_status_repr(guc_fw->guc_fw_fetch_status),
> +		intel_guc_fw_status_repr(guc_fw->guc_fw_load_status));
> +
> +	if (guc_fw->guc_fw_fetch_status == GUC_FIRMWARE_NONE)
> +		return 0;
> +
> +	if (guc_fw->guc_fw_fetch_status == GUC_FIRMWARE_SUCCESS &&
> +	    guc_fw->guc_fw_load_status == GUC_FIRMWARE_FAIL)
> +		return -ENOEXEC;
> +
> +	guc_fw->guc_fw_load_status = GUC_FIRMWARE_PENDING;
> +	if (guc_fw->guc_fw_fetch_status == GUC_FIRMWARE_PENDING) {
> +		/* We only come here once */
> +		guc_fw_fetch(dev, guc_fw);
> +		/* status must now be FAIL or SUCCESS */
> +	}
> +
> +	DRM_DEBUG_DRIVER("GuC fw fetch status %s\n",
> +		intel_guc_fw_status_repr(guc_fw->guc_fw_fetch_status));
> +
> +	switch (guc_fw->guc_fw_fetch_status) {
> +	case GUC_FIRMWARE_FAIL:
> +		/* something went wrong :( */
> +		err = -EIO;
> +		goto fail;
> +
> +	case GUC_FIRMWARE_NONE:
> +	case GUC_FIRMWARE_PENDING:
> +	default:
> +		/* "can't happen" */
> +		WARN_ONCE(1, "GuC fw %s invalid guc_fw_fetch_status %s [%d]\n",
> +			guc_fw->guc_fw_path,
> +			intel_guc_fw_status_repr(guc_fw->guc_fw_fetch_status),
> +			guc_fw->guc_fw_fetch_status);
> +		err = -ENXIO;
> +		goto fail;
> +
> +	case GUC_FIRMWARE_SUCCESS:
> +		break;
> +	}
> +
> +	err = guc_ucode_xfer(dev_priv);
> +	if (err)
> +		goto fail;
> +
> +	guc_fw->guc_fw_load_status = GUC_FIRMWARE_SUCCESS;
> +
> +	DRM_DEBUG_DRIVER("GuC fw status: fetch %s, load %s\n",
> +		intel_guc_fw_status_repr(guc_fw->guc_fw_fetch_status),
> +		intel_guc_fw_status_repr(guc_fw->guc_fw_load_status));
> +
> +	return 0;
> +
> +fail:
> +	if (guc_fw->guc_fw_load_status == GUC_FIRMWARE_PENDING)
> +		guc_fw->guc_fw_load_status = GUC_FIRMWARE_FAIL;
> +
> +	DRM_ERROR("Failed to initialize GuC, error %d\n", err);
> +
> +	return err;
> +}
> +
> +/**
> + * intel_guc_ucode_init() - define parameters for fetching firmware
> + * @dev:	drm device
> + *
> + * Called early during driver load, before GEM is initialised.
> + * Driver is single threaded, so no mutex is required.
> + *
> + * This just sets parameters for use when intel_guc_ucode_load()
> + * is called later, after GEM initialisation is complete.
> + */
> +void intel_guc_ucode_init(struct drm_device *dev)
> +{
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct intel_guc_fw *guc_fw = &dev_priv->guc.guc_fw;
> +	const char *fw_path;
> +
> +	if (!HAS_GUC_SCHED(dev))
> +		i915.enable_guc_submission = false;
> +
> +	if (!HAS_GUC_UCODE(dev)) {
> +		fw_path = NULL;
> +	} else if (IS_SKYLAKE(dev)) {
> +		fw_path = I915_SKL_GUC_UCODE;
> +		guc_fw->guc_fw_major_wanted = 3;
> +		guc_fw->guc_fw_minor_wanted = 0;
> +	} else {
> +		i915.enable_guc_submission = false;
> +		fw_path = "";	/* unknown device */
> +	}
> +
> +	guc_fw->guc_dev = dev;
> +	guc_fw->guc_fw_path = fw_path;
> +	guc_fw->guc_fw_fetch_status = GUC_FIRMWARE_NONE;
> +	guc_fw->guc_fw_load_status = GUC_FIRMWARE_NONE;
> +
> +	if (fw_path == NULL)
> +		return;
> +
> +	if (*fw_path == '\0') {
> +		DRM_ERROR("No GuC firmware known for this platform\n");
> +		guc_fw->guc_fw_fetch_status = GUC_FIRMWARE_FAIL;
> +		return;
> +	}
> +
> +	guc_fw->guc_fw_fetch_status = GUC_FIRMWARE_PENDING;
> +	DRM_DEBUG_DRIVER("GuC firmware pending, path %s\n", fw_path);
> +}
> +
> +/**
> + * intel_guc_ucode_fini() - clean up all allocated resources
> + * @dev:	drm device
> + */
> +void intel_guc_ucode_fini(struct drm_device *dev)
> +{
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct intel_guc_fw *guc_fw = &dev_priv->guc.guc_fw;
> +
> +	if (guc_fw->guc_fw_obj)
> +		drm_gem_object_unreference(&guc_fw->guc_fw_obj->base);
> +	guc_fw->guc_fw_obj = NULL;
> +
> +	guc_fw->guc_fw_fetch_status = GUC_FIRMWARE_NONE;
> +}
> -- 
> 1.9.1
>
[TOR:] I had some questions above.  These could be addressed
in later patches.

Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 01/13 v4] drm/i915: Add i915_gem_object_create_from_data()
  2015-07-09 18:29 ` [PATCH 01/13 v4] drm/i915: Add i915_gem_object_create_from_data() Dave Gordon
@ 2015-07-18  0:36   ` O'Rourke, Tom
  0 siblings, 0 replies; 42+ messages in thread
From: O'Rourke, Tom @ 2015-07-18  0:36 UTC (permalink / raw)
  To: Dave Gordon; +Cc: intel-gfx

On Thu, Jul 09, 2015 at 07:29:02PM +0100, Dave Gordon wrote:
> i915_gem_object_create_from_data() is a generic function to save data
> from a plain linear buffer in a new pageable gem object that can later
> be accessed by the CPU and/or GPU.
> 
> We will need this for the microcontroller firmware loading support code.
> 
> Derived from i915_gem_object_write(), originally by Alex Dai
> 
> v2:
>     Change of function: now allocates & fills a new object, rather than
>         writing to an existing object
>     New name courtesy of Chris Wilson
>     Explicit domain-setting and other improvements per review comments
>         by Chris Wilson & Daniel Vetter
> 
> v4:
>     Rebased
> 
> Issue: VIZ-4884
> Signed-off-by: Alex Dai <yu.dai@intel.com>
> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> ---
Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com>
>  drivers/gpu/drm/i915/i915_drv.h |  2 ++
>  drivers/gpu/drm/i915/i915_gem.c | 40 ++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 42 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 464b28d..3c91507 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2755,6 +2755,8 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
>  			 const struct drm_i915_gem_object_ops *ops);
>  struct drm_i915_gem_object *i915_gem_alloc_object(struct drm_device *dev,
>  						  size_t size);
> +struct drm_i915_gem_object *i915_gem_object_create_from_data(
> +		struct drm_device *dev, const void *data, size_t size);
>  void i915_init_vm(struct drm_i915_private *dev_priv,
>  		  struct i915_address_space *vm);
>  void i915_gem_free_object(struct drm_gem_object *obj);
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index a0bff41..dbbb649 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -5478,3 +5478,43 @@ bool i915_gem_obj_is_pinned(struct drm_i915_gem_object *obj)
>  
>  	return false;
>  }
> +
> +/* Allocate a new GEM object and fill it with the supplied data */
> +struct drm_i915_gem_object *
> +i915_gem_object_create_from_data(struct drm_device *dev,
> +			         const void *data, size_t size)
> +{
> +	struct drm_i915_gem_object *obj;
> +	struct sg_table *sg;
> +	size_t bytes;
> +	int ret;
> +
> +	obj = i915_gem_alloc_object(dev, round_up(size, PAGE_SIZE));
> +	if (IS_ERR_OR_NULL(obj))
> +		return obj;
> +
> +	ret = i915_gem_object_set_to_cpu_domain(obj, true);
> +	if (ret)
> +		goto fail;
> +
> +	ret = i915_gem_object_get_pages(obj);
> +	if (ret)
> +		goto fail;
> +
> +	i915_gem_object_pin_pages(obj);
> +	sg = obj->pages;
> +	bytes = sg_copy_from_buffer(sg->sgl, sg->nents, (void *)data, size);
> +	i915_gem_object_unpin_pages(obj);
> +
> +	if (WARN_ON(bytes != size)) {
> +		DRM_ERROR("Incomplete copy, wrote %zu of %zu", bytes, size);
> +		ret = -EFAULT;
> +		goto fail;
> +	}
> +
> +	return obj;
> +
> +fail:
> +	drm_gem_object_unreference(&obj->base);
> +	return ERR_PTR(ret);
> +}
> -- 
> 1.9.1
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 02/13 v4] drm/i915: Add GuC-related module parameters
  2015-07-09 18:29 ` [PATCH 02/13 v4] drm/i915: Add GuC-related module parameters Dave Gordon
@ 2015-07-18  0:37   ` O'Rourke, Tom
  0 siblings, 0 replies; 42+ messages in thread
From: O'Rourke, Tom @ 2015-07-18  0:37 UTC (permalink / raw)
  To: Dave Gordon; +Cc: intel-gfx

On Thu, Jul 09, 2015 at 07:29:03PM +0100, Dave Gordon wrote:
> From: Alex Dai <yu.dai@intel.com>
> 
> Two new module parameters: "enable_guc_submission" which will turn
> on submission of batchbuffers via the GuC (when implemented), and
> "guc_log_level" which controls the level of debugging logged by the
> GuC and captured by the host.
> 
> Signed-off-by: Alex Dai <yu.dai@intel.com>
> 
> v4:
>     Mark "enable_guc_submission" unsafe [Daniel Vetter]
> 
> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> ---
Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com>

>  drivers/gpu/drm/i915/i915_drv.h    | 2 ++
>  drivers/gpu/drm/i915/i915_params.c | 9 +++++++++
>  2 files changed, 11 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 3c91507..4a512da 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2606,6 +2606,8 @@ struct i915_params {
>  	bool reset;
>  	bool disable_display;
>  	bool disable_vtd_wa;
> +	bool enable_guc_submission;
> +	int guc_log_level;
>  	int use_mmio_flip;
>  	int mmio_debug;
>  	bool verbose_state_checks;
> diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
> index 7983fe4..2791b5a 100644
> --- a/drivers/gpu/drm/i915/i915_params.c
> +++ b/drivers/gpu/drm/i915/i915_params.c
> @@ -53,6 +53,8 @@ struct i915_params i915 __read_mostly = {
>  	.verbose_state_checks = 1,
>  	.nuclear_pageflip = 0,
>  	.edp_vswing = 0,
> +	.enable_guc_submission = false,
> +	.guc_log_level = -1,
>  };
>  
>  module_param_named(modeset, i915.modeset, int, 0400);
> @@ -186,3 +188,10 @@ MODULE_PARM_DESC(edp_vswing,
>  		 "Ignore/Override vswing pre-emph table selection from VBT "
>  		 "(0=use value from vbt [default], 1=low power swing(200mV),"
>  		 "2=default swing(400mV))");
> +
> +module_param_named_unsafe(enable_guc_submission, i915.enable_guc_submission, bool, 0400);
> +MODULE_PARM_DESC(enable_guc_submission, "Enable GuC submission (default:false)");
> +
> +module_param_named(guc_log_level, i915.guc_log_level, int, 0400);
> +MODULE_PARM_DESC(guc_log_level,
> +	"GuC firmware logging level (-1:disabled (default), 0-3:enabled)");
> -- 
> 1.9.1
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 03/13 v4] drm/i915: Add GuC-related header files
  2015-07-09 18:29 ` [PATCH 03/13 v4] drm/i915: Add GuC-related header files Dave Gordon
@ 2015-07-18  0:38   ` O'Rourke, Tom
  2015-07-21  6:38     ` Daniel Vetter
  0 siblings, 1 reply; 42+ messages in thread
From: O'Rourke, Tom @ 2015-07-18  0:38 UTC (permalink / raw)
  To: Dave Gordon; +Cc: intel-gfx

On Thu, Jul 09, 2015 at 07:29:04PM +0100, Dave Gordon wrote:
> intel_guc_fwif.h contains the subset of the GuC interface that we
> will need for submission of commands through the GuC. These MUST
> be kept in sync with the definitions used by the GuC firmware, and
> updates to this file will (or should) be autogenerated from the
> source files used to build the firmware. Editing this file is
> therefore not recommended.
> 
> i915_guc_reg.h contains definitions of GuC-related hardware:
> registers, bitmasks, etc. These should match the BSpec.
> 
> v2:
>     Files renamed & resliced per review comments by Chris Wilson
> 
> v4:
>     Added DON'T-EDIT-ME warning [Tom O'Rourke]
> 
> Issue: VIZ-4884
> Signed-off-by: Alex Dai <yu.dai@intel.com>
> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> ---
Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com>

>  drivers/gpu/drm/i915/i915_guc_reg.h   | 102 ++++++++++++++
>  drivers/gpu/drm/i915/intel_guc_fwif.h | 245 ++++++++++++++++++++++++++++++++++
>  2 files changed, 347 insertions(+)
>  create mode 100644 drivers/gpu/drm/i915/i915_guc_reg.h
>  create mode 100644 drivers/gpu/drm/i915/intel_guc_fwif.h
> 
> diff --git a/drivers/gpu/drm/i915/i915_guc_reg.h b/drivers/gpu/drm/i915/i915_guc_reg.h
> new file mode 100644
> index 0000000..ccdc6c8
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/i915_guc_reg.h
> @@ -0,0 +1,102 @@
> +/*
> + * Copyright © 2014 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
> + * IN THE SOFTWARE.
> + *
> + */
> +#ifndef _I915_GUC_REG_H_
> +#define _I915_GUC_REG_H_
> +
> +/* Definitions of GuC H/W registers, bits, etc */
> +
> +#define GUC_STATUS			0xc000
> +#define   GS_BOOTROM_SHIFT		1
> +#define   GS_BOOTROM_MASK		  (0x7F << GS_BOOTROM_SHIFT)
> +#define   GS_BOOTROM_RSA_FAILED		  (0x50 << GS_BOOTROM_SHIFT)
> +#define   GS_UKERNEL_SHIFT		8
> +#define   GS_UKERNEL_MASK		  (0xFF << GS_UKERNEL_SHIFT)
> +#define   GS_UKERNEL_LAPIC_DONE		  (0x30 << GS_UKERNEL_SHIFT)
> +#define   GS_UKERNEL_DPC_ERROR		  (0x60 << GS_UKERNEL_SHIFT)
> +#define   GS_UKERNEL_READY		  (0xF0 << GS_UKERNEL_SHIFT)
> +#define   GS_MIA_SHIFT			16
> +#define   GS_MIA_MASK			  (0x07 << GS_MIA_SHIFT)
> +
> +#define GUC_WOPCM_SIZE			0xc050
> +#define   GUC_WOPCM_SIZE_VALUE  	  (0x80 << 12)	/* 512KB */
> +#define GUC_WOPCM_OFFSET		0x80000		/* 512KB */
> +
> +#define SOFT_SCRATCH(n)			(0xc180 + ((n) * 4))
> +
> +#define UOS_RSA_SCRATCH_0		0xc200
> +#define DMA_ADDR_0_LOW			0xc300
> +#define DMA_ADDR_0_HIGH			0xc304
> +#define DMA_ADDR_1_LOW			0xc308
> +#define DMA_ADDR_1_HIGH			0xc30c
> +#define   DMA_ADDRESS_SPACE_WOPCM	  (7 << 16)
> +#define   DMA_ADDRESS_SPACE_GTT		  (8 << 16)
> +#define DMA_COPY_SIZE			0xc310
> +#define DMA_CTRL			0xc314
> +#define   UOS_MOVE			  (1<<4)
> +#define   START_DMA			  (1<<0)
> +#define DMA_GUC_WOPCM_OFFSET		0xc340
> +
> +#define GEN8_GT_PM_CONFIG		0x138140
> +#define GEN9_GT_PM_CONFIG		0x13816c
> +#define   GEN8_GT_DOORBELL_ENABLE	  (1<<0)
> +
> +#define GEN8_GTCR			0x4274
> +#define   GEN8_GTCR_INVALIDATE		  (1<<0)
> +
> +#define GUC_ARAT_C6DIS			0xA178
> +
> +#define GUC_SHIM_CONTROL		0xc064
> +#define   GUC_DISABLE_SRAM_INIT_TO_ZEROES	(1<<0)
> +#define   GUC_ENABLE_READ_CACHE_LOGIC		(1<<1)
> +#define   GUC_ENABLE_MIA_CACHING		(1<<2)
> +#define   GUC_GEN10_MSGCH_ENABLE		(1<<4)
> +#define   GUC_ENABLE_READ_CACHE_FOR_SRAM_DATA	(1<<9)
> +#define   GUC_ENABLE_READ_CACHE_FOR_WOPCM_DATA	(1<<10)
> +#define   GUC_ENABLE_MIA_CLOCK_GATING		(1<<15)
> +#define   GUC_GEN10_SHIM_WC_ENABLE		(1<<21)
> +
> +#define GUC_SHIM_CONTROL_VALUE	(GUC_DISABLE_SRAM_INIT_TO_ZEROES	| \
> +				 GUC_ENABLE_READ_CACHE_LOGIC		| \
> +				 GUC_ENABLE_MIA_CACHING			| \
> +				 GUC_ENABLE_READ_CACHE_FOR_SRAM_DATA	| \
> +				 GUC_ENABLE_READ_CACHE_FOR_WOPCM_DATA)
> +
> +#define HOST2GUC_INTERRUPT		0xc4c8
> +#define   HOST2GUC_TRIGGER		  (1<<0)
> +
> +#define DRBMISC1			0x1984
> +#define   DOORBELL_ENABLE		  (1<<0)
> +
> +#define GEN8_DRBREGL(x)			(0x1000 + (x) * 8)
> +#define   GEN8_DRB_VALID		  (1<<0)
> +#define GEN8_DRBREGU(x)			(GEN8_DRBREGL(x) + 4)
> +
> +#define DE_GUCRMR			0x44054
> +
> +#define GUC_BCS_RCS_IER			0xC550
> +#define GUC_VCS2_VCS1_IER		0xC554
> +#define GUC_WD_VECS_IER			0xC558
> +#define GUC_PM_P24C_IER			0xC55C
> +
> +#endif
> diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h b/drivers/gpu/drm/i915/intel_guc_fwif.h
> new file mode 100644
> index 0000000..18d7f20
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
> @@ -0,0 +1,245 @@
> +/*
> + * Copyright © 2014 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
> + * IN THE SOFTWARE.
> + */
> +#ifndef _INTEL_GUC_FWIF_H
> +#define _INTEL_GUC_FWIF_H
> +
> +/*
> + * This file is partially autogenerated, although currently with some manual
> + * fixups afterwards. In future, it should be entirely autogenerated, in order
> + * to ensure that the definitions herein remain in sync with those used by the
> + * GuC's own firmware.
> + *
> + * EDITING THIS FILE IS THEREFORE NOT RECOMMENDED - YOUR CHANGES MAY BE LOST.
> + */
> +
> +#define GFXCORE_FAMILY_GEN8		11
> +#define GFXCORE_FAMILY_GEN9		12
> +#define GFXCORE_FAMILY_FORCE_ULONG	0x7fffffff
> +
> +#define GUC_CTX_PRIORITY_CRITICAL	0
> +#define GUC_CTX_PRIORITY_HIGH		1
> +#define GUC_CTX_PRIORITY_NORMAL		2
> +#define GUC_CTX_PRIORITY_LOW		3
> +
> +#define GUC_MAX_GPU_CONTEXTS		1024
> +#define	GUC_INVALID_CTX_ID		(GUC_MAX_GPU_CONTEXTS + 1)
> +
> +/* Work queue item header definitions */
> +#define WQ_STATUS_ACTIVE		1
> +#define WQ_STATUS_SUSPENDED		2
> +#define WQ_STATUS_CMD_ERROR		3
> +#define WQ_STATUS_ENGINE_ID_NOT_USED	4
> +#define WQ_STATUS_SUSPENDED_FROM_RESET	5
> +#define WQ_TYPE_SHIFT			0
> +#define   WQ_TYPE_BATCH_BUF		(0x1 << WQ_TYPE_SHIFT)
> +#define   WQ_TYPE_PSEUDO		(0x2 << WQ_TYPE_SHIFT)
> +#define   WQ_TYPE_INORDER		(0x3 << WQ_TYPE_SHIFT)
> +#define WQ_TARGET_SHIFT			10
> +#define WQ_LEN_SHIFT			16
> +#define WQ_NO_WCFLUSH_WAIT		(1 << 27)
> +#define WQ_PRESENT_WORKLOAD		(1 << 28)
> +#define WQ_WORKLOAD_SHIFT		29
> +#define   WQ_WORKLOAD_GENERAL		(0 << WQ_WORKLOAD_SHIFT)
> +#define   WQ_WORKLOAD_GPGPU		(1 << WQ_WORKLOAD_SHIFT)
> +#define   WQ_WORKLOAD_TOUCH		(2 << WQ_WORKLOAD_SHIFT)
> +
> +#define WQ_RING_TAIL_SHIFT		20
> +#define WQ_RING_TAIL_MASK		(0x7FF << WQ_RING_TAIL_SHIFT)
> +
> +#define GUC_DOORBELL_ENABLED		1
> +#define GUC_DOORBELL_DISABLED		0
> +
> +#define GUC_CTX_DESC_ATTR_ACTIVE	(1 << 0)
> +#define GUC_CTX_DESC_ATTR_PENDING_DB	(1 << 1)
> +#define GUC_CTX_DESC_ATTR_KERNEL	(1 << 2)
> +#define GUC_CTX_DESC_ATTR_PREEMPT	(1 << 3)
> +#define GUC_CTX_DESC_ATTR_RESET		(1 << 4)
> +#define GUC_CTX_DESC_ATTR_WQLOCKED	(1 << 5)
> +#define GUC_CTX_DESC_ATTR_PCH		(1 << 6)
> +
> +/* The guc control data is 10 DWORDs */
> +#define GUC_CTL_CTXINFO			0
> +#define   GUC_CTL_CTXNUM_IN16_SHIFT	0
> +#define   GUC_CTL_BASE_ADDR_SHIFT	12
> +#define GUC_CTL_ARAT_HIGH		1
> +#define GUC_CTL_ARAT_LOW		2
> +#define GUC_CTL_DEVICE_INFO		3
> +#define   GUC_CTL_GTTYPE_SHIFT		0
> +#define   GUC_CTL_COREFAMILY_SHIFT	7
> +#define GUC_CTL_LOG_PARAMS		4
> +#define   GUC_LOG_VALID			(1 << 0)
> +#define   GUC_LOG_NOTIFY_ON_HALF_FULL	(1 << 1)
> +#define   GUC_LOG_ALLOC_IN_MEGABYTE	(1 << 3)
> +#define   GUC_LOG_CRASH_PAGES		1
> +#define   GUC_LOG_CRASH_SHIFT		4
> +#define   GUC_LOG_DPC_PAGES		3
> +#define   GUC_LOG_DPC_SHIFT		6
> +#define   GUC_LOG_ISR_PAGES		3
> +#define   GUC_LOG_ISR_SHIFT		9
> +#define   GUC_LOG_BUF_ADDR_SHIFT	12
> +#define GUC_CTL_PAGE_FAULT_CONTROL	5
> +#define GUC_CTL_WA			6
> +#define   GUC_CTL_WA_UK_BY_DRIVER	(1 << 3)
> +#define GUC_CTL_FEATURE			7
> +#define   GUC_CTL_VCS2_ENABLED		(1 << 0)
> +#define   GUC_CTL_KERNEL_SUBMISSIONS	(1 << 1)
> +#define   GUC_CTL_FEATURE2		(1 << 2)
> +#define   GUC_CTL_POWER_GATING		(1 << 3)
> +#define   GUC_CTL_DISABLE_SCHEDULER	(1 << 4)
> +#define   GUC_CTL_PREEMPTION_LOG	(1 << 5)
> +#define   GUC_CTL_ENABLE_SLPC		(1 << 7)
> +#define GUC_CTL_DEBUG			8
> +#define   GUC_LOG_VERBOSITY_SHIFT	0
> +#define   GUC_LOG_VERBOSITY_LOW		(0 << GUC_LOG_VERBOSITY_SHIFT)
> +#define   GUC_LOG_VERBOSITY_MED		(1 << GUC_LOG_VERBOSITY_SHIFT)
> +#define   GUC_LOG_VERBOSITY_HIGH	(2 << GUC_LOG_VERBOSITY_SHIFT)
> +#define   GUC_LOG_VERBOSITY_ULTRA	(3 << GUC_LOG_VERBOSITY_SHIFT)
> +/* Verbosity range-check limits, without the shift */
> +#define	  GUC_LOG_VERBOSITY_MIN		0
> +#define	  GUC_LOG_VERBOSITY_MAX		3
> +
> +#define GUC_CTL_MAX_DWORDS		(GUC_CTL_DEBUG + 1)
> +
> +struct guc_doorbell_info {
> +	u32 db_status;
> +	u32 cookie;
> +	u32 reserved[14];
> +} __packed;
> +
> +union guc_doorbell_qw {
> +	struct {
> +		u32 db_status;
> +		u32 cookie;
> +	};
> +	u64 value_qw;
> +} __packed;
> +
> +#define GUC_MAX_DOORBELLS		256
> +#define GUC_INVALID_DOORBELL_ID		(GUC_MAX_DOORBELLS)
> +
> +#define GUC_DB_SIZE			(PAGE_SIZE)
> +#define GUC_WQ_SIZE			(PAGE_SIZE * 2)
> +
> +/* Work item for submitting workloads into work queue of GuC. */
> +struct guc_wq_item {
> +	u32 header;
> +	u32 context_desc;
> +	u32 ring_tail;
> +	u32 fence_id;
> +} __packed;
> +
> +struct guc_process_desc {
> +	u32 context_id;
> +	u64 db_base_addr;
> +	u32 head;
> +	u32 tail;
> +	u32 error_offset;
> +	u64 wq_base_addr;
> +	u32 wq_size_bytes;
> +	u32 wq_status;
> +	u32 engine_presence;
> +	u32 priority;
> +	u32 reserved[30];
> +} __packed;
> +
> +/* engine id and context id is packed into guc_execlist_context.context_id*/
> +#define GUC_ELC_CTXID_OFFSET		0
> +#define GUC_ELC_ENGINE_OFFSET		29
> +
> +/* The execlist context including software and HW information */
> +struct guc_execlist_context {
> +	u32 context_desc;
> +	u32 context_id;
> +	u32 ring_status;
> +	u32 ring_lcra;
> +	u32 ring_begin;
> +	u32 ring_end;
> +	u32 ring_next_free_location;
> +	u32 ring_current_tail_pointer_value;
> +	u8 engine_state_submit_value;
> +	u8 engine_state_wait_value;
> +	u16 pagefault_count;
> +	u16 engine_submit_queue_count;
> +} __packed;
> +
> +/*Context descriptor for communicating between uKernel and Driver*/
> +struct guc_context_desc {
> +	u32 sched_common_area;
> +	u32 context_id;
> +	u32 pas_id;
> +	u8 engines_used;
> +	u64 db_trigger_cpu;
> +	u32 db_trigger_uk;
> +	u64 db_trigger_phy;
> +	u16 db_id;
> +
> +	struct guc_execlist_context lrc[I915_NUM_RINGS];
> +
> +	u8 attribute;
> +
> +	u32 priority;
> +
> +	u32 wq_sampled_tail_offset;
> +	u32 wq_total_submit_enqueues;
> +
> +	u32 process_desc;
> +	u32 wq_addr;
> +	u32 wq_size;
> +
> +	u32 engine_presence;
> +
> +	u32 reserved0[1];
> +	u64 reserved1[1];
> +
> +	u64 desc_private;
> +} __packed;
> +
> +/* This Action will be programmed in C180 - SOFT_SCRATCH_O_REG */
> +enum host2guc_action {
> +	HOST2GUC_ACTION_DEFAULT = 0x0,
> +	HOST2GUC_ACTION_SAMPLE_FORCEWAKE = 0x6,
> +	HOST2GUC_ACTION_ALLOCATE_DOORBELL = 0x10,
> +	HOST2GUC_ACTION_DEALLOCATE_DOORBELL = 0x20,
> +	HOST2GUC_ACTION_SLPC_REQUEST = 0x3003,
> +	HOST2GUC_ACTION_LIMIT
> +};
> +
> +/*
> + * The GuC sends its response to a command by overwriting the
> + * command in SS0. The response is distinguishable from a command
> + * by the fact that all the MASK bits are set. The remaining bits
> + * give more detail.
> + */
> +#define	GUC2HOST_RESPONSE_MASK		((u32)0xF0000000)
> +#define	GUC2HOST_IS_RESPONSE(x) 	((u32)(x) >= GUC2HOST_RESPONSE_MASK)
> +#define	GUC2HOST_STATUS(x)		(GUC2HOST_RESPONSE_MASK | (x))
> +
> +/* GUC will return status back to SOFT_SCRATCH_O_REG */
> +enum guc2host_status {
> +	GUC2HOST_STATUS_SUCCESS = GUC2HOST_STATUS(0x0),
> +	GUC2HOST_STATUS_ALLOCATE_DOORBELL_FAIL = GUC2HOST_STATUS(0x10),
> +	GUC2HOST_STATUS_DEALLOCATE_DOORBELL_FAIL = GUC2HOST_STATUS(0x20),
> +	GUC2HOST_STATUS_GENERIC_FAIL = GUC2HOST_STATUS(0x0000F000)
> +};
> +
> +#endif
> -- 
> 1.9.1
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 05/13 v4] drm/i915: Debugfs interface to read GuC load status
  2015-07-09 18:29 ` [PATCH 05/13 v4] drm/i915: Debugfs interface to read GuC load status Dave Gordon
@ 2015-07-18  0:39   ` O'Rourke, Tom
  0 siblings, 0 replies; 42+ messages in thread
From: O'Rourke, Tom @ 2015-07-18  0:39 UTC (permalink / raw)
  To: Dave Gordon; +Cc: intel-gfx

On Thu, Jul 09, 2015 at 07:29:06PM +0100, Dave Gordon wrote:
> From: Alex Dai <yu.dai@intel.com>
> 
> The new node provides access to the status of the GuC-specific loader;
> also the scratch registers used for communication between the i915
> driver and the GuC firmware.
> 
> v2:
>     Changes to output formats per Chris Wilson's suggestions
> 
> v4:
>     Rebased
> 
> Issue: VIZ-4884
> Signed-off-by: Alex Dai <yu.dai@intel.com>
> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> ---
Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com>

>  drivers/gpu/drm/i915/i915_debugfs.c | 39 +++++++++++++++++++++++++++++++++++++
>  1 file changed, 39 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 98fd3c9..9ff5f17 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -2359,6 +2359,44 @@ static int i915_llc(struct seq_file *m, void *data)
>  	return 0;
>  }
>  
> +static int i915_guc_load_status_info(struct seq_file *m, void *data)
> +{
> +	struct drm_info_node *node = m->private;
> +	struct drm_i915_private *dev_priv = node->minor->dev->dev_private;
> +	struct intel_guc_fw *guc_fw = &dev_priv->guc.guc_fw;
> +	u32 tmp, i;
> +
> +	if (!HAS_GUC_UCODE(dev_priv->dev))
> +		return 0;
> +
> +	seq_printf(m, "GuC firmware status:\n");
> +	seq_printf(m, "\tpath: %s\n",
> +		guc_fw->guc_fw_path);
> +	seq_printf(m, "\tfetch: %s\n",
> +		intel_guc_fw_status_repr(guc_fw->guc_fw_fetch_status));
> +	seq_printf(m, "\tload: %s\n",
> +		intel_guc_fw_status_repr(guc_fw->guc_fw_load_status));
> +	seq_printf(m, "\tversion wanted: %d.%d\n",
> +		guc_fw->guc_fw_major_wanted, guc_fw->guc_fw_minor_wanted);
> +	seq_printf(m, "\tversion found: %d.%d\n",
> +		guc_fw->guc_fw_major_found, guc_fw->guc_fw_minor_found);
> +
> +	tmp = I915_READ(GUC_STATUS);
> +
> +	seq_printf(m, "\nGuC status 0x%08x:\n", tmp);
> +	seq_printf(m, "\tBootrom status = 0x%x\n",
> +		(tmp & GS_BOOTROM_MASK) >> GS_BOOTROM_SHIFT);
> +	seq_printf(m, "\tuKernel status = 0x%x\n",
> +		(tmp & GS_UKERNEL_MASK) >> GS_UKERNEL_SHIFT);
> +	seq_printf(m, "\tMIA Core status = 0x%x\n",
> +		(tmp & GS_MIA_MASK) >> GS_MIA_SHIFT);
> +	seq_puts(m, "\nScratch registers:\n");
> +	for (i = 0; i < 16; i++)
> +		seq_printf(m, "\t%2d: \t0x%x\n", i, I915_READ(SOFT_SCRATCH(i)));
> +
> +	return 0;
> +}
> +
>  static int i915_edp_psr_status(struct seq_file *m, void *data)
>  {
>  	struct drm_info_node *node = m->private;
> @@ -5073,6 +5111,7 @@ static const struct drm_info_list i915_debugfs_list[] = {
>  	{"i915_gem_hws_bsd", i915_hws_info, 0, (void *)VCS},
>  	{"i915_gem_hws_vebox", i915_hws_info, 0, (void *)VECS},
>  	{"i915_gem_batch_pool", i915_gem_batch_pool_info, 0},
> +	{"i915_guc_load_status", i915_guc_load_status_info, 0},
>  	{"i915_frequency_info", i915_frequency_info, 0},
>  	{"i915_hangcheck_info", i915_hangcheck_info, 0},
>  	{"i915_drpc_info", i915_drpc_info, 0},
> -- 
> 1.9.1
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 00/13 v4] Batch submission via GuC
  2015-07-09 18:29 [PATCH 00/13 v4] Batch submission via GuC Dave Gordon
                   ` (12 preceding siblings ...)
  2015-07-09 18:29 ` [PATCH 13/13 v4] drm/i915: Enable GuC submission, where supported Dave Gordon
@ 2015-07-18  0:45 ` O'Rourke, Tom
  13 siblings, 0 replies; 42+ messages in thread
From: O'Rourke, Tom @ 2015-07-18  0:45 UTC (permalink / raw)
  To: Dave Gordon; +Cc: intel-gfx

On Thu, Jul 09, 2015 at 07:29:01PM +0100, Dave Gordon wrote:
> This patch series enables command submission via the GuC. In this mode,
> instead of the host CPU driving the execlist port directly, it hands
> over work items to the GuC, using a doorbell mechanism to tell the GuC
> that new items have been added to its work queue. The GuC then dispatches
> contexts to the various GPU engines, and manages the resulting context-
> switch interrupts. Completion of a batch is however still signalled to
> the CPU; the GuC is not involved in handling user interrupts.
> 
> There are two subsequences within the patch series:
> 
>   drm/i915: Add i915_gem_object_create_from_data()
>   drm/i915: Add GuC-related module parameters
>   drm/i915: Add GuC-related header files
>   drm/i915: GuC-specific firmware loader
>   drm/i915: Debugfs interface to read GuC load status
> 
> These five patches make up the GuC loader and its prerequisites.  At this
> point in the sequence we can load and activate the GuC firmware, but not
> submit any batches through it. (This is nonetheless a potentially useful
> state, as the GuC could do other useful work even when not handling batch
> submissions).
> 
>   drm/i915: Expose two LRC functions for GuC submission mode
>   drm/i915: GuC submission setup, phase 1
>   drm/i915: Enable GuC firmware log
>   drm/i915: Implementation of GuC client
>   drm/i915: Interrupt routing for GuC submission
>   drm/i915: Integrate GuC-based command submission
>   drm/i915: Debugfs interface for GuC submission statistics
>   drm/i915: Enable GuC submission, where supported
> 
> In this second section, we implement the GuC submission mechanism, link
> it into the (execlist-based) submission path, and finally enable it
> (on supported platforms). On platforms where there is no GuC, or if
> GuC submission is explicitly disabled, batch submission will revert to
> using the execlist mechanism directly.
> 
> On the other hand, if the GuC firmware cannot be found or is invalid,
> the GPU will be unusable.
> 
> The GuC firmware itself is not included in this patchset; it is or will
> be available for download from https://01.org/linuxgraphics/downloads/
> This driver works with and requires GuC firmware revision 3.x. It will
> not work with any firmware version 1.x, as the GuC protocol in those
> revisions was incompatible and is no longer supported.

[TOR:] I finished reviewing the first 5 patches for GuC
firmware loading.  These patches look ready to go.
Should we wait until the GuC version 3 firmware is
available from 01.org before merging?

I am still working on the second section for GuC submission.

Thanks,
Tom
> 
> Ben Widawsky (0):
> Vinit Azad (0):
> Michael H. Nguyen (0):
>   created the original versions on which some of these patches are based.
> 
> Alex Dai (6):
>   drm/i915: Add GuC-related module parameters
>   drm/i915: GuC-specific firmware loader
>   drm/i915: Debugfs interface to read GuC load status
>   drm/i915: GuC submission setup, phase 1
>   drm/i915: Enable GuC firmware log
>   drm/i915: Integrate GuC-based command submission
> 
> Dave Gordon (7):
>   drm/i915: Add i915_gem_object_create_from_data()
>   drm/i915: Add GuC-related header files
>   drm/i915: Expose two LRC functions for GuC submission mode
>   drm/i915: Implementation of GuC client
>   drm/i915: Interrupt routing for GuC submission
>   drm/i915: Debugfs interface for GuC submission statistics
>   drm/i915: Enable GuC submission, where supported
> 
>  Documentation/DocBook/drm.tmpl             |  14 +
>  drivers/gpu/drm/i915/Makefile              |   4 +
>  drivers/gpu/drm/i915/i915_debugfs.c        | 110 +++-
>  drivers/gpu/drm/i915/i915_dma.c            |   4 +
>  drivers/gpu/drm/i915/i915_drv.h            |  15 +
>  drivers/gpu/drm/i915/i915_gem.c            |  53 ++
>  drivers/gpu/drm/i915/i915_guc_reg.h        | 102 ++++
>  drivers/gpu/drm/i915/i915_guc_submission.c | 853 +++++++++++++++++++++++++++++
>  drivers/gpu/drm/i915/i915_params.c         |   9 +
>  drivers/gpu/drm/i915/i915_reg.h            |  15 +-
>  drivers/gpu/drm/i915/intel_guc.h           | 118 ++++
>  drivers/gpu/drm/i915/intel_guc_fwif.h      | 245 +++++++++
>  drivers/gpu/drm/i915/intel_guc_loader.c    | 618 +++++++++++++++++++++
>  drivers/gpu/drm/i915/intel_lrc.c           |  72 ++-
>  drivers/gpu/drm/i915/intel_lrc.h           |   9 +
>  15 files changed, 2211 insertions(+), 30 deletions(-)
>  create mode 100644 drivers/gpu/drm/i915/i915_guc_reg.h
>  create mode 100644 drivers/gpu/drm/i915/i915_guc_submission.c
>  create mode 100644 drivers/gpu/drm/i915/intel_guc.h
>  create mode 100644 drivers/gpu/drm/i915/intel_guc_fwif.h
>  create mode 100644 drivers/gpu/drm/i915/intel_guc_loader.c
> 
> -- 
> 1.9.1
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 04/13 v4] drm/i915: GuC-specific firmware loader
  2015-07-18  0:35   ` O'Rourke, Tom
@ 2015-07-20 16:18     ` Yu Dai
  0 siblings, 0 replies; 42+ messages in thread
From: Yu Dai @ 2015-07-20 16:18 UTC (permalink / raw)
  To: O'Rourke, Tom, Dave Gordon; +Cc: intel-gfx



On 07/17/2015 05:35 PM, O'Rourke, Tom wrote:
> On Thu, Jul 09, 2015 at 07:29:05PM +0100, Dave Gordon wrote:
> > From: Alex Dai <yu.dai@intel.com>
> >
> > +static u32 get_core_family(struct drm_i915_private *dev_priv)
> > +{
> > +	switch (INTEL_INFO(dev_priv)->gen) {
> > +	case 8:
> > +		return GFXCORE_FAMILY_GEN8;
> [TOR:] Should Gen 8 case be included here if only Gen 9 is supported?

Yes, we can remove this even Gen8 is capable but it is not supported by 
these patch series anyway.

> > +
> > +
> > +	/* Set MMIO/WA for GuC init */
> > +	I915_WRITE(DRBMISC1, DOORBELL_ENABLE);
> [TOR:] Should this DOORBELL_ENABLE be dropped?  A note in
> the BSpec indicates this is not needed, but also it should
> be harmless.

Per response from firmware team / BSpec, we can remove this line.

> > +
> > +	/* Enable MIA caching. GuC clock gating is disabled. */
> > +	I915_WRITE(GUC_SHIM_CONTROL, GUC_SHIM_CONTROL_VALUE);
> [TOR:] Should guc clock gating be enabled?  A note in the
> BSpec indicates this should be disabled for certain
> pre-production steppings; this note may not apply to later
> steppings.  Normally, the driver would enable guc clock
> gating (bit 15, GUC_ENABLE_MIA_CLOCK_GATING).

There was a hang issue in GuC if clock gating is enabled. This has be 
resolved for a while. We should enable this bit.

> > +
> > +	/* WaC6DisallowByGfxPause*/
> > +	I915_WRITE(GEN6_GFXPAUSE, 0x30FFF);
> > +
> > +	if (IS_SKYLAKE(dev))
> > +		I915_WRITE(GEN9_GT_PM_CONFIG, GEN8_GT_DOORBELL_ENABLE);
> > +	else
> > +		I915_WRITE(GEN8_GT_PM_CONFIG, GEN8_GT_DOORBELL_ENABLE);
> [TOR:] Would a comment be helpful here?  This line is correct
> for Broxton (Gen 9 and not Skylake) but the constants are
> reused from Gen 8.
>
> > +
>

BXT is Gen9 LP, which is using same mmio register as Gen8 for this case. 
My suggestion:
s/GEN8_GT_DOORBELL_ENABLE/GT_DOORBELL_ENABLE/g
And, add definition below. Use it here to avoid confuse.
#define GEN9LP_GT_PM_CONFIG 0x138140

s/I915_WRITE(GEN8_GT_PM_CONFIG, GEN8_GT_DOORBELL_ENABLE);
/I915_WRITE(GEN9LP_GT_PM_CONFIG, GT_DOORBELL_ENABLE);/g

Thanks,
Alex
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 03/13 v4] drm/i915: Add GuC-related header files
  2015-07-18  0:38   ` O'Rourke, Tom
@ 2015-07-21  6:38     ` Daniel Vetter
  2015-07-24 22:08       ` O'Rourke, Tom
  0 siblings, 1 reply; 42+ messages in thread
From: Daniel Vetter @ 2015-07-21  6:38 UTC (permalink / raw)
  To: O'Rourke, Tom; +Cc: intel-gfx

On Fri, Jul 17, 2015 at 05:38:05PM -0700, O'Rourke, Tom wrote:
> On Thu, Jul 09, 2015 at 07:29:04PM +0100, Dave Gordon wrote:
> > intel_guc_fwif.h contains the subset of the GuC interface that we
> > will need for submission of commands through the GuC. These MUST
> > be kept in sync with the definitions used by the GuC firmware, and
> > updates to this file will (or should) be autogenerated from the
> > source files used to build the firmware. Editing this file is
> > therefore not recommended.
> > 
> > i915_guc_reg.h contains definitions of GuC-related hardware:
> > registers, bitmasks, etc. These should match the BSpec.
> > 
> > v2:
> >     Files renamed & resliced per review comments by Chris Wilson
> > 
> > v4:
> >     Added DON'T-EDIT-ME warning [Tom O'Rourke]
> > 
> > Issue: VIZ-4884
> > Signed-off-by: Alex Dai <yu.dai@intel.com>
> > Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> > ---
> Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com>

Merged up to this patch, thanks.
-Daniel

> 
> >  drivers/gpu/drm/i915/i915_guc_reg.h   | 102 ++++++++++++++
> >  drivers/gpu/drm/i915/intel_guc_fwif.h | 245 ++++++++++++++++++++++++++++++++++
> >  2 files changed, 347 insertions(+)
> >  create mode 100644 drivers/gpu/drm/i915/i915_guc_reg.h
> >  create mode 100644 drivers/gpu/drm/i915/intel_guc_fwif.h
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_guc_reg.h b/drivers/gpu/drm/i915/i915_guc_reg.h
> > new file mode 100644
> > index 0000000..ccdc6c8
> > --- /dev/null
> > +++ b/drivers/gpu/drm/i915/i915_guc_reg.h
> > @@ -0,0 +1,102 @@
> > +/*
> > + * Copyright © 2014 Intel Corporation
> > + *
> > + * Permission is hereby granted, free of charge, to any person obtaining a
> > + * copy of this software and associated documentation files (the "Software"),
> > + * to deal in the Software without restriction, including without limitation
> > + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> > + * and/or sell copies of the Software, and to permit persons to whom the
> > + * Software is furnished to do so, subject to the following conditions:
> > + *
> > + * The above copyright notice and this permission notice (including the next
> > + * paragraph) shall be included in all copies or substantial portions of the
> > + * Software.
> > + *
> > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
> > + * IN THE SOFTWARE.
> > + *
> > + */
> > +#ifndef _I915_GUC_REG_H_
> > +#define _I915_GUC_REG_H_
> > +
> > +/* Definitions of GuC H/W registers, bits, etc */
> > +
> > +#define GUC_STATUS			0xc000
> > +#define   GS_BOOTROM_SHIFT		1
> > +#define   GS_BOOTROM_MASK		  (0x7F << GS_BOOTROM_SHIFT)
> > +#define   GS_BOOTROM_RSA_FAILED		  (0x50 << GS_BOOTROM_SHIFT)
> > +#define   GS_UKERNEL_SHIFT		8
> > +#define   GS_UKERNEL_MASK		  (0xFF << GS_UKERNEL_SHIFT)
> > +#define   GS_UKERNEL_LAPIC_DONE		  (0x30 << GS_UKERNEL_SHIFT)
> > +#define   GS_UKERNEL_DPC_ERROR		  (0x60 << GS_UKERNEL_SHIFT)
> > +#define   GS_UKERNEL_READY		  (0xF0 << GS_UKERNEL_SHIFT)
> > +#define   GS_MIA_SHIFT			16
> > +#define   GS_MIA_MASK			  (0x07 << GS_MIA_SHIFT)
> > +
> > +#define GUC_WOPCM_SIZE			0xc050
> > +#define   GUC_WOPCM_SIZE_VALUE  	  (0x80 << 12)	/* 512KB */
> > +#define GUC_WOPCM_OFFSET		0x80000		/* 512KB */
> > +
> > +#define SOFT_SCRATCH(n)			(0xc180 + ((n) * 4))
> > +
> > +#define UOS_RSA_SCRATCH_0		0xc200
> > +#define DMA_ADDR_0_LOW			0xc300
> > +#define DMA_ADDR_0_HIGH			0xc304
> > +#define DMA_ADDR_1_LOW			0xc308
> > +#define DMA_ADDR_1_HIGH			0xc30c
> > +#define   DMA_ADDRESS_SPACE_WOPCM	  (7 << 16)
> > +#define   DMA_ADDRESS_SPACE_GTT		  (8 << 16)
> > +#define DMA_COPY_SIZE			0xc310
> > +#define DMA_CTRL			0xc314
> > +#define   UOS_MOVE			  (1<<4)
> > +#define   START_DMA			  (1<<0)
> > +#define DMA_GUC_WOPCM_OFFSET		0xc340
> > +
> > +#define GEN8_GT_PM_CONFIG		0x138140
> > +#define GEN9_GT_PM_CONFIG		0x13816c
> > +#define   GEN8_GT_DOORBELL_ENABLE	  (1<<0)
> > +
> > +#define GEN8_GTCR			0x4274
> > +#define   GEN8_GTCR_INVALIDATE		  (1<<0)
> > +
> > +#define GUC_ARAT_C6DIS			0xA178
> > +
> > +#define GUC_SHIM_CONTROL		0xc064
> > +#define   GUC_DISABLE_SRAM_INIT_TO_ZEROES	(1<<0)
> > +#define   GUC_ENABLE_READ_CACHE_LOGIC		(1<<1)
> > +#define   GUC_ENABLE_MIA_CACHING		(1<<2)
> > +#define   GUC_GEN10_MSGCH_ENABLE		(1<<4)
> > +#define   GUC_ENABLE_READ_CACHE_FOR_SRAM_DATA	(1<<9)
> > +#define   GUC_ENABLE_READ_CACHE_FOR_WOPCM_DATA	(1<<10)
> > +#define   GUC_ENABLE_MIA_CLOCK_GATING		(1<<15)
> > +#define   GUC_GEN10_SHIM_WC_ENABLE		(1<<21)
> > +
> > +#define GUC_SHIM_CONTROL_VALUE	(GUC_DISABLE_SRAM_INIT_TO_ZEROES	| \
> > +				 GUC_ENABLE_READ_CACHE_LOGIC		| \
> > +				 GUC_ENABLE_MIA_CACHING			| \
> > +				 GUC_ENABLE_READ_CACHE_FOR_SRAM_DATA	| \
> > +				 GUC_ENABLE_READ_CACHE_FOR_WOPCM_DATA)
> > +
> > +#define HOST2GUC_INTERRUPT		0xc4c8
> > +#define   HOST2GUC_TRIGGER		  (1<<0)
> > +
> > +#define DRBMISC1			0x1984
> > +#define   DOORBELL_ENABLE		  (1<<0)
> > +
> > +#define GEN8_DRBREGL(x)			(0x1000 + (x) * 8)
> > +#define   GEN8_DRB_VALID		  (1<<0)
> > +#define GEN8_DRBREGU(x)			(GEN8_DRBREGL(x) + 4)
> > +
> > +#define DE_GUCRMR			0x44054
> > +
> > +#define GUC_BCS_RCS_IER			0xC550
> > +#define GUC_VCS2_VCS1_IER		0xC554
> > +#define GUC_WD_VECS_IER			0xC558
> > +#define GUC_PM_P24C_IER			0xC55C
> > +
> > +#endif
> > diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h b/drivers/gpu/drm/i915/intel_guc_fwif.h
> > new file mode 100644
> > index 0000000..18d7f20
> > --- /dev/null
> > +++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
> > @@ -0,0 +1,245 @@
> > +/*
> > + * Copyright © 2014 Intel Corporation
> > + *
> > + * Permission is hereby granted, free of charge, to any person obtaining a
> > + * copy of this software and associated documentation files (the "Software"),
> > + * to deal in the Software without restriction, including without limitation
> > + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> > + * and/or sell copies of the Software, and to permit persons to whom the
> > + * Software is furnished to do so, subject to the following conditions:
> > + *
> > + * The above copyright notice and this permission notice (including the next
> > + * paragraph) shall be included in all copies or substantial portions of the
> > + * Software.
> > + *
> > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
> > + * IN THE SOFTWARE.
> > + */
> > +#ifndef _INTEL_GUC_FWIF_H
> > +#define _INTEL_GUC_FWIF_H
> > +
> > +/*
> > + * This file is partially autogenerated, although currently with some manual
> > + * fixups afterwards. In future, it should be entirely autogenerated, in order
> > + * to ensure that the definitions herein remain in sync with those used by the
> > + * GuC's own firmware.
> > + *
> > + * EDITING THIS FILE IS THEREFORE NOT RECOMMENDED - YOUR CHANGES MAY BE LOST.
> > + */
> > +
> > +#define GFXCORE_FAMILY_GEN8		11
> > +#define GFXCORE_FAMILY_GEN9		12
> > +#define GFXCORE_FAMILY_FORCE_ULONG	0x7fffffff
> > +
> > +#define GUC_CTX_PRIORITY_CRITICAL	0
> > +#define GUC_CTX_PRIORITY_HIGH		1
> > +#define GUC_CTX_PRIORITY_NORMAL		2
> > +#define GUC_CTX_PRIORITY_LOW		3
> > +
> > +#define GUC_MAX_GPU_CONTEXTS		1024
> > +#define	GUC_INVALID_CTX_ID		(GUC_MAX_GPU_CONTEXTS + 1)
> > +
> > +/* Work queue item header definitions */
> > +#define WQ_STATUS_ACTIVE		1
> > +#define WQ_STATUS_SUSPENDED		2
> > +#define WQ_STATUS_CMD_ERROR		3
> > +#define WQ_STATUS_ENGINE_ID_NOT_USED	4
> > +#define WQ_STATUS_SUSPENDED_FROM_RESET	5
> > +#define WQ_TYPE_SHIFT			0
> > +#define   WQ_TYPE_BATCH_BUF		(0x1 << WQ_TYPE_SHIFT)
> > +#define   WQ_TYPE_PSEUDO		(0x2 << WQ_TYPE_SHIFT)
> > +#define   WQ_TYPE_INORDER		(0x3 << WQ_TYPE_SHIFT)
> > +#define WQ_TARGET_SHIFT			10
> > +#define WQ_LEN_SHIFT			16
> > +#define WQ_NO_WCFLUSH_WAIT		(1 << 27)
> > +#define WQ_PRESENT_WORKLOAD		(1 << 28)
> > +#define WQ_WORKLOAD_SHIFT		29
> > +#define   WQ_WORKLOAD_GENERAL		(0 << WQ_WORKLOAD_SHIFT)
> > +#define   WQ_WORKLOAD_GPGPU		(1 << WQ_WORKLOAD_SHIFT)
> > +#define   WQ_WORKLOAD_TOUCH		(2 << WQ_WORKLOAD_SHIFT)
> > +
> > +#define WQ_RING_TAIL_SHIFT		20
> > +#define WQ_RING_TAIL_MASK		(0x7FF << WQ_RING_TAIL_SHIFT)
> > +
> > +#define GUC_DOORBELL_ENABLED		1
> > +#define GUC_DOORBELL_DISABLED		0
> > +
> > +#define GUC_CTX_DESC_ATTR_ACTIVE	(1 << 0)
> > +#define GUC_CTX_DESC_ATTR_PENDING_DB	(1 << 1)
> > +#define GUC_CTX_DESC_ATTR_KERNEL	(1 << 2)
> > +#define GUC_CTX_DESC_ATTR_PREEMPT	(1 << 3)
> > +#define GUC_CTX_DESC_ATTR_RESET		(1 << 4)
> > +#define GUC_CTX_DESC_ATTR_WQLOCKED	(1 << 5)
> > +#define GUC_CTX_DESC_ATTR_PCH		(1 << 6)
> > +
> > +/* The guc control data is 10 DWORDs */
> > +#define GUC_CTL_CTXINFO			0
> > +#define   GUC_CTL_CTXNUM_IN16_SHIFT	0
> > +#define   GUC_CTL_BASE_ADDR_SHIFT	12
> > +#define GUC_CTL_ARAT_HIGH		1
> > +#define GUC_CTL_ARAT_LOW		2
> > +#define GUC_CTL_DEVICE_INFO		3
> > +#define   GUC_CTL_GTTYPE_SHIFT		0
> > +#define   GUC_CTL_COREFAMILY_SHIFT	7
> > +#define GUC_CTL_LOG_PARAMS		4
> > +#define   GUC_LOG_VALID			(1 << 0)
> > +#define   GUC_LOG_NOTIFY_ON_HALF_FULL	(1 << 1)
> > +#define   GUC_LOG_ALLOC_IN_MEGABYTE	(1 << 3)
> > +#define   GUC_LOG_CRASH_PAGES		1
> > +#define   GUC_LOG_CRASH_SHIFT		4
> > +#define   GUC_LOG_DPC_PAGES		3
> > +#define   GUC_LOG_DPC_SHIFT		6
> > +#define   GUC_LOG_ISR_PAGES		3
> > +#define   GUC_LOG_ISR_SHIFT		9
> > +#define   GUC_LOG_BUF_ADDR_SHIFT	12
> > +#define GUC_CTL_PAGE_FAULT_CONTROL	5
> > +#define GUC_CTL_WA			6
> > +#define   GUC_CTL_WA_UK_BY_DRIVER	(1 << 3)
> > +#define GUC_CTL_FEATURE			7
> > +#define   GUC_CTL_VCS2_ENABLED		(1 << 0)
> > +#define   GUC_CTL_KERNEL_SUBMISSIONS	(1 << 1)
> > +#define   GUC_CTL_FEATURE2		(1 << 2)
> > +#define   GUC_CTL_POWER_GATING		(1 << 3)
> > +#define   GUC_CTL_DISABLE_SCHEDULER	(1 << 4)
> > +#define   GUC_CTL_PREEMPTION_LOG	(1 << 5)
> > +#define   GUC_CTL_ENABLE_SLPC		(1 << 7)
> > +#define GUC_CTL_DEBUG			8
> > +#define   GUC_LOG_VERBOSITY_SHIFT	0
> > +#define   GUC_LOG_VERBOSITY_LOW		(0 << GUC_LOG_VERBOSITY_SHIFT)
> > +#define   GUC_LOG_VERBOSITY_MED		(1 << GUC_LOG_VERBOSITY_SHIFT)
> > +#define   GUC_LOG_VERBOSITY_HIGH	(2 << GUC_LOG_VERBOSITY_SHIFT)
> > +#define   GUC_LOG_VERBOSITY_ULTRA	(3 << GUC_LOG_VERBOSITY_SHIFT)
> > +/* Verbosity range-check limits, without the shift */
> > +#define	  GUC_LOG_VERBOSITY_MIN		0
> > +#define	  GUC_LOG_VERBOSITY_MAX		3
> > +
> > +#define GUC_CTL_MAX_DWORDS		(GUC_CTL_DEBUG + 1)
> > +
> > +struct guc_doorbell_info {
> > +	u32 db_status;
> > +	u32 cookie;
> > +	u32 reserved[14];
> > +} __packed;
> > +
> > +union guc_doorbell_qw {
> > +	struct {
> > +		u32 db_status;
> > +		u32 cookie;
> > +	};
> > +	u64 value_qw;
> > +} __packed;
> > +
> > +#define GUC_MAX_DOORBELLS		256
> > +#define GUC_INVALID_DOORBELL_ID		(GUC_MAX_DOORBELLS)
> > +
> > +#define GUC_DB_SIZE			(PAGE_SIZE)
> > +#define GUC_WQ_SIZE			(PAGE_SIZE * 2)
> > +
> > +/* Work item for submitting workloads into work queue of GuC. */
> > +struct guc_wq_item {
> > +	u32 header;
> > +	u32 context_desc;
> > +	u32 ring_tail;
> > +	u32 fence_id;
> > +} __packed;
> > +
> > +struct guc_process_desc {
> > +	u32 context_id;
> > +	u64 db_base_addr;
> > +	u32 head;
> > +	u32 tail;
> > +	u32 error_offset;
> > +	u64 wq_base_addr;
> > +	u32 wq_size_bytes;
> > +	u32 wq_status;
> > +	u32 engine_presence;
> > +	u32 priority;
> > +	u32 reserved[30];
> > +} __packed;
> > +
> > +/* engine id and context id is packed into guc_execlist_context.context_id*/
> > +#define GUC_ELC_CTXID_OFFSET		0
> > +#define GUC_ELC_ENGINE_OFFSET		29
> > +
> > +/* The execlist context including software and HW information */
> > +struct guc_execlist_context {
> > +	u32 context_desc;
> > +	u32 context_id;
> > +	u32 ring_status;
> > +	u32 ring_lcra;
> > +	u32 ring_begin;
> > +	u32 ring_end;
> > +	u32 ring_next_free_location;
> > +	u32 ring_current_tail_pointer_value;
> > +	u8 engine_state_submit_value;
> > +	u8 engine_state_wait_value;
> > +	u16 pagefault_count;
> > +	u16 engine_submit_queue_count;
> > +} __packed;
> > +
> > +/*Context descriptor for communicating between uKernel and Driver*/
> > +struct guc_context_desc {
> > +	u32 sched_common_area;
> > +	u32 context_id;
> > +	u32 pas_id;
> > +	u8 engines_used;
> > +	u64 db_trigger_cpu;
> > +	u32 db_trigger_uk;
> > +	u64 db_trigger_phy;
> > +	u16 db_id;
> > +
> > +	struct guc_execlist_context lrc[I915_NUM_RINGS];
> > +
> > +	u8 attribute;
> > +
> > +	u32 priority;
> > +
> > +	u32 wq_sampled_tail_offset;
> > +	u32 wq_total_submit_enqueues;
> > +
> > +	u32 process_desc;
> > +	u32 wq_addr;
> > +	u32 wq_size;
> > +
> > +	u32 engine_presence;
> > +
> > +	u32 reserved0[1];
> > +	u64 reserved1[1];
> > +
> > +	u64 desc_private;
> > +} __packed;
> > +
> > +/* This Action will be programmed in C180 - SOFT_SCRATCH_O_REG */
> > +enum host2guc_action {
> > +	HOST2GUC_ACTION_DEFAULT = 0x0,
> > +	HOST2GUC_ACTION_SAMPLE_FORCEWAKE = 0x6,
> > +	HOST2GUC_ACTION_ALLOCATE_DOORBELL = 0x10,
> > +	HOST2GUC_ACTION_DEALLOCATE_DOORBELL = 0x20,
> > +	HOST2GUC_ACTION_SLPC_REQUEST = 0x3003,
> > +	HOST2GUC_ACTION_LIMIT
> > +};
> > +
> > +/*
> > + * The GuC sends its response to a command by overwriting the
> > + * command in SS0. The response is distinguishable from a command
> > + * by the fact that all the MASK bits are set. The remaining bits
> > + * give more detail.
> > + */
> > +#define	GUC2HOST_RESPONSE_MASK		((u32)0xF0000000)
> > +#define	GUC2HOST_IS_RESPONSE(x) 	((u32)(x) >= GUC2HOST_RESPONSE_MASK)
> > +#define	GUC2HOST_STATUS(x)		(GUC2HOST_RESPONSE_MASK | (x))
> > +
> > +/* GUC will return status back to SOFT_SCRATCH_O_REG */
> > +enum guc2host_status {
> > +	GUC2HOST_STATUS_SUCCESS = GUC2HOST_STATUS(0x0),
> > +	GUC2HOST_STATUS_ALLOCATE_DOORBELL_FAIL = GUC2HOST_STATUS(0x10),
> > +	GUC2HOST_STATUS_DEALLOCATE_DOORBELL_FAIL = GUC2HOST_STATUS(0x20),
> > +	GUC2HOST_STATUS_GENERIC_FAIL = GUC2HOST_STATUS(0x0000F000)
> > +};
> > +
> > +#endif
> > -- 
> > 1.9.1
> > 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 03/13 v4] drm/i915: Add GuC-related header files
  2015-07-21  6:38     ` Daniel Vetter
@ 2015-07-24 22:08       ` O'Rourke, Tom
  0 siblings, 0 replies; 42+ messages in thread
From: O'Rourke, Tom @ 2015-07-24 22:08 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On Tue, Jul 21, 2015 at 08:38:35AM +0200, Daniel Vetter wrote:
> On Fri, Jul 17, 2015 at 05:38:05PM -0700, O'Rourke, Tom wrote:
> > On Thu, Jul 09, 2015 at 07:29:04PM +0100, Dave Gordon wrote:
> > > intel_guc_fwif.h contains the subset of the GuC interface that we
> > > will need for submission of commands through the GuC. These MUST
> > > be kept in sync with the definitions used by the GuC firmware, and
> > > updates to this file will (or should) be autogenerated from the
> > > source files used to build the firmware. Editing this file is
> > > therefore not recommended.
> > > 
> > > i915_guc_reg.h contains definitions of GuC-related hardware:
> > > registers, bitmasks, etc. These should match the BSpec.
> > > 
> > > v2:
> > >     Files renamed & resliced per review comments by Chris Wilson
> > > 
> > > v4:
> > >     Added DON'T-EDIT-ME warning [Tom O'Rourke]
> > > 
> > > Issue: VIZ-4884
> > > Signed-off-by: Alex Dai <yu.dai@intel.com>
> > > Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> > > ---
> > Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com>
> 
> Merged up to this patch, thanks.
> -Daniel
>
[TOR:] Thank you. Please hold merging remaining patches in
this series until guc firmware v3 is available on 01.org.
Tom
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 06/13 v4] drm/i915: Expose two LRC functions for GuC submission mode
  2015-07-09 18:29 ` [PATCH 06/13 v4] drm/i915: Expose two LRC functions for GuC submission mode Dave Gordon
@ 2015-07-24 22:12   ` O'Rourke, Tom
  0 siblings, 0 replies; 42+ messages in thread
From: O'Rourke, Tom @ 2015-07-24 22:12 UTC (permalink / raw)
  To: Dave Gordon; +Cc: intel-gfx

On Thu, Jul 09, 2015 at 07:29:07PM +0100, Dave Gordon wrote:
> GuC submission is basically execlist submission, but with the GuC
> handling the actual writes to the ELSP and the resulting context
> switch interrupts. So to prepare a context for submission via the
> GuC, we need some of the same functions used in execlist mode.
> This commit exposes two such functions, changing their names to
> better describe what they do (they're related to logical ring
> contexts rather than to execlists per se).
> 
> v2:
>     Replaces previous "drm/i915: Move execlists defines from .c to .h"
> 
> v3:
>     Incorporates a change to one of the functions exposed here that was
>     previously part of an internal patch, but which was omitted from
>     the version recently committed to drm-intel-nightly:
> 	7a01a0a drm/i915/lrc: Update PDPx registers with lri commands
>     So we reinstate this change here.
> 
> v4:
>     Drop v3 change, update function parameters due to collision with
>     8ee3615 drm/i915: Convert execlists_ctx_descriptor() for requests
> 
> Issue: VIZ-4884
> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>


Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_lrc.c | 21 ++++++++++-----------
>  drivers/gpu/drm/i915/intel_lrc.h |  3 +++
>  2 files changed, 13 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index d4f8b43..9e121d3 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -261,11 +261,11 @@ u32 intel_execlists_ctx_id(struct drm_i915_gem_object *ctx_obj)
>  	return lrca >> 12;
>  }
>  
> -static uint64_t execlists_ctx_descriptor(struct drm_i915_gem_request *rq)
> +uint64_t intel_lr_context_descriptor(struct intel_context *ctx,
> +				     struct intel_engine_cs *ring)
>  {
> -	struct intel_engine_cs *ring = rq->ring;
>  	struct drm_device *dev = ring->dev;
> -	struct drm_i915_gem_object *ctx_obj = rq->ctx->engine[ring->id].state;
> +	struct drm_i915_gem_object *ctx_obj = ctx->engine[ring->id].state;
>  	uint64_t desc;
>  	uint64_t lrca = i915_gem_obj_ggtt_offset(ctx_obj);
>  
> @@ -303,13 +303,13 @@ static void execlists_elsp_write(struct drm_i915_gem_request *rq0,
>  	uint64_t desc[2];
>  
>  	if (rq1) {
> -		desc[1] = execlists_ctx_descriptor(rq1);
> +		desc[1] = intel_lr_context_descriptor(rq1->ctx, rq1->ring);
>  		rq1->elsp_submitted++;
>  	} else {
>  		desc[1] = 0;
>  	}
>  
> -	desc[0] = execlists_ctx_descriptor(rq0);
> +	desc[0] = intel_lr_context_descriptor(rq0->ctx, rq0->ring);
>  	rq0->elsp_submitted++;
>  
>  	/* You must always write both descriptors in the order below. */
> @@ -328,7 +328,8 @@ static void execlists_elsp_write(struct drm_i915_gem_request *rq0,
>  	spin_unlock(&dev_priv->uncore.lock);
>  }
>  
> -static int execlists_update_context(struct drm_i915_gem_request *rq)
> +/* Update the ringbuffer pointer and tail offset in a saved context image */
> +void intel_lr_context_update(struct drm_i915_gem_request *rq)
>  {
>  	struct intel_engine_cs *ring = rq->ring;
>  	struct i915_hw_ppgtt *ppgtt = rq->ctx->ppgtt;
> @@ -358,17 +359,15 @@ static int execlists_update_context(struct drm_i915_gem_request *rq)
>  	}
>  
>  	kunmap_atomic(reg_state);
> -
> -	return 0;
>  }
>  
>  static void execlists_submit_requests(struct drm_i915_gem_request *rq0,
>  				      struct drm_i915_gem_request *rq1)
>  {
> -	execlists_update_context(rq0);
> +	intel_lr_context_update(rq0);
>  
>  	if (rq1)
> -		execlists_update_context(rq1);
> +		intel_lr_context_update(rq1);
>  
>  	execlists_elsp_write(rq0, rq1);
>  }
> @@ -2051,7 +2050,7 @@ populate_lr_context(struct intel_context *ctx, struct drm_i915_gem_object *ctx_o
>  	reg_state[CTX_RING_TAIL+1] = 0;
>  	reg_state[CTX_RING_BUFFER_START] = RING_START(ring->mmio_base);
>  	/* Ring buffer start address is not known until the buffer is pinned.
> -	 * It is written to the context image in execlists_update_context()
> +	 * It is written to the context image in intel_lr_context_update()
>  	 */
>  	reg_state[CTX_RING_BUFFER_CONTROL] = RING_CTL(ring->mmio_base);
>  	reg_state[CTX_RING_BUFFER_CONTROL+1] =
> diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
> index e0299fb..6ecc0b3 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.h
> +++ b/drivers/gpu/drm/i915/intel_lrc.h
> @@ -73,6 +73,9 @@ int intel_lr_context_deferred_create(struct intel_context *ctx,
>  void intel_lr_context_unpin(struct drm_i915_gem_request *req);
>  void intel_lr_context_reset(struct drm_device *dev,
>  			struct intel_context *ctx);
> +void intel_lr_context_update(struct drm_i915_gem_request *rq);
> +uint64_t intel_lr_context_descriptor(struct intel_context *ctx,
> +				     struct intel_engine_cs *ring);
>  
>  /* Execlists */
>  int intel_sanitize_enable_execlists(struct drm_device *dev, int enable_execlists);
> -- 
> 1.9.1
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 07/13 v4] drm/i915: GuC submission setup, phase 1
  2015-07-09 18:29 ` [PATCH 07/13 v4] drm/i915: GuC submission setup, phase 1 Dave Gordon
@ 2015-07-24 22:31   ` O'Rourke, Tom
  2015-07-27 22:41     ` Yu Dai
  0 siblings, 1 reply; 42+ messages in thread
From: O'Rourke, Tom @ 2015-07-24 22:31 UTC (permalink / raw)
  To: Dave Gordon; +Cc: intel-gfx

[TOR:] When I see "phase 1" I also look for "phase 2".  
A subject that better describes the change in this patch 
would help.

On Thu, Jul 09, 2015 at 07:29:08PM +0100, Dave Gordon wrote:
> From: Alex Dai <yu.dai@intel.com>
> 
> This adds the first of the data structures used to communicate with the
> GuC (the pool of guc_context structures).
> 
> We create a GuC-specific wrapper round the GEM object allocator as all
> GEM objects shared with the GuC must be pinned into GGTT space at an
> address that is NOT in the range [0..WOPCM_SIZE), as that range of GGTT
> addresses is not accessible to the GuC (from the GuC's point of view,
> it's permanently reserved for other objects such as the BootROM & SRAM).
[TOR:] I would like a clarfication on the excluded range.
The excluded range should be 0 to "size for guc within
WOPCM area" and not 0 to "size of WOPCM area".

> 
> Later, we will need to allocate additional GuC-sharable objects for the
> submission client(s) and the GuC's debug log.
> 
> v2:
>     Remove redundant initialisation [Chris Wilson]
>     Defer adding struct members until needed [Chris Wilson]
>     Local functions should pass dev_priv rather than dev [Chris Wilson]
> 
> v4:
>     Rebased
> 
> Issue: VIZ-4884
> Signed-off-by: Alex Dai <yu.dai@intel.com>
> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> ---
>  drivers/gpu/drm/i915/Makefile              |   3 +-
>  drivers/gpu/drm/i915/i915_guc_submission.c | 114 +++++++++++++++++++++++++++++
>  drivers/gpu/drm/i915/intel_guc.h           |   7 ++
>  drivers/gpu/drm/i915/intel_guc_loader.c    |  19 +++++
>  4 files changed, 142 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/gpu/drm/i915/i915_guc_submission.c
> 
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index e604cfe..23f5612 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -40,7 +40,8 @@ i915-y += i915_cmd_parser.o \
>  	  intel_uncore.o
>  
>  # general-purpose microcontroller (GuC) support
> -i915-y += intel_guc_loader.o
> +i915-y += intel_guc_loader.o \
> +	  i915_guc_submission.o
>  
>  # autogenerated null render state
>  i915-y += intel_renderstate_gen6.o \
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> new file mode 100644
> index 0000000..70a0405
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -0,0 +1,114 @@
> +/*
> + * Copyright © 2014 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
> + * IN THE SOFTWARE.
> + *
> + */
> +#include <linux/firmware.h>
> +#include <linux/circ_buf.h>
> +#include "i915_drv.h"
> +#include "intel_guc.h"
> +
> +/**
> + * gem_allocate_guc_obj() - Allocate gem object for GuC usage
> + * @dev:	drm device
> + * @size:	size of object
> + *
> + * This is a wrapper to create a gem obj. In order to use it inside GuC, the
> + * object needs to be pinned lifetime. Also we must pin it to gtt space other
> + * than [0, GUC_WOPCM_SIZE] because this range is reserved inside GuC.
> + *
> + * Return:	A drm_i915_gem_object if successful, otherwise NULL.
> + */
> +static struct drm_i915_gem_object *gem_allocate_guc_obj(struct drm_device *dev,
> +							u32 size)
> +{
> +	struct drm_i915_gem_object *obj;
> +
> +	obj = i915_gem_alloc_object(dev, size);
> +	if (!obj)
> +		return NULL;
> +
> +	if (i915_gem_object_get_pages(obj)) {
> +		drm_gem_object_unreference(&obj->base);
> +		return NULL;
> +	}
> +
> +	if (i915_gem_obj_ggtt_pin(obj, PAGE_SIZE,
> +			PIN_OFFSET_BIAS | GUC_WOPCM_SIZE_VALUE)) {
> +		drm_gem_object_unreference(&obj->base);
> +		return NULL;
> +	}
> +
> +	return obj;
> +}
> +
> +/**
> + * gem_release_guc_obj() - Release gem object allocated for GuC usage
> + * @obj:	gem obj to be released
> +  */
> +static void gem_release_guc_obj(struct drm_i915_gem_object *obj)
> +{
> +	if (!obj)
> +		return;
> +
> +	if (i915_gem_obj_is_pinned(obj))
> +		i915_gem_object_ggtt_unpin(obj);
> +
> +	drm_gem_object_unreference(&obj->base);
> +}
> +
> +/*
> + * Set up the memory resources to be shared with the GuC.  At this point,
> + * we require just one object that can be mapped through the GGTT.
> + */
> +int i915_guc_submission_init(struct drm_device *dev)
> +{
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +	const size_t ctxsize = sizeof(struct guc_context_desc);
> +	const size_t poolsize = GUC_MAX_GPU_CONTEXTS * ctxsize;
> +	const size_t gemsize = round_up(poolsize, PAGE_SIZE);
> +	struct intel_guc *guc = &dev_priv->guc;
> +
> +	if (!i915.enable_guc_submission)
> +		return 0; /* not enabled  */
> +
> +	if (guc->ctx_pool_obj)
> +		return 0; /* already allocated */
> +
> +	guc->ctx_pool_obj = gem_allocate_guc_obj(dev_priv->dev, gemsize);
> +	if (!guc->ctx_pool_obj)
> +		return -ENOMEM;
> +
> +	ida_init(&guc->ctx_ids);
> +
> +	return 0;
> +}
> +
> +void i915_guc_submission_fini(struct drm_device *dev)
> +{
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct intel_guc *guc = &dev_priv->guc;
> +
> +	if (guc->ctx_pool_obj)
> +		ida_destroy(&guc->ctx_ids);
> +	gem_release_guc_obj(guc->ctx_pool_obj);
> +	guc->ctx_pool_obj = NULL;
> +}
> diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
> index 2846b6d..be3cad8 100644
> --- a/drivers/gpu/drm/i915/intel_guc.h
> +++ b/drivers/gpu/drm/i915/intel_guc.h
> @@ -56,6 +56,9 @@ struct intel_guc {
>  	struct intel_guc_fw guc_fw;
>  
>  	uint32_t log_flags;
> +
> +	struct drm_i915_gem_object *ctx_pool_obj;
> +	struct ida ctx_ids;
>  };
>  
>  /* intel_guc_loader.c */
> @@ -64,4 +67,8 @@ extern int intel_guc_ucode_load(struct drm_device *dev);
>  extern void intel_guc_ucode_fini(struct drm_device *dev);
>  extern const char *intel_guc_fw_status_repr(enum intel_guc_fw_status status);
>  
> +/* i915_guc_submission.c */
> +int i915_guc_submission_init(struct drm_device *dev);
> +void i915_guc_submission_fini(struct drm_device *dev);
> +
[TOR:] i915_guc_submission_init gets used in this patch.
Unexpectedly, i915_guc_submission_fini does not get used
in this patch.

A later patch in this series adds the call to
i915_guc_submission_fini in intel_guc_ucode_fini.
Should that call be added in this patch?

>  #endif
> diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c b/drivers/gpu/drm/i915/intel_guc_loader.c
> index 2080bca..e5d7136 100644
> --- a/drivers/gpu/drm/i915/intel_guc_loader.c
> +++ b/drivers/gpu/drm/i915/intel_guc_loader.c
> @@ -128,6 +128,21 @@ static void set_guc_init_params(struct drm_i915_private *dev_priv)
>  			i915.guc_log_level << GUC_LOG_VERBOSITY_SHIFT;
>  	}
>  
> +	/* If GuC scheduling is enabled, setup params here. */
[TOR:] I assume from this "GuC scheduling" == "GuC submission".
This is a little confusing.  I recommend either reword
"GuC scheduling" or add comment text to explain.

> +	if (i915.enable_guc_submission) {
> +		u32 pgs = i915_gem_obj_ggtt_offset(dev_priv->guc.ctx_pool_obj);
> +		u32 ctx_in_16 = GUC_MAX_GPU_CONTEXTS / 16;
> +
> +		pgs >>= PAGE_SHIFT;
> +		params[GUC_CTL_CTXINFO] = (pgs << GUC_CTL_BASE_ADDR_SHIFT) |
> +			(ctx_in_16 << GUC_CTL_CTXNUM_IN16_SHIFT);
> +
> +		params[GUC_CTL_FEATURE] |= GUC_CTL_KERNEL_SUBMISSIONS;
> +
> +		/* Unmask this bit to enable GuC scheduler */
> +		params[GUC_CTL_FEATURE] &= ~GUC_CTL_DISABLE_SCHEDULER;
> +	}
> +
>  	I915_WRITE(SOFT_SCRATCH(0), 0);
>  
>  	for (i = 0; i < GUC_CTL_MAX_DWORDS; i++)
> @@ -450,6 +465,10 @@ int intel_guc_ucode_load(struct drm_device *dev)
>  		break;
>  	}
>  
> +	err = i915_guc_submission_init(dev);
> +	if (err)
> +		goto fail;
> +
>  	err = guc_ucode_xfer(dev_priv);
>  	if (err)
>  		goto fail;
> -- 
> 1.9.1
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 08/13 v4] drm/i915: Enable GuC firmware log
  2015-07-09 18:29 ` [PATCH 08/13 v4] drm/i915: Enable GuC firmware log Dave Gordon
@ 2015-07-24 22:40   ` O'Rourke, Tom
  0 siblings, 0 replies; 42+ messages in thread
From: O'Rourke, Tom @ 2015-07-24 22:40 UTC (permalink / raw)
  To: Dave Gordon; +Cc: intel-gfx

On Thu, Jul 09, 2015 at 07:29:09PM +0100, Dave Gordon wrote:
> From: Alex Dai <yu.dai@intel.com>
> 
> Allocate a GEM object to hold GuC log data. A debugfs interface
> (i915_guc_log_dump) is provided to print out the log content.
> 
> v2:
>     Add struct members at point of use [Chris Wilson]
> 
> v4:
>     Rebased
> 
> Issue: VIZ-4884
> Signed-off-by: Alex Dai <yu.dai@intel.com>
> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>

Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com>

> ---
>  drivers/gpu/drm/i915/i915_debugfs.c        | 29 +++++++++++++++++++
>  drivers/gpu/drm/i915/i915_guc_submission.c | 46 ++++++++++++++++++++++++++++++
>  drivers/gpu/drm/i915/intel_guc.h           |  1 +
>  3 files changed, 76 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 9ff5f17..13e37d1 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -2397,6 +2397,34 @@ static int i915_guc_load_status_info(struct seq_file *m, void *data)
>  	return 0;
>  }
>  
> +static int i915_guc_log_dump(struct seq_file *m, void *data)
> +{
> +	struct drm_info_node *node = m->private;
> +	struct drm_device *dev = node->minor->dev;
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct drm_i915_gem_object *log_obj = dev_priv->guc.log_obj;
> +	u32 *log;
> +	int i = 0, pg;
> +
> +	if (!log_obj)
> +		return 0;
> +
> +	for (pg = 0; pg < log_obj->base.size / PAGE_SIZE; pg++) {
> +		log = kmap_atomic(i915_gem_object_get_page(log_obj, pg));
> +
> +		for (i = 0; i < PAGE_SIZE / sizeof(u32); i += 4)
> +			seq_printf(m, "0x%08x 0x%08x 0x%08x 0x%08x\n",
> +				   *(log + i), *(log + i + 1),
> +				   *(log + i + 2), *(log + i + 3));
> +
> +		kunmap_atomic(log);
> +	}
> +
> +	seq_putc(m, '\n');
> +
> +	return 0;
> +}
> +
>  static int i915_edp_psr_status(struct seq_file *m, void *data)
>  {
>  	struct drm_info_node *node = m->private;
> @@ -5112,6 +5140,7 @@ static const struct drm_info_list i915_debugfs_list[] = {
>  	{"i915_gem_hws_vebox", i915_hws_info, 0, (void *)VECS},
>  	{"i915_gem_batch_pool", i915_gem_batch_pool_info, 0},
>  	{"i915_guc_load_status", i915_guc_load_status_info, 0},
> +	{"i915_guc_log_dump", i915_guc_log_dump, 0},
>  	{"i915_frequency_info", i915_frequency_info, 0},
>  	{"i915_hangcheck_info", i915_hangcheck_info, 0},
>  	{"i915_drpc_info", i915_drpc_info, 0},
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> index 70a0405..e9d46d6 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -75,6 +75,47 @@ static void gem_release_guc_obj(struct drm_i915_gem_object *obj)
>  	drm_gem_object_unreference(&obj->base);
>  }
>  
> +static void guc_create_log(struct intel_guc *guc)
> +{
> +	struct drm_i915_private *dev_priv = guc_to_i915(guc);
> +	struct drm_i915_gem_object *obj;
> +	unsigned long offset;
> +	uint32_t size, flags;
> +
> +	if (i915.guc_log_level < GUC_LOG_VERBOSITY_MIN)
> +		return;
> +
> +	if (i915.guc_log_level > GUC_LOG_VERBOSITY_MAX)
> +		i915.guc_log_level = GUC_LOG_VERBOSITY_MAX;
> +
> +	/* The first page is to save log buffer state. Allocate one
> +	 * extra page for others in case for overlap */
> +	size = (1 + GUC_LOG_DPC_PAGES + 1 +
> +		GUC_LOG_ISR_PAGES + 1 +
> +		GUC_LOG_CRASH_PAGES + 1) << PAGE_SHIFT;
> +
> +	obj = guc->log_obj;
> +	if (!obj) {
> +		obj = gem_allocate_guc_obj(dev_priv->dev, size);
> +		if (!obj) {
> +			/* logging will be off */
> +			i915.guc_log_level = -1;
> +			return;
> +		}
> +
> +		guc->log_obj = obj;
> +	}
> +
> +	/* each allocated unit is a page */
> +	flags = GUC_LOG_VALID | GUC_LOG_NOTIFY_ON_HALF_FULL |
> +		(GUC_LOG_DPC_PAGES << GUC_LOG_DPC_SHIFT) |
> +		(GUC_LOG_ISR_PAGES << GUC_LOG_ISR_SHIFT) |
> +		(GUC_LOG_CRASH_PAGES << GUC_LOG_CRASH_SHIFT);
[TOR:] How does the "log half full" interrupt get handled?

> +
> +	offset = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT; /* in pages */
> +	guc->log_flags = (offset << GUC_LOG_BUF_ADDR_SHIFT) | flags;
> +}
> +
>  /*
>   * Set up the memory resources to be shared with the GuC.  At this point,
>   * we require just one object that can be mapped through the GGTT.
> @@ -99,6 +140,8 @@ int i915_guc_submission_init(struct drm_device *dev)
>  
>  	ida_init(&guc->ctx_ids);
>  
> +	guc_create_log(guc);
> +
>  	return 0;
>  }
>  
> @@ -107,6 +150,9 @@ void i915_guc_submission_fini(struct drm_device *dev)
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct intel_guc *guc = &dev_priv->guc;
>  
> +	gem_release_guc_obj(dev_priv->guc.log_obj);
> +	guc->log_obj = NULL;
> +
>  	if (guc->ctx_pool_obj)
>  		ida_destroy(&guc->ctx_ids);
>  	gem_release_guc_obj(guc->ctx_pool_obj);
> diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
> index be3cad8..5b51b05 100644
> --- a/drivers/gpu/drm/i915/intel_guc.h
> +++ b/drivers/gpu/drm/i915/intel_guc.h
> @@ -56,6 +56,7 @@ struct intel_guc {
>  	struct intel_guc_fw guc_fw;
>  
>  	uint32_t log_flags;
> +	struct drm_i915_gem_object *log_obj;
>  
>  	struct drm_i915_gem_object *ctx_pool_obj;
>  	struct ida ctx_ids;
> -- 
> 1.9.1
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 09/13 v4] drm/i915: Implementation of GuC client
  2015-07-09 18:29 ` [PATCH 09/13 v4] drm/i915: Implementation of GuC client Dave Gordon
@ 2015-07-25  2:31   ` O'Rourke, Tom
  0 siblings, 0 replies; 42+ messages in thread
From: O'Rourke, Tom @ 2015-07-25  2:31 UTC (permalink / raw)
  To: Dave Gordon; +Cc: intel-gfx

On Thu, Jul 09, 2015 at 07:29:10PM +0100, Dave Gordon wrote:
> A GuC client has its own doorbell and workqueue. It maintains the
> doorbell cache line, process description object and work queue item.
> 
> A default guc_client is created for the i915 driver to use for
> normal-priority in-order submission.
> 
> Note that the created client is not yet ready for use; doorbell
> allocation will fail as we haven't yet linked the GuC's context
> descriptor to the default contexts for each ring (see later patch).
> 
> v2:
>     Defer adding structure members until needed [Chris Wilson]
>     Rationalise type declarations [Chris Wilson]
> 
> v4:
>     Rebased
> 
> Issue: VIZ-4884
> Signed-off-by: Alex Dai <yu.dai@intel.com>
> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>

[TOR:] I had some non-critical questions below.

Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_guc_submission.c | 649 +++++++++++++++++++++++++++++
>  drivers/gpu/drm/i915/intel_guc.h           |  42 ++
>  drivers/gpu/drm/i915/intel_guc_loader.c    |  12 +
>  3 files changed, 703 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> index e9d46d6..25d8807 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -27,6 +27,512 @@
>  #include "intel_guc.h"
>  
>  /**
> + * DOC: GuC Client
> + *
> + * i915_guc_client:
> + * We use the term client to avoid confusion with contexts. A i915_guc_client is
> + * equivalent to GuC object guc_context_desc. This context descriptor is
> + * allocated from a pool of 1024 entries. Kernel driver will allocate doorbell
> + * and workqueue for it. Also the process descriptor (guc_process_desc), which
> + * is mapped to client space. So the client can write Work Item then ring the
> + * doorbell.
> + *
> + * To simplify the implementation, we allocate one gem object that contains all
> + * pages for doorbell, process descriptor and workqueue.
> + *
> + * The Scratch registers:
> + * There are 16 MMIO-based registers start from 0xC180. The kernel driver writes
> + * a value to the action register (SOFT_SCRATCH_0) along with any data. It then
> + * triggers an interrupt on the GuC via another register write (0xC4C8).
> + * Firmware writes a success/fail code back to the action register after
> + * processes the request. The kernel driver polls waiting for this update and
> + * then proceeds.
> + * See host2guc_action()
> + *
> + * Doorbells:
> + * Doorbells are interrupts to uKernel. A doorbell is a single cache line (QW)
> + * mapped into process space.
> + *
> + * Work Items:
> + * There are several types of work items that the host may place into a
> + * workqueue, each with its own requirements and limitations. Currently only
> + * WQ_TYPE_INORDER is needed to support legacy submission via GuC, which
> + * represents in-order queue. The kernel driver packs ring tail pointer and an
> + * ELSP context descriptor dword into Work Item.
> + * See guc_add_workqueue_item()
> + *
> + */
> +
> +/*
> + * Read GuC command/status register (SOFT_SCRATCH_0)
> + * Return true if it contains a response rather than a command
> + */
> +static inline bool host2guc_action_response(struct drm_i915_private *dev_priv,
> +					    u32 *status)
> +{
> +	u32 val = I915_READ(SOFT_SCRATCH(0));
> +	*status = val;
> +	return GUC2HOST_IS_RESPONSE(val);
> +}
> +
> +static int host2guc_action(struct intel_guc *guc, u32 *data, u32 len)
> +{
> +	struct drm_i915_private *dev_priv = guc_to_i915(guc);
> +	u32 status;
> +	int i;
> +	int ret;
> +
> +	if (WARN_ON(len < 1 || len > 15))
> +		return -EINVAL;
> +

[TOR:] Would it be good for host2guc_action to take a 
forcewake?  There are several writes and polling reads 
for completion.  Taking a forcewake could avoid surplus 
forcewakes for each register access.

> +	spin_lock(&dev_priv->guc.host2guc_lock);
> +
> +	dev_priv->guc.action_count += 1;
> +	dev_priv->guc.action_cmd = data[0];
> +
> +	for (i = 0; i < len; i++)
> +		I915_WRITE(SOFT_SCRATCH(i), data[i]);
> +
> +	POSTING_READ(SOFT_SCRATCH(i - 1));
> +
> +	I915_WRITE(HOST2GUC_INTERRUPT, HOST2GUC_TRIGGER);
> +
> +	ret = wait_for_atomic(host2guc_action_response(dev_priv, &status), 10);
[TOR:] Why 10?

> +	if (status != GUC2HOST_STATUS_SUCCESS) {
> +		/* either GuC doesn't respond, which is a TIMEOUT,
> +		 * or a failure code is returned. */
> +		if (ret != -ETIMEDOUT)
> +			ret = -EIO;
> +
> +		DRM_ERROR("GUC: host2guc action 0x%X failed. ret=%d "
> +				"status=0x%08X response=0x%08X\n",
> +				data[0], ret, status,
> +				I915_READ(SOFT_SCRATCH(15)));
> +
> +		dev_priv->guc.action_fail += 1;
> +		dev_priv->guc.action_err = ret;
> +	}
> +	dev_priv->guc.action_status = status;
> +
> +	spin_unlock(&dev_priv->guc.host2guc_lock);
> +
> +	return ret;
> +}
> +
> +/*
> + * Tell the GuC to allocate or deallocate a specific doorbell
> + */
> +
> +static int host2guc_allocate_doorbell(struct intel_guc *guc,
> +				      struct i915_guc_client *client)
> +{
> +	u32 data[2];
> +
> +	data[0] = HOST2GUC_ACTION_ALLOCATE_DOORBELL;
> +	data[1] = client->ctx_index;
> +
> +	return host2guc_action(guc, data, 2);
> +}
> +
> +static int host2guc_release_doorbell(struct intel_guc *guc,
> +				     struct i915_guc_client *client)
> +{
> +	u32 data[2];
> +
> +	data[0] = HOST2GUC_ACTION_DEALLOCATE_DOORBELL;
> +	data[1] = client->ctx_index;
> +
> +	return host2guc_action(guc, data, 2);
> +}
> +
> +/*
> + * Initialise, update, or clear doorbell data shared with the GuC
> + *
> + * These functions modify shared data and so need access to the mapped
> + * client object which contains the page being used for the doorbell
> + */
> +
> +static void guc_init_doorbell(struct intel_guc *guc,
> +			      struct i915_guc_client *client)
> +{
> +	struct guc_doorbell_info *doorbell;
> +	void *base;
> +
> +	base = kmap_atomic(i915_gem_object_get_page(client->client_obj, 0));
> +	doorbell = base + client->doorbell_offset;
> +
> +	doorbell->db_status = 1;
> +	doorbell->cookie = 0;
> +
> +	kunmap_atomic(base);
> +}
> +
> +static int guc_ring_doorbell(struct i915_guc_client *gc)
> +{
> +	struct guc_process_desc *desc;
> +	union guc_doorbell_qw db_cmp, db_exc, db_ret;
> +	union guc_doorbell_qw *db;
> +	void *base;
> +	int attempt = 2, ret = -EAGAIN;
> +
> +	base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
> +	desc = base + gc->proc_desc_offset;
> +
> +	/* Update the tail so it is visible to GuC */
> +	desc->tail = gc->wq_tail;
> +
> +	/* current cookie */
> +	db_cmp.db_status = GUC_DOORBELL_ENABLED;
> +	db_cmp.cookie = gc->cookie;
> +
> +	/* cookie to be updated */
> +	db_exc.db_status = GUC_DOORBELL_ENABLED;
> +	db_exc.cookie = gc->cookie + 1;
> +	if (db_exc.cookie == 0)
> +		db_exc.cookie = 1;
> +
> +	/* pointer of current doorbell cacheline */
> +	db = base + gc->doorbell_offset;
> +
> +	while (attempt--) {
> +		/* lets ring the doorbell */
> +		db_ret.value_qw = atomic64_cmpxchg((atomic64_t *)db,
> +			db_cmp.value_qw, db_exc.value_qw);
> +
> +		/* if the exchange was successfully executed */
> +		if (db_ret.value_qw == db_cmp.value_qw) {
> +			/* db was successfully rung */
> +			gc->cookie = db_exc.cookie;
> +			ret = 0;
> +			break;
> +		}
> +
> +		/* XXX: doorbell was lost and need to acquire it again */
> +		if (db_ret.db_status == GUC_DOORBELL_DISABLED)
> +			break;
> +
> +		DRM_ERROR("Cookie mismatch. Expected %d, returned %d\n",
> +			  db_cmp.cookie, db_ret.cookie);
> +
> +		/* update the cookie to newly read cookie from GuC */
> +		db_cmp.cookie = db_ret.cookie;
> +		db_exc.cookie = db_ret.cookie + 1;
> +		if (db_exc.cookie == 0)
> +			db_exc.cookie = 1;
> +	}
> +
> +	kunmap_atomic(base);
> +	return ret;
> +}
> +
> +static void guc_disable_doorbell(struct intel_guc *guc,
> +				 struct i915_guc_client *client)
> +{
> +	struct drm_i915_private *dev_priv = guc_to_i915(guc);
> +	struct guc_doorbell_info *doorbell;
> +	void *base;
> +	int drbreg = GEN8_DRBREGL(client->doorbell_id);
> +	int value;
> +
> +	base = kmap_atomic(i915_gem_object_get_page(client->client_obj, 0));
> +	doorbell = base + client->doorbell_offset;
> +
> +	doorbell->db_status = 0;
> +
> +	kunmap_atomic(base);
> +
> +	I915_WRITE(drbreg, I915_READ(drbreg) & ~GEN8_DRB_VALID);
> +
> +	value = I915_READ(drbreg);
> +	WARN_ON((value & GEN8_DRB_VALID) != 0);
> +
> +	I915_WRITE(GEN8_DRBREGU(client->doorbell_id), 0);
> +	I915_WRITE(drbreg, 0);
> +
> +	/* XXX: wait for any interrupts */
> +	/* XXX: wait for workqueue to drain */
> +}
> +
> +/*
> + * Select, assign and relase doorbell cachelines
> + *
> + * These functions track which doorbell cachelines are in use.
> + * The data they manipulate is protected by the host2guc lock.
> + */
> +
> +static uint32_t select_doorbell_cacheline(struct intel_guc *guc)
> +{
> +	const uint32_t cacheline_size = boot_cpu_data.x86_clflush_size;
> +	uint32_t offset;
> +
> +	spin_lock(&guc->host2guc_lock);
> +
> +	/* Doorbell uses a single cache line within a page */
> +	offset = guc->db_cacheline & PAGE_MASK;
> +
> +	/* Moving to next cache line to reduce contention */
> +	guc->db_cacheline += cacheline_size;
> +
> +	spin_unlock(&guc->host2guc_lock);
> +
> +	return offset;
> +}
> +
> +static uint16_t assign_doorbell(struct intel_guc *guc, uint32_t priority)
> +{
> +	/* The bitmap is split into two halves - high and normal priority. */
> +	const uint16_t half = GUC_MAX_DOORBELLS / 2;
> +	const uint16_t start = (priority <= GUC_CTX_PRIORITY_HIGH) ? half : 0;
> +	const uint16_t end = start + half;
> +	uint16_t id;
> +
> +	spin_lock(&guc->host2guc_lock);
> +	id = find_next_zero_bit(guc->doorbell_bitmap, end, start);
> +	if (id == end)
> +		id = GUC_INVALID_DOORBELL_ID;
> +	else
> +		bitmap_set(guc->doorbell_bitmap, id, 1);
> +	spin_unlock(&guc->host2guc_lock);
> +
> +	return id;
> +}
> +
> +static void release_doorbell(struct intel_guc *guc, uint16_t id)
> +{
> +	spin_lock(&guc->host2guc_lock);
> +	bitmap_clear(guc->doorbell_bitmap, id, 1);
> +	spin_unlock(&guc->host2guc_lock);
> +}
> +
> +/*
> + * Initialise the process descriptor shared with the GuC firmware.
> + */
> +static void guc_init_proc_desc(struct intel_guc *guc,
> +			       struct i915_guc_client *client)
> +{
> +	struct guc_process_desc *desc;
> +	void *base;
> +
> +	base = kmap_atomic(i915_gem_object_get_page(client->client_obj, 0));
> +	desc = base + client->proc_desc_offset;
> +
> +	memset(desc, 0, sizeof(*desc));
> +
> +	/*
> +	 * XXX: pDoorbell and WQVBaseAddress are pointers in process address
> +	 * space for ring3 clients (set them as in mmap_ioctl) or kernel
> +	 * space for kernel clients (map on demand instead? May make debug
> +	 * easier to have it mapped).
> +	 */
> +	desc->wq_base_addr = 0;
> +	desc->db_base_addr = 0;
> +
> +	desc->context_id = client->ctx_index;
> +	desc->wq_size_bytes = client->wq_size;
> +	desc->wq_status = WQ_STATUS_ACTIVE;
> +	desc->priority = client->priority;
> +
> +	kunmap_atomic(base);
> +}
> +
> +/*
> + * Initialise/clear the context descriptor shared with the GuC firmware.
> + *
> + * This descriptor tells the GuC where (in GGTT space) to find the important
> + * data structures relating to this client (doorbell, process descriptor,
> + * write queue, etc).
> + */
> +
> +static void guc_init_ctx_desc(struct intel_guc *guc,
> +			      struct i915_guc_client *client)
> +{
> +	struct guc_context_desc desc;
> +	struct sg_table *sg;
> +
> +	memset(&desc, 0, sizeof(desc));
> +
> +	desc.attribute = GUC_CTX_DESC_ATTR_ACTIVE | GUC_CTX_DESC_ATTR_KERNEL;
> +	desc.context_id = client->ctx_index;
> +	desc.priority = client->priority;
> +	desc.engines_used = (1 << RCS) | (1 << VCS) | (1 << BCS) |
> +			    (1 << VECS) | (1 << VCS2); /* all engines */
> +	desc.db_id = client->doorbell_id;
> +
> +	/*
> +	 * The CPU address is only needed at certain points, so kmap_atomic on
> +	 * demand instead of storing it in the ctx descriptor.
> +	 * XXX: May make debug easier to have it mapped
> +	 */
> +	desc.db_trigger_cpu = 0;
> +	desc.db_trigger_uk = client->doorbell_offset +
> +		i915_gem_obj_ggtt_offset(client->client_obj);
> +	desc.db_trigger_phy = client->doorbell_offset +
> +		sg_dma_address(client->client_obj->pages->sgl);
> +
> +	desc.process_desc = client->proc_desc_offset +
> +		i915_gem_obj_ggtt_offset(client->client_obj);
> +
> +	desc.wq_addr = client->wq_offset +
> +		i915_gem_obj_ggtt_offset(client->client_obj);
> +
> +	desc.wq_size = client->wq_size;
> +
> +	/*
> +	 * XXX: Take LRCs from an existing intel_context if this is not an
> +	 * IsKMDCreatedContext client
> +	 */
> +	desc.desc_private = (uintptr_t)client;
> +
> +	/* Pool context is pinned already */
> +	sg = guc->ctx_pool_obj->pages;
> +	sg_pcopy_from_buffer(sg->sgl, sg->nents, &desc, sizeof(desc),
> +			     sizeof(desc) * client->ctx_index);
> +}
> +
> +static void guc_fini_ctx_desc(struct intel_guc *guc,
> +			      struct i915_guc_client *client)
> +{
> +	struct guc_context_desc desc;
> +	struct sg_table *sg;
> +
> +	memset(&desc, 0, sizeof(desc));
> +
> +	sg = guc->ctx_pool_obj->pages;
> +	sg_pcopy_from_buffer(sg->sgl, sg->nents, &desc, sizeof(desc),
> +			     sizeof(desc) * client->ctx_index);
> +}
> +
> +/* Get valid workqueue item and return it back to offset */
> +static int guc_get_workqueue_space(struct i915_guc_client *gc, u32 *offset)
> +{
> +	struct guc_process_desc *desc;
> +	void *base;
> +	u32 size = sizeof(struct guc_wq_item);
> +	int ret = 0, timeout_counter = 200;
> +	unsigned long flags;
> +
> +	base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
> +	desc = base + gc->proc_desc_offset;
> +
> +	while (timeout_counter-- > 0) {
> +		spin_lock_irqsave(&gc->wq_lock, flags);
> +
> +		ret = wait_for_atomic(CIRC_SPACE(gc->wq_tail, desc->head,
> +				gc->wq_size) >= size, 1);
> +
> +		if (!ret) {
> +			*offset = gc->wq_tail;
> +
> +			/* advance the tail for next workqueue item */
> +			gc->wq_tail += size;
> +			gc->wq_tail &= gc->wq_size - 1;
> +
> +			/* this will break the loop */
> +			timeout_counter = 0;
> +		}
> +
> +		spin_unlock_irqrestore(&gc->wq_lock, flags);
> +	};
> +
> +	kunmap_atomic(base);
> +
> +	return ret;
> +}
> +
> +static int guc_add_workqueue_item(struct i915_guc_client *gc,
> +				  struct drm_i915_gem_request *rq)
> +{
> +	enum intel_ring_id ring_id = rq->ring->id;
> +	struct guc_wq_item *wqi;
> +	void *base;
> +	u32 tail, wq_len, wq_off = 0;
> +	int ret;
> +
> +	ret = guc_get_workqueue_space(gc, &wq_off);
> +	if (ret)
> +		return ret;
> +
> +	/* For now workqueue item is 4 DWs; workqueue buffer is 2 pages. So we
> +	 * should not have the case where structure wqi is across page, neither
> +	 * wrapped to the beginning. This simplifies the implementation below.
> +	 *
> +	 * XXX: if not the case, we need save data to a temp wqi and copy it to
> +	 * workqueue buffer dw by dw.
> +	 */
> +	WARN_ON(sizeof(struct guc_wq_item) != 16);
> +	WARN_ON(wq_off & 3);
> +
> +	/* wq starts from the page after doorbell / process_desc */
> +	base = kmap_atomic(i915_gem_object_get_page(gc->client_obj,
> +			(wq_off + GUC_DB_SIZE) >> PAGE_SHIFT));
> +	wq_off &= PAGE_SIZE - 1;
> +	wqi = (struct guc_wq_item *)((char *)base + wq_off);
> +
> +	/* len does not include the header */
> +	wq_len = sizeof(struct guc_wq_item) / sizeof(u32) - 1;
> +	wqi->header = WQ_TYPE_INORDER |
> +			(wq_len << WQ_LEN_SHIFT) |
> +			(ring_id << WQ_TARGET_SHIFT) |
> +			WQ_NO_WCFLUSH_WAIT;
> +
> +	/* The GuC wants only the low-order word of the context descriptor */
> +	wqi->context_desc = (u32)intel_lr_context_descriptor(rq->ctx, rq->ring);
> +
> +	/* The GuC firmware wants the tail index in QWords, not bytes */
> +	tail = rq->ringbuf->tail >> 3;
> +	wqi->ring_tail = tail << WQ_RING_TAIL_SHIFT;
> +	wqi->fence_id = 0; /*XXX: what fence to be here */
> +
> +	kunmap_atomic(base);
> +
> +	return 0;
> +}
> +
> +/**
> + * i915_guc_submit() - Submit commands through GuC
> + * @client:	the guc client where commands will go through
> + * @ctx:	LRC where commands come from
> + * @ring:	HW engine that will excute the commands
> + *
> + * Return:	0 if succeed
> + */
> +int i915_guc_submit(struct i915_guc_client *client,
> +		    struct drm_i915_gem_request *rq)
> +{
> +	unsigned long flags;
> +	int q_ret, b_ret;
> +
> +	/* Need this because of the deferred pin ctx and ring */
> +	/* Shall we move this right after ring is pinned? */
> +	intel_lr_context_update(rq);
> +
> +	q_ret = guc_add_workqueue_item(client, rq);
> +	if (q_ret == 0)
> +		b_ret = guc_ring_doorbell(client);
> +
> +	spin_lock_irqsave(&client->wq_lock, flags);
> +	client->submissions += 1;
> +	if (q_ret) {
> +		client->q_fail += 1;
> +		client->retcode = q_ret;
> +	} else if (b_ret) {
> +		client->b_fail += 1;
> +		client->retcode = q_ret = b_ret;
> +	} else {
> +		client->retcode = 0;
> +	}
> +	spin_unlock_irqrestore(&client->wq_lock, flags);
> +
> +	return q_ret;
> +}
> +
> +/*
> + * Everything below here is concerned with setup & teardown, and is
> + * therefore not part of the somewhat time-critical batch-submission
> + * path of i915_guc_submit() above.
> + */
> +
> +/**
>   * gem_allocate_guc_obj() - Allocate gem object for GuC usage
>   * @dev:	drm device
>   * @size:	size of object
> @@ -75,6 +581,121 @@ static void gem_release_guc_obj(struct drm_i915_gem_object *obj)
>  	drm_gem_object_unreference(&obj->base);
>  }
>  
> +static void guc_client_free(struct drm_device *dev,
> +			    struct i915_guc_client *client)
> +{
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct intel_guc *guc = &dev_priv->guc;
> +
> +	if (!client)
> +		return;
> +
> +	if (client->doorbell_id != GUC_INVALID_DOORBELL_ID) {
> +		/*
> +		 * First disable the doorbell, then tell the GuC we've
> +		 * finished with it, finally deallocate it in our bitmap
> +		 */
> +		guc_disable_doorbell(guc, client);
> +		host2guc_release_doorbell(guc, client);
> +		release_doorbell(guc, client->doorbell_id);
> +	}
> +
> +	/*
> +	 * XXX: wait for any outstanding submissions before freeing memory.
> +	 * Be sure to drop any locks
> +	 */
> +
> +	gem_release_guc_obj(client->client_obj);
> +
> +	if (client->ctx_index != GUC_INVALID_CTX_ID) {
> +		guc_fini_ctx_desc(guc, client);
> +		ida_simple_remove(&guc->ctx_ids, client->ctx_index);
> +	}
> +
> +	kfree(client);
> +}
> +
> +/**
> + * guc_client_alloc() - Allocate an i915_guc_client
> + * @dev:	drm device
> + * @priority:	four levels priority _CRITICAL, _HIGH, _NORMAL and _LOW
> + * 		The kernel client to replace ExecList submission is created with
> + * 		NORMAL priority. Priority of a client for scheduler can be HIGH,
> + * 		while a preemption context can use CRITICAL.
> + *
> + * Return:	An i915_guc_client object if success.
> + */
> +static struct i915_guc_client *guc_client_alloc(struct drm_device *dev,
> +						uint32_t priority)
> +{
> +	struct i915_guc_client *client;
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct intel_guc *guc = &dev_priv->guc;
> +	struct drm_i915_gem_object *obj;
> +
> +	client = kzalloc(sizeof(*client), GFP_KERNEL);
> +	if (!client)
> +		return NULL;
> +
> +	client->doorbell_id = GUC_INVALID_DOORBELL_ID;
> +	client->priority = priority;
> +
> +	client->ctx_index = (uint32_t)ida_simple_get(&guc->ctx_ids, 0,
> +			GUC_MAX_GPU_CONTEXTS, GFP_KERNEL);
> +	if (client->ctx_index >= GUC_MAX_GPU_CONTEXTS) {
> +		client->ctx_index = GUC_INVALID_CTX_ID;
> +		goto err;
> +	}
> +
> +	/* The first page is doorbell/proc_desc. Two followed pages are wq. */
> +	obj = gem_allocate_guc_obj(dev, GUC_DB_SIZE + GUC_WQ_SIZE);
> +	if (!obj)
> +		goto err;
> +
> +	client->client_obj = obj;
> +	client->wq_offset = GUC_DB_SIZE;
> +	client->wq_size = GUC_WQ_SIZE;
> +	spin_lock_init(&client->wq_lock);
> +
> +	client->doorbell_offset = select_doorbell_cacheline(guc);
> +
> +	/*
> +	 * Since the doorbell only requires a single cacheline, we can save
> +	 * space by putting the application process descriptor in the same
> +	 * page. Use the half of the page that doesn't include the doorbell.
> +	 */
> +	if (client->doorbell_offset >= (GUC_DB_SIZE / 2))
> +		client->proc_desc_offset = 0;
> +	else
> +		client->proc_desc_offset = (GUC_DB_SIZE / 2);
> +
> +	client->doorbell_id = assign_doorbell(guc, client->priority);
> +	if (client->doorbell_id == GUC_INVALID_DOORBELL_ID)
> +		/* XXX: evict a doorbell instead */
> +		goto err;
> +
> +	guc_init_proc_desc(guc, client);
> +	guc_init_ctx_desc(guc, client);
> +	guc_init_doorbell(guc, client);
> +
> +	/* Invalidate GuC TLB to let GuC take the latest updates to GTT. */
> +	I915_WRITE(GEN8_GTCR, GEN8_GTCR_INVALIDATE);
> +
> +	/* XXX: Any cache flushes needed? General domain mgmt calls? */
> +
> +	if (host2guc_allocate_doorbell(guc, client))
> +		goto err;
> +
> +	DRM_DEBUG_DRIVER("new priority %u client %p: ctx_index %u db_id %u\n",
> +		priority, client, client->ctx_index, client->doorbell_id);
> +
> +	return client;
> +
> +err:
> +	guc_client_free(dev, client);
> +	return NULL;
> +}
> +
>  static void guc_create_log(struct intel_guc *guc)
>  {
>  	struct drm_i915_private *dev_priv = guc_to_i915(guc);
> @@ -138,6 +759,8 @@ int i915_guc_submission_init(struct drm_device *dev)
>  	if (!guc->ctx_pool_obj)
>  		return -ENOMEM;
>  
> +	spin_lock_init(&dev_priv->guc.host2guc_lock);
> +
>  	ida_init(&guc->ctx_ids);
>  
>  	guc_create_log(guc);
> @@ -145,6 +768,32 @@ int i915_guc_submission_init(struct drm_device *dev)
>  	return 0;
>  }
>  
> +int i915_guc_submission_enable(struct drm_device *dev)
> +{
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct intel_guc *guc = &dev_priv->guc;
> +	struct i915_guc_client *client;
> +
> +	/* client for execbuf submission */
> +	client = guc_client_alloc(dev, GUC_CTX_PRIORITY_NORMAL);
> +	if (!client) {
> +		DRM_ERROR("Failed to create execbuf guc_client\n");
> +		return -ENOMEM;
> +	}
> +
> +	guc->execbuf_client = client;
> +	return 0;
> +}
> +
> +void i915_guc_submission_disable(struct drm_device *dev)
> +{
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct intel_guc *guc = &dev_priv->guc;
> +
> +	guc_client_free(dev, guc->execbuf_client);
> +	guc->execbuf_client = NULL;
> +}
> +
>  void i915_guc_submission_fini(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
> index 5b51b05..d249326 100644
> --- a/drivers/gpu/drm/i915/intel_guc.h
> +++ b/drivers/gpu/drm/i915/intel_guc.h
> @@ -27,6 +27,30 @@
>  #include "intel_guc_fwif.h"
>  #include "i915_guc_reg.h"
>  
> +struct i915_guc_client {
> +	struct drm_i915_gem_object *client_obj;
> +	uint32_t priority;
> +	uint32_t ctx_index;
> +
> +	uint32_t proc_desc_offset;
> +	uint32_t doorbell_offset;
> +	uint32_t cookie;
> +	uint16_t doorbell_id;
> +	uint16_t padding;		/* Maintain alignment		*/
> +
> +	uint32_t wq_offset;
> +	uint32_t wq_size;
> +
> +	spinlock_t wq_lock;		/* Protects all data below	*/
> +	uint32_t wq_tail;
> +
> +	/* GuC submission statistics & status */
> +	uint64_t submissions;
> +	uint32_t q_fail;
> +	uint32_t b_fail;
> +	int retcode;
> +};
> +
>  enum intel_guc_fw_status {
>  	GUC_FIRMWARE_FAIL = -1,
>  	GUC_FIRMWARE_NONE = 0,
> @@ -60,6 +84,20 @@ struct intel_guc {
>  
>  	struct drm_i915_gem_object *ctx_pool_obj;
>  	struct ida ctx_ids;
> +
> +	struct i915_guc_client *execbuf_client;
> +
> +	spinlock_t host2guc_lock;	/* Protects all data below	*/
> +
> +	DECLARE_BITMAP(doorbell_bitmap, GUC_MAX_DOORBELLS);
> +	int db_cacheline;
> +
> +	/* Action status & statistics */
> +	uint64_t action_count;		/* Total commands issued	*/
> +	uint32_t action_cmd;		/* Last command word		*/
> +	uint32_t action_status;		/* Last return status		*/
> +	uint32_t action_fail;		/* Total number of failures	*/
> +	int32_t action_err;		/* Last error code		*/
>  };
>  
>  /* intel_guc_loader.c */
> @@ -70,6 +108,10 @@ extern const char *intel_guc_fw_status_repr(enum intel_guc_fw_status status);
>  
>  /* i915_guc_submission.c */
>  int i915_guc_submission_init(struct drm_device *dev);
> +int i915_guc_submission_enable(struct drm_device *dev);
> +int i915_guc_submit(struct i915_guc_client *client,
> +		    struct drm_i915_gem_request *rq);
> +void i915_guc_submission_disable(struct drm_device *dev);
>  void i915_guc_submission_fini(struct drm_device *dev);
>  
>  #endif
> diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c b/drivers/gpu/drm/i915/intel_guc_loader.c
> index e5d7136..25ba29f 100644
> --- a/drivers/gpu/drm/i915/intel_guc_loader.c
> +++ b/drivers/gpu/drm/i915/intel_guc_loader.c
> @@ -427,6 +427,8 @@ int intel_guc_ucode_load(struct drm_device *dev)
>  		intel_guc_fw_status_repr(guc_fw->guc_fw_fetch_status),
>  		intel_guc_fw_status_repr(guc_fw->guc_fw_load_status));
>  
> +	i915_guc_submission_disable(dev);
> +
>  	if (guc_fw->guc_fw_fetch_status == GUC_FIRMWARE_NONE)
>  		return 0;
>  
> @@ -479,12 +481,20 @@ int intel_guc_ucode_load(struct drm_device *dev)
>  		intel_guc_fw_status_repr(guc_fw->guc_fw_fetch_status),
>  		intel_guc_fw_status_repr(guc_fw->guc_fw_load_status));
>  
> +	if (i915.enable_guc_submission) {
> +		err = i915_guc_submission_enable(dev);
> +		if (err)
> +			goto fail;
> +	}
> +
>  	return 0;
>  
>  fail:
>  	if (guc_fw->guc_fw_load_status == GUC_FIRMWARE_PENDING)
>  		guc_fw->guc_fw_load_status = GUC_FIRMWARE_FAIL;
>  
> +	i915_guc_submission_disable(dev);
> +
>  	DRM_ERROR("Failed to initialize GuC, error %d\n", err);
>  
>  	return err;
> @@ -547,6 +557,8 @@ void intel_guc_ucode_fini(struct drm_device *dev)
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct intel_guc_fw *guc_fw = &dev_priv->guc.guc_fw;
>  
> +	i915_guc_submission_fini(dev);
> +
>  	if (guc_fw->guc_fw_obj)
>  		drm_gem_object_unreference(&guc_fw->guc_fw_obj->base);
>  	guc_fw->guc_fw_obj = NULL;
> -- 
> 1.9.1
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 10/13 v4] drm/i915: Interrupt routing for GuC submission
  2015-07-09 18:29 ` [PATCH 10/13 v4] drm/i915: Interrupt routing for GuC submission Dave Gordon
@ 2015-07-27 15:33   ` O'Rourke, Tom
  2015-07-28 11:29     ` Dave Gordon
  0 siblings, 1 reply; 42+ messages in thread
From: O'Rourke, Tom @ 2015-07-27 15:33 UTC (permalink / raw)
  To: Dave Gordon; +Cc: intel-gfx

On Thu, Jul 09, 2015 at 07:29:11PM +0100, Dave Gordon wrote:
> Turn on interrupt steering to route necessary interrupts to GuC.
> 
> v4:
>     Rebased
> 
> Issue: VIZ-4884
> Signed-off-by: Alex Dai <yu.dai@intel.com>
> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_reg.h         | 11 +++++--
>  drivers/gpu/drm/i915/intel_guc_loader.c | 51 +++++++++++++++++++++++++++++++++
>  2 files changed, 60 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index 63728c1..1c2072b 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -1664,12 +1664,18 @@ enum skl_disp_power_wells {
>  #define GFX_MODE_GEN7	0x0229c
>  #define RING_MODE_GEN7(ring)	((ring)->mmio_base+0x29c)
>  #define   GFX_RUN_LIST_ENABLE		(1<<15)
> +#define   GFX_INTERRUPT_STEERING	(1<<14)
>  #define   GFX_TLB_INVALIDATE_EXPLICIT	(1<<13)
>  #define   GFX_SURFACE_FAULT_ENABLE	(1<<12)
>  #define   GFX_REPLAY_MODE		(1<<11)
>  #define   GFX_PSMI_GRANULARITY		(1<<10)
>  #define   GFX_PPGTT_ENABLE		(1<<9)
>  
> +#define   GFX_FORWARD_VBLANK_MASK	(3<<5)
> +#define   GFX_FORWARD_VBLANK_NEVER	(0<<5)
> +#define   GFX_FORWARD_VBLANK_ALWAYS	(1<<5)
> +#define   GFX_FORWARD_VBLANK_COND	(2<<5)
> +
>  #define VLV_DISPLAY_BASE 0x180000
>  #define VLV_MIPI_BASE VLV_DISPLAY_BASE
>  
> @@ -5683,11 +5689,12 @@ enum skl_disp_power_wells {
>  #define GEN8_GT_IIR(which) (0x44308 + (0x10 * (which)))
>  #define GEN8_GT_IER(which) (0x4430c + (0x10 * (which)))
>  
> -#define GEN8_BCS_IRQ_SHIFT 16
>  #define GEN8_RCS_IRQ_SHIFT 0
> -#define GEN8_VCS2_IRQ_SHIFT 16
> +#define GEN8_BCS_IRQ_SHIFT 16
>  #define GEN8_VCS1_IRQ_SHIFT 0
> +#define GEN8_VCS2_IRQ_SHIFT 16
>  #define GEN8_VECS_IRQ_SHIFT 0
> +#define GEN8_WD_IRQ_SHIFT 16
>  
>  #define GEN8_DE_PIPE_ISR(pipe) (0x44400 + (0x10 * (pipe)))
>  #define GEN8_DE_PIPE_IMR(pipe) (0x44404 + (0x10 * (pipe)))
> diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c b/drivers/gpu/drm/i915/intel_guc_loader.c
> index 25ba29f..2aa9227 100644
> --- a/drivers/gpu/drm/i915/intel_guc_loader.c
> +++ b/drivers/gpu/drm/i915/intel_guc_loader.c
> @@ -79,6 +79,53 @@ const char *intel_guc_fw_status_repr(enum intel_guc_fw_status status)
>  	}
>  };
>  
> +static void direct_interrupts_to_host(struct drm_i915_private *dev_priv)
> +{
> +	struct intel_engine_cs *ring;
> +	int i, irqs;
> +
> +	/* tell all command streamers NOT to forward interrupts and vblank to GuC */
> +	irqs = _MASKED_FIELD(GFX_FORWARD_VBLANK_MASK, GFX_FORWARD_VBLANK_NEVER);
> +	irqs |= _MASKED_BIT_DISABLE(GFX_INTERRUPT_STEERING);
> +	for_each_ring(ring, dev_priv, i)
> +		I915_WRITE(RING_MODE_GEN7(ring), irqs);
> +
> +	/* tell DE to send nothing to GuC */
> +	I915_WRITE(DE_GUCRMR, ~0);
> +
> +	/* route all GT interrupts to the host */
> +	I915_WRITE(GUC_BCS_RCS_IER, 0);
> +	I915_WRITE(GUC_VCS2_VCS1_IER, 0);
> +	I915_WRITE(GUC_WD_VECS_IER, 0);
> +}
> +
> +static void direct_interrupts_to_guc(struct drm_i915_private *dev_priv)
> +{
> +	struct intel_engine_cs *ring;
> +	int i, irqs;
> +
> +	/* tell all command streamers to forward interrupts and vblank to GuC */
> +	irqs = _MASKED_FIELD(GFX_FORWARD_VBLANK_MASK, GFX_FORWARD_VBLANK_ALWAYS);
> +	irqs |= _MASKED_BIT_ENABLE(GFX_INTERRUPT_STEERING);
> +	for_each_ring(ring, dev_priv, i)
> +		I915_WRITE(RING_MODE_GEN7(ring), irqs);
> +
> +	/* tell DE to send (all) flip_done to GuC */
> +	irqs = DERRMR_PIPEA_PRI_FLIP_DONE | DERRMR_PIPEA_SPR_FLIP_DONE |
> +	       DERRMR_PIPEB_PRI_FLIP_DONE | DERRMR_PIPEB_SPR_FLIP_DONE |
> +	       DERRMR_PIPEC_PRI_FLIP_DONE | DERRMR_PIPEC_SPR_FLIP_DONE;
> +	/* Unmasked bits will cause GuC response message to be sent */
> +	I915_WRITE(DE_GUCRMR, ~irqs);
> +
> +	/* route USER_INTERRUPT to Host, all others are sent to GuC. */
> +	irqs = GT_RENDER_USER_INTERRUPT << GEN8_RCS_IRQ_SHIFT |
> +	       GT_RENDER_USER_INTERRUPT << GEN8_BCS_IRQ_SHIFT;
> +	/* These three registers have the same bit definitions */
[TOR:] Reliance on the registers having the same bit
definitions does not seem safe.  Each of the three
registers has shift constants defined.  I would expect
the shift constants for the second and third registers
to be used when writing those registers.

Also, GT_RENDER_USER_INTERRUPT seems to have been defined
for use with a different register than this set.

On the other hand, this code does actually write the
correct values.

> +	I915_WRITE(GUC_BCS_RCS_IER, ~irqs);
> +	I915_WRITE(GUC_VCS2_VCS1_IER, ~irqs);
> +	I915_WRITE(GUC_WD_VECS_IER, ~irqs);
> +}
> +
>  static u32 get_gttype(struct drm_i915_private *dev_priv)
>  {
>  	/* XXX: GT type based on PCI device ID? field seems unused by fw */
> @@ -427,6 +474,7 @@ int intel_guc_ucode_load(struct drm_device *dev)
>  		intel_guc_fw_status_repr(guc_fw->guc_fw_fetch_status),
>  		intel_guc_fw_status_repr(guc_fw->guc_fw_load_status));
>  
> +	direct_interrupts_to_host(dev_priv);
>  	i915_guc_submission_disable(dev);
>  
>  	if (guc_fw->guc_fw_fetch_status == GUC_FIRMWARE_NONE)
> @@ -485,6 +533,7 @@ int intel_guc_ucode_load(struct drm_device *dev)
>  		err = i915_guc_submission_enable(dev);
>  		if (err)
>  			goto fail;
> +		direct_interrupts_to_guc(dev_priv);
>  	}
>  
>  	return 0;
> @@ -493,6 +542,7 @@ fail:
>  	if (guc_fw->guc_fw_load_status == GUC_FIRMWARE_PENDING)
>  		guc_fw->guc_fw_load_status = GUC_FIRMWARE_FAIL;
>  
> +	direct_interrupts_to_host(dev_priv);
>  	i915_guc_submission_disable(dev);
>  
>  	DRM_ERROR("Failed to initialize GuC, error %d\n", err);
> @@ -557,6 +607,7 @@ void intel_guc_ucode_fini(struct drm_device *dev)
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct intel_guc_fw *guc_fw = &dev_priv->guc.guc_fw;
>  
> +	direct_interrupts_to_host(dev_priv);
>  	i915_guc_submission_fini(dev);
>  
>  	if (guc_fw->guc_fw_obj)
> -- 
> 1.9.1
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 12/13 v4] drm/i915: Debugfs interface for GuC submission statistics
  2015-07-09 18:29 ` [PATCH 12/13 v4] drm/i915: Debugfs interface for GuC submission statistics Dave Gordon
@ 2015-07-27 15:36   ` O'Rourke, Tom
  0 siblings, 0 replies; 42+ messages in thread
From: O'Rourke, Tom @ 2015-07-27 15:36 UTC (permalink / raw)
  To: Dave Gordon; +Cc: intel-gfx

On Thu, Jul 09, 2015 at 07:29:13PM +0100, Dave Gordon wrote:
> This provides a means of reading status and counts relating
> to GuC actions and submissions.
> 
> v2:
>     Remove surplus blank line in output [Chris Wilson]
> 
> v4:
>     Rebased
> 
> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> Signed-off-by: Alex Dai <yu.dai@intel.com>
> ---
Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com>
>  drivers/gpu/drm/i915/i915_debugfs.c | 40 +++++++++++++++++++++++++++++++++++++
>  1 file changed, 40 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index d93732a..cebb93c 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -2397,6 +2397,45 @@ static int i915_guc_load_status_info(struct seq_file *m, void *data)
>  	return 0;
>  }
>  
> +static int i915_guc_info(struct seq_file *m, void *data)
> +{
> +	struct drm_info_node *node = m->private;
> +	struct drm_device *dev = node->minor->dev;
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct intel_guc guc;
> +	struct i915_guc_client client = { .client_obj = 0 };
> +
> +	if (!HAS_GUC_SCHED(dev_priv->dev))
> +		return 0;
> +
> +	/* Take a local copy of the GuC data, so we can dump it at leisure */
> +	spin_lock(&dev_priv->guc.host2guc_lock);
> +	guc = dev_priv->guc;
> +	if (guc.execbuf_client) {
> +		spin_lock(&guc.execbuf_client->wq_lock);
> +		client = *guc.execbuf_client;
> +		spin_unlock(&guc.execbuf_client->wq_lock);
> +	}
> +	spin_unlock(&dev_priv->guc.host2guc_lock);
> +
> +	seq_printf(m, "GuC total action count: %llu\n", guc.action_count);
> +	seq_printf(m, "GuC last action command: 0x%x\n", guc.action_cmd);
> +	seq_printf(m, "GuC last action status: 0x%x\n", guc.action_status);
> +
> +	seq_printf(m, "GuC action failure count: %u\n", guc.action_fail);
> +	seq_printf(m, "GuC last action error code: %d\n", guc.action_err);
> +
> +	seq_printf(m, "\nGuC execbuf client @ %p:\n", guc.execbuf_client);
> +	seq_printf(m, "\tTotal submissions: %llu\n", client.submissions);
> +	seq_printf(m, "\tFailed to queue: %u\n", client.q_fail);
> +	seq_printf(m, "\tFailed doorbell: %u\n", client.b_fail);
> +	seq_printf(m, "\tLast submission result: %d\n", client.retcode);
> +
> +	/* Add more as required ... */
> +
> +	return 0;
> +}
> +
>  static int i915_guc_log_dump(struct seq_file *m, void *data)
>  {
>  	struct drm_info_node *node = m->private;
> @@ -5139,6 +5178,7 @@ static const struct drm_info_list i915_debugfs_list[] = {
>  	{"i915_gem_hws_bsd", i915_hws_info, 0, (void *)VCS},
>  	{"i915_gem_hws_vebox", i915_hws_info, 0, (void *)VECS},
>  	{"i915_gem_batch_pool", i915_gem_batch_pool_info, 0},
> +	{"i915_guc_info", i915_guc_info, 0},
>  	{"i915_guc_load_status", i915_guc_load_status_info, 0},
>  	{"i915_guc_log_dump", i915_guc_log_dump, 0},
>  	{"i915_frequency_info", i915_frequency_info, 0},
> -- 
> 1.9.1
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 11/13 v4] drm/i915: Integrate GuC-based command submission
  2015-07-09 18:29 ` [PATCH 11/13 v4] drm/i915: Integrate GuC-based command submission Dave Gordon
@ 2015-07-27 15:57   ` O'Rourke, Tom
  2015-07-27 19:33     ` Yu Dai
  2015-07-28 13:59     ` Dave Gordon
  0 siblings, 2 replies; 42+ messages in thread
From: O'Rourke, Tom @ 2015-07-27 15:57 UTC (permalink / raw)
  To: Dave Gordon; +Cc: intel-gfx

On Thu, Jul 09, 2015 at 07:29:12PM +0100, Dave Gordon wrote:
> From: Alex Dai <yu.dai@intel.com>
> 
> GuC-based submission is mostly the same as execlist mode, up to
> intel_logical_ring_advance_and_submit(), where the context being
> dispatched would be added to the execlist queue; at this point
> we submit the context to the GuC backend instead.
> 
> There are, however, a few other changes also required, notably:
> 1.  Contexts must be pinned at GGTT addresses accessible by the GuC
>     i.e. NOT in the range [0..WOPCM_SIZE), so we have to add the
>     PIN_OFFSET_BIAS flag to the relevant GGTT-pinning calls.
> 
> 2.  The GuC's TLB must be invalidated after a context is pinned at
>     a new GGTT address.
> 
> 3.  GuC firmware uses the one page before Ring Context as shared data.
>     Therefore, whenever driver wants to get base address of LRC, we
>     will offset one page for it. LRC_PPHWSP_PN is defined as the page
>     number of LRCA.
> 
> 4.  In the work queue used to pass requests to the GuC, the GuC
>     firmware requires the ring-tail-offset to be represented as an
>     11-bit value, expressed in QWords. Therefore, the ringbuffer
>     size must be reduced to the representable range (4 pages).
> 
> v2:
>     Defer adding #defines until needed [Chris Wilson]
>     Rationalise type declarations [Chris Wilson]
> 
> v4:
>     Squashed kerneldoc patch into here [Daniel Vetter]
> 
> Issue: VIZ-4884
> Signed-off-by: Alex Dai <yu.dai@intel.com>
> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> ---
>  Documentation/DocBook/drm.tmpl             | 14 ++++++++
>  drivers/gpu/drm/i915/i915_debugfs.c        |  2 +-
>  drivers/gpu/drm/i915/i915_guc_submission.c | 52 +++++++++++++++++++++++++++---
>  drivers/gpu/drm/i915/intel_guc.h           |  1 +
>  drivers/gpu/drm/i915/intel_lrc.c           | 51 ++++++++++++++++++++---------
>  drivers/gpu/drm/i915/intel_lrc.h           |  6 ++++
>  6 files changed, 106 insertions(+), 20 deletions(-)
> 
> diff --git a/Documentation/DocBook/drm.tmpl b/Documentation/DocBook/drm.tmpl
> index 596b11d..0ff5fd7 100644
> --- a/Documentation/DocBook/drm.tmpl
> +++ b/Documentation/DocBook/drm.tmpl
> @@ -4223,6 +4223,20 @@ int num_ioctls;</synopsis>
>        </sect2>
>      </sect1>
>      <sect1>
> +      <title>GuC-based Command Submission</title>
> +      <sect2>
> +        <title>GuC</title>
> +!Pdrivers/gpu/drm/i915/intel_guc_loader.c GuC-specific firmware loader
> +!Idrivers/gpu/drm/i915/intel_guc_loader.c
> +      </sect2>
> +      <sect2>
> +        <title>GuC Client</title>
> +!Pdrivers/gpu/drm/i915/intel_guc_submission.c GuC-based command submissison
> +!Idrivers/gpu/drm/i915/intel_guc_submission.c
> +      </sect2>
> +    </sect1>
> +
> +    <sect1>
>        <title> Tracing </title>
>        <para>
>      This sections covers all things related to the tracepoints implemented in
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 13e37d1..d93732a 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -1982,7 +1982,7 @@ static void i915_dump_lrc_obj(struct seq_file *m,
>  		return;
>  	}
>  
> -	page = i915_gem_object_get_page(ctx_obj, 1);
> +	page = i915_gem_object_get_page(ctx_obj, LRC_STATE_PN);
>  	if (!WARN_ON(page == NULL)) {
>  		reg_state = kmap_atomic(page);
>  
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> index 25d8807..c5c9fbf 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -346,18 +346,58 @@ static void guc_init_proc_desc(struct intel_guc *guc,
>  static void guc_init_ctx_desc(struct intel_guc *guc,
>  			      struct i915_guc_client *client)
>  {
> +	struct intel_context *ctx = client->owner;
>  	struct guc_context_desc desc;
>  	struct sg_table *sg;
> +	int i;
>  
>  	memset(&desc, 0, sizeof(desc));
>  
>  	desc.attribute = GUC_CTX_DESC_ATTR_ACTIVE | GUC_CTX_DESC_ATTR_KERNEL;
>  	desc.context_id = client->ctx_index;
>  	desc.priority = client->priority;
> -	desc.engines_used = (1 << RCS) | (1 << VCS) | (1 << BCS) |
> -			    (1 << VECS) | (1 << VCS2); /* all engines */
>  	desc.db_id = client->doorbell_id;
>  
> +	for (i = 0; i < I915_NUM_RINGS; i++) {
> +		struct guc_execlist_context *lrc = &desc.lrc[i];
> +		struct intel_ringbuffer *ringbuf = ctx->engine[i].ringbuf;
> +		struct intel_engine_cs *ring;
> +		struct drm_i915_gem_object *obj;
> +		uint64_t ctx_desc;
> +
> +		/* TODO: We have a design issue to be solved here. Only when we
> +		 * receive the first batch, we know which engine is used by the
> +		 * user. But here GuC expects the lrc and ring to be pinned. It
> +		 * is not an issue for default context, which is the only one
> +		 * for now who owns a GuC client. But for future owner of GuC
> +		 * client, need to make sure lrc is pinned prior to enter here.
> +		 */
> +		obj = ctx->engine[i].state;
> +		if (!obj)
> +			break;
> +
> +		ring = ringbuf->ring;
> +		ctx_desc = intel_lr_context_descriptor(ctx, ring);
> +		lrc->context_desc = (u32)ctx_desc;
> +
> +		/* The state page is after PPHWSP */
> +		lrc->ring_lcra = i915_gem_obj_ggtt_offset(obj) +
> +				LRC_STATE_PN * PAGE_SIZE;
> +		lrc->context_id = (client->ctx_index << GUC_ELC_CTXID_OFFSET) |
> +				(ring->id << GUC_ELC_ENGINE_OFFSET);
> +
> +		obj = ringbuf->obj;
> +
> +		lrc->ring_begin = i915_gem_obj_ggtt_offset(obj);
> +		lrc->ring_end = lrc->ring_begin + obj->base.size - 1;
> +		lrc->ring_next_free_location = lrc->ring_begin;
> +		lrc->ring_current_tail_pointer_value = 0;
> +
> +		desc.engines_used |= (1 << ring->id);
> +	}
> +
> +	WARN_ON(desc.engines_used == 0);
> +
>  	/*
>  	 * The CPU address is only needed at certain points, so kmap_atomic on
>  	 * demand instead of storing it in the ctx descriptor.
> @@ -622,11 +662,13 @@ static void guc_client_free(struct drm_device *dev,
>   * 		The kernel client to replace ExecList submission is created with
>   * 		NORMAL priority. Priority of a client for scheduler can be HIGH,
>   * 		while a preemption context can use CRITICAL.
> + * @ctx		the context to own the client (we use the default render context)
>   *
>   * Return:	An i915_guc_client object if success.
>   */
>  static struct i915_guc_client *guc_client_alloc(struct drm_device *dev,
> -						uint32_t priority)
> +						uint32_t priority,
> +						struct intel_context *ctx)
>  {
>  	struct i915_guc_client *client;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> @@ -639,6 +681,7 @@ static struct i915_guc_client *guc_client_alloc(struct drm_device *dev,
>  
>  	client->doorbell_id = GUC_INVALID_DOORBELL_ID;
>  	client->priority = priority;
> +	client->owner = ctx;
>  
>  	client->ctx_index = (uint32_t)ida_simple_get(&guc->ctx_ids, 0,
>  			GUC_MAX_GPU_CONTEXTS, GFP_KERNEL);
> @@ -772,10 +815,11 @@ int i915_guc_submission_enable(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct intel_guc *guc = &dev_priv->guc;
> +	struct intel_context *ctx = dev_priv->ring[RCS].default_context;
>  	struct i915_guc_client *client;
>  
>  	/* client for execbuf submission */
> -	client = guc_client_alloc(dev, GUC_CTX_PRIORITY_NORMAL);
> +	client = guc_client_alloc(dev, GUC_CTX_PRIORITY_NORMAL, ctx);
>  	if (!client) {
>  		DRM_ERROR("Failed to create execbuf guc_client\n");
>  		return -ENOMEM;
> diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
> index d249326..9571b56 100644
> --- a/drivers/gpu/drm/i915/intel_guc.h
> +++ b/drivers/gpu/drm/i915/intel_guc.h
> @@ -29,6 +29,7 @@
>  
>  struct i915_guc_client {
>  	struct drm_i915_gem_object *client_obj;
> +	struct intel_context *owner;
>  	uint32_t priority;
>  	uint32_t ctx_index;
>  
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 9e121d3..8294462 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -254,7 +254,8 @@ int intel_sanitize_enable_execlists(struct drm_device *dev, int enable_execlists
>   */
>  u32 intel_execlists_ctx_id(struct drm_i915_gem_object *ctx_obj)
>  {
> -	u32 lrca = i915_gem_obj_ggtt_offset(ctx_obj);
> +	u32 lrca = i915_gem_obj_ggtt_offset(ctx_obj) +
> +			LRC_PPHWSP_PN * PAGE_SIZE;
>  
>  	/* LRCA is required to be 4K aligned so the more significant 20 bits
>  	 * are globally unique */
> @@ -267,7 +268,8 @@ uint64_t intel_lr_context_descriptor(struct intel_context *ctx,
>  	struct drm_device *dev = ring->dev;
>  	struct drm_i915_gem_object *ctx_obj = ctx->engine[ring->id].state;
>  	uint64_t desc;
> -	uint64_t lrca = i915_gem_obj_ggtt_offset(ctx_obj);
> +	uint64_t lrca = i915_gem_obj_ggtt_offset(ctx_obj) +
> +			LRC_PPHWSP_PN * PAGE_SIZE;
>  
>  	WARN_ON(lrca & 0xFFFFFFFF00000FFFULL);
>  
> @@ -342,7 +344,7 @@ void intel_lr_context_update(struct drm_i915_gem_request *rq)
>  	WARN_ON(!i915_gem_obj_is_pinned(ctx_obj));
>  	WARN_ON(!i915_gem_obj_is_pinned(rb_obj));
>  
> -	page = i915_gem_object_get_page(ctx_obj, 1);
> +	page = i915_gem_object_get_page(ctx_obj, LRC_STATE_PN);
>  	reg_state = kmap_atomic(page);
>  
>  	reg_state[CTX_RING_TAIL+1] = rq->tail;
> @@ -687,13 +689,17 @@ static void
>  intel_logical_ring_advance_and_submit(struct drm_i915_gem_request *request)
>  {
>  	struct intel_engine_cs *ring = request->ring;
> +	struct drm_i915_private *dev_priv = request->i915;
>  
>  	intel_logical_ring_advance(request->ringbuf);
>  
>  	if (intel_ring_stopped(ring))
>  		return;
>  
> -	execlists_context_queue(request);
> +	if (dev_priv->guc.execbuf_client)
> +		i915_guc_submit(dev_priv->guc.execbuf_client, request);
> +	else
> +		execlists_context_queue(request);
>  }
>  
>  static void __wrap_ring_buffer(struct intel_ringbuffer *ringbuf)
> @@ -984,6 +990,7 @@ int logical_ring_flush_all_caches(struct drm_i915_gem_request *req)
>  
>  static int intel_lr_context_pin(struct drm_i915_gem_request *rq)
>  {
> +	struct drm_i915_private *dev_priv = rq->i915;
>  	struct intel_engine_cs *ring = rq->ring;
>  	struct drm_i915_gem_object *ctx_obj = rq->ctx->engine[ring->id].state;
>  	struct intel_ringbuffer *ringbuf = rq->ringbuf;
> @@ -991,14 +998,18 @@ static int intel_lr_context_pin(struct drm_i915_gem_request *rq)
>  
>  	WARN_ON(!mutex_is_locked(&ring->dev->struct_mutex));
>  	if (rq->ctx->engine[ring->id].pin_count++ == 0) {
> -		ret = i915_gem_obj_ggtt_pin(ctx_obj,
> -				GEN8_LR_CONTEXT_ALIGN, 0);
> +		ret = i915_gem_obj_ggtt_pin(ctx_obj, GEN8_LR_CONTEXT_ALIGN,
> +				PIN_OFFSET_BIAS | GUC_WOPCM_SIZE_VALUE);
>  		if (ret)
>  			goto reset_pin_count;
>  
>  		ret = intel_pin_and_map_ringbuffer_obj(ring->dev, ringbuf);
>  		if (ret)
>  			goto unpin_ctx_obj;
> +
> +		/* Invalidate GuC TLB. */
> +		if (i915.enable_guc_submission)
> +			I915_WRITE(GEN8_GTCR, GEN8_GTCR_INVALIDATE);
>  	}
>  
>  	return ret;
> @@ -1668,8 +1679,13 @@ out:
>  
>  static int gen8_init_rcs_context(struct drm_i915_gem_request *req)
>  {
> +	struct drm_i915_private *dev_priv = req->i915;
>  	int ret;
>  
> +	/* Invalidate GuC TLB. */
[TOR:] This invalidation is in the init_context for render
ring but not the other rings.  Is this needed for other
rings?  Or, should this invalidation happen at a different
level?  It seems this may depend on the on render ring being
initialized first.

Thanks,
Tom

> +	if (i915.enable_guc_submission)
> +		I915_WRITE(GEN8_GTCR, GEN8_GTCR_INVALIDATE);
> +
>  	ret = intel_logical_ring_workarounds_emit(req);
>  	if (ret)
>  		return ret;
> @@ -2026,7 +2042,7 @@ populate_lr_context(struct intel_context *ctx, struct drm_i915_gem_object *ctx_o
>  
>  	/* The second page of the context object contains some fields which must
>  	 * be set up prior to the first execution. */
> -	page = i915_gem_object_get_page(ctx_obj, 1);
> +	page = i915_gem_object_get_page(ctx_obj, LRC_STATE_PN);
>  	reg_state = kmap_atomic(page);
>  
>  	/* A context is actually a big batch buffer with several MI_LOAD_REGISTER_IMM
> @@ -2185,12 +2201,13 @@ static void lrc_setup_hardware_status_page(struct intel_engine_cs *ring,
>  		struct drm_i915_gem_object *default_ctx_obj)
>  {
>  	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> +	struct page *page;
>  
> -	/* The status page is offset 0 from the default context object
> -	 * in LRC mode. */
> -	ring->status_page.gfx_addr = i915_gem_obj_ggtt_offset(default_ctx_obj);
> -	ring->status_page.page_addr =
> -			kmap(sg_page(default_ctx_obj->pages->sgl));
> +	/* The HWSP is part of the default context object in LRC mode. */
> +	ring->status_page.gfx_addr = i915_gem_obj_ggtt_offset(default_ctx_obj)
> +			+ LRC_PPHWSP_PN * PAGE_SIZE;
> +	page = i915_gem_object_get_page(default_ctx_obj, LRC_PPHWSP_PN);
> +	ring->status_page.page_addr = kmap(page);
>  	ring->status_page.obj = default_ctx_obj;
>  
>  	I915_WRITE(RING_HWS_PGA(ring->mmio_base),
> @@ -2226,6 +2243,9 @@ int intel_lr_context_deferred_create(struct intel_context *ctx,
>  
>  	context_size = round_up(get_lr_context_size(ring), 4096);
>  
> +	/* One extra page as the sharing data between driver and GuC */
> +	context_size += PAGE_SIZE * LRC_PPHWSP_PN;
> +
>  	ctx_obj = i915_gem_alloc_object(dev, context_size);
>  	if (!ctx_obj) {
>  		DRM_DEBUG_DRIVER("Alloc LRC backing obj failed.\n");
> @@ -2233,7 +2253,8 @@ int intel_lr_context_deferred_create(struct intel_context *ctx,
>  	}
>  
>  	if (is_global_default_ctx) {
> -		ret = i915_gem_obj_ggtt_pin(ctx_obj, GEN8_LR_CONTEXT_ALIGN, 0);
> +		ret = i915_gem_obj_ggtt_pin(ctx_obj, GEN8_LR_CONTEXT_ALIGN,
> +				PIN_OFFSET_BIAS | GUC_WOPCM_SIZE_VALUE);
>  		if (ret) {
>  			DRM_DEBUG_DRIVER("Pin LRC backing obj failed: %d\n",
>  					ret);
> @@ -2252,7 +2273,7 @@ int intel_lr_context_deferred_create(struct intel_context *ctx,
>  
>  	ringbuf->ring = ring;
>  
> -	ringbuf->size = 32 * PAGE_SIZE;
> +	ringbuf->size = 4 * PAGE_SIZE;
>  	ringbuf->effective_size = ringbuf->size;
>  	ringbuf->head = 0;
>  	ringbuf->tail = 0;
> @@ -2352,7 +2373,7 @@ void intel_lr_context_reset(struct drm_device *dev,
>  			WARN(1, "Failed get_pages for context obj\n");
>  			continue;
>  		}
> -		page = i915_gem_object_get_page(ctx_obj, 1);
> +		page = i915_gem_object_get_page(ctx_obj, LRC_STATE_PN);
>  		reg_state = kmap_atomic(page);
>  
>  		reg_state[CTX_RING_HEAD+1] = 0;
> diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
> index 6ecc0b3..e04b5c2 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.h
> +++ b/drivers/gpu/drm/i915/intel_lrc.h
> @@ -67,6 +67,12 @@ static inline void intel_logical_ring_emit(struct intel_ringbuffer *ringbuf,
>  }
>  
>  /* Logical Ring Contexts */
> +
> +/* One extra page is added before LRC for GuC as shared data */
> +#define LRC_GUCSHR_PN	(0)
> +#define LRC_PPHWSP_PN	(LRC_GUCSHR_PN + 1)
> +#define LRC_STATE_PN	(LRC_PPHWSP_PN + 1)
> +
>  void intel_lr_context_free(struct intel_context *ctx);
>  int intel_lr_context_deferred_create(struct intel_context *ctx,
>  				     struct intel_engine_cs *ring);
> -- 
> 1.9.1
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 11/13 v4] drm/i915: Integrate GuC-based command submission
  2015-07-27 15:57   ` O'Rourke, Tom
@ 2015-07-27 19:33     ` Yu Dai
  2015-07-28 13:59     ` Dave Gordon
  1 sibling, 0 replies; 42+ messages in thread
From: Yu Dai @ 2015-07-27 19:33 UTC (permalink / raw)
  To: O'Rourke, Tom, Dave Gordon; +Cc: intel-gfx



On 07/27/2015 08:57 AM, O'Rourke, Tom wrote:
> On Thu, Jul 09, 2015 at 07:29:12PM +0100, Dave Gordon wrote:
> > From: Alex Dai <yu.dai@intel.com>
> >
> > GuC-based submission is mostly the same as execlist mode, up to
> > intel_logical_ring_advance_and_submit(), where the context being
> > dispatched would be added to the execlist queue; at this point
> > we submit the context to the GuC backend instead.
> >
> > There are, however, a few other changes also required, notably:
> > 1.  Contexts must be pinned at GGTT addresses accessible by the GuC
> >     i.e. NOT in the range [0..WOPCM_SIZE), so we have to add the
> >     PIN_OFFSET_BIAS flag to the relevant GGTT-pinning calls.
> >
> > 2.  The GuC's TLB must be invalidated after a context is pinned at
> >     a new GGTT address.
> >
> > 3.  GuC firmware uses the one page before Ring Context as shared data.
> >     Therefore, whenever driver wants to get base address of LRC, we
> >     will offset one page for it. LRC_PPHWSP_PN is defined as the page
> >     number of LRCA.
> >
> > 4.  In the work queue used to pass requests to the GuC, the GuC
> >     firmware requires the ring-tail-offset to be represented as an
> >     11-bit value, expressed in QWords. Therefore, the ringbuffer
> >     size must be reduced to the representable range (4 pages).
> >
> > v2:
> >     Defer adding #defines until needed [Chris Wilson]
> >     Rationalise type declarations [Chris Wilson]
> >
> > v4:
> >     Squashed kerneldoc patch into here [Daniel Vetter]
> >
> > Issue: VIZ-4884
> > Signed-off-by: Alex Dai <yu.dai@intel.com>
> > Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> > ---
> >  Documentation/DocBook/drm.tmpl             | 14 ++++++++
> >  drivers/gpu/drm/i915/i915_debugfs.c        |  2 +-
> >  drivers/gpu/drm/i915/i915_guc_submission.c | 52 +++++++++++++++++++++++++++---
> >  drivers/gpu/drm/i915/intel_guc.h           |  1 +
> >  drivers/gpu/drm/i915/intel_lrc.c           | 51 ++++++++++++++++++++---------
> >  drivers/gpu/drm/i915/intel_lrc.h           |  6 ++++
> >  6 files changed, 106 insertions(+), 20 deletions(-)
> >
> > diff --git a/Documentation/DocBook/drm.tmpl b/Documentation/DocBook/drm.tmpl
> > index 596b11d..0ff5fd7 100644
> > --- a/Documentation/DocBook/drm.tmpl
> > +++ b/Documentation/DocBook/drm.tmpl
> > @@ -4223,6 +4223,20 @@ int num_ioctls;</synopsis>
> >        </sect2>
> >      </sect1>
> >      <sect1>
> > +      <title>GuC-based Command Submission</title>
> > +      <sect2>
> > +        <title>GuC</title>
> > +!Pdrivers/gpu/drm/i915/intel_guc_loader.c GuC-specific firmware loader
> > +!Idrivers/gpu/drm/i915/intel_guc_loader.c
> > +      </sect2>
> > +      <sect2>
> > +        <title>GuC Client</title>
> > +!Pdrivers/gpu/drm/i915/intel_guc_submission.c GuC-based command submissison
> > +!Idrivers/gpu/drm/i915/intel_guc_submission.c
> > +      </sect2>
> > +    </sect1>
> > +
> > +    <sect1>
> >        <title> Tracing </title>
> >        <para>
> >      This sections covers all things related to the tracepoints implemented in
> > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> > index 13e37d1..d93732a 100644
> > --- a/drivers/gpu/drm/i915/i915_debugfs.c
> > +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> > @@ -1982,7 +1982,7 @@ static void i915_dump_lrc_obj(struct seq_file *m,
> >  		return;
> >  	}
> >
> > -	page = i915_gem_object_get_page(ctx_obj, 1);
> > +	page = i915_gem_object_get_page(ctx_obj, LRC_STATE_PN);
> >  	if (!WARN_ON(page == NULL)) {
> >  		reg_state = kmap_atomic(page);
> >
> > diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> > index 25d8807..c5c9fbf 100644
> > --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> > +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> > @@ -346,18 +346,58 @@ static void guc_init_proc_desc(struct intel_guc *guc,
> >  static void guc_init_ctx_desc(struct intel_guc *guc,
> >  			      struct i915_guc_client *client)
> >  {
> > +	struct intel_context *ctx = client->owner;
> >  	struct guc_context_desc desc;
> >  	struct sg_table *sg;
> > +	int i;
> >
> >  	memset(&desc, 0, sizeof(desc));
> >
> >  	desc.attribute = GUC_CTX_DESC_ATTR_ACTIVE | GUC_CTX_DESC_ATTR_KERNEL;
> >  	desc.context_id = client->ctx_index;
> >  	desc.priority = client->priority;
> > -	desc.engines_used = (1 << RCS) | (1 << VCS) | (1 << BCS) |
> > -			    (1 << VECS) | (1 << VCS2); /* all engines */
> >  	desc.db_id = client->doorbell_id;
> >
> > +	for (i = 0; i < I915_NUM_RINGS; i++) {
> > +		struct guc_execlist_context *lrc = &desc.lrc[i];
> > +		struct intel_ringbuffer *ringbuf = ctx->engine[i].ringbuf;
> > +		struct intel_engine_cs *ring;
> > +		struct drm_i915_gem_object *obj;
> > +		uint64_t ctx_desc;
> > +
> > +		/* TODO: We have a design issue to be solved here. Only when we
> > +		 * receive the first batch, we know which engine is used by the
> > +		 * user. But here GuC expects the lrc and ring to be pinned. It
> > +		 * is not an issue for default context, which is the only one
> > +		 * for now who owns a GuC client. But for future owner of GuC
> > +		 * client, need to make sure lrc is pinned prior to enter here.
> > +		 */
> > +		obj = ctx->engine[i].state;
> > +		if (!obj)
> > +			break;
> > +
> > +		ring = ringbuf->ring;
> > +		ctx_desc = intel_lr_context_descriptor(ctx, ring);
> > +		lrc->context_desc = (u32)ctx_desc;
> > +
> > +		/* The state page is after PPHWSP */
> > +		lrc->ring_lcra = i915_gem_obj_ggtt_offset(obj) +
> > +				LRC_STATE_PN * PAGE_SIZE;
> > +		lrc->context_id = (client->ctx_index << GUC_ELC_CTXID_OFFSET) |
> > +				(ring->id << GUC_ELC_ENGINE_OFFSET);
> > +
> > +		obj = ringbuf->obj;
> > +
> > +		lrc->ring_begin = i915_gem_obj_ggtt_offset(obj);
> > +		lrc->ring_end = lrc->ring_begin + obj->base.size - 1;
> > +		lrc->ring_next_free_location = lrc->ring_begin;
> > +		lrc->ring_current_tail_pointer_value = 0;
> > +
> > +		desc.engines_used |= (1 << ring->id);
> > +	}
> > +
> > +	WARN_ON(desc.engines_used == 0);
> > +
> >  	/*
> >  	 * The CPU address is only needed at certain points, so kmap_atomic on
> >  	 * demand instead of storing it in the ctx descriptor.
> > @@ -622,11 +662,13 @@ static void guc_client_free(struct drm_device *dev,
> >   * 		The kernel client to replace ExecList submission is created with
> >   * 		NORMAL priority. Priority of a client for scheduler can be HIGH,
> >   * 		while a preemption context can use CRITICAL.
> > + * @ctx		the context to own the client (we use the default render context)
> >   *
> >   * Return:	An i915_guc_client object if success.
> >   */
> >  static struct i915_guc_client *guc_client_alloc(struct drm_device *dev,
> > -						uint32_t priority)
> > +						uint32_t priority,
> > +						struct intel_context *ctx)
> >  {
> >  	struct i915_guc_client *client;
> >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > @@ -639,6 +681,7 @@ static struct i915_guc_client *guc_client_alloc(struct drm_device *dev,
> >
> >  	client->doorbell_id = GUC_INVALID_DOORBELL_ID;
> >  	client->priority = priority;
> > +	client->owner = ctx;
> >
> >  	client->ctx_index = (uint32_t)ida_simple_get(&guc->ctx_ids, 0,
> >  			GUC_MAX_GPU_CONTEXTS, GFP_KERNEL);
> > @@ -772,10 +815,11 @@ int i915_guc_submission_enable(struct drm_device *dev)
> >  {
> >  	struct drm_i915_private *dev_priv = dev->dev_private;
> >  	struct intel_guc *guc = &dev_priv->guc;
> > +	struct intel_context *ctx = dev_priv->ring[RCS].default_context;
> >  	struct i915_guc_client *client;
> >
> >  	/* client for execbuf submission */
> > -	client = guc_client_alloc(dev, GUC_CTX_PRIORITY_NORMAL);
> > +	client = guc_client_alloc(dev, GUC_CTX_PRIORITY_NORMAL, ctx);
> >  	if (!client) {
> >  		DRM_ERROR("Failed to create execbuf guc_client\n");
> >  		return -ENOMEM;
> > diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
> > index d249326..9571b56 100644
> > --- a/drivers/gpu/drm/i915/intel_guc.h
> > +++ b/drivers/gpu/drm/i915/intel_guc.h
> > @@ -29,6 +29,7 @@
> >
> >  struct i915_guc_client {
> >  	struct drm_i915_gem_object *client_obj;
> > +	struct intel_context *owner;
> >  	uint32_t priority;
> >  	uint32_t ctx_index;
> >
> > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> > index 9e121d3..8294462 100644
> > --- a/drivers/gpu/drm/i915/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/intel_lrc.c
> > @@ -254,7 +254,8 @@ int intel_sanitize_enable_execlists(struct drm_device *dev, int enable_execlists
> >   */
> >  u32 intel_execlists_ctx_id(struct drm_i915_gem_object *ctx_obj)
> >  {
> > -	u32 lrca = i915_gem_obj_ggtt_offset(ctx_obj);
> > +	u32 lrca = i915_gem_obj_ggtt_offset(ctx_obj) +
> > +			LRC_PPHWSP_PN * PAGE_SIZE;
> >
> >  	/* LRCA is required to be 4K aligned so the more significant 20 bits
> >  	 * are globally unique */
> > @@ -267,7 +268,8 @@ uint64_t intel_lr_context_descriptor(struct intel_context *ctx,
> >  	struct drm_device *dev = ring->dev;
> >  	struct drm_i915_gem_object *ctx_obj = ctx->engine[ring->id].state;
> >  	uint64_t desc;
> > -	uint64_t lrca = i915_gem_obj_ggtt_offset(ctx_obj);
> > +	uint64_t lrca = i915_gem_obj_ggtt_offset(ctx_obj) +
> > +			LRC_PPHWSP_PN * PAGE_SIZE;
> >
> >  	WARN_ON(lrca & 0xFFFFFFFF00000FFFULL);
> >
> > @@ -342,7 +344,7 @@ void intel_lr_context_update(struct drm_i915_gem_request *rq)
> >  	WARN_ON(!i915_gem_obj_is_pinned(ctx_obj));
> >  	WARN_ON(!i915_gem_obj_is_pinned(rb_obj));
> >
> > -	page = i915_gem_object_get_page(ctx_obj, 1);
> > +	page = i915_gem_object_get_page(ctx_obj, LRC_STATE_PN);
> >  	reg_state = kmap_atomic(page);
> >
> >  	reg_state[CTX_RING_TAIL+1] = rq->tail;
> > @@ -687,13 +689,17 @@ static void
> >  intel_logical_ring_advance_and_submit(struct drm_i915_gem_request *request)
> >  {
> >  	struct intel_engine_cs *ring = request->ring;
> > +	struct drm_i915_private *dev_priv = request->i915;
> >
> >  	intel_logical_ring_advance(request->ringbuf);
> >
> >  	if (intel_ring_stopped(ring))
> >  		return;
> >
> > -	execlists_context_queue(request);
> > +	if (dev_priv->guc.execbuf_client)
> > +		i915_guc_submit(dev_priv->guc.execbuf_client, request);
> > +	else
> > +		execlists_context_queue(request);
> >  }
> >
> >  static void __wrap_ring_buffer(struct intel_ringbuffer *ringbuf)
> > @@ -984,6 +990,7 @@ int logical_ring_flush_all_caches(struct drm_i915_gem_request *req)
> >
> >  static int intel_lr_context_pin(struct drm_i915_gem_request *rq)
> >  {
> > +	struct drm_i915_private *dev_priv = rq->i915;
> >  	struct intel_engine_cs *ring = rq->ring;
> >  	struct drm_i915_gem_object *ctx_obj = rq->ctx->engine[ring->id].state;
> >  	struct intel_ringbuffer *ringbuf = rq->ringbuf;
> > @@ -991,14 +998,18 @@ static int intel_lr_context_pin(struct drm_i915_gem_request *rq)
> >
> >  	WARN_ON(!mutex_is_locked(&ring->dev->struct_mutex));
> >  	if (rq->ctx->engine[ring->id].pin_count++ == 0) {
> > -		ret = i915_gem_obj_ggtt_pin(ctx_obj,
> > -				GEN8_LR_CONTEXT_ALIGN, 0);
> > +		ret = i915_gem_obj_ggtt_pin(ctx_obj, GEN8_LR_CONTEXT_ALIGN,
> > +				PIN_OFFSET_BIAS | GUC_WOPCM_SIZE_VALUE);
> >  		if (ret)
> >  			goto reset_pin_count;
> >
> >  		ret = intel_pin_and_map_ringbuffer_obj(ring->dev, ringbuf);
> >  		if (ret)
> >  			goto unpin_ctx_obj;
> > +
> > +		/* Invalidate GuC TLB. */
> > +		if (i915.enable_guc_submission)
> > +			I915_WRITE(GEN8_GTCR, GEN8_GTCR_INVALIDATE);
> >  	}
> >
> >  	return ret;
> > @@ -1668,8 +1679,13 @@ out:
> >
> >  static int gen8_init_rcs_context(struct drm_i915_gem_request *req)
> >  {
> > +	struct drm_i915_private *dev_priv = req->i915;
> >  	int ret;
> >
> > +	/* Invalidate GuC TLB. */
> [TOR:] This invalidation is in the init_context for render
> ring but not the other rings.  Is this needed for other
> rings?  Or, should this invalidation happen at a different
> level?  It seems this may depend on the on render ring being
> initialized first.

When a LRC is newly mapped, we should invalidate GuC TLB before any 
submission. This is needed for the golden context init. For other rings, 
it can be deferred until an user LRC is mapped into GGTT.

Alex

> Thanks,
> Tom
>
> > +	if (i915.enable_guc_submission)
> > +		I915_WRITE(GEN8_GTCR, GEN8_GTCR_INVALIDATE);
> > +
> >  	ret = intel_logical_ring_workarounds_emit(req);
> >  	if (ret)
> >  		return ret;
> > @@ -2026,7 +2042,7 @@ populate_lr_context(struct intel_context *ctx, struct drm_i915_gem_object *ctx_o
> >
> >  	/* The second page of the context object contains some fields which must
> >  	 * be set up prior to the first execution. */
> > -	page = i915_gem_object_get_page(ctx_obj, 1);
> > +	page = i915_gem_object_get_page(ctx_obj, LRC_STATE_PN);
> >  	reg_state = kmap_atomic(page);
> >
> >  	/* A context is actually a big batch buffer with several MI_LOAD_REGISTER_IMM
> > @@ -2185,12 +2201,13 @@ static void lrc_setup_hardware_status_page(struct intel_engine_cs *ring,
> >  		struct drm_i915_gem_object *default_ctx_obj)
> >  {
> >  	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> > +	struct page *page;
> >
> > -	/* The status page is offset 0 from the default context object
> > -	 * in LRC mode. */
> > -	ring->status_page.gfx_addr = i915_gem_obj_ggtt_offset(default_ctx_obj);
> > -	ring->status_page.page_addr =
> > -			kmap(sg_page(default_ctx_obj->pages->sgl));
> > +	/* The HWSP is part of the default context object in LRC mode. */
> > +	ring->status_page.gfx_addr = i915_gem_obj_ggtt_offset(default_ctx_obj)
> > +			+ LRC_PPHWSP_PN * PAGE_SIZE;
> > +	page = i915_gem_object_get_page(default_ctx_obj, LRC_PPHWSP_PN);
> > +	ring->status_page.page_addr = kmap(page);
> >  	ring->status_page.obj = default_ctx_obj;
> >
> >  	I915_WRITE(RING_HWS_PGA(ring->mmio_base),
> > @@ -2226,6 +2243,9 @@ int intel_lr_context_deferred_create(struct intel_context *ctx,
> >
> >  	context_size = round_up(get_lr_context_size(ring), 4096);
> >
> > +	/* One extra page as the sharing data between driver and GuC */
> > +	context_size += PAGE_SIZE * LRC_PPHWSP_PN;
> > +
> >  	ctx_obj = i915_gem_alloc_object(dev, context_size);
> >  	if (!ctx_obj) {
> >  		DRM_DEBUG_DRIVER("Alloc LRC backing obj failed.\n");
> > @@ -2233,7 +2253,8 @@ int intel_lr_context_deferred_create(struct intel_context *ctx,
> >  	}
> >
> >  	if (is_global_default_ctx) {
> > -		ret = i915_gem_obj_ggtt_pin(ctx_obj, GEN8_LR_CONTEXT_ALIGN, 0);
> > +		ret = i915_gem_obj_ggtt_pin(ctx_obj, GEN8_LR_CONTEXT_ALIGN,
> > +				PIN_OFFSET_BIAS | GUC_WOPCM_SIZE_VALUE);
> >  		if (ret) {
> >  			DRM_DEBUG_DRIVER("Pin LRC backing obj failed: %d\n",
> >  					ret);
> > @@ -2252,7 +2273,7 @@ int intel_lr_context_deferred_create(struct intel_context *ctx,
> >
> >  	ringbuf->ring = ring;
> >
> > -	ringbuf->size = 32 * PAGE_SIZE;
> > +	ringbuf->size = 4 * PAGE_SIZE;
> >  	ringbuf->effective_size = ringbuf->size;
> >  	ringbuf->head = 0;
> >  	ringbuf->tail = 0;
> > @@ -2352,7 +2373,7 @@ void intel_lr_context_reset(struct drm_device *dev,
> >  			WARN(1, "Failed get_pages for context obj\n");
> >  			continue;
> >  		}
> > -		page = i915_gem_object_get_page(ctx_obj, 1);
> > +		page = i915_gem_object_get_page(ctx_obj, LRC_STATE_PN);
> >  		reg_state = kmap_atomic(page);
> >
> >  		reg_state[CTX_RING_HEAD+1] = 0;
> > diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
> > index 6ecc0b3..e04b5c2 100644
> > --- a/drivers/gpu/drm/i915/intel_lrc.h
> > +++ b/drivers/gpu/drm/i915/intel_lrc.h
> > @@ -67,6 +67,12 @@ static inline void intel_logical_ring_emit(struct intel_ringbuffer *ringbuf,
> >  }
> >
> >  /* Logical Ring Contexts */
> > +
> > +/* One extra page is added before LRC for GuC as shared data */
> > +#define LRC_GUCSHR_PN	(0)
> > +#define LRC_PPHWSP_PN	(LRC_GUCSHR_PN + 1)
> > +#define LRC_STATE_PN	(LRC_PPHWSP_PN + 1)
> > +
> >  void intel_lr_context_free(struct intel_context *ctx);
> >  int intel_lr_context_deferred_create(struct intel_context *ctx,
> >  				     struct intel_engine_cs *ring);
> > --
> > 1.9.1
> >

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 07/13 v4] drm/i915: GuC submission setup, phase 1
  2015-07-24 22:31   ` O'Rourke, Tom
@ 2015-07-27 22:41     ` Yu Dai
  2015-07-27 23:12       ` O'Rourke, Tom
  0 siblings, 1 reply; 42+ messages in thread
From: Yu Dai @ 2015-07-27 22:41 UTC (permalink / raw)
  To: O'Rourke, Tom, Dave Gordon; +Cc: intel-gfx



On 07/24/2015 03:31 PM, O'Rourke, Tom wrote:
> [TOR:] When I see "phase 1" I also look for "phase 2".
> A subject that better describes the change in this patch
> would help.
>
> On Thu, Jul 09, 2015 at 07:29:08PM +0100, Dave Gordon wrote:
> > From: Alex Dai <yu.dai@intel.com>
> >
> > This adds the first of the data structures used to communicate with the
> > GuC (the pool of guc_context structures).
> >
> > We create a GuC-specific wrapper round the GEM object allocator as all
> > GEM objects shared with the GuC must be pinned into GGTT space at an
> > address that is NOT in the range [0..WOPCM_SIZE), as that range of GGTT
> > addresses is not accessible to the GuC (from the GuC's point of view,
> > it's permanently reserved for other objects such as the BootROM & SRAM).
> [TOR:] I would like a clarfication on the excluded range.
> The excluded range should be 0 to "size for guc within
> WOPCM area" and not 0 to "size of WOPCM area".

Nope, GGTT range [0..WOPCM_SIZE) should be excluded from GuC usage. 
BSpec clearly says, from 0 to WOPCM_TOP-1 is for BootROM, SRAM and 
WOPCM. From WOPCM_TOP and above is GFX DRAM. Be note that, that GGTT 
space is still available to any gfx obj as long as it is not accessed by 
GuC (OK to pass through GuC).

> >
> > Later, we will need to allocate additional GuC-sharable objects for the
> > submission client(s) and the GuC's debug log.
> >
> > v2:
> >     Remove redundant initialisation [Chris Wilson]
> >     Defer adding struct members until needed [Chris Wilson]
> >     Local functions should pass dev_priv rather than dev [Chris Wilson]
> >
> > v4:
> >     Rebased
> >
> > Issue: VIZ-4884
> > Signed-off-by: Alex Dai <yu.dai@intel.com>
> > Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> > ---
> >  drivers/gpu/drm/i915/Makefile              |   3 +-
> >  drivers/gpu/drm/i915/i915_guc_submission.c | 114 +++++++++++++++++++++++++++++
> >  drivers/gpu/drm/i915/intel_guc.h           |   7 ++
> >  drivers/gpu/drm/i915/intel_guc_loader.c    |  19 +++++
> >  4 files changed, 142 insertions(+), 1 deletion(-)
> >  create mode 100644 drivers/gpu/drm/i915/i915_guc_submission.c
> >
> > diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> > index e604cfe..23f5612 100644
> > --- a/drivers/gpu/drm/i915/Makefile
> > +++ b/drivers/gpu/drm/i915/Makefile
> > @@ -40,7 +40,8 @@ i915-y += i915_cmd_parser.o \
> >  	  intel_uncore.o
> >
> >  # general-purpose microcontroller (GuC) support
> > -i915-y += intel_guc_loader.o
> > +i915-y += intel_guc_loader.o \
> > +	  i915_guc_submission.o
> >
> >  # autogenerated null render state
> >  i915-y += intel_renderstate_gen6.o \
> > diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> > new file mode 100644
> > index 0000000..70a0405
> > --- /dev/null
> > +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> > @@ -0,0 +1,114 @@
> > +/*
> > + * Copyright © 2014 Intel Corporation
> > + *
> > + * Permission is hereby granted, free of charge, to any person obtaining a
> > + * copy of this software and associated documentation files (the "Software"),
> > + * to deal in the Software without restriction, including without limitation
> > + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> > + * and/or sell copies of the Software, and to permit persons to whom the
> > + * Software is furnished to do so, subject to the following conditions:
> > + *
> > + * The above copyright notice and this permission notice (including the next
> > + * paragraph) shall be included in all copies or substantial portions of the
> > + * Software.
> > + *
> > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
> > + * IN THE SOFTWARE.
> > + *
> > + */
> > +#include <linux/firmware.h>
> > +#include <linux/circ_buf.h>
> > +#include "i915_drv.h"
> > +#include "intel_guc.h"
> > +
> > +/**
> > + * gem_allocate_guc_obj() - Allocate gem object for GuC usage
> > + * @dev:	drm device
> > + * @size:	size of object
> > + *
> > + * This is a wrapper to create a gem obj. In order to use it inside GuC, the
> > + * object needs to be pinned lifetime. Also we must pin it to gtt space other
> > + * than [0, GUC_WOPCM_SIZE] because this range is reserved inside GuC.
> > + *
> > + * Return:	A drm_i915_gem_object if successful, otherwise NULL.
> > + */
> > +static struct drm_i915_gem_object *gem_allocate_guc_obj(struct drm_device *dev,
> > +							u32 size)
> > +{
> > +	struct drm_i915_gem_object *obj;
> > +
> > +	obj = i915_gem_alloc_object(dev, size);
> > +	if (!obj)
> > +		return NULL;
> > +
> > +	if (i915_gem_object_get_pages(obj)) {
> > +		drm_gem_object_unreference(&obj->base);
> > +		return NULL;
> > +	}
> > +
> > +	if (i915_gem_obj_ggtt_pin(obj, PAGE_SIZE,
> > +			PIN_OFFSET_BIAS | GUC_WOPCM_SIZE_VALUE)) {
> > +		drm_gem_object_unreference(&obj->base);
> > +		return NULL;
> > +	}
> > +
> > +	return obj;
> > +}
> > +
> > +/**
> > + * gem_release_guc_obj() - Release gem object allocated for GuC usage
> > + * @obj:	gem obj to be released
> > +  */
> > +static void gem_release_guc_obj(struct drm_i915_gem_object *obj)
> > +{
> > +	if (!obj)
> > +		return;
> > +
> > +	if (i915_gem_obj_is_pinned(obj))
> > +		i915_gem_object_ggtt_unpin(obj);
> > +
> > +	drm_gem_object_unreference(&obj->base);
> > +}
> > +
> > +/*
> > + * Set up the memory resources to be shared with the GuC.  At this point,
> > + * we require just one object that can be mapped through the GGTT.
> > + */
> > +int i915_guc_submission_init(struct drm_device *dev)
> > +{
> > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > +	const size_t ctxsize = sizeof(struct guc_context_desc);
> > +	const size_t poolsize = GUC_MAX_GPU_CONTEXTS * ctxsize;
> > +	const size_t gemsize = round_up(poolsize, PAGE_SIZE);
> > +	struct intel_guc *guc = &dev_priv->guc;
> > +
> > +	if (!i915.enable_guc_submission)
> > +		return 0; /* not enabled  */
> > +
> > +	if (guc->ctx_pool_obj)
> > +		return 0; /* already allocated */
> > +
> > +	guc->ctx_pool_obj = gem_allocate_guc_obj(dev_priv->dev, gemsize);
> > +	if (!guc->ctx_pool_obj)
> > +		return -ENOMEM;
> > +
> > +	ida_init(&guc->ctx_ids);
> > +
> > +	return 0;
> > +}
> > +
> > +void i915_guc_submission_fini(struct drm_device *dev)
> > +{
> > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > +	struct intel_guc *guc = &dev_priv->guc;
> > +
> > +	if (guc->ctx_pool_obj)
> > +		ida_destroy(&guc->ctx_ids);
> > +	gem_release_guc_obj(guc->ctx_pool_obj);
> > +	guc->ctx_pool_obj = NULL;
> > +}
> > diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
> > index 2846b6d..be3cad8 100644
> > --- a/drivers/gpu/drm/i915/intel_guc.h
> > +++ b/drivers/gpu/drm/i915/intel_guc.h
> > @@ -56,6 +56,9 @@ struct intel_guc {
> >  	struct intel_guc_fw guc_fw;
> >
> >  	uint32_t log_flags;
> > +
> > +	struct drm_i915_gem_object *ctx_pool_obj;
> > +	struct ida ctx_ids;
> >  };
> >
> >  /* intel_guc_loader.c */
> > @@ -64,4 +67,8 @@ extern int intel_guc_ucode_load(struct drm_device *dev);
> >  extern void intel_guc_ucode_fini(struct drm_device *dev);
> >  extern const char *intel_guc_fw_status_repr(enum intel_guc_fw_status status);
> >
> > +/* i915_guc_submission.c */
> > +int i915_guc_submission_init(struct drm_device *dev);
> > +void i915_guc_submission_fini(struct drm_device *dev);
> > +
> [TOR:] i915_guc_submission_init gets used in this patch.
> Unexpectedly, i915_guc_submission_fini does not get used
> in this patch.
>
> A later patch in this series adds the call to
> i915_guc_submission_fini in intel_guc_ucode_fini.
> Should that call be added in this patch?
>
> >  #endif
> > diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c b/drivers/gpu/drm/i915/intel_guc_loader.c
> > index 2080bca..e5d7136 100644
> > --- a/drivers/gpu/drm/i915/intel_guc_loader.c
> > +++ b/drivers/gpu/drm/i915/intel_guc_loader.c
> > @@ -128,6 +128,21 @@ static void set_guc_init_params(struct drm_i915_private *dev_priv)
> >  			i915.guc_log_level << GUC_LOG_VERBOSITY_SHIFT;
> >  	}
> >
> > +	/* If GuC scheduling is enabled, setup params here. */
> [TOR:] I assume from this "GuC scheduling" == "GuC submission".
> This is a little confusing.  I recommend either reword
> "GuC scheduling" or add comment text to explain.

For now, yes the GuC scheduling is only doing the 'submission' work 
because of the current kernel in-order queue. If we have client direct 
submission enabled, there WILL BE GuC scheduling inside firmware 
according to the priority of each queue etc.

Thanks,
Alex
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 07/13 v4] drm/i915: GuC submission setup, phase 1
  2015-07-27 22:41     ` Yu Dai
@ 2015-07-27 23:12       ` O'Rourke, Tom
  2015-07-28  0:18         ` Yu Dai
  2015-07-28 15:16         ` Dave Gordon
  0 siblings, 2 replies; 42+ messages in thread
From: O'Rourke, Tom @ 2015-07-27 23:12 UTC (permalink / raw)
  To: Yu Dai; +Cc: intel-gfx

On Mon, Jul 27, 2015 at 03:41:31PM -0700, Yu Dai wrote:
> 
> 
> On 07/24/2015 03:31 PM, O'Rourke, Tom wrote:
> >[TOR:] When I see "phase 1" I also look for "phase 2".
> >A subject that better describes the change in this patch
> >would help.
> >
> >On Thu, Jul 09, 2015 at 07:29:08PM +0100, Dave Gordon wrote:
> >> From: Alex Dai <yu.dai@intel.com>
> >>
> >> This adds the first of the data structures used to communicate with the
> >> GuC (the pool of guc_context structures).
> >>
> >> We create a GuC-specific wrapper round the GEM object allocator as all
> >> GEM objects shared with the GuC must be pinned into GGTT space at an
> >> address that is NOT in the range [0..WOPCM_SIZE), as that range of GGTT
> >> addresses is not accessible to the GuC (from the GuC's point of view,
> >> it's permanently reserved for other objects such as the BootROM & SRAM).
> >[TOR:] I would like a clarfication on the excluded range.
> >The excluded range should be 0 to "size for guc within
> >WOPCM area" and not 0 to "size of WOPCM area".
> 
> Nope, GGTT range [0..WOPCM_SIZE) should be excluded from GuC usage.
> BSpec clearly says, from 0 to WOPCM_TOP-1 is for BootROM, SRAM and
> WOPCM. From WOPCM_TOP and above is GFX DRAM. Be note that, that GGTT
> space is still available to any gfx obj as long as it is not
> accessed by GuC (OK to pass through GuC).
>
[TOR:] Should we take a closer look at the pin offset bias
for guc objects?  GUC_WOPCM_SIZE_VALUE is not the full size
of WOPCM area.
 
> >>
> >> Later, we will need to allocate additional GuC-sharable objects for the
> >> submission client(s) and the GuC's debug log.
> >>
> >> v2:
> >>     Remove redundant initialisation [Chris Wilson]
> >>     Defer adding struct members until needed [Chris Wilson]
> >>     Local functions should pass dev_priv rather than dev [Chris Wilson]
> >>
> >> v4:
> >>     Rebased
> >>
> >> Issue: VIZ-4884
> >> Signed-off-by: Alex Dai <yu.dai@intel.com>
> >> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> >> ---
> >>  drivers/gpu/drm/i915/Makefile              |   3 +-
> >>  drivers/gpu/drm/i915/i915_guc_submission.c | 114 +++++++++++++++++++++++++++++
> >>  drivers/gpu/drm/i915/intel_guc.h           |   7 ++
> >>  drivers/gpu/drm/i915/intel_guc_loader.c    |  19 +++++
> >>  4 files changed, 142 insertions(+), 1 deletion(-)
> >>  create mode 100644 drivers/gpu/drm/i915/i915_guc_submission.c
> >>
> >> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> >> index e604cfe..23f5612 100644
> >> --- a/drivers/gpu/drm/i915/Makefile
> >> +++ b/drivers/gpu/drm/i915/Makefile
> >> @@ -40,7 +40,8 @@ i915-y += i915_cmd_parser.o \
> >>  	  intel_uncore.o
> >>
> >>  # general-purpose microcontroller (GuC) support
> >> -i915-y += intel_guc_loader.o
> >> +i915-y += intel_guc_loader.o \
> >> +	  i915_guc_submission.o
> >>
> >>  # autogenerated null render state
> >>  i915-y += intel_renderstate_gen6.o \
> >> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> >> new file mode 100644
> >> index 0000000..70a0405
> >> --- /dev/null
> >> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> >> @@ -0,0 +1,114 @@
> >> +/*
> >> + * Copyright © 2014 Intel Corporation
> >> + *
> >> + * Permission is hereby granted, free of charge, to any person obtaining a
> >> + * copy of this software and associated documentation files (the "Software"),
> >> + * to deal in the Software without restriction, including without limitation
> >> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> >> + * and/or sell copies of the Software, and to permit persons to whom the
> >> + * Software is furnished to do so, subject to the following conditions:
> >> + *
> >> + * The above copyright notice and this permission notice (including the next
> >> + * paragraph) shall be included in all copies or substantial portions of the
> >> + * Software.
> >> + *
> >> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> >> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> >> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> >> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> >> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> >> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
> >> + * IN THE SOFTWARE.
> >> + *
> >> + */
> >> +#include <linux/firmware.h>
> >> +#include <linux/circ_buf.h>
> >> +#include "i915_drv.h"
> >> +#include "intel_guc.h"
> >> +
> >> +/**
> >> + * gem_allocate_guc_obj() - Allocate gem object for GuC usage
> >> + * @dev:	drm device
> >> + * @size:	size of object
> >> + *
> >> + * This is a wrapper to create a gem obj. In order to use it inside GuC, the
> >> + * object needs to be pinned lifetime. Also we must pin it to gtt space other
> >> + * than [0, GUC_WOPCM_SIZE] because this range is reserved inside GuC.
> >> + *
> >> + * Return:	A drm_i915_gem_object if successful, otherwise NULL.
> >> + */
> >> +static struct drm_i915_gem_object *gem_allocate_guc_obj(struct drm_device *dev,
> >> +							u32 size)
> >> +{
> >> +	struct drm_i915_gem_object *obj;
> >> +
> >> +	obj = i915_gem_alloc_object(dev, size);
> >> +	if (!obj)
> >> +		return NULL;
> >> +
> >> +	if (i915_gem_object_get_pages(obj)) {
> >> +		drm_gem_object_unreference(&obj->base);
> >> +		return NULL;
> >> +	}
> >> +
> >> +	if (i915_gem_obj_ggtt_pin(obj, PAGE_SIZE,
> >> +			PIN_OFFSET_BIAS | GUC_WOPCM_SIZE_VALUE)) {
> >> +		drm_gem_object_unreference(&obj->base);
> >> +		return NULL;
> >> +	}
> >> +
> >> +	return obj;
> >> +}
> >> +
> >> +/**
> >> + * gem_release_guc_obj() - Release gem object allocated for GuC usage
> >> + * @obj:	gem obj to be released
> >> +  */
> >> +static void gem_release_guc_obj(struct drm_i915_gem_object *obj)
> >> +{
> >> +	if (!obj)
> >> +		return;
> >> +
> >> +	if (i915_gem_obj_is_pinned(obj))
> >> +		i915_gem_object_ggtt_unpin(obj);
> >> +
> >> +	drm_gem_object_unreference(&obj->base);
> >> +}
> >> +
> >> +/*
> >> + * Set up the memory resources to be shared with the GuC.  At this point,
> >> + * we require just one object that can be mapped through the GGTT.
> >> + */
> >> +int i915_guc_submission_init(struct drm_device *dev)
> >> +{
> >> +	struct drm_i915_private *dev_priv = dev->dev_private;
> >> +	const size_t ctxsize = sizeof(struct guc_context_desc);
> >> +	const size_t poolsize = GUC_MAX_GPU_CONTEXTS * ctxsize;
> >> +	const size_t gemsize = round_up(poolsize, PAGE_SIZE);
> >> +	struct intel_guc *guc = &dev_priv->guc;
> >> +
> >> +	if (!i915.enable_guc_submission)
> >> +		return 0; /* not enabled  */
> >> +
> >> +	if (guc->ctx_pool_obj)
> >> +		return 0; /* already allocated */
> >> +
> >> +	guc->ctx_pool_obj = gem_allocate_guc_obj(dev_priv->dev, gemsize);
> >> +	if (!guc->ctx_pool_obj)
> >> +		return -ENOMEM;
> >> +
> >> +	ida_init(&guc->ctx_ids);
> >> +
> >> +	return 0;
> >> +}
> >> +
> >> +void i915_guc_submission_fini(struct drm_device *dev)
> >> +{
> >> +	struct drm_i915_private *dev_priv = dev->dev_private;
> >> +	struct intel_guc *guc = &dev_priv->guc;
> >> +
> >> +	if (guc->ctx_pool_obj)
> >> +		ida_destroy(&guc->ctx_ids);
> >> +	gem_release_guc_obj(guc->ctx_pool_obj);
> >> +	guc->ctx_pool_obj = NULL;
> >> +}
> >> diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
> >> index 2846b6d..be3cad8 100644
> >> --- a/drivers/gpu/drm/i915/intel_guc.h
> >> +++ b/drivers/gpu/drm/i915/intel_guc.h
> >> @@ -56,6 +56,9 @@ struct intel_guc {
> >>  	struct intel_guc_fw guc_fw;
> >>
> >>  	uint32_t log_flags;
> >> +
> >> +	struct drm_i915_gem_object *ctx_pool_obj;
> >> +	struct ida ctx_ids;
> >>  };
> >>
> >>  /* intel_guc_loader.c */
> >> @@ -64,4 +67,8 @@ extern int intel_guc_ucode_load(struct drm_device *dev);
> >>  extern void intel_guc_ucode_fini(struct drm_device *dev);
> >>  extern const char *intel_guc_fw_status_repr(enum intel_guc_fw_status status);
> >>
> >> +/* i915_guc_submission.c */
> >> +int i915_guc_submission_init(struct drm_device *dev);
> >> +void i915_guc_submission_fini(struct drm_device *dev);
> >> +
> >[TOR:] i915_guc_submission_init gets used in this patch.
> >Unexpectedly, i915_guc_submission_fini does not get used
> >in this patch.
> >
> >A later patch in this series adds the call to
> >i915_guc_submission_fini in intel_guc_ucode_fini.
> >Should that call be added in this patch?
> >
> >>  #endif
> >> diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c b/drivers/gpu/drm/i915/intel_guc_loader.c
> >> index 2080bca..e5d7136 100644
> >> --- a/drivers/gpu/drm/i915/intel_guc_loader.c
> >> +++ b/drivers/gpu/drm/i915/intel_guc_loader.c
> >> @@ -128,6 +128,21 @@ static void set_guc_init_params(struct drm_i915_private *dev_priv)
> >>  			i915.guc_log_level << GUC_LOG_VERBOSITY_SHIFT;
> >>  	}
> >>
> >> +	/* If GuC scheduling is enabled, setup params here. */
> >[TOR:] I assume from this "GuC scheduling" == "GuC submission".
> >This is a little confusing.  I recommend either reword
> >"GuC scheduling" or add comment text to explain.
> 
> For now, yes the GuC scheduling is only doing the 'submission' work
> because of the current kernel in-order queue. If we have client
> direct submission enabled, there WILL BE GuC scheduling inside
> firmware according to the priority of each queue etc.
> 
> Thanks,
> Alex
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 07/13 v4] drm/i915: GuC submission setup, phase 1
  2015-07-27 23:12       ` O'Rourke, Tom
@ 2015-07-28  0:18         ` Yu Dai
  2015-07-28 15:16         ` Dave Gordon
  1 sibling, 0 replies; 42+ messages in thread
From: Yu Dai @ 2015-07-28  0:18 UTC (permalink / raw)
  To: O'Rourke, Tom; +Cc: intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 10561 bytes --]



On 07/27/2015 04:12 PM, O'Rourke, Tom wrote:
> On Mon, Jul 27, 2015 at 03:41:31PM -0700, Yu Dai wrote:
> >
> >
> > On 07/24/2015 03:31 PM, O'Rourke, Tom wrote:
> > >[TOR:] When I see "phase 1" I also look for "phase 2".
> > >A subject that better describes the change in this patch
> > >would help.
> > >
> > >On Thu, Jul 09, 2015 at 07:29:08PM +0100, Dave Gordon wrote:
> > >> From: Alex Dai <yu.dai@intel.com>
> > >>
> > >> This adds the first of the data structures used to communicate with the
> > >> GuC (the pool of guc_context structures).
> > >>
> > >> We create a GuC-specific wrapper round the GEM object allocator as all
> > >> GEM objects shared with the GuC must be pinned into GGTT space at an
> > >> address that is NOT in the range [0..WOPCM_SIZE), as that range of GGTT
> > >> addresses is not accessible to the GuC (from the GuC's point of view,
> > >> it's permanently reserved for other objects such as the BootROM & SRAM).
> > >[TOR:] I would like a clarfication on the excluded range.
> > >The excluded range should be 0 to "size for guc within
> > >WOPCM area" and not 0 to "size of WOPCM area".
> >
> > Nope, GGTT range [0..WOPCM_SIZE) should be excluded from GuC usage.
> > BSpec clearly says, from 0 to WOPCM_TOP-1 is for BootROM, SRAM and
> > WOPCM. From WOPCM_TOP and above is GFX DRAM. Be note that, that GGTT
> > space is still available to any gfx obj as long as it is not
> > accessed by GuC (OK to pass through GuC).
> >
> [TOR:] Should we take a closer look at the pin offset bias
> for guc objects?  GUC_WOPCM_SIZE_VALUE is not the full size
> of WOPCM area.
>   

Yes, GUC_WOPCM_SIZE_VALUE is not the full size of WOPCM area. You can 
blame BSpec for it asks SW to program this 'WOPCM Size' register to be 
(*Physical WOPCM allocated size - GuC WOPCM Base*). So if you offset 
range [GuC WOPCM Base, WOPCM_TOP] by GuC WOPCM Base, it will be [0, 
GUC_WOPCM_SIZE]. This is the GuC 'accessible' range we would like to 
exclude here.

Alex

> > >>
> > >> Later, we will need to allocate additional GuC-sharable objects for the
> > >> submission client(s) and the GuC's debug log.
> > >>
> > >> v2:
> > >>     Remove redundant initialisation [Chris Wilson]
> > >>     Defer adding struct members until needed [Chris Wilson]
> > >>     Local functions should pass dev_priv rather than dev [Chris Wilson]
> > >>
> > >> v4:
> > >>     Rebased
> > >>
> > >> Issue: VIZ-4884
> > >> Signed-off-by: Alex Dai <yu.dai@intel.com>
> > >> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> > >> ---
> > >>  drivers/gpu/drm/i915/Makefile              |   3 +-
> > >>  drivers/gpu/drm/i915/i915_guc_submission.c | 114 +++++++++++++++++++++++++++++
> > >>  drivers/gpu/drm/i915/intel_guc.h           |   7 ++
> > >>  drivers/gpu/drm/i915/intel_guc_loader.c    |  19 +++++
> > >>  4 files changed, 142 insertions(+), 1 deletion(-)
> > >>  create mode 100644 drivers/gpu/drm/i915/i915_guc_submission.c
> > >>
> > >> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> > >> index e604cfe..23f5612 100644
> > >> --- a/drivers/gpu/drm/i915/Makefile
> > >> +++ b/drivers/gpu/drm/i915/Makefile
> > >> @@ -40,7 +40,8 @@ i915-y += i915_cmd_parser.o \
> > >>  	  intel_uncore.o
> > >>
> > >>  # general-purpose microcontroller (GuC) support
> > >> -i915-y += intel_guc_loader.o
> > >> +i915-y += intel_guc_loader.o \
> > >> +	  i915_guc_submission.o
> > >>
> > >>  # autogenerated null render state
> > >>  i915-y += intel_renderstate_gen6.o \
> > >> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> > >> new file mode 100644
> > >> index 0000000..70a0405
> > >> --- /dev/null
> > >> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> > >> @@ -0,0 +1,114 @@
> > >> +/*
> > >> + * Copyright © 2014 Intel Corporation
> > >> + *
> > >> + * Permission is hereby granted, free of charge, to any person obtaining a
> > >> + * copy of this software and associated documentation files (the "Software"),
> > >> + * to deal in the Software without restriction, including without limitation
> > >> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> > >> + * and/or sell copies of the Software, and to permit persons to whom the
> > >> + * Software is furnished to do so, subject to the following conditions:
> > >> + *
> > >> + * The above copyright notice and this permission notice (including the next
> > >> + * paragraph) shall be included in all copies or substantial portions of the
> > >> + * Software.
> > >> + *
> > >> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> > >> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> > >> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> > >> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> > >> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> > >> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
> > >> + * IN THE SOFTWARE.
> > >> + *
> > >> + */
> > >> +#include <linux/firmware.h>
> > >> +#include <linux/circ_buf.h>
> > >> +#include "i915_drv.h"
> > >> +#include "intel_guc.h"
> > >> +
> > >> +/**
> > >> + * gem_allocate_guc_obj() - Allocate gem object for GuC usage
> > >> + * @dev:	drm device
> > >> + * @size:	size of object
> > >> + *
> > >> + * This is a wrapper to create a gem obj. In order to use it inside GuC, the
> > >> + * object needs to be pinned lifetime. Also we must pin it to gtt space other
> > >> + * than [0, GUC_WOPCM_SIZE] because this range is reserved inside GuC.
> > >> + *
> > >> + * Return:	A drm_i915_gem_object if successful, otherwise NULL.
> > >> + */
> > >> +static struct drm_i915_gem_object *gem_allocate_guc_obj(struct drm_device *dev,
> > >> +							u32 size)
> > >> +{
> > >> +	struct drm_i915_gem_object *obj;
> > >> +
> > >> +	obj = i915_gem_alloc_object(dev, size);
> > >> +	if (!obj)
> > >> +		return NULL;
> > >> +
> > >> +	if (i915_gem_object_get_pages(obj)) {
> > >> +		drm_gem_object_unreference(&obj->base);
> > >> +		return NULL;
> > >> +	}
> > >> +
> > >> +	if (i915_gem_obj_ggtt_pin(obj, PAGE_SIZE,
> > >> +			PIN_OFFSET_BIAS | GUC_WOPCM_SIZE_VALUE)) {
> > >> +		drm_gem_object_unreference(&obj->base);
> > >> +		return NULL;
> > >> +	}
> > >> +
> > >> +	return obj;
> > >> +}
> > >> +
> > >> +/**
> > >> + * gem_release_guc_obj() - Release gem object allocated for GuC usage
> > >> + * @obj:	gem obj to be released
> > >> +  */
> > >> +static void gem_release_guc_obj(struct drm_i915_gem_object *obj)
> > >> +{
> > >> +	if (!obj)
> > >> +		return;
> > >> +
> > >> +	if (i915_gem_obj_is_pinned(obj))
> > >> +		i915_gem_object_ggtt_unpin(obj);
> > >> +
> > >> +	drm_gem_object_unreference(&obj->base);
> > >> +}
> > >> +
> > >> +/*
> > >> + * Set up the memory resources to be shared with the GuC.  At this point,
> > >> + * we require just one object that can be mapped through the GGTT.
> > >> + */
> > >> +int i915_guc_submission_init(struct drm_device *dev)
> > >> +{
> > >> +	struct drm_i915_private *dev_priv = dev->dev_private;
> > >> +	const size_t ctxsize = sizeof(struct guc_context_desc);
> > >> +	const size_t poolsize = GUC_MAX_GPU_CONTEXTS * ctxsize;
> > >> +	const size_t gemsize = round_up(poolsize, PAGE_SIZE);
> > >> +	struct intel_guc *guc = &dev_priv->guc;
> > >> +
> > >> +	if (!i915.enable_guc_submission)
> > >> +		return 0; /* not enabled  */
> > >> +
> > >> +	if (guc->ctx_pool_obj)
> > >> +		return 0; /* already allocated */
> > >> +
> > >> +	guc->ctx_pool_obj = gem_allocate_guc_obj(dev_priv->dev, gemsize);
> > >> +	if (!guc->ctx_pool_obj)
> > >> +		return -ENOMEM;
> > >> +
> > >> +	ida_init(&guc->ctx_ids);
> > >> +
> > >> +	return 0;
> > >> +}
> > >> +
> > >> +void i915_guc_submission_fini(struct drm_device *dev)
> > >> +{
> > >> +	struct drm_i915_private *dev_priv = dev->dev_private;
> > >> +	struct intel_guc *guc = &dev_priv->guc;
> > >> +
> > >> +	if (guc->ctx_pool_obj)
> > >> +		ida_destroy(&guc->ctx_ids);
> > >> +	gem_release_guc_obj(guc->ctx_pool_obj);
> > >> +	guc->ctx_pool_obj = NULL;
> > >> +}
> > >> diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
> > >> index 2846b6d..be3cad8 100644
> > >> --- a/drivers/gpu/drm/i915/intel_guc.h
> > >> +++ b/drivers/gpu/drm/i915/intel_guc.h
> > >> @@ -56,6 +56,9 @@ struct intel_guc {
> > >>  	struct intel_guc_fw guc_fw;
> > >>
> > >>  	uint32_t log_flags;
> > >> +
> > >> +	struct drm_i915_gem_object *ctx_pool_obj;
> > >> +	struct ida ctx_ids;
> > >>  };
> > >>
> > >>  /* intel_guc_loader.c */
> > >> @@ -64,4 +67,8 @@ extern int intel_guc_ucode_load(struct drm_device *dev);
> > >>  extern void intel_guc_ucode_fini(struct drm_device *dev);
> > >>  extern const char *intel_guc_fw_status_repr(enum intel_guc_fw_status status);
> > >>
> > >> +/* i915_guc_submission.c */
> > >> +int i915_guc_submission_init(struct drm_device *dev);
> > >> +void i915_guc_submission_fini(struct drm_device *dev);
> > >> +
> > >[TOR:] i915_guc_submission_init gets used in this patch.
> > >Unexpectedly, i915_guc_submission_fini does not get used
> > >in this patch.
> > >
> > >A later patch in this series adds the call to
> > >i915_guc_submission_fini in intel_guc_ucode_fini.
> > >Should that call be added in this patch?
> > >
> > >>  #endif
> > >> diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c b/drivers/gpu/drm/i915/intel_guc_loader.c
> > >> index 2080bca..e5d7136 100644
> > >> --- a/drivers/gpu/drm/i915/intel_guc_loader.c
> > >> +++ b/drivers/gpu/drm/i915/intel_guc_loader.c
> > >> @@ -128,6 +128,21 @@ static void set_guc_init_params(struct drm_i915_private *dev_priv)
> > >>  			i915.guc_log_level << GUC_LOG_VERBOSITY_SHIFT;
> > >>  	}
> > >>
> > >> +	/* If GuC scheduling is enabled, setup params here. */
> > >[TOR:] I assume from this "GuC scheduling" == "GuC submission".
> > >This is a little confusing.  I recommend either reword
> > >"GuC scheduling" or add comment text to explain.
> >
> > For now, yes the GuC scheduling is only doing the 'submission' work
> > because of the current kernel in-order queue. If we have client
> > direct submission enabled, there WILL BE GuC scheduling inside
> > firmware according to the priority of each queue etc.
> >
> > Thanks,
> > Alex


[-- Attachment #1.2: Type: text/html, Size: 13623 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 10/13 v4] drm/i915: Interrupt routing for GuC submission
  2015-07-27 15:33   ` O'Rourke, Tom
@ 2015-07-28 11:29     ` Dave Gordon
  0 siblings, 0 replies; 42+ messages in thread
From: Dave Gordon @ 2015-07-28 11:29 UTC (permalink / raw)
  To: O'Rourke, Tom; +Cc: intel-gfx

On 27/07/15 16:33, O'Rourke, Tom wrote:
> On Thu, Jul 09, 2015 at 07:29:11PM +0100, Dave Gordon wrote:
>> Turn on interrupt steering to route necessary interrupts to GuC.
>>
>> v4:
>>      Rebased
>>
>> Issue: VIZ-4884
>> Signed-off-by: Alex Dai <yu.dai@intel.com>
>> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_reg.h         | 11 +++++--
>>   drivers/gpu/drm/i915/intel_guc_loader.c | 51 +++++++++++++++++++++++++++++++++
>>   2 files changed, 60 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
>> index 63728c1..1c2072b 100644
>> --- a/drivers/gpu/drm/i915/i915_reg.h
>> +++ b/drivers/gpu/drm/i915/i915_reg.h
>> @@ -1664,12 +1664,18 @@ enum skl_disp_power_wells {
>>   #define GFX_MODE_GEN7	0x0229c
>>   #define RING_MODE_GEN7(ring)	((ring)->mmio_base+0x29c)
>>   #define   GFX_RUN_LIST_ENABLE		(1<<15)
>> +#define   GFX_INTERRUPT_STEERING	(1<<14)
>>   #define   GFX_TLB_INVALIDATE_EXPLICIT	(1<<13)
>>   #define   GFX_SURFACE_FAULT_ENABLE	(1<<12)
>>   #define   GFX_REPLAY_MODE		(1<<11)
>>   #define   GFX_PSMI_GRANULARITY		(1<<10)
>>   #define   GFX_PPGTT_ENABLE		(1<<9)
>>
>> +#define   GFX_FORWARD_VBLANK_MASK	(3<<5)
>> +#define   GFX_FORWARD_VBLANK_NEVER	(0<<5)
>> +#define   GFX_FORWARD_VBLANK_ALWAYS	(1<<5)
>> +#define   GFX_FORWARD_VBLANK_COND	(2<<5)
>> +
>>   #define VLV_DISPLAY_BASE 0x180000
>>   #define VLV_MIPI_BASE VLV_DISPLAY_BASE
>>
>> @@ -5683,11 +5689,12 @@ enum skl_disp_power_wells {
>>   #define GEN8_GT_IIR(which) (0x44308 + (0x10 * (which)))
>>   #define GEN8_GT_IER(which) (0x4430c + (0x10 * (which)))
>>
>> -#define GEN8_BCS_IRQ_SHIFT 16
>>   #define GEN8_RCS_IRQ_SHIFT 0
>> -#define GEN8_VCS2_IRQ_SHIFT 16
>> +#define GEN8_BCS_IRQ_SHIFT 16
>>   #define GEN8_VCS1_IRQ_SHIFT 0
>> +#define GEN8_VCS2_IRQ_SHIFT 16
>>   #define GEN8_VECS_IRQ_SHIFT 0
>> +#define GEN8_WD_IRQ_SHIFT 16
>>
>>   #define GEN8_DE_PIPE_ISR(pipe) (0x44400 + (0x10 * (pipe)))
>>   #define GEN8_DE_PIPE_IMR(pipe) (0x44404 + (0x10 * (pipe)))
>> diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c b/drivers/gpu/drm/i915/intel_guc_loader.c
>> index 25ba29f..2aa9227 100644
>> --- a/drivers/gpu/drm/i915/intel_guc_loader.c
>> +++ b/drivers/gpu/drm/i915/intel_guc_loader.c
>> @@ -79,6 +79,53 @@ const char *intel_guc_fw_status_repr(enum intel_guc_fw_status status)
>>   	}
>>   };
>>
>> +static void direct_interrupts_to_host(struct drm_i915_private *dev_priv)
>> +{
>> +	struct intel_engine_cs *ring;
>> +	int i, irqs;
>> +
>> +	/* tell all command streamers NOT to forward interrupts and vblank to GuC */
>> +	irqs = _MASKED_FIELD(GFX_FORWARD_VBLANK_MASK, GFX_FORWARD_VBLANK_NEVER);
>> +	irqs |= _MASKED_BIT_DISABLE(GFX_INTERRUPT_STEERING);
>> +	for_each_ring(ring, dev_priv, i)
>> +		I915_WRITE(RING_MODE_GEN7(ring), irqs);
>> +
>> +	/* tell DE to send nothing to GuC */
>> +	I915_WRITE(DE_GUCRMR, ~0);
>> +
>> +	/* route all GT interrupts to the host */
>> +	I915_WRITE(GUC_BCS_RCS_IER, 0);
>> +	I915_WRITE(GUC_VCS2_VCS1_IER, 0);
>> +	I915_WRITE(GUC_WD_VECS_IER, 0);
>> +}
>> +
>> +static void direct_interrupts_to_guc(struct drm_i915_private *dev_priv)
>> +{
>> +	struct intel_engine_cs *ring;
>> +	int i, irqs;
>> +
>> +	/* tell all command streamers to forward interrupts and vblank to GuC */
>> +	irqs = _MASKED_FIELD(GFX_FORWARD_VBLANK_MASK, GFX_FORWARD_VBLANK_ALWAYS);
>> +	irqs |= _MASKED_BIT_ENABLE(GFX_INTERRUPT_STEERING);
>> +	for_each_ring(ring, dev_priv, i)
>> +		I915_WRITE(RING_MODE_GEN7(ring), irqs);
>> +
>> +	/* tell DE to send (all) flip_done to GuC */
>> +	irqs = DERRMR_PIPEA_PRI_FLIP_DONE | DERRMR_PIPEA_SPR_FLIP_DONE |
>> +	       DERRMR_PIPEB_PRI_FLIP_DONE | DERRMR_PIPEB_SPR_FLIP_DONE |
>> +	       DERRMR_PIPEC_PRI_FLIP_DONE | DERRMR_PIPEC_SPR_FLIP_DONE;
>> +	/* Unmasked bits will cause GuC response message to be sent */
>> +	I915_WRITE(DE_GUCRMR, ~irqs);
>> +
>> +	/* route USER_INTERRUPT to Host, all others are sent to GuC. */
>> +	irqs = GT_RENDER_USER_INTERRUPT << GEN8_RCS_IRQ_SHIFT |
>> +	       GT_RENDER_USER_INTERRUPT << GEN8_BCS_IRQ_SHIFT;
>> +	/* These three registers have the same bit definitions */
 >> +	I915_WRITE(GUC_BCS_RCS_IER, ~irqs);
 >> +	I915_WRITE(GUC_VCS2_VCS1_IER, ~irqs);
 >> +	I915_WRITE(GUC_WD_VECS_IER, ~irqs);
 >> +}
 >> +
 >
> [TOR:] Reliance on the registers having the same bit
> definitions does not seem safe.  Each of the three
> registers has shift constants defined.  I would expect
> the shift constants for the second and third registers
> to be used when writing those registers.
>
> Also, GT_RENDER_USER_INTERRUPT seems to have been defined
> for use with a different register than this set.
>
> On the other hand, this code does actually write the
> correct values.

The header file that defines GT_RENDER_USER_INTERRUPT, 
GEN8_RCS_IRQ_SHIFT et al. contains this comment:

/* On modern GEN architectures interrupt control consists of two sets
  * of registers. The first set pertains to the ring generating the
  * interrupt. The second control is for the functional block generating
  * the interrupt. These are PM, GT, DE, etc.
  *
  * Luckily *knocks on wood* all the ring interrupt bits match up with
  * the GT interrupt bits, so we don't need to duplicate the defines.
  *
  * These defines should cover us well from SNB->HSW with minor
  * exceptions it can also work on ILK.
  */

Per the second paragraph, we use these defines even though they appear 
to relate to a different (older) set of registers.

Also the BSpec doesn't actually contain separate definitions for these 
GuC-related interrupt control registers, but instead they all refer to 
the existing "GT Interrupt 0 Definition" page. So I think we're safe 
enough assuming for now that the H/W will continue to use the existing 
layout for all ISR/IMR/IIR/IER registers; that is, two 16-bit fields 
packed into each 32-bit register, with each field relating to a specific 
interrupt source, and corresponding bits in each of these fields 
controlling the same type of interrupt. But I did consider it an 
assumption that required a comment :)

If a future chip has a different set of interrupt registers or even just 
additional engines this code will need to be updated anyway, so the 
comment should alert anyone modifying this to check that the assumption 
is still valid.

So if you don't mind, I'm going to leave this unchanged.

.Dave.

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 11/13 v4] drm/i915: Integrate GuC-based command submission
  2015-07-27 15:57   ` O'Rourke, Tom
  2015-07-27 19:33     ` Yu Dai
@ 2015-07-28 13:59     ` Dave Gordon
  2015-07-28 16:47       ` Yu Dai
  1 sibling, 1 reply; 42+ messages in thread
From: Dave Gordon @ 2015-07-28 13:59 UTC (permalink / raw)
  To: O'Rourke, Tom; +Cc: intel-gfx

On 27/07/15 16:57, O'Rourke, Tom wrote:
> On Thu, Jul 09, 2015 at 07:29:12PM +0100, Dave Gordon wrote:
>> From: Alex Dai <yu.dai@intel.com>
>>
>> GuC-based submission is mostly the same as execlist mode, up to
>> intel_logical_ring_advance_and_submit(), where the context being
>> dispatched would be added to the execlist queue; at this point
>> we submit the context to the GuC backend instead.
>>
>> There are, however, a few other changes also required, notably:
>> 1.  Contexts must be pinned at GGTT addresses accessible by the GuC
>>      i.e. NOT in the range [0..WOPCM_SIZE), so we have to add the
>>      PIN_OFFSET_BIAS flag to the relevant GGTT-pinning calls.
>>
>> 2.  The GuC's TLB must be invalidated after a context is pinned at
>>      a new GGTT address.

[snip]

>>   static int gen8_init_rcs_context(struct drm_i915_gem_request *req)
>>   {
>> +	struct drm_i915_private *dev_priv = req->i915;
>>   	int ret;
>>
>> +	/* Invalidate GuC TLB. */
>> +	if (i915.enable_guc_submission)
>> +		I915_WRITE(GEN8_GTCR, GEN8_GTCR_INVALIDATE);
>> +
 >
> [TOR:] This invalidation is in the init_context for render
> ring but not the other rings.  Is this needed for other
> rings?  Or, should this invalidation happen at a different
> level?  It seems this may depend on the on render ring being
> initialized first.
>
> Thanks,
> Tom

Hi Tom,

it looks like this is redundant here in the case where its called from 
the non-default-context case of intel_lr_context_deferred_create(); but 
when called from i915_gem_init_hw() [via i915_gem_context_enable()] it 
wouldn't be, because the GuC TLB wouldn't have been flushed since the 
default context was pinned [which is in a completely different path 
through intel_lr_context_deferred_create()!].

However, if we add a TLB flush just after that point, we can remove this 
one here, with several advantages:
* firstly, that path is taken just once, rather than every time a new 
render context is created, and
* secondly, each flush can be specifically associated with a pin-to-gtt 
call that includes the (PIN_OFFSET_BIAS | GUC_WOPCM_SIZE_VALUE) flags, 
showing that the pinned object is of interest to the GuC.

I'll also move the existing TLB flushes in guc_submission.c and 
guc_loader.c so that they're also just after the relevant 'pin' calls.

Thanks,
.Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 07/13 v4] drm/i915: GuC submission setup, phase 1
  2015-07-27 23:12       ` O'Rourke, Tom
  2015-07-28  0:18         ` Yu Dai
@ 2015-07-28 15:16         ` Dave Gordon
  2015-07-28 19:40           ` Dave Gordon
  2015-07-28 21:38           ` O'Rourke, Tom
  1 sibling, 2 replies; 42+ messages in thread
From: Dave Gordon @ 2015-07-28 15:16 UTC (permalink / raw)
  To: O'Rourke, Tom, Yu Dai; +Cc: intel-gfx

On 28/07/15 00:12, O'Rourke, Tom wrote:
> On Mon, Jul 27, 2015 at 03:41:31PM -0700, Yu Dai wrote:
>>
>> On 07/24/2015 03:31 PM, O'Rourke, Tom wrote:
>>> [TOR:] When I see "phase 1" I also look for "phase 2".
>>> A subject that better describes the change in this patch
>>> would help.
>>>
>>> On Thu, Jul 09, 2015 at 07:29:08PM +0100, Dave Gordon wrote:
>>>> From: Alex Dai <yu.dai@intel.com>
>>>>
>>>> This adds the first of the data structures used to communicate with the
>>>> GuC (the pool of guc_context structures).
>>>>
>>>> We create a GuC-specific wrapper round the GEM object allocator as all
>>>> GEM objects shared with the GuC must be pinned into GGTT space at an
>>>> address that is NOT in the range [0..WOPCM_SIZE), as that range of GGTT
>>>> addresses is not accessible to the GuC (from the GuC's point of view,
>>>> it's permanently reserved for other objects such as the BootROM & SRAM).
>>> [TOR:] I would like a clarfication on the excluded range.
>>> The excluded range should be 0 to "size for guc within
>>> WOPCM area" and not 0 to "size of WOPCM area".
>>
>> Nope, GGTT range [0..WOPCM_SIZE) should be excluded from GuC usage.
>> BSpec clearly says, from 0 to WOPCM_TOP-1 is for BootROM, SRAM and
>> WOPCM. From WOPCM_TOP and above is GFX DRAM. Be note that, that GGTT
>> space is still available to any gfx obj as long as it is not
>> accessed by GuC (OK to pass through GuC).
>>
> [TOR:] Should we take a closer look at the pin offset bias
> for guc objects?  GUC_WOPCM_SIZE_VALUE is not the full size
> of WOPCM area.

I'm inclined to set the bias to GUC_WOPCM_TOP, and then define that as 
the sum of GUC_WOPCM_OFFSET_VALUE and GUC_WOPCM_SIZE_VALUE. That seems 
to be what the BSpec pages "WriteOnceProtectedContentMemory (WOPCM) 
Management" and "WOPCM Memory Map" suggest, although I think they're 
pretty unclear on the details :(

Do you (both) agree this would be the right value?

[snip]

>>>> +	/* If GuC scheduling is enabled, setup params here. */
>>> [TOR:] I assume from this "GuC scheduling" == "GuC submission".
>>> This is a little confusing.  I recommend either reword
>>> "GuC scheduling" or add comment text to explain.
>>
>> For now, yes the GuC scheduling is only doing the 'submission' work
>> because of the current kernel in-order queue. If we have client
>> direct submission enabled, there WILL BE GuC scheduling inside
>> firmware according to the priority of each queue etc.
>>
>> Thanks,
>> Alex

I changed the line above to "GuC submission", while the one a few lines 
further down now says:

	/* Unmask this bit to enable the GuC's internal scheduler */

to make it quite clear that we're not referring to the host-based GPU 
scheduler curently in review.

.Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 11/13 v4] drm/i915: Integrate GuC-based command submission
  2015-07-28 13:59     ` Dave Gordon
@ 2015-07-28 16:47       ` Yu Dai
  0 siblings, 0 replies; 42+ messages in thread
From: Yu Dai @ 2015-07-28 16:47 UTC (permalink / raw)
  To: Dave Gordon, O'Rourke, Tom; +Cc: intel-gfx



On 07/28/2015 06:59 AM, Dave Gordon wrote:
> On 27/07/15 16:57, O'Rourke, Tom wrote:
> > On Thu, Jul 09, 2015 at 07:29:12PM +0100, Dave Gordon wrote:
> >> From: Alex Dai <yu.dai@intel.com>
> >>
> >> GuC-based submission is mostly the same as execlist mode, up to
> >> intel_logical_ring_advance_and_submit(), where the context being
> >> dispatched would be added to the execlist queue; at this point
> >> we submit the context to the GuC backend instead.
> >>
> >> There are, however, a few other changes also required, notably:
> >> 1.  Contexts must be pinned at GGTT addresses accessible by the GuC
> >>      i.e. NOT in the range [0..WOPCM_SIZE), so we have to add the
> >>      PIN_OFFSET_BIAS flag to the relevant GGTT-pinning calls.
> >>
> >> 2.  The GuC's TLB must be invalidated after a context is pinned at
> >>      a new GGTT address.
>
> [snip]
>
> >>   static int gen8_init_rcs_context(struct drm_i915_gem_request *req)
> >>   {
> >> +	struct drm_i915_private *dev_priv = req->i915;
> >>   	int ret;
> >>
> >> +	/* Invalidate GuC TLB. */
> >> +	if (i915.enable_guc_submission)
> >> +		I915_WRITE(GEN8_GTCR, GEN8_GTCR_INVALIDATE);
> >> +
>   >
> > [TOR:] This invalidation is in the init_context for render
> > ring but not the other rings.  Is this needed for other
> > rings?  Or, should this invalidation happen at a different
> > level?  It seems this may depend on the on render ring being
> > initialized first.
> >
> > Thanks,
> > Tom
>
> Hi Tom,
>
> it looks like this is redundant here in the case where its called from
> the non-default-context case of intel_lr_context_deferred_create(); but
> when called from i915_gem_init_hw() [via i915_gem_context_enable()] it
> wouldn't be, because the GuC TLB wouldn't have been flushed since the
> default context was pinned [which is in a completely different path
> through intel_lr_context_deferred_create()!].
>
> However, if we add a TLB flush just after that point, we can remove this
> one here, with several advantages:
> * firstly, that path is taken just once, rather than every time a new
> render context is created, and
> * secondly, each flush can be specifically associated with a pin-to-gtt
> call that includes the (PIN_OFFSET_BIAS | GUC_WOPCM_SIZE_VALUE) flags,
> showing that the pinned object is of interest to the GuC.
>
> I'll also move the existing TLB flushes in guc_submission.c and
> guc_loader.c so that they're also just after the relevant 'pin' calls.
>

Aye. -Alex

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 07/13 v4] drm/i915: GuC submission setup, phase 1
  2015-07-28 15:16         ` Dave Gordon
@ 2015-07-28 19:40           ` Dave Gordon
  2015-07-28 22:42             ` O'Rourke, Tom
  2015-07-28 21:38           ` O'Rourke, Tom
  1 sibling, 1 reply; 42+ messages in thread
From: Dave Gordon @ 2015-07-28 19:40 UTC (permalink / raw)
  To: intel-gfx, Koston, Joseph, O'Rourke, Tom, Dai, Yu

On 28/07/15 16:16, Dave Gordon wrote:
> On 28/07/15 00:12, O'Rourke, Tom wrote:
>> On Mon, Jul 27, 2015 at 03:41:31PM -0700, Yu Dai wrote:
>>>
>>> On 07/24/2015 03:31 PM, O'Rourke, Tom wrote:
>>>> [TOR:] When I see "phase 1" I also look for "phase 2".
>>>> A subject that better describes the change in this patch
>>>> would help.
>>>>
>>>> On Thu, Jul 09, 2015 at 07:29:08PM +0100, Dave Gordon wrote:
>>>>> From: Alex Dai <yu.dai@intel.com>
>>>>>
>>>>> This adds the first of the data structures used to communicate with
>>>>> the GuC (the pool of guc_context structures).
>>>>>
>>>>> We create a GuC-specific wrapper round the GEM object allocator as all
>>>>> GEM objects shared with the GuC must be pinned into GGTT space at an
>>>>> address that is NOT in the range [0..WOPCM_SIZE), as that range of
>>>>> GGTT
>>>>> addresses is not accessible to the GuC (from the GuC's point of view,
>>>>> it's permanently reserved for other objects such as the BootROM &
>>>>> SRAM).
 >>>>
>>>> [TOR:] I would like a clarfication on the excluded range.
>>>> The excluded range should be 0 to "size for guc within
>>>> WOPCM area" and not 0 to "size of WOPCM area".
>>>
>>> Nope, GGTT range [0..WOPCM_SIZE) should be excluded from GuC usage.
>>> BSpec clearly says, from 0 to WOPCM_TOP-1 is for BootROM, SRAM and
>>> WOPCM. From WOPCM_TOP and above is GFX DRAM. Be note that, that GGTT
>>> space is still available to any gfx obj as long as it is not
>>> accessed by GuC (OK to pass through GuC).
>>>
>> [TOR:] Should we take a closer look at the pin offset bias
>> for guc objects?  GUC_WOPCM_SIZE_VALUE is not the full size
>> of WOPCM area.
>
> I'm inclined to set the bias to GUC_WOPCM_TOP, and then define that as
> the sum of GUC_WOPCM_OFFSET_VALUE and GUC_WOPCM_SIZE_VALUE. That seems
> to be what the BSpec pages "WriteOnceProtectedContentMemory (WOPCM)
> Management" and "WOPCM Memory Map" suggest, although I think they're
> pretty unclear on the details :(
>
> Do you (both) agree this would be the right value?

Actually I've changed my mind (again). On rereading this stuff, I now 
think that GUC_WOPCM_TOP is the same as the value put into the SIZE 
register. The (physical) range between the (real) WOPCM BASE and that 
plus the GUC WOPCM OFFSET isn't part of the GuC address space at all, so 
GuC address 0 maps (would map) to (real WOPCM BASE+GUC WOPCM OFFSET) in 
physical addresses, except that the bottom 80k is shadowed by the 
bootrom and SRAM; and then the SIZE register defines the size of the 
range from (GuC address) 0 to GUC_WOPCM_TOP; and then higher addresses 
map through the GTT as expected.

Or so I think. Does anyone know for sure? Please let me know ASAP as I 
want to submit an updated patchset tomorrow!

Thanks,
.Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 07/13 v4] drm/i915: GuC submission setup, phase 1
  2015-07-28 15:16         ` Dave Gordon
  2015-07-28 19:40           ` Dave Gordon
@ 2015-07-28 21:38           ` O'Rourke, Tom
  1 sibling, 0 replies; 42+ messages in thread
From: O'Rourke, Tom @ 2015-07-28 21:38 UTC (permalink / raw)
  To: Dave Gordon; +Cc: intel-gfx

On Tue, Jul 28, 2015 at 04:16:03PM +0100, Dave Gordon wrote:
> On 28/07/15 00:12, O'Rourke, Tom wrote:
> >On Mon, Jul 27, 2015 at 03:41:31PM -0700, Yu Dai wrote:
> >>
> >>On 07/24/2015 03:31 PM, O'Rourke, Tom wrote:
> >>>[TOR:] When I see "phase 1" I also look for "phase 2".
> >>>A subject that better describes the change in this patch
> >>>would help.
> >>>
> >>>On Thu, Jul 09, 2015 at 07:29:08PM +0100, Dave Gordon wrote:
> >>>>From: Alex Dai <yu.dai@intel.com>
> >>>>
> >>>>This adds the first of the data structures used to communicate with the
> >>>>GuC (the pool of guc_context structures).
> >>>>
> >>>>We create a GuC-specific wrapper round the GEM object allocator as all
> >>>>GEM objects shared with the GuC must be pinned into GGTT space at an
> >>>>address that is NOT in the range [0..WOPCM_SIZE), as that range of GGTT
> >>>>addresses is not accessible to the GuC (from the GuC's point of view,
> >>>>it's permanently reserved for other objects such as the BootROM & SRAM).
> >>>[TOR:] I would like a clarfication on the excluded range.
> >>>The excluded range should be 0 to "size for guc within
> >>>WOPCM area" and not 0 to "size of WOPCM area".
> >>
> >>Nope, GGTT range [0..WOPCM_SIZE) should be excluded from GuC usage.
> >>BSpec clearly says, from 0 to WOPCM_TOP-1 is for BootROM, SRAM and
> >>WOPCM. From WOPCM_TOP and above is GFX DRAM. Be note that, that GGTT
> >>space is still available to any gfx obj as long as it is not
> >>accessed by GuC (OK to pass through GuC).
> >>
> >[TOR:] Should we take a closer look at the pin offset bias
> >for guc objects?  GUC_WOPCM_SIZE_VALUE is not the full size
> >of WOPCM area.
> 
> I'm inclined to set the bias to GUC_WOPCM_TOP, and then define that
> as the sum of GUC_WOPCM_OFFSET_VALUE and GUC_WOPCM_SIZE_VALUE. That
> seems to be what the BSpec pages "WriteOnceProtectedContentMemory
> (WOPCM) Management" and "WOPCM Memory Map" suggest, although I think
> they're pretty unclear on the details :(
> 
> Do you (both) agree this would be the right value?

[TOR:] No, I do not think that is the right value.

I think the excluded range should be [0 ... GUC_WOPCM_SIZE_VALUE)
and that GUC_WOPCM_SIZE_VALUE should be used as the bias (as it
is now) for objects used by GuC.

The term "WOPCM_SIZE" is ambiguous since it could mean
GUC_WOPCM_SIZE (as in 0xc050) or it could mean "size of
WOPCM area" (as in 0x1082C0).  It gets used both ways
in the BSpec.

> 
> [snip]
[snip]
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 07/13 v4] drm/i915: GuC submission setup, phase 1
  2015-07-28 19:40           ` Dave Gordon
@ 2015-07-28 22:42             ` O'Rourke, Tom
  0 siblings, 0 replies; 42+ messages in thread
From: O'Rourke, Tom @ 2015-07-28 22:42 UTC (permalink / raw)
  To: Dave Gordon; +Cc: intel-gfx, Koston, Joseph

On Tue, Jul 28, 2015 at 08:40:58PM +0100, Dave Gordon wrote:
> On 28/07/15 16:16, Dave Gordon wrote:
> >On 28/07/15 00:12, O'Rourke, Tom wrote:
> >>On Mon, Jul 27, 2015 at 03:41:31PM -0700, Yu Dai wrote:
> >>>
> >>>On 07/24/2015 03:31 PM, O'Rourke, Tom wrote:
> >>>>[TOR:] When I see "phase 1" I also look for "phase 2".
> >>>>A subject that better describes the change in this patch
> >>>>would help.
> >>>>
> >>>>On Thu, Jul 09, 2015 at 07:29:08PM +0100, Dave Gordon wrote:
> >>>>>From: Alex Dai <yu.dai@intel.com>
> >>>>>
> >>>>>This adds the first of the data structures used to communicate with
> >>>>>the GuC (the pool of guc_context structures).
> >>>>>
> >>>>>We create a GuC-specific wrapper round the GEM object allocator as all
> >>>>>GEM objects shared with the GuC must be pinned into GGTT space at an
> >>>>>address that is NOT in the range [0..WOPCM_SIZE), as that range of
> >>>>>GGTT
> >>>>>addresses is not accessible to the GuC (from the GuC's point of view,
> >>>>>it's permanently reserved for other objects such as the BootROM &
> >>>>>SRAM).
> >>>>
> >>>>[TOR:] I would like a clarfication on the excluded range.
> >>>>The excluded range should be 0 to "size for guc within
> >>>>WOPCM area" and not 0 to "size of WOPCM area".
> >>>
> >>>Nope, GGTT range [0..WOPCM_SIZE) should be excluded from GuC usage.
> >>>BSpec clearly says, from 0 to WOPCM_TOP-1 is for BootROM, SRAM and
> >>>WOPCM. From WOPCM_TOP and above is GFX DRAM. Be note that, that GGTT
> >>>space is still available to any gfx obj as long as it is not
> >>>accessed by GuC (OK to pass through GuC).
> >>>
> >>[TOR:] Should we take a closer look at the pin offset bias
> >>for guc objects?  GUC_WOPCM_SIZE_VALUE is not the full size
> >>of WOPCM area.
> >
> >I'm inclined to set the bias to GUC_WOPCM_TOP, and then define that as
> >the sum of GUC_WOPCM_OFFSET_VALUE and GUC_WOPCM_SIZE_VALUE. That seems
> >to be what the BSpec pages "WriteOnceProtectedContentMemory (WOPCM)
> >Management" and "WOPCM Memory Map" suggest, although I think they're
> >pretty unclear on the details :(
> >
> >Do you (both) agree this would be the right value?
> 
> Actually I've changed my mind (again). On rereading this stuff, I
> now think that GUC_WOPCM_TOP is the same as the value put into the
> SIZE register. The (physical) range between the (real) WOPCM BASE
> and that plus the GUC WOPCM OFFSET isn't part of the GuC address
> space at all, so GuC address 0 maps (would map) to (real WOPCM
> BASE+GUC WOPCM OFFSET) in physical addresses, except that the bottom
> 80k is shadowed by the bootrom and SRAM; and then the SIZE register
> defines the size of the range from (GuC address) 0 to GUC_WOPCM_TOP;
> and then higher addresses map through the GTT as expected.
> 
> Or so I think. Does anyone know for sure? Please let me know ASAP as
> I want to submit an updated patchset tomorrow!
> 
> Thanks,
> .Dave.

[TOR:] Hi Dave,  Sorry, I did not see your message earlier.
Please see my other reply on this thread.  I think you are
right here, but to be clear, I think by "SIZE" you mean
"GUC_WOPCM_SIZE_VALUE".

Also, this should not matter here, but the SKL guc SRAM
shadow is 128k, not 80k.

Thanks,
Tom
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2015-07-28 22:46 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-09 18:29 [PATCH 00/13 v4] Batch submission via GuC Dave Gordon
2015-07-09 18:29 ` [PATCH 01/13 v4] drm/i915: Add i915_gem_object_create_from_data() Dave Gordon
2015-07-18  0:36   ` O'Rourke, Tom
2015-07-09 18:29 ` [PATCH 02/13 v4] drm/i915: Add GuC-related module parameters Dave Gordon
2015-07-18  0:37   ` O'Rourke, Tom
2015-07-09 18:29 ` [PATCH 03/13 v4] drm/i915: Add GuC-related header files Dave Gordon
2015-07-18  0:38   ` O'Rourke, Tom
2015-07-21  6:38     ` Daniel Vetter
2015-07-24 22:08       ` O'Rourke, Tom
2015-07-09 18:29 ` [PATCH 04/13 v4] drm/i915: GuC-specific firmware loader Dave Gordon
2015-07-13 15:35   ` Daniel Vetter
2015-07-18  0:35   ` O'Rourke, Tom
2015-07-20 16:18     ` Yu Dai
2015-07-09 18:29 ` [PATCH 05/13 v4] drm/i915: Debugfs interface to read GuC load status Dave Gordon
2015-07-18  0:39   ` O'Rourke, Tom
2015-07-09 18:29 ` [PATCH 06/13 v4] drm/i915: Expose two LRC functions for GuC submission mode Dave Gordon
2015-07-24 22:12   ` O'Rourke, Tom
2015-07-09 18:29 ` [PATCH 07/13 v4] drm/i915: GuC submission setup, phase 1 Dave Gordon
2015-07-24 22:31   ` O'Rourke, Tom
2015-07-27 22:41     ` Yu Dai
2015-07-27 23:12       ` O'Rourke, Tom
2015-07-28  0:18         ` Yu Dai
2015-07-28 15:16         ` Dave Gordon
2015-07-28 19:40           ` Dave Gordon
2015-07-28 22:42             ` O'Rourke, Tom
2015-07-28 21:38           ` O'Rourke, Tom
2015-07-09 18:29 ` [PATCH 08/13 v4] drm/i915: Enable GuC firmware log Dave Gordon
2015-07-24 22:40   ` O'Rourke, Tom
2015-07-09 18:29 ` [PATCH 09/13 v4] drm/i915: Implementation of GuC client Dave Gordon
2015-07-25  2:31   ` O'Rourke, Tom
2015-07-09 18:29 ` [PATCH 10/13 v4] drm/i915: Interrupt routing for GuC submission Dave Gordon
2015-07-27 15:33   ` O'Rourke, Tom
2015-07-28 11:29     ` Dave Gordon
2015-07-09 18:29 ` [PATCH 11/13 v4] drm/i915: Integrate GuC-based command submission Dave Gordon
2015-07-27 15:57   ` O'Rourke, Tom
2015-07-27 19:33     ` Yu Dai
2015-07-28 13:59     ` Dave Gordon
2015-07-28 16:47       ` Yu Dai
2015-07-09 18:29 ` [PATCH 12/13 v4] drm/i915: Debugfs interface for GuC submission statistics Dave Gordon
2015-07-27 15:36   ` O'Rourke, Tom
2015-07-09 18:29 ` [PATCH 13/13 v4] drm/i915: Enable GuC submission, where supported Dave Gordon
2015-07-18  0:45 ` [PATCH 00/13 v4] Batch submission via GuC O'Rourke, Tom

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.