All of lore.kernel.org
 help / color / mirror / Atom feed
* Premature unpinning at last
@ 2016-04-20 18:42 Chris Wilson
  2016-04-20 18:42 ` [PATCH 01/19] drm/i915/overlay: Replace i915_gem_obj_ggtt_offset() with the known flip_addr Chris Wilson
                   ` (22 more replies)
  0 siblings, 23 replies; 52+ messages in thread
From: Chris Wilson @ 2016-04-20 18:42 UTC (permalink / raw)
  To: intel-gfx

Pulled in the few missing details to put everything together in the same
series. Still missing a few r-b through:

[02/19] io-mapping: Specify mapping size for
[05/19] drm/i915: Use i915_vma_pin_iomap on the ringbuffer
[10/19] drm/i915: Rearrange switch_context to load the aliasing
[14/19] drm/i915: Move context initialisation to first-use
[17/19] drm/i915: Track the previous pinned context inside the

02/19 requires an external ack (or Daniel's blessing), but it would
still be good for one of us to r-b it.

Hopefully CI is happy as well,
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH 01/19] drm/i915/overlay: Replace i915_gem_obj_ggtt_offset() with the known flip_addr
  2016-04-20 18:42 Premature unpinning at last Chris Wilson
@ 2016-04-20 18:42 ` Chris Wilson
  2016-04-20 18:42   ` Chris Wilson
                   ` (21 subsequent siblings)
  22 siblings, 0 replies; 52+ messages in thread
From: Chris Wilson @ 2016-04-20 18:42 UTC (permalink / raw)
  To: intel-gfx

When setting up the overlay page, we pin it into the GGTT (when using
virtual addresses) and store the offset as overlay->flip_addr. Rather
than doing a lookup of the GGTT address everytime, we can use the known
address instead.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_overlay.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
index bcc3b6a016d8..9746b9841c13 100644
--- a/drivers/gpu/drm/i915/intel_overlay.c
+++ b/drivers/gpu/drm/i915/intel_overlay.c
@@ -198,7 +198,7 @@ intel_overlay_map_regs(struct intel_overlay *overlay)
 		regs = (struct overlay_registers __iomem *)overlay->reg_bo->phys_handle->vaddr;
 	else
 		regs = io_mapping_map_wc(ggtt->mappable,
-					 i915_gem_obj_ggtt_offset(overlay->reg_bo));
+					 overlay->flip_addr);
 
 	return regs;
 }
@@ -1493,7 +1493,7 @@ intel_overlay_map_regs_atomic(struct intel_overlay *overlay)
 			overlay->reg_bo->phys_handle->vaddr;
 	else
 		regs = io_mapping_map_atomic_wc(ggtt->mappable,
-						i915_gem_obj_ggtt_offset(overlay->reg_bo));
+						overlay->flip_addr);
 
 	return regs;
 }
@@ -1523,10 +1523,7 @@ intel_overlay_capture_error_state(struct drm_device *dev)
 
 	error->dovsta = I915_READ(DOVSTA);
 	error->isr = I915_READ(ISR);
-	if (OVERLAY_NEEDS_PHYSICAL(overlay->dev))
-		error->base = (__force long)overlay->reg_bo->phys_handle->vaddr;
-	else
-		error->base = i915_gem_obj_ggtt_offset(overlay->reg_bo);
+	error->base = overlay->flip_addr;
 
 	regs = intel_overlay_map_regs_atomic(overlay);
 	if (!regs)
-- 
2.8.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 02/19] io-mapping: Specify mapping size for io_mapping_map_wc()
  2016-04-20 18:42 Premature unpinning at last Chris Wilson
  2016-04-20 18:42 ` [PATCH 01/19] drm/i915/overlay: Replace i915_gem_obj_ggtt_offset() with the known flip_addr Chris Wilson
@ 2016-04-20 18:42   ` Chris Wilson
  2016-04-20 18:42 ` [PATCH 03/19] drm/i915: Introduce i915_vm_to_ggtt() Chris Wilson
                     ` (20 subsequent siblings)
  22 siblings, 0 replies; 52+ messages in thread
From: Chris Wilson @ 2016-04-20 18:42 UTC (permalink / raw)
  To: intel-gfx
  Cc: Tvrtko Ursulin, Tvrtko Ursulin, Mika Kuoppala, Yishai Hadas,
	linux-kernel, Peter Zijlstra (Intel),
	Luis R . Rodriguez, dri-devel, netdev, linux-rdma, Daniel Vetter,
	Dan Williams, Ingo Molnar, David Hildenbrand

The ioremap() hidden behind the io_mapping_map_wc() convenience helper
can be used for remapping multiple pages. Extend the helper so that
future callers can use it for larger ranges.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Yishai Hadas <yishaih@mellanox.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
Cc: Luis R. Rodriguez <mcgrof@kernel.org>
Cc: intel-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Cc: netdev@vger.kernel.org
Cc: linux-rdma@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
 drivers/gpu/drm/i915/intel_overlay.c    |  3 ++-
 drivers/net/ethernet/mellanox/mlx4/pd.c |  4 +++-
 include/linux/io-mapping.h              | 10 +++++++---
 3 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
index 9746b9841c13..0d5a376878d3 100644
--- a/drivers/gpu/drm/i915/intel_overlay.c
+++ b/drivers/gpu/drm/i915/intel_overlay.c
@@ -198,7 +198,8 @@ intel_overlay_map_regs(struct intel_overlay *overlay)
 		regs = (struct overlay_registers __iomem *)overlay->reg_bo->phys_handle->vaddr;
 	else
 		regs = io_mapping_map_wc(ggtt->mappable,
-					 overlay->flip_addr);
+					 overlay->flip_addr,
+					 PAGE_SIZE);
 
 	return regs;
 }
diff --git a/drivers/net/ethernet/mellanox/mlx4/pd.c b/drivers/net/ethernet/mellanox/mlx4/pd.c
index b3cc3ab63799..6fc156a3918d 100644
--- a/drivers/net/ethernet/mellanox/mlx4/pd.c
+++ b/drivers/net/ethernet/mellanox/mlx4/pd.c
@@ -205,7 +205,9 @@ int mlx4_bf_alloc(struct mlx4_dev *dev, struct mlx4_bf *bf, int node)
 			goto free_uar;
 		}
 
-		uar->bf_map = io_mapping_map_wc(priv->bf_mapping, uar->index << PAGE_SHIFT);
+		uar->bf_map = io_mapping_map_wc(priv->bf_mapping,
+						uar->index << PAGE_SHIFT,
+						PAGE_SIZE);
 		if (!uar->bf_map) {
 			err = -ENOMEM;
 			goto unamp_uar;
diff --git a/include/linux/io-mapping.h b/include/linux/io-mapping.h
index e399029b68c5..645ad06b5d52 100644
--- a/include/linux/io-mapping.h
+++ b/include/linux/io-mapping.h
@@ -100,14 +100,16 @@ io_mapping_unmap_atomic(void __iomem *vaddr)
 }
 
 static inline void __iomem *
-io_mapping_map_wc(struct io_mapping *mapping, unsigned long offset)
+io_mapping_map_wc(struct io_mapping *mapping,
+		  unsigned long offset,
+		  unsigned long size)
 {
 	resource_size_t phys_addr;
 
 	BUG_ON(offset >= mapping->size);
 	phys_addr = mapping->base + offset;
 
-	return ioremap_wc(phys_addr, PAGE_SIZE);
+	return ioremap_wc(phys_addr, size);
 }
 
 static inline void
@@ -155,7 +157,9 @@ io_mapping_unmap_atomic(void __iomem *vaddr)
 
 /* Non-atomic map/unmap */
 static inline void __iomem *
-io_mapping_map_wc(struct io_mapping *mapping, unsigned long offset)
+io_mapping_map_wc(struct io_mapping *mapping,
+		  unsigned long offset,
+		  unsigned long size)
 {
 	return ((char __force __iomem *) mapping) + offset;
 }
-- 
2.8.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 02/19] io-mapping: Specify mapping size for io_mapping_map_wc()
@ 2016-04-20 18:42   ` Chris Wilson
  0 siblings, 0 replies; 52+ messages in thread
From: Chris Wilson @ 2016-04-20 18:42 UTC (permalink / raw)
  To: intel-gfx
  Cc: Tvrtko Ursulin, Mika Kuoppala, Joonas Lahtinen, Chris Wilson,
	Tvrtko Ursulin, Daniel Vetter, Jani Nikula, David Airlie,
	Yishai Hadas, Dan Williams, Ingo Molnar, Peter Zijlstra (Intel),
	David Hildenbrand, Luis R . Rodriguez, dri-devel, netdev,
	linux-rdma, linux-kernel

The ioremap() hidden behind the io_mapping_map_wc() convenience helper
can be used for remapping multiple pages. Extend the helper so that
future callers can use it for larger ranges.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Yishai Hadas <yishaih@mellanox.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
Cc: Luis R. Rodriguez <mcgrof@kernel.org>
Cc: intel-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Cc: netdev@vger.kernel.org
Cc: linux-rdma@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
 drivers/gpu/drm/i915/intel_overlay.c    |  3 ++-
 drivers/net/ethernet/mellanox/mlx4/pd.c |  4 +++-
 include/linux/io-mapping.h              | 10 +++++++---
 3 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
index 9746b9841c13..0d5a376878d3 100644
--- a/drivers/gpu/drm/i915/intel_overlay.c
+++ b/drivers/gpu/drm/i915/intel_overlay.c
@@ -198,7 +198,8 @@ intel_overlay_map_regs(struct intel_overlay *overlay)
 		regs = (struct overlay_registers __iomem *)overlay->reg_bo->phys_handle->vaddr;
 	else
 		regs = io_mapping_map_wc(ggtt->mappable,
-					 overlay->flip_addr);
+					 overlay->flip_addr,
+					 PAGE_SIZE);
 
 	return regs;
 }
diff --git a/drivers/net/ethernet/mellanox/mlx4/pd.c b/drivers/net/ethernet/mellanox/mlx4/pd.c
index b3cc3ab63799..6fc156a3918d 100644
--- a/drivers/net/ethernet/mellanox/mlx4/pd.c
+++ b/drivers/net/ethernet/mellanox/mlx4/pd.c
@@ -205,7 +205,9 @@ int mlx4_bf_alloc(struct mlx4_dev *dev, struct mlx4_bf *bf, int node)
 			goto free_uar;
 		}
 
-		uar->bf_map = io_mapping_map_wc(priv->bf_mapping, uar->index << PAGE_SHIFT);
+		uar->bf_map = io_mapping_map_wc(priv->bf_mapping,
+						uar->index << PAGE_SHIFT,
+						PAGE_SIZE);
 		if (!uar->bf_map) {
 			err = -ENOMEM;
 			goto unamp_uar;
diff --git a/include/linux/io-mapping.h b/include/linux/io-mapping.h
index e399029b68c5..645ad06b5d52 100644
--- a/include/linux/io-mapping.h
+++ b/include/linux/io-mapping.h
@@ -100,14 +100,16 @@ io_mapping_unmap_atomic(void __iomem *vaddr)
 }
 
 static inline void __iomem *
-io_mapping_map_wc(struct io_mapping *mapping, unsigned long offset)
+io_mapping_map_wc(struct io_mapping *mapping,
+		  unsigned long offset,
+		  unsigned long size)
 {
 	resource_size_t phys_addr;
 
 	BUG_ON(offset >= mapping->size);
 	phys_addr = mapping->base + offset;
 
-	return ioremap_wc(phys_addr, PAGE_SIZE);
+	return ioremap_wc(phys_addr, size);
 }
 
 static inline void
@@ -155,7 +157,9 @@ io_mapping_unmap_atomic(void __iomem *vaddr)
 
 /* Non-atomic map/unmap */
 static inline void __iomem *
-io_mapping_map_wc(struct io_mapping *mapping, unsigned long offset)
+io_mapping_map_wc(struct io_mapping *mapping,
+		  unsigned long offset,
+		  unsigned long size)
 {
 	return ((char __force __iomem *) mapping) + offset;
 }
-- 
2.8.1

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 02/19] io-mapping: Specify mapping size for io_mapping_map_wc()
@ 2016-04-20 18:42   ` Chris Wilson
  0 siblings, 0 replies; 52+ messages in thread
From: Chris Wilson @ 2016-04-20 18:42 UTC (permalink / raw)
  To: intel-gfx
  Cc: Tvrtko Ursulin, Tvrtko Ursulin, Mika Kuoppala, Yishai Hadas,
	linux-kernel, Peter Zijlstra (Intel),
	Luis R . Rodriguez, dri-devel, netdev, linux-rdma, Daniel Vetter,
	Dan Williams, Ingo Molnar, David Hildenbrand

The ioremap() hidden behind the io_mapping_map_wc() convenience helper
can be used for remapping multiple pages. Extend the helper so that
future callers can use it for larger ranges.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Yishai Hadas <yishaih@mellanox.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
Cc: Luis R. Rodriguez <mcgrof@kernel.org>
Cc: intel-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Cc: netdev@vger.kernel.org
Cc: linux-rdma@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
 drivers/gpu/drm/i915/intel_overlay.c    |  3 ++-
 drivers/net/ethernet/mellanox/mlx4/pd.c |  4 +++-
 include/linux/io-mapping.h              | 10 +++++++---
 3 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
index 9746b9841c13..0d5a376878d3 100644
--- a/drivers/gpu/drm/i915/intel_overlay.c
+++ b/drivers/gpu/drm/i915/intel_overlay.c
@@ -198,7 +198,8 @@ intel_overlay_map_regs(struct intel_overlay *overlay)
 		regs = (struct overlay_registers __iomem *)overlay->reg_bo->phys_handle->vaddr;
 	else
 		regs = io_mapping_map_wc(ggtt->mappable,
-					 overlay->flip_addr);
+					 overlay->flip_addr,
+					 PAGE_SIZE);
 
 	return regs;
 }
diff --git a/drivers/net/ethernet/mellanox/mlx4/pd.c b/drivers/net/ethernet/mellanox/mlx4/pd.c
index b3cc3ab63799..6fc156a3918d 100644
--- a/drivers/net/ethernet/mellanox/mlx4/pd.c
+++ b/drivers/net/ethernet/mellanox/mlx4/pd.c
@@ -205,7 +205,9 @@ int mlx4_bf_alloc(struct mlx4_dev *dev, struct mlx4_bf *bf, int node)
 			goto free_uar;
 		}
 
-		uar->bf_map = io_mapping_map_wc(priv->bf_mapping, uar->index << PAGE_SHIFT);
+		uar->bf_map = io_mapping_map_wc(priv->bf_mapping,
+						uar->index << PAGE_SHIFT,
+						PAGE_SIZE);
 		if (!uar->bf_map) {
 			err = -ENOMEM;
 			goto unamp_uar;
diff --git a/include/linux/io-mapping.h b/include/linux/io-mapping.h
index e399029b68c5..645ad06b5d52 100644
--- a/include/linux/io-mapping.h
+++ b/include/linux/io-mapping.h
@@ -100,14 +100,16 @@ io_mapping_unmap_atomic(void __iomem *vaddr)
 }
 
 static inline void __iomem *
-io_mapping_map_wc(struct io_mapping *mapping, unsigned long offset)
+io_mapping_map_wc(struct io_mapping *mapping,
+		  unsigned long offset,
+		  unsigned long size)
 {
 	resource_size_t phys_addr;
 
 	BUG_ON(offset >= mapping->size);
 	phys_addr = mapping->base + offset;
 
-	return ioremap_wc(phys_addr, PAGE_SIZE);
+	return ioremap_wc(phys_addr, size);
 }
 
 static inline void
@@ -155,7 +157,9 @@ io_mapping_unmap_atomic(void __iomem *vaddr)
 
 /* Non-atomic map/unmap */
 static inline void __iomem *
-io_mapping_map_wc(struct io_mapping *mapping, unsigned long offset)
+io_mapping_map_wc(struct io_mapping *mapping,
+		  unsigned long offset,
+		  unsigned long size)
 {
 	return ((char __force __iomem *) mapping) + offset;
 }
-- 
2.8.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 03/19] drm/i915: Introduce i915_vm_to_ggtt()
  2016-04-20 18:42 Premature unpinning at last Chris Wilson
  2016-04-20 18:42 ` [PATCH 01/19] drm/i915/overlay: Replace i915_gem_obj_ggtt_offset() with the known flip_addr Chris Wilson
  2016-04-20 18:42   ` Chris Wilson
@ 2016-04-20 18:42 ` Chris Wilson
  2016-04-20 18:42 ` [PATCH 04/19] drm/i915: Move ioremap_wc tracking onto VMA Chris Wilson
                   ` (19 subsequent siblings)
  22 siblings, 0 replies; 52+ messages in thread
From: Chris Wilson @ 2016-04-20 18:42 UTC (permalink / raw)
  To: intel-gfx

In a couple of places, we have an i915_address_space that we know is
really an i915_ggtt that we want to use. Create an inline helper to
convert from the i915_address_space subclass into its container.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index eebdb28acfa9..1b696a13a727 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -93,6 +93,13 @@
  *
  */
 
+static inline struct i915_ggtt *
+i915_vm_to_ggtt(struct i915_address_space *vm)
+{
+	GEM_BUG_ON(!i915_is_ggtt(vm));
+	return container_of(vm, struct i915_ggtt, base);
+}
+
 static int
 i915_get_ggtt_vma_pages(struct i915_vma *vma);
 
@@ -2359,7 +2366,7 @@ static void gen8_ggtt_insert_entries(struct i915_address_space *vm,
 				     enum i915_cache_level level, u32 unused)
 {
 	struct drm_i915_private *dev_priv = to_i915(vm->dev);
-	struct i915_ggtt *ggtt = &dev_priv->ggtt;
+	struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
 	unsigned first_entry = start >> PAGE_SHIFT;
 	gen8_pte_t __iomem *gtt_entries =
 		(gen8_pte_t __iomem *)ggtt->gsm + first_entry;
@@ -2437,7 +2444,7 @@ static void gen6_ggtt_insert_entries(struct i915_address_space *vm,
 				     enum i915_cache_level level, u32 flags)
 {
 	struct drm_i915_private *dev_priv = to_i915(vm->dev);
-	struct i915_ggtt *ggtt = &dev_priv->ggtt;
+	struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
 	unsigned first_entry = start >> PAGE_SHIFT;
 	gen6_pte_t __iomem *gtt_entries =
 		(gen6_pte_t __iomem *)ggtt->gsm + first_entry;
@@ -2481,7 +2488,7 @@ static void gen8_ggtt_clear_range(struct i915_address_space *vm,
 				  bool use_scratch)
 {
 	struct drm_i915_private *dev_priv = to_i915(vm->dev);
-	struct i915_ggtt *ggtt = &dev_priv->ggtt;
+	struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
 	unsigned first_entry = start >> PAGE_SHIFT;
 	unsigned num_entries = length >> PAGE_SHIFT;
 	gen8_pte_t scratch_pte, __iomem *gtt_base =
@@ -2513,7 +2520,7 @@ static void gen6_ggtt_clear_range(struct i915_address_space *vm,
 				  bool use_scratch)
 {
 	struct drm_i915_private *dev_priv = to_i915(vm->dev);
-	struct i915_ggtt *ggtt = &dev_priv->ggtt;
+	struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
 	unsigned first_entry = start >> PAGE_SHIFT;
 	unsigned num_entries = length >> PAGE_SHIFT;
 	gen6_pte_t scratch_pte, __iomem *gtt_base =
-- 
2.8.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 04/19] drm/i915: Move ioremap_wc tracking onto VMA
  2016-04-20 18:42 Premature unpinning at last Chris Wilson
                   ` (2 preceding siblings ...)
  2016-04-20 18:42 ` [PATCH 03/19] drm/i915: Introduce i915_vm_to_ggtt() Chris Wilson
@ 2016-04-20 18:42 ` Chris Wilson
  2016-04-20 18:42 ` [PATCH 05/19] drm/i915: Use i915_vma_pin_iomap on the ringbuffer object Chris Wilson
                   ` (18 subsequent siblings)
  22 siblings, 0 replies; 52+ messages in thread
From: Chris Wilson @ 2016-04-20 18:42 UTC (permalink / raw)
  To: intel-gfx

By tracking the iomapping on the VMA itself, we can share that area
between multiple users. Also by only revoking the iomapping upon
unbinding from the mappable portion of the GGTT, we can keep that iomap
across multiple invocations (e.g. execlists context pinning).

Note that by moving the iounnmap tracking to the VMA, we actually end up
fixing a leak of the iomapping in intel_fbdev.

v1.5: Rebase prompted by Tvrtko
v2: Drop dev_priv parameter, we can recover the i915_ggtt from the vma.
v3: Move handling of ioremap space exhaustion to vmap_purge and also
allow vmallocs to recover old iomaps. Add Tvrtko's kerneldoc.
v4: Fix a use-after-free in shrinker and rearrange i915_vma_iomap
v5: Back to i915_vm_to_ggtt
v6: Use i915_vma_pin_iomap and i915_vma_unpin_iomap to mark critical
sections and ensure the VMA cannot be reaped whilst mapped.
v7: Move i915_vma_iounmap so that consumers of the API are not tempted,
and add iomem annotations

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c          | 13 ++++++++++++
 drivers/gpu/drm/i915/i915_gem_gtt.c      | 26 ++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_gtt.h      | 35 ++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_shrinker.c | 28 +++++++++++++++++++------
 drivers/gpu/drm/i915/intel_fbdev.c       | 22 +++++++++++---------
 5 files changed, 108 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 261a3ef72828..8f0577886997 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3338,6 +3338,17 @@ static void i915_gem_object_finish_gtt(struct drm_i915_gem_object *obj)
 					    old_write_domain);
 }
 
+static void __i915_vma_iounmap(struct i915_vma *vma)
+{
+	GEM_BUG_ON(vma->pin_count);
+
+	if (vma->iomap == NULL)
+		return;
+
+	io_mapping_unmap(vma->iomap);
+	vma->iomap = NULL;
+}
+
 static int __i915_vma_unbind(struct i915_vma *vma, bool wait)
 {
 	struct drm_i915_gem_object *obj = vma->obj;
@@ -3370,6 +3381,8 @@ static int __i915_vma_unbind(struct i915_vma *vma, bool wait)
 		ret = i915_gem_object_put_fence(obj);
 		if (ret)
 			return ret;
+
+		__i915_vma_iounmap(vma);
 	}
 
 	trace_i915_vma_unbind(vma);
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 1b696a13a727..69191ae2cd6a 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -3634,3 +3634,29 @@ i915_ggtt_view_size(struct drm_i915_gem_object *obj,
 		return obj->base.size;
 	}
 }
+
+void __iomem *i915_vma_pin_iomap(struct i915_vma *vma)
+{
+	void __iomem *ptr;
+
+	lockdep_assert_held(&vma->vm->dev->struct_mutex);
+	if (WARN_ON(!vma->obj->map_and_fenceable))
+		return ERR_PTR(-ENODEV);
+
+	GEM_BUG_ON(!vma->is_ggtt);
+	GEM_BUG_ON((vma->bound & GLOBAL_BIND) == 0);
+
+	ptr = vma->iomap;
+	if (ptr == NULL) {
+		ptr = io_mapping_map_wc(i915_vm_to_ggtt(vma->vm)->mappable,
+					vma->node.start,
+					vma->node.size);
+		if (ptr == NULL)
+			return ERR_PTR(-ENOMEM);
+
+		vma->iomap = ptr;
+	}
+
+	vma->pin_count++;
+	return ptr;
+}
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index d7dd3d8a8758..b0ae6632c01a 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -34,6 +34,8 @@
 #ifndef __I915_GEM_GTT_H__
 #define __I915_GEM_GTT_H__
 
+#include <linux/io-mapping.h>
+
 struct drm_i915_file_private;
 
 typedef uint32_t gen6_pte_t;
@@ -175,6 +177,7 @@ struct i915_vma {
 	struct drm_mm_node node;
 	struct drm_i915_gem_object *obj;
 	struct i915_address_space *vm;
+	void __iomem *iomap;
 
 	/** Flags and address space this VMA is bound to */
 #define GLOBAL_BIND	(1<<0)
@@ -559,4 +562,36 @@ size_t
 i915_ggtt_view_size(struct drm_i915_gem_object *obj,
 		    const struct i915_ggtt_view *view);
 
+/**
+ * i915_vma_pin_iomap - calls ioremap_wc to map the GGTT VMA via the aperture
+ * @vma: VMA to iomap
+ *
+ * The passed in VMA has to be pinned in the global GTT mappable region.
+ * An extra pinning of the VMA is acquired for the return iomapping,
+ * the caller must call i915_vma_unpin_iomap to relinquish the pinning
+ * after the iomapping is no longer required.
+ *
+ * Callers must hold the struct_mutex.
+ *
+ * Returns a valid iomapped pointer or ERR_PTR.
+ */
+void __iomem *i915_vma_pin_iomap(struct i915_vma *vma);
+
+/**
+ * i915_vma_unpin_iomap - unpins the mapping returned from i915_vma_iomap
+ * @vma: VMA to unpin
+ *
+ * Unpins the previously iomapped VMA from i915_vma_pin_iomap().
+ *
+ * Callers must hold the struct_mutex. This function is only valid to be
+ * called on a VMA previously iomapped by the caller with i915_vma_pin_iomap().
+ */
+static inline void i915_vma_unpin_iomap(struct i915_vma *vma)
+{
+	lockdep_assert_held(&vma->vm->dev->struct_mutex);
+	GEM_BUG_ON(vma->pin_count == 0);
+	GEM_BUG_ON(vma->iomap == NULL);
+	vma->pin_count--;
+}
+
 #endif
diff --git a/drivers/gpu/drm/i915/i915_gem_shrinker.c b/drivers/gpu/drm/i915/i915_gem_shrinker.c
index 425e721aac58..2643d18e41f5 100644
--- a/drivers/gpu/drm/i915/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/i915_gem_shrinker.c
@@ -386,17 +386,33 @@ i915_gem_shrinker_vmap(struct notifier_block *nb, unsigned long event, void *ptr
 	struct drm_i915_private *dev_priv =
 		container_of(nb, struct drm_i915_private, mm.vmap_notifier);
 	struct shrinker_lock_uninterruptible slu;
-	unsigned long freed_pages;
+	struct i915_vma *vma, *next;
+	unsigned long freed_pages = 0;
+	int ret;
 
 	if (!i915_gem_shrinker_lock_uninterruptible(dev_priv, &slu, 5000))
 		return NOTIFY_DONE;
 
-	freed_pages = i915_gem_shrink(dev_priv, -1UL,
-				      I915_SHRINK_BOUND |
-				      I915_SHRINK_UNBOUND |
-				      I915_SHRINK_ACTIVE |
-				      I915_SHRINK_VMAPS);
+	/* Force everything onto the inactive lists */
+	ret = i915_gpu_idle(dev_priv->dev);
+	if (ret)
+		goto out;
+
+	freed_pages += i915_gem_shrink(dev_priv, -1UL,
+				       I915_SHRINK_BOUND |
+				       I915_SHRINK_UNBOUND |
+				       I915_SHRINK_ACTIVE |
+				       I915_SHRINK_VMAPS);
+
+	/* We also want to clear any cached iomaps as they wrap vmap */
+	list_for_each_entry_safe(vma, next,
+				 &dev_priv->ggtt.base.inactive_list, vm_link) {
+		unsigned long count = vma->node.size >> PAGE_SHIFT;
+		if (vma->iomap && i915_vma_unbind(vma) == 0)
+			freed_pages += count;
+	}
 
+out:
 	i915_gem_shrinker_unlock_uninterruptible(dev_priv, &slu);
 
 	*(unsigned long *)ptr += freed_pages;
diff --git a/drivers/gpu/drm/i915/intel_fbdev.c b/drivers/gpu/drm/i915/intel_fbdev.c
index 79ac202f3870..93f54a10042f 100644
--- a/drivers/gpu/drm/i915/intel_fbdev.c
+++ b/drivers/gpu/drm/i915/intel_fbdev.c
@@ -186,9 +186,11 @@ static int intelfb_create(struct drm_fb_helper *helper,
 	struct i915_ggtt *ggtt = &dev_priv->ggtt;
 	struct fb_info *info;
 	struct drm_framebuffer *fb;
+	struct i915_vma *vma;
 	struct drm_i915_gem_object *obj;
-	int size, ret;
 	bool prealloc = false;
+	void *vaddr;
+	int ret;
 
 	if (intel_fb &&
 	    (sizes->fb_width > intel_fb->base.width ||
@@ -214,7 +216,6 @@ static int intelfb_create(struct drm_fb_helper *helper,
 	}
 
 	obj = intel_fb->obj;
-	size = obj->base.size;
 
 	mutex_lock(&dev->struct_mutex);
 
@@ -244,22 +245,23 @@ static int intelfb_create(struct drm_fb_helper *helper,
 	info->flags = FBINFO_DEFAULT | FBINFO_CAN_FORCE_OUTPUT;
 	info->fbops = &intelfb_ops;
 
+	vma = i915_gem_obj_to_ggtt(obj);
+
 	/* setup aperture base/size for vesafb takeover */
 	info->apertures->ranges[0].base = dev->mode_config.fb_base;
 	info->apertures->ranges[0].size = ggtt->mappable_end;
 
-	info->fix.smem_start = dev->mode_config.fb_base + i915_gem_obj_ggtt_offset(obj);
-	info->fix.smem_len = size;
+	info->fix.smem_start = dev->mode_config.fb_base + vma->node.start;
+	info->fix.smem_len = vma->node.size;
 
-	info->screen_base =
-		ioremap_wc(ggtt->mappable_base + i915_gem_obj_ggtt_offset(obj),
-			   size);
-	if (!info->screen_base) {
+	vaddr = i915_vma_pin_iomap(vma);
+	if (IS_ERR(vaddr)) {
 		DRM_ERROR("Failed to remap framebuffer into virtual memory\n");
-		ret = -ENOSPC;
+		ret = PTR_ERR(vaddr);
 		goto out_destroy_fbi;
 	}
-	info->screen_size = size;
+	info->screen_base = vaddr;
+	info->screen_size = vma->node.size;
 
 	/* This driver doesn't need a VT switch to restore the mode on resume */
 	info->skip_vt_switch = true;
-- 
2.8.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 05/19] drm/i915: Use i915_vma_pin_iomap on the ringbuffer object
  2016-04-20 18:42 Premature unpinning at last Chris Wilson
                   ` (3 preceding siblings ...)
  2016-04-20 18:42 ` [PATCH 04/19] drm/i915: Move ioremap_wc tracking onto VMA Chris Wilson
@ 2016-04-20 18:42 ` Chris Wilson
  2016-04-21  7:19   ` Joonas Lahtinen
  2016-04-20 18:42 ` [PATCH 06/19] drm/i915: Mark the current context as lost on suspend Chris Wilson
                   ` (17 subsequent siblings)
  22 siblings, 1 reply; 52+ messages in thread
From: Chris Wilson @ 2016-04-20 18:42 UTC (permalink / raw)
  To: intel-gfx

Similarly to i915_gem_object_pin_map on LLC platforms, we can
use the new VMA based io mapping on !LLC to amoritize the cost
of ringbuffer pinning and unpinning.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 245386e20c52..27ba15acae1e 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2084,20 +2084,23 @@ static int init_phys_status_page(struct intel_engine_cs *engine)
 
 void intel_unpin_ringbuffer_obj(struct intel_ringbuffer *ringbuf)
 {
+	GEM_BUG_ON(ringbuf->vma == NULL);
+	GEM_BUG_ON(ringbuf->virtual_start == NULL);
+
 	if (HAS_LLC(ringbuf->obj->base.dev) && !ringbuf->obj->stolen)
 		i915_gem_object_unpin_map(ringbuf->obj);
 	else
-		iounmap(ringbuf->virtual_start);
+		i915_vma_unpin_iomap(ringbuf->vma);
 	ringbuf->virtual_start = NULL;
-	ringbuf->vma = NULL;
+
 	i915_gem_object_ggtt_unpin(ringbuf->obj);
+	ringbuf->vma = NULL;
 }
 
 int intel_pin_and_map_ringbuffer_obj(struct drm_device *dev,
 				     struct intel_ringbuffer *ringbuf)
 {
 	struct drm_i915_private *dev_priv = to_i915(dev);
-	struct i915_ggtt *ggtt = &dev_priv->ggtt;
 	struct drm_i915_gem_object *obj = ringbuf->obj;
 	/* Ring wraparound at offset 0 sometimes hangs. No idea why. */
 	unsigned flags = PIN_OFFSET_BIAS | 4096;
@@ -2131,10 +2134,9 @@ int intel_pin_and_map_ringbuffer_obj(struct drm_device *dev,
 		/* Access through the GTT requires the device to be awake. */
 		assert_rpm_wakelock_held(dev_priv);
 
-		addr = ioremap_wc(ggtt->mappable_base +
-				  i915_gem_obj_ggtt_offset(obj), ringbuf->size);
-		if (addr == NULL) {
-			ret = -ENOMEM;
+		addr = i915_vma_pin_iomap(i915_gem_obj_to_ggtt(obj));
+		if (IS_ERR(addr)) {
+			ret = PTR_ERR(addr);
 			goto err_unpin;
 		}
 	}
-- 
2.8.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 06/19] drm/i915: Mark the current context as lost on suspend
  2016-04-20 18:42 Premature unpinning at last Chris Wilson
                   ` (4 preceding siblings ...)
  2016-04-20 18:42 ` [PATCH 05/19] drm/i915: Use i915_vma_pin_iomap on the ringbuffer object Chris Wilson
@ 2016-04-20 18:42 ` Chris Wilson
  2016-04-20 18:42 ` [PATCH 07/19] drm/i915: L3 cache remapping is part of context switching Chris Wilson
                   ` (16 subsequent siblings)
  22 siblings, 0 replies; 52+ messages in thread
From: Chris Wilson @ 2016-04-20 18:42 UTC (permalink / raw)
  To: intel-gfx

In order to force a reload of the context image upon resume, we first
need to mark its absence on suspend. Currently we are failing to restore
the golden context state and any context w/a to the default context
after resume.

One oversight corrected, is that we had forgotten to reapply the L3
remapping when restoring the lost default context.

v2: Remove deprecated WARN.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         |  1 +
 drivers/gpu/drm/i915/i915_gem.c         |  1 +
 drivers/gpu/drm/i915/i915_gem_context.c | 54 ++++++++++++++-------------------
 3 files changed, 25 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 6f1e0f127c0a..e35e190722fb 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3301,6 +3301,7 @@ void i915_gem_object_save_bit_17_swizzle(struct drm_i915_gem_object *obj);
 
 /* i915_gem_context.c */
 int __must_check i915_gem_context_init(struct drm_device *dev);
+void i915_gem_context_lost(struct drm_i915_private *dev_priv);
 void i915_gem_context_fini(struct drm_device *dev);
 void i915_gem_context_reset(struct drm_device *dev);
 int i915_gem_context_open(struct drm_device *dev, struct drm_file *file);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 8f0577886997..7c4630656d7a 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4715,6 +4715,7 @@ i915_gem_suspend(struct drm_device *dev)
 	i915_gem_retire_requests(dev);
 
 	i915_gem_stop_engines(dev);
+	i915_gem_context_lost(dev_priv);
 	mutex_unlock(&dev->struct_mutex);
 
 	cancel_delayed_work_sync(&dev_priv->gpu_error.hangcheck_work);
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index e5acc3916f75..bf31ee1ed914 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -90,6 +90,8 @@
 #include "i915_drv.h"
 #include "i915_trace.h"
 
+#define ALL_L3_SLICES(dev) (1 << NUM_L3_SLICES(dev)) - 1
+
 /* This is a HW constraint. The value below is the largest known requirement
  * I've seen in a spec to date, and that was a workaround for a non-shipping
  * part. It should be safe to decrease this, but it's more future proof as is.
@@ -249,7 +251,7 @@ __create_hw_context(struct drm_device *dev,
 	/* NB: Mark all slices as needing a remap so that when the context first
 	 * loads it will restore whatever remap state already exists. If there
 	 * is no remap info, it will be a NOP. */
-	ctx->remap_slice = (1 << NUM_L3_SLICES(dev)) - 1;
+	ctx->remap_slice = ALL_L3_SLICES(dev_priv);
 
 	ctx->hang_stats.ban_period_seconds = DRM_I915_CTX_BAN_PERIOD;
 
@@ -336,7 +338,6 @@ static void i915_gem_context_unpin(struct intel_context *ctx,
 void i915_gem_context_reset(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	int i;
 
 	if (i915.enable_execlists) {
 		struct intel_context *ctx;
@@ -345,17 +346,7 @@ void i915_gem_context_reset(struct drm_device *dev)
 			intel_lr_context_reset(dev_priv, ctx);
 	}
 
-	for (i = 0; i < I915_NUM_ENGINES; i++) {
-		struct intel_engine_cs *engine = &dev_priv->engine[i];
-
-		if (engine->last_context) {
-			i915_gem_context_unpin(engine->last_context, engine);
-			engine->last_context = NULL;
-		}
-	}
-
-	/* Force the GPU state to be reinitialised on enabling */
-	dev_priv->kernel_context->legacy_hw_ctx.initialized = false;
+	i915_gem_context_lost(dev_priv);
 }
 
 int i915_gem_context_init(struct drm_device *dev)
@@ -403,11 +394,29 @@ int i915_gem_context_init(struct drm_device *dev)
 	return 0;
 }
 
+void i915_gem_context_lost(struct drm_i915_private *dev_priv)
+{
+	struct intel_engine_cs *engine;
+
+	for_each_engine(engine, dev_priv) {
+		if (engine->last_context == NULL)
+			continue;
+
+		i915_gem_context_unpin(engine->last_context, engine);
+		engine->last_context = NULL;
+	}
+
+	/* Force the GPU state to be reinitialised on enabling */
+	dev_priv->kernel_context->legacy_hw_ctx.initialized = false;
+	dev_priv->kernel_context->remap_slice = ALL_L3_SLICES(dev_priv);
+}
+
 void i915_gem_context_fini(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_context *dctx = dev_priv->kernel_context;
-	int i;
+
+	i915_gem_context_lost(dev_priv);
 
 	if (dctx->legacy_hw_ctx.rcs_state) {
 		/* The only known way to stop the gpu from accessing the hw context is
@@ -415,26 +424,9 @@ void i915_gem_context_fini(struct drm_device *dev)
 		 * other code, leading to spurious errors. */
 		intel_gpu_reset(dev, ALL_ENGINES);
 
-		/* When default context is created and switched to, base object refcount
-		 * will be 2 (+1 from object creation and +1 from do_switch()).
-		 * i915_gem_context_fini() will be called after gpu_idle() has switched
-		 * to default context. So we need to unreference the base object once
-		 * to offset the do_switch part, so that i915_gem_context_unreference()
-		 * can then free the base object correctly. */
-		WARN_ON(!dev_priv->engine[RCS].last_context);
-
 		i915_gem_object_ggtt_unpin(dctx->legacy_hw_ctx.rcs_state);
 	}
 
-	for (i = I915_NUM_ENGINES; --i >= 0;) {
-		struct intel_engine_cs *engine = &dev_priv->engine[i];
-
-		if (engine->last_context) {
-			i915_gem_context_unpin(engine->last_context, engine);
-			engine->last_context = NULL;
-		}
-	}
-
 	i915_gem_context_unreference(dctx);
 	dev_priv->kernel_context = NULL;
 }
-- 
2.8.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 07/19] drm/i915: L3 cache remapping is part of context switching
  2016-04-20 18:42 Premature unpinning at last Chris Wilson
                   ` (5 preceding siblings ...)
  2016-04-20 18:42 ` [PATCH 06/19] drm/i915: Mark the current context as lost on suspend Chris Wilson
@ 2016-04-20 18:42 ` Chris Wilson
  2016-04-20 18:42 ` [PATCH 08/19] drm/i915: Consolidate L3 remapping LRI Chris Wilson
                   ` (15 subsequent siblings)
  22 siblings, 0 replies; 52+ messages in thread
From: Chris Wilson @ 2016-04-20 18:42 UTC (permalink / raw)
  To: intel-gfx

Move the i915_gem_l3_remap function such that it next to the context
switching, which is where we perform the L3 remap.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c         | 31 -------------------------------
 drivers/gpu/drm/i915/i915_gem_context.c | 31 +++++++++++++++++++++++++++++++
 2 files changed, 31 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 7c4630656d7a..d3e226b714fa 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4734,37 +4734,6 @@ err:
 	return ret;
 }
 
-int i915_gem_l3_remap(struct drm_i915_gem_request *req, int slice)
-{
-	struct intel_engine_cs *engine = req->engine;
-	struct drm_device *dev = engine->dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	u32 *remap_info = dev_priv->l3_parity.remap_info[slice];
-	int i, ret;
-
-	if (!HAS_L3_DPF(dev) || !remap_info)
-		return 0;
-
-	ret = intel_ring_begin(req, GEN7_L3LOG_SIZE / 4 * 3);
-	if (ret)
-		return ret;
-
-	/*
-	 * Note: We do not worry about the concurrent register cacheline hang
-	 * here because no other code should access these registers other than
-	 * at initialization time.
-	 */
-	for (i = 0; i < GEN7_L3LOG_SIZE / 4; i++) {
-		intel_ring_emit(engine, MI_LOAD_REGISTER_IMM(1));
-		intel_ring_emit_reg(engine, GEN7_L3LOG(slice, i));
-		intel_ring_emit(engine, remap_info[i]);
-	}
-
-	intel_ring_advance(engine);
-
-	return ret;
-}
-
 void i915_gem_init_swizzling(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index bf31ee1ed914..7b0a82c0d6e0 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -601,6 +601,37 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
 	return ret;
 }
 
+int i915_gem_l3_remap(struct drm_i915_gem_request *req, int slice)
+{
+	struct intel_engine_cs *engine = req->engine;
+	struct drm_device *dev = engine->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	u32 *remap_info = dev_priv->l3_parity.remap_info[slice];
+	int i, ret;
+
+	if (!HAS_L3_DPF(dev) || !remap_info)
+		return 0;
+
+	ret = intel_ring_begin(req, GEN7_L3LOG_SIZE / 4 * 3);
+	if (ret)
+		return ret;
+
+	/*
+	 * Note: We do not worry about the concurrent register cacheline hang
+	 * here because no other code should access these registers other than
+	 * at initialization time.
+	 */
+	for (i = 0; i < GEN7_L3LOG_SIZE / 4; i++) {
+		intel_ring_emit(engine, MI_LOAD_REGISTER_IMM(1));
+		intel_ring_emit_reg(engine, GEN7_L3LOG(slice, i));
+		intel_ring_emit(engine, remap_info[i]);
+	}
+
+	intel_ring_advance(engine);
+
+	return ret;
+}
+
 static inline bool skip_rcs_switch(struct intel_engine_cs *engine,
 				   struct intel_context *to)
 {
-- 
2.8.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 08/19] drm/i915: Consolidate L3 remapping LRI
  2016-04-20 18:42 Premature unpinning at last Chris Wilson
                   ` (6 preceding siblings ...)
  2016-04-20 18:42 ` [PATCH 07/19] drm/i915: L3 cache remapping is part of context switching Chris Wilson
@ 2016-04-20 18:42 ` Chris Wilson
  2016-04-20 18:42 ` [PATCH 09/19] drm/i915: Remove early l3-remap Chris Wilson
                   ` (14 subsequent siblings)
  22 siblings, 0 replies; 52+ messages in thread
From: Chris Wilson @ 2016-04-20 18:42 UTC (permalink / raw)
  To: intel-gfx

We can use a single MI_LOAD_REGISTER_IMM command packet to write all the
L3 remapping registers, shrinking the number of bytes required to emit
the context switch.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_context.c | 16 +++++++---------
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 7b0a82c0d6e0..1c63bea32211 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -603,16 +603,14 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
 
 int i915_gem_l3_remap(struct drm_i915_gem_request *req, int slice)
 {
+	u32 *remap_info = req->i915->l3_parity.remap_info[slice];
 	struct intel_engine_cs *engine = req->engine;
-	struct drm_device *dev = engine->dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	u32 *remap_info = dev_priv->l3_parity.remap_info[slice];
 	int i, ret;
 
-	if (!HAS_L3_DPF(dev) || !remap_info)
+	if (!remap_info)
 		return 0;
 
-	ret = intel_ring_begin(req, GEN7_L3LOG_SIZE / 4 * 3);
+	ret = intel_ring_begin(req, GEN7_L3LOG_SIZE/4 * 2 + 2);
 	if (ret)
 		return ret;
 
@@ -621,15 +619,15 @@ int i915_gem_l3_remap(struct drm_i915_gem_request *req, int slice)
 	 * here because no other code should access these registers other than
 	 * at initialization time.
 	 */
-	for (i = 0; i < GEN7_L3LOG_SIZE / 4; i++) {
-		intel_ring_emit(engine, MI_LOAD_REGISTER_IMM(1));
+	intel_ring_emit(engine, MI_LOAD_REGISTER_IMM(GEN7_L3LOG_SIZE/4));
+	for (i = 0; i < GEN7_L3LOG_SIZE/4; i++) {
 		intel_ring_emit_reg(engine, GEN7_L3LOG(slice, i));
 		intel_ring_emit(engine, remap_info[i]);
 	}
-
+	intel_ring_emit(engine, MI_NOOP);
 	intel_ring_advance(engine);
 
-	return ret;
+	return 0;
 }
 
 static inline bool skip_rcs_switch(struct intel_engine_cs *engine,
-- 
2.8.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 09/19] drm/i915: Remove early l3-remap
  2016-04-20 18:42 Premature unpinning at last Chris Wilson
                   ` (7 preceding siblings ...)
  2016-04-20 18:42 ` [PATCH 08/19] drm/i915: Consolidate L3 remapping LRI Chris Wilson
@ 2016-04-20 18:42 ` Chris Wilson
  2016-04-20 18:42 ` [PATCH 10/19] drm/i915: Rearrange switch_context to load the aliasing ppgtt on first use Chris Wilson
                   ` (13 subsequent siblings)
  22 siblings, 0 replies; 52+ messages in thread
From: Chris Wilson @ 2016-04-20 18:42 UTC (permalink / raw)
  To: intel-gfx

Since we do the l3-remap on context switch, we can remove the redundant
early call to set the mapping prior to performing the first context
switch.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         |  1 -
 drivers/gpu/drm/i915/i915_gem.c         | 10 +---------
 drivers/gpu/drm/i915/i915_gem_context.c |  4 ++--
 3 files changed, 3 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index e35e190722fb..7dd79c9a8ce3 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3147,7 +3147,6 @@ bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, bool force);
 int __must_check i915_gem_init(struct drm_device *dev);
 int i915_gem_init_engines(struct drm_device *dev);
 int __must_check i915_gem_init_hw(struct drm_device *dev);
-int i915_gem_l3_remap(struct drm_i915_gem_request *req, int slice);
 void i915_gem_init_swizzling(struct drm_device *dev);
 void i915_gem_cleanup_engines(struct drm_device *dev);
 int __must_check i915_gpu_idle(struct drm_device *dev);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index d3e226b714fa..aafcb4942acf 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4838,7 +4838,7 @@ i915_gem_init_hw(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_engine_cs *engine;
-	int ret, j;
+	int ret;
 
 	if (INTEL_INFO(dev)->gen < 6 && !intel_enable_gtt())
 		return -EIO;
@@ -4920,14 +4920,6 @@ i915_gem_init_hw(struct drm_device *dev)
 			break;
 		}
 
-		if (engine->id == RCS) {
-			for (j = 0; j < NUM_L3_SLICES(dev); j++) {
-				ret = i915_gem_l3_remap(req, j);
-				if (ret)
-					goto err_request;
-			}
-		}
-
 		ret = i915_ppgtt_init_ring(req);
 		if (ret)
 			goto err_request;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 1c63bea32211..e5b3d74f8222 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -601,7 +601,7 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
 	return ret;
 }
 
-int i915_gem_l3_remap(struct drm_i915_gem_request *req, int slice)
+static int remap_l3(struct drm_i915_gem_request *req, int slice)
 {
 	u32 *remap_info = req->i915->l3_parity.remap_info[slice];
 	struct intel_engine_cs *engine = req->engine;
@@ -799,7 +799,7 @@ static int do_rcs_switch(struct drm_i915_gem_request *req)
 		if (!(to->remap_slice & (1<<i)))
 			continue;
 
-		ret = i915_gem_l3_remap(req, i);
+		ret = remap_l3(req, i);
 		if (ret)
 			return ret;
 
-- 
2.8.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 10/19] drm/i915: Rearrange switch_context to load the aliasing ppgtt on first use
  2016-04-20 18:42 Premature unpinning at last Chris Wilson
                   ` (8 preceding siblings ...)
  2016-04-20 18:42 ` [PATCH 09/19] drm/i915: Remove early l3-remap Chris Wilson
@ 2016-04-20 18:42 ` Chris Wilson
  2016-04-21  6:48   ` Joonas Lahtinen
  2016-04-20 18:42 ` [PATCH 11/19] drm/i915: Assign every HW context a unique ID Chris Wilson
                   ` (12 subsequent siblings)
  22 siblings, 1 reply; 52+ messages in thread
From: Chris Wilson @ 2016-04-20 18:42 UTC (permalink / raw)
  To: intel-gfx; +Cc: Mika Kuoppala

The code to switch_mm() is already handled by i915_switch_context(), the
only difference required to setup the aliasing ppgtt is that we need to
emit te switch_mm() on the first context, i.e. when transitioning from
engine->last_context == NULL. This allows us to defer the
initialisation of the GPU from early device initialisation to first use,
which should marginally speed up both. The caveat is that we then defer
the context initialisation until first use - i.e. we cannot assume that
the GPU engines are initialised. For example, this means that power
contexts for rc6 (Ironlake) need to explicitly loaded, as they are.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         |  1 -
 drivers/gpu/drm/i915/i915_gem.c         | 28 ------------
 drivers/gpu/drm/i915/i915_gem_context.c | 77 ++++++++++++++-------------------
 drivers/gpu/drm/i915/i915_gem_gtt.c     | 14 ------
 drivers/gpu/drm/i915/i915_gem_gtt.h     |  1 -
 5 files changed, 33 insertions(+), 88 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 7dd79c9a8ce3..7adc7c83a116 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3304,7 +3304,6 @@ void i915_gem_context_lost(struct drm_i915_private *dev_priv);
 void i915_gem_context_fini(struct drm_device *dev);
 void i915_gem_context_reset(struct drm_device *dev);
 int i915_gem_context_open(struct drm_device *dev, struct drm_file *file);
-int i915_gem_context_enable(struct drm_i915_gem_request *req);
 void i915_gem_context_close(struct drm_device *dev, struct drm_file *file);
 int i915_switch_context(struct drm_i915_gem_request *req);
 struct intel_context *
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index aafcb4942acf..14c9b29294c5 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4910,34 +4910,6 @@ i915_gem_init_hw(struct drm_device *dev)
 	if (ret)
 		goto out;
 
-	/* Now it is safe to go back round and do everything else: */
-	for_each_engine(engine, dev_priv) {
-		struct drm_i915_gem_request *req;
-
-		req = i915_gem_request_alloc(engine, NULL);
-		if (IS_ERR(req)) {
-			ret = PTR_ERR(req);
-			break;
-		}
-
-		ret = i915_ppgtt_init_ring(req);
-		if (ret)
-			goto err_request;
-
-		ret = i915_gem_context_enable(req);
-		if (ret)
-			goto err_request;
-
-err_request:
-		i915_add_request_no_flush(req);
-		if (ret) {
-			DRM_ERROR("Failed to enable %s, error=%d\n",
-				  engine->name, ret);
-			i915_gem_cleanup_engines(dev);
-			break;
-		}
-	}
-
 out:
 	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
 	return ret;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index e5b3d74f8222..4d376f984a8c 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -431,27 +431,6 @@ void i915_gem_context_fini(struct drm_device *dev)
 	dev_priv->kernel_context = NULL;
 }
 
-int i915_gem_context_enable(struct drm_i915_gem_request *req)
-{
-	struct intel_engine_cs *engine = req->engine;
-	int ret;
-
-	if (i915.enable_execlists) {
-		if (engine->init_context == NULL)
-			return 0;
-
-		ret = engine->init_context(req);
-	} else
-		ret = i915_switch_context(req);
-
-	if (ret) {
-		DRM_ERROR("ring init context: %d\n", ret);
-		return ret;
-	}
-
-	return 0;
-}
-
 static int context_idr_cleanup(int id, void *p, void *data)
 {
 	struct intel_context *ctx = p;
@@ -630,7 +609,8 @@ static int remap_l3(struct drm_i915_gem_request *req, int slice)
 	return 0;
 }
 
-static inline bool skip_rcs_switch(struct intel_engine_cs *engine,
+static inline bool skip_rcs_switch(struct i915_hw_ppgtt *ppgtt,
+				   struct intel_engine_cs *engine,
 				   struct intel_context *to)
 {
 	if (to->remap_slice)
@@ -639,21 +619,27 @@ static inline bool skip_rcs_switch(struct intel_engine_cs *engine,
 	if (!to->legacy_hw_ctx.initialized)
 		return false;
 
-	if (to->ppgtt &&
-	    !(intel_engine_flag(engine) & to->ppgtt->pd_dirty_rings))
+	if (ppgtt && (intel_engine_flag(engine) & ppgtt->pd_dirty_rings))
 		return false;
 
 	return to == engine->last_context;
 }
 
 static bool
-needs_pd_load_pre(struct intel_engine_cs *engine, struct intel_context *to)
+needs_pd_load_pre(struct i915_hw_ppgtt *ppgtt,
+		  struct intel_engine_cs *engine,
+		  struct intel_context *to)
 {
-	if (!to->ppgtt)
+	if (ppgtt == NULL)
 		return false;
 
+	/* Always load the ppgtt on first use */
+	if (engine->last_context == NULL)
+		return true;
+
+	/* Same context without new entries, skip */
 	if (engine->last_context == to &&
-	    !(intel_engine_flag(engine) & to->ppgtt->pd_dirty_rings))
+	    !(intel_engine_flag(engine) & ppgtt->pd_dirty_rings))
 		return false;
 
 	if (engine->id != RCS)
@@ -666,9 +652,11 @@ needs_pd_load_pre(struct intel_engine_cs *engine, struct intel_context *to)
 }
 
 static bool
-needs_pd_load_post(struct intel_context *to, u32 hw_flags)
+needs_pd_load_post(struct i915_hw_ppgtt *ppgtt,
+		   struct intel_context *to,
+		   u32 hw_flags)
 {
-	if (!to->ppgtt)
+	if (ppgtt == NULL)
 		return false;
 
 	if (!IS_GEN8(to->i915))
@@ -684,11 +672,12 @@ static int do_rcs_switch(struct drm_i915_gem_request *req)
 {
 	struct intel_context *to = req->ctx;
 	struct intel_engine_cs *engine = req->engine;
+	struct i915_hw_ppgtt *ppgtt = to->ppgtt ?: req->i915->mm.aliasing_ppgtt;
 	struct intel_context *from;
 	u32 hw_flags;
 	int ret, i;
 
-	if (skip_rcs_switch(engine, to))
+	if (skip_rcs_switch(ppgtt, engine, to))
 		return 0;
 
 	/* Trying to pin first makes error handling easier. */
@@ -719,13 +708,13 @@ static int do_rcs_switch(struct drm_i915_gem_request *req)
 	if (ret)
 		goto unpin_out;
 
-	if (needs_pd_load_pre(engine, to)) {
+	if (needs_pd_load_pre(ppgtt, engine, to)) {
 		/* Older GENs and non render rings still want the load first,
 		 * "PP_DCLV followed by PP_DIR_BASE register through Load
 		 * Register Immediate commands in Ring Buffer before submitting
 		 * a context."*/
 		trace_switch_mm(engine, to);
-		ret = to->ppgtt->switch_mm(to->ppgtt, req);
+		ret = ppgtt->switch_mm(ppgtt, req);
 		if (ret)
 			goto unpin_out;
 	}
@@ -736,15 +725,14 @@ static int do_rcs_switch(struct drm_i915_gem_request *req)
 		 * space. This means we must enforce that a page table load
 		 * occur when this occurs. */
 		hw_flags = MI_RESTORE_INHIBIT;
-	else if (to->ppgtt &&
-		 intel_engine_flag(engine) & to->ppgtt->pd_dirty_rings)
+	else if (ppgtt && intel_engine_flag(engine) & ppgtt->pd_dirty_rings)
 		hw_flags = MI_FORCE_RESTORE;
 	else
 		hw_flags = 0;
 
 	/* We should never emit switch_mm more than once */
-	WARN_ON(needs_pd_load_pre(engine, to) &&
-		needs_pd_load_post(to, hw_flags));
+	WARN_ON(needs_pd_load_pre(ppgtt, engine, to) &&
+		needs_pd_load_post(ppgtt, to, hw_flags));
 
 	if (to != from || (hw_flags & MI_FORCE_RESTORE)) {
 		ret = mi_set_context(req, hw_flags);
@@ -780,9 +768,9 @@ static int do_rcs_switch(struct drm_i915_gem_request *req)
 	/* GEN8 does *not* require an explicit reload if the PDPs have been
 	 * setup, and we do not wish to move them.
 	 */
-	if (needs_pd_load_post(to, hw_flags)) {
+	if (needs_pd_load_post(ppgtt, to, hw_flags)) {
 		trace_switch_mm(engine, to);
-		ret = to->ppgtt->switch_mm(to->ppgtt, req);
+		ret = ppgtt->switch_mm(ppgtt, req);
 		/* The hardware context switch is emitted, but we haven't
 		 * actually changed the state - so it's probably safe to bail
 		 * here. Still, let the user know something dangerous has
@@ -792,8 +780,8 @@ static int do_rcs_switch(struct drm_i915_gem_request *req)
 			return ret;
 	}
 
-	if (to->ppgtt)
-		to->ppgtt->pd_dirty_rings &= ~intel_engine_flag(engine);
+	if (ppgtt)
+		ppgtt->pd_dirty_rings &= ~intel_engine_flag(engine);
 
 	for (i = 0; i < MAX_L3_SLICES; i++) {
 		if (!(to->remap_slice & (1<<i)))
@@ -846,17 +834,18 @@ int i915_switch_context(struct drm_i915_gem_request *req)
 	if (engine->id != RCS ||
 	    req->ctx->legacy_hw_ctx.rcs_state == NULL) {
 		struct intel_context *to = req->ctx;
+		struct i915_hw_ppgtt *ppgtt =
+			to->ppgtt ?: req->i915->mm.aliasing_ppgtt;
 
-		if (needs_pd_load_pre(engine, to)) {
+		if (needs_pd_load_pre(ppgtt, engine, to)) {
 			int ret;
 
 			trace_switch_mm(engine, to);
-			ret = to->ppgtt->switch_mm(to->ppgtt, req);
+			ret = ppgtt->switch_mm(ppgtt, req);
 			if (ret)
 				return ret;
 
-			/* Doing a PD load always reloads the page dirs */
-			to->ppgtt->pd_dirty_rings &= ~intel_engine_flag(engine);
+			ppgtt->pd_dirty_rings &= ~intel_engine_flag(engine);
 		}
 
 		if (to != engine->last_context) {
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 69191ae2cd6a..45a17a9bf3ce 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2187,20 +2187,6 @@ int i915_ppgtt_init_hw(struct drm_device *dev)
 	return 0;
 }
 
-int i915_ppgtt_init_ring(struct drm_i915_gem_request *req)
-{
-	struct drm_i915_private *dev_priv = req->i915;
-	struct i915_hw_ppgtt *ppgtt = dev_priv->mm.aliasing_ppgtt;
-
-	if (i915.enable_execlists)
-		return 0;
-
-	if (!ppgtt)
-		return 0;
-
-	return ppgtt->switch_mm(ppgtt, req);
-}
-
 struct i915_hw_ppgtt *
 i915_ppgtt_create(struct drm_device *dev, struct drm_i915_file_private *fpriv)
 {
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index b0ae6632c01a..114850368bca 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -522,7 +522,6 @@ void i915_ggtt_cleanup_hw(struct drm_device *dev);
 
 int i915_ppgtt_init(struct drm_device *dev, struct i915_hw_ppgtt *ppgtt);
 int i915_ppgtt_init_hw(struct drm_device *dev);
-int i915_ppgtt_init_ring(struct drm_i915_gem_request *req);
 void i915_ppgtt_release(struct kref *kref);
 struct i915_hw_ppgtt *i915_ppgtt_create(struct drm_device *dev,
 					struct drm_i915_file_private *fpriv);
-- 
2.8.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 11/19] drm/i915: Assign every HW context a unique ID
  2016-04-20 18:42 Premature unpinning at last Chris Wilson
                   ` (9 preceding siblings ...)
  2016-04-20 18:42 ` [PATCH 10/19] drm/i915: Rearrange switch_context to load the aliasing ppgtt on first use Chris Wilson
@ 2016-04-20 18:42 ` Chris Wilson
  2016-04-20 18:42 ` [PATCH 12/19] drm/i915: Replace the pinned context address with its " Chris Wilson
                   ` (11 subsequent siblings)
  22 siblings, 0 replies; 52+ messages in thread
From: Chris Wilson @ 2016-04-20 18:42 UTC (permalink / raw)
  To: intel-gfx

The hardware tracks contexts and expects all live contexts (those active
on the hardware) to have a unique identifier. This is used by the
hardware to assign pagefaults and the like to a particular context.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c     |  2 +-
 drivers/gpu/drm/i915/i915_drv.h         | 10 +++++++++
 drivers/gpu/drm/i915/i915_gem_context.c | 36 +++++++++++++++++++++++++++++++++
 3 files changed, 47 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 931dc6086f3b..d46413969daa 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2001,7 +2001,7 @@ static int i915_context_status(struct seq_file *m, void *unused)
 		    ctx->legacy_hw_ctx.rcs_state == NULL)
 			continue;
 
-		seq_puts(m, "HW context ");
+		seq_printf(m, "HW context %u ", ctx->hw_id);
 		describe_ctx(m, ctx);
 		if (ctx == dev_priv->kernel_context)
 			seq_printf(m, "(kernel context) ");
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 7adc7c83a116..2bf3a8f97d52 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -851,6 +851,9 @@ struct intel_context {
 	struct i915_ctx_hang_stats hang_stats;
 	struct i915_hw_ppgtt *ppgtt;
 
+	/* Unique identifier for this context, used by the hw for tracking */
+	unsigned hw_id;
+
 	/* Legacy ring buffer submission */
 	struct {
 		struct drm_i915_gem_object *rcs_state;
@@ -1838,6 +1841,13 @@ struct drm_i915_private {
 	DECLARE_HASHTABLE(mm_structs, 7);
 	struct mutex mm_lock;
 
+	/* The hw wants to have a stable context identifier for the lifetime
+	 * of the context (for OA, PASID, faults, etc). This is limited
+	 * in execlists to 20 bits.
+	 */
+	struct ida context_hw_ida;
+#define MAX_CONTEXT_HW_ID (1<<20)
+
 	/* Kernel Modesetting */
 
 	struct drm_crtc *plane_to_crtc_mapping[I915_MAX_PIPES];
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 4d376f984a8c..47cc8c27e150 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -171,6 +171,8 @@ void i915_gem_context_free(struct kref *ctx_ref)
 	if (ctx->legacy_hw_ctx.rcs_state)
 		drm_gem_object_unreference(&ctx->legacy_hw_ctx.rcs_state->base);
 	list_del(&ctx->link);
+
+	ida_simple_remove(&ctx->i915->context_hw_ida, ctx->hw_id);
 	kfree(ctx);
 }
 
@@ -211,6 +213,28 @@ i915_gem_alloc_context_obj(struct drm_device *dev, size_t size)
 	return obj;
 }
 
+static int assign_hw_id(struct drm_i915_private *dev_priv, unsigned *out)
+{
+	int ret;
+
+	ret = ida_simple_get(&dev_priv->context_hw_ida,
+			     0, MAX_CONTEXT_HW_ID, GFP_KERNEL);
+	if (ret < 0) {
+		/* Contexts are only released when no longer active.
+		 * Flush any pending retires to hopefully release some
+		 * stale contexts and try again.
+		 */
+		i915_gem_retire_requests(dev_priv->dev);
+		ret = ida_simple_get(&dev_priv->context_hw_ida,
+				     0, MAX_CONTEXT_HW_ID, GFP_KERNEL);
+		if (ret < 0)
+			return ret;
+	}
+
+	*out = ret;
+	return 0;
+}
+
 static struct intel_context *
 __create_hw_context(struct drm_device *dev,
 		    struct drm_i915_file_private *file_priv)
@@ -227,6 +251,12 @@ __create_hw_context(struct drm_device *dev,
 	list_add_tail(&ctx->link, &dev_priv->context_list);
 	ctx->i915 = dev_priv;
 
+	ret = assign_hw_id(dev_priv, &ctx->hw_id);
+	if (ret) {
+		kfree(ctx);
+		return ERR_PTR(ret);
+	}
+
 	if (dev_priv->hw_context_size) {
 		struct drm_i915_gem_object *obj =
 				i915_gem_alloc_context_obj(dev, dev_priv->hw_context_size);
@@ -366,6 +396,10 @@ int i915_gem_context_init(struct drm_device *dev)
 		}
 	}
 
+	/* Using the simple ida interface, the max is limited by sizeof(int) */
+	BUILD_BUG_ON(MAX_CONTEXT_HW_ID > INT_MAX);
+	ida_init(&dev_priv->context_hw_ida);
+
 	if (i915.enable_execlists) {
 		/* NB: intentionally left blank. We will allocate our own
 		 * backing objects as we need them, thank you very much */
@@ -429,6 +463,8 @@ void i915_gem_context_fini(struct drm_device *dev)
 
 	i915_gem_context_unreference(dctx);
 	dev_priv->kernel_context = NULL;
+
+	ida_destroy(&dev_priv->context_hw_ida);
 }
 
 static int context_idr_cleanup(int id, void *p, void *data)
-- 
2.8.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 12/19] drm/i915: Replace the pinned context address with its unique ID
  2016-04-20 18:42 Premature unpinning at last Chris Wilson
                   ` (10 preceding siblings ...)
  2016-04-20 18:42 ` [PATCH 11/19] drm/i915: Assign every HW context a unique ID Chris Wilson
@ 2016-04-20 18:42 ` Chris Wilson
  2016-04-20 18:42 ` [PATCH 13/19] drm/i915: Refactor execlists default context pinning Chris Wilson
                   ` (10 subsequent siblings)
  22 siblings, 0 replies; 52+ messages in thread
From: Chris Wilson @ 2016-04-20 18:42 UTC (permalink / raw)
  To: intel-gfx

Rather than reuse the current location of the context in the global GTT
for its hardware identifier, use the context's unique ID assigned to it
for its whole lifetime.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c | 12 +++++-------
 drivers/gpu/drm/i915/intel_lrc.c    | 36 ++++++------------------------------
 drivers/gpu/drm/i915/intel_lrc.h    |  3 ---
 3 files changed, 11 insertions(+), 40 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index d46413969daa..f775451bd0b6 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2043,15 +2043,13 @@ static void i915_dump_lrc_obj(struct seq_file *m,
 	struct drm_i915_gem_object *ctx_obj = ctx->engine[engine->id].state;
 	unsigned long ggtt_offset = 0;
 
+	seq_printf(m, "CONTEXT: %s %u\n", engine->name, ctx->hw_id);
+
 	if (ctx_obj == NULL) {
-		seq_printf(m, "Context on %s with no gem object\n",
-			   engine->name);
+		seq_puts(m, "\tNot allocated\n");
 		return;
 	}
 
-	seq_printf(m, "CONTEXT: %s %u\n", engine->name,
-		   intel_execlists_ctx_id(ctx, engine));
-
 	if (!i915_gem_obj_ggtt_bound(ctx_obj))
 		seq_puts(m, "\tNot bound in GGTT\n");
 	else
@@ -2170,8 +2168,8 @@ static int i915_execlists(struct seq_file *m, void *data)
 
 		seq_printf(m, "\t%d requests in queue\n", count);
 		if (head_req) {
-			seq_printf(m, "\tHead request id: %u\n",
-				   intel_execlists_ctx_id(head_req->ctx, engine));
+			seq_printf(m, "\tHead request context: %u\n",
+				   head_req->ctx->hw_id);
 			seq_printf(m, "\tHead request tail: %u\n",
 				   head_req->tail);
 		}
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 6179b591ee84..2ed7363f76ea 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -314,14 +314,12 @@ static void
 intel_lr_context_descriptor_update(struct intel_context *ctx,
 				   struct intel_engine_cs *engine)
 {
-	uint64_t lrca, desc;
+	u64 desc;
 
-	lrca = ctx->engine[engine->id].lrc_vma->node.start +
-	       LRC_PPHWSP_PN * PAGE_SIZE;
-
-	desc = engine->ctx_desc_template;			   /* bits  0-11 */
-	desc |= lrca;					   /* bits 12-31 */
-	desc |= (lrca >> PAGE_SHIFT) << GEN8_CTX_ID_SHIFT; /* bits 32-51 */
+	desc = engine->ctx_desc_template; /* bits  0-11 */
+	desc |= ctx->engine[engine->id].lrc_vma->node.start +
+	       LRC_PPHWSP_PN * PAGE_SIZE; /* bits 12-31 */
+	desc |= (u64)ctx->hw_id << GEN8_CTX_ID_SHIFT; /* bits 32-51 */
 
 	ctx->engine[engine->id].lrc_desc = desc;
 }
@@ -332,28 +330,6 @@ uint64_t intel_lr_context_descriptor(struct intel_context *ctx,
 	return ctx->engine[engine->id].lrc_desc;
 }
 
-/**
- * intel_execlists_ctx_id() - get the Execlists Context ID
- * @ctx: Context to get the ID for
- * @ring: Engine to get the ID for
- *
- * Do not confuse with ctx->id! Unfortunately we have a name overload
- * here: the old context ID we pass to userspace as a handler so that
- * they can refer to a context, and the new context ID we pass to the
- * ELSP so that the GPU can inform us of the context status via
- * interrupts.
- *
- * The context ID is a portion of the context descriptor, so we can
- * just extract the required part from the cached descriptor.
- *
- * Return: 20-bits globally unique context ID.
- */
-u32 intel_execlists_ctx_id(struct intel_context *ctx,
-			   struct intel_engine_cs *engine)
-{
-	return intel_lr_context_descriptor(ctx, engine) >> GEN8_CTX_ID_SHIFT;
-}
-
 static void execlists_elsp_write(struct drm_i915_gem_request *rq0,
 				 struct drm_i915_gem_request *rq1)
 {
@@ -499,7 +475,7 @@ execlists_check_remove_request(struct intel_engine_cs *engine, u32 request_id)
 	if (!head_req)
 		return 0;
 
-	if (unlikely(intel_execlists_ctx_id(head_req->ctx, engine) != request_id))
+	if (unlikely(head_req->ctx->hw_id != request_id))
 		return 0;
 
 	WARN(head_req->elsp_submitted == 0, "Never submitted head request\n");
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index 461f1ef9b5c1..b17ab79333aa 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -114,9 +114,6 @@ void intel_lr_context_reset(struct drm_i915_private *dev_priv,
 uint64_t intel_lr_context_descriptor(struct intel_context *ctx,
 				     struct intel_engine_cs *engine);
 
-u32 intel_execlists_ctx_id(struct intel_context *ctx,
-			   struct intel_engine_cs *engine);
-
 /* Execlists */
 int intel_sanitize_enable_execlists(struct drm_device *dev, int enable_execlists);
 struct i915_execbuffer_params;
-- 
2.8.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 13/19] drm/i915: Refactor execlists default context pinning
  2016-04-20 18:42 Premature unpinning at last Chris Wilson
                   ` (11 preceding siblings ...)
  2016-04-20 18:42 ` [PATCH 12/19] drm/i915: Replace the pinned context address with its " Chris Wilson
@ 2016-04-20 18:42 ` Chris Wilson
  2016-04-20 18:42 ` [PATCH 14/19] drm/i915: Move context initialisation to first-use Chris Wilson
                   ` (9 subsequent siblings)
  22 siblings, 0 replies; 52+ messages in thread
From: Chris Wilson @ 2016-04-20 18:42 UTC (permalink / raw)
  To: intel-gfx

Refactor pinning and unpinning of contexts, such that the default
context for an engine is pinned during initialisation and unpinned
during teardown (pinning of the context handles the reference counting).
Thus we can eliminate the special case handling of the default context
that was required to mask that it was not being pinned normally.

v2: Rebalance context_queue after rebasing.
v3: Rebase to -nightly (not 40 patches in)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c |   5 +-
 drivers/gpu/drm/i915/i915_gem.c     |   2 +-
 drivers/gpu/drm/i915/intel_lrc.c    | 107 ++++++++++++++----------------------
 3 files changed, 43 insertions(+), 71 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index f775451bd0b6..e81a7504656e 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2095,9 +2095,8 @@ static int i915_dump_lrc(struct seq_file *m, void *unused)
 		return ret;
 
 	list_for_each_entry(ctx, &dev_priv->context_list, link)
-		if (ctx != dev_priv->kernel_context)
-			for_each_engine(engine, dev_priv)
-				i915_dump_lrc_obj(m, ctx, engine);
+		for_each_engine(engine, dev_priv)
+			i915_dump_lrc_obj(m, ctx, engine);
 
 	mutex_unlock(&dev->struct_mutex);
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 14c9b29294c5..8c523166e3e0 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2711,7 +2711,7 @@ void i915_gem_request_free(struct kref *req_ref)
 		i915_gem_request_remove_from_client(req);
 
 	if (ctx) {
-		if (i915.enable_execlists && ctx != req->i915->kernel_context)
+		if (i915.enable_execlists)
 			intel_lr_context_unpin(ctx, req->engine);
 
 		i915_gem_context_unreference(ctx);
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 2ed7363f76ea..838abd4b42a3 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -588,9 +588,7 @@ static void execlists_context_queue(struct drm_i915_gem_request *request)
 	struct drm_i915_gem_request *cursor;
 	int num_elements = 0;
 
-	if (request->ctx != request->i915->kernel_context)
-		intel_lr_context_pin(request->ctx, engine);
-
+	intel_lr_context_pin(request->ctx, request->engine);
 	i915_gem_request_reference(request);
 
 	spin_lock_bh(&engine->execlist_lock);
@@ -691,10 +689,7 @@ int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request
 			return ret;
 	}
 
-	if (request->ctx != request->i915->kernel_context)
-		ret = intel_lr_context_pin(request->ctx, request->engine);
-
-	return ret;
+	return intel_lr_context_pin(request->ctx, request->engine);
 }
 
 static int logical_ring_wait_for_space(struct drm_i915_gem_request *req,
@@ -774,12 +769,8 @@ intel_logical_ring_advance_and_submit(struct drm_i915_gem_request *request)
 	if (engine->last_context != request->ctx) {
 		if (engine->last_context)
 			intel_lr_context_unpin(engine->last_context, engine);
-		if (request->ctx != request->i915->kernel_context) {
-			intel_lr_context_pin(request->ctx, engine);
-			engine->last_context = request->ctx;
-		} else {
-			engine->last_context = NULL;
-		}
+		intel_lr_context_pin(request->ctx, engine);
+		engine->last_context = request->ctx;
 	}
 
 	if (dev_priv->guc.execbuf_client)
@@ -1000,12 +991,7 @@ void intel_execlists_retire_requests(struct intel_engine_cs *engine)
 	spin_unlock_bh(&engine->execlist_lock);
 
 	list_for_each_entry_safe(req, tmp, &retired_list, execlist_link) {
-		struct intel_context *ctx = req->ctx;
-		struct drm_i915_gem_object *ctx_obj =
-				ctx->engine[engine->id].state;
-
-		if (ctx_obj && (ctx != req->i915->kernel_context))
-			intel_lr_context_unpin(ctx, engine);
+		intel_lr_context_unpin(req->ctx, engine);
 
 		list_del(&req->execlist_link);
 		i915_gem_request_unreference(req);
@@ -1050,23 +1036,26 @@ int logical_ring_flush_all_caches(struct drm_i915_gem_request *req)
 	return 0;
 }
 
-static int intel_lr_context_do_pin(struct intel_context *ctx,
-				   struct intel_engine_cs *engine)
+static int intel_lr_context_pin(struct intel_context *ctx,
+				struct intel_engine_cs *engine)
 {
-	struct drm_device *dev = engine->dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct drm_i915_gem_object *ctx_obj = ctx->engine[engine->id].state;
-	struct intel_ringbuffer *ringbuf = ctx->engine[engine->id].ringbuf;
+	struct drm_i915_private *dev_priv = ctx->i915;
+	struct drm_i915_gem_object *ctx_obj;
+	struct intel_ringbuffer *ringbuf;
 	void *vaddr;
 	u32 *lrc_reg_state;
 	int ret;
 
-	WARN_ON(!mutex_is_locked(&engine->dev->struct_mutex));
+	lockdep_assert_held(&ctx->i915->dev->struct_mutex);
 
+	if (ctx->engine[engine->id].pin_count++)
+		return 0;
+
+	ctx_obj = ctx->engine[engine->id].state;
 	ret = i915_gem_obj_ggtt_pin(ctx_obj, GEN8_LR_CONTEXT_ALIGN,
 			PIN_OFFSET_BIAS | GUC_WOPCM_TOP);
 	if (ret)
-		return ret;
+		goto err;
 
 	vaddr = i915_gem_object_pin_map(ctx_obj);
 	if (IS_ERR(vaddr)) {
@@ -1076,10 +1065,12 @@ static int intel_lr_context_do_pin(struct intel_context *ctx,
 
 	lrc_reg_state = vaddr + LRC_STATE_PN * PAGE_SIZE;
 
+	ringbuf = ctx->engine[engine->id].ringbuf;
 	ret = intel_pin_and_map_ringbuffer_obj(engine->dev, ringbuf);
 	if (ret)
 		goto unpin_map;
 
+	i915_gem_context_reference(ctx);
 	ctx->engine[engine->id].lrc_vma = i915_gem_obj_to_ggtt(ctx_obj);
 	intel_lr_context_descriptor_update(ctx, engine);
 	lrc_reg_state[CTX_RING_BUFFER_START+1] = ringbuf->vma->node.start;
@@ -1090,51 +1081,39 @@ static int intel_lr_context_do_pin(struct intel_context *ctx,
 	if (i915.enable_guc_submission)
 		I915_WRITE(GEN8_GTCR, GEN8_GTCR_INVALIDATE);
 
-	return ret;
+	return 0;
 
 unpin_map:
 	i915_gem_object_unpin_map(ctx_obj);
 unpin_ctx_obj:
 	i915_gem_object_ggtt_unpin(ctx_obj);
-
+err:
+	ctx->engine[engine->id].pin_count = 0;
 	return ret;
 }
 
-static int intel_lr_context_pin(struct intel_context *ctx,
-				struct intel_engine_cs *engine)
+void intel_lr_context_unpin(struct intel_context *ctx,
+			    struct intel_engine_cs *engine)
 {
-	int ret = 0;
+	struct drm_i915_gem_object *ctx_obj;
 
-	if (ctx->engine[engine->id].pin_count++ == 0) {
-		ret = intel_lr_context_do_pin(ctx, engine);
-		if (ret)
-			goto reset_pin_count;
+	lockdep_assert_held(&ctx->i915->dev->struct_mutex);
+	GEM_BUG_ON(ctx->engine[engine->id].pin_count == 0);
 
-		i915_gem_context_reference(ctx);
-	}
-	return ret;
+	if (--ctx->engine[engine->id].pin_count)
+		return;
 
-reset_pin_count:
-	ctx->engine[engine->id].pin_count = 0;
-	return ret;
-}
+	intel_unpin_ringbuffer_obj(ctx->engine[engine->id].ringbuf);
 
-void intel_lr_context_unpin(struct intel_context *ctx,
-			    struct intel_engine_cs *engine)
-{
-	struct drm_i915_gem_object *ctx_obj = ctx->engine[engine->id].state;
+	ctx_obj = ctx->engine[engine->id].state;
+	i915_gem_object_unpin_map(ctx_obj);
+	i915_gem_object_ggtt_unpin(ctx_obj);
 
-	WARN_ON(!mutex_is_locked(&ctx->i915->dev->struct_mutex));
-	if (--ctx->engine[engine->id].pin_count == 0) {
-		i915_gem_object_unpin_map(ctx_obj);
-		intel_unpin_ringbuffer_obj(ctx->engine[engine->id].ringbuf);
-		i915_gem_object_ggtt_unpin(ctx_obj);
-		ctx->engine[engine->id].lrc_vma = NULL;
-		ctx->engine[engine->id].lrc_desc = 0;
-		ctx->engine[engine->id].lrc_reg_state = NULL;
+	ctx->engine[engine->id].lrc_vma = NULL;
+	ctx->engine[engine->id].lrc_desc = 0;
+	ctx->engine[engine->id].lrc_reg_state = NULL;
 
-		i915_gem_context_unreference(ctx);
-	}
+	i915_gem_context_unreference(ctx);
 }
 
 static int intel_logical_ring_workarounds_emit(struct drm_i915_gem_request *req)
@@ -2032,6 +2011,7 @@ void intel_logical_ring_cleanup(struct intel_engine_cs *engine)
 		i915_gem_object_unpin_map(engine->status_page.obj);
 		engine->status_page.obj = NULL;
 	}
+	intel_lr_context_unpin(dev_priv->kernel_context, engine);
 
 	engine->idle_lite_restore_wa = 0;
 	engine->disable_lite_restore_wa = false;
@@ -2135,11 +2115,10 @@ logical_ring_init(struct drm_device *dev, struct intel_engine_cs *engine)
 		goto error;
 
 	/* As this is the default context, always pin it */
-	ret = intel_lr_context_do_pin(dctx, engine);
+	ret = intel_lr_context_pin(dctx, engine);
 	if (ret) {
-		DRM_ERROR(
-			"Failed to pin and map ringbuffer %s: %d\n",
-			engine->name, ret);
+		DRM_ERROR("Failed to pin context for %s: %d\n",
+			  engine->name, ret);
 		goto error;
 	}
 
@@ -2560,12 +2539,6 @@ void intel_lr_context_free(struct intel_context *ctx)
 		if (!ctx_obj)
 			continue;
 
-		if (ctx == ctx->i915->kernel_context) {
-			intel_unpin_ringbuffer_obj(ringbuf);
-			i915_gem_object_ggtt_unpin(ctx_obj);
-			i915_gem_object_unpin_map(ctx_obj);
-		}
-
 		WARN_ON(ctx->engine[i].pin_count);
 		intel_ringbuffer_free(ringbuf);
 		drm_gem_object_unreference(&ctx_obj->base);
-- 
2.8.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 14/19] drm/i915: Move context initialisation to first-use
  2016-04-20 18:42 Premature unpinning at last Chris Wilson
                   ` (12 preceding siblings ...)
  2016-04-20 18:42 ` [PATCH 13/19] drm/i915: Refactor execlists default context pinning Chris Wilson
@ 2016-04-20 18:42 ` Chris Wilson
  2016-04-21  6:57   ` Joonas Lahtinen
  2016-04-20 18:42 ` [PATCH 15/19] drm/i915: Move the magical deferred context allocation into the request Chris Wilson
                   ` (8 subsequent siblings)
  22 siblings, 1 reply; 52+ messages in thread
From: Chris Wilson @ 2016-04-20 18:42 UTC (permalink / raw)
  To: intel-gfx

Instead of allocating a new request when allocating a context, use the
request that initiated the allocation to emit the context
initialisation. This serves two purposes, it makes the initialisation
atomic with first use (simplifying scheduling and our own error
handling). Secondly, it enables us to remove the explicit context
allocation required by higher levels of GEM and make that property of
execlists opaque (in the next patch). There is also a minor step
forwards towards convergence of legacy/execlist contexts.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.h  |  1 +
 drivers/gpu/drm/i915/intel_lrc.c | 39 ++++++++++++++++++---------------------
 2 files changed, 19 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 2bf3a8f97d52..e4b510bcee62 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -868,6 +868,7 @@ struct intel_context {
 		struct i915_vma *lrc_vma;
 		u64 lrc_desc;
 		uint32_t *lrc_reg_state;
+		bool initialised;
 	} engine[I915_NUM_ENGINES];
 
 	struct list_head link;
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 838abd4b42a3..d765267153c5 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -672,9 +672,10 @@ static int execlists_move_to_gpu(struct drm_i915_gem_request *req,
 
 int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request)
 {
-	int ret = 0;
+	struct intel_engine_cs *engine = request->engine;
+	int ret;
 
-	request->ringbuf = request->ctx->engine[request->engine->id].ringbuf;
+	request->ringbuf = request->ctx->engine[engine->id].ringbuf;
 
 	if (i915.enable_guc_submission) {
 		/*
@@ -689,7 +690,20 @@ int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request
 			return ret;
 	}
 
-	return intel_lr_context_pin(request->ctx, request->engine);
+	ret = intel_lr_context_pin(request->ctx, engine);
+	if (ret)
+		return ret;
+
+	if (!request->ctx->engine[engine->id].initialised) {
+		ret = engine->init_context(request);
+		if (ret) {
+			intel_lr_context_unpin(request->ctx, engine);
+			return ret;
+		}
+		request->ctx->engine[engine->id].initialised = true;
+	}
+
+	return 0;
 }
 
 static int logical_ring_wait_for_space(struct drm_i915_gem_request *req,
@@ -2634,25 +2648,8 @@ int intel_lr_context_deferred_alloc(struct intel_context *ctx,
 
 	ctx->engine[engine->id].ringbuf = ringbuf;
 	ctx->engine[engine->id].state = ctx_obj;
+	ctx->engine[engine->id].initialised = engine->init_context == NULL;
 
-	if (ctx != ctx->i915->kernel_context && engine->init_context) {
-		struct drm_i915_gem_request *req;
-
-		req = i915_gem_request_alloc(engine, ctx);
-		if (IS_ERR(req)) {
-			ret = PTR_ERR(req);
-			DRM_ERROR("ring create req: %d\n", ret);
-			goto error_ringbuf;
-		}
-
-		ret = engine->init_context(req);
-		i915_add_request_no_flush(req);
-		if (ret) {
-			DRM_ERROR("ring init context: %d\n",
-				ret);
-			goto error_ringbuf;
-		}
-	}
 	return 0;
 
 error_ringbuf:
-- 
2.8.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 15/19] drm/i915: Move the magical deferred context allocation into the request
  2016-04-20 18:42 Premature unpinning at last Chris Wilson
                   ` (13 preceding siblings ...)
  2016-04-20 18:42 ` [PATCH 14/19] drm/i915: Move context initialisation to first-use Chris Wilson
@ 2016-04-20 18:42 ` Chris Wilson
  2016-04-20 18:42 ` [PATCH 16/19] drm/i915: Move releasing of the GEM request from free to retire/cancel Chris Wilson
                   ` (7 subsequent siblings)
  22 siblings, 0 replies; 52+ messages in thread
From: Chris Wilson @ 2016-04-20 18:42 UTC (permalink / raw)
  To: intel-gfx

We can hide more details of execlists from higher level code by removing
the explicit call to create an execlist context from execbuffer and
into its first use by execlists.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  8 --------
 drivers/gpu/drm/i915/intel_lrc.c           | 23 +++++++++++++++--------
 drivers/gpu/drm/i915/intel_lrc.h           |  2 --
 3 files changed, 15 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 6f4f2a6cdf93..e0ee5d1ac372 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -1085,14 +1085,6 @@ i915_gem_validate_context(struct drm_device *dev, struct drm_file *file,
 		return ERR_PTR(-EIO);
 	}
 
-	if (i915.enable_execlists && !ctx->engine[engine->id].state) {
-		int ret = intel_lr_context_deferred_alloc(ctx, engine);
-		if (ret) {
-			DRM_DEBUG("Could not create LRC %u: %d\n", ctx_id, ret);
-			return ERR_PTR(ret);
-		}
-	}
-
 	return ctx;
 }
 
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index d765267153c5..8a544e2286f9 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -227,6 +227,8 @@ enum {
 #define GEN8_CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT	0x17
 #define GEN9_CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT	0x26
 
+static int execlists_context_deferred_alloc(struct intel_context *ctx,
+					    struct intel_engine_cs *engine);
 static int intel_lr_context_pin(struct intel_context *ctx,
 				struct intel_engine_cs *engine);
 
@@ -675,8 +677,6 @@ int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request
 	struct intel_engine_cs *engine = request->engine;
 	int ret;
 
-	request->ringbuf = request->ctx->engine[engine->id].ringbuf;
-
 	if (i915.enable_guc_submission) {
 		/*
 		 * Check that the GuC has space for the request before
@@ -690,6 +690,14 @@ int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request
 			return ret;
 	}
 
+	if (request->ctx->engine[engine->id].state == NULL) {
+		ret = execlists_context_deferred_alloc(request->ctx, engine);
+		if (ret)
+			return ret;
+	}
+
+	request->ringbuf = request->ctx->engine[engine->id].ringbuf;
+
 	ret = intel_lr_context_pin(request->ctx, engine);
 	if (ret)
 		return ret;
@@ -2124,7 +2132,7 @@ logical_ring_init(struct drm_device *dev, struct intel_engine_cs *engine)
 	if (ret)
 		goto error;
 
-	ret = intel_lr_context_deferred_alloc(dctx, engine);
+	ret = execlists_context_deferred_alloc(dctx, engine);
 	if (ret)
 		goto error;
 
@@ -2598,9 +2606,9 @@ uint32_t intel_lr_context_size(struct intel_engine_cs *engine)
 }
 
 /**
- * intel_lr_context_deferred_alloc() - create the LRC specific bits of a context
+ * execlists_context_deferred_alloc() - create the LRC specific bits of a context
  * @ctx: LR context to create.
- * @ring: engine to be used with the context.
+ * @engine: engine to be used with the context.
  *
  * This function can be called more than once, with different engines, if we plan
  * to use the context with them. The context backing objects and the ringbuffers
@@ -2610,9 +2618,8 @@ uint32_t intel_lr_context_size(struct intel_engine_cs *engine)
  *
  * Return: non-zero on error.
  */
-
-int intel_lr_context_deferred_alloc(struct intel_context *ctx,
-				    struct intel_engine_cs *engine)
+static int execlists_context_deferred_alloc(struct intel_context *ctx,
+					    struct intel_engine_cs *engine)
 {
 	struct drm_device *dev = engine->dev;
 	struct drm_i915_gem_object *ctx_obj;
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index b17ab79333aa..8bea937973f6 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -102,8 +102,6 @@ static inline void intel_logical_ring_emit_reg(struct intel_ringbuffer *ringbuf,
 
 void intel_lr_context_free(struct intel_context *ctx);
 uint32_t intel_lr_context_size(struct intel_engine_cs *engine);
-int intel_lr_context_deferred_alloc(struct intel_context *ctx,
-				    struct intel_engine_cs *engine);
 void intel_lr_context_unpin(struct intel_context *ctx,
 			    struct intel_engine_cs *engine);
 
-- 
2.8.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 16/19] drm/i915: Move releasing of the GEM request from free to retire/cancel
  2016-04-20 18:42 Premature unpinning at last Chris Wilson
                   ` (14 preceding siblings ...)
  2016-04-20 18:42 ` [PATCH 15/19] drm/i915: Move the magical deferred context allocation into the request Chris Wilson
@ 2016-04-20 18:42 ` Chris Wilson
  2016-04-20 18:42 ` [PATCH 17/19] drm/i915: Track the previous pinned context inside the request Chris Wilson
                   ` (6 subsequent siblings)
  22 siblings, 0 replies; 52+ messages in thread
From: Chris Wilson @ 2016-04-20 18:42 UTC (permalink / raw)
  To: intel-gfx

If we move the release of the GEM request (i.e. decoupling it from the
various lists used for client and context tracking) after it is complete
(either by the GPU retiring the request, or by the caller cancelling the
request), we can remove the requirement that the final unreference of
the GEM request need to be under the struct_mutex.

v2,v3: Rebalance execlists by moving the context unpinning.
v4: Rebase onto -nightly
v5: Avoid trying to rebalance execlist/GuC context pinning, leave that
to the next step

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h      | 14 --------------
 drivers/gpu/drm/i915/i915_gem.c      | 23 +++++++++--------------
 drivers/gpu/drm/i915/intel_display.c |  2 +-
 drivers/gpu/drm/i915/intel_pm.c      |  2 +-
 4 files changed, 11 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index e4b510bcee62..a26a026ef8e2 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2370,23 +2370,9 @@ i915_gem_request_reference(struct drm_i915_gem_request *req)
 static inline void
 i915_gem_request_unreference(struct drm_i915_gem_request *req)
 {
-	WARN_ON(!mutex_is_locked(&req->engine->dev->struct_mutex));
 	kref_put(&req->ref, i915_gem_request_free);
 }
 
-static inline void
-i915_gem_request_unreference__unlocked(struct drm_i915_gem_request *req)
-{
-	struct drm_device *dev;
-
-	if (!req)
-		return;
-
-	dev = req->engine->dev;
-	if (kref_put_mutex(&req->ref, i915_gem_request_free, &dev->struct_mutex))
-		mutex_unlock(&dev->struct_mutex);
-}
-
 static inline void i915_gem_request_assign(struct drm_i915_gem_request **pdst,
 					   struct drm_i915_gem_request *src)
 {
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 8c523166e3e0..eaf1d13c943f 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1413,6 +1413,13 @@ static void i915_gem_request_retire(struct drm_i915_gem_request *request)
 	list_del_init(&request->list);
 	i915_gem_request_remove_from_client(request);
 
+	if (request->ctx) {
+		if (i915.enable_execlists)
+			intel_lr_context_unpin(request->ctx, request->engine);
+
+		i915_gem_context_unreference(request->ctx);
+	}
+
 	i915_gem_request_unreference(request);
 }
 
@@ -2705,18 +2712,6 @@ void i915_gem_request_free(struct kref *req_ref)
 {
 	struct drm_i915_gem_request *req = container_of(req_ref,
 						 typeof(*req), ref);
-	struct intel_context *ctx = req->ctx;
-
-	if (req->file_priv)
-		i915_gem_request_remove_from_client(req);
-
-	if (ctx) {
-		if (i915.enable_execlists)
-			intel_lr_context_unpin(ctx, req->engine);
-
-		i915_gem_context_unreference(ctx);
-	}
-
 	kmem_cache_free(req->i915->requests, req);
 }
 
@@ -3178,7 +3173,7 @@ i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
 			ret = __i915_wait_request(req[i], true,
 						  args->timeout_ns > 0 ? &args->timeout_ns : NULL,
 						  to_rps_client(file));
-		i915_gem_request_unreference__unlocked(req[i]);
+		i915_gem_request_unreference(req[i]);
 	}
 	return ret;
 
@@ -4204,7 +4199,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
 	if (ret == 0)
 		queue_delayed_work(dev_priv->wq, &dev_priv->mm.retire_work, 0);
 
-	i915_gem_request_unreference__unlocked(target);
+	i915_gem_request_unreference(target);
 
 	return ret;
 }
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index ff60241b1f76..f5bf46f99cc2 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -11399,7 +11399,7 @@ static void intel_mmio_flip_work_func(struct work_struct *work)
 		WARN_ON(__i915_wait_request(mmio_flip->req,
 					    false, NULL,
 					    &mmio_flip->i915->rps.mmioflips));
-		i915_gem_request_unreference__unlocked(mmio_flip->req);
+		i915_gem_request_unreference(mmio_flip->req);
 	}
 
 	/* For framebuffer backed by dmabuf, wait for fence */
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index b7c218602c6e..ed3797bf41aa 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -7366,7 +7366,7 @@ static void __intel_rps_boost_work(struct work_struct *work)
 		gen6_rps_boost(to_i915(req->engine->dev), NULL,
 			       req->emitted_jiffies);
 
-	i915_gem_request_unreference__unlocked(req);
+	i915_gem_request_unreference(req);
 	kfree(boost);
 }
 
-- 
2.8.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 17/19] drm/i915: Track the previous pinned context inside the request
  2016-04-20 18:42 Premature unpinning at last Chris Wilson
                   ` (15 preceding siblings ...)
  2016-04-20 18:42 ` [PATCH 16/19] drm/i915: Move releasing of the GEM request from free to retire/cancel Chris Wilson
@ 2016-04-20 18:42 ` Chris Wilson
  2016-04-21  7:07   ` Joonas Lahtinen
  2016-04-20 18:42 ` [PATCH 18/19] drm/i915: Store LRC hardware id in " Chris Wilson
                   ` (5 subsequent siblings)
  22 siblings, 1 reply; 52+ messages in thread
From: Chris Wilson @ 2016-04-20 18:42 UTC (permalink / raw)
  To: intel-gfx

As the contexts are accessed by the hardware until the switch is completed
to a new context, the hardware may still be writing to the context object
after the breadcrumb is visible. We must not unpin/unbind/prune that
object whilst still active and so we keep the previous context pinned until
the following request. We can generalise the tracking we already do via
the engine->last_context and move it to the request so that it works
equally for execlists and GuC.

v2: Drop the execlists double pin as that exposes a race inside the lrc
irq handler as it tries to access the context after it may be retired.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h  | 11 +++++++++++
 drivers/gpu/drm/i915/i915_gem.c  |  8 ++++----
 drivers/gpu/drm/i915/intel_lrc.c | 14 ++++++++------
 3 files changed, 23 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index a26a026ef8e2..515b8badce61 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2302,6 +2302,17 @@ struct drm_i915_gem_request {
 	struct intel_context *ctx;
 	struct intel_ringbuffer *ringbuf;
 
+	/**
+	 * Context related to the previous request.
+	 * As the contexts are accessed by the hardware until the switch is
+	 * completed to a new context, the hardware may still be writing
+	 * to the context object after the breadcrumb is visible. We must
+	 * not unpin/unbind/prune that object whilst still active and so
+	 * we keep the previous context pinned until the following (this)
+	 * request is retired.
+	 */
+	struct intel_context *previous_context;
+
 	/** Batch buffer related to this request if any (used for
 	    error state dump only) */
 	struct drm_i915_gem_object *batch_obj;
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index eaf1d13c943f..dbfc38f91f7d 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1413,13 +1413,13 @@ static void i915_gem_request_retire(struct drm_i915_gem_request *request)
 	list_del_init(&request->list);
 	i915_gem_request_remove_from_client(request);
 
-	if (request->ctx) {
+	if (request->previous_context) {
 		if (i915.enable_execlists)
-			intel_lr_context_unpin(request->ctx, request->engine);
-
-		i915_gem_context_unreference(request->ctx);
+			intel_lr_context_unpin(request->previous_context,
+					       request->engine);
 	}
 
+	i915_gem_context_unreference(request->ctx);
 	i915_gem_request_unreference(request);
 }
 
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 8a544e2286f9..67c369ae649b 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -788,12 +788,14 @@ intel_logical_ring_advance_and_submit(struct drm_i915_gem_request *request)
 	if (intel_engine_stopped(engine))
 		return 0;
 
-	if (engine->last_context != request->ctx) {
-		if (engine->last_context)
-			intel_lr_context_unpin(engine->last_context, engine);
-		intel_lr_context_pin(request->ctx, engine);
-		engine->last_context = request->ctx;
-	}
+	/* We keep the previous context alive until we retire the following
+	 * request. This ensures that any the context object is still pinned
+	 * for any residual writes the HW makes into it on the context switch
+	 * into the next object following the breadcrumb. Otherwise, we may
+	 * retire the context too early.
+	 */
+	request->previous_context = engine->last_context;
+	engine->last_context = request->ctx;
 
 	if (dev_priv->guc.execbuf_client)
 		i915_guc_submit(dev_priv->guc.execbuf_client, request);
-- 
2.8.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 18/19] drm/i915: Store LRC hardware id in the request
  2016-04-20 18:42 Premature unpinning at last Chris Wilson
                   ` (16 preceding siblings ...)
  2016-04-20 18:42 ` [PATCH 17/19] drm/i915: Track the previous pinned context inside the request Chris Wilson
@ 2016-04-20 18:42 ` Chris Wilson
  2016-04-21  7:58   ` Tvrtko Ursulin
  2016-04-20 18:42 ` [PATCH 19/19] drm/i915: Stop tracking execlists retired requests Chris Wilson
                   ` (4 subsequent siblings)
  22 siblings, 1 reply; 52+ messages in thread
From: Chris Wilson @ 2016-04-20 18:42 UTC (permalink / raw)
  To: intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

This way in the following patch we can disconnect requests
from contexts.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.h  | 2 ++
 drivers/gpu/drm/i915/intel_lrc.c | 3 ++-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 515b8badce61..0efbe6c4634f 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2349,6 +2349,8 @@ struct drm_i915_gem_request {
 	/** Execlists no. of times this request has been sent to the ELSP */
 	int elsp_submitted;
 
+	/** Execlists context hardware id. */
+	unsigned ctx_hw_id;
 };
 
 struct drm_i915_gem_request * __must_check
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 67c369ae649b..833d8fd3343f 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -477,7 +477,7 @@ execlists_check_remove_request(struct intel_engine_cs *engine, u32 request_id)
 	if (!head_req)
 		return 0;
 
-	if (unlikely(head_req->ctx->hw_id != request_id))
+	if (unlikely(head_req->ctx_hw_id != request_id))
 		return 0;
 
 	WARN(head_req->elsp_submitted == 0, "Never submitted head request\n");
@@ -615,6 +615,7 @@ static void execlists_context_queue(struct drm_i915_gem_request *request)
 	}
 
 	list_add_tail(&request->execlist_link, &engine->execlist_queue);
+	request->ctx_hw_id = request->ctx->hw_id;
 	if (num_elements == 0)
 		execlists_context_unqueue(engine);
 
-- 
2.8.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 19/19] drm/i915: Stop tracking execlists retired requests
  2016-04-20 18:42 Premature unpinning at last Chris Wilson
                   ` (17 preceding siblings ...)
  2016-04-20 18:42 ` [PATCH 18/19] drm/i915: Store LRC hardware id in " Chris Wilson
@ 2016-04-20 18:42 ` Chris Wilson
  2016-04-21 11:27 ` ✓ Fi.CI.BAT: success for series starting with [01/19] drm/i915/overlay: Replace i915_gem_obj_ggtt_offset() with the known flip_addr Patchwork
                   ` (3 subsequent siblings)
  22 siblings, 0 replies; 52+ messages in thread
From: Chris Wilson @ 2016-04-20 18:42 UTC (permalink / raw)
  To: intel-gfx; +Cc: Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko@ursulin.net>

With the previous patch having extended the pinned lifetime of
contexts by referencing the previous context from the current
request until the latter is retired (completed by the GPU),
we can now remove usage of execlist retired queue entirely.

This is because the above now guarantees that all execlist
object access requirements are satisfied by this new tracking,
and we can stop taking additional references and stop keeping
request on the execlists retired queue.

The latter was a source of significant scalability issues in
the driver causing performance hits on some tests. Most
dramatical of which was igt/gem_close_race which had run time
in tens of minutes which is now reduced to tens of seconds.

Signed-off-by: Tvrtko Ursulin <tvrtko@ursulin.net>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.c         | 10 +--------
 drivers/gpu/drm/i915/intel_lrc.c        | 39 ++++++++++++---------------------
 drivers/gpu/drm/i915/intel_lrc.h        |  2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.h |  1 -
 4 files changed, 16 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index dbfc38f91f7d..045d8369e24a 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2876,13 +2876,7 @@ static void i915_gem_reset_engine_cleanup(struct drm_i915_private *dev_priv,
 		/* Ensure irq handler finishes or is cancelled. */
 		tasklet_kill(&engine->irq_tasklet);
 
-		spin_lock_bh(&engine->execlist_lock);
-		/* list_splice_tail_init checks for empty lists */
-		list_splice_tail_init(&engine->execlist_queue,
-				      &engine->execlist_retired_req_list);
-		spin_unlock_bh(&engine->execlist_lock);
-
-		intel_execlists_retire_requests(engine);
+		intel_execlists_cancel_requests(engine);
 	}
 
 	/*
@@ -3006,8 +3000,6 @@ i915_gem_retire_requests(struct drm_device *dev)
 			spin_lock_bh(&engine->execlist_lock);
 			idle &= list_empty(&engine->execlist_queue);
 			spin_unlock_bh(&engine->execlist_lock);
-
-			intel_execlists_retire_requests(engine);
 		}
 	}
 
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 833d8fd3343f..37c557d3fb4a 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -431,8 +431,8 @@ static void execlists_context_unqueue(struct intel_engine_cs *engine)
 			/* Same ctx: ignore first request, as second request
 			 * will update tail past first request's workload */
 			cursor->elsp_submitted = req0->elsp_submitted;
-			list_move_tail(&req0->execlist_link,
-				       &engine->execlist_retired_req_list);
+			list_del(&req0->execlist_link);
+			i915_gem_request_unreference(req0);
 			req0 = cursor;
 		} else {
 			req1 = cursor;
@@ -464,7 +464,7 @@ static void execlists_context_unqueue(struct intel_engine_cs *engine)
 }
 
 static unsigned int
-execlists_check_remove_request(struct intel_engine_cs *engine, u32 request_id)
+execlists_check_remove_request(struct intel_engine_cs *engine, u32 ctx_id)
 {
 	struct drm_i915_gem_request *head_req;
 
@@ -474,19 +474,16 @@ execlists_check_remove_request(struct intel_engine_cs *engine, u32 request_id)
 					    struct drm_i915_gem_request,
 					    execlist_link);
 
-	if (!head_req)
-		return 0;
-
-	if (unlikely(head_req->ctx_hw_id != request_id))
-		return 0;
+	if (WARN_ON(!head_req || (head_req->ctx_hw_id != ctx_id)))
+               return 0;
 
 	WARN(head_req->elsp_submitted == 0, "Never submitted head request\n");
 
 	if (--head_req->elsp_submitted > 0)
 		return 0;
 
-	list_move_tail(&head_req->execlist_link,
-		       &engine->execlist_retired_req_list);
+	list_del(&head_req->execlist_link);
+	i915_gem_request_unreference(head_req);
 
 	return 1;
 }
@@ -590,9 +587,6 @@ static void execlists_context_queue(struct drm_i915_gem_request *request)
 	struct drm_i915_gem_request *cursor;
 	int num_elements = 0;
 
-	intel_lr_context_pin(request->ctx, request->engine);
-	i915_gem_request_reference(request);
-
 	spin_lock_bh(&engine->execlist_lock);
 
 	list_for_each_entry(cursor, &engine->execlist_queue, execlist_link)
@@ -609,11 +603,12 @@ static void execlists_context_queue(struct drm_i915_gem_request *request)
 		if (request->ctx == tail_req->ctx) {
 			WARN(tail_req->elsp_submitted != 0,
 				"More than 2 already-submitted reqs queued\n");
-			list_move_tail(&tail_req->execlist_link,
-				       &engine->execlist_retired_req_list);
+			list_del(&tail_req->execlist_link);
+			i915_gem_request_unreference(tail_req);
 		}
 	}
 
+	i915_gem_request_reference(request);
 	list_add_tail(&request->execlist_link, &engine->execlist_queue);
 	request->ctx_hw_id = request->ctx->hw_id;
 	if (num_elements == 0)
@@ -1001,23 +996,18 @@ int intel_execlists_submission(struct i915_execbuffer_params *params,
 	return 0;
 }
 
-void intel_execlists_retire_requests(struct intel_engine_cs *engine)
+void intel_execlists_cancel_requests(struct intel_engine_cs *engine)
 {
 	struct drm_i915_gem_request *req, *tmp;
-	struct list_head retired_list;
+	LIST_HEAD(cancel_list);
 
 	WARN_ON(!mutex_is_locked(&engine->dev->struct_mutex));
-	if (list_empty(&engine->execlist_retired_req_list))
-		return;
 
-	INIT_LIST_HEAD(&retired_list);
 	spin_lock_bh(&engine->execlist_lock);
-	list_replace_init(&engine->execlist_retired_req_list, &retired_list);
+	list_replace_init(&engine->execlist_queue, &cancel_list);
 	spin_unlock_bh(&engine->execlist_lock);
 
-	list_for_each_entry_safe(req, tmp, &retired_list, execlist_link) {
-		intel_lr_context_unpin(req->ctx, engine);
-
+	list_for_each_entry_safe(req, tmp, &cancel_list, execlist_link) {
 		list_del(&req->execlist_link);
 		i915_gem_request_unreference(req);
 	}
@@ -2109,7 +2099,6 @@ logical_ring_init(struct drm_device *dev, struct intel_engine_cs *engine)
 
 	INIT_LIST_HEAD(&engine->buffers);
 	INIT_LIST_HEAD(&engine->execlist_queue);
-	INIT_LIST_HEAD(&engine->execlist_retired_req_list);
 	spin_lock_init(&engine->execlist_lock);
 
 	tasklet_init(&engine->irq_tasklet,
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index 8bea937973f6..4b1c896b5019 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -119,6 +119,6 @@ int intel_execlists_submission(struct i915_execbuffer_params *params,
 			       struct drm_i915_gem_execbuffer2 *args,
 			       struct list_head *vmas);
 
-void intel_execlists_retire_requests(struct intel_engine_cs *engine);
+void intel_execlists_cancel_requests(struct intel_engine_cs *engine);
 
 #endif /* _INTEL_LRC_H_ */
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 2ade194bbea9..527549dbeb3c 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -269,7 +269,6 @@ struct  intel_engine_cs {
 	struct tasklet_struct irq_tasklet;
 	spinlock_t execlist_lock; /* used inside tasklet, use spin_lock_bh */
 	struct list_head execlist_queue;
-	struct list_head execlist_retired_req_list;
 	unsigned int fw_domains;
 	unsigned int next_context_status_buffer;
 	unsigned int idle_lite_restore_wa;
-- 
2.8.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH 02/19] io-mapping: Specify mapping size for io_mapping_map_wc()
  2016-04-20 18:42   ` Chris Wilson
  (?)
@ 2016-04-20 18:58     ` Luis R. Rodriguez
  -1 siblings, 0 replies; 52+ messages in thread
From: Luis R. Rodriguez @ 2016-04-20 18:58 UTC (permalink / raw)
  To: Chris Wilson
  Cc: David Airlie, intel-gfx, linux-kernel, Ingo Molnar,
	Peter Zijlstra (Intel),
	Luis R . Rodriguez, dri-devel, netdev, linux-rdma, Daniel Vetter,
	Dan Williams, Yishai Hadas, David Hildenbrand

On Wed, Apr 20, 2016 at 07:42:13PM +0100, Chris Wilson wrote:
> The ioremap() hidden behind the io_mapping_map_wc() convenience helper
> can be used for remapping multiple pages. Extend the helper so that
> future callers can use it for larger ranges.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Jani Nikula <jani.nikula@linux.intel.com>
> Cc: David Airlie <airlied@linux.ie>
> Cc: Yishai Hadas <yishaih@mellanox.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
> Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
> Cc: Luis R. Rodriguez <mcgrof@kernel.org>
> Cc: intel-gfx@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org
> Cc: netdev@vger.kernel.org
> Cc: linux-rdma@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org

We have 2 callers today, in the future, can you envision
this API getting more options? If so, in order to avoid the
pain of collateral evolutions I can suggest a descriptor
being passed with the required settings / options. This lets
you evolve the API without needing to go in and modify
old users. If you choose not to that's fine too, just
figured I'd chime in with that as I've seen the pain
with other APIs, and I'm putting an end to the needless
set of collateral evolutions this way.

Other than that possible API optimization:

Reviewed-by: Luis R. Rodriguez <mcgrof@kernel.org>

  Luis
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 02/19] io-mapping: Specify mapping size for io_mapping_map_wc()
@ 2016-04-20 18:58     ` Luis R. Rodriguez
  0 siblings, 0 replies; 52+ messages in thread
From: Luis R. Rodriguez @ 2016-04-20 18:58 UTC (permalink / raw)
  To: Chris Wilson
  Cc: intel-gfx, Tvrtko Ursulin, Mika Kuoppala, Joonas Lahtinen,
	Tvrtko Ursulin, Daniel Vetter, Jani Nikula, David Airlie,
	Yishai Hadas, Dan Williams, Ingo Molnar, Peter Zijlstra (Intel),
	David Hildenbrand, Luis R . Rodriguez, dri-devel, netdev,
	linux-rdma, linux-kernel

On Wed, Apr 20, 2016 at 07:42:13PM +0100, Chris Wilson wrote:
> The ioremap() hidden behind the io_mapping_map_wc() convenience helper
> can be used for remapping multiple pages. Extend the helper so that
> future callers can use it for larger ranges.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Jani Nikula <jani.nikula@linux.intel.com>
> Cc: David Airlie <airlied@linux.ie>
> Cc: Yishai Hadas <yishaih@mellanox.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
> Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
> Cc: Luis R. Rodriguez <mcgrof@kernel.org>
> Cc: intel-gfx@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org
> Cc: netdev@vger.kernel.org
> Cc: linux-rdma@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org

We have 2 callers today, in the future, can you envision
this API getting more options? If so, in order to avoid the
pain of collateral evolutions I can suggest a descriptor
being passed with the required settings / options. This lets
you evolve the API without needing to go in and modify
old users. If you choose not to that's fine too, just
figured I'd chime in with that as I've seen the pain
with other APIs, and I'm putting an end to the needless
set of collateral evolutions this way.

Other than that possible API optimization:

Reviewed-by: Luis R. Rodriguez <mcgrof@kernel.org>

  Luis

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 02/19] io-mapping: Specify mapping size for io_mapping_map_wc()
@ 2016-04-20 18:58     ` Luis R. Rodriguez
  0 siblings, 0 replies; 52+ messages in thread
From: Luis R. Rodriguez @ 2016-04-20 18:58 UTC (permalink / raw)
  To: Chris Wilson
  Cc: David Airlie, intel-gfx, linux-kernel, Ingo Molnar,
	Peter Zijlstra (Intel),
	Luis R . Rodriguez, dri-devel, netdev, linux-rdma, Daniel Vetter,
	Dan Williams, Yishai Hadas, David Hildenbrand

On Wed, Apr 20, 2016 at 07:42:13PM +0100, Chris Wilson wrote:
> The ioremap() hidden behind the io_mapping_map_wc() convenience helper
> can be used for remapping multiple pages. Extend the helper so that
> future callers can use it for larger ranges.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Jani Nikula <jani.nikula@linux.intel.com>
> Cc: David Airlie <airlied@linux.ie>
> Cc: Yishai Hadas <yishaih@mellanox.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
> Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
> Cc: Luis R. Rodriguez <mcgrof@kernel.org>
> Cc: intel-gfx@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org
> Cc: netdev@vger.kernel.org
> Cc: linux-rdma@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org

We have 2 callers today, in the future, can you envision
this API getting more options? If so, in order to avoid the
pain of collateral evolutions I can suggest a descriptor
being passed with the required settings / options. This lets
you evolve the API without needing to go in and modify
old users. If you choose not to that's fine too, just
figured I'd chime in with that as I've seen the pain
with other APIs, and I'm putting an end to the needless
set of collateral evolutions this way.

Other than that possible API optimization:

Reviewed-by: Luis R. Rodriguez <mcgrof@kernel.org>

  Luis
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 02/19] io-mapping: Specify mapping size for io_mapping_map_wc()
  2016-04-20 18:58     ` Luis R. Rodriguez
  (?)
@ 2016-04-20 19:14       ` Chris Wilson
  -1 siblings, 0 replies; 52+ messages in thread
From: Chris Wilson @ 2016-04-20 19:14 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: David Airlie, intel-gfx, linux-kernel, Ingo Molnar,
	Peter Zijlstra (Intel),
	dri-devel, netdev, linux-rdma, Daniel Vetter, Dan Williams,
	Yishai Hadas, David Hildenbrand

On Wed, Apr 20, 2016 at 08:58:44PM +0200, Luis R. Rodriguez wrote:
> On Wed, Apr 20, 2016 at 07:42:13PM +0100, Chris Wilson wrote:
> > The ioremap() hidden behind the io_mapping_map_wc() convenience helper
> > can be used for remapping multiple pages. Extend the helper so that
> > future callers can use it for larger ranges.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > Cc: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Jani Nikula <jani.nikula@linux.intel.com>
> > Cc: David Airlie <airlied@linux.ie>
> > Cc: Yishai Hadas <yishaih@mellanox.com>
> > Cc: Dan Williams <dan.j.williams@intel.com>
> > Cc: Ingo Molnar <mingo@kernel.org>
> > Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
> > Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
> > Cc: Luis R. Rodriguez <mcgrof@kernel.org>
> > Cc: intel-gfx@lists.freedesktop.org
> > Cc: dri-devel@lists.freedesktop.org
> > Cc: netdev@vger.kernel.org
> > Cc: linux-rdma@vger.kernel.org
> > Cc: linux-kernel@vger.kernel.org
> 
> We have 2 callers today, in the future, can you envision
> this API getting more options? If so, in order to avoid the
> pain of collateral evolutions I can suggest a descriptor
> being passed with the required settings / options. This lets
> you evolve the API without needing to go in and modify
> old users. If you choose not to that's fine too, just
> figured I'd chime in with that as I've seen the pain
> with other APIs, and I'm putting an end to the needless
> set of collateral evolutions this way.

Do you have a good example in mind? I've one more patch to try and take
advantage of the io-mapping (that may or not be such a good idea in
practice) but I may as well see if I can make io_mapping more useful
when I do.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 02/19] io-mapping: Specify mapping size for io_mapping_map_wc()
@ 2016-04-20 19:14       ` Chris Wilson
  0 siblings, 0 replies; 52+ messages in thread
From: Chris Wilson @ 2016-04-20 19:14 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: intel-gfx, Tvrtko Ursulin, Mika Kuoppala, Joonas Lahtinen,
	Tvrtko Ursulin, Daniel Vetter, Jani Nikula, David Airlie,
	Yishai Hadas, Dan Williams, Ingo Molnar, Peter Zijlstra (Intel),
	David Hildenbrand, dri-devel, netdev, linux-rdma, linux-kernel

On Wed, Apr 20, 2016 at 08:58:44PM +0200, Luis R. Rodriguez wrote:
> On Wed, Apr 20, 2016 at 07:42:13PM +0100, Chris Wilson wrote:
> > The ioremap() hidden behind the io_mapping_map_wc() convenience helper
> > can be used for remapping multiple pages. Extend the helper so that
> > future callers can use it for larger ranges.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > Cc: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Jani Nikula <jani.nikula@linux.intel.com>
> > Cc: David Airlie <airlied@linux.ie>
> > Cc: Yishai Hadas <yishaih@mellanox.com>
> > Cc: Dan Williams <dan.j.williams@intel.com>
> > Cc: Ingo Molnar <mingo@kernel.org>
> > Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
> > Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
> > Cc: Luis R. Rodriguez <mcgrof@kernel.org>
> > Cc: intel-gfx@lists.freedesktop.org
> > Cc: dri-devel@lists.freedesktop.org
> > Cc: netdev@vger.kernel.org
> > Cc: linux-rdma@vger.kernel.org
> > Cc: linux-kernel@vger.kernel.org
> 
> We have 2 callers today, in the future, can you envision
> this API getting more options? If so, in order to avoid the
> pain of collateral evolutions I can suggest a descriptor
> being passed with the required settings / options. This lets
> you evolve the API without needing to go in and modify
> old users. If you choose not to that's fine too, just
> figured I'd chime in with that as I've seen the pain
> with other APIs, and I'm putting an end to the needless
> set of collateral evolutions this way.

Do you have a good example in mind? I've one more patch to try and take
advantage of the io-mapping (that may or not be such a good idea in
practice) but I may as well see if I can make io_mapping more useful
when I do.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 02/19] io-mapping: Specify mapping size for io_mapping_map_wc()
@ 2016-04-20 19:14       ` Chris Wilson
  0 siblings, 0 replies; 52+ messages in thread
From: Chris Wilson @ 2016-04-20 19:14 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: David Airlie, intel-gfx, linux-kernel, Ingo Molnar,
	Peter Zijlstra (Intel),
	dri-devel, netdev, linux-rdma, Daniel Vetter, Dan Williams,
	Yishai Hadas, David Hildenbrand

On Wed, Apr 20, 2016 at 08:58:44PM +0200, Luis R. Rodriguez wrote:
> On Wed, Apr 20, 2016 at 07:42:13PM +0100, Chris Wilson wrote:
> > The ioremap() hidden behind the io_mapping_map_wc() convenience helper
> > can be used for remapping multiple pages. Extend the helper so that
> > future callers can use it for larger ranges.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > Cc: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Jani Nikula <jani.nikula@linux.intel.com>
> > Cc: David Airlie <airlied@linux.ie>
> > Cc: Yishai Hadas <yishaih@mellanox.com>
> > Cc: Dan Williams <dan.j.williams@intel.com>
> > Cc: Ingo Molnar <mingo@kernel.org>
> > Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
> > Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
> > Cc: Luis R. Rodriguez <mcgrof@kernel.org>
> > Cc: intel-gfx@lists.freedesktop.org
> > Cc: dri-devel@lists.freedesktop.org
> > Cc: netdev@vger.kernel.org
> > Cc: linux-rdma@vger.kernel.org
> > Cc: linux-kernel@vger.kernel.org
> 
> We have 2 callers today, in the future, can you envision
> this API getting more options? If so, in order to avoid the
> pain of collateral evolutions I can suggest a descriptor
> being passed with the required settings / options. This lets
> you evolve the API without needing to go in and modify
> old users. If you choose not to that's fine too, just
> figured I'd chime in with that as I've seen the pain
> with other APIs, and I'm putting an end to the needless
> set of collateral evolutions this way.

Do you have a good example in mind? I've one more patch to try and take
advantage of the io-mapping (that may or not be such a good idea in
practice) but I may as well see if I can make io_mapping more useful
when I do.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 02/19] io-mapping: Specify mapping size for io_mapping_map_wc()
  2016-04-20 19:14       ` Chris Wilson
@ 2016-04-20 21:23         ` Luis R. Rodriguez
  -1 siblings, 0 replies; 52+ messages in thread
From: Luis R. Rodriguez @ 2016-04-20 21:23 UTC (permalink / raw)
  To: Chris Wilson, Luis R. Rodriguez, intel-gfx, Tvrtko Ursulin,
	Mika Kuoppala, Joonas Lahtinen, Tvrtko Ursulin, Daniel Vetter,
	Jani Nikula, David Airlie, Yishai Hadas, Dan Williams,
	Ingo Molnar, Peter Zijlstra (Intel),
	David Hildenbrand, dri-devel, netdev, linux-rdma, linux-kernel

On Wed, Apr 20, 2016 at 08:14:32PM +0100, Chris Wilson wrote:
> On Wed, Apr 20, 2016 at 08:58:44PM +0200, Luis R. Rodriguez wrote:
> > On Wed, Apr 20, 2016 at 07:42:13PM +0100, Chris Wilson wrote:
> > > The ioremap() hidden behind the io_mapping_map_wc() convenience helper
> > > can be used for remapping multiple pages. Extend the helper so that
> > > future callers can use it for larger ranges.
> > > 
> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > > Cc: Daniel Vetter <daniel.vetter@intel.com>
> > > Cc: Jani Nikula <jani.nikula@linux.intel.com>
> > > Cc: David Airlie <airlied@linux.ie>
> > > Cc: Yishai Hadas <yishaih@mellanox.com>
> > > Cc: Dan Williams <dan.j.williams@intel.com>
> > > Cc: Ingo Molnar <mingo@kernel.org>
> > > Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
> > > Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
> > > Cc: Luis R. Rodriguez <mcgrof@kernel.org>
> > > Cc: intel-gfx@lists.freedesktop.org
> > > Cc: dri-devel@lists.freedesktop.org
> > > Cc: netdev@vger.kernel.org
> > > Cc: linux-rdma@vger.kernel.org
> > > Cc: linux-kernel@vger.kernel.org
> > 
> > We have 2 callers today, in the future, can you envision
> > this API getting more options? If so, in order to avoid the
> > pain of collateral evolutions I can suggest a descriptor
> > being passed with the required settings / options. This lets
> > you evolve the API without needing to go in and modify
> > old users. If you choose not to that's fine too, just
> > figured I'd chime in with that as I've seen the pain
> > with other APIs, and I'm putting an end to the needless
> > set of collateral evolutions this way.
> 
> Do you have a good example in mind? I've one more patch to try and take
> advantage of the io-mapping (that may or not be such a good idea in
> practice) but I may as well see if I can make io_mapping more useful
> when I do.

Sure, here's my current version of the revamp of the firmware API
to a more flexible API, which lets us compartamentalize the
usermode helper, and through the new API avoids the issues with further
future collateral evolutions. It is still being baked, I'm fine tuning
the SmPL to folks automatically do conversion if they want:

https://git.kernel.org/cgit/linux/kernel/git/mcgrof/linux.git/log/?h=20160417-sysdata-api-v1

It also has a test driver (which I'd also recommend if you can pull off).
It would be kind of hard to do something like a lib/io-mapping_test.c
given there is no real device to ioremap -- _but_ perhaps regular
RAM can be used for fake a device MMIO. I am not sure if its even
possible... but if so it would not only be useful for something
like your API but also for testing ioremap() and friends, and
any possible aliasing bombs we may want to vet for. It also hints
how we may in the future be able to automatically write test drivers
for APIs for us through inference, but that needs a lot of more love
to make it tangible.

  Luis
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 02/19] io-mapping: Specify mapping size for io_mapping_map_wc()
@ 2016-04-20 21:23         ` Luis R. Rodriguez
  0 siblings, 0 replies; 52+ messages in thread
From: Luis R. Rodriguez @ 2016-04-20 21:23 UTC (permalink / raw)
  To: Chris Wilson, Luis R. Rodriguez, intel-gfx, Tvrtko Ursulin,
	Mika Kuoppala, Joonas Lahtinen, Tvrtko Ursulin, Daniel Vetter,
	Jani Nikula, David Airlie, Yishai Hadas, Dan Williams,
	Ingo Molnar, Peter Zijlstra (Intel),
	David Hildenbrand, dri-devel, netdev, linux-rdma, linux-kernel

On Wed, Apr 20, 2016 at 08:14:32PM +0100, Chris Wilson wrote:
> On Wed, Apr 20, 2016 at 08:58:44PM +0200, Luis R. Rodriguez wrote:
> > On Wed, Apr 20, 2016 at 07:42:13PM +0100, Chris Wilson wrote:
> > > The ioremap() hidden behind the io_mapping_map_wc() convenience helper
> > > can be used for remapping multiple pages. Extend the helper so that
> > > future callers can use it for larger ranges.
> > > 
> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > > Cc: Daniel Vetter <daniel.vetter@intel.com>
> > > Cc: Jani Nikula <jani.nikula@linux.intel.com>
> > > Cc: David Airlie <airlied@linux.ie>
> > > Cc: Yishai Hadas <yishaih@mellanox.com>
> > > Cc: Dan Williams <dan.j.williams@intel.com>
> > > Cc: Ingo Molnar <mingo@kernel.org>
> > > Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
> > > Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
> > > Cc: Luis R. Rodriguez <mcgrof@kernel.org>
> > > Cc: intel-gfx@lists.freedesktop.org
> > > Cc: dri-devel@lists.freedesktop.org
> > > Cc: netdev@vger.kernel.org
> > > Cc: linux-rdma@vger.kernel.org
> > > Cc: linux-kernel@vger.kernel.org
> > 
> > We have 2 callers today, in the future, can you envision
> > this API getting more options? If so, in order to avoid the
> > pain of collateral evolutions I can suggest a descriptor
> > being passed with the required settings / options. This lets
> > you evolve the API without needing to go in and modify
> > old users. If you choose not to that's fine too, just
> > figured I'd chime in with that as I've seen the pain
> > with other APIs, and I'm putting an end to the needless
> > set of collateral evolutions this way.
> 
> Do you have a good example in mind? I've one more patch to try and take
> advantage of the io-mapping (that may or not be such a good idea in
> practice) but I may as well see if I can make io_mapping more useful
> when I do.

Sure, here's my current version of the revamp of the firmware API
to a more flexible API, which lets us compartamentalize the
usermode helper, and through the new API avoids the issues with further
future collateral evolutions. It is still being baked, I'm fine tuning
the SmPL to folks automatically do conversion if they want:

https://git.kernel.org/cgit/linux/kernel/git/mcgrof/linux.git/log/?h=20160417-sysdata-api-v1

It also has a test driver (which I'd also recommend if you can pull off).
It would be kind of hard to do something like a lib/io-mapping_test.c
given there is no real device to ioremap -- _but_ perhaps regular
RAM can be used for fake a device MMIO. I am not sure if its even
possible... but if so it would not only be useful for something
like your API but also for testing ioremap() and friends, and
any possible aliasing bombs we may want to vet for. It also hints
how we may in the future be able to automatically write test drivers
for APIs for us through inference, but that needs a lot of more love
to make it tangible.

  Luis

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 10/19] drm/i915: Rearrange switch_context to load the aliasing ppgtt on first use
  2016-04-20 18:42 ` [PATCH 10/19] drm/i915: Rearrange switch_context to load the aliasing ppgtt on first use Chris Wilson
@ 2016-04-21  6:48   ` Joonas Lahtinen
  2016-04-21  6:58     ` Chris Wilson
  2016-04-21  7:01     ` Chris Wilson
  0 siblings, 2 replies; 52+ messages in thread
From: Joonas Lahtinen @ 2016-04-21  6:48 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: Mika Kuoppala

On ke, 2016-04-20 at 19:42 +0100, Chris Wilson wrote:
> The code to switch_mm() is already handled by i915_switch_context(), the
> only difference required to setup the aliasing ppgtt is that we need to
> emit te switch_mm() on the first context, i.e. when transitioning from
> engine->last_context == NULL. This allows us to defer the
> initialisation of the GPU from early device initialisation to first use,
> which should marginally speed up both. The caveat is that we then defer
> the context initialisation until first use - i.e. we cannot assume that
> the GPU engines are initialised. For example, this means that power
> contexts for rc6 (Ironlake) need to explicitly loaded, as they are.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@intel.com>

Some comments below.

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index aafcb4942acf..14c9b29294c5 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -4910,34 +4910,6 @@ i915_gem_init_hw(struct drm_device *dev)
>  	if (ret)
>  		goto out;

This whole if (ret) goto out can be removed as out is right after the
removed code block.

>  
> -	/* Now it is safe to go back round and do everything else: */
> -	for_each_engine(engine, dev_priv) {
> -		struct drm_i915_gem_request *req;
> -
> -		req = i915_gem_request_alloc(engine, NULL);
> -		if (IS_ERR(req)) {
> -			ret = PTR_ERR(req);
> -			break;
> -		}
> -
> -		ret = i915_ppgtt_init_ring(req);
> -		if (ret)
> -			goto err_request;
> -
> -		ret = i915_gem_context_enable(req);
> -		if (ret)
> -			goto err_request;
> -
> -err_request:
> -		i915_add_request_no_flush(req);
> -		if (ret) {
> -			DRM_ERROR("Failed to enable %s, error=%d\n",
> -				  engine->name, ret);
> -			i915_gem_cleanup_engines(dev);
> -			break;
> -		}
> -	}
> -
>  out:
>  	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
>  	return ret;
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index e5b3d74f8222..4d376f984a8c 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -431,27 +431,6 @@ void i915_gem_context_fini(struct drm_device *dev)
>  	dev_priv->kernel_context = NULL;
>  }
> 
<SNIP>
>  
>  static bool
> -needs_pd_load_pre(struct intel_engine_cs *engine, struct intel_context *to)
> +needs_pd_load_pre(struct i915_hw_ppgtt *ppgtt,
> +		  struct intel_engine_cs *engine,
> +		  struct intel_context *to)
>  {
> -	if (!to->ppgtt)
> +	if (ppgtt == NULL)

Code checker will scream and ask for !ppgtt. And I'm pretty sure we
should not keep flip-flopping between two styles. Also, you're not
doing != NULL either to check if pointer is not NULL, but just if
(foo), so I do not see why to avoid if (!foo).

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 14/19] drm/i915: Move context initialisation to first-use
  2016-04-20 18:42 ` [PATCH 14/19] drm/i915: Move context initialisation to first-use Chris Wilson
@ 2016-04-21  6:57   ` Joonas Lahtinen
  2016-04-21  7:08     ` Chris Wilson
  0 siblings, 1 reply; 52+ messages in thread
From: Joonas Lahtinen @ 2016-04-21  6:57 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On ke, 2016-04-20 at 19:42 +0100, Chris Wilson wrote:
> Instead of allocating a new request when allocating a context, use the
> request that initiated the allocation to emit the context
> initialisation. This serves two purposes, it makes the initialisation
> atomic with first use (simplifying scheduling and our own error
> handling). Secondly, it enables us to remove the explicit context
> allocation required by higher levels of GEM and make that property of
> execlists opaque (in the next patch). There is also a minor step
> forwards towards convergence of legacy/execlist contexts.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Nitpick below.

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

> ---
>  drivers/gpu/drm/i915/i915_drv.h  |  1 +
>  drivers/gpu/drm/i915/intel_lrc.c | 39 ++++++++++++++++++---------------------
>  2 files changed, 19 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 2bf3a8f97d52..e4b510bcee62 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -868,6 +868,7 @@ struct intel_context {
>  		struct i915_vma *lrc_vma;
>  		u64 lrc_desc;
>  		uint32_t *lrc_reg_state;
> +		bool initialised;
>  	} engine[I915_NUM_ENGINES];
>  
>  	struct list_head link;
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 838abd4b42a3..d765267153c5 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -672,9 +672,10 @@ static int execlists_move_to_gpu(struct drm_i915_gem_request *req,
>  
>  int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request)
>  {
> -	int ret = 0;
> +	struct intel_engine_cs *engine = request->engine;
> +	int ret;
>  
> -	request->ringbuf = request->ctx->engine[request->engine->id].ringbuf;
> +	request->ringbuf = request->ctx->engine[engine->id].ringbuf;
>  
>  	if (i915.enable_guc_submission) {
>  		/*
> @@ -689,7 +690,20 @@ int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request
>  			return ret;
>  	}
>  
> -	return intel_lr_context_pin(request->ctx, request->engine);
> +	ret = intel_lr_context_pin(request->ctx, engine);
> +	if (ret)
> +		return ret;
> +
> +	if (!request->ctx->engine[engine->id].initialised) {
> +		ret = engine->init_context(request);
> +		if (ret) {
> +			intel_lr_context_unpin(request->ctx, engine);

I prefer the goto teardown path, it's easy to read and modify later on.

> +			return ret;
> +		}
> +		request->ctx->engine[engine->id].initialised = true;
> +	}
> +
> +	return 0;
>  }
>  
>  static int logical_ring_wait_for_space(struct drm_i915_gem_request *req,
> @@ -2634,25 +2648,8 @@ int intel_lr_context_deferred_alloc(struct intel_context *ctx,
>  
>  	ctx->engine[engine->id].ringbuf = ringbuf;
>  	ctx->engine[engine->id].state = ctx_obj;
> +	ctx->engine[engine->id].initialised = engine->init_context == NULL;
>  
> -	if (ctx != ctx->i915->kernel_context && engine->init_context) {
> -		struct drm_i915_gem_request *req;
> -
> -		req = i915_gem_request_alloc(engine, ctx);
> -		if (IS_ERR(req)) {
> -			ret = PTR_ERR(req);
> -			DRM_ERROR("ring create req: %d\n", ret);
> -			goto error_ringbuf;
> -		}
> -
> -		ret = engine->init_context(req);
> -		i915_add_request_no_flush(req);
> -		if (ret) {
> -			DRM_ERROR("ring init context: %d\n",
> -				ret);
> -			goto error_ringbuf;
> -		}
> -	}
>  	return 0;
>  
>  error_ringbuf:
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 10/19] drm/i915: Rearrange switch_context to load the aliasing ppgtt on first use
  2016-04-21  6:48   ` Joonas Lahtinen
@ 2016-04-21  6:58     ` Chris Wilson
  2016-04-21  7:01     ` Chris Wilson
  1 sibling, 0 replies; 52+ messages in thread
From: Chris Wilson @ 2016-04-21  6:58 UTC (permalink / raw)
  To: Joonas Lahtinen; +Cc: intel-gfx, Mika Kuoppala

On Thu, Apr 21, 2016 at 09:48:36AM +0300, Joonas Lahtinen wrote:
> On ke, 2016-04-20 at 19:42 +0100, Chris Wilson wrote:
> >  static bool
> > -needs_pd_load_pre(struct intel_engine_cs *engine, struct intel_context *to)
> > +needs_pd_load_pre(struct i915_hw_ppgtt *ppgtt,
> > +		  struct intel_engine_cs *engine,
> > +		  struct intel_context *to)
> >  {
> > -	if (!to->ppgtt)
> > +	if (ppgtt == NULL)
> 
> Code checker will scream and ask for !ppgtt. And I'm pretty sure we
> should not keep flip-flopping between two styles. Also, you're not
> doing != NULL either to check if pointer is not NULL, but just if
> (foo), so I do not see why to avoid if (!foo).

I normally do do != NULL :(

Oh, well. I give in. Can I also stop writing NULL everywhere and just
use 0 in pointer contexts?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 10/19] drm/i915: Rearrange switch_context to load the aliasing ppgtt on first use
  2016-04-21  6:48   ` Joonas Lahtinen
  2016-04-21  6:58     ` Chris Wilson
@ 2016-04-21  7:01     ` Chris Wilson
  2016-04-21  7:24       ` Chris Wilson
  2016-04-21  7:32       ` Joonas Lahtinen
  1 sibling, 2 replies; 52+ messages in thread
From: Chris Wilson @ 2016-04-21  7:01 UTC (permalink / raw)
  To: Joonas Lahtinen; +Cc: intel-gfx, Mika Kuoppala

On Thu, Apr 21, 2016 at 09:48:36AM +0300, Joonas Lahtinen wrote:
> On ke, 2016-04-20 at 19:42 +0100, Chris Wilson wrote:
> > The code to switch_mm() is already handled by i915_switch_context(), the
> > only difference required to setup the aliasing ppgtt is that we need to
> > emit te switch_mm() on the first context, i.e. when transitioning from
> > engine->last_context == NULL. This allows us to defer the
> > initialisation of the GPU from early device initialisation to first use,
> > which should marginally speed up both. The caveat is that we then defer
> > the context initialisation until first use - i.e. we cannot assume that
> > the GPU engines are initialised. For example, this means that power
> > contexts for rc6 (Ironlake) need to explicitly loaded, as they are.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> 
> Some comments below.
> 
> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> 
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index aafcb4942acf..14c9b29294c5 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -4910,34 +4910,6 @@ i915_gem_init_hw(struct drm_device *dev)
> >  	if (ret)
> >  		goto out;
> 
> This whole if (ret) goto out can be removed as out is right after the
> removed code block.

Oh, this piece of code is earmarked to be removed and now I even have a
bugzilla for it!

https://bugs.freedesktop.org/show_bug.cgi?id=95023
is an example where we are doing the wrong thing with the current
i915_gem_init_seqno.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 17/19] drm/i915: Track the previous pinned context inside the request
  2016-04-20 18:42 ` [PATCH 17/19] drm/i915: Track the previous pinned context inside the request Chris Wilson
@ 2016-04-21  7:07   ` Joonas Lahtinen
  2016-04-21  7:22     ` Chris Wilson
  0 siblings, 1 reply; 52+ messages in thread
From: Joonas Lahtinen @ 2016-04-21  7:07 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On ke, 2016-04-20 at 19:42 +0100, Chris Wilson wrote:
> As the contexts are accessed by the hardware until the switch is completed
> to a new context, the hardware may still be writing to the context object
> after the breadcrumb is visible. We must not unpin/unbind/prune that
> object whilst still active and so we keep the previous context pinned until
> the following request. We can generalise the tracking we already do via
> the engine->last_context and move it to the request so that it works
> equally for execlists and GuC.
> 
> v2: Drop the execlists double pin as that exposes a race inside the lrc
> irq handler as it tries to access the context after it may be retired.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>

Below comments addressed,

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

> ---
>  drivers/gpu/drm/i915/i915_drv.h  | 11 +++++++++++
>  drivers/gpu/drm/i915/i915_gem.c  |  8 ++++----
>  drivers/gpu/drm/i915/intel_lrc.c | 14 ++++++++------
>  3 files changed, 23 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index a26a026ef8e2..515b8badce61 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2302,6 +2302,17 @@ struct drm_i915_gem_request {
>  	struct intel_context *ctx;
>  	struct intel_ringbuffer *ringbuf;
>  
> +	/**
> +	 * Context related to the previous request.
> +	 * As the contexts are accessed by the hardware until the switch is
> +	 * completed to a new context, the hardware may still be writing
> +	 * to the context object after the breadcrumb is visible. We must
> +	 * not unpin/unbind/prune that object whilst still active and so
> +	 * we keep the previous context pinned until the following (this)
> +	 * request is retired.
> +	 */
> +	struct intel_context *previous_context;
> +
>  	/** Batch buffer related to this request if any (used for
>  	    error state dump only) */
>  	struct drm_i915_gem_object *batch_obj;
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index eaf1d13c943f..dbfc38f91f7d 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1413,13 +1413,13 @@ static void i915_gem_request_retire(struct drm_i915_gem_request *request)
>  	list_del_init(&request->list);
>  	i915_gem_request_remove_from_client(request);
>  
> -	if (request->ctx) {
> +	if (request->previous_context) {

Could be if (i915.enable_execlists && request->previous_context) now.

>  		if (i915.enable_execlists)
> -			intel_lr_context_unpin(request->ctx, request->engine);
> -
> -		i915_gem_context_unreference(request->ctx);
> +			intel_lr_context_unpin(request->previous_context,
> +					       request->engine);
>  	}
>  
> +	i915_gem_context_unreference(request->ctx);

Obviously some previous patch changed it so that request->ctx is never
NULL at this point, as no more testing is done.

>  	i915_gem_request_unreference(request);
>  }
>  
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 8a544e2286f9..67c369ae649b 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -788,12 +788,14 @@ intel_logical_ring_advance_and_submit(struct drm_i915_gem_request *request)
>  	if (intel_engine_stopped(engine))
>  		return 0;
>  
> -	if (engine->last_context != request->ctx) {
> -		if (engine->last_context)
> -			intel_lr_context_unpin(engine->last_context, engine);
> -		intel_lr_context_pin(request->ctx, engine);
> -		engine->last_context = request->ctx;
> -	}
> +	/* We keep the previous context alive until we retire the following
> +	 * request. This ensures that any the context object is still pinned
> +	 * for any residual writes the HW makes into it on the context switch
> +	 * into the next object following the breadcrumb. Otherwise, we may
> +	 * retire the context too early.
> +	 */
> +	request->previous_context = engine->last_context;
> +	engine->last_context = request->ctx;
>  
>  	if (dev_priv->guc.execbuf_client)
>  		i915_guc_submit(dev_priv->guc.execbuf_client, request);
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 14/19] drm/i915: Move context initialisation to first-use
  2016-04-21  6:57   ` Joonas Lahtinen
@ 2016-04-21  7:08     ` Chris Wilson
  2016-04-21  7:47       ` Joonas Lahtinen
  0 siblings, 1 reply; 52+ messages in thread
From: Chris Wilson @ 2016-04-21  7:08 UTC (permalink / raw)
  To: Joonas Lahtinen; +Cc: intel-gfx

On Thu, Apr 21, 2016 at 09:57:03AM +0300, Joonas Lahtinen wrote:
> On ke, 2016-04-20 at 19:42 +0100, Chris Wilson wrote:
> > +	if (!request->ctx->engine[engine->id].initialised) {
> > +		ret = engine->init_context(request);
> > +		if (ret) {
> > +			intel_lr_context_unpin(request->ctx, engine);
> 
> I prefer the goto teardown path, it's easy to read and modify later on.

Ah, that would lead to bugs here. After we emit init_context on this
request, the request must run to completion as the request itself tracks
modification to global data, e.g. the ctx->initialised flag here and the
golden render state object's liveness tracking.

Well that deserves a comment!
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 05/19] drm/i915: Use i915_vma_pin_iomap on the ringbuffer object
  2016-04-20 18:42 ` [PATCH 05/19] drm/i915: Use i915_vma_pin_iomap on the ringbuffer object Chris Wilson
@ 2016-04-21  7:19   ` Joonas Lahtinen
  0 siblings, 0 replies; 52+ messages in thread
From: Joonas Lahtinen @ 2016-04-21  7:19 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On ke, 2016-04-20 at 19:42 +0100, Chris Wilson wrote:
> Similarly to i915_gem_object_pin_map on LLC platforms, we can
> use the new VMA based io mapping on !LLC to amoritize the cost
> of ringbuffer pinning and unpinning.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

> ---
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 16 +++++++++-------
>  1 file changed, 9 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 245386e20c52..27ba15acae1e 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -2084,20 +2084,23 @@ static int init_phys_status_page(struct intel_engine_cs *engine)
>  
>  void intel_unpin_ringbuffer_obj(struct intel_ringbuffer *ringbuf)
>  {
> +	GEM_BUG_ON(ringbuf->vma == NULL);
> +	GEM_BUG_ON(ringbuf->virtual_start == NULL);
> +
>  	if (HAS_LLC(ringbuf->obj->base.dev) && !ringbuf->obj->stolen)
>  		i915_gem_object_unpin_map(ringbuf->obj);
>  	else
> -		iounmap(ringbuf->virtual_start);
> +		i915_vma_unpin_iomap(ringbuf->vma);
>  	ringbuf->virtual_start = NULL;
> -	ringbuf->vma = NULL;
> +
>  	i915_gem_object_ggtt_unpin(ringbuf->obj);
> +	ringbuf->vma = NULL;
>  }
>  
>  int intel_pin_and_map_ringbuffer_obj(struct drm_device *dev,
>  				     struct intel_ringbuffer *ringbuf)
>  {
>  	struct drm_i915_private *dev_priv = to_i915(dev);
> -	struct i915_ggtt *ggtt = &dev_priv->ggtt;
>  	struct drm_i915_gem_object *obj = ringbuf->obj;
>  	/* Ring wraparound at offset 0 sometimes hangs. No idea why. */
>  	unsigned flags = PIN_OFFSET_BIAS | 4096;
> @@ -2131,10 +2134,9 @@ int intel_pin_and_map_ringbuffer_obj(struct drm_device *dev,
>  		/* Access through the GTT requires the device to be awake. */
>  		assert_rpm_wakelock_held(dev_priv);
>  
> -		addr = ioremap_wc(ggtt->mappable_base +
> -				  i915_gem_obj_ggtt_offset(obj), ringbuf->size);
> -		if (addr == NULL) {
> -			ret = -ENOMEM;
> +		addr = i915_vma_pin_iomap(i915_gem_obj_to_ggtt(obj));
> +		if (IS_ERR(addr)) {
> +			ret = PTR_ERR(addr);
>  			goto err_unpin;
>  		}
>  	}
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 17/19] drm/i915: Track the previous pinned context inside the request
  2016-04-21  7:07   ` Joonas Lahtinen
@ 2016-04-21  7:22     ` Chris Wilson
  2016-04-21  7:35       ` Joonas Lahtinen
  0 siblings, 1 reply; 52+ messages in thread
From: Chris Wilson @ 2016-04-21  7:22 UTC (permalink / raw)
  To: Joonas Lahtinen; +Cc: intel-gfx

On Thu, Apr 21, 2016 at 10:07:50AM +0300, Joonas Lahtinen wrote:
> On ke, 2016-04-20 at 19:42 +0100, Chris Wilson wrote:
> > As the contexts are accessed by the hardware until the switch is completed
> > to a new context, the hardware may still be writing to the context object
> > after the breadcrumb is visible. We must not unpin/unbind/prune that
> > object whilst still active and so we keep the previous context pinned until
> > the following request. We can generalise the tracking we already do via
> > the engine->last_context and move it to the request so that it works
> > equally for execlists and GuC.
> > 
> > v2: Drop the execlists double pin as that exposes a race inside the lrc
> > irq handler as it tries to access the context after it may be retired.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> 
> Below comments addressed,
> 
> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> 
> > ---
> >  drivers/gpu/drm/i915/i915_drv.h  | 11 +++++++++++
> >  drivers/gpu/drm/i915/i915_gem.c  |  8 ++++----
> >  drivers/gpu/drm/i915/intel_lrc.c | 14 ++++++++------
> >  3 files changed, 23 insertions(+), 10 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index a26a026ef8e2..515b8badce61 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -2302,6 +2302,17 @@ struct drm_i915_gem_request {
> >  	struct intel_context *ctx;
> >  	struct intel_ringbuffer *ringbuf;
> >  
> > +	/**
> > +	 * Context related to the previous request.
> > +	 * As the contexts are accessed by the hardware until the switch is
> > +	 * completed to a new context, the hardware may still be writing
> > +	 * to the context object after the breadcrumb is visible. We must
> > +	 * not unpin/unbind/prune that object whilst still active and so
> > +	 * we keep the previous context pinned until the following (this)
> > +	 * request is retired.
> > +	 */
> > +	struct intel_context *previous_context;
> > +
> >  	/** Batch buffer related to this request if any (used for
> >  	    error state dump only) */
> >  	struct drm_i915_gem_object *batch_obj;
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index eaf1d13c943f..dbfc38f91f7d 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -1413,13 +1413,13 @@ static void i915_gem_request_retire(struct drm_i915_gem_request *request)
> >  	list_del_init(&request->list);
> >  	i915_gem_request_remove_from_client(request);
> >  
> > -	if (request->ctx) {
> > +	if (request->previous_context) {
> 
> Could be if (i915.enable_execlists && request->previous_context) now.

I didn't do this because I intend (been trying!) to remove the execlists
special case. So this will be

if (request->previous_context)
	request->engine->unpin_context(request->engine,
				       request->previous_context);

Though the plan I actually have is to move the context over to activity
tracking so that it finally behaves like legacy. And one step closer to
removing the explicit ordering requirement between contexts.

> >  		if (i915.enable_execlists)
> > -			intel_lr_context_unpin(request->ctx, request->engine);
> > -
> > -		i915_gem_context_unreference(request->ctx);
> > +			intel_lr_context_unpin(request->previous_context,
> > +					       request->engine);
> >  	}
> >  
> > +	i915_gem_context_unreference(request->ctx);
> 
> Obviously some previous patch changed it so that request->ctx is never
> NULL at this point, as no more testing is done.

Since requests.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 10/19] drm/i915: Rearrange switch_context to load the aliasing ppgtt on first use
  2016-04-21  7:01     ` Chris Wilson
@ 2016-04-21  7:24       ` Chris Wilson
  2016-04-21  7:32       ` Joonas Lahtinen
  1 sibling, 0 replies; 52+ messages in thread
From: Chris Wilson @ 2016-04-21  7:24 UTC (permalink / raw)
  To: Joonas Lahtinen, intel-gfx, Tvrtko Ursulin, Mika Kuoppala, Mika Kuoppala

On Thu, Apr 21, 2016 at 08:01:37AM +0100, Chris Wilson wrote:
> Oh, this piece of code is earmarked to be removed and now I even have a
> bugzilla for it!
> 
> https://bugs.freedesktop.org/show_bug.cgi?id=95023
> is an example where we are doing the wrong thing with the current
> i915_gem_init_seqno.

Too excited this morning. Thought I could explain some of the
discrepancy in the error state by convincing myself we currently didn't
reset all register state for seqno on init.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 10/19] drm/i915: Rearrange switch_context to load the aliasing ppgtt on first use
  2016-04-21  7:01     ` Chris Wilson
  2016-04-21  7:24       ` Chris Wilson
@ 2016-04-21  7:32       ` Joonas Lahtinen
  1 sibling, 0 replies; 52+ messages in thread
From: Joonas Lahtinen @ 2016-04-21  7:32 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, Mika Kuoppala

On to, 2016-04-21 at 08:01 +0100, Chris Wilson wrote:
> On Thu, Apr 21, 2016 at 09:48:36AM +0300, Joonas Lahtinen wrote:
> > 
> > On ke, 2016-04-20 at 19:42 +0100, Chris Wilson wrote:
> > > 
> > > The code to switch_mm() is already handled by i915_switch_context(), the
> > > only difference required to setup the aliasing ppgtt is that we need to
> > > emit te switch_mm() on the first context, i.e. when transitioning from
> > > engine->last_context == NULL. This allows us to defer the
> > > initialisation of the GPU from early device initialisation to first use,
> > > which should marginally speed up both. The caveat is that we then defer
> > > the context initialisation until first use - i.e. we cannot assume that
> > > the GPU engines are initialised. For example, this means that power
> > > contexts for rc6 (Ironlake) need to explicitly loaded, as they are.
> > > 
> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> > Some comments below.
> > 
> > Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > 
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > > index aafcb4942acf..14c9b29294c5 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > @@ -4910,34 +4910,6 @@ i915_gem_init_hw(struct drm_device *dev)
> > >  	if (ret)
> > >  		goto out;
> > This whole if (ret) goto out can be removed as out is right after the
> > removed code block.
> Oh, this piece of code is earmarked to be removed and now I even have a
> bugzilla for it!
> 

Huh? I was commeting that you just literally removed everything between
"goto out;" and "out:" with this commit. So you can just drop the whole
if (ret) goto out;

> https://bugs.freedesktop.org/show_bug.cgi?id=95023
> is an example where we are doing the wrong thing with the current
> i915_gem_init_seqno.
> -Chris
> 
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 17/19] drm/i915: Track the previous pinned context inside the request
  2016-04-21  7:22     ` Chris Wilson
@ 2016-04-21  7:35       ` Joonas Lahtinen
  0 siblings, 0 replies; 52+ messages in thread
From: Joonas Lahtinen @ 2016-04-21  7:35 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On to, 2016-04-21 at 08:22 +0100, Chris Wilson wrote:
> On Thu, Apr 21, 2016 at 10:07:50AM +0300, Joonas Lahtinen wrote:
> > 
> > On ke, 2016-04-20 at 19:42 +0100, Chris Wilson wrote:
> > > 
> > > As the contexts are accessed by the hardware until the switch is completed
> > > to a new context, the hardware may still be writing to the context object
> > > after the breadcrumb is visible. We must not unpin/unbind/prune that
> > > object whilst still active and so we keep the previous context pinned until
> > > the following request. We can generalise the tracking we already do via
> > > the engine->last_context and move it to the request so that it works
> > > equally for execlists and GuC.
> > > 
> > > v2: Drop the execlists double pin as that exposes a race inside the lrc
> > > irq handler as it tries to access the context after it may be retired.
> > > 
> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> > Below comments addressed,
> > 
> > Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > 
<SNIP>
> > > index eaf1d13c943f..dbfc38f91f7d 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > @@ -1413,13 +1413,13 @@ static void i915_gem_request_retire(struct drm_i915_gem_request *request)
> > >  	list_del_init(&request->list);
> > >  	i915_gem_request_remove_from_client(request);
> > >  
> > > -	if (request->ctx) {
> > > +	if (request->previous_context) {
> > Could be if (i915.enable_execlists && request->previous_context) now.
> I didn't do this because I intend (been trying!) to remove the execlists
> special case. So this will be
> 
> if (request->previous_context)
> 	request->engine->unpin_context(request->engine,
> 				       request->previous_context);
> 
> Though the plan I actually have is to move the context over to activity
> tracking so that it finally behaves like legacy. And one step closer to
> removing the explicit ordering requirement between contexts.
> 
> > 
> > > 
> > >  		if (i915.enable_execlists)
> > > -			intel_lr_context_unpin(request->ctx, request->engine);
> > > -
> > > -		i915_gem_context_unreference(request->ctx);
> > > +			intel_lr_context_unpin(request->previous_context,
> > > +					       request->engine);
> > >  	}
> > >  
> > > +	i915_gem_context_unreference(request->ctx);
> > Obviously some previous patch changed it so that request->ctx is never
> > NULL at this point, as no more testing is done.
> Since requests.

Might be worthy a note in the commit message. "Remove redundand check
for request->ctx being NULL."

Regards, Joonas

> -Chris
> 
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 14/19] drm/i915: Move context initialisation to first-use
  2016-04-21  7:08     ` Chris Wilson
@ 2016-04-21  7:47       ` Joonas Lahtinen
  2016-04-21  7:56         ` Chris Wilson
  0 siblings, 1 reply; 52+ messages in thread
From: Joonas Lahtinen @ 2016-04-21  7:47 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On to, 2016-04-21 at 08:08 +0100, Chris Wilson wrote:
> On Thu, Apr 21, 2016 at 09:57:03AM +0300, Joonas Lahtinen wrote:
> > 
> > On ke, 2016-04-20 at 19:42 +0100, Chris Wilson wrote:
> > > 
> > > +	if (!request->ctx->engine[engine->id].initialised) {
> > > +		ret = engine->init_context(request);
> > > +		if (ret) {
> > > +			intel_lr_context_unpin(request->ctx, engine);
> > I prefer the goto teardown path, it's easy to read and modify later on.

Meant something like this which is functionally the same but less
nesting;

	if (request->ctx->engine[engine->id].initialised)
		return 0;

	ret = engine->init_context(request);
	if (ret)
		goto out_unpin;

	request->ctx->engine[engine->id].initialised = true;

	return 0;

out_unpin:
	intel_lr_context_unpin(request->ctx, engine);

	return ret;
}

> Ah, that would lead to bugs here. After we emit init_context on this
> request, the request must run to completion as the request itself tracks
> modification to global data, e.g. the ctx->initialised flag here and the
> golden render state object's liveness tracking.
> 
> Well that deserves a comment!
> -Chris
> 
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 14/19] drm/i915: Move context initialisation to first-use
  2016-04-21  7:47       ` Joonas Lahtinen
@ 2016-04-21  7:56         ` Chris Wilson
  2016-04-21  8:37           ` Joonas Lahtinen
  0 siblings, 1 reply; 52+ messages in thread
From: Chris Wilson @ 2016-04-21  7:56 UTC (permalink / raw)
  To: Joonas Lahtinen; +Cc: intel-gfx

On Thu, Apr 21, 2016 at 10:47:30AM +0300, Joonas Lahtinen wrote:
> On to, 2016-04-21 at 08:08 +0100, Chris Wilson wrote:
> > On Thu, Apr 21, 2016 at 09:57:03AM +0300, Joonas Lahtinen wrote:
> > > 
> > > On ke, 2016-04-20 at 19:42 +0100, Chris Wilson wrote:
> > > > 
> > > > +	if (!request->ctx->engine[engine->id].initialised) {
> > > > +		ret = engine->init_context(request);
> > > > +		if (ret) {
> > > > +			intel_lr_context_unpin(request->ctx, engine);
> > > I prefer the goto teardown path, it's easy to read and modify later on.
> 
> Meant something like this which is functionally the same but less
> nesting;
> 
> 	if (request->ctx->engine[engine->id].initialised)
> 		return 0;
> 
> 	ret = engine->init_context(request);
> 	if (ret)
> 		goto out_unpin;
> 
> 	request->ctx->engine[engine->id].initialised = true;
> 
> 	return 0;
> 
> out_unpin:
> 	intel_lr_context_unpin(request->ctx, engine);
> 
> 	return ret;

Yes, I am arguing that this leaves the next person thinking it will be
safe to extend the unwind. And so by not conforming to the idiom, it
should be more of a danger signal.

Compromise?

        if (!request->ctx->engine[engine->id].initialised) {
                ret = engine->init_context(request);
                if (ret)
                        goto err_unpin;

                request->ctx->engine[engine->id].initialised = true;
        }

        /* Note that after this point, we have committed to using
         * this request as it is being used to both track the
         * state of engine initialisation and liveness of the
         * golden renderstate above. Think twice before you try
         * to cancel/unwind this request now.
         */

        return 0;

err_unpin:
        intel_lr_context_unpin(request->ctx, engine);
        return ret;
}


-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 18/19] drm/i915: Store LRC hardware id in the request
  2016-04-20 18:42 ` [PATCH 18/19] drm/i915: Store LRC hardware id in " Chris Wilson
@ 2016-04-21  7:58   ` Tvrtko Ursulin
  2016-04-21  9:02     ` Chris Wilson
  0 siblings, 1 reply; 52+ messages in thread
From: Tvrtko Ursulin @ 2016-04-21  7:58 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 20/04/16 19:42, Chris Wilson wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>
> This way in the following patch we can disconnect requests
> from contexts.
>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Surely I didn't review my own patch? :)

Regards,

Tvrtko

> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/i915_drv.h  | 2 ++
>   drivers/gpu/drm/i915/intel_lrc.c | 3 ++-
>   2 files changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 515b8badce61..0efbe6c4634f 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2349,6 +2349,8 @@ struct drm_i915_gem_request {
>   	/** Execlists no. of times this request has been sent to the ELSP */
>   	int elsp_submitted;
>
> +	/** Execlists context hardware id. */
> +	unsigned ctx_hw_id;
>   };
>
>   struct drm_i915_gem_request * __must_check
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 67c369ae649b..833d8fd3343f 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -477,7 +477,7 @@ execlists_check_remove_request(struct intel_engine_cs *engine, u32 request_id)
>   	if (!head_req)
>   		return 0;
>
> -	if (unlikely(head_req->ctx->hw_id != request_id))
> +	if (unlikely(head_req->ctx_hw_id != request_id))
>   		return 0;
>
>   	WARN(head_req->elsp_submitted == 0, "Never submitted head request\n");
> @@ -615,6 +615,7 @@ static void execlists_context_queue(struct drm_i915_gem_request *request)
>   	}
>
>   	list_add_tail(&request->execlist_link, &engine->execlist_queue);
> +	request->ctx_hw_id = request->ctx->hw_id;
>   	if (num_elements == 0)
>   		execlists_context_unqueue(engine);
>
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 14/19] drm/i915: Move context initialisation to first-use
  2016-04-21  7:56         ` Chris Wilson
@ 2016-04-21  8:37           ` Joonas Lahtinen
  0 siblings, 0 replies; 52+ messages in thread
From: Joonas Lahtinen @ 2016-04-21  8:37 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On to, 2016-04-21 at 08:56 +0100, Chris Wilson wrote:
> On Thu, Apr 21, 2016 at 10:47:30AM +0300, Joonas Lahtinen wrote:
> > 
> > On to, 2016-04-21 at 08:08 +0100, Chris Wilson wrote:
> > > 
> > > On Thu, Apr 21, 2016 at 09:57:03AM +0300, Joonas Lahtinen wrote:
> > > > 
> > > > 
> > > > On ke, 2016-04-20 at 19:42 +0100, Chris Wilson wrote:
> > > > > 
> > > > > 
> > > > > +	if (!request->ctx->engine[engine->id].initialised) {
> > > > > +		ret = engine->init_context(request);
> > > > > +		if (ret) {
> > > > > +			intel_lr_context_unpin(request->ctx, engine);
> > > > I prefer the goto teardown path, it's easy to read and modify later on.
> > Meant something like this which is functionally the same but less
> > nesting;
> > 
> > 	if (request->ctx->engine[engine->id].initialised)
> > 		return 0;
> > 
> > 	ret = engine->init_context(request);
> > 	if (ret)
> > 		goto out_unpin;
> > 
> > 	request->ctx->engine[engine->id].initialised = true;
> > 
> > 	return 0;
> > 
> > out_unpin:
> > 	intel_lr_context_unpin(request->ctx, engine);
> > 
> > 	return ret;
> Yes, I am arguing that this leaves the next person thinking it will be
> safe to extend the unwind. And so by not conforming to the idiom, it
> should be more of a danger signal.
> 
> Compromise?

Fair enough.

> 
>         if (!request->ctx->engine[engine->id].initialised) {
>                 ret = engine->init_context(request);
>                 if (ret)
>                         goto err_unpin;
> 
>                 request->ctx->engine[engine->id].initialised = true;
>         }
> 
>         /* Note that after this point, we have committed to using
>          * this request as it is being used to both track the
>          * state of engine initialisation and liveness of the
>          * golden renderstate above. Think twice before you try
>          * to cancel/unwind this request now.
>          */
> 
>         return 0;
> 
> err_unpin:
>         intel_lr_context_unpin(request->ctx, engine);
>         return ret;
> }
> 
> 
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 18/19] drm/i915: Store LRC hardware id in the request
  2016-04-21  7:58   ` Tvrtko Ursulin
@ 2016-04-21  9:02     ` Chris Wilson
  0 siblings, 0 replies; 52+ messages in thread
From: Chris Wilson @ 2016-04-21  9:02 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

On Thu, Apr 21, 2016 at 08:58:20AM +0100, Tvrtko Ursulin wrote:
> 
> On 20/04/16 19:42, Chris Wilson wrote:
> >From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >
> >This way in the following patch we can disconnect requests
> >from contexts.
> >
> >Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Surely I didn't review my own patch? :)

I hope you at least read it once!
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 52+ messages in thread

* ✓ Fi.CI.BAT: success for series starting with [01/19] drm/i915/overlay: Replace i915_gem_obj_ggtt_offset() with the known flip_addr
  2016-04-20 18:42 Premature unpinning at last Chris Wilson
                   ` (18 preceding siblings ...)
  2016-04-20 18:42 ` [PATCH 19/19] drm/i915: Stop tracking execlists retired requests Chris Wilson
@ 2016-04-21 11:27 ` Patchwork
  2016-04-21 12:05 ` ✗ Fi.CI.BAT: failure " Patchwork
                   ` (2 subsequent siblings)
  22 siblings, 0 replies; 52+ messages in thread
From: Patchwork @ 2016-04-21 11:27 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/19] drm/i915/overlay: Replace i915_gem_obj_ggtt_offset() with the known flip_addr
URL   : https://patchwork.freedesktop.org/series/6017/
State : success

== Summary ==

Series 6017v1 Series without cover letter
http://patchwork.freedesktop.org/api/1.0/series/6017/revisions/1/mbox/


bdw-nuci7        total:194  pass:182  dwarn:0   dfail:0   fail:0   skip:12 
bdw-ultra        total:194  pass:169  dwarn:1   dfail:0   fail:1   skip:23 
bsw-nuc-2        total:193  pass:154  dwarn:0   dfail:0   fail:0   skip:39 
byt-nuc          total:193  pass:154  dwarn:0   dfail:0   fail:1   skip:38 
hsw-brixbox      total:194  pass:169  dwarn:1   dfail:0   fail:0   skip:24 
hsw-gt2          total:194  pass:175  dwarn:0   dfail:0   fail:0   skip:19 
ilk-hp8440p      total:194  pass:137  dwarn:0   dfail:0   fail:0   skip:57 
ivb-t430s        total:194  pass:166  dwarn:0   dfail:0   fail:0   skip:28 
skl-i7k-2        total:194  pass:167  dwarn:1   dfail:0   fail:1   skip:25 
skl-nuci5        total:194  pass:182  dwarn:1   dfail:0   fail:0   skip:11 
snb-dellxps      total:43   pass:35   dwarn:0   dfail:0   fail:0   skip:7  

Results at /archive/results/CI_IGT_test/Patchwork_1968/

9dabb0053b63bc32ab6ad5d13209d1e43395313f drm-intel-nightly: 2016y-04m-21d-09h-27m-12s UTC integration manifest
e4106b1 drm/i915: Stop tracking execlists retired requests
30f6ba3 drm/i915: Store LRC hardware id in the request
91e51f7 drm/i915: Track the previous pinned context inside the request
3990af2 drm/i915: Move releasing of the GEM request from free to retire/cancel
360c4ec drm/i915: Move the magical deferred context allocation into the request
b8dbd2d drm/i915: Move context initialisation to first-use
9f8b875 drm/i915: Refactor execlists default context pinning
8bcbfbb drm/i915: Replace the pinned context address with its unique ID
a8edba0 drm/i915: Assign every HW context a unique ID
2d74d18 drm/i915: Rearrange switch_context to load the aliasing ppgtt on first use
973eca0 drm/i915: Remove early l3-remap
0706faa drm/i915: Consolidate L3 remapping LRI
159dfbc drm/i915: L3 cache remapping is part of context switching
1c086a2 drm/i915: Mark the current context as lost on suspend
73c4f54 drm/i915: Use i915_vma_pin_iomap on the ringbuffer object
b087003 drm/i915: Move ioremap_wc tracking onto VMA
33d9834 drm/i915: Introduce i915_vm_to_ggtt()
b9e931c io-mapping: Specify mapping size for io_mapping_map_wc()
bb9e0f2 drm/i915/overlay: Replace i915_gem_obj_ggtt_offset() with the known flip_addr

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 52+ messages in thread

* ✗ Fi.CI.BAT: failure for series starting with [01/19] drm/i915/overlay: Replace i915_gem_obj_ggtt_offset() with the known flip_addr
  2016-04-20 18:42 Premature unpinning at last Chris Wilson
                   ` (19 preceding siblings ...)
  2016-04-21 11:27 ` ✓ Fi.CI.BAT: success for series starting with [01/19] drm/i915/overlay: Replace i915_gem_obj_ggtt_offset() with the known flip_addr Patchwork
@ 2016-04-21 12:05 ` Patchwork
  2016-04-23 10:53 ` ✗ Fi.CI.BAT: warning " Patchwork
  2016-04-25  6:51 ` Patchwork
  22 siblings, 0 replies; 52+ messages in thread
From: Patchwork @ 2016-04-21 12:05 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/19] drm/i915/overlay: Replace i915_gem_obj_ggtt_offset() with the known flip_addr
URL   : https://patchwork.freedesktop.org/series/6017/
State : failure

== Summary ==

Series 6017v1 Series without cover letter
http://patchwork.freedesktop.org/api/1.0/series/6017/revisions/1/mbox/

Test drv_module_reload_basic:
                pass       -> DMESG-WARN (skl-nuci5)
                pass       -> DMESG-WARN (hsw-brixbox)
                pass       -> DMESG-WARN (bdw-ultra)
                pass       -> DMESG-WARN (skl-i7k-2)
Test gem_busy:
        Subgroup basic-blt:
                skip       -> PASS       (bsw-nuc-2)
Test kms_flip:
        Subgroup basic-flip-vs-wf_vblank:
                pass       -> FAIL       (byt-nuc)

bdw-nuci7        total:194  pass:182  dwarn:0   dfail:0   fail:0   skip:12 
bdw-ultra        total:194  pass:169  dwarn:1   dfail:0   fail:1   skip:23 
bsw-nuc-2        total:193  pass:154  dwarn:0   dfail:0   fail:0   skip:39 
byt-nuc          total:193  pass:154  dwarn:0   dfail:0   fail:1   skip:38 
hsw-brixbox      total:194  pass:169  dwarn:1   dfail:0   fail:0   skip:24 
hsw-gt2          total:194  pass:175  dwarn:0   dfail:0   fail:0   skip:19 
ilk-hp8440p      total:194  pass:137  dwarn:0   dfail:0   fail:0   skip:57 
ivb-t430s        total:194  pass:166  dwarn:0   dfail:0   fail:0   skip:28 
skl-i7k-2        total:194  pass:167  dwarn:1   dfail:0   fail:1   skip:25 
skl-nuci5        total:194  pass:182  dwarn:1   dfail:0   fail:0   skip:11 
snb-dellxps      total:43   pass:35   dwarn:0   dfail:0   fail:0   skip:7  

Results at /archive/results/CI_IGT_test/Patchwork_1968/

9dabb0053b63bc32ab6ad5d13209d1e43395313f drm-intel-nightly: 2016y-04m-21d-09h-27m-12s UTC integration manifest
e4106b1 drm/i915: Stop tracking execlists retired requests
30f6ba3 drm/i915: Store LRC hardware id in the request
91e51f7 drm/i915: Track the previous pinned context inside the request
3990af2 drm/i915: Move releasing of the GEM request from free to retire/cancel
360c4ec drm/i915: Move the magical deferred context allocation into the request
b8dbd2d drm/i915: Move context initialisation to first-use
9f8b875 drm/i915: Refactor execlists default context pinning
8bcbfbb drm/i915: Replace the pinned context address with its unique ID
a8edba0 drm/i915: Assign every HW context a unique ID
2d74d18 drm/i915: Rearrange switch_context to load the aliasing ppgtt on first use
973eca0 drm/i915: Remove early l3-remap
0706faa drm/i915: Consolidate L3 remapping LRI
159dfbc drm/i915: L3 cache remapping is part of context switching
1c086a2 drm/i915: Mark the current context as lost on suspend
73c4f54 drm/i915: Use i915_vma_pin_iomap on the ringbuffer object
b087003 drm/i915: Move ioremap_wc tracking onto VMA
33d9834 drm/i915: Introduce i915_vm_to_ggtt()
b9e931c io-mapping: Specify mapping size for io_mapping_map_wc()
bb9e0f2 drm/i915/overlay: Replace i915_gem_obj_ggtt_offset() with the known flip_addr

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 52+ messages in thread

* ✗ Fi.CI.BAT: warning for series starting with [01/19] drm/i915/overlay: Replace i915_gem_obj_ggtt_offset() with the known flip_addr
  2016-04-20 18:42 Premature unpinning at last Chris Wilson
                   ` (20 preceding siblings ...)
  2016-04-21 12:05 ` ✗ Fi.CI.BAT: failure " Patchwork
@ 2016-04-23 10:53 ` Patchwork
  2016-04-25  6:51 ` Patchwork
  22 siblings, 0 replies; 52+ messages in thread
From: Patchwork @ 2016-04-23 10:53 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/19] drm/i915/overlay: Replace i915_gem_obj_ggtt_offset() with the known flip_addr
URL   : https://patchwork.freedesktop.org/series/6017/
State : warning

== Summary ==

Series 6017v1 Series without cover letter
http://patchwork.freedesktop.org/api/1.0/series/6017/revisions/1/mbox/

Test drv_module_reload_basic:
                pass       -> DMESG-WARN (skl-nuci5)
                pass       -> DMESG-WARN (skl-i7k-2)
Test kms_force_connector_basic:
        Subgroup force-connector-state:
                pass       -> SKIP       (snb-x220t)
                pass       -> SKIP       (ilk-hp8440p)
        Subgroup force-load-detect:
                pass       -> SKIP       (ilk-hp8440p)

bdw-nuci7        total:193  pass:181  dwarn:0   dfail:0   fail:0   skip:12 
bdw-ultra        total:193  pass:170  dwarn:0   dfail:0   fail:0   skip:23 
bsw-nuc-2        total:192  pass:153  dwarn:0   dfail:0   fail:0   skip:39 
byt-nuc          total:192  pass:154  dwarn:0   dfail:0   fail:0   skip:38 
ilk-hp8440p      total:193  pass:134  dwarn:0   dfail:0   fail:0   skip:59 
ivb-t430s        total:193  pass:165  dwarn:0   dfail:0   fail:0   skip:28 
skl-i7k-2        total:193  pass:167  dwarn:1   dfail:0   fail:0   skip:25 
skl-nuci5        total:193  pass:181  dwarn:1   dfail:0   fail:0   skip:11 
snb-dellxps      total:193  pass:155  dwarn:0   dfail:0   fail:0   skip:38 
snb-x220t        total:193  pass:154  dwarn:0   dfail:0   fail:1   skip:38 

Results at /archive/results/CI_IGT_test/Patchwork_2011/

340c485ad98d0ec0369a3b18d4a09938f3f5537d drm-intel-nightly: 2016y-04m-22d-17h-32m-25s UTC integration manifest
ae6845f drm/i915: Stop tracking execlists retired requests
5c84802 drm/i915: Store LRC hardware id in the request
3723061 drm/i915: Track the previous pinned context inside the request
635c06e drm/i915: Move releasing of the GEM request from free to retire/cancel
8515ecd drm/i915: Move the magical deferred context allocation into the request
50ba33c drm/i915: Move context initialisation to first-use
804e8d3 drm/i915: Refactor execlists default context pinning
1de02b6 drm/i915: Replace the pinned context address with its unique ID
28c6fd0 drm/i915: Assign every HW context a unique ID
da8c2b8 drm/i915: Rearrange switch_context to load the aliasing ppgtt on first use
67e55ed drm/i915: Remove early l3-remap
bfbecf5 drm/i915: Consolidate L3 remapping LRI
2510aa7 drm/i915: L3 cache remapping is part of context switching
8c547fe drm/i915: Mark the current context as lost on suspend
1909bdb drm/i915: Use i915_vma_pin_iomap on the ringbuffer object
2870bb5 drm/i915: Move ioremap_wc tracking onto VMA
80b6012 drm/i915: Introduce i915_vm_to_ggtt()
dbdaaeb io-mapping: Specify mapping size for io_mapping_map_wc()
7282438 drm/i915/overlay: Replace i915_gem_obj_ggtt_offset() with the known flip_addr

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 52+ messages in thread

* ✗ Fi.CI.BAT: warning for series starting with [01/19] drm/i915/overlay: Replace i915_gem_obj_ggtt_offset() with the known flip_addr
  2016-04-20 18:42 Premature unpinning at last Chris Wilson
                   ` (21 preceding siblings ...)
  2016-04-23 10:53 ` ✗ Fi.CI.BAT: warning " Patchwork
@ 2016-04-25  6:51 ` Patchwork
  22 siblings, 0 replies; 52+ messages in thread
From: Patchwork @ 2016-04-25  6:51 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/19] drm/i915/overlay: Replace i915_gem_obj_ggtt_offset() with the known flip_addr
URL   : https://patchwork.freedesktop.org/series/6017/
State : warning

== Summary ==

Series 6017v1 Series without cover letter
http://patchwork.freedesktop.org/api/1.0/series/6017/revisions/1/mbox/

Test drv_module_reload_basic:
                pass       -> DMESG-WARN (snb-dellxps)

byt-nuc          total:192  pass:154  dwarn:0   dfail:0   fail:0   skip:38 
ilk-hp8440p      total:193  pass:136  dwarn:0   dfail:0   fail:0   skip:57 
ivb-t430s        total:193  pass:165  dwarn:0   dfail:0   fail:0   skip:28 
snb-dellxps      total:193  pass:154  dwarn:1   dfail:0   fail:0   skip:38 

Results at /archive/results/CI_IGT_test/Patchwork_2053/

1e81bacf1f7fdbdf83f46b55389713fa13cb1256 drm-intel-nightly: 2016y-04m-24d-10h-36m-11s UTC integration manifest
0cb9d63 drm/i915: Stop tracking execlists retired requests
a439651 drm/i915: Store LRC hardware id in the request
f87b3e6 drm/i915: Track the previous pinned context inside the request
63e2b33 drm/i915: Move releasing of the GEM request from free to retire/cancel
00fac5d drm/i915: Move the magical deferred context allocation into the request
7384fc6 drm/i915: Move context initialisation to first-use
d5ddbe9 drm/i915: Refactor execlists default context pinning
693817f drm/i915: Replace the pinned context address with its unique ID
6500526 drm/i915: Assign every HW context a unique ID
fb9d7b2 drm/i915: Rearrange switch_context to load the aliasing ppgtt on first use
ee3c163 drm/i915: Remove early l3-remap
d6edd4f drm/i915: Consolidate L3 remapping LRI
5c5f311 drm/i915: L3 cache remapping is part of context switching
efe4dc3 drm/i915: Mark the current context as lost on suspend
cabc51b drm/i915: Use i915_vma_pin_iomap on the ringbuffer object
3333a3c drm/i915: Move ioremap_wc tracking onto VMA
b080b56 drm/i915: Introduce i915_vm_to_ggtt()
a13aaeb io-mapping: Specify mapping size for io_mapping_map_wc()
6d95901 drm/i915/overlay: Replace i915_gem_obj_ggtt_offset() with the known flip_addr

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH 19/19] drm/i915: Stop tracking execlists retired requests
  2016-04-21 14:57 Final CI pass for premature Chris Wilson
@ 2016-04-21 14:57 ` Chris Wilson
  0 siblings, 0 replies; 52+ messages in thread
From: Chris Wilson @ 2016-04-21 14:57 UTC (permalink / raw)
  To: intel-gfx

From: Tvrtko Ursulin <tvrtko@ursulin.net>

With the previous patch having extended the pinned lifetime of
contexts by referencing the previous context from the current
request until the latter is retired (completed by the GPU),
we can now remove usage of execlist retired queue entirely.

This is because the above now guarantees that all execlist
object access requirements are satisfied by this new tracking,
and we can stop taking additional references and stop keeping
request on the execlists retired queue.

The latter was a source of significant scalability issues in
the driver causing performance hits on some tests. Most
dramatical of which was igt/gem_close_race which had run time
in tens of minutes which is now reduced to tens of seconds.

Signed-off-by: Tvrtko Ursulin <tvrtko@ursulin.net>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.c         | 10 +--------
 drivers/gpu/drm/i915/intel_lrc.c        | 39 ++++++++++++---------------------
 drivers/gpu/drm/i915/intel_lrc.h        |  2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.h |  1 -
 4 files changed, 16 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index d16680b77e13..a88aa7f11a9c 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2876,13 +2876,7 @@ static void i915_gem_reset_engine_cleanup(struct drm_i915_private *dev_priv,
 		/* Ensure irq handler finishes or is cancelled. */
 		tasklet_kill(&engine->irq_tasklet);
 
-		spin_lock_bh(&engine->execlist_lock);
-		/* list_splice_tail_init checks for empty lists */
-		list_splice_tail_init(&engine->execlist_queue,
-				      &engine->execlist_retired_req_list);
-		spin_unlock_bh(&engine->execlist_lock);
-
-		intel_execlists_retire_requests(engine);
+		intel_execlists_cancel_requests(engine);
 	}
 
 	/*
@@ -3006,8 +3000,6 @@ i915_gem_retire_requests(struct drm_device *dev)
 			spin_lock_bh(&engine->execlist_lock);
 			idle &= list_empty(&engine->execlist_queue);
 			spin_unlock_bh(&engine->execlist_lock);
-
-			intel_execlists_retire_requests(engine);
 		}
 	}
 
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 0c46a5cf4be0..b6f12cb4dba9 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -431,8 +431,8 @@ static void execlists_context_unqueue(struct intel_engine_cs *engine)
 			/* Same ctx: ignore first request, as second request
 			 * will update tail past first request's workload */
 			cursor->elsp_submitted = req0->elsp_submitted;
-			list_move_tail(&req0->execlist_link,
-				       &engine->execlist_retired_req_list);
+			list_del(&req0->execlist_link);
+			i915_gem_request_unreference(req0);
 			req0 = cursor;
 		} else {
 			req1 = cursor;
@@ -464,7 +464,7 @@ static void execlists_context_unqueue(struct intel_engine_cs *engine)
 }
 
 static unsigned int
-execlists_check_remove_request(struct intel_engine_cs *engine, u32 request_id)
+execlists_check_remove_request(struct intel_engine_cs *engine, u32 ctx_id)
 {
 	struct drm_i915_gem_request *head_req;
 
@@ -474,19 +474,16 @@ execlists_check_remove_request(struct intel_engine_cs *engine, u32 request_id)
 					    struct drm_i915_gem_request,
 					    execlist_link);
 
-	if (!head_req)
-		return 0;
-
-	if (unlikely(head_req->ctx_hw_id != request_id))
-		return 0;
+	if (WARN_ON(!head_req || (head_req->ctx_hw_id != ctx_id)))
+               return 0;
 
 	WARN(head_req->elsp_submitted == 0, "Never submitted head request\n");
 
 	if (--head_req->elsp_submitted > 0)
 		return 0;
 
-	list_move_tail(&head_req->execlist_link,
-		       &engine->execlist_retired_req_list);
+	list_del(&head_req->execlist_link);
+	i915_gem_request_unreference(head_req);
 
 	return 1;
 }
@@ -590,9 +587,6 @@ static void execlists_context_queue(struct drm_i915_gem_request *request)
 	struct drm_i915_gem_request *cursor;
 	int num_elements = 0;
 
-	intel_lr_context_pin(request->ctx, request->engine);
-	i915_gem_request_reference(request);
-
 	spin_lock_bh(&engine->execlist_lock);
 
 	list_for_each_entry(cursor, &engine->execlist_queue, execlist_link)
@@ -609,11 +603,12 @@ static void execlists_context_queue(struct drm_i915_gem_request *request)
 		if (request->ctx == tail_req->ctx) {
 			WARN(tail_req->elsp_submitted != 0,
 				"More than 2 already-submitted reqs queued\n");
-			list_move_tail(&tail_req->execlist_link,
-				       &engine->execlist_retired_req_list);
+			list_del(&tail_req->execlist_link);
+			i915_gem_request_unreference(tail_req);
 		}
 	}
 
+	i915_gem_request_reference(request);
 	list_add_tail(&request->execlist_link, &engine->execlist_queue);
 	request->ctx_hw_id = request->ctx->hw_id;
 	if (num_elements == 0)
@@ -1011,23 +1006,18 @@ int intel_execlists_submission(struct i915_execbuffer_params *params,
 	return 0;
 }
 
-void intel_execlists_retire_requests(struct intel_engine_cs *engine)
+void intel_execlists_cancel_requests(struct intel_engine_cs *engine)
 {
 	struct drm_i915_gem_request *req, *tmp;
-	struct list_head retired_list;
+	LIST_HEAD(cancel_list);
 
 	WARN_ON(!mutex_is_locked(&engine->dev->struct_mutex));
-	if (list_empty(&engine->execlist_retired_req_list))
-		return;
 
-	INIT_LIST_HEAD(&retired_list);
 	spin_lock_bh(&engine->execlist_lock);
-	list_replace_init(&engine->execlist_retired_req_list, &retired_list);
+	list_replace_init(&engine->execlist_queue, &cancel_list);
 	spin_unlock_bh(&engine->execlist_lock);
 
-	list_for_each_entry_safe(req, tmp, &retired_list, execlist_link) {
-		intel_lr_context_unpin(req->ctx, engine);
-
+	list_for_each_entry_safe(req, tmp, &cancel_list, execlist_link) {
 		list_del(&req->execlist_link);
 		i915_gem_request_unreference(req);
 	}
@@ -2119,7 +2109,6 @@ logical_ring_init(struct drm_device *dev, struct intel_engine_cs *engine)
 
 	INIT_LIST_HEAD(&engine->buffers);
 	INIT_LIST_HEAD(&engine->execlist_queue);
-	INIT_LIST_HEAD(&engine->execlist_retired_req_list);
 	spin_lock_init(&engine->execlist_lock);
 
 	tasklet_init(&engine->irq_tasklet,
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index 8bea937973f6..4b1c896b5019 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -119,6 +119,6 @@ int intel_execlists_submission(struct i915_execbuffer_params *params,
 			       struct drm_i915_gem_execbuffer2 *args,
 			       struct list_head *vmas);
 
-void intel_execlists_retire_requests(struct intel_engine_cs *engine);
+void intel_execlists_cancel_requests(struct intel_engine_cs *engine);
 
 #endif /* _INTEL_LRC_H_ */
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 2ade194bbea9..527549dbeb3c 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -269,7 +269,6 @@ struct  intel_engine_cs {
 	struct tasklet_struct irq_tasklet;
 	spinlock_t execlist_lock; /* used inside tasklet, use spin_lock_bh */
 	struct list_head execlist_queue;
-	struct list_head execlist_retired_req_list;
 	unsigned int fw_domains;
 	unsigned int next_context_status_buffer;
 	unsigned int idle_lite_restore_wa;
-- 
2.8.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 19/19] drm/i915: Stop tracking execlists retired requests
  2016-04-21  8:58 [PATCH 01/19] " Chris Wilson
@ 2016-04-21  8:59 ` Chris Wilson
  0 siblings, 0 replies; 52+ messages in thread
From: Chris Wilson @ 2016-04-21  8:59 UTC (permalink / raw)
  To: intel-gfx

From: Tvrtko Ursulin <tvrtko@ursulin.net>

With the previous patch having extended the pinned lifetime of
contexts by referencing the previous context from the current
request until the latter is retired (completed by the GPU),
we can now remove usage of execlist retired queue entirely.

This is because the above now guarantees that all execlist
object access requirements are satisfied by this new tracking,
and we can stop taking additional references and stop keeping
request on the execlists retired queue.

The latter was a source of significant scalability issues in
the driver causing performance hits on some tests. Most
dramatical of which was igt/gem_close_race which had run time
in tens of minutes which is now reduced to tens of seconds.

Signed-off-by: Tvrtko Ursulin <tvrtko@ursulin.net>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.c         | 10 +--------
 drivers/gpu/drm/i915/intel_lrc.c        | 39 ++++++++++++---------------------
 drivers/gpu/drm/i915/intel_lrc.h        |  2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.h |  1 -
 4 files changed, 16 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index d16680b77e13..a88aa7f11a9c 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2876,13 +2876,7 @@ static void i915_gem_reset_engine_cleanup(struct drm_i915_private *dev_priv,
 		/* Ensure irq handler finishes or is cancelled. */
 		tasklet_kill(&engine->irq_tasklet);
 
-		spin_lock_bh(&engine->execlist_lock);
-		/* list_splice_tail_init checks for empty lists */
-		list_splice_tail_init(&engine->execlist_queue,
-				      &engine->execlist_retired_req_list);
-		spin_unlock_bh(&engine->execlist_lock);
-
-		intel_execlists_retire_requests(engine);
+		intel_execlists_cancel_requests(engine);
 	}
 
 	/*
@@ -3006,8 +3000,6 @@ i915_gem_retire_requests(struct drm_device *dev)
 			spin_lock_bh(&engine->execlist_lock);
 			idle &= list_empty(&engine->execlist_queue);
 			spin_unlock_bh(&engine->execlist_lock);
-
-			intel_execlists_retire_requests(engine);
 		}
 	}
 
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 0c46a5cf4be0..b6f12cb4dba9 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -431,8 +431,8 @@ static void execlists_context_unqueue(struct intel_engine_cs *engine)
 			/* Same ctx: ignore first request, as second request
 			 * will update tail past first request's workload */
 			cursor->elsp_submitted = req0->elsp_submitted;
-			list_move_tail(&req0->execlist_link,
-				       &engine->execlist_retired_req_list);
+			list_del(&req0->execlist_link);
+			i915_gem_request_unreference(req0);
 			req0 = cursor;
 		} else {
 			req1 = cursor;
@@ -464,7 +464,7 @@ static void execlists_context_unqueue(struct intel_engine_cs *engine)
 }
 
 static unsigned int
-execlists_check_remove_request(struct intel_engine_cs *engine, u32 request_id)
+execlists_check_remove_request(struct intel_engine_cs *engine, u32 ctx_id)
 {
 	struct drm_i915_gem_request *head_req;
 
@@ -474,19 +474,16 @@ execlists_check_remove_request(struct intel_engine_cs *engine, u32 request_id)
 					    struct drm_i915_gem_request,
 					    execlist_link);
 
-	if (!head_req)
-		return 0;
-
-	if (unlikely(head_req->ctx_hw_id != request_id))
-		return 0;
+	if (WARN_ON(!head_req || (head_req->ctx_hw_id != ctx_id)))
+               return 0;
 
 	WARN(head_req->elsp_submitted == 0, "Never submitted head request\n");
 
 	if (--head_req->elsp_submitted > 0)
 		return 0;
 
-	list_move_tail(&head_req->execlist_link,
-		       &engine->execlist_retired_req_list);
+	list_del(&head_req->execlist_link);
+	i915_gem_request_unreference(head_req);
 
 	return 1;
 }
@@ -590,9 +587,6 @@ static void execlists_context_queue(struct drm_i915_gem_request *request)
 	struct drm_i915_gem_request *cursor;
 	int num_elements = 0;
 
-	intel_lr_context_pin(request->ctx, request->engine);
-	i915_gem_request_reference(request);
-
 	spin_lock_bh(&engine->execlist_lock);
 
 	list_for_each_entry(cursor, &engine->execlist_queue, execlist_link)
@@ -609,11 +603,12 @@ static void execlists_context_queue(struct drm_i915_gem_request *request)
 		if (request->ctx == tail_req->ctx) {
 			WARN(tail_req->elsp_submitted != 0,
 				"More than 2 already-submitted reqs queued\n");
-			list_move_tail(&tail_req->execlist_link,
-				       &engine->execlist_retired_req_list);
+			list_del(&tail_req->execlist_link);
+			i915_gem_request_unreference(tail_req);
 		}
 	}
 
+	i915_gem_request_reference(request);
 	list_add_tail(&request->execlist_link, &engine->execlist_queue);
 	request->ctx_hw_id = request->ctx->hw_id;
 	if (num_elements == 0)
@@ -1011,23 +1006,18 @@ int intel_execlists_submission(struct i915_execbuffer_params *params,
 	return 0;
 }
 
-void intel_execlists_retire_requests(struct intel_engine_cs *engine)
+void intel_execlists_cancel_requests(struct intel_engine_cs *engine)
 {
 	struct drm_i915_gem_request *req, *tmp;
-	struct list_head retired_list;
+	LIST_HEAD(cancel_list);
 
 	WARN_ON(!mutex_is_locked(&engine->dev->struct_mutex));
-	if (list_empty(&engine->execlist_retired_req_list))
-		return;
 
-	INIT_LIST_HEAD(&retired_list);
 	spin_lock_bh(&engine->execlist_lock);
-	list_replace_init(&engine->execlist_retired_req_list, &retired_list);
+	list_replace_init(&engine->execlist_queue, &cancel_list);
 	spin_unlock_bh(&engine->execlist_lock);
 
-	list_for_each_entry_safe(req, tmp, &retired_list, execlist_link) {
-		intel_lr_context_unpin(req->ctx, engine);
-
+	list_for_each_entry_safe(req, tmp, &cancel_list, execlist_link) {
 		list_del(&req->execlist_link);
 		i915_gem_request_unreference(req);
 	}
@@ -2119,7 +2109,6 @@ logical_ring_init(struct drm_device *dev, struct intel_engine_cs *engine)
 
 	INIT_LIST_HEAD(&engine->buffers);
 	INIT_LIST_HEAD(&engine->execlist_queue);
-	INIT_LIST_HEAD(&engine->execlist_retired_req_list);
 	spin_lock_init(&engine->execlist_lock);
 
 	tasklet_init(&engine->irq_tasklet,
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index 8bea937973f6..4b1c896b5019 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -119,6 +119,6 @@ int intel_execlists_submission(struct i915_execbuffer_params *params,
 			       struct drm_i915_gem_execbuffer2 *args,
 			       struct list_head *vmas);
 
-void intel_execlists_retire_requests(struct intel_engine_cs *engine);
+void intel_execlists_cancel_requests(struct intel_engine_cs *engine);
 
 #endif /* _INTEL_LRC_H_ */
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 2ade194bbea9..527549dbeb3c 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -269,7 +269,6 @@ struct  intel_engine_cs {
 	struct tasklet_struct irq_tasklet;
 	spinlock_t execlist_lock; /* used inside tasklet, use spin_lock_bh */
 	struct list_head execlist_queue;
-	struct list_head execlist_retired_req_list;
 	unsigned int fw_domains;
 	unsigned int next_context_status_buffer;
 	unsigned int idle_lite_restore_wa;
-- 
2.8.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 52+ messages in thread

end of thread, other threads:[~2016-04-25  6:51 UTC | newest]

Thread overview: 52+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-20 18:42 Premature unpinning at last Chris Wilson
2016-04-20 18:42 ` [PATCH 01/19] drm/i915/overlay: Replace i915_gem_obj_ggtt_offset() with the known flip_addr Chris Wilson
2016-04-20 18:42 ` [PATCH 02/19] io-mapping: Specify mapping size for io_mapping_map_wc() Chris Wilson
2016-04-20 18:42   ` Chris Wilson
2016-04-20 18:42   ` Chris Wilson
2016-04-20 18:58   ` Luis R. Rodriguez
2016-04-20 18:58     ` Luis R. Rodriguez
2016-04-20 18:58     ` Luis R. Rodriguez
2016-04-20 19:14     ` Chris Wilson
2016-04-20 19:14       ` Chris Wilson
2016-04-20 19:14       ` Chris Wilson
2016-04-20 21:23       ` Luis R. Rodriguez
2016-04-20 21:23         ` Luis R. Rodriguez
2016-04-20 18:42 ` [PATCH 03/19] drm/i915: Introduce i915_vm_to_ggtt() Chris Wilson
2016-04-20 18:42 ` [PATCH 04/19] drm/i915: Move ioremap_wc tracking onto VMA Chris Wilson
2016-04-20 18:42 ` [PATCH 05/19] drm/i915: Use i915_vma_pin_iomap on the ringbuffer object Chris Wilson
2016-04-21  7:19   ` Joonas Lahtinen
2016-04-20 18:42 ` [PATCH 06/19] drm/i915: Mark the current context as lost on suspend Chris Wilson
2016-04-20 18:42 ` [PATCH 07/19] drm/i915: L3 cache remapping is part of context switching Chris Wilson
2016-04-20 18:42 ` [PATCH 08/19] drm/i915: Consolidate L3 remapping LRI Chris Wilson
2016-04-20 18:42 ` [PATCH 09/19] drm/i915: Remove early l3-remap Chris Wilson
2016-04-20 18:42 ` [PATCH 10/19] drm/i915: Rearrange switch_context to load the aliasing ppgtt on first use Chris Wilson
2016-04-21  6:48   ` Joonas Lahtinen
2016-04-21  6:58     ` Chris Wilson
2016-04-21  7:01     ` Chris Wilson
2016-04-21  7:24       ` Chris Wilson
2016-04-21  7:32       ` Joonas Lahtinen
2016-04-20 18:42 ` [PATCH 11/19] drm/i915: Assign every HW context a unique ID Chris Wilson
2016-04-20 18:42 ` [PATCH 12/19] drm/i915: Replace the pinned context address with its " Chris Wilson
2016-04-20 18:42 ` [PATCH 13/19] drm/i915: Refactor execlists default context pinning Chris Wilson
2016-04-20 18:42 ` [PATCH 14/19] drm/i915: Move context initialisation to first-use Chris Wilson
2016-04-21  6:57   ` Joonas Lahtinen
2016-04-21  7:08     ` Chris Wilson
2016-04-21  7:47       ` Joonas Lahtinen
2016-04-21  7:56         ` Chris Wilson
2016-04-21  8:37           ` Joonas Lahtinen
2016-04-20 18:42 ` [PATCH 15/19] drm/i915: Move the magical deferred context allocation into the request Chris Wilson
2016-04-20 18:42 ` [PATCH 16/19] drm/i915: Move releasing of the GEM request from free to retire/cancel Chris Wilson
2016-04-20 18:42 ` [PATCH 17/19] drm/i915: Track the previous pinned context inside the request Chris Wilson
2016-04-21  7:07   ` Joonas Lahtinen
2016-04-21  7:22     ` Chris Wilson
2016-04-21  7:35       ` Joonas Lahtinen
2016-04-20 18:42 ` [PATCH 18/19] drm/i915: Store LRC hardware id in " Chris Wilson
2016-04-21  7:58   ` Tvrtko Ursulin
2016-04-21  9:02     ` Chris Wilson
2016-04-20 18:42 ` [PATCH 19/19] drm/i915: Stop tracking execlists retired requests Chris Wilson
2016-04-21 11:27 ` ✓ Fi.CI.BAT: success for series starting with [01/19] drm/i915/overlay: Replace i915_gem_obj_ggtt_offset() with the known flip_addr Patchwork
2016-04-21 12:05 ` ✗ Fi.CI.BAT: failure " Patchwork
2016-04-23 10:53 ` ✗ Fi.CI.BAT: warning " Patchwork
2016-04-25  6:51 ` Patchwork
2016-04-21  8:58 [PATCH 01/19] " Chris Wilson
2016-04-21  8:59 ` [PATCH 19/19] drm/i915: Stop tracking execlists retired requests Chris Wilson
2016-04-21 14:57 Final CI pass for premature Chris Wilson
2016-04-21 14:57 ` [PATCH 19/19] drm/i915: Stop tracking execlists retired requests Chris Wilson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.