All of lore.kernel.org
 help / color / mirror / Atom feed
* Small GEM fixes
@ 2016-12-31 12:06 Chris Wilson
  2016-12-31 12:06 ` [PATCH 01/26] drm/i915: Assert that we do create the deferred context Chris Wilson
                   ` (26 more replies)
  0 siblings, 27 replies; 36+ messages in thread
From: Chris Wilson @ 2016-12-31 12:06 UTC (permalink / raw)
  To: intel-gfx

Outside of making guc play nicely with high GGTT offsets, a big fix here
is the correct implementation of lazy fencing for gen4+.
-Chris

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 01/26] drm/i915: Assert that we do create the deferred context
  2016-12-31 12:06 Small GEM fixes Chris Wilson
@ 2016-12-31 12:06 ` Chris Wilson
  2016-12-31 12:06 ` [PATCH 02/26] drm/i915: Move a few utility macros into a separate header Chris Wilson
                   ` (25 subsequent siblings)
  26 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2016-12-31 12:06 UTC (permalink / raw)
  To: intel-gfx

In order to convince static analyzers that the allocation function
returns an error or sets ce->state, assert that it is set afterwards.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/intel_lrc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 227978820320..51ecb395551b 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -781,6 +781,7 @@ static int execlists_context_pin(struct intel_engine_cs *engine,
 		if (ret)
 			goto err;
 	}
+	GEM_BUG_ON(!ce->state);
 
 	flags = PIN_GLOBAL;
 	if (ctx->ggtt_offset_bias)
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 02/26] drm/i915: Move a few utility macros into a separate header
  2016-12-31 12:06 Small GEM fixes Chris Wilson
  2016-12-31 12:06 ` [PATCH 01/26] drm/i915: Assert that we do create the deferred context Chris Wilson
@ 2016-12-31 12:06 ` Chris Wilson
  2016-12-31 12:06 ` [PATCH 03/26] drm/i915: Use fixed-sized types for stolen Chris Wilson
                   ` (24 subsequent siblings)
  26 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2016-12-31 12:06 UTC (permalink / raw)
  To: intel-gfx

In order to defeat some circular dependencies between headers to allow use
of e.g. range_overflows() in a header, move the simple independent macros
into their own header.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.h   | 13 +-----------
 drivers/gpu/drm/i915/i915_gem.c   |  4 ----
 drivers/gpu/drm/i915/i915_utils.h | 44 +++++++++++++++++++++++++++++++++++++++
 3 files changed, 45 insertions(+), 16 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_utils.h

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 22d3f610212c..58dfe8ab9c0b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -52,6 +52,7 @@
 
 #include "i915_params.h"
 #include "i915_reg.h"
+#include "i915_utils.h"
 
 #include "intel_bios.h"
 #include "intel_dpll_mgr.h"
@@ -219,18 +220,6 @@ static inline const char *enableddisabled(bool v)
 	return v ? "enabled" : "disabled";
 }
 
-#define range_overflows(start, size, max) ({ \
-	typeof(start) start__ = (start); \
-	typeof(size) size__ = (size); \
-	typeof(max) max__ = (max); \
-	(void)(&start__ == &size__); \
-	(void)(&start__ == &max__); \
-	start__ > max__ || size__ > max__ - start__; \
-})
-
-#define range_overflows_t(type, start, size, max) \
-	range_overflows((type)(start), (type)(size), (type)(max))
-
 enum pipe {
 	INVALID_PIPE = -1,
 	PIPE_A = 0,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index d8760acf8001..2957c96eb93d 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3999,10 +3999,6 @@ static const struct drm_i915_gem_object_ops i915_gem_object_ops = {
 	.put_pages = i915_gem_object_put_pages_gtt,
 };
 
-/* Note we don't consider signbits :| */
-#define overflows_type(x, T) \
-	(sizeof(x) > sizeof(T) && (x) >> (sizeof(T) * BITS_PER_BYTE))
-
 struct drm_i915_gem_object *
 i915_gem_object_create(struct drm_i915_private *dev_priv, u64 size)
 {
diff --git a/drivers/gpu/drm/i915/i915_utils.h b/drivers/gpu/drm/i915/i915_utils.h
new file mode 100644
index 000000000000..9983ef30bac8
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_utils.h
@@ -0,0 +1,44 @@
+/*
+ * Copyright © 2016 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#ifndef __I915_UTILS_H
+#define __I915_UTILS_H
+
+#define range_overflows(start, size, max) ({ \
+	typeof(start) start__ = (start); \
+	typeof(size) size__ = (size); \
+	typeof(max) max__ = (max); \
+	(void)(&start__ == &size__); \
+	(void)(&start__ == &max__); \
+	start__ > max__ || size__ > max__ - start__; \
+})
+
+#define range_overflows_t(type, start, size, max) \
+	range_overflows((type)(start), (type)(size), (type)(max))
+
+/* Note we don't consider signbits :| */
+#define overflows_type(x, T) \
+	(sizeof(x) > sizeof(T) && (x) >> (sizeof(T) * BITS_PER_BYTE))
+
+#endif /* !__I915_UTILS_H */
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 03/26] drm/i915: Use fixed-sized types for stolen
  2016-12-31 12:06 Small GEM fixes Chris Wilson
  2016-12-31 12:06 ` [PATCH 01/26] drm/i915: Assert that we do create the deferred context Chris Wilson
  2016-12-31 12:06 ` [PATCH 02/26] drm/i915: Move a few utility macros into a separate header Chris Wilson
@ 2016-12-31 12:06 ` Chris Wilson
  2016-12-31 12:06 ` [PATCH 04/26] drm/i915: Use range_overflows() Chris Wilson
                   ` (23 subsequent siblings)
  26 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2016-12-31 12:06 UTC (permalink / raw)
  To: intel-gfx

Stolen memory is a hardware resource of known size, so use an accurate
fixed integer type rather than the ambiguous variable size_t. This was
motivated by the next patch spotting inconsistencies in our types.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/char/agp/intel-gtt.c           |  6 ++++--
 drivers/gpu/drm/i915/i915_gem_gtt.c    |  8 +++++---
 drivers/gpu/drm/i915/i915_gem_gtt.h    | 13 +++++++------
 drivers/gpu/drm/i915/i915_gem_stolen.c |  6 +++---
 include/drm/intel-gtt.h                |  6 ++++--
 5 files changed, 23 insertions(+), 16 deletions(-)

diff --git a/drivers/char/agp/intel-gtt.c b/drivers/char/agp/intel-gtt.c
index 0f7d28a98b9a..9702c78f458d 100644
--- a/drivers/char/agp/intel-gtt.c
+++ b/drivers/char/agp/intel-gtt.c
@@ -1420,8 +1420,10 @@ int intel_gmch_probe(struct pci_dev *bridge_pdev, struct pci_dev *gpu_pdev,
 }
 EXPORT_SYMBOL(intel_gmch_probe);
 
-void intel_gtt_get(u64 *gtt_total, size_t *stolen_size,
-		   phys_addr_t *mappable_base, u64 *mappable_end)
+void intel_gtt_get(u64 *gtt_total,
+		   u32 *stolen_size,
+		   phys_addr_t *mappable_base,
+		   u64 *mappable_end)
 {
 	*gtt_total = intel_private.gtt_total_entries << PAGE_SHIFT;
 	*stolen_size = intel_private.stolen_size;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 6af9311f72f5..11aeba60b5d7 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -3139,8 +3139,10 @@ static int i915_gmch_probe(struct i915_ggtt *ggtt)
 		return -EIO;
 	}
 
-	intel_gtt_get(&ggtt->base.total, &ggtt->stolen_size,
-		      &ggtt->mappable_base, &ggtt->mappable_end);
+	intel_gtt_get(&ggtt->base.total,
+		      &ggtt->stolen_size,
+		      &ggtt->mappable_base,
+		      &ggtt->mappable_end);
 
 	ggtt->do_idle_maps = needs_idle_maps(dev_priv);
 	ggtt->base.insert_page = i915_ggtt_insert_page;
@@ -3195,7 +3197,7 @@ int i915_ggtt_probe_hw(struct drm_i915_private *dev_priv)
 	DRM_INFO("Memory usable by graphics device = %lluM\n",
 		 ggtt->base.total >> 20);
 	DRM_DEBUG_DRIVER("GMADR size = %lldM\n", ggtt->mappable_end >> 20);
-	DRM_DEBUG_DRIVER("GTT stolen size = %zdM\n", ggtt->stolen_size >> 20);
+	DRM_DEBUG_DRIVER("GTT stolen size = %uM\n", ggtt->stolen_size >> 20);
 #ifdef CONFIG_INTEL_IOMMU
 	if (intel_iommu_gfx_mapped)
 		DRM_INFO("VT-d active for gfx access\n");
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 0055b8567a43..9e91d7e6149c 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -315,6 +315,9 @@ struct i915_ggtt {
 	struct i915_address_space base;
 	struct io_mapping mappable;	/* Mapping to our CPU mappable region */
 
+	phys_addr_t mappable_base;	/* PA of our GMADR */
+	u64 mappable_end;		/* End offset that we can CPU map */
+
 	/* Stolen memory is segmented in hardware with different portions
 	 * offlimits to certain functions.
 	 *
@@ -323,12 +326,10 @@ struct i915_ggtt {
 	 * avoid the first page! The upper end of stolen memory is reserved for
 	 * hardware functions and similarly removed from the accessible range.
 	 */
-	size_t stolen_size;		/* Total size of stolen memory */
-	size_t stolen_usable_size;	/* Total size minus reserved ranges */
-	size_t stolen_reserved_base;
-	size_t stolen_reserved_size;
-	u64 mappable_end;		/* End offset that we can CPU map */
-	phys_addr_t mappable_base;	/* PA of our GMADR */
+	u32 stolen_size;		/* Total size of stolen memory */
+	u32 stolen_usable_size;	/* Total size minus reserved ranges */
+	u32 stolen_reserved_base;
+	u32 stolen_reserved_size;
 
 	/** "Graphics Stolen Memory" holds the global PTEs */
 	void __iomem *gsm;
diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index f1a80bfa9919..112ca930ea4c 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -475,7 +475,7 @@ int i915_gem_init_stolen(struct drm_i915_private *dev_priv)
 	 * memory, so just consider the start. */
 	reserved_total = stolen_top - reserved_base;
 
-	DRM_DEBUG_KMS("Memory reserved for graphics device: %zuK, usable: %luK\n",
+	DRM_DEBUG_KMS("Memory reserved for graphics device: %uK, usable: %luK\n",
 		      ggtt->stolen_size >> 10,
 		      (ggtt->stolen_size - reserved_total) >> 10);
 
@@ -484,8 +484,8 @@ int i915_gem_init_stolen(struct drm_i915_private *dev_priv)
 	if (INTEL_GEN(dev_priv) >= 8)
 		stolen_usable_start = 4096;
 
-	ggtt->stolen_usable_size = ggtt->stolen_size - reserved_total -
-				   stolen_usable_start;
+	ggtt->stolen_usable_size =
+		ggtt->stolen_size - reserved_total - stolen_usable_start;
 
 	/* Basic memrange allocator for stolen space. */
 	drm_mm_init(&dev_priv->mm.stolen, stolen_usable_start,
diff --git a/include/drm/intel-gtt.h b/include/drm/intel-gtt.h
index f49edecd66a3..b3bf717cfc45 100644
--- a/include/drm/intel-gtt.h
+++ b/include/drm/intel-gtt.h
@@ -3,8 +3,10 @@
 #ifndef _DRM_INTEL_GTT_H
 #define	_DRM_INTEL_GTT_H
 
-void intel_gtt_get(u64 *gtt_total, size_t *stolen_size,
-		   phys_addr_t *mappable_base, u64 *mappable_end);
+void intel_gtt_get(u64 *gtt_total,
+		   u32 *stolen_size,
+		   phys_addr_t *mappable_base,
+		   u64 *mappable_end);
 
 int intel_gmch_probe(struct pci_dev *bridge_pdev, struct pci_dev *gpu_pdev,
 		     struct agp_bridge_data *bridge);
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 04/26] drm/i915: Use range_overflows()
  2016-12-31 12:06 Small GEM fixes Chris Wilson
                   ` (2 preceding siblings ...)
  2016-12-31 12:06 ` [PATCH 03/26] drm/i915: Use fixed-sized types for stolen Chris Wilson
@ 2016-12-31 12:06 ` Chris Wilson
  2016-12-31 12:06 ` [PATCH 05/26] drm/i915/guc: Exclude the upper end of the Global GTT for the GuC Chris Wilson
                   ` (22 subsequent siblings)
  26 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2016-12-31 12:06 UTC (permalink / raw)
  To: intel-gfx

Replace a few more open-coded overflow checks with the macro.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem_stolen.c | 2 +-
 drivers/gpu/drm/i915/i915_vma.c        | 3 ++-
 drivers/gpu/drm/i915/intel_bios.c      | 7 +++++--
 3 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index 112ca930ea4c..5fd851336a99 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -502,7 +502,7 @@ i915_pages_create_for_stolen(struct drm_device *dev,
 	struct sg_table *st;
 	struct scatterlist *sg;
 
-	GEM_BUG_ON(offset > dev_priv->ggtt.stolen_size - size);
+	GEM_BUG_ON(range_overflows(offset, size, dev_priv->ggtt.stolen_size));
 
 	/* We hide that we have no struct page backing our stolen object
 	 * by wrapping the contiguous physical allocation with a fake
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index d48c68214611..58f2483362ad 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -403,7 +403,8 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
 
 	if (flags & PIN_OFFSET_FIXED) {
 		u64 offset = flags & PIN_OFFSET_MASK;
-		if (offset & (alignment - 1) || offset > end - size) {
+		if (offset & (alignment - 1) ||
+		    range_overflows(offset, size, end)) {
 			ret = -EINVAL;
 			goto err_unpin;
 		}
diff --git a/drivers/gpu/drm/i915/intel_bios.c b/drivers/gpu/drm/i915/intel_bios.c
index f6d37558301d..e144f033f4b5 100644
--- a/drivers/gpu/drm/i915/intel_bios.c
+++ b/drivers/gpu/drm/i915/intel_bios.c
@@ -1416,13 +1416,16 @@ bool intel_bios_is_valid_vbt(const void *buf, size_t size)
 		return false;
 	}
 
-	if (vbt->bdb_offset + sizeof(struct bdb_header) > size) {
+	if (range_overflows_t(size_t,
+			      vbt->bdb_offset,
+			      sizeof(struct bdb_header),
+			      size)) {
 		DRM_DEBUG_DRIVER("BDB header incomplete\n");
 		return false;
 	}
 
 	bdb = get_bdb_header(vbt);
-	if (vbt->bdb_offset + bdb->bdb_size > size) {
+	if (range_overflows_t(size_t, vbt->bdb_offset, bdb->bdb_size, size)) {
 		DRM_DEBUG_DRIVER("BDB incomplete\n");
 		return false;
 	}
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 05/26] drm/i915/guc: Exclude the upper end of the Global GTT for the GuC
  2016-12-31 12:06 Small GEM fixes Chris Wilson
                   ` (3 preceding siblings ...)
  2016-12-31 12:06 ` [PATCH 04/26] drm/i915: Use range_overflows() Chris Wilson
@ 2016-12-31 12:06 ` Chris Wilson
  2017-01-02 15:25   ` Arkadiusz Hiler
  2016-12-31 12:06 ` [PATCH 06/26] drm/i915: Purge loose pages if we run out of DMA remap space Chris Wilson
                   ` (21 subsequent siblings)
  26 siblings, 1 reply; 36+ messages in thread
From: Chris Wilson @ 2016-12-31 12:06 UTC (permalink / raw)
  To: intel-gfx

The GuC uses a special mapping for the upper end of the Global GTT,
similar to the way it uses a special mapping for the lower end, so
exclude it from our drm_mm to prevent us using it.

v2: Rename to reflect that it is unmappable similar to the region at the
bottom of the GGTT, and couple it into the assertion that we don't feed
unmappable addresses to the GuC.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 10 ++++++++++
 drivers/gpu/drm/i915/i915_guc_reg.h |  3 +++
 drivers/gpu/drm/i915/intel_uc.h     |  1 +
 3 files changed, 14 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 11aeba60b5d7..00520f27bea6 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -3178,6 +3178,16 @@ int i915_ggtt_probe_hw(struct drm_i915_private *dev_priv)
 	if (ret)
 		return ret;
 
+	/* Trim the GGTT to fit the GuC mappable upper range (when enabled).
+	 * This is easier than doing range restriction on the fly, as we
+	 * currently don't have any bits spare to pass in this upper
+	 * restriction!
+	 */
+	if (HAS_GUC(dev_priv) && i915.enable_guc_loading) {
+		ggtt->base.total = min_t(u64, ggtt->base.total, GUC_GGTT_TOP);
+		ggtt->mappable_end = min(ggtt->mappable_end, ggtt->base.total);
+	}
+
 	if ((ggtt->base.total - 1) >> 32) {
 		DRM_ERROR("We never expected a Global GTT with more than 32bits"
 			  " of address space! Found %lldM!\n",
diff --git a/drivers/gpu/drm/i915/i915_guc_reg.h b/drivers/gpu/drm/i915/i915_guc_reg.h
index 5e638fc37208..6a0adafe0523 100644
--- a/drivers/gpu/drm/i915/i915_guc_reg.h
+++ b/drivers/gpu/drm/i915/i915_guc_reg.h
@@ -73,6 +73,9 @@
 #define   GUC_WOPCM_TOP			  (0x80 << 12)	/* 512KB */
 #define   BXT_GUC_WOPCM_RC6_RESERVED	  (0x10 << 12)	/* 64KB  */
 
+/* GuC addresses above GUC_GGTT_TOP also don't map through the GTT */
+#define GUC_GGTT_TOP			0xFEE00000
+
 #define GEN8_GT_PM_CONFIG		_MMIO(0x138140)
 #define GEN9LP_GT_PM_CONFIG		_MMIO(0x138140)
 #define GEN9_GT_PM_CONFIG		_MMIO(0x13816c)
diff --git a/drivers/gpu/drm/i915/intel_uc.h b/drivers/gpu/drm/i915/intel_uc.h
index d556215e691f..c594472d918b 100644
--- a/drivers/gpu/drm/i915/intel_uc.h
+++ b/drivers/gpu/drm/i915/intel_uc.h
@@ -204,6 +204,7 @@ static inline u32 guc_ggtt_offset(struct i915_vma *vma)
 {
 	u32 offset = i915_ggtt_offset(vma);
 	GEM_BUG_ON(offset < GUC_WOPCM_TOP);
+	GEM_BUG_ON(range_overflows_t(u64, offset, vma->size, GUC_GGTT_TOP));
 	return offset;
 }
 
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 06/26] drm/i915: Purge loose pages if we run out of DMA remap space
  2016-12-31 12:06 Small GEM fixes Chris Wilson
                   ` (4 preceding siblings ...)
  2016-12-31 12:06 ` [PATCH 05/26] drm/i915/guc: Exclude the upper end of the Global GTT for the GuC Chris Wilson
@ 2016-12-31 12:06 ` Chris Wilson
  2016-12-31 12:06 ` [PATCH 07/26] drm/i915: Fix phys pwrite for struct_mutex-less operation Chris Wilson
                   ` (20 subsequent siblings)
  26 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2016-12-31 12:06 UTC (permalink / raw)
  To: intel-gfx

If the DMA remap fails, one cause can be that we have too many objects
pinned in a small remapping table, such as swiotlb. (DMA remapping does
not trigger the shrinker by itself on its normal failure paths.)  So try
purging all other objects (using i915_gem_shrink_all(), sparing our own
pages as we have yet to assign them to the obj->pages) and try again. If
there are no pages to reclaim (and consequently no pages to unmap), the
shrinker will report 0 and we fail with -ENOSPC as before.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 22 ++++++++++++++++++----
 1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 00520f27bea6..f698006fe883 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2361,10 +2361,24 @@ void i915_gem_suspend_gtt_mappings(struct drm_i915_private *dev_priv)
 int i915_gem_gtt_prepare_pages(struct drm_i915_gem_object *obj,
 			       struct sg_table *pages)
 {
-	if (dma_map_sg(&obj->base.dev->pdev->dev,
-		       pages->sgl, pages->nents,
-		       PCI_DMA_BIDIRECTIONAL))
-		return 0;
+	do {
+		if (dma_map_sg(&obj->base.dev->pdev->dev,
+			       pages->sgl, pages->nents,
+			       PCI_DMA_BIDIRECTIONAL))
+			return 0;
+
+		/* If the DMA remap fails, one cause can be that we have
+		 * too many objects pinned in a small remapping table,
+		 * such as swiotlb. Incrementally purge all other objects and
+		 * try again - if there are no more pages to remove from
+		 * the DMA remapper, i915_gem_shrink will return 0.
+		 */
+		GEM_BUG_ON(obj->mm.pages == pages);
+	} while (i915_gem_shrink(to_i915(obj->base.dev),
+				 obj->base.size >> PAGE_SHIFT,
+				 I915_SHRINK_BOUND |
+				 I915_SHRINK_UNBOUND |
+				 I915_SHRINK_ACTIVE));
 
 	return -ENOSPC;
 }
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 07/26] drm/i915: Fix phys pwrite for struct_mutex-less operation
  2016-12-31 12:06 Small GEM fixes Chris Wilson
                   ` (5 preceding siblings ...)
  2016-12-31 12:06 ` [PATCH 06/26] drm/i915: Purge loose pages if we run out of DMA remap space Chris Wilson
@ 2016-12-31 12:06 ` Chris Wilson
  2016-12-31 12:06 ` [PATCH 08/26] drm/i915: Drain freed objects for mmap space exhaustion Chris Wilson
                   ` (19 subsequent siblings)
  26 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2016-12-31 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: drm-intel-fixes

Since commit fe115628d567 ("drm/i915: Implement pwrite without
struct-mutex") the lowlevel pwrite calls are now called without the
protection of struct_mutex, but pwrite_phys was still asserting that it
held the struct_mutex and later tried to drop and relock it.

Fixes: fe115628d567 ("drm/i915: Implement pwrite without struct-mutex")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: <drm-intel-fixes@lists.freedesktop.org>
---
 drivers/gpu/drm/i915/i915_gem.c | 34 ++++------------------------------
 1 file changed, 4 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 2957c96eb93d..e5b4a72645d4 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -597,47 +597,21 @@ i915_gem_phys_pwrite(struct drm_i915_gem_object *obj,
 		     struct drm_i915_gem_pwrite *args,
 		     struct drm_file *file)
 {
-	struct drm_device *dev = obj->base.dev;
 	void *vaddr = obj->phys_handle->vaddr + args->offset;
 	char __user *user_data = u64_to_user_ptr(args->data_ptr);
-	int ret;
 
 	/* We manually control the domain here and pretend that it
 	 * remains coherent i.e. in the GTT domain, like shmem_pwrite.
 	 */
-	lockdep_assert_held(&obj->base.dev->struct_mutex);
-	ret = i915_gem_object_wait(obj,
-				   I915_WAIT_INTERRUPTIBLE |
-				   I915_WAIT_LOCKED |
-				   I915_WAIT_ALL,
-				   MAX_SCHEDULE_TIMEOUT,
-				   to_rps_client(file));
-	if (ret)
-		return ret;
-
 	intel_fb_obj_invalidate(obj, ORIGIN_CPU);
-	if (__copy_from_user_inatomic_nocache(vaddr, user_data, args->size)) {
-		unsigned long unwritten;
-
-		/* The physical object once assigned is fixed for the lifetime
-		 * of the obj, so we can safely drop the lock and continue
-		 * to access vaddr.
-		 */
-		mutex_unlock(&dev->struct_mutex);
-		unwritten = copy_from_user(vaddr, user_data, args->size);
-		mutex_lock(&dev->struct_mutex);
-		if (unwritten) {
-			ret = -EFAULT;
-			goto out;
-		}
-	}
+	if (copy_from_user(vaddr, user_data, args->size))
+		return -EFAULT;
 
 	drm_clflush_virt_range(vaddr, args->size);
-	i915_gem_chipset_flush(to_i915(dev));
+	i915_gem_chipset_flush(to_i915(obj->base.dev));
 
-out:
 	intel_fb_obj_flush(obj, false, ORIGIN_CPU);
-	return ret;
+	return 0;
 }
 
 void *i915_gem_object_alloc(struct drm_i915_private *dev_priv)
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 08/26] drm/i915: Drain freed objects for mmap space exhaustion
  2016-12-31 12:06 Small GEM fixes Chris Wilson
                   ` (6 preceding siblings ...)
  2016-12-31 12:06 ` [PATCH 07/26] drm/i915: Fix phys pwrite for struct_mutex-less operation Chris Wilson
@ 2016-12-31 12:06 ` Chris Wilson
  2016-12-31 12:06 ` [PATCH 09/26] drm/i915: Simplify testing for am-I-the-kernel-context? Chris Wilson
                   ` (18 subsequent siblings)
  26 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2016-12-31 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: drm-intel-fixes

As we now use a deferred free queue for objects, simply retiring the
active objects is not enough to immediately free them and recover their
mmap space - we must now also drain the freed object list.

Fixes: fbbd37b36fa5 ("drm/i915: Move object release to a freelist + worker"
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: <drm-intel-fixes@lists.freedesktop.org>
---
 drivers/gpu/drm/i915/i915_gem.c | 24 +++++++++++-------------
 1 file changed, 11 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index e5b4a72645d4..fe7419cab716 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2090,23 +2090,21 @@ static int i915_gem_object_create_mmap_offset(struct drm_i915_gem_object *obj)
 	int err;
 
 	err = drm_gem_create_mmap_offset(&obj->base);
-	if (!err)
+	if (likely(!err))
 		return 0;
 
-	/* We can idle the GPU locklessly to flush stale objects, but in order
-	 * to claim that space for ourselves, we need to take the big
-	 * struct_mutex to free the requests+objects and allocate our slot.
-	 */
-	err = i915_gem_wait_for_idle(dev_priv, I915_WAIT_INTERRUPTIBLE);
-	if (err)
-		return err;
+	/* Attempt to reap some mmap space from dead objects */
+	do {
+		err = i915_gem_wait_for_idle(dev_priv, I915_WAIT_INTERRUPTIBLE);
+		if (err)
+			break;
 
-	err = i915_mutex_lock_interruptible(&dev_priv->drm);
-	if (!err) {
-		i915_gem_retire_requests(dev_priv);
+		i915_gem_drain_freed_objects(dev_priv);
 		err = drm_gem_create_mmap_offset(&obj->base);
-		mutex_unlock(&dev_priv->drm.struct_mutex);
-	}
+		if (!err)
+			break;
+
+	} while (flush_delayed_work(&dev_priv->gt.retire_work));
 
 	return err;
 }
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 09/26] drm/i915: Simplify testing for am-I-the-kernel-context?
  2016-12-31 12:06 Small GEM fixes Chris Wilson
                   ` (7 preceding siblings ...)
  2016-12-31 12:06 ` [PATCH 08/26] drm/i915: Drain freed objects for mmap space exhaustion Chris Wilson
@ 2016-12-31 12:06 ` Chris Wilson
  2016-12-31 12:06 ` [PATCH 10/26] drm/i915: Name the anonymous structs inside i915_ggtt_view Chris Wilson
                   ` (17 subsequent siblings)
  26 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2016-12-31 12:06 UTC (permalink / raw)
  To: intel-gfx

The kernel context (dev_priv->kernel_context) is unique in that it is
not associated with any user filp - it is the only one with
ctx->file_priv == NULL. This is a simpler test than comparing it against
dev_priv->kernel_context which involves some pointer dancing.

In checking that this is true, we notice that the gvt context is
allocating itself a i915_hw_ppgtt it doesn't use and not flagging that
its file_priv should be invalid.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.c         |  2 +-
 drivers/gpu/drm/i915/i915_gem_context.c | 13 ++++++++++++-
 drivers/gpu/drm/i915/i915_gem_context.h |  5 +++++
 drivers/gpu/drm/i915/intel_lrc.c        |  2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c |  4 ++--
 5 files changed, 21 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index fe7419cab716..4b461f01a910 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4192,7 +4192,7 @@ static void assert_kernel_context_is_current(struct drm_i915_private *dev_priv)
 	enum intel_engine_id id;
 
 	for_each_engine(engine, dev_priv, id)
-		GEM_BUG_ON(engine->last_retired_context != dev_priv->kernel_context);
+		GEM_BUG_ON(!i915_gem_context_is_kernel(engine->last_retired_context));
 }
 
 int i915_gem_suspend(struct drm_i915_private *dev_priv)
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 07ac81103f44..40a6939e3956 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -413,14 +413,17 @@ i915_gem_context_create_gvt(struct drm_device *dev)
 	if (ret)
 		return ERR_PTR(ret);
 
-	ctx = i915_gem_create_context(to_i915(dev), NULL);
+	ctx = __create_hw_context(to_i915(dev), NULL);
 	if (IS_ERR(ctx))
 		goto out;
 
+	ctx->file_priv = ERR_PTR(-EBADF);
 	i915_gem_context_set_closed(ctx); /* not user accessible */
 	i915_gem_context_clear_bannable(ctx);
 	i915_gem_context_set_force_single_submission(ctx);
 	ctx->ring_size = 512 * PAGE_SIZE; /* Max ring buffer size */
+
+	GEM_BUG_ON(i915_gem_context_is_kernel(ctx));
 out:
 	mutex_unlock(&dev->struct_mutex);
 	return ctx;
@@ -472,6 +475,8 @@ int i915_gem_context_init(struct drm_i915_private *dev_priv)
 	ctx->priority = I915_PRIORITY_MIN; /* lowest priority; idle task */
 	dev_priv->kernel_context = ctx;
 
+	GEM_BUG_ON(!i915_gem_context_is_kernel(ctx));
+
 	DRM_DEBUG_DRIVER("%s context support initialized\n",
 			i915.enable_execlists ? "LR" :
 			dev_priv->hw_context_size ? "HW" : "fake");
@@ -524,6 +529,8 @@ void i915_gem_context_fini(struct drm_i915_private *dev_priv)
 
 	lockdep_assert_held(&dev_priv->drm.struct_mutex);
 
+	GEM_BUG_ON(!i915_gem_context_is_kernel(dctx));
+
 	context_close(dctx);
 	dev_priv->kernel_context = NULL;
 
@@ -549,6 +556,8 @@ int i915_gem_context_open(struct drm_device *dev, struct drm_file *file)
 	ctx = i915_gem_create_context(to_i915(dev), file_priv);
 	mutex_unlock(&dev->struct_mutex);
 
+	GEM_BUG_ON(i915_gem_context_is_kernel(ctx));
+
 	if (IS_ERR(ctx)) {
 		idr_destroy(&file_priv->context_idr);
 		return PTR_ERR(ctx);
@@ -968,6 +977,8 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
 	if (IS_ERR(ctx))
 		return PTR_ERR(ctx);
 
+	GEM_BUG_ON(i915_gem_context_is_kernel(ctx));
+
 	args->ctx_id = ctx->user_handle;
 	DRM_DEBUG("HW context %d created\n", args->ctx_id);
 
diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
index 89f6764fb338..0ac750b90f3d 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/i915_gem_context.h
@@ -246,6 +246,11 @@ static inline bool i915_gem_context_is_default(const struct i915_gem_context *c)
 	return c->user_handle == DEFAULT_CONTEXT_HANDLE;
 }
 
+static inline bool i915_gem_context_is_kernel(struct i915_gem_context *ctx)
+{
+	return !ctx->file_priv;
+}
+
 /* i915_gem_context.c */
 int __must_check i915_gem_context_init(struct drm_i915_private *dev_priv);
 void i915_gem_context_lost(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 51ecb395551b..600518ee89db 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -786,7 +786,7 @@ static int execlists_context_pin(struct intel_engine_cs *engine,
 	flags = PIN_GLOBAL;
 	if (ctx->ggtt_offset_bias)
 		flags |= PIN_OFFSET_BIAS | ctx->ggtt_offset_bias;
-	if (ctx == ctx->i915->kernel_context)
+	if (i915_gem_context_is_kernel(ctx))
 		flags |= PIN_HIGH;
 
 	ret = i915_vma_pin(ce->state, 0, GEN8_LR_CONTEXT_ALIGN, flags);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index e84d488e167a..0971ac396b60 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1974,7 +1974,7 @@ static int intel_ring_context_pin(struct intel_engine_cs *engine,
 		unsigned int flags;
 
 		flags = 0;
-		if (ctx == ctx->i915->kernel_context)
+		if (i915_gem_context_is_kernel(ctx))
 			flags = PIN_HIGH;
 
 		ret = context_pin(ctx, flags);
@@ -1989,7 +1989,7 @@ static int intel_ring_context_pin(struct intel_engine_cs *engine,
 	 * as during eviction we cannot allocate and pin the renderstate in
 	 * order to initialise the context.
 	 */
-	if (ctx == ctx->i915->kernel_context)
+	if (i915_gem_context_is_kernel(ctx))
 		ce->initialised = true;
 
 	i915_gem_context_get(ctx);
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 10/26] drm/i915: Name the anonymous structs inside i915_ggtt_view
  2016-12-31 12:06 Small GEM fixes Chris Wilson
                   ` (8 preceding siblings ...)
  2016-12-31 12:06 ` [PATCH 09/26] drm/i915: Simplify testing for am-I-the-kernel-context? Chris Wilson
@ 2016-12-31 12:06 ` Chris Wilson
  2016-12-31 12:06 ` [PATCH 11/26] drm/i915: Pack the partial view size and offset into a single u64 Chris Wilson
                   ` (16 subsequent siblings)
  26 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2016-12-31 12:06 UTC (permalink / raw)
  To: intel-gfx

Naming this pair will become useful later...

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem_gtt.h | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 9e91d7e6149c..19ea4c942df4 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -149,20 +149,22 @@ enum i915_ggtt_view_type {
 };
 
 struct intel_rotation_info {
-	struct {
+	struct intel_rotation_plane_info {
 		/* tiles */
 		unsigned int width, height, stride, offset;
 	} plane[2];
 };
 
+struct intel_partial_info {
+	u64 offset;
+	unsigned int size;
+};
+
 struct i915_ggtt_view {
 	enum i915_ggtt_view_type type;
 
 	union {
-		struct {
-			u64 offset;
-			unsigned int size;
-		} partial;
+		struct intel_partial_info partial;
 		struct intel_rotation_info rotated;
 	} params;
 };
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 11/26] drm/i915: Pack the partial view size and offset into a single u64
  2016-12-31 12:06 Small GEM fixes Chris Wilson
                   ` (9 preceding siblings ...)
  2016-12-31 12:06 ` [PATCH 10/26] drm/i915: Name the anonymous structs inside i915_ggtt_view Chris Wilson
@ 2016-12-31 12:06 ` Chris Wilson
  2016-12-31 12:06 ` [PATCH 12/26] drm/i915: Convert i915_ggtt_view to use an anonymous union Chris Wilson
                   ` (15 subsequent siblings)
  26 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2016-12-31 12:06 UTC (permalink / raw)
  To: intel-gfx

Since the partial offset must be page aligned, we can use those low 12
bits to encode the side of the partial view (which then cannot be larger
than 8MiB in pages).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.c     | 22 +++++++++++++---------
 drivers/gpu/drm/i915/i915_gem_gtt.c |  4 ++--
 drivers/gpu/drm/i915/i915_gem_gtt.h | 25 +++++++++++++++++++++++--
 drivers/gpu/drm/i915/i915_vma.c     |  9 ++++-----
 4 files changed, 42 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 4b461f01a910..87e41d714890 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1698,6 +1698,9 @@ static unsigned int tile_row_pages(struct drm_i915_gem_object *obj)
 {
 	u64 size;
 
+	BUILD_BUG_ON(ilog2(GEN7_FENCE_MAX_PITCH_VAL*128*32 >> PAGE_SHIFT) >
+		     INTEL_PARTIAL_SIZE_BITS);
+
 	size = i915_gem_object_get_stride(obj);
 	size *= i915_gem_object_get_tiling(obj) == I915_TILING_Y ? 32 : 8;
 
@@ -1831,24 +1834,25 @@ int i915_gem_fault(struct vm_area_struct *area, struct vm_fault *vmf)
 	vma = i915_gem_object_ggtt_pin(obj, NULL, 0, 0, flags);
 	if (IS_ERR(vma)) {
 		struct i915_ggtt_view view;
-		unsigned int chunk_size;
+		unsigned int chunk;
 
 		/* Use a partial view if it is bigger than available space */
-		chunk_size = MIN_CHUNK_PAGES;
+		chunk = MIN_CHUNK_PAGES;
 		if (i915_gem_object_is_tiled(obj))
-			chunk_size = roundup(chunk_size, tile_row_pages(obj));
+			chunk = roundup(chunk, tile_row_pages(obj));
 
 		memset(&view, 0, sizeof(view));
 		view.type = I915_GGTT_VIEW_PARTIAL;
-		view.params.partial.offset = rounddown(page_offset, chunk_size);
-		view.params.partial.size =
-			min_t(unsigned int, chunk_size,
-			      vma_pages(area) - view.params.partial.offset);
+		view.params.partial.offset_size = rounddown(page_offset, chunk);
+		view.params.partial.offset_size =
+			(view.params.partial.offset_size << INTEL_PARTIAL_SIZE_BITS) |
+			(min_t(unsigned int, chunk,
+			      vma_pages(area) - view.params.partial.offset_size) - 1);
 
 		/* If the partial covers the entire object, just create a
 		 * normal VMA.
 		 */
-		if (chunk_size >= obj->base.size >> PAGE_SHIFT)
+		if (chunk >= obj->base.size >> PAGE_SHIFT)
 			view.type = I915_GGTT_VIEW_NORMAL;
 
 		/* Userspace is now writing through an untracked VMA, abandon
@@ -1878,7 +1882,7 @@ int i915_gem_fault(struct vm_area_struct *area, struct vm_fault *vmf)
 
 	/* Finally, remap it using the new GTT offset */
 	ret = remap_io_mapping(area,
-			       area->vm_start + (vma->ggtt_view.params.partial.offset << PAGE_SHIFT),
+			       area->vm_start + intel_partial_get_offset(&vma->ggtt_view.params.partial),
 			       (ggtt->mappable_base + vma->node.start) >> PAGE_SHIFT,
 			       min_t(u64, vma->size, area->vm_end - area->vm_start),
 			       &ggtt->mappable);
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index f698006fe883..4e77baf7d652 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -3492,7 +3492,7 @@ intel_partial_pages(const struct i915_ggtt_view *view,
 {
 	struct sg_table *st;
 	struct scatterlist *sg, *iter;
-	unsigned int count = view->params.partial.size;
+	unsigned int count = intel_partial_get_pages(&view->params.partial);
 	unsigned int offset;
 	int ret = -ENOMEM;
 
@@ -3505,7 +3505,7 @@ intel_partial_pages(const struct i915_ggtt_view *view,
 		goto err_sg_alloc;
 
 	iter = i915_gem_object_get_sg(obj,
-				      view->params.partial.offset,
+				      intel_partial_get_page_offset(&view->params.partial),
 				      &offset);
 	GEM_BUG_ON(!iter);
 
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 19ea4c942df4..023bf6ac3dc7 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -156,10 +156,31 @@ struct intel_rotation_info {
 };
 
 struct intel_partial_info {
-	u64 offset;
-	unsigned int size;
+	/* offset is page-aligned, leaving just enough bits for the size */
+#define INTEL_PARTIAL_SIZE_BITS PAGE_SHIFT
+	u64 offset_size;
 };
 
+static inline u32 intel_partial_get_pages(const struct intel_partial_info *pi)
+{
+	return 1 + (pi->offset_size & GENMASK(INTEL_PARTIAL_SIZE_BITS-1, 0));
+}
+
+static inline u32 intel_partial_get_size(const struct intel_partial_info *pi)
+{
+	return intel_partial_get_pages(pi) << PAGE_SHIFT;
+}
+
+static inline u64 intel_partial_get_offset(const struct intel_partial_info *pi)
+{
+	return pi->offset_size & GENMASK(63, INTEL_PARTIAL_SIZE_BITS);
+}
+
+static inline u64 intel_partial_get_page_offset(const struct intel_partial_info *pi)
+{
+	return pi->offset_size >> INTEL_PARTIAL_SIZE_BITS;
+}
+
 struct i915_ggtt_view {
 	enum i915_ggtt_view_type type;
 
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 58f2483362ad..65770b7109c0 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -96,11 +96,10 @@ __i915_vma_create(struct drm_i915_gem_object *obj,
 		vma->ggtt_view = *view;
 		if (view->type == I915_GGTT_VIEW_PARTIAL) {
 			GEM_BUG_ON(range_overflows_t(u64,
-						     view->params.partial.offset,
-						     view->params.partial.size,
-						     obj->base.size >> PAGE_SHIFT));
-			vma->size = view->params.partial.size;
-			vma->size <<= PAGE_SHIFT;
+						     intel_partial_get_offset(&view->params.partial),
+						     intel_partial_get_size(&view->params.partial),
+						     obj->base.size));
+			vma->size = intel_partial_get_size(&view->params.partial);
 			GEM_BUG_ON(vma->size >= obj->base.size);
 		} else if (view->type == I915_GGTT_VIEW_ROTATED) {
 			vma->size =
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 12/26] drm/i915: Convert i915_ggtt_view to use an anonymous union
  2016-12-31 12:06 Small GEM fixes Chris Wilson
                   ` (10 preceding siblings ...)
  2016-12-31 12:06 ` [PATCH 11/26] drm/i915: Pack the partial view size and offset into a single u64 Chris Wilson
@ 2016-12-31 12:06 ` Chris Wilson
  2016-12-31 12:06 ` [PATCH 13/26] drm/i915: Eliminate superfluous i915_ggtt_view_rotated Chris Wilson
                   ` (14 subsequent siblings)
  26 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2016-12-31 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter

Save a lot of characters by making the union anonymous, with the
side-effect of ignoring unset bits when comparing views.

v2: Roll up the memcmps back into one.
v3: And split again as Ville points out we can't trust the compiler.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_gem.c      | 11 +++++------
 drivers/gpu/drm/i915/i915_gem_gtt.c  |  7 ++++---
 drivers/gpu/drm/i915/i915_gem_gtt.h  | 16 ++++++++--------
 drivers/gpu/drm/i915/i915_vma.c      |  9 ++++-----
 drivers/gpu/drm/i915/i915_vma.h      | 16 ++++++++++------
 drivers/gpu/drm/i915/intel_display.c |  2 +-
 6 files changed, 32 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 87e41d714890..fb59487f5271 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1841,13 +1841,12 @@ int i915_gem_fault(struct vm_area_struct *area, struct vm_fault *vmf)
 		if (i915_gem_object_is_tiled(obj))
 			chunk = roundup(chunk, tile_row_pages(obj));
 
-		memset(&view, 0, sizeof(view));
 		view.type = I915_GGTT_VIEW_PARTIAL;
-		view.params.partial.offset_size = rounddown(page_offset, chunk);
-		view.params.partial.offset_size =
-			(view.params.partial.offset_size << INTEL_PARTIAL_SIZE_BITS) |
+		view.partial.offset_size = rounddown(page_offset, chunk);
+		view.partial.offset_size =
+			(view.partial.offset_size << INTEL_PARTIAL_SIZE_BITS) |
 			(min_t(unsigned int, chunk,
-			      vma_pages(area) - view.params.partial.offset_size) - 1);
+			      vma_pages(area) - view.partial.offset_size) - 1);
 
 		/* If the partial covers the entire object, just create a
 		 * normal VMA.
@@ -1882,7 +1881,7 @@ int i915_gem_fault(struct vm_area_struct *area, struct vm_fault *vmf)
 
 	/* Finally, remap it using the new GTT offset */
 	ret = remap_io_mapping(area,
-			       area->vm_start + intel_partial_get_offset(&vma->ggtt_view.params.partial),
+			       area->vm_start + intel_partial_get_offset(&vma->ggtt_view.partial),
 			       (ggtt->mappable_base + vma->node.start) >> PAGE_SHIFT,
 			       min_t(u64, vma->size, area->vm_end - area->vm_start),
 			       &ggtt->mappable);
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 4e77baf7d652..f863f2f98c7e 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -3492,7 +3492,7 @@ intel_partial_pages(const struct i915_ggtt_view *view,
 {
 	struct sg_table *st;
 	struct scatterlist *sg, *iter;
-	unsigned int count = intel_partial_get_pages(&view->params.partial);
+	unsigned int count = intel_partial_get_pages(&view->partial);
 	unsigned int offset;
 	int ret = -ENOMEM;
 
@@ -3505,7 +3505,7 @@ intel_partial_pages(const struct i915_ggtt_view *view,
 		goto err_sg_alloc;
 
 	iter = i915_gem_object_get_sg(obj,
-				      intel_partial_get_page_offset(&view->params.partial),
+				      intel_partial_get_page_offset(&view->partial),
 				      &offset);
 	GEM_BUG_ON(!iter);
 
@@ -3558,7 +3558,8 @@ i915_get_ggtt_vma_pages(struct i915_vma *vma)
 		vma->pages = vma->obj->mm.pages;
 	else if (vma->ggtt_view.type == I915_GGTT_VIEW_ROTATED)
 		vma->pages =
-			intel_rotate_fb_obj_pages(&vma->ggtt_view.params.rotated, vma->obj);
+			intel_rotate_fb_obj_pages(&vma->ggtt_view.rotated,
+						  vma->obj);
 	else if (vma->ggtt_view.type == I915_GGTT_VIEW_PARTIAL)
 		vma->pages = intel_partial_pages(&vma->ggtt_view, vma->obj);
 	else
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 023bf6ac3dc7..b2daca712df7 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -142,12 +142,6 @@ typedef uint64_t gen8_ppgtt_pml4e_t;
 
 struct sg_table;
 
-enum i915_ggtt_view_type {
-	I915_GGTT_VIEW_NORMAL = 0,
-	I915_GGTT_VIEW_ROTATED,
-	I915_GGTT_VIEW_PARTIAL,
-};
-
 struct intel_rotation_info {
 	struct intel_rotation_plane_info {
 		/* tiles */
@@ -181,13 +175,19 @@ static inline u64 intel_partial_get_page_offset(const struct intel_partial_info
 	return pi->offset_size >> INTEL_PARTIAL_SIZE_BITS;
 }
 
+enum i915_ggtt_view_type {
+	I915_GGTT_VIEW_NORMAL = 0,
+	I915_GGTT_VIEW_ROTATED = sizeof(struct intel_rotation_info),
+	I915_GGTT_VIEW_PARTIAL = sizeof(struct intel_partial_info),
+};
+
 struct i915_ggtt_view {
 	enum i915_ggtt_view_type type;
-
 	union {
+		/* Members need to contain no holes/padding */
 		struct intel_partial_info partial;
 		struct intel_rotation_info rotated;
-	} params;
+	};
 };
 
 extern const struct i915_ggtt_view i915_ggtt_view_normal;
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 65770b7109c0..284273056ea2 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -96,14 +96,13 @@ __i915_vma_create(struct drm_i915_gem_object *obj,
 		vma->ggtt_view = *view;
 		if (view->type == I915_GGTT_VIEW_PARTIAL) {
 			GEM_BUG_ON(range_overflows_t(u64,
-						     intel_partial_get_offset(&view->params.partial),
-						     intel_partial_get_size(&view->params.partial),
+						     intel_partial_get_offset(&view->partial),
+						     intel_partial_get_size(&view->partial),
 						     obj->base.size));
-			vma->size = intel_partial_get_size(&view->params.partial);
+			vma->size = intel_partial_get_size(&view->partial);
 			GEM_BUG_ON(vma->size >= obj->base.size);
 		} else if (view->type == I915_GGTT_VIEW_ROTATED) {
-			vma->size =
-				intel_rotation_info_size(&view->params.rotated);
+			vma->size = intel_rotation_info_size(&view->rotated);
 			vma->size <<= PAGE_SHIFT;
 		}
 	}
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index e3b2b3b1e056..6d936009f919 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -196,15 +196,19 @@ i915_vma_compare(struct i915_vma *vma,
 	if (cmp)
 		return cmp;
 
+	cmp = vma->ggtt_view.type;
 	if (!view)
-		return vma->ggtt_view.type;
+		return cmp;
+
+	cmp -= view->type;
+	if (cmp)
+		return cmp;
 
-	if (vma->ggtt_view.type != view->type)
-		return vma->ggtt_view.type - view->type;
+	BUILD_BUG_ON(I915_GGTT_VIEW_PARTIAL == I915_GGTT_VIEW_ROTATED);
+	BUILD_BUG_ON(offsetof(typeof(*view), rotated) !=
+		     offsetof(typeof(*view), partial));
 
-	return memcmp(&vma->ggtt_view.params,
-		      &view->params,
-		      sizeof(view->params));
+	return memcmp(&vma->ggtt_view.partial, &view->partial, view->type);
 }
 
 int i915_vma_bind(struct i915_vma *vma, enum i915_cache_level cache_level,
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 9b64edc44e26..7f12db705408 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -2140,7 +2140,7 @@ intel_fill_fb_ggtt_view(struct i915_ggtt_view *view,
 {
 	if (drm_rotation_90_or_270(rotation)) {
 		*view = i915_ggtt_view_rotated;
-		view->params.rotated = to_intel_framebuffer(fb)->rot_info;
+		view->rotated = to_intel_framebuffer(fb)->rot_info;
 	} else {
 		*view = i915_ggtt_view_normal;
 	}
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 13/26] drm/i915: Eliminate superfluous i915_ggtt_view_rotated
  2016-12-31 12:06 Small GEM fixes Chris Wilson
                   ` (11 preceding siblings ...)
  2016-12-31 12:06 ` [PATCH 12/26] drm/i915: Convert i915_ggtt_view to use an anonymous union Chris Wilson
@ 2016-12-31 12:06 ` Chris Wilson
  2016-12-31 12:06 ` [PATCH 14/26] drm/i915: Eliminate superfluous i915_ggtt_view_normal Chris Wilson
                   ` (13 subsequent siblings)
  26 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2016-12-31 12:06 UTC (permalink / raw)
  To: intel-gfx

It is only being used to clear a struct and set the type, after which it
is overwritten. Since we no longer check the unset bits of the union,
skipping the clear is permissible.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c  | 3 ---
 drivers/gpu/drm/i915/i915_gem_gtt.h  | 1 -
 drivers/gpu/drm/i915/intel_display.c | 5 ++---
 3 files changed, 2 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index f863f2f98c7e..2c7000310030 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -102,9 +102,6 @@ i915_get_ggtt_vma_pages(struct i915_vma *vma);
 const struct i915_ggtt_view i915_ggtt_view_normal = {
 	.type = I915_GGTT_VIEW_NORMAL,
 };
-const struct i915_ggtt_view i915_ggtt_view_rotated = {
-	.type = I915_GGTT_VIEW_ROTATED,
-};
 
 int intel_sanitize_enable_ppgtt(struct drm_i915_private *dev_priv,
 			       	int enable_ppgtt)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index b2daca712df7..bdfd6c210fb9 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -191,7 +191,6 @@ struct i915_ggtt_view {
 };
 
 extern const struct i915_ggtt_view i915_ggtt_view_normal;
-extern const struct i915_ggtt_view i915_ggtt_view_rotated;
 
 enum i915_cache_level;
 
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 7f12db705408..dcd413ce4d84 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -2138,11 +2138,10 @@ intel_fill_fb_ggtt_view(struct i915_ggtt_view *view,
 			const struct drm_framebuffer *fb,
 			unsigned int rotation)
 {
+	view->type = I915_GGTT_VIEW_NORMAL;
 	if (drm_rotation_90_or_270(rotation)) {
-		*view = i915_ggtt_view_rotated;
+		view->type = I915_GGTT_VIEW_ROTATED;
 		view->rotated = to_intel_framebuffer(fb)->rot_info;
-	} else {
-		*view = i915_ggtt_view_normal;
 	}
 }
 
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 14/26] drm/i915: Eliminate superfluous i915_ggtt_view_normal
  2016-12-31 12:06 Small GEM fixes Chris Wilson
                   ` (12 preceding siblings ...)
  2016-12-31 12:06 ` [PATCH 13/26] drm/i915: Eliminate superfluous i915_ggtt_view_rotated Chris Wilson
@ 2016-12-31 12:06 ` Chris Wilson
  2016-12-31 12:06 ` [PATCH 15/26] drm/i915: Extact compute_partial_view() Chris Wilson
                   ` (12 subsequent siblings)
  26 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2016-12-31 12:06 UTC (permalink / raw)
  To: intel-gfx

Since commit 058d88c4330f ("drm/i915: Track pinned VMA"), there is only
one user of i915_ggtt_view_normal rodate. Just treat NULL as no special
view in pin_to_display() like everywhere else.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c      | 2 +-
 drivers/gpu/drm/i915/i915_gem_gtt.c  | 4 ----
 drivers/gpu/drm/i915/i915_gem_gtt.h  | 2 --
 drivers/gpu/drm/i915/intel_overlay.c | 3 +--
 4 files changed, 2 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index fb59487f5271..5e7ecec2a73d 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3487,7 +3487,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
 	 * try to preserve the existing ABI).
 	 */
 	vma = ERR_PTR(-ENOSPC);
-	if (view->type == I915_GGTT_VIEW_NORMAL)
+	if (!view || view->type == I915_GGTT_VIEW_NORMAL)
 		vma = i915_gem_object_ggtt_pin(obj, view, 0, alignment,
 					       PIN_MAPPABLE | PIN_NONBLOCK);
 	if (IS_ERR(vma)) {
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 2c7000310030..2c579946447c 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -99,10 +99,6 @@
 static int
 i915_get_ggtt_vma_pages(struct i915_vma *vma);
 
-const struct i915_ggtt_view i915_ggtt_view_normal = {
-	.type = I915_GGTT_VIEW_NORMAL,
-};
-
 int intel_sanitize_enable_ppgtt(struct drm_i915_private *dev_priv,
 			       	int enable_ppgtt)
 {
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index bdfd6c210fb9..a4070c27d69b 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -190,8 +190,6 @@ struct i915_ggtt_view {
 	};
 };
 
-extern const struct i915_ggtt_view i915_ggtt_view_normal;
-
 enum i915_cache_level;
 
 struct i915_vma;
diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
index 4473a611c664..0608fad7f593 100644
--- a/drivers/gpu/drm/i915/intel_overlay.c
+++ b/drivers/gpu/drm/i915/intel_overlay.c
@@ -811,8 +811,7 @@ static int intel_overlay_do_put_image(struct intel_overlay *overlay,
 	if (ret != 0)
 		return ret;
 
-	vma = i915_gem_object_pin_to_display_plane(new_bo, 0,
-						   &i915_ggtt_view_normal);
+	vma = i915_gem_object_pin_to_display_plane(new_bo, 0, NULL);
 	if (IS_ERR(vma))
 		return PTR_ERR(vma);
 
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 15/26] drm/i915: Extact compute_partial_view()
  2016-12-31 12:06 Small GEM fixes Chris Wilson
                   ` (13 preceding siblings ...)
  2016-12-31 12:06 ` [PATCH 14/26] drm/i915: Eliminate superfluous i915_ggtt_view_normal Chris Wilson
@ 2016-12-31 12:06 ` Chris Wilson
  2016-12-31 12:06 ` [PATCH 16/26] drm/i915: Clip the partial view against the object not vma Chris Wilson
                   ` (11 subsequent siblings)
  26 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2016-12-31 12:06 UTC (permalink / raw)
  To: intel-gfx

In order to reuse the partial view for selftesting, extract the common
function for computing the view.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c | 49 +++++++++++++++++++++++++----------------
 1 file changed, 30 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 5e7ecec2a73d..669c11f62376 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1757,6 +1757,33 @@ int i915_gem_mmap_gtt_version(void)
 	return 1;
 }
 
+static inline struct i915_ggtt_view
+compute_partial_view(struct drm_i915_gem_object *obj,
+		     struct vm_area_struct *area,
+		     pgoff_t page_offset,
+		     unsigned int chunk)
+{
+	struct i915_ggtt_view view;
+
+	if (i915_gem_object_is_tiled(obj))
+		chunk = roundup(chunk, tile_row_pages(obj));
+
+	view.type = I915_GGTT_VIEW_PARTIAL;
+	view.partial.offset_size = rounddown(page_offset, chunk);
+	view.partial.offset_size =
+		(view.partial.offset_size << INTEL_PARTIAL_SIZE_BITS) |
+		(min_t(unsigned int, chunk,
+		       vma_pages(area) - view.partial.offset_size) - 1);
+
+	/* If the partial covers the entire object, just create a
+	 * normal VMA.
+	 */
+	if (chunk >= obj->base.size >> PAGE_SHIFT)
+		view.type = I915_GGTT_VIEW_NORMAL;
+
+	return view;
+}
+
 /**
  * i915_gem_fault - fault a page into the GTT
  * @area: CPU VMA in question
@@ -1833,26 +1860,10 @@ int i915_gem_fault(struct vm_area_struct *area, struct vm_fault *vmf)
 	/* Now pin it into the GTT as needed */
 	vma = i915_gem_object_ggtt_pin(obj, NULL, 0, 0, flags);
 	if (IS_ERR(vma)) {
-		struct i915_ggtt_view view;
-		unsigned int chunk;
-
 		/* Use a partial view if it is bigger than available space */
-		chunk = MIN_CHUNK_PAGES;
-		if (i915_gem_object_is_tiled(obj))
-			chunk = roundup(chunk, tile_row_pages(obj));
-
-		view.type = I915_GGTT_VIEW_PARTIAL;
-		view.partial.offset_size = rounddown(page_offset, chunk);
-		view.partial.offset_size =
-			(view.partial.offset_size << INTEL_PARTIAL_SIZE_BITS) |
-			(min_t(unsigned int, chunk,
-			      vma_pages(area) - view.partial.offset_size) - 1);
-
-		/* If the partial covers the entire object, just create a
-		 * normal VMA.
-		 */
-		if (chunk >= obj->base.size >> PAGE_SHIFT)
-			view.type = I915_GGTT_VIEW_NORMAL;
+		struct i915_ggtt_view view =
+			compute_partial_view(obj, area,
+					     page_offset, MIN_CHUNK_PAGES);
 
 		/* Userspace is now writing through an untracked VMA, abandon
 		 * all hope that the hardware is able to track future writes.
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 16/26] drm/i915: Clip the partial view against the object not vma
  2016-12-31 12:06 Small GEM fixes Chris Wilson
                   ` (14 preceding siblings ...)
  2016-12-31 12:06 ` [PATCH 15/26] drm/i915: Extact compute_partial_view() Chris Wilson
@ 2016-12-31 12:06 ` Chris Wilson
  2016-12-31 12:06 ` [PATCH 17/26] drm/i915: Align GGTT sizes to a fence tile row Chris Wilson
                   ` (10 subsequent siblings)
  26 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2016-12-31 12:06 UTC (permalink / raw)
  To: intel-gfx

The VMA is later clipped against the vm_area_struct before insertion of
the faulting PTE so we are free to create the partial view as we desire.
If we use the object as the extents rather than the area, this partial
can then be used for other areas.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 669c11f62376..ea507d3eadba 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1759,7 +1759,6 @@ int i915_gem_mmap_gtt_version(void)
 
 static inline struct i915_ggtt_view
 compute_partial_view(struct drm_i915_gem_object *obj,
-		     struct vm_area_struct *area,
 		     pgoff_t page_offset,
 		     unsigned int chunk)
 {
@@ -1773,7 +1772,7 @@ compute_partial_view(struct drm_i915_gem_object *obj,
 	view.partial.offset_size =
 		(view.partial.offset_size << INTEL_PARTIAL_SIZE_BITS) |
 		(min_t(unsigned int, chunk,
-		       vma_pages(area) - view.partial.offset_size) - 1);
+		       (obj->base.size >> PAGE_SHIFT) - view.partial.offset_size) - 1);
 
 	/* If the partial covers the entire object, just create a
 	 * normal VMA.
@@ -1862,8 +1861,7 @@ int i915_gem_fault(struct vm_area_struct *area, struct vm_fault *vmf)
 	if (IS_ERR(vma)) {
 		/* Use a partial view if it is bigger than available space */
 		struct i915_ggtt_view view =
-			compute_partial_view(obj, area,
-					     page_offset, MIN_CHUNK_PAGES);
+			compute_partial_view(obj, page_offset, MIN_CHUNK_PAGES);
 
 		/* Userspace is now writing through an untracked VMA, abandon
 		 * all hope that the hardware is able to track future writes.
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 17/26] drm/i915: Align GGTT sizes to a fence tile row
  2016-12-31 12:06 Small GEM fixes Chris Wilson
                   ` (15 preceding siblings ...)
  2016-12-31 12:06 ` [PATCH 16/26] drm/i915: Clip the partial view against the object not vma Chris Wilson
@ 2016-12-31 12:06 ` Chris Wilson
  2016-12-31 12:06 ` [PATCH 18/26] drm/i915: Replace WARNs in fence register writes with extensive asserts Chris Wilson
                   ` (9 subsequent siblings)
  26 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2016-12-31 12:06 UTC (permalink / raw)
  To: intel-gfx

Ensure the view occupies the full tile row so that reads/writes into the
VMA do not escape (via fenced detiling) into neighbouring objects - we
will pad the object with scratch pages to satisfy the fence. This
applies the lazy-tiling we employed on gen2/3 to gen4+.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.h        |  5 +++--
 drivers/gpu/drm/i915/i915_gem.c        | 26 ++++++++++++++++++--------
 drivers/gpu/drm/i915/i915_gem_tiling.c | 14 +++++++++-----
 drivers/gpu/drm/i915/i915_vma.c        |  8 ++++++--
 4 files changed, 36 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 58dfe8ab9c0b..fae0f07b232c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3359,9 +3359,10 @@ int i915_gem_open(struct drm_device *dev, struct drm_file *file);
 void i915_gem_release(struct drm_device *dev, struct drm_file *file);
 
 u64 i915_gem_get_ggtt_size(struct drm_i915_private *dev_priv, u64 size,
-			   int tiling_mode);
+			   int tiling_mode, unsigned int stride);
 u64 i915_gem_get_ggtt_alignment(struct drm_i915_private *dev_priv, u64 size,
-				int tiling_mode, bool fenced);
+				int tiling_mode, unsigned int stride,
+				bool fenced);
 
 int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 				    enum i915_cache_level cache_level);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index ea507d3eadba..695b21b7ec8b 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2038,21 +2038,28 @@ void i915_gem_runtime_suspend(struct drm_i915_private *dev_priv)
  * @dev_priv: i915 device
  * @size: object size
  * @tiling_mode: tiling mode
+ * @stride: tiling stride
  *
  * Return the required global GTT size for an object, taking into account
  * potential fence register mapping.
  */
 u64 i915_gem_get_ggtt_size(struct drm_i915_private *dev_priv,
-			   u64 size, int tiling_mode)
+			   u64 size, int tiling_mode, unsigned int stride)
 {
 	u64 ggtt_size;
 
-	GEM_BUG_ON(size == 0);
+	GEM_BUG_ON(!size);
 
-	if (INTEL_GEN(dev_priv) >= 4 ||
-	    tiling_mode == I915_TILING_NONE)
+	if (tiling_mode == I915_TILING_NONE)
 		return size;
 
+	GEM_BUG_ON(!stride);
+
+	if (INTEL_GEN(dev_priv) >= 4) {
+		stride *= tiling_mode == I915_TILING_Y ? 32 : 8;
+		return roundup(size, stride);
+	}
+
 	/* Previous chips need a power-of-two fence region when tiling */
 	if (IS_GEN3(dev_priv))
 		ggtt_size = 1024*1024;
@@ -2070,15 +2077,17 @@ u64 i915_gem_get_ggtt_size(struct drm_i915_private *dev_priv,
  * @dev_priv: i915 device
  * @size: object size
  * @tiling_mode: tiling mode
+ * @stride: tiling stride
  * @fenced: is fenced alignment required or not
  *
  * Return the required global GTT alignment for an object, taking into account
  * potential fence register mapping.
  */
 u64 i915_gem_get_ggtt_alignment(struct drm_i915_private *dev_priv, u64 size,
-				int tiling_mode, bool fenced)
+				int tiling_mode, unsigned int stride,
+				bool fenced)
 {
-	GEM_BUG_ON(size == 0);
+	GEM_BUG_ON(!size);
 
 	/*
 	 * Minimum alignment is 4k (GTT page size), but might be greater
@@ -2093,7 +2102,7 @@ u64 i915_gem_get_ggtt_alignment(struct drm_i915_private *dev_priv, u64 size,
 	 * Previous chips need to be aligned to the size of the smallest
 	 * fence register that can contain the object.
 	 */
-	return i915_gem_get_ggtt_size(dev_priv, size, tiling_mode);
+	return i915_gem_get_ggtt_size(dev_priv, size, tiling_mode, stride);
 }
 
 static int i915_gem_object_create_mmap_offset(struct drm_i915_gem_object *obj)
@@ -3703,7 +3712,8 @@ i915_gem_object_ggtt_pin(struct drm_i915_gem_object *obj,
 			u32 fence_size;
 
 			fence_size = i915_gem_get_ggtt_size(dev_priv, vma->size,
-							    i915_gem_object_get_tiling(obj));
+							    i915_gem_object_get_tiling(obj),
+							    i915_gem_object_get_stride(obj));
 			/* If the required space is larger than the available
 			 * aperture, we will not able to find a slot for the
 			 * object and unbinding the object now will be in
diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
index 62ad375de6ca..bc9eb7e1da0e 100644
--- a/drivers/gpu/drm/i915/i915_gem_tiling.c
+++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
@@ -117,7 +117,8 @@ i915_tiling_ok(struct drm_i915_private *dev_priv,
 	return true;
 }
 
-static bool i915_vma_fence_prepare(struct i915_vma *vma, int tiling_mode)
+static bool i915_vma_fence_prepare(struct i915_vma *vma,
+				   int tiling_mode, unsigned int stride)
 {
 	struct drm_i915_private *dev_priv = vma->vm->i915;
 	u32 size;
@@ -133,7 +134,7 @@ static bool i915_vma_fence_prepare(struct i915_vma *vma, int tiling_mode)
 			return false;
 	}
 
-	size = i915_gem_get_ggtt_size(dev_priv, vma->size, tiling_mode);
+	size = i915_gem_get_ggtt_size(dev_priv, vma->size, tiling_mode, stride);
 	if (vma->node.size < size)
 		return false;
 
@@ -145,7 +146,8 @@ static bool i915_vma_fence_prepare(struct i915_vma *vma, int tiling_mode)
 
 /* Make the current GTT allocation valid for the change in tiling. */
 static int
-i915_gem_object_fence_prepare(struct drm_i915_gem_object *obj, int tiling_mode)
+i915_gem_object_fence_prepare(struct drm_i915_gem_object *obj,
+			      int tiling_mode, unsigned int stride)
 {
 	struct drm_i915_private *dev_priv = to_i915(obj->base.dev);
 	struct i915_vma *vma;
@@ -158,7 +160,7 @@ i915_gem_object_fence_prepare(struct drm_i915_gem_object *obj, int tiling_mode)
 		return 0;
 
 	list_for_each_entry(vma, &obj->vma_list, obj_link) {
-		if (i915_vma_fence_prepare(vma, tiling_mode))
+		if (i915_vma_fence_prepare(vma, tiling_mode, stride))
 			continue;
 
 		ret = i915_vma_unbind(vma);
@@ -255,7 +257,9 @@ i915_gem_set_tiling(struct drm_device *dev, void *data,
 		 * whilst executing a fenced command for an untiled object.
 		 */
 
-		err = i915_gem_object_fence_prepare(obj, args->tiling_mode);
+		err = i915_gem_object_fence_prepare(obj,
+						    args->tiling_mode,
+						    args->stride);
 		if (!err) {
 			struct i915_vma *vma;
 
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 284273056ea2..028d011f805a 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -282,10 +282,12 @@ void __i915_vma_set_map_and_fenceable(struct i915_vma *vma)
 
 	fence_size = i915_gem_get_ggtt_size(dev_priv,
 					    vma->size,
-					    i915_gem_object_get_tiling(obj));
+					    i915_gem_object_get_tiling(obj),
+					    i915_gem_object_get_stride(obj));
 	fence_alignment = i915_gem_get_ggtt_alignment(dev_priv,
 						      vma->size,
 						      i915_gem_object_get_tiling(obj),
+						      i915_gem_object_get_stride(obj),
 						      true);
 
 	fenceable = (vma->node.size == fence_size &&
@@ -368,11 +370,13 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
 	size = max(size, vma->size);
 	if (flags & PIN_MAPPABLE)
 		size = i915_gem_get_ggtt_size(dev_priv, size,
-					      i915_gem_object_get_tiling(obj));
+					      i915_gem_object_get_tiling(obj),
+					      i915_gem_object_get_stride(obj));
 
 	alignment = max(max(alignment, vma->display_alignment),
 			i915_gem_get_ggtt_alignment(dev_priv, size,
 						    i915_gem_object_get_tiling(obj),
+						    i915_gem_object_get_stride(obj),
 						    flags & PIN_MAPPABLE));
 
 	start = flags & PIN_OFFSET_BIAS ? flags & PIN_OFFSET_MASK : 0;
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 18/26] drm/i915: Replace WARNs in fence register writes with extensive asserts
  2016-12-31 12:06 Small GEM fixes Chris Wilson
                   ` (16 preceding siblings ...)
  2016-12-31 12:06 ` [PATCH 17/26] drm/i915: Align GGTT sizes to a fence tile row Chris Wilson
@ 2016-12-31 12:06 ` Chris Wilson
  2016-12-31 12:06 ` [PATCH 19/26] drm/i915: Store required fence size/alignment for GGTT vma Chris Wilson
                   ` (8 subsequent siblings)
  26 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2016-12-31 12:06 UTC (permalink / raw)
  To: intel-gfx

All of these conditions are prechecked by i915_tiling_ok() before we
allow setting the tiling/stride on the object and so we should never
fail asserting those conditions before writing the register.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem_fence_reg.c | 50 ++++++++++++++-----------------
 1 file changed, 22 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_fence_reg.c b/drivers/gpu/drm/i915/i915_gem_fence_reg.c
index e03983973252..a6e3f2bfae15 100644
--- a/drivers/gpu/drm/i915/i915_gem_fence_reg.c
+++ b/drivers/gpu/drm/i915/i915_gem_fence_reg.c
@@ -83,8 +83,13 @@ static void i965_write_fence_reg(struct drm_i915_fence_reg *fence,
 		u32 row_size = stride * (is_y_tiled ? 32 : 8);
 		u32 size = rounddown((u32)vma->node.size, row_size);
 
-		val = ((vma->node.start + size - 4096) & 0xfffff000) << 32;
-		val |= vma->node.start & 0xfffff000;
+		GEM_BUG_ON(!i915_vma_is_map_and_fenceable(vma));
+		GEM_BUG_ON(vma->node.start & 4095);
+		GEM_BUG_ON(vma->node.size & 4095);
+		GEM_BUG_ON(stride & 127);
+
+		val = (vma->node.start + size - 4096) << 32;
+		val |= vma->node.start;
 		val |= (u64)((stride / 128) - 1) << fence_pitch_shift;
 		if (is_y_tiled)
 			val |= BIT(I965_FENCE_TILING_Y_SHIFT);
@@ -122,31 +127,24 @@ static void i915_write_fence_reg(struct drm_i915_fence_reg *fence,
 		unsigned int tiling = i915_gem_object_get_tiling(vma->obj);
 		bool is_y_tiled = tiling == I915_TILING_Y;
 		unsigned int stride = i915_gem_object_get_stride(vma->obj);
-		int pitch_val;
-		int tile_width;
 
-		WARN((vma->node.start & ~I915_FENCE_START_MASK) ||
-		     !is_power_of_2(vma->node.size) ||
-		     (vma->node.start & (vma->node.size - 1)),
-		     "object 0x%08llx [fenceable? %d] not 1M or pot-size (0x%08llx) aligned\n",
-		     vma->node.start,
-		     i915_vma_is_map_and_fenceable(vma),
-		     vma->node.size);
+		GEM_BUG_ON(!i915_vma_is_map_and_fenceable(vma));
+		GEM_BUG_ON(vma->node.start & ~I915_FENCE_START_MASK);
+		GEM_BUG_ON(!is_power_of_2(vma->node.size));
+		GEM_BUG_ON(vma->node.start & (vma->node.size - 1));
 
 		if (is_y_tiled && HAS_128_BYTE_Y_TILING(fence->i915))
-			tile_width = 128;
+			stride /= 128;
 		else
-			tile_width = 512;
-
-		/* Note: pitch better be a power of two tile widths */
-		pitch_val = stride / tile_width;
-		pitch_val = ffs(pitch_val) - 1;
+			stride /= 512;
+		GEM_BUG_ON(!is_power_of_2(stride));
 
 		val = vma->node.start;
 		if (is_y_tiled)
 			val |= BIT(I830_FENCE_TILING_Y_SHIFT);
 		val |= I915_FENCE_SIZE_BITS(vma->node.size);
-		val |= pitch_val << I830_FENCE_PITCH_SHIFT;
+		val |= ilog2(stride) << I830_FENCE_PITCH_SHIFT;
+
 		val |= I830_FENCE_REG_VALID;
 	}
 
@@ -169,22 +167,18 @@ static void i830_write_fence_reg(struct drm_i915_fence_reg *fence,
 		unsigned int tiling = i915_gem_object_get_tiling(vma->obj);
 		bool is_y_tiled = tiling == I915_TILING_Y;
 		unsigned int stride = i915_gem_object_get_stride(vma->obj);
-		u32 pitch_val;
-
-		WARN((vma->node.start & ~I830_FENCE_START_MASK) ||
-		     !is_power_of_2(vma->node.size) ||
-		     (vma->node.start & (vma->node.size - 1)),
-		     "object 0x%08llx not 512K or pot-size 0x%08llx aligned\n",
-		     vma->node.start, vma->node.size);
 
-		pitch_val = stride / 128;
-		pitch_val = ffs(pitch_val) - 1;
+		GEM_BUG_ON(!i915_vma_is_map_and_fenceable(vma));
+		GEM_BUG_ON(vma->node.start & ~I830_FENCE_START_MASK);
+		GEM_BUG_ON(!is_power_of_2(vma->node.size));
+		GEM_BUG_ON(!is_power_of_2(stride / 128));
+		GEM_BUG_ON(vma->node.start & (vma->node.size - 1));
 
 		val = vma->node.start;
 		if (is_y_tiled)
 			val |= BIT(I830_FENCE_TILING_Y_SHIFT);
 		val |= I830_FENCE_SIZE_BITS(vma->node.size);
-		val |= pitch_val << I830_FENCE_PITCH_SHIFT;
+		val |= ilog2(stride / 128) << I830_FENCE_PITCH_SHIFT;
 		val |= I830_FENCE_REG_VALID;
 	}
 
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 19/26] drm/i915: Store required fence size/alignment for GGTT vma
  2016-12-31 12:06 Small GEM fixes Chris Wilson
                   ` (17 preceding siblings ...)
  2016-12-31 12:06 ` [PATCH 18/26] drm/i915: Replace WARNs in fence register writes with extensive asserts Chris Wilson
@ 2016-12-31 12:06 ` Chris Wilson
  2016-12-31 12:06 ` [PATCH 20/26] drm/i915: Remove the rounding down of the gen4+ fence region Chris Wilson
                   ` (7 subsequent siblings)
  26 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2016-12-31 12:06 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.h           |  7 ++--
 drivers/gpu/drm/i915/i915_gem.c           | 27 +++++----------
 drivers/gpu/drm/i915/i915_gem_fence_reg.c | 16 ++++-----
 drivers/gpu/drm/i915/i915_gem_tiling.c    | 40 +++++++++++-----------
 drivers/gpu/drm/i915/i915_vma.c           | 56 ++++++++++++++-----------------
 drivers/gpu/drm/i915/i915_vma.h           |  3 ++
 6 files changed, 69 insertions(+), 80 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index fae0f07b232c..4bfa3393c97b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3358,11 +3358,10 @@ int i915_gem_object_attach_phys(struct drm_i915_gem_object *obj,
 int i915_gem_open(struct drm_device *dev, struct drm_file *file);
 void i915_gem_release(struct drm_device *dev, struct drm_file *file);
 
-u64 i915_gem_get_ggtt_size(struct drm_i915_private *dev_priv, u64 size,
+u32 i915_gem_get_ggtt_size(struct drm_i915_private *dev_priv, u32 size,
 			   int tiling_mode, unsigned int stride);
-u64 i915_gem_get_ggtt_alignment(struct drm_i915_private *dev_priv, u64 size,
-				int tiling_mode, unsigned int stride,
-				bool fenced);
+u32 i915_gem_get_ggtt_alignment(struct drm_i915_private *dev_priv, u32 size,
+				int tiling_mode, unsigned int stride);
 
 int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 				    enum i915_cache_level cache_level);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 695b21b7ec8b..4ded4c42b9b6 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2043,10 +2043,10 @@ void i915_gem_runtime_suspend(struct drm_i915_private *dev_priv)
  * Return the required global GTT size for an object, taking into account
  * potential fence register mapping.
  */
-u64 i915_gem_get_ggtt_size(struct drm_i915_private *dev_priv,
-			   u64 size, int tiling_mode, unsigned int stride)
+u32 i915_gem_get_ggtt_size(struct drm_i915_private *dev_priv,
+			   u32 size, int tiling_mode, unsigned int stride)
 {
-	u64 ggtt_size;
+	u32 ggtt_size;
 
 	GEM_BUG_ON(!size);
 
@@ -2078,14 +2078,12 @@ u64 i915_gem_get_ggtt_size(struct drm_i915_private *dev_priv,
  * @size: object size
  * @tiling_mode: tiling mode
  * @stride: tiling stride
- * @fenced: is fenced alignment required or not
  *
  * Return the required global GTT alignment for an object, taking into account
  * potential fence register mapping.
  */
-u64 i915_gem_get_ggtt_alignment(struct drm_i915_private *dev_priv, u64 size,
-				int tiling_mode, unsigned int stride,
-				bool fenced)
+u32 i915_gem_get_ggtt_alignment(struct drm_i915_private *dev_priv, u32 size,
+				int tiling_mode, unsigned int stride)
 {
 	GEM_BUG_ON(!size);
 
@@ -2093,9 +2091,7 @@ u64 i915_gem_get_ggtt_alignment(struct drm_i915_private *dev_priv, u64 size,
 	 * Minimum alignment is 4k (GTT page size), but might be greater
 	 * if a fence register is needed for the object.
 	 */
-	if (INTEL_GEN(dev_priv) >= 4 ||
-	    (!fenced && (IS_G33(dev_priv) || IS_PINEVIEW(dev_priv))) ||
-	    tiling_mode == I915_TILING_NONE)
+	if (INTEL_GEN(dev_priv) >= 4 || tiling_mode == I915_TILING_NONE)
 		return 4096;
 
 	/*
@@ -3564,7 +3560,7 @@ i915_gem_object_unpin_from_display_plane(struct i915_vma *vma)
 		return;
 
 	if (--vma->obj->pin_display == 0)
-		vma->display_alignment = 0;
+		vma->display_alignment = 4096;
 
 	/* Bump the LRU to try and avoid premature eviction whilst flipping  */
 	if (!i915_vma_is_active(vma))
@@ -3709,11 +3705,6 @@ i915_gem_object_ggtt_pin(struct drm_i915_gem_object *obj,
 			return ERR_PTR(-ENOSPC);
 
 		if (flags & PIN_MAPPABLE) {
-			u32 fence_size;
-
-			fence_size = i915_gem_get_ggtt_size(dev_priv, vma->size,
-							    i915_gem_object_get_tiling(obj),
-							    i915_gem_object_get_stride(obj));
 			/* If the required space is larger than the available
 			 * aperture, we will not able to find a slot for the
 			 * object and unbinding the object now will be in
@@ -3721,7 +3712,7 @@ i915_gem_object_ggtt_pin(struct drm_i915_gem_object *obj,
 			 * the object in and out of the Global GTT and
 			 * waste a lot of cycles under the mutex.
 			 */
-			if (fence_size > dev_priv->ggtt.mappable_end)
+			if (vma->fence_size > dev_priv->ggtt.mappable_end)
 				return ERR_PTR(-E2BIG);
 
 			/* If NONBLOCK is set the caller is optimistically
@@ -3740,7 +3731,7 @@ i915_gem_object_ggtt_pin(struct drm_i915_gem_object *obj,
 			 * we could try to minimise harm to others.
 			 */
 			if (flags & PIN_NONBLOCK &&
-			    fence_size > dev_priv->ggtt.mappable_end / 2)
+			    vma->fence_size > dev_priv->ggtt.mappable_end / 2)
 				return ERR_PTR(-ENOSPC);
 		}
 
diff --git a/drivers/gpu/drm/i915/i915_gem_fence_reg.c b/drivers/gpu/drm/i915/i915_gem_fence_reg.c
index a6e3f2bfae15..a1a0c7104576 100644
--- a/drivers/gpu/drm/i915/i915_gem_fence_reg.c
+++ b/drivers/gpu/drm/i915/i915_gem_fence_reg.c
@@ -81,11 +81,11 @@ static void i965_write_fence_reg(struct drm_i915_fence_reg *fence,
 		bool is_y_tiled = tiling == I915_TILING_Y;
 		unsigned int stride = i915_gem_object_get_stride(vma->obj);
 		u32 row_size = stride * (is_y_tiled ? 32 : 8);
-		u32 size = rounddown((u32)vma->node.size, row_size);
+		u32 size = rounddown((u32)vma->fence_size, row_size);
 
 		GEM_BUG_ON(!i915_vma_is_map_and_fenceable(vma));
 		GEM_BUG_ON(vma->node.start & 4095);
-		GEM_BUG_ON(vma->node.size & 4095);
+		GEM_BUG_ON(vma->fence_size & 4095);
 		GEM_BUG_ON(stride & 127);
 
 		val = (vma->node.start + size - 4096) << 32;
@@ -130,8 +130,8 @@ static void i915_write_fence_reg(struct drm_i915_fence_reg *fence,
 
 		GEM_BUG_ON(!i915_vma_is_map_and_fenceable(vma));
 		GEM_BUG_ON(vma->node.start & ~I915_FENCE_START_MASK);
-		GEM_BUG_ON(!is_power_of_2(vma->node.size));
-		GEM_BUG_ON(vma->node.start & (vma->node.size - 1));
+		GEM_BUG_ON(!is_power_of_2(vma->fence_size));
+		GEM_BUG_ON(vma->node.start & (vma->fence_size - 1));
 
 		if (is_y_tiled && HAS_128_BYTE_Y_TILING(fence->i915))
 			stride /= 128;
@@ -142,7 +142,7 @@ static void i915_write_fence_reg(struct drm_i915_fence_reg *fence,
 		val = vma->node.start;
 		if (is_y_tiled)
 			val |= BIT(I830_FENCE_TILING_Y_SHIFT);
-		val |= I915_FENCE_SIZE_BITS(vma->node.size);
+		val |= I915_FENCE_SIZE_BITS(vma->fence_size);
 		val |= ilog2(stride) << I830_FENCE_PITCH_SHIFT;
 
 		val |= I830_FENCE_REG_VALID;
@@ -170,14 +170,14 @@ static void i830_write_fence_reg(struct drm_i915_fence_reg *fence,
 
 		GEM_BUG_ON(!i915_vma_is_map_and_fenceable(vma));
 		GEM_BUG_ON(vma->node.start & ~I830_FENCE_START_MASK);
-		GEM_BUG_ON(!is_power_of_2(vma->node.size));
+		GEM_BUG_ON(!is_power_of_2(vma->fence_size));
 		GEM_BUG_ON(!is_power_of_2(stride / 128));
-		GEM_BUG_ON(vma->node.start & (vma->node.size - 1));
+		GEM_BUG_ON(vma->node.start & (vma->fence_size - 1));
 
 		val = vma->node.start;
 		if (is_y_tiled)
 			val |= BIT(I830_FENCE_TILING_Y_SHIFT);
-		val |= I830_FENCE_SIZE_BITS(vma->node.size);
+		val |= I830_FENCE_SIZE_BITS(vma->fence_size);
 		val |= ilog2(stride / 128) << I830_FENCE_PITCH_SHIFT;
 		val |= I830_FENCE_REG_VALID;
 	}
diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
index bc9eb7e1da0e..23a896cd934f 100644
--- a/drivers/gpu/drm/i915/i915_gem_tiling.c
+++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
@@ -120,25 +120,18 @@ i915_tiling_ok(struct drm_i915_private *dev_priv,
 static bool i915_vma_fence_prepare(struct i915_vma *vma,
 				   int tiling_mode, unsigned int stride)
 {
-	struct drm_i915_private *dev_priv = vma->vm->i915;
-	u32 size;
+	struct drm_i915_private *i915 = vma->vm->i915;
+	u32 size, alignment;
 
 	if (!i915_vma_is_map_and_fenceable(vma))
 		return true;
 
-	if (INTEL_GEN(dev_priv) == 3) {
-		if (vma->node.start & ~I915_FENCE_START_MASK)
-			return false;
-	} else {
-		if (vma->node.start & ~I830_FENCE_START_MASK)
-			return false;
-	}
-
-	size = i915_gem_get_ggtt_size(dev_priv, vma->size, tiling_mode, stride);
+	size = i915_gem_get_ggtt_size(i915, vma->size, tiling_mode, stride);
 	if (vma->node.size < size)
 		return false;
 
-	if (vma->node.start & (size - 1))
+	alignment = i915_gem_get_ggtt_alignment(i915, vma->size, tiling_mode, stride);
+	if (vma->node.start & (alignment - 1))
 		return false;
 
 	return true;
@@ -149,17 +142,16 @@ static int
 i915_gem_object_fence_prepare(struct drm_i915_gem_object *obj,
 			      int tiling_mode, unsigned int stride)
 {
-	struct drm_i915_private *dev_priv = to_i915(obj->base.dev);
 	struct i915_vma *vma;
 	int ret;
 
 	if (tiling_mode == I915_TILING_NONE)
 		return 0;
 
-	if (INTEL_GEN(dev_priv) >= 4)
-		return 0;
-
 	list_for_each_entry(vma, &obj->vma_list, obj_link) {
+		if (!i915_vma_is_ggtt(vma))
+			break;
+
 		if (i915_vma_fence_prepare(vma, tiling_mode, stride))
 			continue;
 
@@ -281,10 +273,18 @@ i915_gem_set_tiling(struct drm_device *dev, void *data,
 			mutex_unlock(&obj->mm.lock);
 
 			list_for_each_entry(vma, &obj->vma_list, obj_link) {
-				if (!vma->fence)
-					continue;
-
-				vma->fence->dirty = true;
+				if (!i915_vma_is_ggtt(vma))
+					break;
+
+				vma->fence_size = i915_gem_get_ggtt_size(dev_priv, vma->size,
+									 args->tiling_mode,
+									 args->stride);
+				vma->fence_alignment = i915_gem_get_ggtt_alignment(dev_priv, vma->size,
+										   args->tiling_mode,
+										   args->stride);
+
+				if (vma->fence)
+					vma->fence->dirty = true;
 			}
 			obj->tiling_and_stride =
 				args->stride | args->tiling_mode;
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 028d011f805a..ae00d9f4e520 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -91,6 +91,7 @@ __i915_vma_create(struct drm_i915_gem_object *obj,
 	vma->vm = vm;
 	vma->obj = obj;
 	vma->size = obj->base.size;
+	vma->display_alignment = 4096;
 
 	if (view) {
 		vma->ggtt_view = *view;
@@ -108,6 +109,14 @@ __i915_vma_create(struct drm_i915_gem_object *obj,
 	}
 
 	if (i915_is_ggtt(vm)) {
+		GEM_BUG_ON(overflows_type(vma->size, u32));
+		vma->fence_size = i915_gem_get_ggtt_size(vm->i915, vma->size,
+							 i915_gem_object_get_tiling(obj),
+							 i915_gem_object_get_stride(obj));
+		vma->fence_alignment = i915_gem_get_ggtt_alignment(vm->i915, vma->size,
+								   i915_gem_object_get_tiling(obj),
+								   i915_gem_object_get_stride(obj));
+
 		vma->flags |= I915_VMA_GGTT;
 		list_add(&vma->obj_link, &obj->vma_list);
 	} else {
@@ -275,33 +284,24 @@ i915_vma_misplaced(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
 
 void __i915_vma_set_map_and_fenceable(struct i915_vma *vma)
 {
-	struct drm_i915_gem_object *obj = vma->obj;
-	struct drm_i915_private *dev_priv = to_i915(obj->base.dev);
 	bool mappable, fenceable;
-	u32 fence_size, fence_alignment;
-
-	fence_size = i915_gem_get_ggtt_size(dev_priv,
-					    vma->size,
-					    i915_gem_object_get_tiling(obj),
-					    i915_gem_object_get_stride(obj));
-	fence_alignment = i915_gem_get_ggtt_alignment(dev_priv,
-						      vma->size,
-						      i915_gem_object_get_tiling(obj),
-						      i915_gem_object_get_stride(obj),
-						      true);
 
-	fenceable = (vma->node.size == fence_size &&
-		     (vma->node.start & (fence_alignment - 1)) == 0);
-
-	mappable = (vma->node.start + fence_size <=
-		    dev_priv->ggtt.mappable_end);
+	GEM_BUG_ON(!i915_vma_is_ggtt(vma));
+	GEM_BUG_ON(!vma->fence_size);
 
 	/*
 	 * Explicitly disable for rotated VMA since the display does not
 	 * need the fence and the VMA is not accessible to other users.
 	 */
-	if (mappable && fenceable &&
-	    vma->ggtt_view.type != I915_GGTT_VIEW_ROTATED)
+	if (vma->ggtt_view.type == I915_GGTT_VIEW_ROTATED)
+		return;
+
+	fenceable = (vma->node.size >= vma->fence_size &&
+		     (vma->node.start & (vma->fence_alignment - 1)) == 0);
+
+	mappable = vma->node.start + vma->fence_size <= i915_vm_to_ggtt(vma->vm)->mappable_end;
+
+	if (mappable && fenceable)
 		vma->flags |= I915_VMA_CAN_FENCE;
 	else
 		vma->flags &= ~I915_VMA_CAN_FENCE;
@@ -368,16 +368,12 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
 	GEM_BUG_ON(drm_mm_node_allocated(&vma->node));
 
 	size = max(size, vma->size);
-	if (flags & PIN_MAPPABLE)
-		size = i915_gem_get_ggtt_size(dev_priv, size,
-					      i915_gem_object_get_tiling(obj),
-					      i915_gem_object_get_stride(obj));
-
-	alignment = max(max(alignment, vma->display_alignment),
-			i915_gem_get_ggtt_alignment(dev_priv, size,
-						    i915_gem_object_get_tiling(obj),
-						    i915_gem_object_get_stride(obj),
-						    flags & PIN_MAPPABLE));
+	alignment = max(alignment, vma->display_alignment);
+	if (flags & PIN_MAPPABLE) {
+		size = max_t(typeof(size), size, vma->fence_size);
+		alignment = max_t(typeof(alignment),
+				  alignment, vma->fence_alignment);
+	}
 
 	start = flags & PIN_OFFSET_BIAS ? flags & PIN_OFFSET_MASK : 0;
 
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index 6d936009f919..8193ad34a1fe 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -55,6 +55,9 @@ struct i915_vma {
 	u64 size;
 	u64 display_alignment;
 
+	u32 fence_size;
+	u32 fence_alignment;
+
 	unsigned int flags;
 	/**
 	 * How many users have pinned this object in GTT space. The following
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 20/26] drm/i915: Remove the rounding down of the gen4+ fence region
  2016-12-31 12:06 Small GEM fixes Chris Wilson
                   ` (18 preceding siblings ...)
  2016-12-31 12:06 ` [PATCH 19/26] drm/i915: Store required fence size/alignment for GGTT vma Chris Wilson
@ 2016-12-31 12:06 ` Chris Wilson
  2016-12-31 12:06 ` [PATCH 21/26] drm/i915: Include ioctl in set-tiling and get-tiling function names Chris Wilson
                   ` (6 subsequent siblings)
  26 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2016-12-31 12:06 UTC (permalink / raw)
  To: intel-gfx

Restricting the fence to the end of the previous tile-row breaks access
to the final portion of the object. On gen2/3 we employed lazy fencing
to pad out the fence with scratch page to provide access to the tail,
and now we also pad out the object on gen4+ we can apply the same fix.

Fixes: af1a7301c7cf ("drm/i915: Only fence tiled region of object.")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem_fence_reg.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_fence_reg.c b/drivers/gpu/drm/i915/i915_gem_fence_reg.c
index a1a0c7104576..6de527df6cdc 100644
--- a/drivers/gpu/drm/i915/i915_gem_fence_reg.c
+++ b/drivers/gpu/drm/i915/i915_gem_fence_reg.c
@@ -78,20 +78,17 @@ static void i965_write_fence_reg(struct drm_i915_fence_reg *fence,
 	val = 0;
 	if (vma) {
 		unsigned int tiling = i915_gem_object_get_tiling(vma->obj);
-		bool is_y_tiled = tiling == I915_TILING_Y;
 		unsigned int stride = i915_gem_object_get_stride(vma->obj);
-		u32 row_size = stride * (is_y_tiled ? 32 : 8);
-		u32 size = rounddown((u32)vma->fence_size, row_size);
 
 		GEM_BUG_ON(!i915_vma_is_map_and_fenceable(vma));
 		GEM_BUG_ON(vma->node.start & 4095);
 		GEM_BUG_ON(vma->fence_size & 4095);
 		GEM_BUG_ON(stride & 127);
 
-		val = (vma->node.start + size - 4096) << 32;
+		val = (vma->node.start + vma->fence_size - 4096) << 32;
 		val |= vma->node.start;
 		val |= (u64)((stride / 128) - 1) << fence_pitch_shift;
-		if (is_y_tiled)
+		if (tiling == I915_TILING_Y)
 			val |= BIT(I965_FENCE_TILING_Y_SHIFT);
 		val |= I965_FENCE_REG_VALID;
 	}
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 21/26] drm/i915: Include ioctl in set-tiling and get-tiling function names
  2016-12-31 12:06 Small GEM fixes Chris Wilson
                   ` (19 preceding siblings ...)
  2016-12-31 12:06 ` [PATCH 20/26] drm/i915: Remove the rounding down of the gen4+ fence region Chris Wilson
@ 2016-12-31 12:06 ` Chris Wilson
  2017-01-09 15:12   ` Tvrtko Ursulin
  2016-12-31 12:06 ` [PATCH 22/26] drm/i915: Split out i915_gem_object_set_tiling() Chris Wilson
                   ` (5 subsequent siblings)
  26 siblings, 1 reply; 36+ messages in thread
From: Chris Wilson @ 2016-12-31 12:06 UTC (permalink / raw)
  To: intel-gfx

Make it clear that these functions are the user entry points for
the tiling/fence registers.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.c        |  4 ++--
 drivers/gpu/drm/i915/i915_drv.h        |  8 ++++----
 drivers/gpu/drm/i915/i915_gem_tiling.c | 16 ++++++++--------
 3 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 2c020eafada6..2cb8682ae57f 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -2548,8 +2548,8 @@ static const struct drm_ioctl_desc i915_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(I915_GEM_MMAP_GTT, i915_gem_mmap_gtt_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_GEM_SET_DOMAIN, i915_gem_set_domain_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_GEM_SW_FINISH, i915_gem_sw_finish_ioctl, DRM_RENDER_ALLOW),
-	DRM_IOCTL_DEF_DRV(I915_GEM_SET_TILING, i915_gem_set_tiling, DRM_RENDER_ALLOW),
-	DRM_IOCTL_DEF_DRV(I915_GEM_GET_TILING, i915_gem_get_tiling, DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(I915_GEM_SET_TILING, i915_gem_set_tiling_ioctl, DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(I915_GEM_GET_TILING, i915_gem_get_tiling_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_GEM_GET_APERTURE, i915_gem_get_aperture_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_GET_PIPE_FROM_CRTC_ID, intel_get_pipe_from_crtc_id, 0),
 	DRM_IOCTL_DEF_DRV(I915_GEM_MADVISE, i915_gem_madvise_ioctl, DRM_RENDER_ALLOW),
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 4bfa3393c97b..5d7d1a9c0179 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3100,10 +3100,10 @@ int i915_gem_throttle_ioctl(struct drm_device *dev, void *data,
 			    struct drm_file *file_priv);
 int i915_gem_madvise_ioctl(struct drm_device *dev, void *data,
 			   struct drm_file *file_priv);
-int i915_gem_set_tiling(struct drm_device *dev, void *data,
-			struct drm_file *file_priv);
-int i915_gem_get_tiling(struct drm_device *dev, void *data,
-			struct drm_file *file_priv);
+int i915_gem_set_tiling_ioctl(struct drm_device *dev, void *data,
+			      struct drm_file *file_priv);
+int i915_gem_get_tiling_ioctl(struct drm_device *dev, void *data,
+			      struct drm_file *file_priv);
 void i915_gem_init_userptr(struct drm_i915_private *dev_priv);
 int i915_gem_userptr_ioctl(struct drm_device *dev, void *data,
 			   struct drm_file *file);
diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
index 23a896cd934f..6c5b8a0e8325 100644
--- a/drivers/gpu/drm/i915/i915_gem_tiling.c
+++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
@@ -34,8 +34,8 @@
 /**
  * DOC: buffer object tiling
  *
- * i915_gem_set_tiling() and i915_gem_get_tiling() is the userspace interface to
- * declare fence register requirements.
+ * i915_gem_set_tiling_ioctl() and i915_gem_get_tiling_ioctl() is the userspace
+ * interface to declare fence register requirements.
  *
  * In principle GEM doesn't care at all about the internal data layout of an
  * object, and hence it also doesn't care about tiling or swizzling. There's two
@@ -164,7 +164,7 @@ i915_gem_object_fence_prepare(struct drm_i915_gem_object *obj,
 }
 
 /**
- * i915_gem_set_tiling - IOCTL handler to set tiling mode
+ * i915_gem_set_tiling_ioctl - IOCTL handler to set tiling mode
  * @dev: DRM device
  * @data: data pointer for the ioctl
  * @file: DRM file for the ioctl call
@@ -178,8 +178,8 @@ i915_gem_object_fence_prepare(struct drm_i915_gem_object *obj,
  * Zero on success, negative errno on failure.
  */
 int
-i915_gem_set_tiling(struct drm_device *dev, void *data,
-		   struct drm_file *file)
+i915_gem_set_tiling_ioctl(struct drm_device *dev, void *data,
+			  struct drm_file *file)
 {
 	struct drm_i915_gem_set_tiling *args = data;
 	struct drm_i915_private *dev_priv = to_i915(dev);
@@ -316,7 +316,7 @@ i915_gem_set_tiling(struct drm_device *dev, void *data,
 }
 
 /**
- * i915_gem_get_tiling - IOCTL handler to get tiling mode
+ * i915_gem_get_tiling_ioctl - IOCTL handler to get tiling mode
  * @dev: DRM device
  * @data: data pointer for the ioctl
  * @file: DRM file for the ioctl call
@@ -329,8 +329,8 @@ i915_gem_set_tiling(struct drm_device *dev, void *data,
  * Zero on success, negative errno on failure.
  */
 int
-i915_gem_get_tiling(struct drm_device *dev, void *data,
-		   struct drm_file *file)
+i915_gem_get_tiling_ioctl(struct drm_device *dev, void *data,
+			  struct drm_file *file)
 {
 	struct drm_i915_gem_get_tiling *args = data;
 	struct drm_i915_private *dev_priv = to_i915(dev);
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 22/26] drm/i915: Split out i915_gem_object_set_tiling()
  2016-12-31 12:06 Small GEM fixes Chris Wilson
                   ` (20 preceding siblings ...)
  2016-12-31 12:06 ` [PATCH 21/26] drm/i915: Include ioctl in set-tiling and get-tiling function names Chris Wilson
@ 2016-12-31 12:06 ` Chris Wilson
  2017-01-09 15:43   ` Tvrtko Ursulin
  2016-12-31 12:06 ` [PATCH 23/26] drm/i915: Construct a request even if the GPU is currently hung Chris Wilson
                   ` (4 subsequent siblings)
  26 siblings, 1 reply; 36+ messages in thread
From: Chris Wilson @ 2016-12-31 12:06 UTC (permalink / raw)
  To: intel-gfx

Expose an interface for changing the tiling and stride on an object,
that includes the complexity of checking for conflicting bindings and
fence registers.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem_object.h |   3 +
 drivers/gpu/drm/i915/i915_gem_tiling.c | 238 +++++++++++++++++----------------
 2 files changed, 128 insertions(+), 113 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_object.h b/drivers/gpu/drm/i915/i915_gem_object.h
index 6a368de9d81e..789bb44c9149 100644
--- a/drivers/gpu/drm/i915/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/i915_gem_object.h
@@ -317,6 +317,9 @@ i915_gem_object_get_stride(struct drm_i915_gem_object *obj)
 	return obj->tiling_and_stride & STRIDE_MASK;
 }
 
+int i915_gem_object_set_tiling(struct drm_i915_gem_object *obj,
+			       unsigned int tiling, unsigned int stride);
+
 static inline struct intel_engine_cs *
 i915_gem_object_last_write_engine(struct drm_i915_gem_object *obj)
 {
diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
index 6c5b8a0e8325..4ec2620b7bf9 100644
--- a/drivers/gpu/drm/i915/i915_gem_tiling.c
+++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
@@ -60,61 +60,56 @@
 
 /* Check pitch constriants for all chips & tiling formats */
 static bool
-i915_tiling_ok(struct drm_i915_private *dev_priv,
-	       int stride, int size, int tiling_mode)
+i915_tiling_ok(struct drm_i915_gem_object *obj,
+	       unsigned int tiling, unsigned int stride)
 {
-	int tile_width;
+	struct drm_i915_private *i915 = to_i915(obj->base.dev);
+	unsigned int tile_width;
 
 	/* Linear is always fine */
-	if (tiling_mode == I915_TILING_NONE)
+	if (tiling == I915_TILING_NONE)
 		return true;
 
-	if (tiling_mode > I915_TILING_LAST)
+	if (tiling > I915_TILING_LAST)
 		return false;
 
-	if (IS_GEN2(dev_priv) ||
-	    (tiling_mode == I915_TILING_Y && HAS_128_BYTE_Y_TILING(dev_priv)))
-		tile_width = 128;
-	else
-		tile_width = 512;
-
 	/* check maximum stride & object size */
 	/* i965+ stores the end address of the gtt mapping in the fence
 	 * reg, so dont bother to check the size */
-	if (INTEL_GEN(dev_priv) >= 7) {
+	if (INTEL_GEN(i915) >= 7) {
 		if (stride / 128 > GEN7_FENCE_MAX_PITCH_VAL)
 			return false;
-	} else if (INTEL_GEN(dev_priv) >= 4) {
+	} else if (INTEL_GEN(i915) >= 4) {
 		if (stride / 128 > I965_FENCE_MAX_PITCH_VAL)
 			return false;
 	} else {
 		if (stride > 8192)
 			return false;
 
-		if (IS_GEN3(dev_priv)) {
-			if (size > I830_FENCE_MAX_SIZE_VAL << 20)
+		if (IS_GEN3(i915)) {
+			if (obj->base.size > I830_FENCE_MAX_SIZE_VAL << 20)
 				return false;
 		} else {
-			if (size > I830_FENCE_MAX_SIZE_VAL << 19)
+			if (obj->base.size > I830_FENCE_MAX_SIZE_VAL << 19)
 				return false;
 		}
 	}
 
-	if (stride < tile_width)
+	if (IS_GEN2(i915) ||
+	    (tiling == I915_TILING_Y && HAS_128_BYTE_Y_TILING(i915)))
+		tile_width = 128;
+	else
+		tile_width = 512;
+
+	if (stride & (tile_width - 1))
 		return false;
 
 	/* 965+ just needs multiples of tile width */
-	if (INTEL_GEN(dev_priv) >= 4) {
-		if (stride & (tile_width - 1))
-			return false;
+	if (INTEL_GEN(i915) >= 4)
 		return true;
-	}
 
 	/* Pre-965 needs power of two tile widths */
-	if (stride & (stride - 1))
-		return false;
-
-	return true;
+	return is_power_of_2(stride);
 }
 
 static bool i915_vma_fence_prepare(struct i915_vma *vma,
@@ -163,6 +158,99 @@ i915_gem_object_fence_prepare(struct drm_i915_gem_object *obj,
 	return 0;
 }
 
+int
+i915_gem_object_set_tiling(struct drm_i915_gem_object *obj,
+			   unsigned int tiling, unsigned int stride)
+{
+	struct drm_i915_private *i915 = to_i915(obj->base.dev);
+	struct i915_vma *vma;
+	int err;
+
+	/* Make sure we don't cross-contaminate obj->tiling_and_stride */
+	BUILD_BUG_ON(I915_TILING_LAST & STRIDE_MASK);
+
+	GEM_BUG_ON(!i915_tiling_ok(obj, tiling, stride));
+	GEM_BUG_ON(!stride ^ (tiling == I915_TILING_NONE));
+	lockdep_assert_held(&i915->drm.struct_mutex);
+
+	if ((tiling | stride) == obj->tiling_and_stride)
+		return 0;
+
+	if (obj->pin_display || obj->framebuffer_references)
+		return -EBUSY;
+
+	/* We need to rebind the object if its current allocation
+	 * no longer meets the alignment restrictions for its new
+	 * tiling mode. Otherwise we can just leave it alone, but
+	 * need to ensure that any fence register is updated before
+	 * the next fenced (either through the GTT or by the BLT unit
+	 * on older GPUs) access.
+	 *
+	 * After updating the tiling parameters, we then flag whether
+	 * we need to update an associated fence register. Note this
+	 * has to also include the unfenced register the GPU uses
+	 * whilst executing a fenced command for an untiled object.
+	 */
+
+	err = i915_gem_object_fence_prepare(obj, tiling, stride);
+	if (err)
+		return err;
+
+	/* If the memory has unknown (i.e. varying) swizzling, we pin the
+	 * pages to prevent them being swapped out and causing corruption
+	 * due to the change in swizzling.
+	 */
+	mutex_lock(&obj->mm.lock);
+	if (obj->mm.pages &&
+	    obj->mm.madv == I915_MADV_WILLNEED &&
+	    i915->quirks & QUIRK_PIN_SWIZZLED_PAGES) {
+		if (tiling == I915_TILING_NONE) {
+			GEM_BUG_ON(!obj->mm.quirked);
+			__i915_gem_object_unpin_pages(obj);
+			obj->mm.quirked = false;
+		}
+		if (!i915_gem_object_is_tiled(obj)) {
+			GEM_BUG_ON(!obj->mm.quirked);
+			__i915_gem_object_pin_pages(obj);
+			obj->mm.quirked = true;
+		}
+	}
+	mutex_unlock(&obj->mm.lock);
+
+	list_for_each_entry(vma, &obj->vma_list, obj_link) {
+		if (!i915_vma_is_ggtt(vma))
+			break;
+
+		vma->fence_size =
+			i915_gem_get_ggtt_size(i915, vma->size,
+					       tiling, stride);
+		vma->fence_alignment =
+			i915_gem_get_ggtt_alignment(i915, vma->size,
+						    tiling, stride);
+
+		if (vma->fence)
+			vma->fence->dirty = true;
+	}
+
+	obj->tiling_and_stride = tiling | stride;
+
+	/* Force the fence to be reacquired for GTT access */
+	i915_gem_release_mmap(obj);
+
+	/* Try to preallocate memory required to save swizzling on put-pages */
+	if (i915_gem_object_needs_bit17_swizzle(obj)) {
+		if (!obj->bit_17) {
+			obj->bit_17 = kcalloc(BITS_TO_LONGS(obj->base.size >> PAGE_SHIFT),
+					      sizeof(long), GFP_KERNEL);
+		}
+	} else {
+		kfree(obj->bit_17);
+		obj->bit_17 = NULL;
+	}
+
+	return 0;
+}
+
 /**
  * i915_gem_set_tiling_ioctl - IOCTL handler to set tiling mode
  * @dev: DRM device
@@ -182,26 +270,15 @@ i915_gem_set_tiling_ioctl(struct drm_device *dev, void *data,
 			  struct drm_file *file)
 {
 	struct drm_i915_gem_set_tiling *args = data;
-	struct drm_i915_private *dev_priv = to_i915(dev);
 	struct drm_i915_gem_object *obj;
-	int err = 0;
-
-	/* Make sure we don't cross-contaminate obj->tiling_and_stride */
-	BUILD_BUG_ON(I915_TILING_LAST & STRIDE_MASK);
+	int err;
 
 	obj = i915_gem_object_lookup(file, args->handle);
 	if (!obj)
 		return -ENOENT;
 
-	if (!i915_tiling_ok(dev_priv,
-			    args->stride, obj->base.size, args->tiling_mode)) {
-		i915_gem_object_put(obj);
-		return -EINVAL;
-	}
-
-	mutex_lock(&dev->struct_mutex);
-	if (obj->pin_display || obj->framebuffer_references) {
-		err = -EBUSY;
+	if (!i915_tiling_ok(obj, args->tiling_mode, args->stride)) {
+		err = -EINVAL;
 		goto err;
 	}
 
@@ -210,9 +287,9 @@ i915_gem_set_tiling_ioctl(struct drm_device *dev, void *data,
 		args->stride = 0;
 	} else {
 		if (args->tiling_mode == I915_TILING_X)
-			args->swizzle_mode = dev_priv->mm.bit_6_swizzle_x;
+			args->swizzle_mode = to_i915(dev)->mm.bit_6_swizzle_x;
 		else
-			args->swizzle_mode = dev_priv->mm.bit_6_swizzle_y;
+			args->swizzle_mode = to_i915(dev)->mm.bit_6_swizzle_y;
 
 		/* Hide bit 17 swizzling from the user.  This prevents old Mesa
 		 * from aborting the application on sw fallbacks to bit 17,
@@ -234,84 +311,19 @@ i915_gem_set_tiling_ioctl(struct drm_device *dev, void *data,
 		}
 	}
 
-	if (args->tiling_mode != i915_gem_object_get_tiling(obj) ||
-	    args->stride != i915_gem_object_get_stride(obj)) {
-		/* We need to rebind the object if its current allocation
-		 * no longer meets the alignment restrictions for its new
-		 * tiling mode. Otherwise we can just leave it alone, but
-		 * need to ensure that any fence register is updated before
-		 * the next fenced (either through the GTT or by the BLT unit
-		 * on older GPUs) access.
-		 *
-		 * After updating the tiling parameters, we then flag whether
-		 * we need to update an associated fence register. Note this
-		 * has to also include the unfenced register the GPU uses
-		 * whilst executing a fenced command for an untiled object.
-		 */
+	err = mutex_lock_interruptible(&dev->struct_mutex);
+	if (err)
+		goto err;
 
-		err = i915_gem_object_fence_prepare(obj,
-						    args->tiling_mode,
-						    args->stride);
-		if (!err) {
-			struct i915_vma *vma;
-
-			mutex_lock(&obj->mm.lock);
-			if (obj->mm.pages &&
-			    obj->mm.madv == I915_MADV_WILLNEED &&
-			    dev_priv->quirks & QUIRK_PIN_SWIZZLED_PAGES) {
-				if (args->tiling_mode == I915_TILING_NONE) {
-					GEM_BUG_ON(!obj->mm.quirked);
-					__i915_gem_object_unpin_pages(obj);
-					obj->mm.quirked = false;
-				}
-				if (!i915_gem_object_is_tiled(obj)) {
-					GEM_BUG_ON(!obj->mm.quirked);
-					__i915_gem_object_pin_pages(obj);
-					obj->mm.quirked = true;
-				}
-			}
-			mutex_unlock(&obj->mm.lock);
-
-			list_for_each_entry(vma, &obj->vma_list, obj_link) {
-				if (!i915_vma_is_ggtt(vma))
-					break;
-
-				vma->fence_size = i915_gem_get_ggtt_size(dev_priv, vma->size,
-									 args->tiling_mode,
-									 args->stride);
-				vma->fence_alignment = i915_gem_get_ggtt_alignment(dev_priv, vma->size,
-										   args->tiling_mode,
-										   args->stride);
-
-				if (vma->fence)
-					vma->fence->dirty = true;
-			}
-			obj->tiling_and_stride =
-				args->stride | args->tiling_mode;
-
-			/* Force the fence to be reacquired for GTT access */
-			i915_gem_release_mmap(obj);
-		}
-	}
-	/* we have to maintain this existing ABI... */
+	err = i915_gem_object_set_tiling(obj, args->tiling_mode, args->stride);
+	mutex_unlock(&dev->struct_mutex);
+
+	/* We have to maintain this existing ABI... */
 	args->stride = i915_gem_object_get_stride(obj);
 	args->tiling_mode = i915_gem_object_get_tiling(obj);
 
-	/* Try to preallocate memory required to save swizzling on put-pages */
-	if (i915_gem_object_needs_bit17_swizzle(obj)) {
-		if (obj->bit_17 == NULL) {
-			obj->bit_17 = kcalloc(BITS_TO_LONGS(obj->base.size >> PAGE_SHIFT),
-					      sizeof(long), GFP_KERNEL);
-		}
-	} else {
-		kfree(obj->bit_17);
-		obj->bit_17 = NULL;
-	}
-
 err:
 	i915_gem_object_put(obj);
-	mutex_unlock(&dev->struct_mutex);
-
 	return err;
 }
 
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 23/26] drm/i915: Construct a request even if the GPU is currently hung
  2016-12-31 12:06 Small GEM fixes Chris Wilson
                   ` (21 preceding siblings ...)
  2016-12-31 12:06 ` [PATCH 22/26] drm/i915: Split out i915_gem_object_set_tiling() Chris Wilson
@ 2016-12-31 12:06 ` Chris Wilson
  2016-12-31 12:07 ` [PATCH 24/26] drm/i915: Skip switch to kernel context if already done Chris Wilson
                   ` (3 subsequent siblings)
  26 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2016-12-31 12:06 UTC (permalink / raw)
  To: intel-gfx

As we now have the ability to directly reset the GPU from the waiter
(and so do not need to drop the lock in order to let the reset proceed)
and also do not lose requests over a reset, we can now simply queue the
request to occur after the reset rather than roundtripping to userspace
(or worse failing with EIO).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem_request.c | 28 +++-------------------------
 1 file changed, 3 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
index 99056b948eda..ed76e7ecf9e2 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.c
+++ b/drivers/gpu/drm/i915/i915_gem_request.c
@@ -307,26 +307,6 @@ void i915_gem_request_retire_upto(struct drm_i915_gem_request *req)
 	} while (tmp != req);
 }
 
-static int i915_gem_check_wedge(struct drm_i915_private *dev_priv)
-{
-	struct i915_gpu_error *error = &dev_priv->gpu_error;
-
-	if (i915_terminally_wedged(error))
-		return -EIO;
-
-	if (i915_reset_in_progress(error)) {
-		/* Non-interruptible callers can't handle -EAGAIN, hence return
-		 * -EIO unconditionally for these.
-		 */
-		if (!dev_priv->mm.interruptible)
-			return -EIO;
-
-		return -EAGAIN;
-	}
-
-	return 0;
-}
-
 static int i915_gem_init_global_seqno(struct drm_i915_private *i915, u32 seqno)
 {
 	struct i915_gem_timeline *timeline = &i915->gt.global_timeline;
@@ -521,12 +501,10 @@ i915_gem_request_alloc(struct intel_engine_cs *engine,
 	lockdep_assert_held(&dev_priv->drm.struct_mutex);
 
 	/* ABI: Before userspace accesses the GPU (e.g. execbuffer), report
-	 * EIO if the GPU is already wedged, or EAGAIN to drop the struct_mutex
-	 * and restart.
+	 * EIO if the GPU is already wedged.
 	 */
-	ret = i915_gem_check_wedge(dev_priv);
-	if (ret)
-		return ERR_PTR(ret);
+	if (i915_terminally_wedged(&dev_priv->gpu_error))
+		return ERR_PTR(-EIO);
 
 	/* Pinning the contexts may generate requests in order to acquire
 	 * GGTT space, so do this first before we reserve a seqno for
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 24/26] drm/i915: Skip switch to kernel context if already done
  2016-12-31 12:06 Small GEM fixes Chris Wilson
                   ` (22 preceding siblings ...)
  2016-12-31 12:06 ` [PATCH 23/26] drm/i915: Construct a request even if the GPU is currently hung Chris Wilson
@ 2016-12-31 12:07 ` Chris Wilson
  2016-12-31 12:07 ` [PATCH 25/26] drm/i915: Replace 4096 with PAGE_SIZE or I915_GTT_PAGE_SIZE Chris Wilson
                   ` (2 subsequent siblings)
  26 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2016-12-31 12:07 UTC (permalink / raw)
  To: intel-gfx

Some engines are never user or already sitting idle in the kernel
context and for those we can skip flushing the current context for
i915_gem_switch_to_kernel_context(). We used to perform this
optimisation but that was removed for convenience of converting over to
multiple timelines and handling the pending request queues.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.c         |  3 ++-
 drivers/gpu/drm/i915/i915_gem_context.c | 25 +++++++++++++++++++++++++
 2 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 4ded4c42b9b6..0424cee5c767 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4205,7 +4205,8 @@ static void assert_kernel_context_is_current(struct drm_i915_private *dev_priv)
 	enum intel_engine_id id;
 
 	for_each_engine(engine, dev_priv, id)
-		GEM_BUG_ON(!i915_gem_context_is_kernel(engine->last_retired_context));
+		GEM_BUG_ON(!(engine->last_retired_context == NULL ||
+			     i915_gem_context_is_kernel(engine->last_retired_context)));
 }
 
 int i915_gem_suspend(struct drm_i915_private *dev_priv)
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 40a6939e3956..a8f6b8843bdf 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -897,6 +897,26 @@ int i915_switch_context(struct drm_i915_gem_request *req)
 	return do_rcs_switch(req);
 }
 
+static bool engine_has_kernel_context(struct intel_engine_cs *engine)
+{
+	struct i915_gem_timeline *timeline;
+
+	list_for_each_entry(timeline, &engine->i915->gt.timelines, link) {
+		struct intel_timeline *tl;
+
+		if (timeline == &engine->i915->gt.global_timeline)
+			continue;
+
+		tl = &timeline->engine[engine->id];
+		if (i915_gem_active_peek(&tl->last_request,
+					 &engine->i915->drm.struct_mutex))
+			return false;
+	}
+
+	return (engine->last_retired_context == NULL ||
+		i915_gem_context_is_kernel(engine->last_retired_context));
+}
+
 int i915_gem_switch_to_kernel_context(struct drm_i915_private *dev_priv)
 {
 	struct intel_engine_cs *engine;
@@ -905,10 +925,15 @@ int i915_gem_switch_to_kernel_context(struct drm_i915_private *dev_priv)
 
 	lockdep_assert_held(&dev_priv->drm.struct_mutex);
 
+	i915_gem_retire_requests(dev_priv);
+
 	for_each_engine(engine, dev_priv, id) {
 		struct drm_i915_gem_request *req;
 		int ret;
 
+		if (engine_has_kernel_context(engine))
+			continue;
+
 		req = i915_gem_request_alloc(engine, dev_priv->kernel_context);
 		if (IS_ERR(req))
 			return PTR_ERR(req);
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 25/26] drm/i915: Replace 4096 with PAGE_SIZE or I915_GTT_PAGE_SIZE
  2016-12-31 12:06 Small GEM fixes Chris Wilson
                   ` (23 preceding siblings ...)
  2016-12-31 12:07 ` [PATCH 24/26] drm/i915: Skip switch to kernel context if already done Chris Wilson
@ 2016-12-31 12:07 ` Chris Wilson
  2017-01-09 16:19   ` Tvrtko Ursulin
  2016-12-31 12:07 ` [PATCH 26/26] drm/i915: Use the MRU stack search after evicting Chris Wilson
  2016-12-31 13:23 ` ✗ Fi.CI.BAT: failure for series starting with [01/26] drm/i915: Assert that we do create the deferred context Patchwork
  26 siblings, 1 reply; 36+ messages in thread
From: Chris Wilson @ 2016-12-31 12:07 UTC (permalink / raw)
  To: intel-gfx

Start converting over from the byte count to its semantic macro, either
we want to allocate the size of a physical page in main memory or we
want the size of a virtual page in the GTT. 4096 could mean either, but
PAGE_SIZE and I915_GTT_PAGE_SIZE are explicit and should help improve
coe comprehension and future changes.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.c              |  4 ++--
 drivers/gpu/drm/i915/i915_gem_context.c      |  5 +++--
 drivers/gpu/drm/i915/i915_gem_evict.c        |  4 ++--
 drivers/gpu/drm/i915/i915_gem_execbuffer.c   |  2 +-
 drivers/gpu/drm/i915/i915_gem_fence_reg.c    |  6 +++---
 drivers/gpu/drm/i915/i915_gem_gtt.c          |  6 +++---
 drivers/gpu/drm/i915/i915_gem_gtt.h          |  5 ++++-
 drivers/gpu/drm/i915/i915_gem_render_state.c |  4 ++--
 drivers/gpu/drm/i915/i915_gem_stolen.c       |  7 ++++---
 drivers/gpu/drm/i915/i915_vma.c              |  4 ++--
 drivers/gpu/drm/i915/intel_lrc.c             |  5 +++--
 drivers/gpu/drm/i915/intel_ringbuffer.c      | 10 +++++-----
 12 files changed, 34 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 0424cee5c767..bad57279e817 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2092,7 +2092,7 @@ u32 i915_gem_get_ggtt_alignment(struct drm_i915_private *dev_priv, u32 size,
 	 * if a fence register is needed for the object.
 	 */
 	if (INTEL_GEN(dev_priv) >= 4 || tiling_mode == I915_TILING_NONE)
-		return 4096;
+		return I915_GTT_MIN_ALIGNMENT;
 
 	/*
 	 * Previous chips need to be aligned to the size of the smallest
@@ -3560,7 +3560,7 @@ i915_gem_object_unpin_from_display_plane(struct i915_vma *vma)
 		return;
 
 	if (--vma->obj->pin_display == 0)
-		vma->display_alignment = 4096;
+		vma->display_alignment = I915_GTT_MIN_ALIGNMENT;
 
 	/* Bump the LRU to try and avoid premature eviction whilst flipping  */
 	if (!i915_vma_is_active(vma))
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index a8f6b8843bdf..d4780cdfb8a0 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -341,7 +341,7 @@ __create_hw_context(struct drm_i915_private *dev_priv,
 	if (HAS_GUC(dev_priv) && i915.enable_guc_loading)
 		ctx->ggtt_offset_bias = GUC_WOPCM_TOP;
 	else
-		ctx->ggtt_offset_bias = 4096;
+		ctx->ggtt_offset_bias = I915_GTT_PAGE_SIZE;
 
 	return ctx;
 
@@ -456,7 +456,8 @@ int i915_gem_context_init(struct drm_i915_private *dev_priv)
 		dev_priv->hw_context_size = 0;
 	} else if (HAS_HW_CONTEXTS(dev_priv)) {
 		dev_priv->hw_context_size =
-			round_up(get_context_size(dev_priv), 4096);
+			round_up(get_context_size(dev_priv),
+				 I915_GTT_PAGE_SIZE);
 		if (dev_priv->hw_context_size > (1<<20)) {
 			DRM_DEBUG_DRIVER("Disabling HW Contexts; invalid size %d\n",
 					 dev_priv->hw_context_size);
diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
index 50129ec1caab..70ba92f91148 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -263,9 +263,9 @@ int i915_gem_evict_for_vma(struct i915_vma *target, unsigned int flags)
 	if (check_color) {
 		/* Expand search to cover neighbouring guard pages (or lack!) */
 		if (start > target->vm->start)
-			start -= 4096;
+			start -= I915_GTT_PAGE_SIZE;
 		if (end < target->vm->start + target->vm->total)
-			end += 4096;
+			end += I915_GTT_PAGE_SIZE;
 	}
 
 	drm_mm_for_each_node_in_range(node, &target->vm->mm, start, end) {
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index a5fe299da1d3..c6922a5f0514 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -438,7 +438,7 @@ static void *reloc_iomap(struct drm_i915_gem_object *obj,
 			memset(&cache->node, 0, sizeof(cache->node));
 			ret = drm_mm_insert_node_in_range_generic
 				(&ggtt->base.mm, &cache->node,
-				 4096, 0, I915_COLOR_UNEVICTABLE,
+				 PAGE_SIZE, 0, I915_COLOR_UNEVICTABLE,
 				 0, ggtt->mappable_end,
 				 DRM_MM_SEARCH_DEFAULT,
 				 DRM_MM_CREATE_DEFAULT);
diff --git a/drivers/gpu/drm/i915/i915_gem_fence_reg.c b/drivers/gpu/drm/i915/i915_gem_fence_reg.c
index 6de527df6cdc..67835d7b39c7 100644
--- a/drivers/gpu/drm/i915/i915_gem_fence_reg.c
+++ b/drivers/gpu/drm/i915/i915_gem_fence_reg.c
@@ -81,11 +81,11 @@ static void i965_write_fence_reg(struct drm_i915_fence_reg *fence,
 		unsigned int stride = i915_gem_object_get_stride(vma->obj);
 
 		GEM_BUG_ON(!i915_vma_is_map_and_fenceable(vma));
-		GEM_BUG_ON(vma->node.start & 4095);
-		GEM_BUG_ON(vma->fence_size & 4095);
+		GEM_BUG_ON(vma->node.start & (I915_GTT_PAGE_SIZE - 1));
+		GEM_BUG_ON(vma->fence_size & (I915_GTT_PAGE_SIZE - 1));
 		GEM_BUG_ON(stride & 127);
 
-		val = (vma->node.start + vma->fence_size - 4096) << 32;
+		val = (vma->node.start + vma->fence_size - I915_GTT_PAGE_SIZE) << 32;
 		val |= vma->node.start;
 		val |= (u64)((stride / 128) - 1) << fence_pitch_shift;
 		if (tiling == I915_TILING_Y)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 2c579946447c..116e43bffdb1 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2716,11 +2716,11 @@ static void i915_gtt_color_adjust(const struct drm_mm_node *node,
 				  u64 *end)
 {
 	if (node->color != color)
-		*start += 4096;
+		*start += I915_GTT_PAGE_SIZE;
 
 	node = list_next_entry(node, node_list);
 	if (node->allocated && node->color != color)
-		*end -= 4096;
+		*end -= I915_GTT_PAGE_SIZE;
 }
 
 int i915_gem_init_ggtt(struct drm_i915_private *dev_priv)
@@ -2747,7 +2747,7 @@ int i915_gem_init_ggtt(struct drm_i915_private *dev_priv)
 	/* Reserve a mappable slot for our lockless error capture */
 	ret = drm_mm_insert_node_in_range_generic(&ggtt->base.mm,
 						  &ggtt->error_capture,
-						  4096, 0,
+						  PAGE_SIZE, 0,
 						  I915_COLOR_UNEVICTABLE,
 						  0, ggtt->mappable_end,
 						  0, 0);
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index a4070c27d69b..e217b14e14fd 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -40,6 +40,9 @@
 #include "i915_gem_timeline.h"
 #include "i915_gem_request.h"
 
+#define I915_GTT_PAGE_SIZE 4096
+#define I915_GTT_MIN_ALIGNMENT I915_GTT_PAGE_SIZE
+
 #define I915_FENCE_REG_NONE -1
 #define I915_MAX_NUM_FENCES 32
 /* 32 fences + sign bit for FENCE_REG_NONE */
@@ -563,6 +566,6 @@ void i915_gem_gtt_finish_pages(struct drm_i915_gem_object *obj,
 #define PIN_HIGH		BIT(9)
 #define PIN_OFFSET_BIAS		BIT(10)
 #define PIN_OFFSET_FIXED	BIT(11)
-#define PIN_OFFSET_MASK		(~4095)
+#define PIN_OFFSET_MASK		(-I915_GTT_PAGE_SIZE)
 
 #endif
diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c b/drivers/gpu/drm/i915/i915_gem_render_state.c
index 5af19b0bf713..63ae7e813335 100644
--- a/drivers/gpu/drm/i915/i915_gem_render_state.c
+++ b/drivers/gpu/drm/i915/i915_gem_render_state.c
@@ -187,14 +187,14 @@ int i915_gem_render_state_init(struct intel_engine_cs *engine)
 	if (!rodata)
 		return 0;
 
-	if (rodata->batch_items * 4 > 4096)
+	if (rodata->batch_items * 4 > PAGE_SIZE)
 		return -EINVAL;
 
 	so = kmalloc(sizeof(*so), GFP_KERNEL);
 	if (!so)
 		return -ENOMEM;
 
-	obj = i915_gem_object_create_internal(engine->i915, 4096);
+	obj = i915_gem_object_create_internal(engine->i915, PAGE_SIZE);
 	if (IS_ERR(obj)) {
 		ret = PTR_ERR(obj);
 		goto err_free;
diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index 5fd851336a99..be611f1367b4 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -482,7 +482,7 @@ int i915_gem_init_stolen(struct drm_i915_private *dev_priv)
 	stolen_usable_start = 0;
 	/* WaSkipStolenMemoryFirstPage:bdw+ */
 	if (INTEL_GEN(dev_priv) >= 8)
-		stolen_usable_start = 4096;
+		stolen_usable_start = PAGE_SIZE;
 
 	ggtt->stolen_usable_size =
 		ggtt->stolen_size - reserved_total - stolen_usable_start;
@@ -644,8 +644,9 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_i915_private *dev_priv
 			stolen_offset, gtt_offset, size);
 
 	/* KISS and expect everything to be page-aligned */
-	if (WARN_ON(size == 0) || WARN_ON(size & 4095) ||
-	    WARN_ON(stolen_offset & 4095))
+	if (WARN_ON(size == 0) ||
+	    WARN_ON(size & (I915_GTT_PAGE_SIZE - 1)) ||
+	    WARN_ON(stolen_offset & (I915_GTT_MIN_ALIGNMENT - 1)))
 		return NULL;
 
 	stolen = kzalloc(sizeof(*stolen), GFP_KERNEL);
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index ae00d9f4e520..6596d0963da1 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -91,7 +91,7 @@ __i915_vma_create(struct drm_i915_gem_object *obj,
 	vma->vm = vm;
 	vma->obj = obj;
 	vma->size = obj->base.size;
-	vma->display_alignment = 4096;
+	vma->display_alignment = I915_GTT_MIN_ALIGNMENT;
 
 	if (view) {
 		vma->ggtt_view = *view;
@@ -435,7 +435,7 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
 		 * with zero alignment, so where possible use the optimal
 		 * path.
 		 */
-		if (alignment <= 4096)
+		if (alignment <= I915_GTT_MIN_ALIGNMENT)
 			alignment = 0;
 
 search_free:
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 600518ee89db..c809eda65f61 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1937,7 +1937,7 @@ int logical_render_ring_init(struct intel_engine_cs *engine)
 	engine->emit_breadcrumb = gen8_emit_breadcrumb_render;
 	engine->emit_breadcrumb_sz = gen8_emit_breadcrumb_render_sz;
 
-	ret = intel_engine_create_scratch(engine, 4096);
+	ret = intel_engine_create_scratch(engine, PAGE_SIZE);
 	if (ret)
 		return ret;
 
@@ -2219,7 +2219,8 @@ static int execlists_context_deferred_alloc(struct i915_gem_context *ctx,
 
 	WARN_ON(ce->state);
 
-	context_size = round_up(intel_lr_context_size(engine), 4096);
+	context_size = round_up(intel_lr_context_size(engine),
+				I915_GTT_PAGE_SIZE);
 
 	/* One extra page as the sharing data between driver and GuC */
 	context_size += PAGE_SIZE * LRC_PPHWSP_PN;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 0971ac396b60..ab83fc22d207 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1736,7 +1736,7 @@ static int init_status_page(struct intel_engine_cs *engine)
 	void *vaddr;
 	int ret;
 
-	obj = i915_gem_object_create_internal(engine->i915, 4096);
+	obj = i915_gem_object_create_internal(engine->i915, PAGE_SIZE);
 	if (IS_ERR(obj)) {
 		DRM_ERROR("Failed to allocate status page\n");
 		return PTR_ERR(obj);
@@ -1777,7 +1777,7 @@ static int init_status_page(struct intel_engine_cs *engine)
 
 	engine->status_page.vma = vma;
 	engine->status_page.ggtt_offset = i915_ggtt_offset(vma);
-	engine->status_page.page_addr = memset(vaddr, 0, 4096);
+	engine->status_page.page_addr = memset(vaddr, 0, PAGE_SIZE);
 
 	DRM_DEBUG_DRIVER("%s hws offset: 0x%08x\n",
 			 engine->name, i915_ggtt_offset(vma));
@@ -2049,7 +2049,7 @@ static int intel_init_ring_buffer(struct intel_engine_cs *engine)
 	}
 
 	/* Ring wraparound at offset 0 sometimes hangs. No idea why. */
-	ret = intel_ring_pin(ring, 4096);
+	ret = intel_ring_pin(ring, I915_GTT_PAGE_SIZE);
 	if (ret) {
 		intel_ring_free(ring);
 		goto error;
@@ -2466,7 +2466,7 @@ static void intel_ring_init_semaphores(struct drm_i915_private *dev_priv,
 	if (INTEL_GEN(dev_priv) >= 8 && !dev_priv->semaphore) {
 		struct i915_vma *vma;
 
-		obj = i915_gem_object_create(dev_priv, 4096);
+		obj = i915_gem_object_create(dev_priv, PAGE_SIZE);
 		if (IS_ERR(obj))
 			goto err;
 
@@ -2683,7 +2683,7 @@ int intel_init_render_ring_buffer(struct intel_engine_cs *engine)
 		return ret;
 
 	if (INTEL_GEN(dev_priv) >= 6) {
-		ret = intel_engine_create_scratch(engine, 4096);
+		ret = intel_engine_create_scratch(engine, PAGE_SIZE);
 		if (ret)
 			return ret;
 	} else if (HAS_BROKEN_CS_TLB(dev_priv)) {
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 26/26] drm/i915: Use the MRU stack search after evicting
  2016-12-31 12:06 Small GEM fixes Chris Wilson
                   ` (24 preceding siblings ...)
  2016-12-31 12:07 ` [PATCH 25/26] drm/i915: Replace 4096 with PAGE_SIZE or I915_GTT_PAGE_SIZE Chris Wilson
@ 2016-12-31 12:07 ` Chris Wilson
  2016-12-31 13:23 ` ✗ Fi.CI.BAT: failure for series starting with [01/26] drm/i915: Assert that we do create the deferred context Patchwork
  26 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2016-12-31 12:07 UTC (permalink / raw)
  To: intel-gfx

When we evict from the GTT to make room for an object, the hole we
create is put onto the MRU stack inside the drm_mm range manager. On the
next search pass, we can speed up a PIN_HIGH allocation by referencing
that stack for the new hole.

v2: Pull together the 3 identical implements (ahem, a couple were
outdated) into a common routine for allocating a node and evicting as
necessary.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/gvt/aperture_gm.c | 33 +++++-----------
 drivers/gpu/drm/i915/i915_gem_gtt.c    | 72 ++++++++++++++++++++++++----------
 drivers/gpu/drm/i915/i915_gem_gtt.h    |  5 +++
 drivers/gpu/drm/i915/i915_vma.c        | 40 ++-----------------
 4 files changed, 70 insertions(+), 80 deletions(-)

diff --git a/drivers/gpu/drm/i915/gvt/aperture_gm.c b/drivers/gpu/drm/i915/gvt/aperture_gm.c
index 7d33b607bc89..1bb7a5b80d47 100644
--- a/drivers/gpu/drm/i915/gvt/aperture_gm.c
+++ b/drivers/gpu/drm/i915/gvt/aperture_gm.c
@@ -48,47 +48,34 @@ static int alloc_gm(struct intel_vgpu *vgpu, bool high_gm)
 {
 	struct intel_gvt *gvt = vgpu->gvt;
 	struct drm_i915_private *dev_priv = gvt->dev_priv;
-	u32 alloc_flag, search_flag;
+	unsigned int flags;
 	u64 start, end, size;
 	struct drm_mm_node *node;
-	int retried = 0;
 	int ret;
 
 	if (high_gm) {
-		search_flag = DRM_MM_SEARCH_BELOW;
-		alloc_flag = DRM_MM_CREATE_TOP;
 		node = &vgpu->gm.high_gm_node;
 		size = vgpu_hidden_sz(vgpu);
 		start = gvt_hidden_gmadr_base(gvt);
 		end = gvt_hidden_gmadr_end(gvt);
+		flags = PIN_HIGH;
 	} else {
-		search_flag = DRM_MM_SEARCH_DEFAULT;
-		alloc_flag = DRM_MM_CREATE_DEFAULT;
 		node = &vgpu->gm.low_gm_node;
 		size = vgpu_aperture_sz(vgpu);
 		start = gvt_aperture_gmadr_base(gvt);
 		end = gvt_aperture_gmadr_end(gvt);
+		flags = PIN_MAPPABLE;
 	}
 
 	mutex_lock(&dev_priv->drm.struct_mutex);
-search_again:
-	ret = drm_mm_insert_node_in_range_generic(&dev_priv->ggtt.base.mm,
-						  node, size, 4096,
-						  I915_COLOR_UNEVICTABLE,
-						  start, end, search_flag,
-						  alloc_flag);
-	if (ret) {
-		ret = i915_gem_evict_something(&dev_priv->ggtt.base,
-					       size, 4096,
-					       I915_COLOR_UNEVICTABLE,
-					       start, end, 0);
-		if (ret == 0 && ++retried < 3)
-			goto search_again;
-
-		gvt_err("fail to alloc %s gm space from host, retried %d\n",
-				high_gm ? "high" : "low", retried);
-	}
+	ret = i915_gem_gtt_insert(&dev_priv->ggtt.base, node,
+				  size, 4096, I915_COLOR_UNEVICTABLE,
+				  start, end, flags);
 	mutex_unlock(&dev_priv->drm.struct_mutex);
+	if (ret)
+		gvt_err("fail to alloc %s gm space from host\n",
+			high_gm ? "high" : "low");
+
 	return ret;
 }
 
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 116e43bffdb1..d7119a210e9c 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2037,7 +2037,6 @@ static int gen6_ppgtt_allocate_page_directories(struct i915_hw_ppgtt *ppgtt)
 	struct i915_address_space *vm = &ppgtt->base;
 	struct drm_i915_private *dev_priv = ppgtt->base.i915;
 	struct i915_ggtt *ggtt = &dev_priv->ggtt;
-	bool retried = false;
 	int ret;
 
 	/* PPGTT PDEs reside in the GGTT and consists of 512 entries. The
@@ -2050,29 +2049,14 @@ static int gen6_ppgtt_allocate_page_directories(struct i915_hw_ppgtt *ppgtt)
 	if (ret)
 		return ret;
 
-alloc:
-	ret = drm_mm_insert_node_in_range_generic(&ggtt->base.mm, &ppgtt->node,
-						  GEN6_PD_SIZE, GEN6_PD_ALIGN,
-						  I915_COLOR_UNEVICTABLE,
-						  0, ggtt->base.total,
-						  DRM_MM_TOPDOWN);
-	if (ret == -ENOSPC && !retried) {
-		ret = i915_gem_evict_something(&ggtt->base,
-					       GEN6_PD_SIZE, GEN6_PD_ALIGN,
-					       I915_COLOR_UNEVICTABLE,
-					       0, ggtt->base.total,
-					       0);
-		if (ret)
-			goto err_out;
-
-		retried = true;
-		goto alloc;
-	}
-
+	ret = i915_gem_gtt_insert(&ggtt->base, &ppgtt->node,
+				  GEN6_PD_SIZE, GEN6_PD_ALIGN,
+				  I915_COLOR_UNEVICTABLE,
+				  0, ggtt->base.total,
+				  PIN_HIGH);
 	if (ret)
 		goto err_out;
 
-
 	if (ppgtt->node.start < ggtt->mappable_end)
 		DRM_DEBUG("Forced to use aperture for PDEs\n");
 
@@ -3573,3 +3557,49 @@ i915_get_ggtt_vma_pages(struct i915_vma *vma)
 	return ret;
 }
 
+int i915_gem_gtt_insert(struct i915_address_space *vm,
+			struct drm_mm_node *node,
+			u64 size, u64 alignment, unsigned long color,
+			u64 start, u64 end, unsigned int flags)
+{
+	u32 search_flag, alloc_flag;
+	int err;
+
+	lockdep_assert_held(&vm->i915->drm.struct_mutex);
+
+	if (flags & PIN_HIGH) {
+		search_flag = DRM_MM_SEARCH_BELOW;
+		alloc_flag = DRM_MM_CREATE_TOP;
+	} else {
+		search_flag = DRM_MM_SEARCH_DEFAULT;
+		alloc_flag = DRM_MM_CREATE_DEFAULT;
+	}
+
+	/* We only allocate in PAGE_SIZE/GTT_PAGE_SIZE (4096) chunks,
+	 * so we know that we always have a minimum alignment of 4096.
+	 * The drm_mm range manager is optimised to return results
+	 * with zero alignment, so where possible use the optimal
+	 * path.
+	 */
+	GEM_BUG_ON(size & (I915_GTT_MIN_ALIGNMENT - 1));
+	if (alignment <= I915_GTT_MIN_ALIGNMENT)
+		alignment = 0;
+
+	err = drm_mm_insert_node_in_range_generic(&vm->mm, node,
+						  size, alignment, color,
+						  start, end,
+						  search_flag, alloc_flag);
+	if (err != -ENOSPC)
+		return err;
+
+	err = i915_gem_evict_something(vm, size, alignment, color,
+				       start, end, flags);
+	if (err)
+		return err;
+
+	search_flag = DRM_MM_SEARCH_DEFAULT;
+	return drm_mm_insert_node_in_range_generic(&vm->mm, node,
+						   size, alignment, color,
+						   start, end,
+						   search_flag, alloc_flag);
+}
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index e217b14e14fd..4f78e0ff3efe 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -552,6 +552,11 @@ int __must_check i915_gem_gtt_prepare_pages(struct drm_i915_gem_object *obj,
 void i915_gem_gtt_finish_pages(struct drm_i915_gem_object *obj,
 			       struct sg_table *pages);
 
+int i915_gem_gtt_insert(struct i915_address_space *vm,
+			struct drm_mm_node *node,
+			u64 size, u64 alignment, unsigned long color,
+			u64 start, u64 end, unsigned int flags);
+
 /* Flags used by pin/bind&friends. */
 #define PIN_NONBLOCK		BIT(0)
 #define PIN_MAPPABLE		BIT(1)
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 6596d0963da1..9251787f8fc4 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -419,43 +419,11 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
 				goto err_unpin;
 		}
 	} else {
-		u32 search_flag, alloc_flag;
-
-		if (flags & PIN_HIGH) {
-			search_flag = DRM_MM_SEARCH_BELOW;
-			alloc_flag = DRM_MM_CREATE_TOP;
-		} else {
-			search_flag = DRM_MM_SEARCH_DEFAULT;
-			alloc_flag = DRM_MM_CREATE_DEFAULT;
-		}
-
-		/* We only allocate in PAGE_SIZE/GTT_PAGE_SIZE (4096) chunks,
-		 * so we know that we always have a minimum alignment of 4096.
-		 * The drm_mm range manager is optimised to return results
-		 * with zero alignment, so where possible use the optimal
-		 * path.
-		 */
-		if (alignment <= I915_GTT_MIN_ALIGNMENT)
-			alignment = 0;
-
-search_free:
-		ret = drm_mm_insert_node_in_range_generic(&vma->vm->mm,
-							  &vma->node,
-							  size, alignment,
-							  obj->cache_level,
-							  start, end,
-							  search_flag,
-							  alloc_flag);
-		if (ret) {
-			ret = i915_gem_evict_something(vma->vm, size, alignment,
-						       obj->cache_level,
-						       start, end,
-						       flags);
-			if (ret == 0)
-				goto search_free;
-
+		ret = i915_gem_gtt_insert(vma->vm, &vma->node,
+					  size, alignment, obj->cache_level,
+					  start, end, flags);
+		if (ret)
 			goto err_unpin;
-		}
 
 		GEM_BUG_ON(vma->node.start < start);
 		GEM_BUG_ON(vma->node.start + vma->node.size > end);
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* ✗ Fi.CI.BAT: failure for series starting with [01/26] drm/i915: Assert that we do create the deferred context
  2016-12-31 12:06 Small GEM fixes Chris Wilson
                   ` (25 preceding siblings ...)
  2016-12-31 12:07 ` [PATCH 26/26] drm/i915: Use the MRU stack search after evicting Chris Wilson
@ 2016-12-31 13:23 ` Patchwork
  2016-12-31 13:29   ` Chris Wilson
  26 siblings, 1 reply; 36+ messages in thread
From: Patchwork @ 2016-12-31 13:23 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/26] drm/i915: Assert that we do create the deferred context
URL   : https://patchwork.freedesktop.org/series/17337/
State : failure

== Summary ==

Series 17337v1 Series without cover letter
https://patchwork.freedesktop.org/api/1.0/series/17337/revisions/1/mbox/

Test kms_addfb_basic:
        Subgroup addfb25-framebuffer-vs-set-tiling:
                pass       -> FAIL       (fi-byt-j1900)
                pass       -> FAIL       (fi-byt-n2820)
                pass       -> FAIL       (fi-hsw-4770r)
                pass       -> FAIL       (fi-snb-2600)
                pass       -> FAIL       (fi-skl-6700k)
                pass       -> FAIL       (fi-skl-6700hq)
                pass       -> FAIL       (fi-hsw-4770)
                pass       -> FAIL       (fi-kbl-7500u)
                pass       -> FAIL       (fi-bsw-n3050)
                pass       -> FAIL       (fi-skl-6770hq)
                pass       -> FAIL       (fi-bxt-j4205)
                pass       -> FAIL       (fi-ivb-3520m)
                pass       -> FAIL       (fi-snb-2520m)
                pass       -> FAIL       (fi-ivb-3770)
                pass       -> FAIL       (fi-skl-6260u)
                pass       -> FAIL       (fi-bdw-5557u)
        Subgroup framebuffer-vs-set-tiling:
                pass       -> FAIL       (fi-byt-j1900)
                pass       -> FAIL       (fi-byt-n2820)
                pass       -> FAIL       (fi-hsw-4770r)
                pass       -> FAIL       (fi-snb-2600)
                pass       -> FAIL       (fi-skl-6700k)
                pass       -> FAIL       (fi-skl-6700hq)
                pass       -> FAIL       (fi-hsw-4770)
                pass       -> FAIL       (fi-kbl-7500u)
                pass       -> FAIL       (fi-bsw-n3050)
                pass       -> FAIL       (fi-skl-6770hq)
                pass       -> FAIL       (fi-bxt-j4205)
                pass       -> FAIL       (fi-ivb-3520m)
                pass       -> FAIL       (fi-snb-2520m)
                pass       -> FAIL       (fi-ivb-3770)
                pass       -> FAIL       (fi-skl-6260u)
                pass       -> FAIL       (fi-bdw-5557u)
Test kms_pipe_crc_basic:
        Subgroup hang-read-crc-pipe-a:
                pass       -> DMESG-WARN (fi-ivb-3770)

fi-bdw-5557u     total:246  pass:230  dwarn:0   dfail:0   fail:2   skip:14 
fi-bsw-n3050     total:246  pass:205  dwarn:0   dfail:0   fail:2   skip:39 
fi-bxt-j4205     total:246  pass:222  dwarn:0   dfail:0   fail:2   skip:22 
fi-bxt-t5700     total:82   pass:69   dwarn:0   dfail:0   fail:0   skip:12 
fi-byt-j1900     total:246  pass:217  dwarn:0   dfail:0   fail:2   skip:27 
fi-byt-n2820     total:246  pass:213  dwarn:0   dfail:0   fail:2   skip:31 
fi-hsw-4770      total:246  pass:225  dwarn:0   dfail:0   fail:2   skip:19 
fi-hsw-4770r     total:246  pass:225  dwarn:0   dfail:0   fail:2   skip:19 
fi-ivb-3520m     total:246  pass:223  dwarn:0   dfail:0   fail:2   skip:21 
fi-ivb-3770      total:246  pass:222  dwarn:1   dfail:0   fail:2   skip:21 
fi-kbl-7500u     total:246  pass:223  dwarn:0   dfail:0   fail:2   skip:21 
fi-skl-6260u     total:246  pass:231  dwarn:0   dfail:0   fail:2   skip:13 
fi-skl-6700hq    total:246  pass:224  dwarn:0   dfail:0   fail:2   skip:20 
fi-skl-6700k     total:246  pass:220  dwarn:3   dfail:0   fail:2   skip:21 
fi-skl-6770hq    total:246  pass:231  dwarn:0   dfail:0   fail:2   skip:13 
fi-snb-2520m     total:246  pass:213  dwarn:0   dfail:0   fail:2   skip:31 
fi-snb-2600      total:246  pass:212  dwarn:0   dfail:0   fail:2   skip:32 

82a69977b9cf76b03f4abe89cd581b436b8405e5 drm-tip: 2016y-12m-31d-12h-02m-21s UTC integration manifest
4c3de03 drm/i915: Use the MRU stack search after evicting
f91c4da drm/i915: Replace 4096 with PAGE_SIZE or I915_GTT_PAGE_SIZE
2d172a6 drm/i915: Skip switch to kernel context if already done
4ab2783 drm/i915: Construct a request even if the GPU is currently hung
9c35b8d drm/i915: Split out i915_gem_object_set_tiling()
b3d8dd4 drm/i915: Include ioctl in set-tiling and get-tiling function names
07f7b8e drm/i915: Remove the rounding down of the gen4+ fence region
84d909a drm/i915: Store required fence size/alignment for GGTT vma
762d552 drm/i915: Replace WARNs in fence register writes with extensive asserts
f6130d4 drm/i915: Align GGTT sizes to a fence tile row
5926e7a drm/i915: Clip the partial view against the object not vma
1571923 drm/i915: Extact compute_partial_view()
a56c803 drm/i915: Eliminate superfluous i915_ggtt_view_normal
28536b4 drm/i915: Eliminate superfluous i915_ggtt_view_rotated
7c0d284 drm/i915: Convert i915_ggtt_view to use an anonymous union
9d9ab42 drm/i915: Pack the partial view size and offset into a single u64
5cfd7df drm/i915: Name the anonymous structs inside i915_ggtt_view
2881fd3 drm/i915: Simplify testing for am-I-the-kernel-context?
5874a08 drm/i915: Drain freed objects for mmap space exhaustion
5e1d000 drm/i915: Fix phys pwrite for struct_mutex-less operation
dbfdd24 drm/i915: Purge loose pages if we run out of DMA remap space
26d3de2 drm/i915/guc: Exclude the upper end of the Global GTT for the GuC
dab3544 drm/i915: Use range_overflows()
b3dcd9c drm/i915: Use fixed-sized types for stolen
144f452 drm/i915: Move a few utility macros into a separate header
0036933 drm/i915: Assert that we do create the deferred context

== Logs ==

For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_3416/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: ✗ Fi.CI.BAT: failure for series starting with [01/26] drm/i915: Assert that we do create the deferred context
  2016-12-31 13:23 ` ✗ Fi.CI.BAT: failure for series starting with [01/26] drm/i915: Assert that we do create the deferred context Patchwork
@ 2016-12-31 13:29   ` Chris Wilson
  0 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2016-12-31 13:29 UTC (permalink / raw)
  To: intel-gfx

On Sat, Dec 31, 2016 at 01:23:42PM -0000, Patchwork wrote:
> == Series Details ==
> 
> Series: series starting with [01/26] drm/i915: Assert that we do create the deferred context
> URL   : https://patchwork.freedesktop.org/series/17337/
> State : failure
> 
> == Summary ==
> 
> Series 17337v1 Series without cover letter
> https://patchwork.freedesktop.org/api/1.0/series/17337/revisions/1/mbox/
> 
> Test kms_addfb_basic:
>         Subgroup addfb25-framebuffer-vs-set-tiling:
>                 pass       -> FAIL       (fi-byt-j1900)
>                 pass       -> FAIL       (fi-byt-n2820)
>                 pass       -> FAIL       (fi-hsw-4770r)
>                 pass       -> FAIL       (fi-snb-2600)
>                 pass       -> FAIL       (fi-skl-6700k)
>                 pass       -> FAIL       (fi-skl-6700hq)
>                 pass       -> FAIL       (fi-hsw-4770)
>                 pass       -> FAIL       (fi-kbl-7500u)
>                 pass       -> FAIL       (fi-bsw-n3050)
>                 pass       -> FAIL       (fi-skl-6770hq)
>                 pass       -> FAIL       (fi-bxt-j4205)
>                 pass       -> FAIL       (fi-ivb-3520m)
>                 pass       -> FAIL       (fi-snb-2520m)
>                 pass       -> FAIL       (fi-ivb-3770)
>                 pass       -> FAIL       (fi-skl-6260u)
>                 pass       -> FAIL       (fi-bdw-5557u)
>         Subgroup framebuffer-vs-set-tiling:
>                 pass       -> FAIL       (fi-byt-j1900)
>                 pass       -> FAIL       (fi-byt-n2820)
>                 pass       -> FAIL       (fi-hsw-4770r)
>                 pass       -> FAIL       (fi-snb-2600)
>                 pass       -> FAIL       (fi-skl-6700k)
>                 pass       -> FAIL       (fi-skl-6700hq)
>                 pass       -> FAIL       (fi-hsw-4770)
>                 pass       -> FAIL       (fi-kbl-7500u)
>                 pass       -> FAIL       (fi-bsw-n3050)
>                 pass       -> FAIL       (fi-skl-6770hq)
>                 pass       -> FAIL       (fi-bxt-j4205)
>                 pass       -> FAIL       (fi-ivb-3520m)
>                 pass       -> FAIL       (fi-snb-2520m)
>                 pass       -> FAIL       (fi-ivb-3770)
>                 pass       -> FAIL       (fi-skl-6260u)
>                 pass       -> FAIL       (fi-bdw-5557u)

Just a bad test (already fixed in igt), expecting a fail for a no-op.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 05/26] drm/i915/guc: Exclude the upper end of the Global GTT for the GuC
  2016-12-31 12:06 ` [PATCH 05/26] drm/i915/guc: Exclude the upper end of the Global GTT for the GuC Chris Wilson
@ 2017-01-02 15:25   ` Arkadiusz Hiler
  0 siblings, 0 replies; 36+ messages in thread
From: Arkadiusz Hiler @ 2017-01-02 15:25 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On Sat, Dec 31, 2016 at 12:06:41PM +0000, Chris Wilson wrote:
> The GuC uses a special mapping for the upper end of the Global GTT,
> similar to the way it uses a special mapping for the lower end, so
> exclude it from our drm_mm to prevent us using it.
> 
> v2: Rename to reflect that it is unmappable similar to the region at the
> bottom of the GGTT, and couple it into the assertion that we don't feed
> unmappable addresses to the GuC.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem_gtt.c | 10 ++++++++++
>  drivers/gpu/drm/i915/i915_guc_reg.h |  3 +++
>  drivers/gpu/drm/i915/intel_uc.h     |  1 +
>  3 files changed, 14 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 11aeba60b5d7..00520f27bea6 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -3178,6 +3178,16 @@ int i915_ggtt_probe_hw(struct drm_i915_private *dev_priv)
>  	if (ret)
>  		return ret;
>  
> +	/* Trim the GGTT to fit the GuC mappable upper range (when enabled).
> +	 * This is easier than doing range restriction on the fly, as we
> +	 * currently don't have any bits spare to pass in this upper
> +	 * restriction!
> +	 */
> +	if (HAS_GUC(dev_priv) && i915.enable_guc_loading) {
> +		ggtt->base.total = min_t(u64, ggtt->base.total, GUC_GGTT_TOP);
> +		ggtt->mappable_end = min(ggtt->mappable_end, ggtt->base.total);
> +	}
> +
>  	if ((ggtt->base.total - 1) >> 32) {
>  		DRM_ERROR("We never expected a Global GTT with more than 32bits"
>  			  " of address space! Found %lldM!\n",
> diff --git a/drivers/gpu/drm/i915/i915_guc_reg.h b/drivers/gpu/drm/i915/i915_guc_reg.h
> index 5e638fc37208..6a0adafe0523 100644
> --- a/drivers/gpu/drm/i915/i915_guc_reg.h
> +++ b/drivers/gpu/drm/i915/i915_guc_reg.h
> @@ -73,6 +73,9 @@
>  #define   GUC_WOPCM_TOP			  (0x80 << 12)	/* 512KB */
>  #define   BXT_GUC_WOPCM_RC6_RESERVED	  (0x10 << 12)	/* 64KB  */
>  
> +/* GuC addresses above GUC_GGTT_TOP also don't map through the GTT */
> +#define GUC_GGTT_TOP			0xFEE00000
> +
>  #define GEN8_GT_PM_CONFIG		_MMIO(0x138140)
>  #define GEN9LP_GT_PM_CONFIG		_MMIO(0x138140)
>  #define GEN9_GT_PM_CONFIG		_MMIO(0x13816c)
> diff --git a/drivers/gpu/drm/i915/intel_uc.h b/drivers/gpu/drm/i915/intel_uc.h
> index d556215e691f..c594472d918b 100644
> --- a/drivers/gpu/drm/i915/intel_uc.h
> +++ b/drivers/gpu/drm/i915/intel_uc.h
> @@ -204,6 +204,7 @@ static inline u32 guc_ggtt_offset(struct i915_vma *vma)
>  {
>  	u32 offset = i915_ggtt_offset(vma);
>  	GEM_BUG_ON(offset < GUC_WOPCM_TOP);
> +	GEM_BUG_ON(range_overflows_t(u64, offset, vma->size, GUC_GGTT_TOP));
>  	return offset;
>  }
>  
> -- 
> 2.11.0
> 

-- 
Cheers,
Arek
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 21/26] drm/i915: Include ioctl in set-tiling and get-tiling function names
  2016-12-31 12:06 ` [PATCH 21/26] drm/i915: Include ioctl in set-tiling and get-tiling function names Chris Wilson
@ 2017-01-09 15:12   ` Tvrtko Ursulin
  0 siblings, 0 replies; 36+ messages in thread
From: Tvrtko Ursulin @ 2017-01-09 15:12 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 31/12/2016 12:06, Chris Wilson wrote:
> Make it clear that these functions are the user entry points for
> the tiling/fence registers.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/i915_drv.c        |  4 ++--
>  drivers/gpu/drm/i915/i915_drv.h        |  8 ++++----
>  drivers/gpu/drm/i915/i915_gem_tiling.c | 16 ++++++++--------
>  3 files changed, 14 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 2c020eafada6..2cb8682ae57f 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -2548,8 +2548,8 @@ static const struct drm_ioctl_desc i915_ioctls[] = {
>  	DRM_IOCTL_DEF_DRV(I915_GEM_MMAP_GTT, i915_gem_mmap_gtt_ioctl, DRM_RENDER_ALLOW),
>  	DRM_IOCTL_DEF_DRV(I915_GEM_SET_DOMAIN, i915_gem_set_domain_ioctl, DRM_RENDER_ALLOW),
>  	DRM_IOCTL_DEF_DRV(I915_GEM_SW_FINISH, i915_gem_sw_finish_ioctl, DRM_RENDER_ALLOW),
> -	DRM_IOCTL_DEF_DRV(I915_GEM_SET_TILING, i915_gem_set_tiling, DRM_RENDER_ALLOW),
> -	DRM_IOCTL_DEF_DRV(I915_GEM_GET_TILING, i915_gem_get_tiling, DRM_RENDER_ALLOW),
> +	DRM_IOCTL_DEF_DRV(I915_GEM_SET_TILING, i915_gem_set_tiling_ioctl, DRM_RENDER_ALLOW),
> +	DRM_IOCTL_DEF_DRV(I915_GEM_GET_TILING, i915_gem_get_tiling_ioctl, DRM_RENDER_ALLOW),
>  	DRM_IOCTL_DEF_DRV(I915_GEM_GET_APERTURE, i915_gem_get_aperture_ioctl, DRM_RENDER_ALLOW),
>  	DRM_IOCTL_DEF_DRV(I915_GET_PIPE_FROM_CRTC_ID, intel_get_pipe_from_crtc_id, 0),
>  	DRM_IOCTL_DEF_DRV(I915_GEM_MADVISE, i915_gem_madvise_ioctl, DRM_RENDER_ALLOW),
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 4bfa3393c97b..5d7d1a9c0179 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -3100,10 +3100,10 @@ int i915_gem_throttle_ioctl(struct drm_device *dev, void *data,
>  			    struct drm_file *file_priv);
>  int i915_gem_madvise_ioctl(struct drm_device *dev, void *data,
>  			   struct drm_file *file_priv);
> -int i915_gem_set_tiling(struct drm_device *dev, void *data,
> -			struct drm_file *file_priv);
> -int i915_gem_get_tiling(struct drm_device *dev, void *data,
> -			struct drm_file *file_priv);
> +int i915_gem_set_tiling_ioctl(struct drm_device *dev, void *data,
> +			      struct drm_file *file_priv);
> +int i915_gem_get_tiling_ioctl(struct drm_device *dev, void *data,
> +			      struct drm_file *file_priv);
>  void i915_gem_init_userptr(struct drm_i915_private *dev_priv);
>  int i915_gem_userptr_ioctl(struct drm_device *dev, void *data,
>  			   struct drm_file *file);
> diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
> index 23a896cd934f..6c5b8a0e8325 100644
> --- a/drivers/gpu/drm/i915/i915_gem_tiling.c
> +++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
> @@ -34,8 +34,8 @@
>  /**
>   * DOC: buffer object tiling
>   *
> - * i915_gem_set_tiling() and i915_gem_get_tiling() is the userspace interface to
> - * declare fence register requirements.
> + * i915_gem_set_tiling_ioctl() and i915_gem_get_tiling_ioctl() is the userspace
> + * interface to declare fence register requirements.
>   *
>   * In principle GEM doesn't care at all about the internal data layout of an
>   * object, and hence it also doesn't care about tiling or swizzling. There's two
> @@ -164,7 +164,7 @@ i915_gem_object_fence_prepare(struct drm_i915_gem_object *obj,
>  }
>
>  /**
> - * i915_gem_set_tiling - IOCTL handler to set tiling mode
> + * i915_gem_set_tiling_ioctl - IOCTL handler to set tiling mode
>   * @dev: DRM device
>   * @data: data pointer for the ioctl
>   * @file: DRM file for the ioctl call
> @@ -178,8 +178,8 @@ i915_gem_object_fence_prepare(struct drm_i915_gem_object *obj,
>   * Zero on success, negative errno on failure.
>   */
>  int
> -i915_gem_set_tiling(struct drm_device *dev, void *data,
> -		   struct drm_file *file)
> +i915_gem_set_tiling_ioctl(struct drm_device *dev, void *data,
> +			  struct drm_file *file)
>  {
>  	struct drm_i915_gem_set_tiling *args = data;
>  	struct drm_i915_private *dev_priv = to_i915(dev);
> @@ -316,7 +316,7 @@ i915_gem_set_tiling(struct drm_device *dev, void *data,
>  }
>
>  /**
> - * i915_gem_get_tiling - IOCTL handler to get tiling mode
> + * i915_gem_get_tiling_ioctl - IOCTL handler to get tiling mode
>   * @dev: DRM device
>   * @data: data pointer for the ioctl
>   * @file: DRM file for the ioctl call
> @@ -329,8 +329,8 @@ i915_gem_set_tiling(struct drm_device *dev, void *data,
>   * Zero on success, negative errno on failure.
>   */
>  int
> -i915_gem_get_tiling(struct drm_device *dev, void *data,
> -		   struct drm_file *file)
> +i915_gem_get_tiling_ioctl(struct drm_device *dev, void *data,
> +			  struct drm_file *file)
>  {
>  	struct drm_i915_gem_get_tiling *args = data;
>  	struct drm_i915_private *dev_priv = to_i915(dev);
>

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 22/26] drm/i915: Split out i915_gem_object_set_tiling()
  2016-12-31 12:06 ` [PATCH 22/26] drm/i915: Split out i915_gem_object_set_tiling() Chris Wilson
@ 2017-01-09 15:43   ` Tvrtko Ursulin
  2017-01-09 15:58     ` Chris Wilson
  0 siblings, 1 reply; 36+ messages in thread
From: Tvrtko Ursulin @ 2017-01-09 15:43 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 31/12/2016 12:06, Chris Wilson wrote:
> Expose an interface for changing the tiling and stride on an object,
> that includes the complexity of checking for conflicting bindings and
> fence registers.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/i915_gem_object.h |   3 +
>  drivers/gpu/drm/i915/i915_gem_tiling.c | 238 +++++++++++++++++----------------
>  2 files changed, 128 insertions(+), 113 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_object.h b/drivers/gpu/drm/i915/i915_gem_object.h
> index 6a368de9d81e..789bb44c9149 100644
> --- a/drivers/gpu/drm/i915/i915_gem_object.h
> +++ b/drivers/gpu/drm/i915/i915_gem_object.h
> @@ -317,6 +317,9 @@ i915_gem_object_get_stride(struct drm_i915_gem_object *obj)
>  	return obj->tiling_and_stride & STRIDE_MASK;
>  }
>
> +int i915_gem_object_set_tiling(struct drm_i915_gem_object *obj,
> +			       unsigned int tiling, unsigned int stride);
> +
>  static inline struct intel_engine_cs *
>  i915_gem_object_last_write_engine(struct drm_i915_gem_object *obj)
>  {
> diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
> index 6c5b8a0e8325..4ec2620b7bf9 100644
> --- a/drivers/gpu/drm/i915/i915_gem_tiling.c
> +++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
> @@ -60,61 +60,56 @@
>
>  /* Check pitch constriants for all chips & tiling formats */
>  static bool
> -i915_tiling_ok(struct drm_i915_private *dev_priv,
> -	       int stride, int size, int tiling_mode)
> +i915_tiling_ok(struct drm_i915_gem_object *obj,
> +	       unsigned int tiling, unsigned int stride)

Should it, or should it not now have an _obj(ect)_ somewhere in the 
function name? i915_gem_object_tiling_ok, or i915_gem_tiling_ok_for_object?

>  {
> -	int tile_width;
> +	struct drm_i915_private *i915 = to_i915(obj->base.dev);
> +	unsigned int tile_width;
>
>  	/* Linear is always fine */
> -	if (tiling_mode == I915_TILING_NONE)
> +	if (tiling == I915_TILING_NONE)
>  		return true;
>
> -	if (tiling_mode > I915_TILING_LAST)
> +	if (tiling > I915_TILING_LAST)
>  		return false;
>
> -	if (IS_GEN2(dev_priv) ||
> -	    (tiling_mode == I915_TILING_Y && HAS_128_BYTE_Y_TILING(dev_priv)))
> -		tile_width = 128;
> -	else
> -		tile_width = 512;
> -
>  	/* check maximum stride & object size */
>  	/* i965+ stores the end address of the gtt mapping in the fence
>  	 * reg, so dont bother to check the size */
> -	if (INTEL_GEN(dev_priv) >= 7) {
> +	if (INTEL_GEN(i915) >= 7) {
>  		if (stride / 128 > GEN7_FENCE_MAX_PITCH_VAL)
>  			return false;
> -	} else if (INTEL_GEN(dev_priv) >= 4) {
> +	} else if (INTEL_GEN(i915) >= 4) {
>  		if (stride / 128 > I965_FENCE_MAX_PITCH_VAL)
>  			return false;
>  	} else {
>  		if (stride > 8192)
>  			return false;
>
> -		if (IS_GEN3(dev_priv)) {
> -			if (size > I830_FENCE_MAX_SIZE_VAL << 20)
> +		if (IS_GEN3(i915)) {
> +			if (obj->base.size > I830_FENCE_MAX_SIZE_VAL << 20)
>  				return false;
>  		} else {
> -			if (size > I830_FENCE_MAX_SIZE_VAL << 19)
> +			if (obj->base.size > I830_FENCE_MAX_SIZE_VAL << 19)
>  				return false;
>  		}
>  	}
>
> -	if (stride < tile_width)
> +	if (IS_GEN2(i915) ||
> +	    (tiling == I915_TILING_Y && HAS_128_BYTE_Y_TILING(i915)))
> +		tile_width = 128;
> +	else
> +		tile_width = 512;
> +
> +	if (stride & (tile_width - 1))
>  		return false;
>
>  	/* 965+ just needs multiples of tile width */
> -	if (INTEL_GEN(dev_priv) >= 4) {
> -		if (stride & (tile_width - 1))
> -			return false;
> +	if (INTEL_GEN(i915) >= 4)
>  		return true;
> -	}
>
>  	/* Pre-965 needs power of two tile widths */

Is this comment actually referring to the stride? tile_width is always a 
power of two since it is set above.

> -	if (stride & (stride - 1))
> -		return false;
> -
> -	return true;
> +	return is_power_of_2(stride);
>  }

Looks OK, but would have been better if you left the optimising bits out 
from this patch.

>  static bool i915_vma_fence_prepare(struct i915_vma *vma,
> @@ -163,6 +158,99 @@ i915_gem_object_fence_prepare(struct drm_i915_gem_object *obj,
>  	return 0;
>  }
>
> +int
> +i915_gem_object_set_tiling(struct drm_i915_gem_object *obj,
> +			   unsigned int tiling, unsigned int stride)
> +{
> +	struct drm_i915_private *i915 = to_i915(obj->base.dev);
> +	struct i915_vma *vma;
> +	int err;
> +
> +	/* Make sure we don't cross-contaminate obj->tiling_and_stride */
> +	BUILD_BUG_ON(I915_TILING_LAST & STRIDE_MASK);
> +
> +	GEM_BUG_ON(!i915_tiling_ok(obj, tiling, stride));
> +	GEM_BUG_ON(!stride ^ (tiling == I915_TILING_NONE));

Why is i915_tiling_ok not checking for this and handling it gracefully? 
It might be even user triggerable, not sure, don't see the whole 
function in the diff.

> +	lockdep_assert_held(&i915->drm.struct_mutex);
> +
> +	if ((tiling | stride) == obj->tiling_and_stride)
> +		return 0;
> +
> +	if (obj->pin_display || obj->framebuffer_references)
> +		return -EBUSY;
> +
> +	/* We need to rebind the object if its current allocation
> +	 * no longer meets the alignment restrictions for its new
> +	 * tiling mode. Otherwise we can just leave it alone, but
> +	 * need to ensure that any fence register is updated before
> +	 * the next fenced (either through the GTT or by the BLT unit
> +	 * on older GPUs) access.
> +	 *
> +	 * After updating the tiling parameters, we then flag whether
> +	 * we need to update an associated fence register. Note this
> +	 * has to also include the unfenced register the GPU uses
> +	 * whilst executing a fenced command for an untiled object.
> +	 */
> +
> +	err = i915_gem_object_fence_prepare(obj, tiling, stride);
> +	if (err)
> +		return err;
> +
> +	/* If the memory has unknown (i.e. varying) swizzling, we pin the
> +	 * pages to prevent them being swapped out and causing corruption
> +	 * due to the change in swizzling.
> +	 */
> +	mutex_lock(&obj->mm.lock);
> +	if (obj->mm.pages &&
> +	    obj->mm.madv == I915_MADV_WILLNEED &&
> +	    i915->quirks & QUIRK_PIN_SWIZZLED_PAGES) {
> +		if (tiling == I915_TILING_NONE) {
> +			GEM_BUG_ON(!obj->mm.quirked);
> +			__i915_gem_object_unpin_pages(obj);
> +			obj->mm.quirked = false;
> +		}
> +		if (!i915_gem_object_is_tiled(obj)) {
> +			GEM_BUG_ON(!obj->mm.quirked);
> +			__i915_gem_object_pin_pages(obj);
> +			obj->mm.quirked = true;
> +		}
> +	}
> +	mutex_unlock(&obj->mm.lock);
> +
> +	list_for_each_entry(vma, &obj->vma_list, obj_link) {
> +		if (!i915_vma_is_ggtt(vma))
> +			break;
> +
> +		vma->fence_size =
> +			i915_gem_get_ggtt_size(i915, vma->size,
> +					       tiling, stride);
> +		vma->fence_alignment =
> +			i915_gem_get_ggtt_alignment(i915, vma->size,
> +						    tiling, stride);
> +
> +		if (vma->fence)
> +			vma->fence->dirty = true;
> +	}
> +
> +	obj->tiling_and_stride = tiling | stride;
> +
> +	/* Force the fence to be reacquired for GTT access */
> +	i915_gem_release_mmap(obj);
> +
> +	/* Try to preallocate memory required to save swizzling on put-pages */
> +	if (i915_gem_object_needs_bit17_swizzle(obj)) {
> +		if (!obj->bit_17) {
> +			obj->bit_17 = kcalloc(BITS_TO_LONGS(obj->base.size >> PAGE_SHIFT),
> +					      sizeof(long), GFP_KERNEL);
> +		}
> +	} else {
> +		kfree(obj->bit_17);
> +		obj->bit_17 = NULL;
> +	}
> +
> +	return 0;
> +}
> +
>  /**
>   * i915_gem_set_tiling_ioctl - IOCTL handler to set tiling mode
>   * @dev: DRM device
> @@ -182,26 +270,15 @@ i915_gem_set_tiling_ioctl(struct drm_device *dev, void *data,
>  			  struct drm_file *file)
>  {
>  	struct drm_i915_gem_set_tiling *args = data;
> -	struct drm_i915_private *dev_priv = to_i915(dev);
>  	struct drm_i915_gem_object *obj;
> -	int err = 0;
> -
> -	/* Make sure we don't cross-contaminate obj->tiling_and_stride */
> -	BUILD_BUG_ON(I915_TILING_LAST & STRIDE_MASK);
> +	int err;
>
>  	obj = i915_gem_object_lookup(file, args->handle);
>  	if (!obj)
>  		return -ENOENT;
>
> -	if (!i915_tiling_ok(dev_priv,
> -			    args->stride, obj->base.size, args->tiling_mode)) {
> -		i915_gem_object_put(obj);
> -		return -EINVAL;
> -	}
> -
> -	mutex_lock(&dev->struct_mutex);
> -	if (obj->pin_display || obj->framebuffer_references) {
> -		err = -EBUSY;
> +	if (!i915_tiling_ok(obj, args->tiling_mode, args->stride)) {
> +		err = -EINVAL;
>  		goto err;
>  	}
>
> @@ -210,9 +287,9 @@ i915_gem_set_tiling_ioctl(struct drm_device *dev, void *data,
>  		args->stride = 0;
>  	} else {
>  		if (args->tiling_mode == I915_TILING_X)
> -			args->swizzle_mode = dev_priv->mm.bit_6_swizzle_x;
> +			args->swizzle_mode = to_i915(dev)->mm.bit_6_swizzle_x;
>  		else
> -			args->swizzle_mode = dev_priv->mm.bit_6_swizzle_y;
> +			args->swizzle_mode = to_i915(dev)->mm.bit_6_swizzle_y;
>
>  		/* Hide bit 17 swizzling from the user.  This prevents old Mesa
>  		 * from aborting the application on sw fallbacks to bit 17,
> @@ -234,84 +311,19 @@ i915_gem_set_tiling_ioctl(struct drm_device *dev, void *data,
>  		}
>  	}
>
> -	if (args->tiling_mode != i915_gem_object_get_tiling(obj) ||
> -	    args->stride != i915_gem_object_get_stride(obj)) {
> -		/* We need to rebind the object if its current allocation
> -		 * no longer meets the alignment restrictions for its new
> -		 * tiling mode. Otherwise we can just leave it alone, but
> -		 * need to ensure that any fence register is updated before
> -		 * the next fenced (either through the GTT or by the BLT unit
> -		 * on older GPUs) access.
> -		 *
> -		 * After updating the tiling parameters, we then flag whether
> -		 * we need to update an associated fence register. Note this
> -		 * has to also include the unfenced register the GPU uses
> -		 * whilst executing a fenced command for an untiled object.
> -		 */
> +	err = mutex_lock_interruptible(&dev->struct_mutex);
> +	if (err)
> +		goto err;
>
> -		err = i915_gem_object_fence_prepare(obj,
> -						    args->tiling_mode,
> -						    args->stride);
> -		if (!err) {
> -			struct i915_vma *vma;
> -
> -			mutex_lock(&obj->mm.lock);
> -			if (obj->mm.pages &&
> -			    obj->mm.madv == I915_MADV_WILLNEED &&
> -			    dev_priv->quirks & QUIRK_PIN_SWIZZLED_PAGES) {
> -				if (args->tiling_mode == I915_TILING_NONE) {
> -					GEM_BUG_ON(!obj->mm.quirked);
> -					__i915_gem_object_unpin_pages(obj);
> -					obj->mm.quirked = false;
> -				}
> -				if (!i915_gem_object_is_tiled(obj)) {
> -					GEM_BUG_ON(!obj->mm.quirked);
> -					__i915_gem_object_pin_pages(obj);
> -					obj->mm.quirked = true;
> -				}
> -			}
> -			mutex_unlock(&obj->mm.lock);
> -
> -			list_for_each_entry(vma, &obj->vma_list, obj_link) {
> -				if (!i915_vma_is_ggtt(vma))
> -					break;
> -
> -				vma->fence_size = i915_gem_get_ggtt_size(dev_priv, vma->size,
> -									 args->tiling_mode,
> -									 args->stride);
> -				vma->fence_alignment = i915_gem_get_ggtt_alignment(dev_priv, vma->size,
> -										   args->tiling_mode,
> -										   args->stride);
> -
> -				if (vma->fence)
> -					vma->fence->dirty = true;
> -			}
> -			obj->tiling_and_stride =
> -				args->stride | args->tiling_mode;
> -
> -			/* Force the fence to be reacquired for GTT access */
> -			i915_gem_release_mmap(obj);
> -		}
> -	}
> -	/* we have to maintain this existing ABI... */
> +	err = i915_gem_object_set_tiling(obj, args->tiling_mode, args->stride);
> +	mutex_unlock(&dev->struct_mutex);
> +
> +	/* We have to maintain this existing ABI... */
>  	args->stride = i915_gem_object_get_stride(obj);
>  	args->tiling_mode = i915_gem_object_get_tiling(obj);
>
> -	/* Try to preallocate memory required to save swizzling on put-pages */
> -	if (i915_gem_object_needs_bit17_swizzle(obj)) {
> -		if (obj->bit_17 == NULL) {
> -			obj->bit_17 = kcalloc(BITS_TO_LONGS(obj->base.size >> PAGE_SHIFT),
> -					      sizeof(long), GFP_KERNEL);
> -		}
> -	} else {
> -		kfree(obj->bit_17);
> -		obj->bit_17 = NULL;
> -	}
> -
>  err:
>  	i915_gem_object_put(obj);
> -	mutex_unlock(&dev->struct_mutex);
> -
>  	return err;
>  }
>
>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 22/26] drm/i915: Split out i915_gem_object_set_tiling()
  2017-01-09 15:43   ` Tvrtko Ursulin
@ 2017-01-09 15:58     ` Chris Wilson
  2017-01-09 16:10       ` Tvrtko Ursulin
  0 siblings, 1 reply; 36+ messages in thread
From: Chris Wilson @ 2017-01-09 15:58 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

On Mon, Jan 09, 2017 at 03:43:46PM +0000, Tvrtko Ursulin wrote:
> 
> On 31/12/2016 12:06, Chris Wilson wrote:
> >Expose an interface for changing the tiling and stride on an object,
> >that includes the complexity of checking for conflicting bindings and
> >fence registers.
> >
> >Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >---
> > drivers/gpu/drm/i915/i915_gem_object.h |   3 +
> > drivers/gpu/drm/i915/i915_gem_tiling.c | 238 +++++++++++++++++----------------
> > 2 files changed, 128 insertions(+), 113 deletions(-)
> >
> >diff --git a/drivers/gpu/drm/i915/i915_gem_object.h b/drivers/gpu/drm/i915/i915_gem_object.h
> >index 6a368de9d81e..789bb44c9149 100644
> >--- a/drivers/gpu/drm/i915/i915_gem_object.h
> >+++ b/drivers/gpu/drm/i915/i915_gem_object.h
> >@@ -317,6 +317,9 @@ i915_gem_object_get_stride(struct drm_i915_gem_object *obj)
> > 	return obj->tiling_and_stride & STRIDE_MASK;
> > }
> >
> >+int i915_gem_object_set_tiling(struct drm_i915_gem_object *obj,
> >+			       unsigned int tiling, unsigned int stride);
> >+
> > static inline struct intel_engine_cs *
> > i915_gem_object_last_write_engine(struct drm_i915_gem_object *obj)
> > {
> >diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
> >index 6c5b8a0e8325..4ec2620b7bf9 100644
> >--- a/drivers/gpu/drm/i915/i915_gem_tiling.c
> >+++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
> >@@ -60,61 +60,56 @@
> >
> > /* Check pitch constriants for all chips & tiling formats */
> > static bool
> >-i915_tiling_ok(struct drm_i915_private *dev_priv,
> >-	       int stride, int size, int tiling_mode)
> >+i915_tiling_ok(struct drm_i915_gem_object *obj,
> >+	       unsigned int tiling, unsigned int stride)
> 
> Should it, or should it not now have an _obj(ect)_ somewhere in the
> function name? i915_gem_object_tiling_ok, or
> i915_gem_tiling_ok_for_object?

It's static. I'm not too fussed.

> > {
> >-	int tile_width;
> >+	struct drm_i915_private *i915 = to_i915(obj->base.dev);
> >+	unsigned int tile_width;
> >
> > 	/* Linear is always fine */
> >-	if (tiling_mode == I915_TILING_NONE)
> >+	if (tiling == I915_TILING_NONE)
> > 		return true;
> >
> >-	if (tiling_mode > I915_TILING_LAST)
> >+	if (tiling > I915_TILING_LAST)
> > 		return false;
> >
> >-	if (IS_GEN2(dev_priv) ||
> >-	    (tiling_mode == I915_TILING_Y && HAS_128_BYTE_Y_TILING(dev_priv)))
> >-		tile_width = 128;
> >-	else
> >-		tile_width = 512;
> >-
> > 	/* check maximum stride & object size */
> > 	/* i965+ stores the end address of the gtt mapping in the fence
> > 	 * reg, so dont bother to check the size */
> >-	if (INTEL_GEN(dev_priv) >= 7) {
> >+	if (INTEL_GEN(i915) >= 7) {
> > 		if (stride / 128 > GEN7_FENCE_MAX_PITCH_VAL)
> > 			return false;
> >-	} else if (INTEL_GEN(dev_priv) >= 4) {
> >+	} else if (INTEL_GEN(i915) >= 4) {
> > 		if (stride / 128 > I965_FENCE_MAX_PITCH_VAL)
> > 			return false;
> > 	} else {
> > 		if (stride > 8192)
> > 			return false;
> >
> >-		if (IS_GEN3(dev_priv)) {
> >-			if (size > I830_FENCE_MAX_SIZE_VAL << 20)
> >+		if (IS_GEN3(i915)) {
> >+			if (obj->base.size > I830_FENCE_MAX_SIZE_VAL << 20)
> > 				return false;
> > 		} else {
> >-			if (size > I830_FENCE_MAX_SIZE_VAL << 19)
> >+			if (obj->base.size > I830_FENCE_MAX_SIZE_VAL << 19)
> > 				return false;
> > 		}
> > 	}
> >
> >-	if (stride < tile_width)
> >+	if (IS_GEN2(i915) ||
> >+	    (tiling == I915_TILING_Y && HAS_128_BYTE_Y_TILING(i915)))
> >+		tile_width = 128;
> >+	else
> >+		tile_width = 512;
> >+
> >+	if (stride & (tile_width - 1))
> > 		return false;
> >
> > 	/* 965+ just needs multiples of tile width */
> >-	if (INTEL_GEN(dev_priv) >= 4) {
> >-		if (stride & (tile_width - 1))
> >-			return false;
> >+	if (INTEL_GEN(i915) >= 4)
> > 		return true;
> >-	}
> >
> > 	/* Pre-965 needs power of two tile widths */
> 
> Is this comment actually referring to the stride? tile_width is
> always a power of two since it is set above.

It does mean power-of-two x tile-width (the unit for the old fence
registers is tile-width). Just that since tile-width is a power of two,
so must stride.
 
> >-	if (stride & (stride - 1))
> >-		return false;
> >-
> >-	return true;
> >+	return is_power_of_2(stride);
> > }
> 
> Looks OK, but would have been better if you left the optimising bits
> out from this patch.

Optimising? I was just replacing it with the equivalent helper so that
the code was a more obvious match to the comment.
 
> > static bool i915_vma_fence_prepare(struct i915_vma *vma,
> >@@ -163,6 +158,99 @@ i915_gem_object_fence_prepare(struct drm_i915_gem_object *obj,
> > 	return 0;
> > }
> >
> >+int
> >+i915_gem_object_set_tiling(struct drm_i915_gem_object *obj,
> >+			   unsigned int tiling, unsigned int stride)
> >+{
> >+	struct drm_i915_private *i915 = to_i915(obj->base.dev);
> >+	struct i915_vma *vma;
> >+	int err;
> >+
> >+	/* Make sure we don't cross-contaminate obj->tiling_and_stride */
> >+	BUILD_BUG_ON(I915_TILING_LAST & STRIDE_MASK);
> >+
> >+	GEM_BUG_ON(!i915_tiling_ok(obj, tiling, stride));
> >+	GEM_BUG_ON(!stride ^ (tiling == I915_TILING_NONE));
> 
> Why is i915_tiling_ok not checking for this and handling it
> gracefully? It might be even user triggerable, not sure, don't see
> the whole function in the diff.

It is specifically handled in the caller (ioctl interface), so I put the
assert in so that future users didn't rely on unsupported behaviour.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 22/26] drm/i915: Split out i915_gem_object_set_tiling()
  2017-01-09 15:58     ` Chris Wilson
@ 2017-01-09 16:10       ` Tvrtko Ursulin
  0 siblings, 0 replies; 36+ messages in thread
From: Tvrtko Ursulin @ 2017-01-09 16:10 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 09/01/2017 15:58, Chris Wilson wrote:
> On Mon, Jan 09, 2017 at 03:43:46PM +0000, Tvrtko Ursulin wrote:
>>
>> On 31/12/2016 12:06, Chris Wilson wrote:
>>> Expose an interface for changing the tiling and stride on an object,
>>> that includes the complexity of checking for conflicting bindings and
>>> fence registers.
>>>
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> ---
>>> drivers/gpu/drm/i915/i915_gem_object.h |   3 +
>>> drivers/gpu/drm/i915/i915_gem_tiling.c | 238 +++++++++++++++++----------------
>>> 2 files changed, 128 insertions(+), 113 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_gem_object.h b/drivers/gpu/drm/i915/i915_gem_object.h
>>> index 6a368de9d81e..789bb44c9149 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem_object.h
>>> +++ b/drivers/gpu/drm/i915/i915_gem_object.h
>>> @@ -317,6 +317,9 @@ i915_gem_object_get_stride(struct drm_i915_gem_object *obj)
>>> 	return obj->tiling_and_stride & STRIDE_MASK;
>>> }
>>>
>>> +int i915_gem_object_set_tiling(struct drm_i915_gem_object *obj,
>>> +			       unsigned int tiling, unsigned int stride);
>>> +
>>> static inline struct intel_engine_cs *
>>> i915_gem_object_last_write_engine(struct drm_i915_gem_object *obj)
>>> {
>>> diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
>>> index 6c5b8a0e8325..4ec2620b7bf9 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem_tiling.c
>>> +++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
>>> @@ -60,61 +60,56 @@
>>>
>>> /* Check pitch constriants for all chips & tiling formats */
>>> static bool
>>> -i915_tiling_ok(struct drm_i915_private *dev_priv,
>>> -	       int stride, int size, int tiling_mode)
>>> +i915_tiling_ok(struct drm_i915_gem_object *obj,
>>> +	       unsigned int tiling, unsigned int stride)
>>
>> Should it, or should it not now have an _obj(ect)_ somewhere in the
>> function name? i915_gem_object_tiling_ok, or
>> i915_gem_tiling_ok_for_object?
>
> It's static. I'm not too fussed.
>
>>> {
>>> -	int tile_width;
>>> +	struct drm_i915_private *i915 = to_i915(obj->base.dev);
>>> +	unsigned int tile_width;
>>>
>>> 	/* Linear is always fine */
>>> -	if (tiling_mode == I915_TILING_NONE)
>>> +	if (tiling == I915_TILING_NONE)
>>> 		return true;
>>>
>>> -	if (tiling_mode > I915_TILING_LAST)
>>> +	if (tiling > I915_TILING_LAST)
>>> 		return false;
>>>
>>> -	if (IS_GEN2(dev_priv) ||
>>> -	    (tiling_mode == I915_TILING_Y && HAS_128_BYTE_Y_TILING(dev_priv)))
>>> -		tile_width = 128;
>>> -	else
>>> -		tile_width = 512;
>>> -
>>> 	/* check maximum stride & object size */
>>> 	/* i965+ stores the end address of the gtt mapping in the fence
>>> 	 * reg, so dont bother to check the size */
>>> -	if (INTEL_GEN(dev_priv) >= 7) {
>>> +	if (INTEL_GEN(i915) >= 7) {
>>> 		if (stride / 128 > GEN7_FENCE_MAX_PITCH_VAL)
>>> 			return false;
>>> -	} else if (INTEL_GEN(dev_priv) >= 4) {
>>> +	} else if (INTEL_GEN(i915) >= 4) {
>>> 		if (stride / 128 > I965_FENCE_MAX_PITCH_VAL)
>>> 			return false;
>>> 	} else {
>>> 		if (stride > 8192)
>>> 			return false;
>>>
>>> -		if (IS_GEN3(dev_priv)) {
>>> -			if (size > I830_FENCE_MAX_SIZE_VAL << 20)
>>> +		if (IS_GEN3(i915)) {
>>> +			if (obj->base.size > I830_FENCE_MAX_SIZE_VAL << 20)
>>> 				return false;
>>> 		} else {
>>> -			if (size > I830_FENCE_MAX_SIZE_VAL << 19)
>>> +			if (obj->base.size > I830_FENCE_MAX_SIZE_VAL << 19)
>>> 				return false;
>>> 		}
>>> 	}
>>>
>>> -	if (stride < tile_width)
>>> +	if (IS_GEN2(i915) ||
>>> +	    (tiling == I915_TILING_Y && HAS_128_BYTE_Y_TILING(i915)))
>>> +		tile_width = 128;
>>> +	else
>>> +		tile_width = 512;
>>> +
>>> +	if (stride & (tile_width - 1))
>>> 		return false;
>>>
>>> 	/* 965+ just needs multiples of tile width */
>>> -	if (INTEL_GEN(dev_priv) >= 4) {
>>> -		if (stride & (tile_width - 1))
>>> -			return false;
>>> +	if (INTEL_GEN(i915) >= 4)
>>> 		return true;
>>> -	}
>>>
>>> 	/* Pre-965 needs power of two tile widths */
>>
>> Is this comment actually referring to the stride? tile_width is
>> always a power of two since it is set above.
>
> It does mean power-of-two x tile-width (the unit for the old fence
> registers is tile-width). Just that since tile-width is a power of two,
> so must stride.
>
>>> -	if (stride & (stride - 1))
>>> -		return false;
>>> -
>>> -	return true;
>>> +	return is_power_of_2(stride);
>>> }
>>
>> Looks OK, but would have been better if you left the optimising bits
>> out from this patch.
>
> Optimising? I was just replacing it with the equivalent helper so that
> the code was a more obvious match to the comment.

Optimising as in one conditional less in total, the removal of "if 
(stride < tile_width)".

>>> static bool i915_vma_fence_prepare(struct i915_vma *vma,
>>> @@ -163,6 +158,99 @@ i915_gem_object_fence_prepare(struct drm_i915_gem_object *obj,
>>> 	return 0;
>>> }
>>>
>>> +int
>>> +i915_gem_object_set_tiling(struct drm_i915_gem_object *obj,
>>> +			   unsigned int tiling, unsigned int stride)
>>> +{
>>> +	struct drm_i915_private *i915 = to_i915(obj->base.dev);
>>> +	struct i915_vma *vma;
>>> +	int err;
>>> +
>>> +	/* Make sure we don't cross-contaminate obj->tiling_and_stride */
>>> +	BUILD_BUG_ON(I915_TILING_LAST & STRIDE_MASK);
>>> +
>>> +	GEM_BUG_ON(!i915_tiling_ok(obj, tiling, stride));
>>> +	GEM_BUG_ON(!stride ^ (tiling == I915_TILING_NONE));
>>
>> Why is i915_tiling_ok not checking for this and handling it
>> gracefully? It might be even user triggerable, not sure, don't see
>> the whole function in the diff.
>
> It is specifically handled in the caller (ioctl interface), so I put the
> assert in so that future users didn't rely on unsupported behaviour.

Yes true, I see the setting of args->stride to zero for linear in the 
full file.

Looks OK then.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 25/26] drm/i915: Replace 4096 with PAGE_SIZE or I915_GTT_PAGE_SIZE
  2016-12-31 12:07 ` [PATCH 25/26] drm/i915: Replace 4096 with PAGE_SIZE or I915_GTT_PAGE_SIZE Chris Wilson
@ 2017-01-09 16:19   ` Tvrtko Ursulin
  2017-01-09 16:28     ` Chris Wilson
  0 siblings, 1 reply; 36+ messages in thread
From: Tvrtko Ursulin @ 2017-01-09 16:19 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 31/12/2016 12:07, Chris Wilson wrote:
> Start converting over from the byte count to its semantic macro, either
> we want to allocate the size of a physical page in main memory or we
> want the size of a virtual page in the GTT. 4096 could mean either, but
> PAGE_SIZE and I915_GTT_PAGE_SIZE are explicit and should help improve
> coe comprehension and future changes.

code

>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/i915_gem.c              |  4 ++--
>  drivers/gpu/drm/i915/i915_gem_context.c      |  5 +++--
>  drivers/gpu/drm/i915/i915_gem_evict.c        |  4 ++--
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c   |  2 +-
>  drivers/gpu/drm/i915/i915_gem_fence_reg.c    |  6 +++---
>  drivers/gpu/drm/i915/i915_gem_gtt.c          |  6 +++---
>  drivers/gpu/drm/i915/i915_gem_gtt.h          |  5 ++++-
>  drivers/gpu/drm/i915/i915_gem_render_state.c |  4 ++--
>  drivers/gpu/drm/i915/i915_gem_stolen.c       |  7 ++++---
>  drivers/gpu/drm/i915/i915_vma.c              |  4 ++--
>  drivers/gpu/drm/i915/intel_lrc.c             |  5 +++--
>  drivers/gpu/drm/i915/intel_ringbuffer.c      | 10 +++++-----
>  12 files changed, 34 insertions(+), 28 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 0424cee5c767..bad57279e817 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2092,7 +2092,7 @@ u32 i915_gem_get_ggtt_alignment(struct drm_i915_private *dev_priv, u32 size,
>  	 * if a fence register is needed for the object.
>  	 */
>  	if (INTEL_GEN(dev_priv) >= 4 || tiling_mode == I915_TILING_NONE)
> -		return 4096;
> +		return I915_GTT_MIN_ALIGNMENT;
>
>  	/*
>  	 * Previous chips need to be aligned to the size of the smallest
> @@ -3560,7 +3560,7 @@ i915_gem_object_unpin_from_display_plane(struct i915_vma *vma)
>  		return;
>
>  	if (--vma->obj->pin_display == 0)
> -		vma->display_alignment = 4096;
> +		vma->display_alignment = I915_GTT_MIN_ALIGNMENT;
>
>  	/* Bump the LRU to try and avoid premature eviction whilst flipping  */
>  	if (!i915_vma_is_active(vma))
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index a8f6b8843bdf..d4780cdfb8a0 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -341,7 +341,7 @@ __create_hw_context(struct drm_i915_private *dev_priv,
>  	if (HAS_GUC(dev_priv) && i915.enable_guc_loading)
>  		ctx->ggtt_offset_bias = GUC_WOPCM_TOP;
>  	else
> -		ctx->ggtt_offset_bias = 4096;
> +		ctx->ggtt_offset_bias = I915_GTT_PAGE_SIZE;
>
>  	return ctx;
>
> @@ -456,7 +456,8 @@ int i915_gem_context_init(struct drm_i915_private *dev_priv)
>  		dev_priv->hw_context_size = 0;
>  	} else if (HAS_HW_CONTEXTS(dev_priv)) {
>  		dev_priv->hw_context_size =
> -			round_up(get_context_size(dev_priv), 4096);
> +			round_up(get_context_size(dev_priv),
> +				 I915_GTT_PAGE_SIZE);
>  		if (dev_priv->hw_context_size > (1<<20)) {
>  			DRM_DEBUG_DRIVER("Disabling HW Contexts; invalid size %d\n",
>  					 dev_priv->hw_context_size);
> diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
> index 50129ec1caab..70ba92f91148 100644
> --- a/drivers/gpu/drm/i915/i915_gem_evict.c
> +++ b/drivers/gpu/drm/i915/i915_gem_evict.c
> @@ -263,9 +263,9 @@ int i915_gem_evict_for_vma(struct i915_vma *target, unsigned int flags)
>  	if (check_color) {
>  		/* Expand search to cover neighbouring guard pages (or lack!) */
>  		if (start > target->vm->start)
> -			start -= 4096;
> +			start -= I915_GTT_PAGE_SIZE;
>  		if (end < target->vm->start + target->vm->total)
> -			end += 4096;
> +			end += I915_GTT_PAGE_SIZE;
>  	}
>
>  	drm_mm_for_each_node_in_range(node, &target->vm->mm, start, end) {
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index a5fe299da1d3..c6922a5f0514 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -438,7 +438,7 @@ static void *reloc_iomap(struct drm_i915_gem_object *obj,
>  			memset(&cache->node, 0, sizeof(cache->node));
>  			ret = drm_mm_insert_node_in_range_generic
>  				(&ggtt->base.mm, &cache->node,
> -				 4096, 0, I915_COLOR_UNEVICTABLE,
> +				 PAGE_SIZE, 0, I915_COLOR_UNEVICTABLE,

This one is both really, hardcodes that they have to be equal. Hm, yeah 
don't know. Still there is value in doing this patch so leave the 
problem for later.

>  				 0, ggtt->mappable_end,
>  				 DRM_MM_SEARCH_DEFAULT,
>  				 DRM_MM_CREATE_DEFAULT);
> diff --git a/drivers/gpu/drm/i915/i915_gem_fence_reg.c b/drivers/gpu/drm/i915/i915_gem_fence_reg.c
> index 6de527df6cdc..67835d7b39c7 100644
> --- a/drivers/gpu/drm/i915/i915_gem_fence_reg.c
> +++ b/drivers/gpu/drm/i915/i915_gem_fence_reg.c
> @@ -81,11 +81,11 @@ static void i965_write_fence_reg(struct drm_i915_fence_reg *fence,
>  		unsigned int stride = i915_gem_object_get_stride(vma->obj);
>
>  		GEM_BUG_ON(!i915_vma_is_map_and_fenceable(vma));
> -		GEM_BUG_ON(vma->node.start & 4095);
> -		GEM_BUG_ON(vma->fence_size & 4095);
> +		GEM_BUG_ON(vma->node.start & (I915_GTT_PAGE_SIZE - 1));
> +		GEM_BUG_ON(vma->fence_size & (I915_GTT_PAGE_SIZE - 1));
>  		GEM_BUG_ON(stride & 127);
>
> -		val = (vma->node.start + vma->fence_size - 4096) << 32;
> +		val = (vma->node.start + vma->fence_size - I915_GTT_PAGE_SIZE) << 32;
>  		val |= vma->node.start;
>  		val |= (u64)((stride / 128) - 1) << fence_pitch_shift;
>  		if (tiling == I915_TILING_Y)
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 2c579946447c..116e43bffdb1 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -2716,11 +2716,11 @@ static void i915_gtt_color_adjust(const struct drm_mm_node *node,
>  				  u64 *end)
>  {
>  	if (node->color != color)
> -		*start += 4096;
> +		*start += I915_GTT_PAGE_SIZE;
>
>  	node = list_next_entry(node, node_list);
>  	if (node->allocated && node->color != color)
> -		*end -= 4096;
> +		*end -= I915_GTT_PAGE_SIZE;
>  }
>
>  int i915_gem_init_ggtt(struct drm_i915_private *dev_priv)
> @@ -2747,7 +2747,7 @@ int i915_gem_init_ggtt(struct drm_i915_private *dev_priv)
>  	/* Reserve a mappable slot for our lockless error capture */
>  	ret = drm_mm_insert_node_in_range_generic(&ggtt->base.mm,
>  						  &ggtt->error_capture,
> -						  4096, 0,
> +						  PAGE_SIZE, 0,
>  						  I915_COLOR_UNEVICTABLE,
>  						  0, ggtt->mappable_end,
>  						  0, 0);
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
> index a4070c27d69b..e217b14e14fd 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.h
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
> @@ -40,6 +40,9 @@
>  #include "i915_gem_timeline.h"
>  #include "i915_gem_request.h"
>
> +#define I915_GTT_PAGE_SIZE 4096
> +#define I915_GTT_MIN_ALIGNMENT I915_GTT_PAGE_SIZE
> +
>  #define I915_FENCE_REG_NONE -1
>  #define I915_MAX_NUM_FENCES 32
>  /* 32 fences + sign bit for FENCE_REG_NONE */
> @@ -563,6 +566,6 @@ void i915_gem_gtt_finish_pages(struct drm_i915_gem_object *obj,
>  #define PIN_HIGH		BIT(9)
>  #define PIN_OFFSET_BIAS		BIT(10)
>  #define PIN_OFFSET_FIXED	BIT(11)
> -#define PIN_OFFSET_MASK		(~4095)
> +#define PIN_OFFSET_MASK		(-I915_GTT_PAGE_SIZE)
>
>  #endif
> diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c b/drivers/gpu/drm/i915/i915_gem_render_state.c
> index 5af19b0bf713..63ae7e813335 100644
> --- a/drivers/gpu/drm/i915/i915_gem_render_state.c
> +++ b/drivers/gpu/drm/i915/i915_gem_render_state.c
> @@ -187,14 +187,14 @@ int i915_gem_render_state_init(struct intel_engine_cs *engine)
>  	if (!rodata)
>  		return 0;
>
> -	if (rodata->batch_items * 4 > 4096)
> +	if (rodata->batch_items * 4 > PAGE_SIZE)
>  		return -EINVAL;
>
>  	so = kmalloc(sizeof(*so), GFP_KERNEL);
>  	if (!so)
>  		return -ENOMEM;
>
> -	obj = i915_gem_object_create_internal(engine->i915, 4096);
> +	obj = i915_gem_object_create_internal(engine->i915, PAGE_SIZE);
>  	if (IS_ERR(obj)) {
>  		ret = PTR_ERR(obj);
>  		goto err_free;
> diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> index 5fd851336a99..be611f1367b4 100644
> --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> @@ -482,7 +482,7 @@ int i915_gem_init_stolen(struct drm_i915_private *dev_priv)
>  	stolen_usable_start = 0;
>  	/* WaSkipStolenMemoryFirstPage:bdw+ */
>  	if (INTEL_GEN(dev_priv) >= 8)
> -		stolen_usable_start = 4096;
> +		stolen_usable_start = PAGE_SIZE;
>
>  	ggtt->stolen_usable_size =
>  		ggtt->stolen_size - reserved_total - stolen_usable_start;
> @@ -644,8 +644,9 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_i915_private *dev_priv
>  			stolen_offset, gtt_offset, size);
>
>  	/* KISS and expect everything to be page-aligned */
> -	if (WARN_ON(size == 0) || WARN_ON(size & 4095) ||
> -	    WARN_ON(stolen_offset & 4095))
> +	if (WARN_ON(size == 0) ||
> +	    WARN_ON(size & (I915_GTT_PAGE_SIZE - 1)) ||
> +	    WARN_ON(stolen_offset & (I915_GTT_MIN_ALIGNMENT - 1)))
>  		return NULL;
>
>  	stolen = kzalloc(sizeof(*stolen), GFP_KERNEL);
> diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
> index ae00d9f4e520..6596d0963da1 100644
> --- a/drivers/gpu/drm/i915/i915_vma.c
> +++ b/drivers/gpu/drm/i915/i915_vma.c
> @@ -91,7 +91,7 @@ __i915_vma_create(struct drm_i915_gem_object *obj,
>  	vma->vm = vm;
>  	vma->obj = obj;
>  	vma->size = obj->base.size;
> -	vma->display_alignment = 4096;
> +	vma->display_alignment = I915_GTT_MIN_ALIGNMENT;
>
>  	if (view) {
>  		vma->ggtt_view = *view;
> @@ -435,7 +435,7 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
>  		 * with zero alignment, so where possible use the optimal
>  		 * path.
>  		 */
> -		if (alignment <= 4096)
> +		if (alignment <= I915_GTT_MIN_ALIGNMENT)
>  			alignment = 0;

Luckily for the comment, or this would look a bit strange. :)

>
>  search_free:
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 600518ee89db..c809eda65f61 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1937,7 +1937,7 @@ int logical_render_ring_init(struct intel_engine_cs *engine)
>  	engine->emit_breadcrumb = gen8_emit_breadcrumb_render;
>  	engine->emit_breadcrumb_sz = gen8_emit_breadcrumb_render_sz;
>
> -	ret = intel_engine_create_scratch(engine, 4096);
> +	ret = intel_engine_create_scratch(engine, PAGE_SIZE);
>  	if (ret)
>  		return ret;
>
> @@ -2219,7 +2219,8 @@ static int execlists_context_deferred_alloc(struct i915_gem_context *ctx,
>
>  	WARN_ON(ce->state);
>
> -	context_size = round_up(intel_lr_context_size(engine), 4096);
> +	context_size = round_up(intel_lr_context_size(engine),
> +				I915_GTT_PAGE_SIZE);
>
>  	/* One extra page as the sharing data between driver and GuC */
>  	context_size += PAGE_SIZE * LRC_PPHWSP_PN;
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 0971ac396b60..ab83fc22d207 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -1736,7 +1736,7 @@ static int init_status_page(struct intel_engine_cs *engine)
>  	void *vaddr;
>  	int ret;
>
> -	obj = i915_gem_object_create_internal(engine->i915, 4096);
> +	obj = i915_gem_object_create_internal(engine->i915, PAGE_SIZE);
>  	if (IS_ERR(obj)) {
>  		DRM_ERROR("Failed to allocate status page\n");
>  		return PTR_ERR(obj);
> @@ -1777,7 +1777,7 @@ static int init_status_page(struct intel_engine_cs *engine)
>
>  	engine->status_page.vma = vma;
>  	engine->status_page.ggtt_offset = i915_ggtt_offset(vma);
> -	engine->status_page.page_addr = memset(vaddr, 0, 4096);
> +	engine->status_page.page_addr = memset(vaddr, 0, PAGE_SIZE);
>
>  	DRM_DEBUG_DRIVER("%s hws offset: 0x%08x\n",
>  			 engine->name, i915_ggtt_offset(vma));
> @@ -2049,7 +2049,7 @@ static int intel_init_ring_buffer(struct intel_engine_cs *engine)
>  	}
>
>  	/* Ring wraparound at offset 0 sometimes hangs. No idea why. */
> -	ret = intel_ring_pin(ring, 4096);
> +	ret = intel_ring_pin(ring, I915_GTT_PAGE_SIZE);
>  	if (ret) {
>  		intel_ring_free(ring);
>  		goto error;
> @@ -2466,7 +2466,7 @@ static void intel_ring_init_semaphores(struct drm_i915_private *dev_priv,
>  	if (INTEL_GEN(dev_priv) >= 8 && !dev_priv->semaphore) {
>  		struct i915_vma *vma;
>
> -		obj = i915_gem_object_create(dev_priv, 4096);
> +		obj = i915_gem_object_create(dev_priv, PAGE_SIZE);
>  		if (IS_ERR(obj))
>  			goto err;
>
> @@ -2683,7 +2683,7 @@ int intel_init_render_ring_buffer(struct intel_engine_cs *engine)
>  		return ret;
>
>  	if (INTEL_GEN(dev_priv) >= 6) {
> -		ret = intel_engine_create_scratch(engine, 4096);
> +		ret = intel_engine_create_scratch(engine, PAGE_SIZE);
>  		if (ret)
>  			return ret;
>  	} else if (HAS_BROKEN_CS_TLB(dev_priv)) {
>

Replacing of 4096-es with names is welcomed, I am just not sure we need 
two. Since it will be fixed to eternity perhaps we could just have 
PAGE_SIZE for brevity. Not sure. Second opinions please! :)

Regards,

Tvrtko

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 25/26] drm/i915: Replace 4096 with PAGE_SIZE or I915_GTT_PAGE_SIZE
  2017-01-09 16:19   ` Tvrtko Ursulin
@ 2017-01-09 16:28     ` Chris Wilson
  0 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2017-01-09 16:28 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

On Mon, Jan 09, 2017 at 04:19:03PM +0000, Tvrtko Ursulin wrote:
> >diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> >index a5fe299da1d3..c6922a5f0514 100644
> >--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> >+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> >@@ -438,7 +438,7 @@ static void *reloc_iomap(struct drm_i915_gem_object *obj,
> > 			memset(&cache->node, 0, sizeof(cache->node));
> > 			ret = drm_mm_insert_node_in_range_generic
> > 				(&ggtt->base.mm, &cache->node,
> >-				 4096, 0, I915_COLOR_UNEVICTABLE,
> >+				 PAGE_SIZE, 0, I915_COLOR_UNEVICTABLE,
> 
> This one is both really, hardcodes that they have to be equal. Hm,
> yeah don't know. Still there is value in doing this patch so leave
> the problem for later.

Hmm, true. Such usage does pose interesting times ahead. We will
probably need a MIN_GTT_PAGE_SIZE == PAGE_SIZE.

> Replacing of 4096-es with names is welcomed, I am just not sure we
> need two. Since it will be fixed to eternity perhaps we could just
> have PAGE_SIZE for brevity. Not sure. Second opinions please! :)

GTT_PAGE_SIZE may become variable, which is the fun. Not sure if
MIN_ALIGNMENT will change with it or be implementation specific. This
pass was more to do with splitting the actual PAGE_SIZE from our GTT
units. There are still a few 4096s that I don't know the meaning behind.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2017-01-09 16:28 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-31 12:06 Small GEM fixes Chris Wilson
2016-12-31 12:06 ` [PATCH 01/26] drm/i915: Assert that we do create the deferred context Chris Wilson
2016-12-31 12:06 ` [PATCH 02/26] drm/i915: Move a few utility macros into a separate header Chris Wilson
2016-12-31 12:06 ` [PATCH 03/26] drm/i915: Use fixed-sized types for stolen Chris Wilson
2016-12-31 12:06 ` [PATCH 04/26] drm/i915: Use range_overflows() Chris Wilson
2016-12-31 12:06 ` [PATCH 05/26] drm/i915/guc: Exclude the upper end of the Global GTT for the GuC Chris Wilson
2017-01-02 15:25   ` Arkadiusz Hiler
2016-12-31 12:06 ` [PATCH 06/26] drm/i915: Purge loose pages if we run out of DMA remap space Chris Wilson
2016-12-31 12:06 ` [PATCH 07/26] drm/i915: Fix phys pwrite for struct_mutex-less operation Chris Wilson
2016-12-31 12:06 ` [PATCH 08/26] drm/i915: Drain freed objects for mmap space exhaustion Chris Wilson
2016-12-31 12:06 ` [PATCH 09/26] drm/i915: Simplify testing for am-I-the-kernel-context? Chris Wilson
2016-12-31 12:06 ` [PATCH 10/26] drm/i915: Name the anonymous structs inside i915_ggtt_view Chris Wilson
2016-12-31 12:06 ` [PATCH 11/26] drm/i915: Pack the partial view size and offset into a single u64 Chris Wilson
2016-12-31 12:06 ` [PATCH 12/26] drm/i915: Convert i915_ggtt_view to use an anonymous union Chris Wilson
2016-12-31 12:06 ` [PATCH 13/26] drm/i915: Eliminate superfluous i915_ggtt_view_rotated Chris Wilson
2016-12-31 12:06 ` [PATCH 14/26] drm/i915: Eliminate superfluous i915_ggtt_view_normal Chris Wilson
2016-12-31 12:06 ` [PATCH 15/26] drm/i915: Extact compute_partial_view() Chris Wilson
2016-12-31 12:06 ` [PATCH 16/26] drm/i915: Clip the partial view against the object not vma Chris Wilson
2016-12-31 12:06 ` [PATCH 17/26] drm/i915: Align GGTT sizes to a fence tile row Chris Wilson
2016-12-31 12:06 ` [PATCH 18/26] drm/i915: Replace WARNs in fence register writes with extensive asserts Chris Wilson
2016-12-31 12:06 ` [PATCH 19/26] drm/i915: Store required fence size/alignment for GGTT vma Chris Wilson
2016-12-31 12:06 ` [PATCH 20/26] drm/i915: Remove the rounding down of the gen4+ fence region Chris Wilson
2016-12-31 12:06 ` [PATCH 21/26] drm/i915: Include ioctl in set-tiling and get-tiling function names Chris Wilson
2017-01-09 15:12   ` Tvrtko Ursulin
2016-12-31 12:06 ` [PATCH 22/26] drm/i915: Split out i915_gem_object_set_tiling() Chris Wilson
2017-01-09 15:43   ` Tvrtko Ursulin
2017-01-09 15:58     ` Chris Wilson
2017-01-09 16:10       ` Tvrtko Ursulin
2016-12-31 12:06 ` [PATCH 23/26] drm/i915: Construct a request even if the GPU is currently hung Chris Wilson
2016-12-31 12:07 ` [PATCH 24/26] drm/i915: Skip switch to kernel context if already done Chris Wilson
2016-12-31 12:07 ` [PATCH 25/26] drm/i915: Replace 4096 with PAGE_SIZE or I915_GTT_PAGE_SIZE Chris Wilson
2017-01-09 16:19   ` Tvrtko Ursulin
2017-01-09 16:28     ` Chris Wilson
2016-12-31 12:07 ` [PATCH 26/26] drm/i915: Use the MRU stack search after evicting Chris Wilson
2016-12-31 13:23 ` ✗ Fi.CI.BAT: failure for series starting with [01/26] drm/i915: Assert that we do create the deferred context Patchwork
2016-12-31 13:29   ` Chris Wilson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.