All of lore.kernel.org
 help / color / mirror / Atom feed
* Hopefully the last round of stolen work
@ 2012-11-15 11:32 Chris Wilson
  2012-11-15 11:32 ` [PATCH 01/16] drm: Introduce drm_mm_create_block() Chris Wilson
                   ` (15 more replies)
  0 siblings, 16 replies; 26+ messages in thread
From: Chris Wilson @ 2012-11-15 11:32 UTC (permalink / raw)
  To: intel-gfx

So nearly everything has been reviewed, thanks mostly to Ben and Jesse.
Both Ben and Paulo have tested the patches, and a smattering of bug
reporters have also unbeknownst to them also provided testing feedback.
It has been running on all of my machines for several months. The
stolen handling is a prerequisite for fastboot.

The big changes since last time are that a few patches have been
reworked to hopefully make Ben happier, notably
  03/16: Fix detection of base of stolen memory
  05/16: Delay allocation of stolen space for FBC
and there was the realisation that there was one mmu-notifier that was
called outside of the mm locks which we could use to sleep on GPU
activity before page tables were modified. Hence, provided the
mmu-notifier is present, we can do userptr safely for normal users.
-Chris

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 01/16] drm: Introduce drm_mm_create_block()
  2012-11-15 11:32 Hopefully the last round of stolen work Chris Wilson
@ 2012-11-15 11:32 ` Chris Wilson
  2012-11-15 11:32 ` [PATCH 02/16] drm: Introduce an iterator over holes in the drm_mm range manager Chris Wilson
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: Chris Wilson @ 2012-11-15 11:32 UTC (permalink / raw)
  To: intel-gfx; +Cc: Dave Airlie, dri-devel

To be used later by i915 to preallocate exact blocks of space from the
range manager.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dave Airlie <airlied@redhat.com>
Cc: dri-devel@lists.freedesktop.org
---
 drivers/gpu/drm/drm_mm.c |   50 ++++++++++++++++++++++++++++++++++++++++++++++
 include/drm/drm_mm.h     |    4 ++++
 2 files changed, 54 insertions(+)

diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c
index 0761a03..bd203b6 100644
--- a/drivers/gpu/drm/drm_mm.c
+++ b/drivers/gpu/drm/drm_mm.c
@@ -161,6 +161,56 @@ static void drm_mm_insert_helper(struct drm_mm_node *hole_node,
 	}
 }
 
+struct drm_mm_node *drm_mm_create_block(struct drm_mm *mm,
+					unsigned long start,
+					unsigned long size,
+					bool atomic)
+{
+	struct drm_mm_node *hole, *node;
+	unsigned long end = start + size;
+
+	list_for_each_entry(hole, &mm->hole_stack, hole_stack) {
+		unsigned long hole_start;
+		unsigned long hole_end;
+
+		BUG_ON(!hole->hole_follows);
+		hole_start = drm_mm_hole_node_start(hole);
+		hole_end = drm_mm_hole_node_end(hole);
+
+		if (hole_start > start || hole_end < end)
+			continue;
+
+		node = drm_mm_kmalloc(mm, atomic);
+		if (unlikely(node == NULL))
+			return NULL;
+
+		node->start = start;
+		node->size = size;
+		node->mm = mm;
+		node->allocated = 1;
+
+		INIT_LIST_HEAD(&node->hole_stack);
+		list_add(&node->node_list, &hole->node_list);
+
+		if (start == hole_start) {
+			hole->hole_follows = 0;
+			list_del_init(&hole->hole_stack);
+		}
+
+		node->hole_follows = 0;
+		if (end != hole_end) {
+			list_add(&node->hole_stack, &mm->hole_stack);
+			node->hole_follows = 1;
+		}
+
+		return node;
+	}
+
+	WARN(1, "no hole found for block 0x%lx + 0x%lx\n", start, size);
+	return NULL;
+}
+EXPORT_SYMBOL(drm_mm_create_block);
+
 struct drm_mm_node *drm_mm_get_block_generic(struct drm_mm_node *hole_node,
 					     unsigned long size,
 					     unsigned alignment,
diff --git a/include/drm/drm_mm.h b/include/drm/drm_mm.h
index 06d7f79..4020f96 100644
--- a/include/drm/drm_mm.h
+++ b/include/drm/drm_mm.h
@@ -102,6 +102,10 @@ static inline bool drm_mm_initialized(struct drm_mm *mm)
 /*
  * Basic range manager support (drm_mm.c)
  */
+extern struct drm_mm_node *drm_mm_create_block(struct drm_mm *mm,
+					       unsigned long start,
+					       unsigned long size,
+					       bool atomic);
 extern struct drm_mm_node *drm_mm_get_block_generic(struct drm_mm_node *node,
 						    unsigned long size,
 						    unsigned alignment,
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 02/16] drm: Introduce an iterator over holes in the drm_mm range manager
  2012-11-15 11:32 Hopefully the last round of stolen work Chris Wilson
  2012-11-15 11:32 ` [PATCH 01/16] drm: Introduce drm_mm_create_block() Chris Wilson
@ 2012-11-15 11:32 ` Chris Wilson
  2012-11-15 11:32 ` [PATCH 03/16] drm/i915: Fix detection of base of stolen memory Chris Wilson
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: Chris Wilson @ 2012-11-15 11:32 UTC (permalink / raw)
  To: intel-gfx; +Cc: Dave Airlie, dri-devel

This will be used i915 in forthcoming patches in order to measure the
largest contiguous chunk of memory available for enabling chipset
features.

v2: Try to make the macro marginally safer and more readable by not
depending upon the drm_mm_hole_node_end() being non-zero. Note that we
need to open code list_for_each() in order to update the hole_start,
hole_end variable on each iteration and keep the macro sane.

v3: Tidy up few BUG_ONs that fell foul of adding additional tests to
drm_mm_hole_node_start().

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dave Airlie <airlied@redhat.com>
Cc: dri-devel@lists.freedesktop.org
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/drm_mm.c |   62 ++++++++++++++++------------------------------
 include/drm/drm_mm.h     |   36 +++++++++++++++++++++++++++
 2 files changed, 57 insertions(+), 41 deletions(-)

diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c
index bd203b6..b751b8e 100644
--- a/drivers/gpu/drm/drm_mm.c
+++ b/drivers/gpu/drm/drm_mm.c
@@ -102,20 +102,6 @@ int drm_mm_pre_get(struct drm_mm *mm)
 }
 EXPORT_SYMBOL(drm_mm_pre_get);
 
-static inline unsigned long drm_mm_hole_node_start(struct drm_mm_node *hole_node)
-{
-	return hole_node->start + hole_node->size;
-}
-
-static inline unsigned long drm_mm_hole_node_end(struct drm_mm_node *hole_node)
-{
-	struct drm_mm_node *next_node =
-		list_entry(hole_node->node_list.next, struct drm_mm_node,
-			   node_list);
-
-	return next_node->start;
-}
-
 static void drm_mm_insert_helper(struct drm_mm_node *hole_node,
 				 struct drm_mm_node *node,
 				 unsigned long size, unsigned alignment,
@@ -127,7 +113,7 @@ static void drm_mm_insert_helper(struct drm_mm_node *hole_node,
 	unsigned long adj_start = hole_start;
 	unsigned long adj_end = hole_end;
 
-	BUG_ON(!hole_node->hole_follows || node->allocated);
+	BUG_ON(node->allocated);
 
 	if (mm->color_adjust)
 		mm->color_adjust(hole_node, color, &adj_start, &adj_end);
@@ -155,7 +141,7 @@ static void drm_mm_insert_helper(struct drm_mm_node *hole_node,
 	BUG_ON(node->start + node->size > adj_end);
 
 	node->hole_follows = 0;
-	if (node->start + node->size < hole_end) {
+	if (__drm_mm_hole_node_start(node) < hole_end) {
 		list_add(&node->hole_stack, &mm->hole_stack);
 		node->hole_follows = 1;
 	}
@@ -168,15 +154,10 @@ struct drm_mm_node *drm_mm_create_block(struct drm_mm *mm,
 {
 	struct drm_mm_node *hole, *node;
 	unsigned long end = start + size;
+	unsigned long hole_start;
+	unsigned long hole_end;
 
-	list_for_each_entry(hole, &mm->hole_stack, hole_stack) {
-		unsigned long hole_start;
-		unsigned long hole_end;
-
-		BUG_ON(!hole->hole_follows);
-		hole_start = drm_mm_hole_node_start(hole);
-		hole_end = drm_mm_hole_node_end(hole);
-
+	drm_mm_for_each_hole(hole, mm, hole_start, hole_end) {
 		if (hole_start > start || hole_end < end)
 			continue;
 
@@ -293,7 +274,7 @@ static void drm_mm_insert_helper_range(struct drm_mm_node *hole_node,
 	BUG_ON(node->start + node->size > end);
 
 	node->hole_follows = 0;
-	if (node->start + node->size < hole_end) {
+	if (__drm_mm_hole_node_start(node) < hole_end) {
 		list_add(&node->hole_stack, &mm->hole_stack);
 		node->hole_follows = 1;
 	}
@@ -358,12 +339,13 @@ void drm_mm_remove_node(struct drm_mm_node *node)
 	    list_entry(node->node_list.prev, struct drm_mm_node, node_list);
 
 	if (node->hole_follows) {
-		BUG_ON(drm_mm_hole_node_start(node)
-				== drm_mm_hole_node_end(node));
+		BUG_ON(__drm_mm_hole_node_start(node) ==
+		       __drm_mm_hole_node_end(node));
 		list_del(&node->hole_stack);
 	} else
-		BUG_ON(drm_mm_hole_node_start(node)
-				!= drm_mm_hole_node_end(node));
+		BUG_ON(__drm_mm_hole_node_start(node) !=
+		       __drm_mm_hole_node_end(node));
+
 
 	if (!prev_node->hole_follows) {
 		prev_node->hole_follows = 1;
@@ -421,6 +403,8 @@ struct drm_mm_node *drm_mm_search_free_generic(const struct drm_mm *mm,
 {
 	struct drm_mm_node *entry;
 	struct drm_mm_node *best;
+	unsigned long adj_start;
+	unsigned long adj_end;
 	unsigned long best_size;
 
 	BUG_ON(mm->scanned_blocks);
@@ -428,17 +412,13 @@ struct drm_mm_node *drm_mm_search_free_generic(const struct drm_mm *mm,
 	best = NULL;
 	best_size = ~0UL;
 
-	list_for_each_entry(entry, &mm->hole_stack, hole_stack) {
-		unsigned long adj_start = drm_mm_hole_node_start(entry);
-		unsigned long adj_end = drm_mm_hole_node_end(entry);
-
+	drm_mm_for_each_hole(entry, mm, adj_start, adj_end) {
 		if (mm->color_adjust) {
 			mm->color_adjust(entry, color, &adj_start, &adj_end);
 			if (adj_end <= adj_start)
 				continue;
 		}
 
-		BUG_ON(!entry->hole_follows);
 		if (!check_free_hole(adj_start, adj_end, size, alignment))
 			continue;
 
@@ -465,6 +445,8 @@ struct drm_mm_node *drm_mm_search_free_in_range_generic(const struct drm_mm *mm,
 {
 	struct drm_mm_node *entry;
 	struct drm_mm_node *best;
+	unsigned long adj_start;
+	unsigned long adj_end;
 	unsigned long best_size;
 
 	BUG_ON(mm->scanned_blocks);
@@ -472,13 +454,11 @@ struct drm_mm_node *drm_mm_search_free_in_range_generic(const struct drm_mm *mm,
 	best = NULL;
 	best_size = ~0UL;
 
-	list_for_each_entry(entry, &mm->hole_stack, hole_stack) {
-		unsigned long adj_start = drm_mm_hole_node_start(entry) < start ?
-			start : drm_mm_hole_node_start(entry);
-		unsigned long adj_end = drm_mm_hole_node_end(entry) > end ?
-			end : drm_mm_hole_node_end(entry);
-
-		BUG_ON(!entry->hole_follows);
+	drm_mm_for_each_hole(entry, mm, adj_start, adj_end) {
+		if (adj_start < start)
+			adj_start = start;
+		if (adj_end > end)
+			adj_end = end;
 
 		if (mm->color_adjust) {
 			mm->color_adjust(entry, color, &adj_start, &adj_end);
diff --git a/include/drm/drm_mm.h b/include/drm/drm_mm.h
index 4020f96..cd45365 100644
--- a/include/drm/drm_mm.h
+++ b/include/drm/drm_mm.h
@@ -89,6 +89,29 @@ static inline bool drm_mm_initialized(struct drm_mm *mm)
 {
 	return mm->hole_stack.next;
 }
+
+static inline unsigned long __drm_mm_hole_node_start(struct drm_mm_node *hole_node)
+{
+	return hole_node->start + hole_node->size;
+}
+
+static inline unsigned long drm_mm_hole_node_start(struct drm_mm_node *hole_node)
+{
+	BUG_ON(!hole_node->hole_follows);
+	return __drm_mm_hole_node_start(hole_node);
+}
+
+static inline unsigned long __drm_mm_hole_node_end(struct drm_mm_node *hole_node)
+{
+	return list_entry(hole_node->node_list.next,
+			  struct drm_mm_node, node_list)->start;
+}
+
+static inline unsigned long drm_mm_hole_node_end(struct drm_mm_node *hole_node)
+{
+	return __drm_mm_hole_node_end(hole_node);
+}
+
 #define drm_mm_for_each_node(entry, mm) list_for_each_entry(entry, \
 						&(mm)->head_node.node_list, \
 						node_list)
@@ -99,6 +122,19 @@ static inline bool drm_mm_initialized(struct drm_mm *mm)
 	     entry != NULL; entry = next, \
 		next = entry ? list_entry(entry->node_list.next, \
 			struct drm_mm_node, node_list) : NULL) \
+
+/* Note that we need to unroll list_for_each_entry in order to inline
+ * setting hole_start and hole_end on each iteration and keep the
+ * macro sane.
+ */
+#define drm_mm_for_each_hole(entry, mm, hole_start, hole_end) \
+	for (entry = list_entry((mm)->hole_stack.next, struct drm_mm_node, hole_stack); \
+	     &entry->hole_stack != &(mm)->hole_stack ? \
+	     hole_start = drm_mm_hole_node_start(entry), \
+	     hole_end = drm_mm_hole_node_end(entry), \
+	     1 : 0; \
+	     entry = list_entry(entry->hole_stack.next, struct drm_mm_node, hole_stack))
+
 /*
  * Basic range manager support (drm_mm.c)
  */
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 03/16] drm/i915: Fix detection of base of stolen memory
  2012-11-15 11:32 Hopefully the last round of stolen work Chris Wilson
  2012-11-15 11:32 ` [PATCH 01/16] drm: Introduce drm_mm_create_block() Chris Wilson
  2012-11-15 11:32 ` [PATCH 02/16] drm: Introduce an iterator over holes in the drm_mm range manager Chris Wilson
@ 2012-11-15 11:32 ` Chris Wilson
  2012-11-15 11:32 ` [PATCH 04/16] drm/i915: Avoid clearing preallocated regions from the GTT Chris Wilson
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: Chris Wilson @ 2012-11-15 11:32 UTC (permalink / raw)
  To: intel-gfx

The routine to query the base of stolen memory was using the wrong
registers and the wrong encodings on virtually every platform.

It was not until the G33 refresh, that a PCI config register was
introduced that explicitly said where the stolen memory was. Prior to
865G there was not even a register that said where the end of usable
low memory was and where the stolen memory began (or ended depending
upon chipset). Before then, one has to look at the BIOS memory maps to
find the Top of Memory. Alas that is not exported by arch/x86 and so we
have to resort to disabling stolen memory on gen2 for the time being.

Then SandyBridge enlarged the PCI register to a full 32-bits and change
the encoding of the address, so even though we happened to be querying
the right register, we read the wrong bits and ended up using address 0
for our stolen data, i.e. notably FBC.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.h        |    1 +
 drivers/gpu/drm/i915/i915_gem_stolen.c |   81 ++++++++++++++++----------------
 2 files changed, 42 insertions(+), 40 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index db6d71c..500066a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -765,6 +765,7 @@ typedef struct drm_i915_private {
 		unsigned long gtt_start;
 		unsigned long gtt_mappable_end;
 		unsigned long gtt_end;
+		unsigned long stolen_base; /* limited to low memory (32-bit) */
 
 		struct io_mapping *gtt_mapping;
 		phys_addr_t gtt_base_addr;
diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index 8e91083..be24312 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -42,56 +42,50 @@
  * for is a boon.
  */
 
-#define PTE_ADDRESS_MASK		0xfffff000
-#define PTE_ADDRESS_MASK_HIGH		0x000000f0 /* i915+ */
-#define PTE_MAPPING_TYPE_UNCACHED	(0 << 1)
-#define PTE_MAPPING_TYPE_DCACHE		(1 << 1) /* i830 only */
-#define PTE_MAPPING_TYPE_CACHED		(3 << 1)
-#define PTE_MAPPING_TYPE_MASK		(3 << 1)
-#define PTE_VALID			(1 << 0)
-
-/**
- * i915_stolen_to_phys - take an offset into stolen memory and turn it into
- *                       a physical one
- * @dev: drm device
- * @offset: address to translate
- *
- * Some chip functions require allocations from stolen space and need the
- * physical address of the memory in question.
- */
-static unsigned long i915_stolen_to_phys(struct drm_device *dev, u32 offset)
+static unsigned long i915_stolen_to_physical(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct pci_dev *pdev = dev_priv->bridge_dev;
 	u32 base;
 
-#if 0
 	/* On the machines I have tested the Graphics Base of Stolen Memory
-	 * is unreliable, so compute the base by subtracting the stolen memory
-	 * from the Top of Low Usable DRAM which is where the BIOS places
-	 * the graphics stolen memory.
+	 * is unreliable, so on those compute the base by subtracting the
+	 * stolen memory from the Top of Low Usable DRAM which is where the
+	 * BIOS places the graphics stolen memory.
+	 *
+	 * On gen2, the layout is slightly different with the Graphics Segment
+	 * immediately following Top of Memory (or Top of Usable DRAM). Note
+	 * it appears that TOUD is only reported by 865g, so we just use the
+	 * top of memory as determined by the e820 probe.
+	 *
+	 * XXX gen2 requires an unavailable symbol and 945gm fails with
+	 * its value of TOLUD.
 	 */
-	if (INTEL_INFO(dev)->gen > 3 || IS_G33(dev)) {
-		/* top 32bits are reserved = 0 */
+	base = 0;
+	if (INTEL_INFO(dev)->gen >= 6) {
+		/* Read Base Data of Stolen Memory Register (BDSM) directly.
+		 * Note that there is also a MCHBAR miror at 0x1080c0 or
+		 * we could use device 2:0x5c instead.
+		*/
+		pci_read_config_dword(pdev, 0xB0, &base);
+		base &= ~4095; /* lower bits used for locking register */
+	} else if (INTEL_INFO(dev)->gen > 3 || IS_G33(dev)) {
+		/* Read Graphics Base of Stolen Memory directly */
 		pci_read_config_dword(pdev, 0xA4, &base);
-	} else {
-		/* XXX presume 8xx is the same as i915 */
-		pci_bus_read_config_dword(pdev->bus, 2, 0x5C, &base);
-	}
-#else
-	if (INTEL_INFO(dev)->gen > 3 || IS_G33(dev)) {
-		u16 val;
-		pci_read_config_word(pdev, 0xb0, &val);
-		base = val >> 4 << 20;
-	} else {
+#if 0
+	} else if (IS_GEN3(dev)) {
 		u8 val;
+		/* Stolen is immediately below Top of Low Usable DRAM */
 		pci_read_config_byte(pdev, 0x9c, &val);
 		base = val >> 3 << 27;
-	}
-	base -= dev_priv->mm.gtt->stolen_size;
+		base -= dev_priv->mm.gtt->stolen_size;
+	} else {
+		/* Stolen is immediately above Top of Memory */
+		base = max_low_pfn_mapped << PAGE_SHIFT;
 #endif
+	}
 
-	return base + offset;
+	return base;
 }
 
 static void i915_warn_stolen(struct drm_device *dev)
@@ -116,7 +110,7 @@ static void i915_setup_compression(struct drm_device *dev, int size)
 	if (!compressed_fb)
 		goto err;
 
-	cfb_base = i915_stolen_to_phys(dev, compressed_fb->start);
+	cfb_base = dev_priv->mm.stolen_base + compressed_fb->start;
 	if (!cfb_base)
 		goto err_fb;
 
@@ -129,7 +123,7 @@ static void i915_setup_compression(struct drm_device *dev, int size)
 		if (!compressed_llb)
 			goto err_fb;
 
-		ll_base = i915_stolen_to_phys(dev, compressed_llb->start);
+		ll_base = dev_priv->mm.stolen_base + compressed_llb->start;
 		if (!ll_base)
 			goto err_llb;
 	}
@@ -148,7 +142,7 @@ static void i915_setup_compression(struct drm_device *dev, int size)
 	}
 
 	DRM_DEBUG_KMS("FBC base 0x%08lx, ll base 0x%08lx, size %dM\n",
-		      cfb_base, ll_base, size >> 20);
+		      (long)cfb_base, (long)ll_base, size >> 20);
 	return;
 
 err_llb:
@@ -180,6 +174,13 @@ int i915_gem_init_stolen(struct drm_device *dev)
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	unsigned long prealloc_size = dev_priv->mm.gtt->stolen_size;
 
+	dev_priv->mm.stolen_base = i915_stolen_to_physical(dev);
+	if (dev_priv->mm.stolen_base == 0)
+		return 0;
+
+	DRM_DEBUG_KMS("found %d bytes of stolen memory at %08lx\n",
+		      dev_priv->mm.gtt->stolen_size, dev_priv->mm.stolen_base);
+
 	/* Basic memrange allocator for stolen space */
 	drm_mm_init(&dev_priv->mm.stolen, 0, prealloc_size);
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 04/16] drm/i915: Avoid clearing preallocated regions from the GTT
  2012-11-15 11:32 Hopefully the last round of stolen work Chris Wilson
                   ` (2 preceding siblings ...)
  2012-11-15 11:32 ` [PATCH 03/16] drm/i915: Fix detection of base of stolen memory Chris Wilson
@ 2012-11-15 11:32 ` Chris Wilson
  2012-11-15 11:32 ` [PATCH 05/16] drm/i915: Delay allocation of stolen space for FBC Chris Wilson
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: Chris Wilson @ 2012-11-15 11:32 UTC (permalink / raw)
  To: intel-gfx

As yet we do not do any preallocation (chicken-and-egg problem), but we
may like to preserve anything already allocated by the BIOS or grub and
reuse for own purposes after initialising the driver.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h     |    2 ++
 drivers/gpu/drm/i915/i915_gem_gtt.c |   32 +++++++++++++++++++++++++++++---
 2 files changed, 31 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 500066a..db5fb57 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -935,6 +935,8 @@ enum i915_cache_level {
 	I915_CACHE_LLC_MLC, /* gen6+, in docs at least! */
 };
 
+#define I915_GTT_RESERVED ((struct drm_mm_node *)0x1)
+
 struct drm_i915_gem_object_ops {
 	/* Interface between the GEM object and its backing storage.
 	 * get_pages() is called once prior to the use of the associated set
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 35fec1e..612d735 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -529,20 +529,46 @@ void i915_gem_init_global_gtt(struct drm_device *dev,
 			      unsigned long end)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
+	struct drm_mm_node *entry;
+	struct drm_i915_gem_object *obj;
+	unsigned long hole_start, hole_end;
 
-	/* Substract the guard page ... */
+	/* Subtract the guard page ... */
 	drm_mm_init(&dev_priv->mm.gtt_space, start, end - start - PAGE_SIZE);
 	if (!HAS_LLC(dev))
 		dev_priv->mm.gtt_space.color_adjust = i915_gtt_color_adjust;
 
+	/* Mark any preallocated objects as occupied */
+	list_for_each_entry(obj, &dev_priv->mm.bound_list, gtt_list) {
+		DRM_DEBUG_KMS("reserving preallocated space: %x + %zx\n",
+			      obj->gtt_offset, obj->base.size);
+
+		BUG_ON(obj->gtt_space != I915_GTT_RESERVED);
+		obj->gtt_space = drm_mm_create_block(&dev_priv->mm.gtt_space,
+						     obj->gtt_offset,
+						     obj->base.size,
+						     false);
+		obj->has_global_gtt_mapping = 1;
+	}
+
 	dev_priv->mm.gtt_start = start;
 	dev_priv->mm.gtt_mappable_end = mappable_end;
 	dev_priv->mm.gtt_end = end;
 	dev_priv->mm.gtt_total = end - start;
 	dev_priv->mm.mappable_gtt_total = min(end, mappable_end) - start;
 
-	/* ... but ensure that we clear the entire range. */
-	i915_ggtt_clear_range(dev, start / PAGE_SIZE, (end-start) / PAGE_SIZE);
+	/* Clear any non-preallocated blocks */
+	drm_mm_for_each_hole(entry, &dev_priv->mm.gtt_space,
+			     hole_start, hole_end) {
+		DRM_DEBUG_KMS("clearing unused GTT space: [%lx, %lx]\n",
+			      hole_start, hole_end);
+		i915_ggtt_clear_range(dev,
+				      hole_start / PAGE_SIZE,
+				      (hole_end-hole_start) / PAGE_SIZE);
+	}
+
+	/* And finally clear the reserved guard page */
+	i915_ggtt_clear_range(dev, end / PAGE_SIZE - 1, 1);
 }
 
 static int setup_scratch_page(struct drm_device *dev)
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 05/16] drm/i915: Delay allocation of stolen space for FBC
  2012-11-15 11:32 Hopefully the last round of stolen work Chris Wilson
                   ` (3 preceding siblings ...)
  2012-11-15 11:32 ` [PATCH 04/16] drm/i915: Avoid clearing preallocated regions from the GTT Chris Wilson
@ 2012-11-15 11:32 ` Chris Wilson
  2012-11-15 11:32 ` [PATCH 06/16] drm/i915: Allow objects to be created with no backing pages, but stolen space Chris Wilson
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: Chris Wilson @ 2012-11-15 11:32 UTC (permalink / raw)
  To: intel-gfx

As FBC is commonly disabled due to limitations of the chipset upon
output configurations, on many systems FBC is never enabled. For those
systems, it is advantageous to make use of the stolen memory for other
objects and so we defer allocation of the FBC chunk until we actually
require it. This increases the likelihood of that allocation failing,
but that in turns means that we are already taking advantage of the
stolen memory!

As well as delaying the allocation from driver initialisation until the
first use of FBC, we also return the stolen block after we finish using
it - allowing greater flexibility in our usage of stolen space. A side
effect of this is that we can then attempt to allocate only the required
amount of space (with a little slack to reduce reallocation rate and
avoid fragmentation).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.h        |    2 +
 drivers/gpu/drm/i915/i915_gem_stolen.c |  108 +++++++++++++++-----------------
 drivers/gpu/drm/i915/intel_display.c   |    3 +
 drivers/gpu/drm/i915/intel_pm.c        |   15 +++--
 4 files changed, 64 insertions(+), 64 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index db5fb57..196d1b7 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1560,6 +1560,8 @@ int i915_gem_evict_everything(struct drm_device *dev);
 
 /* i915_gem_stolen.c */
 int i915_gem_init_stolen(struct drm_device *dev);
+int i915_gem_stolen_setup_compression(struct drm_device *dev, int size);
+void i915_gem_stolen_cleanup_compression(struct drm_device *dev);
 void i915_gem_cleanup_stolen(struct drm_device *dev);
 
 /* i915_gem_tiling.c */
diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index be24312..10ca473 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -88,33 +88,27 @@ static unsigned long i915_stolen_to_physical(struct drm_device *dev)
 	return base;
 }
 
-static void i915_warn_stolen(struct drm_device *dev)
-{
-	DRM_INFO("not enough stolen space for compressed buffer, disabling\n");
-	DRM_INFO("hint: you may be able to increase stolen memory size in the BIOS to avoid this\n");
-}
-
-static void i915_setup_compression(struct drm_device *dev, int size)
+static int i915_setup_compression(struct drm_device *dev, int size)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_mm_node *compressed_fb, *uninitialized_var(compressed_llb);
-	unsigned long cfb_base;
-	unsigned long ll_base = 0;
 
-	/* Just in case the BIOS is doing something questionable. */
-	intel_disable_fbc(dev);
-
-	compressed_fb = drm_mm_search_free(&dev_priv->mm.stolen, size, 4096, 0);
+	/* Try to over-allocate to reduce reallocations and fragmentation */
+	compressed_fb = drm_mm_search_free(&dev_priv->mm.stolen,
+					   size <<= 1, 4096, 0);
+	if (!compressed_fb)
+		compressed_fb = drm_mm_search_free(&dev_priv->mm.stolen,
+						   size >>= 1, 4096, 0);
 	if (compressed_fb)
 		compressed_fb = drm_mm_get_block(compressed_fb, size, 4096);
 	if (!compressed_fb)
 		goto err;
 
-	cfb_base = dev_priv->mm.stolen_base + compressed_fb->start;
-	if (!cfb_base)
-		goto err_fb;
-
-	if (!(IS_GM45(dev) || HAS_PCH_SPLIT(dev))) {
+	if (HAS_PCH_SPLIT(dev))
+		I915_WRITE(ILK_DPFC_CB_BASE, compressed_fb->start);
+	else if (IS_GM45(dev)) {
+		I915_WRITE(DPFC_CB_BASE, compressed_fb->start);
+	} else {
 		compressed_llb = drm_mm_search_free(&dev_priv->mm.stolen,
 						    4096, 4096, 0);
 		if (compressed_llb)
@@ -123,56 +117,68 @@ static void i915_setup_compression(struct drm_device *dev, int size)
 		if (!compressed_llb)
 			goto err_fb;
 
-		ll_base = dev_priv->mm.stolen_base + compressed_llb->start;
-		if (!ll_base)
-			goto err_llb;
+		dev_priv->compressed_llb = compressed_llb;
+
+		I915_WRITE(FBC_CFB_BASE,
+			   dev_priv->mm.stolen_base + compressed_fb->start);
+		I915_WRITE(FBC_LL_BASE,
+			   dev_priv->mm.stolen_base + compressed_llb->start);
 	}
 
+	dev_priv->compressed_fb = compressed_fb;
 	dev_priv->cfb_size = size;
 
-	dev_priv->compressed_fb = compressed_fb;
-	if (HAS_PCH_SPLIT(dev))
-		I915_WRITE(ILK_DPFC_CB_BASE, compressed_fb->start);
-	else if (IS_GM45(dev)) {
-		I915_WRITE(DPFC_CB_BASE, compressed_fb->start);
-	} else {
-		I915_WRITE(FBC_CFB_BASE, cfb_base);
-		I915_WRITE(FBC_LL_BASE, ll_base);
-		dev_priv->compressed_llb = compressed_llb;
-	}
+	DRM_DEBUG_KMS("reserved %d bytes of contiguous stolen space for FBC\n",
+		      size);
 
-	DRM_DEBUG_KMS("FBC base 0x%08lx, ll base 0x%08lx, size %dM\n",
-		      (long)cfb_base, (long)ll_base, size >> 20);
-	return;
+	return 0;
 
-err_llb:
-	drm_mm_put_block(compressed_llb);
 err_fb:
 	drm_mm_put_block(compressed_fb);
 err:
-	dev_priv->no_fbc_reason = FBC_STOLEN_TOO_SMALL;
-	i915_warn_stolen(dev);
+	return -ENOSPC;
+}
+
+int i915_gem_stolen_setup_compression(struct drm_device *dev, int size)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	if (dev_priv->mm.stolen_base == 0)
+		return -ENODEV;
+
+	if (size < dev_priv->cfb_size)
+		return 0;
+
+	/* Release any current block */
+	i915_gem_stolen_cleanup_compression(dev);
+
+	return i915_setup_compression(dev, size);
 }
 
-static void i915_cleanup_compression(struct drm_device *dev)
+void i915_gem_stolen_cleanup_compression(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 
-	drm_mm_put_block(dev_priv->compressed_fb);
+	if (dev_priv->cfb_size == 0)
+		return;
+
+	if (dev_priv->compressed_fb)
+		drm_mm_put_block(dev_priv->compressed_fb);
+
 	if (dev_priv->compressed_llb)
 		drm_mm_put_block(dev_priv->compressed_llb);
+
+	dev_priv->cfb_size = 0;
 }
 
 void i915_gem_cleanup_stolen(struct drm_device *dev)
 {
-	if (I915_HAS_FBC(dev) && i915_powersave)
-		i915_cleanup_compression(dev);
+	i915_gem_stolen_cleanup_compression(dev);
 }
 
 int i915_gem_init_stolen(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	unsigned long prealloc_size = dev_priv->mm.gtt->stolen_size;
 
 	dev_priv->mm.stolen_base = i915_stolen_to_physical(dev);
 	if (dev_priv->mm.stolen_base == 0)
@@ -182,21 +188,7 @@ int i915_gem_init_stolen(struct drm_device *dev)
 		      dev_priv->mm.gtt->stolen_size, dev_priv->mm.stolen_base);
 
 	/* Basic memrange allocator for stolen space */
-	drm_mm_init(&dev_priv->mm.stolen, 0, prealloc_size);
-
-	/* Try to set up FBC with a reasonable compressed buffer size */
-	if (I915_HAS_FBC(dev) && i915_powersave) {
-		int cfb_size;
-
-		/* Leave 1M for line length buffer & misc. */
-
-		/* Try to get a 32M buffer... */
-		if (prealloc_size > (36*1024*1024))
-			cfb_size = 32*1024*1024;
-		else /* fall back to 7/8 of the stolen space */
-			cfb_size = prealloc_size * 7 / 8;
-		i915_setup_compression(dev, cfb_size);
-	}
+	drm_mm_init(&dev_priv->mm.stolen, 0, dev_priv->mm.gtt->stolen_size);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 6d8a5ed..cd9cc89 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -8773,6 +8773,9 @@ void intel_modeset_init(struct drm_device *dev)
 	/* Just disable it once at startup */
 	i915_disable_vga(dev);
 	intel_setup_outputs(dev);
+
+	/* Just in case the BIOS is doing something questionable. */
+	intel_disable_fbc(dev);
 }
 
 static void
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 0edb549..5169158 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -440,12 +440,6 @@ void intel_update_fbc(struct drm_device *dev)
 		dev_priv->no_fbc_reason = FBC_MODULE_PARAM;
 		goto out_disable;
 	}
-	if (intel_fb->obj->base.size > dev_priv->cfb_size) {
-		DRM_DEBUG_KMS("framebuffer too large, disabling "
-			      "compression\n");
-		dev_priv->no_fbc_reason = FBC_STOLEN_TOO_SMALL;
-		goto out_disable;
-	}
 	if ((crtc->mode.flags & DRM_MODE_FLAG_INTERLACE) ||
 	    (crtc->mode.flags & DRM_MODE_FLAG_DBLSCAN)) {
 		DRM_DEBUG_KMS("mode incompatible with compression, "
@@ -479,6 +473,14 @@ void intel_update_fbc(struct drm_device *dev)
 	if (in_dbg_master())
 		goto out_disable;
 
+	if (i915_gem_stolen_setup_compression(dev, intel_fb->obj->base.size)) {
+		DRM_INFO("not enough stolen space for compressed buffer (need %zd bytes), disabling\n", intel_fb->obj->base.size);
+		DRM_INFO("hint: you may be able to increase stolen memory size in the BIOS to avoid this\n");
+		DRM_DEBUG_KMS("framebuffer too large, disabling compression\n");
+		dev_priv->no_fbc_reason = FBC_STOLEN_TOO_SMALL;
+		goto out_disable;
+	}
+
 	/* If the scanout has not changed, don't modify the FBC settings.
 	 * Note that we make the fundamental assumption that the fb->obj
 	 * cannot be unpinned (and have its GTT offset and fence revoked)
@@ -526,6 +528,7 @@ out_disable:
 		DRM_DEBUG_KMS("unsupported config, disabling FBC\n");
 		intel_disable_fbc(dev);
 	}
+	i915_gem_stolen_cleanup_compression(dev);
 }
 
 static void i915_pineview_get_mem_freq(struct drm_device *dev)
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 06/16] drm/i915: Allow objects to be created with no backing pages, but stolen space
  2012-11-15 11:32 Hopefully the last round of stolen work Chris Wilson
                   ` (4 preceding siblings ...)
  2012-11-15 11:32 ` [PATCH 05/16] drm/i915: Delay allocation of stolen space for FBC Chris Wilson
@ 2012-11-15 11:32 ` Chris Wilson
  2012-11-15 11:32 ` [PATCH 07/16] drm/i915: Differentiate between prime and stolen objects Chris Wilson
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: Chris Wilson @ 2012-11-15 11:32 UTC (permalink / raw)
  To: intel-gfx

In order to accommodate objects that are not backed by struct pages, but
instead point into a contiguous region of stolen space, we need to make
various changes to avoid dereferencing obj->pages or obj->base.filp.

First introduce a marker for the stolen object, that specifies its
offset into the stolen region and implies that it has no backing pages.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_debugfs.c |    2 ++
 drivers/gpu/drm/i915/i915_drv.h     |    2 ++
 2 files changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 4568e7d..d6e9198 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -124,6 +124,8 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
 	if (obj->gtt_space != NULL)
 		seq_printf(m, " (gtt offset: %08x, size: %08x)",
 			   obj->gtt_offset, (unsigned int)obj->gtt_space->size);
+	if (obj->stolen)
+		seq_printf(m, " (stolen: %08lx)", obj->stolen->start);
 	if (obj->pin_mappable || obj->fault_mappable) {
 		char s[3], *t = s;
 		if (obj->pin_mappable)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 196d1b7..9705380 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -962,6 +962,8 @@ struct drm_i915_gem_object {
 
 	/** Current space allocated to this object in the GTT, if any. */
 	struct drm_mm_node *gtt_space;
+	/** Stolen memory for this object, instead of being backed by shmem. */
+	struct drm_mm_node *stolen;
 	struct list_head gtt_list;
 
 	/** This object's place on the active/inactive lists */
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 07/16] drm/i915: Differentiate between prime and stolen objects
  2012-11-15 11:32 Hopefully the last round of stolen work Chris Wilson
                   ` (5 preceding siblings ...)
  2012-11-15 11:32 ` [PATCH 06/16] drm/i915: Allow objects to be created with no backing pages, but stolen space Chris Wilson
@ 2012-11-15 11:32 ` Chris Wilson
  2012-11-15 11:32 ` [PATCH 08/16] drm/i915: Support readback of stolen objects upon error Chris Wilson
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: Chris Wilson @ 2012-11-15 11:32 UTC (permalink / raw)
  To: intel-gfx

Stolen objects also share the property that they have no backing shmemfs
filp, but they can be used with pwrite/pread/gtt-mapping.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h |    5 +++++
 drivers/gpu/drm/i915/i915_gem.c |    4 ++--
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 9705380..b29ee03 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1098,6 +1098,11 @@ struct drm_i915_gem_object {
 	atomic_t pending_flip;
 };
 
+inline static bool i915_gem_object_is_prime(struct drm_i915_gem_object *obj)
+{
+	return obj->base.import_attach != NULL;
+}
+
 #define to_intel_bo(x) container_of(x, struct drm_i915_gem_object, base)
 
 /**
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index cdcf19d..a6a2893 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -552,7 +552,7 @@ i915_gem_pread_ioctl(struct drm_device *dev, void *data,
 	/* prime objects have no backing filp to GEM pread/pwrite
 	 * pages from.
 	 */
-	if (!obj->base.filp) {
+	if (i915_gem_object_is_prime(obj)) {
 		ret = -EINVAL;
 		goto out;
 	}
@@ -901,7 +901,7 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data,
 	/* prime objects have no backing filp to GEM pread/pwrite
 	 * pages from.
 	 */
-	if (!obj->base.filp) {
+	if (i915_gem_object_is_prime(obj)) {
 		ret = -EINVAL;
 		goto out;
 	}
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 08/16] drm/i915: Support readback of stolen objects upon error
  2012-11-15 11:32 Hopefully the last round of stolen work Chris Wilson
                   ` (6 preceding siblings ...)
  2012-11-15 11:32 ` [PATCH 07/16] drm/i915: Differentiate between prime and stolen objects Chris Wilson
@ 2012-11-15 11:32 ` Chris Wilson
  2012-11-15 11:32 ` [PATCH 09/16] drm/i915: Handle stolen objects in pwrite Chris Wilson
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: Chris Wilson @ 2012-11-15 11:32 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_irq.c |    8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 2604867..205413f 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -928,6 +928,14 @@ i915_error_object_create(struct drm_i915_private *dev_priv,
 						     reloc_offset);
 			memcpy_fromio(d, s, PAGE_SIZE);
 			io_mapping_unmap_atomic(s);
+		} else if (src->stolen) {
+			unsigned long offset;
+
+			offset = dev_priv->mm.stolen_base;
+			offset += src->stolen->start;
+			offset += i << PAGE_SHIFT;
+
+			memcpy_fromio(d, (void *)offset, PAGE_SIZE);
 		} else {
 			struct page *page;
 			void *s;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 09/16] drm/i915: Handle stolen objects in pwrite
  2012-11-15 11:32 Hopefully the last round of stolen work Chris Wilson
                   ` (7 preceding siblings ...)
  2012-11-15 11:32 ` [PATCH 08/16] drm/i915: Support readback of stolen objects upon error Chris Wilson
@ 2012-11-15 11:32 ` Chris Wilson
  2012-11-15 11:32 ` [PATCH 10/16] drm/i915: Handle stolen objects for pread Chris Wilson
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: Chris Wilson @ 2012-11-15 11:32 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.c |  159 ++++++++++++++++++++++++---------------
 1 file changed, 100 insertions(+), 59 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index a6a2893..9e66e29 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -663,19 +663,17 @@ out:
  * needs_clflush_before is set and flushes out any written cachelines after
  * writing if needs_clflush is set. */
 static int
-shmem_pwrite_fast(struct page *page, int shmem_page_offset, int page_length,
+shmem_pwrite_fast(char *vaddr, int shmem_page_offset, int page_length,
 		  char __user *user_data,
 		  bool page_do_bit17_swizzling,
 		  bool needs_clflush_before,
 		  bool needs_clflush_after)
 {
-	char *vaddr;
 	int ret;
 
 	if (unlikely(page_do_bit17_swizzling))
 		return -EINVAL;
 
-	vaddr = kmap_atomic(page);
 	if (needs_clflush_before)
 		drm_clflush_virt_range(vaddr + shmem_page_offset,
 				       page_length);
@@ -685,7 +683,6 @@ shmem_pwrite_fast(struct page *page, int shmem_page_offset, int page_length,
 	if (needs_clflush_after)
 		drm_clflush_virt_range(vaddr + shmem_page_offset,
 				       page_length);
-	kunmap_atomic(vaddr);
 
 	return ret ? -EFAULT : 0;
 }
@@ -693,16 +690,14 @@ shmem_pwrite_fast(struct page *page, int shmem_page_offset, int page_length,
 /* Only difference to the fast-path function is that this can handle bit17
  * and uses non-atomic copy and kmap functions. */
 static int
-shmem_pwrite_slow(struct page *page, int shmem_page_offset, int page_length,
+shmem_pwrite_slow(char *vaddr, int shmem_page_offset, int page_length,
 		  char __user *user_data,
 		  bool page_do_bit17_swizzling,
 		  bool needs_clflush_before,
 		  bool needs_clflush_after)
 {
-	char *vaddr;
 	int ret;
 
-	vaddr = kmap(page);
 	if (unlikely(needs_clflush_before || page_do_bit17_swizzling))
 		shmem_clflush_swizzled_range(vaddr + shmem_page_offset,
 					     page_length,
@@ -719,7 +714,6 @@ shmem_pwrite_slow(struct page *page, int shmem_page_offset, int page_length,
 		shmem_clflush_swizzled_range(vaddr + shmem_page_offset,
 					     page_length,
 					     page_do_bit17_swizzling);
-	kunmap(page);
 
 	return ret ? -EFAULT : 0;
 }
@@ -730,10 +724,11 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
 		      struct drm_i915_gem_pwrite *args,
 		      struct drm_file *file)
 {
+	struct drm_i915_private *dev_priv = dev->dev_private;
 	ssize_t remain;
 	loff_t offset;
 	char __user *user_data;
-	int shmem_page_offset, page_length, ret = 0;
+	int page_length, ret = 0;
 	int obj_do_bit17_swizzling, page_do_bit17_swizzling;
 	int hit_slowpath = 0;
 	int needs_clflush_after = 0;
@@ -769,74 +764,120 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
 	if (ret)
 		return ret;
 
-	i915_gem_object_pin_pages(obj);
-
 	offset = args->offset;
 	obj->dirty = 1;
 
-	for_each_sg(obj->pages->sgl, sg, obj->pages->nents, i) {
-		struct page *page;
-		int partial_cacheline_write;
+	if (obj->stolen) {
+		char *vaddr;
 
-		if (i < offset >> PAGE_SHIFT)
-			continue;
+		vaddr = (char *)dev_priv->mm.stolen_base;
+		vaddr += obj->stolen->start + (offset & PAGE_MASK);
 
-		if (remain <= 0)
-			break;
+		offset = offset_in_page(offset);
+		while (remain > 0) {
+			int partial_cacheline_write;
 
-		/* Operation in this page
-		 *
-		 * shmem_page_offset = offset within page in shmem file
-		 * page_length = bytes to copy for this page
-		 */
-		shmem_page_offset = offset_in_page(offset);
+			page_length = remain;
+			if ((offset + page_length) > PAGE_SIZE)
+				page_length = PAGE_SIZE - offset;
 
-		page_length = remain;
-		if ((shmem_page_offset + page_length) > PAGE_SIZE)
-			page_length = PAGE_SIZE - shmem_page_offset;
+			/* If we don't overwrite a cacheline completely we need to be
+			 * careful to have up-to-date data by first clflushing. Don't
+			 * overcomplicate things and flush the entire patch. */
+			partial_cacheline_write = needs_clflush_before &&
+				((offset | page_length) & (boot_cpu_data.x86_clflush_size - 1));
 
-		/* If we don't overwrite a cacheline completely we need to be
-		 * careful to have up-to-date data by first clflushing. Don't
-		 * overcomplicate things and flush the entire patch. */
-		partial_cacheline_write = needs_clflush_before &&
-			((shmem_page_offset | page_length)
-				& (boot_cpu_data.x86_clflush_size - 1));
+			page_do_bit17_swizzling = obj_do_bit17_swizzling &&
+				((uintptr_t)vaddr & (1 << 17)) != 0;
 
-		page = sg_page(sg);
-		page_do_bit17_swizzling = obj_do_bit17_swizzling &&
-			(page_to_phys(page) & (1 << 17)) != 0;
+			ret = shmem_pwrite_fast(vaddr, offset, page_length,
+						user_data, page_do_bit17_swizzling,
+						partial_cacheline_write,
+						needs_clflush_after);
 
-		ret = shmem_pwrite_fast(page, shmem_page_offset, page_length,
-					user_data, page_do_bit17_swizzling,
-					partial_cacheline_write,
-					needs_clflush_after);
-		if (ret == 0)
-			goto next_page;
+			if (ret == 0)
+				goto next_stolen;
 
-		hit_slowpath = 1;
-		mutex_unlock(&dev->struct_mutex);
-		ret = shmem_pwrite_slow(page, shmem_page_offset, page_length,
-					user_data, page_do_bit17_swizzling,
-					partial_cacheline_write,
-					needs_clflush_after);
+			hit_slowpath = 1;
+			mutex_unlock(&dev->struct_mutex);
 
-		mutex_lock(&dev->struct_mutex);
+			ret = shmem_pwrite_slow(vaddr, offset, page_length,
+						user_data, page_do_bit17_swizzling,
+						partial_cacheline_write,
+						needs_clflush_after);
 
-next_page:
-		set_page_dirty(page);
-		mark_page_accessed(page);
+			mutex_lock(&dev->struct_mutex);
+			if (ret)
+				goto out;
 
-		if (ret)
-			goto out;
+next_stolen:
+			remain -= page_length;
+			user_data += page_length;
+			offset = 0;
+		}
+	} else {
+		i915_gem_object_pin_pages(obj);
 
-		remain -= page_length;
-		user_data += page_length;
-		offset += page_length;
+		offset = offset_in_page(offset);
+		for_each_sg(obj->pages->sgl, sg, obj->pages->nents, i) {
+			struct page *page;
+			char *vaddr;
+			int partial_cacheline_write;
+
+			if (i < args->offset >> PAGE_SHIFT)
+				continue;
+
+			if (remain <= 0)
+				break;
+
+			page_length = remain;
+			if ((offset + page_length) > PAGE_SIZE)
+				page_length = PAGE_SIZE - offset;
+
+			/* If we don't overwrite a cacheline completely we need to be
+			 * careful to have up-to-date data by first clflushing. Don't
+			 * overcomplicate things and flush the entire patch. */
+			partial_cacheline_write = needs_clflush_before &&
+				((offset | page_length) & (boot_cpu_data.x86_clflush_size - 1));
+
+			page = sg_page(sg);
+			page_do_bit17_swizzling = obj_do_bit17_swizzling &&
+				(page_to_phys(page) & (1 << 17)) != 0;
+
+			vaddr = kmap_atomic(page);
+			ret = shmem_pwrite_fast(vaddr, offset, page_length,
+						user_data, page_do_bit17_swizzling,
+						partial_cacheline_write,
+						needs_clflush_after);
+			kunmap_atomic(vaddr);
+
+			if (ret == 0)
+				goto next_page;
+
+			hit_slowpath = 1;
+			mutex_unlock(&dev->struct_mutex);
+
+			vaddr = kmap(page);
+			ret = shmem_pwrite_slow(vaddr, offset, page_length,
+						user_data, page_do_bit17_swizzling,
+						partial_cacheline_write,
+						needs_clflush_after);
+			kunmap(page);
+
+			mutex_lock(&dev->struct_mutex);
+			if (ret)
+				goto out_unpin;
+
+next_page:
+			remain -= page_length;
+			user_data += page_length;
+			offset = 0;
+		}
+out_unpin:
+		i915_gem_object_unpin_pages(obj);
 	}
 
 out:
-	i915_gem_object_unpin_pages(obj);
-
 	if (hit_slowpath) {
 		/* Fixup: Kill any reinstated backing storage pages */
 		if (obj->madv == __I915_MADV_PURGED)
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 10/16] drm/i915: Handle stolen objects for pread
  2012-11-15 11:32 Hopefully the last round of stolen work Chris Wilson
                   ` (8 preceding siblings ...)
  2012-11-15 11:32 ` [PATCH 09/16] drm/i915: Handle stolen objects in pwrite Chris Wilson
@ 2012-11-15 11:32 ` Chris Wilson
  2012-11-30 22:33   ` Daniel Vetter
  2012-11-15 11:32 ` [PATCH 11/16] drm/i915: Introduce i915_gem_object_create_stolen() Chris Wilson
                   ` (5 subsequent siblings)
  15 siblings, 1 reply; 26+ messages in thread
From: Chris Wilson @ 2012-11-15 11:32 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.c |  175 ++++++++++++++++++++++++++-------------
 1 file changed, 116 insertions(+), 59 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 9e66e29..db87ce4 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -323,24 +323,21 @@ __copy_from_user_swizzled(char *gpu_vaddr, int gpu_offset,
  * Flushes invalid cachelines before reading the target if
  * needs_clflush is set. */
 static int
-shmem_pread_fast(struct page *page, int shmem_page_offset, int page_length,
+shmem_pread_fast(char *vaddr, int shmem_page_offset, int page_length,
 		 char __user *user_data,
 		 bool page_do_bit17_swizzling, bool needs_clflush)
 {
-	char *vaddr;
 	int ret;
 
 	if (unlikely(page_do_bit17_swizzling))
 		return -EINVAL;
 
-	vaddr = kmap_atomic(page);
 	if (needs_clflush)
 		drm_clflush_virt_range(vaddr + shmem_page_offset,
 				       page_length);
 	ret = __copy_to_user_inatomic(user_data,
 				      vaddr + shmem_page_offset,
 				      page_length);
-	kunmap_atomic(vaddr);
 
 	return ret ? -EFAULT : 0;
 }
@@ -370,14 +367,12 @@ shmem_clflush_swizzled_range(char *addr, unsigned long length,
 /* Only difference to the fast-path function is that this can handle bit17
  * and uses non-atomic copy and kmap functions. */
 static int
-shmem_pread_slow(struct page *page, int shmem_page_offset, int page_length,
+shmem_pread_slow(char *vaddr, int shmem_page_offset, int page_length,
 		 char __user *user_data,
 		 bool page_do_bit17_swizzling, bool needs_clflush)
 {
-	char *vaddr;
 	int ret;
 
-	vaddr = kmap(page);
 	if (needs_clflush)
 		shmem_clflush_swizzled_range(vaddr + shmem_page_offset,
 					     page_length,
@@ -391,7 +386,6 @@ shmem_pread_slow(struct page *page, int shmem_page_offset, int page_length,
 		ret = __copy_to_user(user_data,
 				     vaddr + shmem_page_offset,
 				     page_length);
-	kunmap(page);
 
 	return ret ? - EFAULT : 0;
 }
@@ -402,6 +396,7 @@ i915_gem_shmem_pread(struct drm_device *dev,
 		     struct drm_i915_gem_pread *args,
 		     struct drm_file *file)
 {
+	struct drm_i915_private *dev_priv = dev->dev_private;
 	char __user *user_data;
 	ssize_t remain;
 	loff_t offset;
@@ -432,76 +427,138 @@ i915_gem_shmem_pread(struct drm_device *dev,
 		}
 	}
 
-	ret = i915_gem_object_get_pages(obj);
-	if (ret)
-		return ret;
+	offset = args->offset;
 
-	i915_gem_object_pin_pages(obj);
+	if (obj->stolen) {
+		char *vaddr;
 
-	offset = args->offset;
+		vaddr = (char *)dev_priv->mm.stolen_base;
+		vaddr += obj->stolen->start + (offset & PAGE_MASK);
 
-	for_each_sg(obj->pages->sgl, sg, obj->pages->nents, i) {
-		struct page *page;
+		shmem_page_offset = offset_in_page(offset);
+		while (remain > 0) {
+			/* Operation in this page
+			 *
+			 * shmem_page_offset = offset within page in shmem file
+			 * page_length = bytes to copy for this page
+			 */
+			page_length = remain;
+			if ((shmem_page_offset + page_length) > PAGE_SIZE)
+				page_length = PAGE_SIZE - shmem_page_offset;
 
-		if (i < offset >> PAGE_SHIFT)
-			continue;
+			page_do_bit17_swizzling = obj_do_bit17_swizzling &&
+				((uintptr_t)vaddr & (1 << 17)) != 0;
 
-		if (remain <= 0)
-			break;
+			ret = shmem_pread_fast(vaddr, shmem_page_offset, page_length,
+					       user_data, page_do_bit17_swizzling,
+					       needs_clflush);
+			if (ret == 0)
+				goto next_stolen;
 
-		/* Operation in this page
-		 *
-		 * shmem_page_offset = offset within page in shmem file
-		 * page_length = bytes to copy for this page
-		 */
-		shmem_page_offset = offset_in_page(offset);
-		page_length = remain;
-		if ((shmem_page_offset + page_length) > PAGE_SIZE)
-			page_length = PAGE_SIZE - shmem_page_offset;
+			hit_slowpath = 1;
+			mutex_unlock(&dev->struct_mutex);
 
-		page = sg_page(sg);
-		page_do_bit17_swizzling = obj_do_bit17_swizzling &&
-			(page_to_phys(page) & (1 << 17)) != 0;
+			if (!prefaulted) {
+				ret = fault_in_multipages_writeable(user_data, remain);
+				/* Userspace is tricking us, but we've already clobbered
+				 * its pages with the prefault and promised to write the
+				 * data up to the first fault. Hence ignore any errors
+				 * and just continue. */
+				(void)ret;
+				prefaulted = 1;
+			}
 
-		ret = shmem_pread_fast(page, shmem_page_offset, page_length,
-				       user_data, page_do_bit17_swizzling,
-				       needs_clflush);
-		if (ret == 0)
-			goto next_page;
+			ret = shmem_pread_slow(vaddr, shmem_page_offset, page_length,
+					       user_data, page_do_bit17_swizzling,
+					       needs_clflush);
 
-		hit_slowpath = 1;
-		mutex_unlock(&dev->struct_mutex);
+			mutex_lock(&dev->struct_mutex);
+			if (ret)
+				goto out;
 
-		if (!prefaulted) {
-			ret = fault_in_multipages_writeable(user_data, remain);
-			/* Userspace is tricking us, but we've already clobbered
-			 * its pages with the prefault and promised to write the
-			 * data up to the first fault. Hence ignore any errors
-			 * and just continue. */
-			(void)ret;
-			prefaulted = 1;
+next_stolen:
+			remain -= page_length;
+			user_data += page_length;
+			vaddr += page_length;
+			shmem_page_offset = 0;
 		}
+	} else {
+		ret = i915_gem_object_get_pages(obj);
+		if (ret)
+			return ret;
 
-		ret = shmem_pread_slow(page, shmem_page_offset, page_length,
-				       user_data, page_do_bit17_swizzling,
-				       needs_clflush);
+		i915_gem_object_pin_pages(obj);
 
-		mutex_lock(&dev->struct_mutex);
+		for_each_sg(obj->pages->sgl, sg, obj->pages->nents, i) {
+			char *vaddr;
+			struct page *page;
 
-next_page:
-		mark_page_accessed(page);
+			if (i < offset >> PAGE_SHIFT)
+				continue;
 
-		if (ret)
-			goto out;
+			if (remain <= 0)
+				break;
 
-		remain -= page_length;
-		user_data += page_length;
-		offset += page_length;
+			/* Operation in this page
+			 *
+			 * shmem_page_offset = offset within page in shmem file
+			 * page_length = bytes to copy for this page
+			 */
+			shmem_page_offset = offset_in_page(offset);
+			page_length = remain;
+			if ((shmem_page_offset + page_length) > PAGE_SIZE)
+				page_length = PAGE_SIZE - shmem_page_offset;
+
+			page = sg_page(sg);
+			mark_page_accessed(page);
+
+			page_do_bit17_swizzling = obj_do_bit17_swizzling &&
+				(page_to_phys(page) & (1 << 17)) != 0;
+
+			vaddr = kmap_atomic(page);
+			ret = shmem_pread_fast(vaddr, shmem_page_offset, page_length,
+					       user_data, page_do_bit17_swizzling,
+					       needs_clflush);
+			kunmap_atomic(vaddr);
+
+			if (ret == 0)
+				goto next_page;
+
+			hit_slowpath = 1;
+			mutex_unlock(&dev->struct_mutex);
+
+			if (!prefaulted) {
+				ret = fault_in_multipages_writeable(user_data, remain);
+				/* Userspace is tricking us, but we've already clobbered
+				 * its pages with the prefault and promised to write the
+				 * data up to the first fault. Hence ignore any errors
+				 * and just continue. */
+				(void)ret;
+				prefaulted = 1;
+			}
+
+			vaddr = kmap(page);
+			ret = shmem_pread_slow(vaddr, shmem_page_offset, page_length,
+					       user_data, page_do_bit17_swizzling,
+					       needs_clflush);
+			kunmap(page);
+
+			mutex_lock(&dev->struct_mutex);
+
+			if (ret)
+				goto out_unpin;
+
+next_page:
+			remain -= page_length;
+			user_data += page_length;
+			offset += page_length;
+		}
+out_unpin:
+		i915_gem_object_unpin_pages(obj);
 	}
 
-out:
-	i915_gem_object_unpin_pages(obj);
 
+out:
 	if (hit_slowpath) {
 		/* Fixup: Kill any reinstated backing storage pages */
 		if (obj->madv == __I915_MADV_PURGED)
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 11/16] drm/i915: Introduce i915_gem_object_create_stolen()
  2012-11-15 11:32 Hopefully the last round of stolen work Chris Wilson
                   ` (9 preceding siblings ...)
  2012-11-15 11:32 ` [PATCH 10/16] drm/i915: Handle stolen objects for pread Chris Wilson
@ 2012-11-15 11:32 ` Chris Wilson
  2012-11-15 11:32 ` [PATCH 12/16] drm/i915: Allocate fbcon from stolen memory Chris Wilson
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: Chris Wilson @ 2012-11-15 11:32 UTC (permalink / raw)
  To: intel-gfx

Allow for the creation of GEM objects backed by stolen memory. As these
are not backed by ordinary pages, we create a fake dma mapping and store
the address in the scatterlist rather than obj->pages.

v2: Mark _i915_gem_object_create_stolen() as static, as noticed by Jesse
Barnes.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h        |    3 +
 drivers/gpu/drm/i915/i915_gem.c        |    1 +
 drivers/gpu/drm/i915/i915_gem_stolen.c |  125 ++++++++++++++++++++++++++++++++
 3 files changed, 129 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index b29ee03..b626260 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1570,6 +1570,9 @@ int i915_gem_init_stolen(struct drm_device *dev);
 int i915_gem_stolen_setup_compression(struct drm_device *dev, int size);
 void i915_gem_stolen_cleanup_compression(struct drm_device *dev);
 void i915_gem_cleanup_stolen(struct drm_device *dev);
+struct drm_i915_gem_object *
+i915_gem_object_create_stolen(struct drm_device *dev, u32 size);
+void i915_gem_object_release_stolen(struct drm_i915_gem_object *obj);
 
 /* i915_gem_tiling.c */
 void i915_gem_detect_bit_6_swizzle(struct drm_device *dev);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index db87ce4..f611d9a 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3863,6 +3863,7 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
 	obj->pages_pin_count = 0;
 	i915_gem_object_put_pages(obj);
 	i915_gem_object_free_mmap_offset(obj);
+	i915_gem_object_release_stolen(obj);
 
 	BUG_ON(obj->pages);
 
diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index 10ca473..7299d63 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -192,3 +192,128 @@ int i915_gem_init_stolen(struct drm_device *dev)
 
 	return 0;
 }
+
+static struct sg_table *
+i915_pages_create_for_stolen(struct drm_device *dev,
+			     u32 offset, u32 size)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct sg_table *st;
+	struct scatterlist *sg;
+
+	DRM_DEBUG_DRIVER("offset=0x%x, size=%d\n", offset, size);
+	BUG_ON(offset > dev_priv->mm.gtt->stolen_size - size);
+
+	/* We hide that we have no struct page backing our stolen object
+	 * by wrapping the contiguous physical allocation with a fake
+	 * dma mapping in a single scatterlist.
+	 */
+
+	st = kmalloc(sizeof(*st), GFP_KERNEL);
+	if (st == NULL)
+		return NULL;
+
+	if (sg_alloc_table(st, 1, GFP_KERNEL)) {
+		kfree(st);
+		return NULL;
+	}
+
+	sg = st->sgl;
+	sg->offset = offset;
+	sg->length = size;
+
+	sg_dma_address(sg) = (dma_addr_t)dev_priv->mm.stolen_base + offset;
+	sg_dma_len(sg) = size;
+
+	return st;
+}
+
+static int i915_gem_object_get_pages_stolen(struct drm_i915_gem_object *obj)
+{
+	BUG();
+	return -EINVAL;
+}
+
+static void i915_gem_object_put_pages_stolen(struct drm_i915_gem_object *obj)
+{
+	/* Should only be called during free */
+	sg_free_table(obj->pages);
+	kfree(obj->pages);
+}
+
+static const struct drm_i915_gem_object_ops i915_gem_object_stolen_ops = {
+	.get_pages = i915_gem_object_get_pages_stolen,
+	.put_pages = i915_gem_object_put_pages_stolen,
+};
+
+static struct drm_i915_gem_object *
+_i915_gem_object_create_stolen(struct drm_device *dev,
+			       struct drm_mm_node *stolen)
+{
+	struct drm_i915_gem_object *obj;
+
+	obj = kzalloc(sizeof(*obj), GFP_KERNEL);
+	if (obj == NULL)
+		return NULL;
+
+	if (drm_gem_private_object_init(dev, &obj->base, stolen->size))
+		goto cleanup;
+
+	i915_gem_object_init(obj, &i915_gem_object_stolen_ops);
+
+	obj->pages = i915_pages_create_for_stolen(dev,
+						  stolen->start, stolen->size);
+	if (obj->pages == NULL)
+		goto cleanup;
+
+	obj->has_dma_mapping = true;
+	obj->pages_pin_count = 1;
+	obj->stolen = stolen;
+
+	obj->base.write_domain = I915_GEM_DOMAIN_GTT;
+	obj->base.read_domains = I915_GEM_DOMAIN_GTT;
+	obj->cache_level = I915_CACHE_NONE;
+
+	return obj;
+
+cleanup:
+	kfree(obj);
+	return NULL;
+}
+
+struct drm_i915_gem_object *
+i915_gem_object_create_stolen(struct drm_device *dev, u32 size)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct drm_i915_gem_object *obj;
+	struct drm_mm_node *stolen;
+
+	if (dev_priv->mm.stolen_base == 0)
+		return NULL;
+
+	DRM_DEBUG_KMS("creating stolen object: size=%x\n", size);
+	if (size == 0)
+		return NULL;
+
+	stolen = drm_mm_search_free(&dev_priv->mm.stolen, size, 4096, 0);
+	if (stolen)
+		stolen = drm_mm_get_block(stolen, size, 4096);
+	if (stolen == NULL)
+		return NULL;
+
+	obj = _i915_gem_object_create_stolen(dev, stolen);
+	if (obj)
+		return obj;
+
+	drm_mm_put_block(stolen);
+	return NULL;
+}
+
+void
+i915_gem_object_release_stolen(struct drm_i915_gem_object *obj)
+{
+	if (obj->stolen) {
+		drm_mm_put_block(obj->stolen);
+		obj->stolen = NULL;
+	}
+}
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 12/16] drm/i915: Allocate fbcon from stolen memory
  2012-11-15 11:32 Hopefully the last round of stolen work Chris Wilson
                   ` (10 preceding siblings ...)
  2012-11-15 11:32 ` [PATCH 11/16] drm/i915: Introduce i915_gem_object_create_stolen() Chris Wilson
@ 2012-11-15 11:32 ` Chris Wilson
  2012-11-15 11:32 ` [PATCH 13/16] drm/i915: Allocate ringbuffers " Chris Wilson
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: Chris Wilson @ 2012-11-15 11:32 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Acked-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/intel_fb.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_fb.c b/drivers/gpu/drm/i915/intel_fb.c
index 7b30b5c..b7773e5 100644
--- a/drivers/gpu/drm/i915/intel_fb.c
+++ b/drivers/gpu/drm/i915/intel_fb.c
@@ -83,7 +83,9 @@ static int intelfb_create(struct intel_fbdev *ifbdev,
 
 	size = mode_cmd.pitches[0] * mode_cmd.height;
 	size = ALIGN(size, PAGE_SIZE);
-	obj = i915_gem_alloc_object(dev, size);
+	obj = i915_gem_object_create_stolen(dev, size);
+	if (obj == NULL)
+		obj = i915_gem_alloc_object(dev, size);
 	if (!obj) {
 		DRM_ERROR("failed to allocate framebuffer\n");
 		ret = -ENOMEM;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 13/16] drm/i915: Allocate ringbuffers from stolen memory
  2012-11-15 11:32 Hopefully the last round of stolen work Chris Wilson
                   ` (11 preceding siblings ...)
  2012-11-15 11:32 ` [PATCH 12/16] drm/i915: Allocate fbcon from stolen memory Chris Wilson
@ 2012-11-15 11:32 ` Chris Wilson
  2012-11-15 11:32 ` [PATCH 14/16] drm/i915: Allocate overlay registers " Chris Wilson
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: Chris Wilson @ 2012-11-15 11:32 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Acked-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c |    6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 221c2a5..50b54e9 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1096,7 +1096,11 @@ static int intel_init_ring_buffer(struct drm_device *dev,
 			return ret;
 	}
 
-	obj = i915_gem_alloc_object(dev, ring->size);
+	obj = NULL;
+	if (!HAS_LLC(dev))
+		obj = i915_gem_object_create_stolen(dev, ring->size);
+	if (obj == NULL)
+		obj = i915_gem_alloc_object(dev, ring->size);
 	if (obj == NULL) {
 		DRM_ERROR("Failed to allocate ringbuffer\n");
 		ret = -ENOMEM;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 14/16] drm/i915: Allocate overlay registers from stolen memory
  2012-11-15 11:32 Hopefully the last round of stolen work Chris Wilson
                   ` (12 preceding siblings ...)
  2012-11-15 11:32 ` [PATCH 13/16] drm/i915: Allocate ringbuffers " Chris Wilson
@ 2012-11-15 11:32 ` Chris Wilson
  2012-11-15 11:32 ` [PATCH 15/16] drm/i915: Use a slab for object allocation Chris Wilson
  2012-11-15 11:32 ` [PATCH 16/16] drm/i915: Introduce mapping of user pages into video memory (userptr) ioctl Chris Wilson
  15 siblings, 0 replies; 26+ messages in thread
From: Chris Wilson @ 2012-11-15 11:32 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Acked-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/intel_overlay.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
index 4956259..8877b9c 100644
--- a/drivers/gpu/drm/i915/intel_overlay.c
+++ b/drivers/gpu/drm/i915/intel_overlay.c
@@ -1325,8 +1325,10 @@ void intel_setup_overlay(struct drm_device *dev)
 
 	overlay->dev = dev;
 
-	reg_bo = i915_gem_alloc_object(dev, PAGE_SIZE);
-	if (!reg_bo)
+	reg_bo = i915_gem_object_create_stolen(dev, PAGE_SIZE);
+	if (reg_bo == NULL)
+		reg_bo = i915_gem_alloc_object(dev, PAGE_SIZE);
+	if (reg_bo == NULL)
 		goto out_free;
 	overlay->reg_bo = reg_bo;
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 15/16] drm/i915: Use a slab for object allocation
  2012-11-15 11:32 Hopefully the last round of stolen work Chris Wilson
                   ` (13 preceding siblings ...)
  2012-11-15 11:32 ` [PATCH 14/16] drm/i915: Allocate overlay registers " Chris Wilson
@ 2012-11-15 11:32 ` Chris Wilson
  2012-11-30 22:45   ` Daniel Vetter
  2012-11-15 11:32 ` [PATCH 16/16] drm/i915: Introduce mapping of user pages into video memory (userptr) ioctl Chris Wilson
  15 siblings, 1 reply; 26+ messages in thread
From: Chris Wilson @ 2012-11-15 11:32 UTC (permalink / raw)
  To: intel-gfx

The primary purpose of this was to debug some use-after-free memory
corruption that was causing an OOPS inside drm/i915. As it turned out
the corruption was being caused elsewhere and i915.ko as a major user of
many objects was being hit hardest.

Indeed as we do frequent the generic kmalloc caches, dedicating one to
ourselves (or at least naming one for us depending upon the core) aids
debugging our own slab usage.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_dma.c        |    3 +++
 drivers/gpu/drm/i915/i915_drv.h        |    4 ++++
 drivers/gpu/drm/i915/i915_gem.c        |   28 +++++++++++++++++++++++-----
 drivers/gpu/drm/i915/i915_gem_dmabuf.c |    5 ++---
 drivers/gpu/drm/i915/i915_gem_stolen.c |    4 ++--
 5 files changed, 34 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 1eea5be..e347a20 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1769,6 +1769,9 @@ int i915_driver_unload(struct drm_device *dev)
 
 	destroy_workqueue(dev_priv->wq);
 
+	if (dev_priv->slab)
+		kmem_cache_destroy(dev_priv->slab);
+
 	pci_dev_put(dev_priv->bridge_dev);
 	kfree(dev->dev_private);
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index b626260..930270c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -614,6 +614,7 @@ struct intel_l3_parity {
 
 typedef struct drm_i915_private {
 	struct drm_device *dev;
+	struct kmem_cache *slab;
 
 	const struct intel_device_info *info;
 
@@ -1372,12 +1373,15 @@ int i915_gem_get_aperture_ioctl(struct drm_device *dev, void *data,
 int i915_gem_wait_ioctl(struct drm_device *dev, void *data,
 			struct drm_file *file_priv);
 void i915_gem_load(struct drm_device *dev);
+void *i915_gem_object_alloc(struct drm_device *dev);
+void i915_gem_object_free(struct drm_i915_gem_object *obj);
 int i915_gem_init_object(struct drm_gem_object *obj);
 void i915_gem_object_init(struct drm_i915_gem_object *obj,
 			 const struct drm_i915_gem_object_ops *ops);
 struct drm_i915_gem_object *i915_gem_alloc_object(struct drm_device *dev,
 						  size_t size);
 void i915_gem_free_object(struct drm_gem_object *obj);
+
 int __must_check i915_gem_object_pin(struct drm_i915_gem_object *obj,
 				     uint32_t alignment,
 				     bool map_and_fenceable,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index f611d9a..0bb5f86 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -192,6 +192,18 @@ i915_gem_get_aperture_ioctl(struct drm_device *dev, void *data,
 	return 0;
 }
 
+void *i915_gem_object_alloc(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	return kmem_cache_alloc(dev_priv->slab, GFP_KERNEL | __GFP_ZERO);
+}
+
+void i915_gem_object_free(struct drm_i915_gem_object *obj)
+{
+	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
+	kmem_cache_free(dev_priv->slab, obj);
+}
+
 static int
 i915_gem_create(struct drm_file *file,
 		struct drm_device *dev,
@@ -215,7 +227,7 @@ i915_gem_create(struct drm_file *file,
 	if (ret) {
 		drm_gem_object_release(&obj->base);
 		i915_gem_info_remove_obj(dev->dev_private, obj->base.size);
-		kfree(obj);
+		i915_gem_object_free(obj);
 		return ret;
 	}
 
@@ -3786,12 +3798,12 @@ struct drm_i915_gem_object *i915_gem_alloc_object(struct drm_device *dev,
 	struct address_space *mapping;
 	u32 mask;
 
-	obj = kzalloc(sizeof(*obj), GFP_KERNEL);
+	obj = i915_gem_object_alloc(dev);
 	if (obj == NULL)
 		return NULL;
 
 	if (drm_gem_object_init(dev, &obj->base, size) != 0) {
-		kfree(obj);
+		i915_gem_object_free(obj);
 		return NULL;
 	}
 
@@ -3874,7 +3886,7 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
 	i915_gem_info_remove_obj(dev_priv, obj->base.size);
 
 	kfree(obj->bit_17);
-	kfree(obj);
+	i915_gem_object_free(obj);
 }
 
 int
@@ -4192,8 +4204,14 @@ init_ring_lists(struct intel_ring_buffer *ring)
 void
 i915_gem_load(struct drm_device *dev)
 {
-	int i;
 	drm_i915_private_t *dev_priv = dev->dev_private;
+	int i;
+
+	dev_priv->slab =
+		kmem_cache_create("i915_gem_object",
+				  sizeof(struct drm_i915_gem_object), 0,
+				  SLAB_HWCACHE_ALIGN,
+				  NULL);
 
 	INIT_LIST_HEAD(&dev_priv->mm.active_list);
 	INIT_LIST_HEAD(&dev_priv->mm.inactive_list);
diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
index 773ef77..defb888 100644
--- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
@@ -276,8 +276,7 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev,
 	if (IS_ERR(attach))
 		return ERR_CAST(attach);
 
-
-	obj = kzalloc(sizeof(*obj), GFP_KERNEL);
+	obj = i915_gem_object_alloc(dev);
 	if (obj == NULL) {
 		ret = -ENOMEM;
 		goto fail_detach;
@@ -285,7 +284,7 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev,
 
 	ret = drm_gem_private_object_init(dev, &obj->base, dma_buf->size);
 	if (ret) {
-		kfree(obj);
+		i915_gem_object_free(obj);
 		goto fail_detach;
 	}
 
diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index 7299d63..f817b0c 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -252,7 +252,7 @@ _i915_gem_object_create_stolen(struct drm_device *dev,
 {
 	struct drm_i915_gem_object *obj;
 
-	obj = kzalloc(sizeof(*obj), GFP_KERNEL);
+	obj = i915_gem_object_alloc(dev);
 	if (obj == NULL)
 		return NULL;
 
@@ -277,7 +277,7 @@ _i915_gem_object_create_stolen(struct drm_device *dev,
 	return obj;
 
 cleanup:
-	kfree(obj);
+	i915_gem_object_free(obj);
 	return NULL;
 }
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 16/16] drm/i915: Introduce mapping of user pages into video memory (userptr) ioctl
  2012-11-15 11:32 Hopefully the last round of stolen work Chris Wilson
                   ` (14 preceding siblings ...)
  2012-11-15 11:32 ` [PATCH 15/16] drm/i915: Use a slab for object allocation Chris Wilson
@ 2012-11-15 11:32 ` Chris Wilson
  2012-11-30 23:06   ` Daniel Vetter
  15 siblings, 1 reply; 26+ messages in thread
From: Chris Wilson @ 2012-11-15 11:32 UTC (permalink / raw)
  To: intel-gfx

By exporting the ability to map user address and inserting PTEs
representing their backing pages into the GTT, we can exploit UMA in order
to utilize normal application data as a texture source or even as a
render target (depending upon the capabilities of the chipset). This has
a number of uses, with zero-copy downloads to the GPU and efficient
readback making the intermixed streaming of CPU and GPU operations
fairly efficient. This ability has many widespread implications from
faster rendering of client-side software rasterisers (chromium),
mitigation of stalls due to read back (firefox) and to faster pipelining
of texture data (such as pixel buffer objects in GL).

v2: Compile with CONFIG_MMU_NOTIFIER
v3: We can sleep while performing invalidate-range, which we can utilise
to drop our page references prior to the kernel manipulating the vma
(for either discard or cloning) and so protect normal users.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/Makefile           |    1 +
 drivers/gpu/drm/i915/i915_dma.c         |    1 +
 drivers/gpu/drm/i915/i915_drv.h         |   22 +++
 drivers/gpu/drm/i915/i915_gem.c         |   11 +-
 drivers/gpu/drm/i915/i915_gem_userptr.c |  320 +++++++++++++++++++++++++++++++
 include/uapi/drm/i915_drm.h             |   15 ++
 6 files changed, 366 insertions(+), 4 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_gem_userptr.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 0f2c549..754d665 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -14,6 +14,7 @@ i915-y := i915_drv.o i915_dma.o i915_irq.o \
 	  i915_gem_gtt.o \
 	  i915_gem_stolen.o \
 	  i915_gem_tiling.o \
+	  i915_gem_userptr.o \
 	  i915_sysfs.o \
 	  i915_trace_points.o \
 	  intel_display.o \
diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index e347a20..7c025ae 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1892,6 +1892,7 @@ struct drm_ioctl_desc i915_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_CREATE, i915_gem_context_create_ioctl, DRM_UNLOCKED),
 	DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_DESTROY, i915_gem_context_destroy_ioctl, DRM_UNLOCKED),
 	DRM_IOCTL_DEF_DRV(I915_REG_READ, i915_reg_read_ioctl, DRM_UNLOCKED),
+	DRM_IOCTL_DEF_DRV(I915_GEM_USERPTR, i915_gem_userptr_ioctl, DRM_UNLOCKED),
 };
 
 int i915_max_ioctl = DRM_ARRAY_SIZE(i915_ioctls);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 930270c..c86ba1d 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -40,6 +40,7 @@
 #include <linux/backlight.h>
 #include <linux/intel-iommu.h>
 #include <linux/kref.h>
+#include <linux/mmu_notifier.h>
 
 /* General customization:
  */
@@ -954,6 +955,7 @@ struct drm_i915_gem_object_ops {
 	 */
 	int (*get_pages)(struct drm_i915_gem_object *);
 	void (*put_pages)(struct drm_i915_gem_object *);
+	void (*release)(struct drm_i915_gem_object *);
 };
 
 struct drm_i915_gem_object {
@@ -1099,6 +1101,23 @@ struct drm_i915_gem_object {
 	atomic_t pending_flip;
 };
 
+struct i915_gem_userptr_object {
+	struct drm_i915_gem_object gem;
+	uintptr_t user_ptr;
+	size_t user_size;
+	int read_only;
+
+	struct mm_struct *mm;
+#if defined(CONFIG_MMU_NOTIFIER)
+	struct mmu_notifier mn;
+#endif
+};
+
+union drm_i915_gem_objects {
+	struct drm_i915_gem_object base;
+	struct i915_gem_userptr_object userptr;
+};
+
 inline static bool i915_gem_object_is_prime(struct drm_i915_gem_object *obj)
 {
 	return obj->base.import_attach != NULL;
@@ -1364,6 +1383,8 @@ int i915_gem_entervt_ioctl(struct drm_device *dev, void *data,
 			   struct drm_file *file_priv);
 int i915_gem_leavevt_ioctl(struct drm_device *dev, void *data,
 			   struct drm_file *file_priv);
+int i915_gem_userptr_ioctl(struct drm_device *dev, void *data,
+			   struct drm_file *file);
 int i915_gem_set_tiling(struct drm_device *dev, void *data,
 			struct drm_file *file_priv);
 int i915_gem_get_tiling(struct drm_device *dev, void *data,
@@ -1416,6 +1437,7 @@ static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
 	BUG_ON(obj->pages_pin_count == 0);
 	obj->pages_pin_count--;
 }
+int i915_gem_object_put_pages(struct drm_i915_gem_object *obj);
 
 int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
 int i915_gem_object_sync(struct drm_i915_gem_object *obj,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 0bb5f86..c4731e2 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1802,7 +1802,7 @@ i915_gem_object_put_pages_gtt(struct drm_i915_gem_object *obj)
 	kfree(obj->pages);
 }
 
-static int
+int
 i915_gem_object_put_pages(struct drm_i915_gem_object *obj)
 {
 	const struct drm_i915_gem_object_ops *ops = obj->ops;
@@ -2571,9 +2571,9 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
 	/* Avoid an unnecessary call to unbind on rebind. */
 	obj->map_and_fenceable = true;
 
+	obj->gtt_offset -= obj->gtt_space->start;
 	drm_mm_put_block(obj->gtt_space);
 	obj->gtt_space = NULL;
-	obj->gtt_offset = 0;
 
 	return 0;
 }
@@ -3089,7 +3089,7 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
 	list_move_tail(&obj->gtt_list, &dev_priv->mm.bound_list);
 	list_add_tail(&obj->mm_list, &dev_priv->mm.inactive_list);
 
-	obj->gtt_offset = obj->gtt_space->start;
+	obj->gtt_offset += obj->gtt_space->start;
 
 	fenceable =
 		obj->gtt_space->size == fence_size &&
@@ -3882,6 +3882,9 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
 	if (obj->base.import_attach)
 		drm_prime_gem_destroy(&obj->base, NULL);
 
+	if (obj->ops->release)
+		obj->ops->release(obj);
+
 	drm_gem_object_release(&obj->base);
 	i915_gem_info_remove_obj(dev_priv, obj->base.size);
 
@@ -4209,7 +4212,7 @@ i915_gem_load(struct drm_device *dev)
 
 	dev_priv->slab =
 		kmem_cache_create("i915_gem_object",
-				  sizeof(struct drm_i915_gem_object), 0,
+				  sizeof(union drm_i915_gem_objects), 0,
 				  SLAB_HWCACHE_ALIGN,
 				  NULL);
 
diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
new file mode 100644
index 0000000..5545181
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -0,0 +1,320 @@
+/*
+ * Copyright © 2012 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#include "drmP.h"
+#include "i915_drm.h"
+#include "i915_drv.h"
+#include "i915_trace.h"
+#include "intel_drv.h"
+#include <linux/mmu_notifier.h>
+#include <linux/swap.h>
+
+static struct i915_gem_userptr_object *to_userptr_object(struct drm_i915_gem_object *obj)
+{
+	return container_of(obj, struct i915_gem_userptr_object, gem);
+}
+
+#if defined(CONFIG_MMU_NOTIFIER)
+static void i915_gem_userptr_mn_invalidate_range_start(struct mmu_notifier *mn,
+						       struct mm_struct *mm,
+						       unsigned long start,
+						       unsigned long end)
+{
+	struct i915_gem_userptr_object *vmap;
+	struct drm_device *dev;
+
+	/* XXX race between obj unref and mmu notifier? */
+	vmap = container_of(mn, struct i915_gem_userptr_object, mn);
+	BUG_ON(vmap->mm != mm);
+
+	dev = vmap->gem.base.dev;
+	mutex_lock(&dev->struct_mutex);
+	if (vmap->gem.gtt_space) {
+		struct drm_i915_private *dev_priv = dev->dev_private;
+		bool was_interruptible;
+		int ret;
+
+		was_interruptible = dev_priv->mm.interruptible;
+		dev_priv->mm.interruptible = false;
+
+		ret = i915_gem_object_unbind(&vmap->gem);
+		BUG_ON(ret != -EIO);
+
+		dev_priv->mm.interruptible = was_interruptible;
+	}
+
+	BUG_ON(i915_gem_object_put_pages(&vmap->gem));
+	mutex_unlock(&dev->struct_mutex);
+}
+
+static void i915_gem_userptr_mn_release(struct mmu_notifier *mn,
+					struct mm_struct *mm)
+{
+	struct i915_gem_userptr_object *vmap;
+
+	vmap = container_of(mn, struct i915_gem_userptr_object, mn);
+	BUG_ON(vmap->mm != mm);
+	vmap->mm = NULL;
+
+	/* XXX Schedule an eventual unbind? E.g. hook into require request?
+	 * However, locking will be complicated.
+	 */
+}
+
+static const struct mmu_notifier_ops i915_gem_userptr_notifier = {
+	.invalidate_range_start = i915_gem_userptr_mn_invalidate_range_start,
+	.release = i915_gem_userptr_mn_release,
+};
+
+static void
+i915_gem_userptr_release__mmu_notifier(struct i915_gem_userptr_object *vmap)
+{
+	if (vmap->mm) {
+		mmu_notifier_unregister(&vmap->mn, vmap->mm);
+		BUG_ON(vmap->mm);
+	}
+}
+
+static int
+i915_gem_userptr_init__mmu_notifier(struct i915_gem_userptr_object *vmap)
+{
+	vmap->mn.ops = &i915_gem_userptr_notifier;
+	return mmu_notifier_register(&vmap->mn, vmap->mm);
+}
+
+#else
+
+static void
+i915_gem_userptr_release__mmu_notifier(struct i915_gem_userptr_object *vmap)
+{
+}
+
+static int
+i915_gem_userptr_init__mmu_notifier(struct i915_gem_userptr_object *vmap)
+{
+	if (!capable(CAP_SYS_ADMIN))
+		return -EPERM;
+
+	return 0;
+}
+#endif
+
+static int
+i915_gem_userptr_get_pages(struct drm_i915_gem_object *obj)
+{
+	struct i915_gem_userptr_object *vmap = to_userptr_object(obj);
+	int num_pages = obj->base.size >> PAGE_SHIFT;
+	struct sg_table *st;
+	struct scatterlist *sg;
+	struct page **pvec;
+	int n, pinned, ret;
+
+	if (vmap->mm == NULL)
+		return -EFAULT;
+
+	if (!access_ok(vmap->read_only ? VERIFY_READ : VERIFY_WRITE,
+		       (char __user *)vmap->user_ptr, vmap->user_size))
+		return -EFAULT;
+
+	/* If userspace should engineer that these pages are replaced in
+	 * the vma between us binding this page into the GTT and completion
+	 * of rendering... Their loss. If they change the mapping of their
+	 * pages they need to create a new bo to point to the new vma.
+	 *
+	 * However, that still leaves open the possibility of the vma
+	 * being copied upon fork. Which falls under the same userspace
+	 * synchronisation issue as a regular bo, except that this time
+	 * the process may not be expecting that a particular piece of
+	 * memory is tied to the GPU.
+	 *
+	 * Fortunately, we can hook into the mmu_notifier in order to
+	 * discard the page references prior to anything nasty happening
+	 * to the vma (discard or cloning) which should prevent the more
+	 * egregious cases from causing harm.
+	 */
+
+	pvec = kmalloc(num_pages*sizeof(struct page *),
+		       GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY);
+	if (pvec == NULL) {
+		pvec = drm_malloc_ab(num_pages, sizeof(struct page *));
+		if (pvec == NULL)
+			return -ENOMEM;
+	}
+
+	pinned = 0;
+	if (vmap->mm == current->mm)
+		pinned = __get_user_pages_fast(vmap->user_ptr, num_pages,
+					       !vmap->read_only, pvec);
+	if (pinned < num_pages) {
+		struct mm_struct *mm = vmap->mm;
+		ret = 0;
+		mutex_unlock(&obj->base.dev->struct_mutex);
+		down_read(&mm->mmap_sem);
+		if (vmap->mm != NULL)
+			ret = get_user_pages(current, mm,
+					     vmap->user_ptr + (pinned << PAGE_SHIFT),
+					     num_pages - pinned,
+					     !vmap->read_only, 0,
+					     pvec + pinned,
+					     NULL);
+		up_read(&mm->mmap_sem);
+		mutex_lock(&obj->base.dev->struct_mutex);
+		if (ret > 0)
+			pinned += ret;
+
+		if (obj->pages || pinned < num_pages) {
+			ret = obj->pages ? 0 : -EFAULT;
+			goto cleanup_pinned;
+		}
+	}
+
+	st = kmalloc(sizeof(*st), GFP_KERNEL);
+	if (st == NULL) {
+		ret = -ENOMEM;
+		goto cleanup_pinned;
+	}
+
+	if (sg_alloc_table(st, num_pages, GFP_KERNEL)) {
+		ret = -ENOMEM;
+		goto cleanup_st;
+	}
+
+	for_each_sg(st->sgl, sg, num_pages, n)
+		sg_set_page(sg, pvec[n], PAGE_SIZE, 0);
+	drm_free_large(pvec);
+
+	obj->pages = st;
+	return 0;
+
+cleanup_st:
+	kfree(st);
+cleanup_pinned:
+	release_pages(pvec, pinned, 0);
+	drm_free_large(pvec);
+	return ret;
+}
+
+static void
+i915_gem_userptr_put_pages(struct drm_i915_gem_object *obj)
+{
+	struct scatterlist *sg;
+	int i;
+
+	if (obj->madv != I915_MADV_WILLNEED)
+		obj->dirty = 0;
+
+	for_each_sg(obj->pages->sgl, sg, obj->pages->nents, i) {
+		struct page *page = sg_page(sg);
+
+		if (obj->dirty)
+			set_page_dirty(page);
+
+		mark_page_accessed(page);
+		page_cache_release(page);
+	}
+	obj->dirty = 0;
+
+	sg_free_table(obj->pages);
+	kfree(obj->pages);
+}
+
+static void
+i915_gem_userptr_release(struct drm_i915_gem_object *obj)
+{
+	struct i915_gem_userptr_object *vmap = to_userptr_object(obj);
+
+	i915_gem_userptr_release__mmu_notifier(vmap);
+}
+
+static const struct drm_i915_gem_object_ops i915_gem_userptr_ops = {
+	.get_pages = i915_gem_userptr_get_pages,
+	.put_pages = i915_gem_userptr_put_pages,
+	.release = i915_gem_userptr_release,
+};
+
+/**
+ * Creates a new mm object that wraps some user memory.
+ */
+int
+i915_gem_userptr_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct drm_i915_gem_userptr *args = data;
+	struct i915_gem_userptr_object *obj;
+	loff_t first_data_page, last_data_page;
+	int num_pages;
+	int ret;
+	u32 handle;
+
+	if (args->flags & ~(I915_USERPTR_READ_ONLY))
+		return -EINVAL;
+
+	first_data_page = args->user_ptr / PAGE_SIZE;
+	last_data_page = (args->user_ptr + args->user_size - 1) / PAGE_SIZE;
+	num_pages = last_data_page - first_data_page + 1;
+	if (num_pages * PAGE_SIZE > dev_priv->mm.gtt_total)
+		return -E2BIG;
+
+	ret = fault_in_multipages_readable((char __user *)(uintptr_t)args->user_ptr,
+					   args->user_size);
+	if (ret)
+		return ret;
+
+	/* Allocate the new object */
+	obj = i915_gem_object_alloc(dev);
+	if (obj == NULL)
+		return -ENOMEM;
+
+	if (drm_gem_private_object_init(dev, &obj->gem.base,
+					num_pages * PAGE_SIZE)) {
+		i915_gem_object_free(&obj->gem);
+		return -ENOMEM;
+	}
+
+	i915_gem_object_init(&obj->gem, &i915_gem_userptr_ops);
+	obj->gem.cache_level = I915_CACHE_LLC_MLC;
+
+	obj->gem.gtt_offset = offset_in_page(args->user_ptr);
+	obj->user_ptr = args->user_ptr;
+	obj->user_size = args->user_size;
+	obj->read_only = args->flags & I915_USERPTR_READ_ONLY;
+
+	/* And keep a pointer to the current->mm for resolving the user pages
+	 * at binding. This means that we need to hook into the mmu_notifier
+	 * in order to detect if the mmu is destroyed.
+	 */
+	obj->mm = current->mm;
+	ret = i915_gem_userptr_init__mmu_notifier(obj);
+	if (ret)
+		return ret;
+
+	ret = drm_gem_handle_create(file, &obj->gem.base, &handle);
+	/* drop reference from allocate - handle holds it now */
+	drm_gem_object_unreference(&obj->gem.base);
+	if (ret)
+		return ret;
+
+	args->handle = handle;
+	return 0;
+}
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index b746a3c..067b98e 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -198,6 +198,7 @@ typedef struct _drm_i915_sarea {
 #define DRM_I915_GEM_SET_CACHING	0x2f
 #define DRM_I915_GEM_GET_CACHING	0x30
 #define DRM_I915_REG_READ		0x31
+#define DRM_I915_GEM_USERPTR		0x32
 
 #define DRM_IOCTL_I915_INIT		DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t)
 #define DRM_IOCTL_I915_FLUSH		DRM_IO ( DRM_COMMAND_BASE + DRM_I915_FLUSH)
@@ -247,6 +248,7 @@ typedef struct _drm_i915_sarea {
 #define DRM_IOCTL_I915_GEM_CONTEXT_CREATE	DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_CONTEXT_CREATE, struct drm_i915_gem_context_create)
 #define DRM_IOCTL_I915_GEM_CONTEXT_DESTROY	DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_CONTEXT_DESTROY, struct drm_i915_gem_context_destroy)
 #define DRM_IOCTL_I915_REG_READ			DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_REG_READ, struct drm_i915_reg_read)
+#define DRM_IOCTL_I915_GEM_USERPTR			DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_USERPTR, struct drm_i915_gem_userptr)
 
 /* Allow drivers to submit batchbuffers directly to hardware, relying
  * on the security mechanisms provided by hardware.
@@ -950,4 +952,17 @@ struct drm_i915_reg_read {
 	__u64 offset;
 	__u64 val; /* Return value */
 };
+
+struct drm_i915_gem_userptr {
+	__u64 user_ptr;
+	__u32 user_size;
+	__u32 flags;
+#define I915_USERPTR_READ_ONLY 0x1
+	/**
+	 * Returned handle for the object.
+	 *
+	 * Object handles are nonzero.
+	 */
+	__u32 handle;
+};
 #endif /* _UAPI_I915_DRM_H_ */
-- 
1.7.10.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 10/16] drm/i915: Handle stolen objects for pread
  2012-11-15 11:32 ` [PATCH 10/16] drm/i915: Handle stolen objects for pread Chris Wilson
@ 2012-11-30 22:33   ` Daniel Vetter
  2012-11-30 23:46     ` Chris Wilson
  0 siblings, 1 reply; 26+ messages in thread
From: Daniel Vetter @ 2012-11-30 22:33 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On Thu, Nov 15, 2012 at 11:32:25AM +0000, Chris Wilson wrote:
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Tbh I don't see any reason for handling pwrite/pread support for stolen
mem backed objects. I'll leave these two patches here (plus the prep one
to differentiate between stolen and dma_bufs) for now until a user pops
up.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 15/16] drm/i915: Use a slab for object allocation
  2012-11-15 11:32 ` [PATCH 15/16] drm/i915: Use a slab for object allocation Chris Wilson
@ 2012-11-30 22:45   ` Daniel Vetter
  0 siblings, 0 replies; 26+ messages in thread
From: Daniel Vetter @ 2012-11-30 22:45 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On Thu, Nov 15, 2012 at 11:32:30AM +0000, Chris Wilson wrote:
> The primary purpose of this was to debug some use-after-free memory
> corruption that was causing an OOPS inside drm/i915. As it turned out
> the corruption was being caused elsewhere and i915.ko as a major user of
> many objects was being hit hardest.
> 
> Indeed as we do frequent the generic kmalloc caches, dedicating one to
> ourselves (or at least naming one for us depending upon the core) aids
> debugging our own slab usage.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>

Slurped in thus far, safe for the three patches I've already mentioned in
the previous reply.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 16/16] drm/i915: Introduce mapping of user pages into video memory (userptr) ioctl
  2012-11-15 11:32 ` [PATCH 16/16] drm/i915: Introduce mapping of user pages into video memory (userptr) ioctl Chris Wilson
@ 2012-11-30 23:06   ` Daniel Vetter
  2012-11-30 23:57     ` Chris Wilson
  2012-12-01 11:25     ` Chris Wilson
  0 siblings, 2 replies; 26+ messages in thread
From: Daniel Vetter @ 2012-11-30 23:06 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On Thu, Nov 15, 2012 at 11:32:31AM +0000, Chris Wilson wrote:
> By exporting the ability to map user address and inserting PTEs
> representing their backing pages into the GTT, we can exploit UMA in order
> to utilize normal application data as a texture source or even as a
> render target (depending upon the capabilities of the chipset). This has
> a number of uses, with zero-copy downloads to the GPU and efficient
> readback making the intermixed streaming of CPU and GPU operations
> fairly efficient. This ability has many widespread implications from
> faster rendering of client-side software rasterisers (chromium),
> mitigation of stalls due to read back (firefox) and to faster pipelining
> of texture data (such as pixel buffer objects in GL).
> 
> v2: Compile with CONFIG_MMU_NOTIFIER
> v3: We can sleep while performing invalidate-range, which we can utilise
> to drop our page references prior to the kernel manipulating the vma
> (for either discard or cloning) and so protect normal users.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Quick review:
- I don't like the internal byte offset we add/substract for relocations.
  mmap works on page boundaries, imo gpu mmap should do the same.
- The mmu_notifier_register stuff looks ridiculously expensive - does it
  matter or do we need some sort of per-mm rbtree to keep track of our own
  gtt windows into that mm through the different bos?
- select CONFIG_MMU_NOTIFIER? Especially since the mm handling differs
  quite a bit, I don't really like to keep the other code around.
- I admit I haven't checked the refcounting/lifetime rules around holding
  pointers to random mms and how the mmu_notifier affects that. Since
  we're not core mm hackers I think a quick comment in the code is in
  order how that works.
- gem api review in the view of these strange new objects - e.g. I think
  disallowing flink and dma_buf export would make sense here. I also
  think that tiling doesn't make much sense, at least if userspace doesn't
  want it's memory to get swizzled once a while ...
- api tests in i-g-t both for anything the above api review brings up, but
  also for resource lifetime things (e.g. killing a process while a
  userptr is still busy on the gpu). Also some tests whether evil/buggy
  userspace gets the proper sigbus/-EINVAL if it unmaps a section of it's
  mm which is used by a userptr.
- I think the locking needs some clarification ...

For lifting the root-only flag I think only some form of memory/swap test
would be required. Actually this would be good anyway, since then lockdep
can do the deadlock potential review for me ;-)

Cheers, Daniel


> ---
>  drivers/gpu/drm/i915/Makefile           |    1 +
>  drivers/gpu/drm/i915/i915_dma.c         |    1 +
>  drivers/gpu/drm/i915/i915_drv.h         |   22 +++
>  drivers/gpu/drm/i915/i915_gem.c         |   11 +-
>  drivers/gpu/drm/i915/i915_gem_userptr.c |  320 +++++++++++++++++++++++++++++++
>  include/uapi/drm/i915_drm.h             |   15 ++
>  6 files changed, 366 insertions(+), 4 deletions(-)
>  create mode 100644 drivers/gpu/drm/i915/i915_gem_userptr.c
> 
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index 0f2c549..754d665 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -14,6 +14,7 @@ i915-y := i915_drv.o i915_dma.o i915_irq.o \
>  	  i915_gem_gtt.o \
>  	  i915_gem_stolen.o \
>  	  i915_gem_tiling.o \
> +	  i915_gem_userptr.o \
>  	  i915_sysfs.o \
>  	  i915_trace_points.o \
>  	  intel_display.o \
> diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> index e347a20..7c025ae 100644
> --- a/drivers/gpu/drm/i915/i915_dma.c
> +++ b/drivers/gpu/drm/i915/i915_dma.c
> @@ -1892,6 +1892,7 @@ struct drm_ioctl_desc i915_ioctls[] = {
>  	DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_CREATE, i915_gem_context_create_ioctl, DRM_UNLOCKED),
>  	DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_DESTROY, i915_gem_context_destroy_ioctl, DRM_UNLOCKED),
>  	DRM_IOCTL_DEF_DRV(I915_REG_READ, i915_reg_read_ioctl, DRM_UNLOCKED),
> +	DRM_IOCTL_DEF_DRV(I915_GEM_USERPTR, i915_gem_userptr_ioctl, DRM_UNLOCKED),
>  };
>  
>  int i915_max_ioctl = DRM_ARRAY_SIZE(i915_ioctls);
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 930270c..c86ba1d 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -40,6 +40,7 @@
>  #include <linux/backlight.h>
>  #include <linux/intel-iommu.h>
>  #include <linux/kref.h>
> +#include <linux/mmu_notifier.h>
>  
>  /* General customization:
>   */
> @@ -954,6 +955,7 @@ struct drm_i915_gem_object_ops {
>  	 */
>  	int (*get_pages)(struct drm_i915_gem_object *);
>  	void (*put_pages)(struct drm_i915_gem_object *);
> +	void (*release)(struct drm_i915_gem_object *);
>  };
>  
>  struct drm_i915_gem_object {
> @@ -1099,6 +1101,23 @@ struct drm_i915_gem_object {
>  	atomic_t pending_flip;
>  };
>  
> +struct i915_gem_userptr_object {
> +	struct drm_i915_gem_object gem;
> +	uintptr_t user_ptr;
> +	size_t user_size;
> +	int read_only;
> +
> +	struct mm_struct *mm;
> +#if defined(CONFIG_MMU_NOTIFIER)
> +	struct mmu_notifier mn;
> +#endif
> +};
> +
> +union drm_i915_gem_objects {
> +	struct drm_i915_gem_object base;
> +	struct i915_gem_userptr_object userptr;
> +};
> +
>  inline static bool i915_gem_object_is_prime(struct drm_i915_gem_object *obj)
>  {
>  	return obj->base.import_attach != NULL;
> @@ -1364,6 +1383,8 @@ int i915_gem_entervt_ioctl(struct drm_device *dev, void *data,
>  			   struct drm_file *file_priv);
>  int i915_gem_leavevt_ioctl(struct drm_device *dev, void *data,
>  			   struct drm_file *file_priv);
> +int i915_gem_userptr_ioctl(struct drm_device *dev, void *data,
> +			   struct drm_file *file);
>  int i915_gem_set_tiling(struct drm_device *dev, void *data,
>  			struct drm_file *file_priv);
>  int i915_gem_get_tiling(struct drm_device *dev, void *data,
> @@ -1416,6 +1437,7 @@ static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
>  	BUG_ON(obj->pages_pin_count == 0);
>  	obj->pages_pin_count--;
>  }
> +int i915_gem_object_put_pages(struct drm_i915_gem_object *obj);
>  
>  int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
>  int i915_gem_object_sync(struct drm_i915_gem_object *obj,
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 0bb5f86..c4731e2 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1802,7 +1802,7 @@ i915_gem_object_put_pages_gtt(struct drm_i915_gem_object *obj)
>  	kfree(obj->pages);
>  }
>  
> -static int
> +int
>  i915_gem_object_put_pages(struct drm_i915_gem_object *obj)
>  {
>  	const struct drm_i915_gem_object_ops *ops = obj->ops;
> @@ -2571,9 +2571,9 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
>  	/* Avoid an unnecessary call to unbind on rebind. */
>  	obj->map_and_fenceable = true;
>  
> +	obj->gtt_offset -= obj->gtt_space->start;
>  	drm_mm_put_block(obj->gtt_space);
>  	obj->gtt_space = NULL;
> -	obj->gtt_offset = 0;
>  
>  	return 0;
>  }
> @@ -3089,7 +3089,7 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
>  	list_move_tail(&obj->gtt_list, &dev_priv->mm.bound_list);
>  	list_add_tail(&obj->mm_list, &dev_priv->mm.inactive_list);
>  
> -	obj->gtt_offset = obj->gtt_space->start;
> +	obj->gtt_offset += obj->gtt_space->start;
>  
>  	fenceable =
>  		obj->gtt_space->size == fence_size &&
> @@ -3882,6 +3882,9 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
>  	if (obj->base.import_attach)
>  		drm_prime_gem_destroy(&obj->base, NULL);
>  
> +	if (obj->ops->release)
> +		obj->ops->release(obj);
> +
>  	drm_gem_object_release(&obj->base);
>  	i915_gem_info_remove_obj(dev_priv, obj->base.size);
>  
> @@ -4209,7 +4212,7 @@ i915_gem_load(struct drm_device *dev)
>  
>  	dev_priv->slab =
>  		kmem_cache_create("i915_gem_object",
> -				  sizeof(struct drm_i915_gem_object), 0,
> +				  sizeof(union drm_i915_gem_objects), 0,
>  				  SLAB_HWCACHE_ALIGN,
>  				  NULL);
>  
> diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
> new file mode 100644
> index 0000000..5545181
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
> @@ -0,0 +1,320 @@
> +/*
> + * Copyright © 2012 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
> + * IN THE SOFTWARE.
> + *
> + */
> +
> +#include "drmP.h"
> +#include "i915_drm.h"
> +#include "i915_drv.h"
> +#include "i915_trace.h"
> +#include "intel_drv.h"
> +#include <linux/mmu_notifier.h>
> +#include <linux/swap.h>
> +
> +static struct i915_gem_userptr_object *to_userptr_object(struct drm_i915_gem_object *obj)
> +{
> +	return container_of(obj, struct i915_gem_userptr_object, gem);
> +}
> +
> +#if defined(CONFIG_MMU_NOTIFIER)
> +static void i915_gem_userptr_mn_invalidate_range_start(struct mmu_notifier *mn,
> +						       struct mm_struct *mm,
> +						       unsigned long start,
> +						       unsigned long end)
> +{
> +	struct i915_gem_userptr_object *vmap;
> +	struct drm_device *dev;
> +
> +	/* XXX race between obj unref and mmu notifier? */
> +	vmap = container_of(mn, struct i915_gem_userptr_object, mn);
> +	BUG_ON(vmap->mm != mm);
> +
> +	dev = vmap->gem.base.dev;
> +	mutex_lock(&dev->struct_mutex);
> +	if (vmap->gem.gtt_space) {
> +		struct drm_i915_private *dev_priv = dev->dev_private;
> +		bool was_interruptible;
> +		int ret;
> +
> +		was_interruptible = dev_priv->mm.interruptible;
> +		dev_priv->mm.interruptible = false;
> +
> +		ret = i915_gem_object_unbind(&vmap->gem);
> +		BUG_ON(ret != -EIO);
> +
> +		dev_priv->mm.interruptible = was_interruptible;
> +	}
> +
> +	BUG_ON(i915_gem_object_put_pages(&vmap->gem));
> +	mutex_unlock(&dev->struct_mutex);
> +}
> +
> +static void i915_gem_userptr_mn_release(struct mmu_notifier *mn,
> +					struct mm_struct *mm)
> +{
> +	struct i915_gem_userptr_object *vmap;
> +
> +	vmap = container_of(mn, struct i915_gem_userptr_object, mn);
> +	BUG_ON(vmap->mm != mm);
> +	vmap->mm = NULL;
> +
> +	/* XXX Schedule an eventual unbind? E.g. hook into require request?
> +	 * However, locking will be complicated.
> +	 */
> +}
> +
> +static const struct mmu_notifier_ops i915_gem_userptr_notifier = {
> +	.invalidate_range_start = i915_gem_userptr_mn_invalidate_range_start,
> +	.release = i915_gem_userptr_mn_release,
> +};
> +
> +static void
> +i915_gem_userptr_release__mmu_notifier(struct i915_gem_userptr_object *vmap)
> +{
> +	if (vmap->mm) {
> +		mmu_notifier_unregister(&vmap->mn, vmap->mm);
> +		BUG_ON(vmap->mm);
> +	}
> +}
> +
> +static int
> +i915_gem_userptr_init__mmu_notifier(struct i915_gem_userptr_object *vmap)
> +{
> +	vmap->mn.ops = &i915_gem_userptr_notifier;
> +	return mmu_notifier_register(&vmap->mn, vmap->mm);
> +}
> +
> +#else
> +
> +static void
> +i915_gem_userptr_release__mmu_notifier(struct i915_gem_userptr_object *vmap)
> +{
> +}
> +
> +static int
> +i915_gem_userptr_init__mmu_notifier(struct i915_gem_userptr_object *vmap)
> +{
> +	if (!capable(CAP_SYS_ADMIN))
> +		return -EPERM;
> +
> +	return 0;
> +}
> +#endif
> +
> +static int
> +i915_gem_userptr_get_pages(struct drm_i915_gem_object *obj)
> +{
> +	struct i915_gem_userptr_object *vmap = to_userptr_object(obj);
> +	int num_pages = obj->base.size >> PAGE_SHIFT;
> +	struct sg_table *st;
> +	struct scatterlist *sg;
> +	struct page **pvec;
> +	int n, pinned, ret;
> +
> +	if (vmap->mm == NULL)
> +		return -EFAULT;
> +
> +	if (!access_ok(vmap->read_only ? VERIFY_READ : VERIFY_WRITE,
> +		       (char __user *)vmap->user_ptr, vmap->user_size))
> +		return -EFAULT;
> +
> +	/* If userspace should engineer that these pages are replaced in
> +	 * the vma between us binding this page into the GTT and completion
> +	 * of rendering... Their loss. If they change the mapping of their
> +	 * pages they need to create a new bo to point to the new vma.
> +	 *
> +	 * However, that still leaves open the possibility of the vma
> +	 * being copied upon fork. Which falls under the same userspace
> +	 * synchronisation issue as a regular bo, except that this time
> +	 * the process may not be expecting that a particular piece of
> +	 * memory is tied to the GPU.
> +	 *
> +	 * Fortunately, we can hook into the mmu_notifier in order to
> +	 * discard the page references prior to anything nasty happening
> +	 * to the vma (discard or cloning) which should prevent the more
> +	 * egregious cases from causing harm.
> +	 */
> +
> +	pvec = kmalloc(num_pages*sizeof(struct page *),
> +		       GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY);
> +	if (pvec == NULL) {
> +		pvec = drm_malloc_ab(num_pages, sizeof(struct page *));
> +		if (pvec == NULL)
> +			return -ENOMEM;
> +	}
> +
> +	pinned = 0;
> +	if (vmap->mm == current->mm)
> +		pinned = __get_user_pages_fast(vmap->user_ptr, num_pages,
> +					       !vmap->read_only, pvec);
> +	if (pinned < num_pages) {
> +		struct mm_struct *mm = vmap->mm;
> +		ret = 0;
> +		mutex_unlock(&obj->base.dev->struct_mutex);
> +		down_read(&mm->mmap_sem);
> +		if (vmap->mm != NULL)
> +			ret = get_user_pages(current, mm,
> +					     vmap->user_ptr + (pinned << PAGE_SHIFT),
> +					     num_pages - pinned,
> +					     !vmap->read_only, 0,
> +					     pvec + pinned,
> +					     NULL);
> +		up_read(&mm->mmap_sem);
> +		mutex_lock(&obj->base.dev->struct_mutex);
> +		if (ret > 0)
> +			pinned += ret;
> +
> +		if (obj->pages || pinned < num_pages) {
> +			ret = obj->pages ? 0 : -EFAULT;
> +			goto cleanup_pinned;
> +		}
> +	}
> +
> +	st = kmalloc(sizeof(*st), GFP_KERNEL);
> +	if (st == NULL) {
> +		ret = -ENOMEM;
> +		goto cleanup_pinned;
> +	}
> +
> +	if (sg_alloc_table(st, num_pages, GFP_KERNEL)) {
> +		ret = -ENOMEM;
> +		goto cleanup_st;
> +	}
> +
> +	for_each_sg(st->sgl, sg, num_pages, n)
> +		sg_set_page(sg, pvec[n], PAGE_SIZE, 0);
> +	drm_free_large(pvec);
> +
> +	obj->pages = st;
> +	return 0;
> +
> +cleanup_st:
> +	kfree(st);
> +cleanup_pinned:
> +	release_pages(pvec, pinned, 0);
> +	drm_free_large(pvec);
> +	return ret;
> +}
> +
> +static void
> +i915_gem_userptr_put_pages(struct drm_i915_gem_object *obj)
> +{
> +	struct scatterlist *sg;
> +	int i;
> +
> +	if (obj->madv != I915_MADV_WILLNEED)
> +		obj->dirty = 0;
> +
> +	for_each_sg(obj->pages->sgl, sg, obj->pages->nents, i) {
> +		struct page *page = sg_page(sg);
> +
> +		if (obj->dirty)
> +			set_page_dirty(page);
> +
> +		mark_page_accessed(page);
> +		page_cache_release(page);
> +	}
> +	obj->dirty = 0;
> +
> +	sg_free_table(obj->pages);
> +	kfree(obj->pages);
> +}
> +
> +static void
> +i915_gem_userptr_release(struct drm_i915_gem_object *obj)
> +{
> +	struct i915_gem_userptr_object *vmap = to_userptr_object(obj);
> +
> +	i915_gem_userptr_release__mmu_notifier(vmap);
> +}
> +
> +static const struct drm_i915_gem_object_ops i915_gem_userptr_ops = {
> +	.get_pages = i915_gem_userptr_get_pages,
> +	.put_pages = i915_gem_userptr_put_pages,
> +	.release = i915_gem_userptr_release,
> +};
> +
> +/**
> + * Creates a new mm object that wraps some user memory.
> + */
> +int
> +i915_gem_userptr_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
> +{
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct drm_i915_gem_userptr *args = data;
> +	struct i915_gem_userptr_object *obj;
> +	loff_t first_data_page, last_data_page;
> +	int num_pages;
> +	int ret;
> +	u32 handle;
> +
> +	if (args->flags & ~(I915_USERPTR_READ_ONLY))
> +		return -EINVAL;
> +
> +	first_data_page = args->user_ptr / PAGE_SIZE;
> +	last_data_page = (args->user_ptr + args->user_size - 1) / PAGE_SIZE;
> +	num_pages = last_data_page - first_data_page + 1;
> +	if (num_pages * PAGE_SIZE > dev_priv->mm.gtt_total)
> +		return -E2BIG;
> +
> +	ret = fault_in_multipages_readable((char __user *)(uintptr_t)args->user_ptr,
> +					   args->user_size);
> +	if (ret)
> +		return ret;
> +
> +	/* Allocate the new object */
> +	obj = i915_gem_object_alloc(dev);
> +	if (obj == NULL)
> +		return -ENOMEM;
> +
> +	if (drm_gem_private_object_init(dev, &obj->gem.base,
> +					num_pages * PAGE_SIZE)) {
> +		i915_gem_object_free(&obj->gem);
> +		return -ENOMEM;
> +	}
> +
> +	i915_gem_object_init(&obj->gem, &i915_gem_userptr_ops);
> +	obj->gem.cache_level = I915_CACHE_LLC_MLC;
> +
> +	obj->gem.gtt_offset = offset_in_page(args->user_ptr);
> +	obj->user_ptr = args->user_ptr;
> +	obj->user_size = args->user_size;
> +	obj->read_only = args->flags & I915_USERPTR_READ_ONLY;
> +
> +	/* And keep a pointer to the current->mm for resolving the user pages
> +	 * at binding. This means that we need to hook into the mmu_notifier
> +	 * in order to detect if the mmu is destroyed.
> +	 */
> +	obj->mm = current->mm;
> +	ret = i915_gem_userptr_init__mmu_notifier(obj);
> +	if (ret)
> +		return ret;
> +
> +	ret = drm_gem_handle_create(file, &obj->gem.base, &handle);
> +	/* drop reference from allocate - handle holds it now */
> +	drm_gem_object_unreference(&obj->gem.base);
> +	if (ret)
> +		return ret;
> +
> +	args->handle = handle;
> +	return 0;
> +}
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index b746a3c..067b98e 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -198,6 +198,7 @@ typedef struct _drm_i915_sarea {
>  #define DRM_I915_GEM_SET_CACHING	0x2f
>  #define DRM_I915_GEM_GET_CACHING	0x30
>  #define DRM_I915_REG_READ		0x31
> +#define DRM_I915_GEM_USERPTR		0x32
>  
>  #define DRM_IOCTL_I915_INIT		DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t)
>  #define DRM_IOCTL_I915_FLUSH		DRM_IO ( DRM_COMMAND_BASE + DRM_I915_FLUSH)
> @@ -247,6 +248,7 @@ typedef struct _drm_i915_sarea {
>  #define DRM_IOCTL_I915_GEM_CONTEXT_CREATE	DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_CONTEXT_CREATE, struct drm_i915_gem_context_create)
>  #define DRM_IOCTL_I915_GEM_CONTEXT_DESTROY	DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_CONTEXT_DESTROY, struct drm_i915_gem_context_destroy)
>  #define DRM_IOCTL_I915_REG_READ			DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_REG_READ, struct drm_i915_reg_read)
> +#define DRM_IOCTL_I915_GEM_USERPTR			DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_USERPTR, struct drm_i915_gem_userptr)
>  
>  /* Allow drivers to submit batchbuffers directly to hardware, relying
>   * on the security mechanisms provided by hardware.
> @@ -950,4 +952,17 @@ struct drm_i915_reg_read {
>  	__u64 offset;
>  	__u64 val; /* Return value */
>  };
> +
> +struct drm_i915_gem_userptr {
> +	__u64 user_ptr;
> +	__u32 user_size;
> +	__u32 flags;
> +#define I915_USERPTR_READ_ONLY 0x1
> +	/**
> +	 * Returned handle for the object.
> +	 *
> +	 * Object handles are nonzero.
> +	 */
> +	__u32 handle;
> +};
>  #endif /* _UAPI_I915_DRM_H_ */
> -- 
> 1.7.10.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 10/16] drm/i915: Handle stolen objects for pread
  2012-11-30 22:33   ` Daniel Vetter
@ 2012-11-30 23:46     ` Chris Wilson
  2012-12-01  0:03       ` Daniel Vetter
  0 siblings, 1 reply; 26+ messages in thread
From: Chris Wilson @ 2012-11-30 23:46 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On Fri, 30 Nov 2012 23:33:43 +0100, Daniel Vetter <daniel@ffwll.ch> wrote:
> On Thu, Nov 15, 2012 at 11:32:25AM +0000, Chris Wilson wrote:
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> 
> Tbh I don't see any reason for handling pwrite/pread support for stolen
> mem backed objects. I'll leave these two patches here (plus the prep one
> to differentiate between stolen and dma_bufs) for now until a user pops

The problem is that constitutes a change in ABI, so I was trying to make
sure stolen buffer objects were first class from day one. The only
saving grace is that userspace can only access the stolen object for
fbcon, but it is still available...
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 16/16] drm/i915: Introduce mapping of user pages into video memory (userptr) ioctl
  2012-11-30 23:06   ` Daniel Vetter
@ 2012-11-30 23:57     ` Chris Wilson
  2012-12-01 11:25     ` Chris Wilson
  1 sibling, 0 replies; 26+ messages in thread
From: Chris Wilson @ 2012-11-30 23:57 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On Sat, 1 Dec 2012 00:06:43 +0100, Daniel Vetter <daniel@ffwll.ch> wrote:
> On Thu, Nov 15, 2012 at 11:32:31AM +0000, Chris Wilson wrote:
> > By exporting the ability to map user address and inserting PTEs
> > representing their backing pages into the GTT, we can exploit UMA in order
> > to utilize normal application data as a texture source or even as a
> > render target (depending upon the capabilities of the chipset). This has
> > a number of uses, with zero-copy downloads to the GPU and efficient
> > readback making the intermixed streaming of CPU and GPU operations
> > fairly efficient. This ability has many widespread implications from
> > faster rendering of client-side software rasterisers (chromium),
> > mitigation of stalls due to read back (firefox) and to faster pipelining
> > of texture data (such as pixel buffer objects in GL).

> - gem api review in the view of these strange new objects - e.g. I think
>   disallowing flink and dma_buf export would make sense here. I also
>   think that tiling doesn't make much sense, at least if userspace doesn't
>   want it's memory to get swizzled once a while ...

Over the course of the years this has been sitting around, I've
changed my mind. Originally, I thought it would be wise to prevent
userspace from trying to do things like pwrite, mmap-gtt, or set tiling.
However, there is nothing fundamental in the hardware that prevents
userspace from doing so, so long as it is fully cogniscient of what it
is doing (which it has to be anyway). Similarly for flink/dma-buf, though
I have not thought it through and mmu-notifier should be sufficient to
teardown cross-process attachments - but it would be useful (cf SHM). The
handwaving here is why it originally was root-only, and why we are looking
at the mmu-notifiers to shootdown the vma as it is copied between
processes.

mmu_notifier is seemingly expensive, and I note is only compiled in for
one of my distro kernels - it is not a burden I want to impose on my
machines, but zero-copy uploads is too important to give up.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 10/16] drm/i915: Handle stolen objects for pread
  2012-11-30 23:46     ` Chris Wilson
@ 2012-12-01  0:03       ` Daniel Vetter
  2012-12-01  0:14         ` Chris Wilson
  0 siblings, 1 reply; 26+ messages in thread
From: Daniel Vetter @ 2012-12-01  0:03 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On Sat, Dec 1, 2012 at 12:46 AM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> On Fri, 30 Nov 2012 23:33:43 +0100, Daniel Vetter <daniel@ffwll.ch> wrote:
>> On Thu, Nov 15, 2012 at 11:32:25AM +0000, Chris Wilson wrote:
>> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>
>> Tbh I don't see any reason for handling pwrite/pread support for stolen
>> mem backed objects. I'll leave these two patches here (plus the prep one
>> to differentiate between stolen and dma_bufs) for now until a user pops
>
> The problem is that constitutes a change in ABI, so I was trying to make
> sure stolen buffer objects were first class from day one. The only
> saving grace is that userspace can only access the stolen object for
> fbcon, but it is still available...

Hm, I've forgotten that userspace can get at the fbcon. Does it really
do that? The problem I see here is untested code and no sane way to
test it.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 10/16] drm/i915: Handle stolen objects for pread
  2012-12-01  0:03       ` Daniel Vetter
@ 2012-12-01  0:14         ` Chris Wilson
  0 siblings, 0 replies; 26+ messages in thread
From: Chris Wilson @ 2012-12-01  0:14 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On Sat, 1 Dec 2012 01:03:39 +0100, Daniel Vetter <daniel@ffwll.ch> wrote:
> On Sat, Dec 1, 2012 at 12:46 AM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > On Fri, 30 Nov 2012 23:33:43 +0100, Daniel Vetter <daniel@ffwll.ch> wrote:
> >> On Thu, Nov 15, 2012 at 11:32:25AM +0000, Chris Wilson wrote:
> >> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >>
> >> Tbh I don't see any reason for handling pwrite/pread support for stolen
> >> mem backed objects. I'll leave these two patches here (plus the prep one
> >> to differentiate between stolen and dma_bufs) for now until a user pops
> >
> > The problem is that constitutes a change in ABI, so I was trying to make
> > sure stolen buffer objects were first class from day one. The only
> > saving grace is that userspace can only access the stolen object for
> > fbcon, but it is still available...
> 
> Hm, I've forgotten that userspace can get at the fbcon. Does it really
> do that? The problem I see here is untested code and no sane way to
> test it.

Hint: the bogus obj->gtt_offset == 0 check was found experimentally. :)

No one pwrites/preads to the fbcon, today. If it makes you feel
any better, I would like to be able to create objects preferrentially
from stolen (e.g. scanouts) and you could build a test interface on top.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 16/16] drm/i915: Introduce mapping of user pages into video memory (userptr) ioctl
  2012-11-30 23:06   ` Daniel Vetter
  2012-11-30 23:57     ` Chris Wilson
@ 2012-12-01 11:25     ` Chris Wilson
  2012-12-01 11:35       ` Daniel Vetter
  1 sibling, 1 reply; 26+ messages in thread
From: Chris Wilson @ 2012-12-01 11:25 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On Sat, 1 Dec 2012 00:06:43 +0100, Daniel Vetter <daniel@ffwll.ch> wrote:
> Quick review:
> - I don't like the internal byte offset we add/substract for relocations.
>   mmap works on page boundaries, imo gpu mmap should do the same.

However, the necessity is that I work on unaligned boundaries since I
have no control over how the clients allocate the memory to be mapped,
or which bytes within that memory are associated with a particular
pixmap. The XShm API is such that not even the ShmSegment is passed down
to the driver. That vies with the lack of explicit synchronisation
points as being the worst elements of its design.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 16/16] drm/i915: Introduce mapping of user pages into video memory (userptr) ioctl
  2012-12-01 11:25     ` Chris Wilson
@ 2012-12-01 11:35       ` Daniel Vetter
  0 siblings, 0 replies; 26+ messages in thread
From: Daniel Vetter @ 2012-12-01 11:35 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On Sat, Dec 1, 2012 at 12:25 PM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> On Sat, 1 Dec 2012 00:06:43 +0100, Daniel Vetter <daniel@ffwll.ch> wrote:
>> Quick review:
>> - I don't like the internal byte offset we add/substract for relocations.
>>   mmap works on page boundaries, imo gpu mmap should do the same.
>
> However, the necessity is that I work on unaligned boundaries since I
> have no control over how the clients allocate the memory to be mapped,
> or which bytes within that memory are associated with a particular
> pixmap. The XShm API is such that not even the ShmSegment is passed down
> to the driver. That vies with the lack of explicit synchronisation
> points as being the worst elements of its design.

Well, I just want to move the page_offset computation from the kernel
to userspace. I know, this requires that you store that offset in
userspace somewhere, which requires a new field right next to the
existing u32 handle.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2012-12-01 13:40 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-11-15 11:32 Hopefully the last round of stolen work Chris Wilson
2012-11-15 11:32 ` [PATCH 01/16] drm: Introduce drm_mm_create_block() Chris Wilson
2012-11-15 11:32 ` [PATCH 02/16] drm: Introduce an iterator over holes in the drm_mm range manager Chris Wilson
2012-11-15 11:32 ` [PATCH 03/16] drm/i915: Fix detection of base of stolen memory Chris Wilson
2012-11-15 11:32 ` [PATCH 04/16] drm/i915: Avoid clearing preallocated regions from the GTT Chris Wilson
2012-11-15 11:32 ` [PATCH 05/16] drm/i915: Delay allocation of stolen space for FBC Chris Wilson
2012-11-15 11:32 ` [PATCH 06/16] drm/i915: Allow objects to be created with no backing pages, but stolen space Chris Wilson
2012-11-15 11:32 ` [PATCH 07/16] drm/i915: Differentiate between prime and stolen objects Chris Wilson
2012-11-15 11:32 ` [PATCH 08/16] drm/i915: Support readback of stolen objects upon error Chris Wilson
2012-11-15 11:32 ` [PATCH 09/16] drm/i915: Handle stolen objects in pwrite Chris Wilson
2012-11-15 11:32 ` [PATCH 10/16] drm/i915: Handle stolen objects for pread Chris Wilson
2012-11-30 22:33   ` Daniel Vetter
2012-11-30 23:46     ` Chris Wilson
2012-12-01  0:03       ` Daniel Vetter
2012-12-01  0:14         ` Chris Wilson
2012-11-15 11:32 ` [PATCH 11/16] drm/i915: Introduce i915_gem_object_create_stolen() Chris Wilson
2012-11-15 11:32 ` [PATCH 12/16] drm/i915: Allocate fbcon from stolen memory Chris Wilson
2012-11-15 11:32 ` [PATCH 13/16] drm/i915: Allocate ringbuffers " Chris Wilson
2012-11-15 11:32 ` [PATCH 14/16] drm/i915: Allocate overlay registers " Chris Wilson
2012-11-15 11:32 ` [PATCH 15/16] drm/i915: Use a slab for object allocation Chris Wilson
2012-11-30 22:45   ` Daniel Vetter
2012-11-15 11:32 ` [PATCH 16/16] drm/i915: Introduce mapping of user pages into video memory (userptr) ioctl Chris Wilson
2012-11-30 23:06   ` Daniel Vetter
2012-11-30 23:57     ` Chris Wilson
2012-12-01 11:25     ` Chris Wilson
2012-12-01 11:35       ` Daniel Vetter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.