All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/11] ppgtt: just the VMA
@ 2013-07-09  6:08 Ben Widawsky
  2013-07-09  6:08 ` [PATCH 01/11] drm/i915: Move gtt and ppgtt under address space umbrella Ben Widawsky
                   ` (12 more replies)
  0 siblings, 13 replies; 50+ messages in thread
From: Ben Widawsky @ 2013-07-09  6:08 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

By Daniel's request, to make the PPGTT merging more manageable, here are the
patches associated with the VM/VMA infrastructure. They are not as well tested
as the previous series, although I would hope that without actually changing
address space, most of this series is just massaging code.

Even though these patches were all cherry picked from the original,
working series, the amount of rework was not insignificant ie. there may
be a lot of bugs present, or changes needed.

There should be little to no effect on the code, since there will only ever be
one VM until the rest of the PPGTT series is merged.

Finally, Daniel, is this more or less what you wanted first?

References:
http://lists.freedesktop.org/archives/intel-gfx/2013-June/029408.html

Ben Widawsky (11):
  drm/i915: Move gtt and ppgtt under address space umbrella
  drm/i915: Put the mm in the parent address space
  drm/i915: Create a global list of vms
  drm/i915: Move active/inactive lists to new mm
  drm/i915: Create VMAs
  drm/i915: plumb VM into object operations
  drm/i915: Fix up map and fenceable for VMA
  drm/i915: mm_list is per VMA
  drm/i915: Update error capture for VMs
  drm/i915: create an object_is_active()
  drm/i915: Move active to vma

 drivers/gpu/drm/i915/i915_debugfs.c        |  88 ++++--
 drivers/gpu/drm/i915/i915_dma.c            |   9 +-
 drivers/gpu/drm/i915/i915_drv.h            | 243 +++++++++-------
 drivers/gpu/drm/i915/i915_gem.c            | 432 ++++++++++++++++++++++-------
 drivers/gpu/drm/i915/i915_gem_context.c    |   9 +-
 drivers/gpu/drm/i915/i915_gem_debug.c      |   2 +-
 drivers/gpu/drm/i915/i915_gem_evict.c      |  67 +++--
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  87 +++---
 drivers/gpu/drm/i915/i915_gem_gtt.c        | 193 +++++++------
 drivers/gpu/drm/i915/i915_gem_stolen.c     |  19 +-
 drivers/gpu/drm/i915/i915_gem_tiling.c     |  10 +-
 drivers/gpu/drm/i915/i915_irq.c            | 158 ++++++++---
 drivers/gpu/drm/i915/i915_trace.h          |  20 +-
 drivers/gpu/drm/i915/intel_fb.c            |   1 -
 drivers/gpu/drm/i915/intel_overlay.c       |   2 +-
 drivers/gpu/drm/i915/intel_pm.c            |   2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c    |  16 +-
 17 files changed, 902 insertions(+), 456 deletions(-)

-- 
1.8.3.2

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 01/11] drm/i915: Move gtt and ppgtt under address space umbrella
  2013-07-09  6:08 [PATCH 00/11] ppgtt: just the VMA Ben Widawsky
@ 2013-07-09  6:08 ` Ben Widawsky
  2013-07-09  6:37   ` Daniel Vetter
  2013-07-11 11:14   ` Imre Deak
  2013-07-09  6:08 ` [PATCH 02/11] drm/i915: Put the mm in the parent address space Ben Widawsky
                   ` (11 subsequent siblings)
  12 siblings, 2 replies; 50+ messages in thread
From: Ben Widawsky @ 2013-07-09  6:08 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

The GTT and PPGTT can be thought of more generally as GPU address
spaces. Many of their actions (insert entries), state (LRU lists) and
many of their characteristics (size), can be shared. Do that.

The change itself doesn't actually impact most of the VMA/VM rework
coming up, it just fits in with the grand scheme. GGTT will usually be a
special case where we either know an object must be in the GGTT (dislay
engine, workarounds, etc.).

v2: Drop usage of i915_gtt_vm (Daniel)
Make cleanup also part of the parent class (Ben)
Modified commit msg
Rebased

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_debugfs.c |   4 +-
 drivers/gpu/drm/i915/i915_dma.c     |   4 +-
 drivers/gpu/drm/i915/i915_drv.h     |  57 ++++++-------
 drivers/gpu/drm/i915/i915_gem.c     |   4 +-
 drivers/gpu/drm/i915/i915_gem_gtt.c | 162 ++++++++++++++++++++----------------
 5 files changed, 121 insertions(+), 110 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index c8059f5..d870f27 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -287,8 +287,8 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
 		   count, size);
 
 	seq_printf(m, "%zu [%lu] gtt total\n",
-		   dev_priv->gtt.total,
-		   dev_priv->gtt.mappable_end - dev_priv->gtt.start);
+		   dev_priv->gtt.base.total,
+		   dev_priv->gtt.mappable_end - dev_priv->gtt.base.start);
 
 	seq_putc(m, '\n');
 	list_for_each_entry_reverse(file, &dev->filelist, lhead) {
diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 0e22142..15bca96 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1669,7 +1669,7 @@ out_gem_unload:
 out_mtrrfree:
 	arch_phys_wc_del(dev_priv->gtt.mtrr);
 	io_mapping_free(dev_priv->gtt.mappable);
-	dev_priv->gtt.gtt_remove(dev);
+	dev_priv->gtt.base.cleanup(&dev_priv->gtt.base);
 out_rmmap:
 	pci_iounmap(dev->pdev, dev_priv->regs);
 put_bridge:
@@ -1764,7 +1764,7 @@ int i915_driver_unload(struct drm_device *dev)
 	destroy_workqueue(dev_priv->wq);
 	pm_qos_remove_request(&dev_priv->pm_qos);
 
-	dev_priv->gtt.gtt_remove(dev);
+	dev_priv->gtt.base.cleanup(&dev_priv->gtt.base);
 
 	if (dev_priv->slab)
 		kmem_cache_destroy(dev_priv->slab);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index c8d6104..d6d4d7d 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -446,6 +446,29 @@ enum i915_cache_level {
 
 typedef uint32_t gen6_gtt_pte_t;
 
+struct i915_address_space {
+	struct drm_device *dev;
+	unsigned long start;		/* Start offset always 0 for dri2 */
+	size_t total;		/* size addr space maps (ex. 2GB for ggtt) */
+
+	struct {
+		dma_addr_t addr;
+		struct page *page;
+	} scratch;
+
+	/* FIXME: Need a more generic return type */
+	gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
+				     enum i915_cache_level level);
+	void (*clear_range)(struct i915_address_space *vm,
+			    unsigned int first_entry,
+			    unsigned int num_entries);
+	void (*insert_entries)(struct i915_address_space *vm,
+			       struct sg_table *st,
+			       unsigned int first_entry,
+			       enum i915_cache_level cache_level);
+	void (*cleanup)(struct i915_address_space *vm);
+};
+
 /* The Graphics Translation Table is the way in which GEN hardware translates a
  * Graphics Virtual Address into a Physical Address. In addition to the normal
  * collateral associated with any va->pa translations GEN hardware also has a
@@ -454,8 +477,7 @@ typedef uint32_t gen6_gtt_pte_t;
  * the spec.
  */
 struct i915_gtt {
-	unsigned long start;		/* Start offset of used GTT */
-	size_t total;			/* Total size GTT can map */
+	struct i915_address_space base;
 	size_t stolen_size;		/* Total size of stolen memory */
 
 	unsigned long mappable_end;	/* End offset that we can CPU map */
@@ -466,10 +488,6 @@ struct i915_gtt {
 	void __iomem *gsm;
 
 	bool do_idle_maps;
-	struct {
-		dma_addr_t addr;
-		struct page *page;
-	} scratch;
 
 	int mtrr;
 
@@ -477,38 +495,17 @@ struct i915_gtt {
 	int (*gtt_probe)(struct drm_device *dev, size_t *gtt_total,
 			  size_t *stolen, phys_addr_t *mappable_base,
 			  unsigned long *mappable_end);
-	void (*gtt_remove)(struct drm_device *dev);
-	void (*gtt_clear_range)(struct drm_device *dev,
-				unsigned int first_entry,
-				unsigned int num_entries);
-	void (*gtt_insert_entries)(struct drm_device *dev,
-				   struct sg_table *st,
-				   unsigned int pg_start,
-				   enum i915_cache_level cache_level);
-	gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
-				     enum i915_cache_level level);
 };
-#define gtt_total_entries(gtt) ((gtt).total >> PAGE_SHIFT)
+#define gtt_total_entries(gtt) ((gtt).base.total >> PAGE_SHIFT)
 
 struct i915_hw_ppgtt {
-	struct drm_device *dev;
+	struct i915_address_space base;
 	unsigned num_pd_entries;
 	struct page **pt_pages;
 	uint32_t pd_offset;
 	dma_addr_t *pt_dma_addr;
 
-	/* pte functions, mirroring the interface of the global gtt. */
-	void (*clear_range)(struct i915_hw_ppgtt *ppgtt,
-			    unsigned int first_entry,
-			    unsigned int num_entries);
-	void (*insert_entries)(struct i915_hw_ppgtt *ppgtt,
-			       struct sg_table *st,
-			       unsigned int pg_start,
-			       enum i915_cache_level cache_level);
-	gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
-				     enum i915_cache_level level);
 	int (*enable)(struct drm_device *dev);
-	void (*cleanup)(struct i915_hw_ppgtt *ppgtt);
 };
 
 struct i915_ctx_hang_stats {
@@ -1124,7 +1121,7 @@ typedef struct drm_i915_private {
 	enum modeset_restore modeset_restore;
 	struct mutex modeset_restore_lock;
 
-	struct i915_gtt gtt;
+	struct i915_gtt gtt; /* VMA representing the global address space */
 
 	struct i915_gem_mm mm;
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index af61be8..3ecedfd 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -181,7 +181,7 @@ i915_gem_get_aperture_ioctl(struct drm_device *dev, void *data,
 			pinned += i915_gem_obj_ggtt_size(obj);
 	mutex_unlock(&dev->struct_mutex);
 
-	args->aper_size = dev_priv->gtt.total;
+	args->aper_size = dev_priv->gtt.base.total;
 	args->aper_available_size = args->aper_size - pinned;
 
 	return 0;
@@ -3070,7 +3070,7 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
 	u32 size, fence_size, fence_alignment, unfenced_alignment;
 	bool mappable, fenceable;
 	size_t gtt_max = map_and_fenceable ?
-		dev_priv->gtt.mappable_end : dev_priv->gtt.total;
+		dev_priv->gtt.mappable_end : dev_priv->gtt.base.total;
 	int ret;
 
 	fence_size = i915_gem_get_gtt_size(dev,
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 242d0f9..693115a 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -102,7 +102,7 @@ static gen6_gtt_pte_t hsw_pte_encode(dma_addr_t addr,
 
 static void gen6_write_pdes(struct i915_hw_ppgtt *ppgtt)
 {
-	struct drm_i915_private *dev_priv = ppgtt->dev->dev_private;
+	struct drm_i915_private *dev_priv = ppgtt->base.dev->dev_private;
 	gen6_gtt_pte_t __iomem *pd_addr;
 	uint32_t pd_entry;
 	int i;
@@ -181,18 +181,18 @@ static int gen6_ppgtt_enable(struct drm_device *dev)
 }
 
 /* PPGTT support for Sandybdrige/Gen6 and later */
-static void gen6_ppgtt_clear_range(struct i915_hw_ppgtt *ppgtt,
+static void gen6_ppgtt_clear_range(struct i915_address_space *vm,
 				   unsigned first_entry,
 				   unsigned num_entries)
 {
-	struct drm_i915_private *dev_priv = ppgtt->dev->dev_private;
+	struct i915_hw_ppgtt *ppgtt =
+		container_of(vm, struct i915_hw_ppgtt, base);
 	gen6_gtt_pte_t *pt_vaddr, scratch_pte;
 	unsigned act_pt = first_entry / I915_PPGTT_PT_ENTRIES;
 	unsigned first_pte = first_entry % I915_PPGTT_PT_ENTRIES;
 	unsigned last_pte, i;
 
-	scratch_pte = ppgtt->pte_encode(dev_priv->gtt.scratch.addr,
-					I915_CACHE_LLC);
+	scratch_pte = vm->pte_encode(vm->scratch.addr, I915_CACHE_LLC);
 
 	while (num_entries) {
 		last_pte = first_pte + num_entries;
@@ -212,11 +212,13 @@ static void gen6_ppgtt_clear_range(struct i915_hw_ppgtt *ppgtt,
 	}
 }
 
-static void gen6_ppgtt_insert_entries(struct i915_hw_ppgtt *ppgtt,
+static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
 				      struct sg_table *pages,
 				      unsigned first_entry,
 				      enum i915_cache_level cache_level)
 {
+	struct i915_hw_ppgtt *ppgtt =
+		container_of(vm, struct i915_hw_ppgtt, base);
 	gen6_gtt_pte_t *pt_vaddr;
 	unsigned act_pt = first_entry / I915_PPGTT_PT_ENTRIES;
 	unsigned act_pte = first_entry % I915_PPGTT_PT_ENTRIES;
@@ -227,7 +229,7 @@ static void gen6_ppgtt_insert_entries(struct i915_hw_ppgtt *ppgtt,
 		dma_addr_t page_addr;
 
 		page_addr = sg_page_iter_dma_address(&sg_iter);
-		pt_vaddr[act_pte] = ppgtt->pte_encode(page_addr, cache_level);
+		pt_vaddr[act_pte] = vm->pte_encode(page_addr, cache_level);
 		if (++act_pte == I915_PPGTT_PT_ENTRIES) {
 			kunmap_atomic(pt_vaddr);
 			act_pt++;
@@ -239,13 +241,15 @@ static void gen6_ppgtt_insert_entries(struct i915_hw_ppgtt *ppgtt,
 	kunmap_atomic(pt_vaddr);
 }
 
-static void gen6_ppgtt_cleanup(struct i915_hw_ppgtt *ppgtt)
+static void gen6_ppgtt_cleanup(struct i915_address_space *vm)
 {
+	struct i915_hw_ppgtt *ppgtt =
+		container_of(vm, struct i915_hw_ppgtt, base);
 	int i;
 
 	if (ppgtt->pt_dma_addr) {
 		for (i = 0; i < ppgtt->num_pd_entries; i++)
-			pci_unmap_page(ppgtt->dev->pdev,
+			pci_unmap_page(ppgtt->base.dev->pdev,
 				       ppgtt->pt_dma_addr[i],
 				       4096, PCI_DMA_BIDIRECTIONAL);
 	}
@@ -259,7 +263,7 @@ static void gen6_ppgtt_cleanup(struct i915_hw_ppgtt *ppgtt)
 
 static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 {
-	struct drm_device *dev = ppgtt->dev;
+	struct drm_device *dev = ppgtt->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	unsigned first_pd_entry_in_global_pt;
 	int i;
@@ -271,17 +275,17 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 	first_pd_entry_in_global_pt = gtt_total_entries(dev_priv->gtt);
 
 	if (IS_HASWELL(dev)) {
-		ppgtt->pte_encode = hsw_pte_encode;
+		ppgtt->base.pte_encode = hsw_pte_encode;
 	} else if (IS_VALLEYVIEW(dev)) {
-		ppgtt->pte_encode = byt_pte_encode;
+		ppgtt->base.pte_encode = byt_pte_encode;
 	} else {
-		ppgtt->pte_encode = gen6_pte_encode;
+		ppgtt->base.pte_encode = gen6_pte_encode;
 	}
 	ppgtt->num_pd_entries = GEN6_PPGTT_PD_ENTRIES;
 	ppgtt->enable = gen6_ppgtt_enable;
-	ppgtt->clear_range = gen6_ppgtt_clear_range;
-	ppgtt->insert_entries = gen6_ppgtt_insert_entries;
-	ppgtt->cleanup = gen6_ppgtt_cleanup;
+	ppgtt->base.clear_range = gen6_ppgtt_clear_range;
+	ppgtt->base.insert_entries = gen6_ppgtt_insert_entries;
+	ppgtt->base.cleanup = gen6_ppgtt_cleanup;
 	ppgtt->pt_pages = kzalloc(sizeof(struct page *)*ppgtt->num_pd_entries,
 				  GFP_KERNEL);
 	if (!ppgtt->pt_pages)
@@ -312,8 +316,8 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 		ppgtt->pt_dma_addr[i] = pt_addr;
 	}
 
-	ppgtt->clear_range(ppgtt, 0,
-			   ppgtt->num_pd_entries*I915_PPGTT_PT_ENTRIES);
+	ppgtt->base.clear_range(&ppgtt->base, 0,
+				ppgtt->num_pd_entries * I915_PPGTT_PT_ENTRIES);
 
 	ppgtt->pd_offset = first_pd_entry_in_global_pt * sizeof(gen6_gtt_pte_t);
 
@@ -346,7 +350,7 @@ static int i915_gem_init_aliasing_ppgtt(struct drm_device *dev)
 	if (!ppgtt)
 		return -ENOMEM;
 
-	ppgtt->dev = dev;
+	ppgtt->base.dev = dev;
 
 	if (INTEL_INFO(dev)->gen < 8)
 		ret = gen6_ppgtt_init(ppgtt);
@@ -369,7 +373,7 @@ void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev)
 	if (!ppgtt)
 		return;
 
-	ppgtt->cleanup(ppgtt);
+	ppgtt->base.cleanup(&ppgtt->base);
 	dev_priv->mm.aliasing_ppgtt = NULL;
 }
 
@@ -377,17 +381,17 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
 			    struct drm_i915_gem_object *obj,
 			    enum i915_cache_level cache_level)
 {
-	ppgtt->insert_entries(ppgtt, obj->pages,
-			      i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
-			      cache_level);
+	ppgtt->base.insert_entries(&ppgtt->base, obj->pages,
+				   i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
+				   cache_level);
 }
 
 void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
 			      struct drm_i915_gem_object *obj)
 {
-	ppgtt->clear_range(ppgtt,
-			   i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
-			   obj->base.size >> PAGE_SHIFT);
+	ppgtt->base.clear_range(&ppgtt->base,
+				i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
+				obj->base.size >> PAGE_SHIFT);
 }
 
 extern int intel_iommu_gfx_mapped;
@@ -434,8 +438,9 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
 	struct drm_i915_gem_object *obj;
 
 	/* First fill our portion of the GTT with scratch pages */
-	dev_priv->gtt.gtt_clear_range(dev, dev_priv->gtt.start / PAGE_SIZE,
-				      dev_priv->gtt.total / PAGE_SIZE);
+	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
+				       dev_priv->gtt.base.start / PAGE_SIZE,
+				       dev_priv->gtt.base.total / PAGE_SIZE);
 
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
 		i915_gem_clflush_object(obj);
@@ -464,12 +469,12 @@ int i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj)
  * within the global GTT as well as accessible by the GPU through the GMADR
  * mapped BAR (dev_priv->mm.gtt->gtt).
  */
-static void gen6_ggtt_insert_entries(struct drm_device *dev,
+static void gen6_ggtt_insert_entries(struct i915_address_space *vm,
 				     struct sg_table *st,
 				     unsigned int first_entry,
 				     enum i915_cache_level level)
 {
-	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct drm_i915_private *dev_priv = vm->dev->dev_private;
 	gen6_gtt_pte_t __iomem *gtt_entries =
 		(gen6_gtt_pte_t __iomem *)dev_priv->gtt.gsm + first_entry;
 	int i = 0;
@@ -478,8 +483,7 @@ static void gen6_ggtt_insert_entries(struct drm_device *dev,
 
 	for_each_sg_page(st->sgl, &sg_iter, st->nents, 0) {
 		addr = sg_page_iter_dma_address(&sg_iter);
-		iowrite32(dev_priv->gtt.pte_encode(addr, level),
-			  &gtt_entries[i]);
+		iowrite32(vm->pte_encode(addr, level), &gtt_entries[i]);
 		i++;
 	}
 
@@ -490,8 +494,8 @@ static void gen6_ggtt_insert_entries(struct drm_device *dev,
 	 * hardware should work, we must keep this posting read for paranoia.
 	 */
 	if (i != 0)
-		WARN_ON(readl(&gtt_entries[i-1])
-			!= dev_priv->gtt.pte_encode(addr, level));
+		WARN_ON(readl(&gtt_entries[i-1]) !=
+			vm->pte_encode(addr, level));
 
 	/* This next bit makes the above posting read even more important. We
 	 * want to flush the TLBs only after we're certain all the PTE updates
@@ -501,11 +505,11 @@ static void gen6_ggtt_insert_entries(struct drm_device *dev,
 	POSTING_READ(GFX_FLSH_CNTL_GEN6);
 }
 
-static void gen6_ggtt_clear_range(struct drm_device *dev,
+static void gen6_ggtt_clear_range(struct i915_address_space *vm,
 				  unsigned int first_entry,
 				  unsigned int num_entries)
 {
-	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct drm_i915_private *dev_priv = vm->dev->dev_private;
 	gen6_gtt_pte_t scratch_pte, __iomem *gtt_base =
 		(gen6_gtt_pte_t __iomem *) dev_priv->gtt.gsm + first_entry;
 	const int max_entries = gtt_total_entries(dev_priv->gtt) - first_entry;
@@ -516,15 +520,14 @@ static void gen6_ggtt_clear_range(struct drm_device *dev,
 		 first_entry, num_entries, max_entries))
 		num_entries = max_entries;
 
-	scratch_pte = dev_priv->gtt.pte_encode(dev_priv->gtt.scratch.addr,
-					       I915_CACHE_LLC);
+	scratch_pte = vm->pte_encode(vm->scratch.addr, I915_CACHE_LLC);
 	for (i = 0; i < num_entries; i++)
 		iowrite32(scratch_pte, &gtt_base[i]);
 	readl(gtt_base);
 }
 
 
-static void i915_ggtt_insert_entries(struct drm_device *dev,
+static void i915_ggtt_insert_entries(struct i915_address_space *vm,
 				     struct sg_table *st,
 				     unsigned int pg_start,
 				     enum i915_cache_level cache_level)
@@ -536,7 +539,7 @@ static void i915_ggtt_insert_entries(struct drm_device *dev,
 
 }
 
-static void i915_ggtt_clear_range(struct drm_device *dev,
+static void i915_ggtt_clear_range(struct i915_address_space *vm,
 				  unsigned int first_entry,
 				  unsigned int num_entries)
 {
@@ -549,10 +552,11 @@ void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
 {
 	struct drm_device *dev = obj->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
+	const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
 
-	dev_priv->gtt.gtt_insert_entries(dev, obj->pages,
-					 i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
-					 cache_level);
+	dev_priv->gtt.base.insert_entries(&dev_priv->gtt.base, obj->pages,
+					  entry,
+					  cache_level);
 
 	obj->has_global_gtt_mapping = 1;
 }
@@ -561,10 +565,11 @@ void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj)
 {
 	struct drm_device *dev = obj->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
+	const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
 
-	dev_priv->gtt.gtt_clear_range(obj->base.dev,
-				      i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
-				      obj->base.size >> PAGE_SHIFT);
+	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
+				       entry,
+				       obj->base.size >> PAGE_SHIFT);
 
 	obj->has_global_gtt_mapping = 0;
 }
@@ -641,20 +646,23 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
 		obj->has_global_gtt_mapping = 1;
 	}
 
-	dev_priv->gtt.start = start;
-	dev_priv->gtt.total = end - start;
+	dev_priv->gtt.base.start = start;
+	dev_priv->gtt.base.total = end - start;
 
 	/* Clear any non-preallocated blocks */
 	drm_mm_for_each_hole(entry, &dev_priv->mm.gtt_space,
 			     hole_start, hole_end) {
+		const unsigned long count = (hole_end - hole_start) / PAGE_SIZE;
 		DRM_DEBUG_KMS("clearing unused GTT space: [%lx, %lx]\n",
 			      hole_start, hole_end);
-		dev_priv->gtt.gtt_clear_range(dev, hole_start / PAGE_SIZE,
-					      (hole_end-hole_start) / PAGE_SIZE);
+		dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
+					       hole_start / PAGE_SIZE,
+					       count);
 	}
 
 	/* And finally clear the reserved guard page */
-	dev_priv->gtt.gtt_clear_range(dev, end / PAGE_SIZE - 1, 1);
+	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
+				       end / PAGE_SIZE - 1, 1);
 }
 
 static bool
@@ -677,7 +685,7 @@ void i915_gem_init_global_gtt(struct drm_device *dev)
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	unsigned long gtt_size, mappable_size;
 
-	gtt_size = dev_priv->gtt.total;
+	gtt_size = dev_priv->gtt.base.total;
 	mappable_size = dev_priv->gtt.mappable_end;
 
 	if (intel_enable_ppgtt(dev) && HAS_ALIASING_PPGTT(dev)) {
@@ -722,8 +730,8 @@ static int setup_scratch_page(struct drm_device *dev)
 #else
 	dma_addr = page_to_phys(page);
 #endif
-	dev_priv->gtt.scratch.page = page;
-	dev_priv->gtt.scratch.addr = dma_addr;
+	dev_priv->gtt.base.scratch.page = page;
+	dev_priv->gtt.base.scratch.addr = dma_addr;
 
 	return 0;
 }
@@ -731,11 +739,13 @@ static int setup_scratch_page(struct drm_device *dev)
 static void teardown_scratch_page(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	set_pages_wb(dev_priv->gtt.scratch.page, 1);
-	pci_unmap_page(dev->pdev, dev_priv->gtt.scratch.addr,
+	struct page *page = dev_priv->gtt.base.scratch.page;
+
+	set_pages_wb(page, 1);
+	pci_unmap_page(dev->pdev, dev_priv->gtt.base.scratch.addr,
 		       PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
-	put_page(dev_priv->gtt.scratch.page);
-	__free_page(dev_priv->gtt.scratch.page);
+	put_page(page);
+	__free_page(page);
 }
 
 static inline unsigned int gen6_get_total_gtt_size(u16 snb_gmch_ctl)
@@ -798,17 +808,18 @@ static int gen6_gmch_probe(struct drm_device *dev,
 	if (ret)
 		DRM_ERROR("Scratch setup failed\n");
 
-	dev_priv->gtt.gtt_clear_range = gen6_ggtt_clear_range;
-	dev_priv->gtt.gtt_insert_entries = gen6_ggtt_insert_entries;
+	dev_priv->gtt.base.clear_range = gen6_ggtt_clear_range;
+	dev_priv->gtt.base.insert_entries = gen6_ggtt_insert_entries;
 
 	return ret;
 }
 
-static void gen6_gmch_remove(struct drm_device *dev)
+static void gen6_gmch_remove(struct i915_address_space *vm)
 {
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	iounmap(dev_priv->gtt.gsm);
-	teardown_scratch_page(dev_priv->dev);
+
+	struct i915_gtt *gtt = container_of(vm, struct i915_gtt, base);
+	iounmap(gtt->gsm);
+	teardown_scratch_page(vm->dev);
 }
 
 static int i915_gmch_probe(struct drm_device *dev,
@@ -829,13 +840,13 @@ static int i915_gmch_probe(struct drm_device *dev,
 	intel_gtt_get(gtt_total, stolen, mappable_base, mappable_end);
 
 	dev_priv->gtt.do_idle_maps = needs_idle_maps(dev_priv->dev);
-	dev_priv->gtt.gtt_clear_range = i915_ggtt_clear_range;
-	dev_priv->gtt.gtt_insert_entries = i915_ggtt_insert_entries;
+	dev_priv->gtt.base.clear_range = i915_ggtt_clear_range;
+	dev_priv->gtt.base.insert_entries = i915_ggtt_insert_entries;
 
 	return 0;
 }
 
-static void i915_gmch_remove(struct drm_device *dev)
+static void i915_gmch_remove(struct i915_address_space *vm)
 {
 	intel_gmch_remove();
 }
@@ -848,25 +859,28 @@ int i915_gem_gtt_init(struct drm_device *dev)
 
 	if (INTEL_INFO(dev)->gen <= 5) {
 		gtt->gtt_probe = i915_gmch_probe;
-		gtt->gtt_remove = i915_gmch_remove;
+		gtt->base.cleanup = i915_gmch_remove;
 	} else {
 		gtt->gtt_probe = gen6_gmch_probe;
-		gtt->gtt_remove = gen6_gmch_remove;
+		gtt->base.cleanup = gen6_gmch_remove;
 		if (IS_HASWELL(dev))
-			gtt->pte_encode = hsw_pte_encode;
+			gtt->base.pte_encode = hsw_pte_encode;
 		else if (IS_VALLEYVIEW(dev))
-			gtt->pte_encode = byt_pte_encode;
+			gtt->base.pte_encode = byt_pte_encode;
 		else
-			gtt->pte_encode = gen6_pte_encode;
+			gtt->base.pte_encode = gen6_pte_encode;
 	}
 
-	ret = gtt->gtt_probe(dev, &gtt->total, &gtt->stolen_size,
+	ret = gtt->gtt_probe(dev, &gtt->base.total, &gtt->stolen_size,
 			     &gtt->mappable_base, &gtt->mappable_end);
 	if (ret)
 		return ret;
 
+	gtt->base.dev = dev;
+
 	/* GMADR is the PCI mmio aperture into the global GTT. */
-	DRM_INFO("Memory usable by graphics device = %zdM\n", gtt->total >> 20);
+	DRM_INFO("Memory usable by graphics device = %zdM\n",
+		 gtt->base.total >> 20);
 	DRM_DEBUG_DRIVER("GMADR size = %ldM\n", gtt->mappable_end >> 20);
 	DRM_DEBUG_DRIVER("GTT stolen size = %zdM\n", gtt->stolen_size >> 20);
 
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 02/11] drm/i915: Put the mm in the parent address space
  2013-07-09  6:08 [PATCH 00/11] ppgtt: just the VMA Ben Widawsky
  2013-07-09  6:08 ` [PATCH 01/11] drm/i915: Move gtt and ppgtt under address space umbrella Ben Widawsky
@ 2013-07-09  6:08 ` Ben Widawsky
  2013-07-09  6:08 ` [PATCH 03/11] drm/i915: Create a global list of vms Ben Widawsky
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 50+ messages in thread
From: Ben Widawsky @ 2013-07-09  6:08 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

Every address space should support object allocation. It therefore makes
sense to have the allocator be part of the "superclass" which GGTT and
PPGTT will derive.

Since our maximum address space size is only 2GB we're not yet able to
avoid doing allocation/eviction; but we'd hope one day this becomes
almost irrelvant.

v2: Rebased

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

Conflicts:
	drivers/gpu/drm/i915/i915_dma.c
	drivers/gpu/drm/i915/i915_drv.h
	drivers/gpu/drm/i915/i915_gem.c
	drivers/gpu/drm/i915/i915_gem_gtt.c
	drivers/gpu/drm/i915/i915_gem_stolen.c
---
 drivers/gpu/drm/i915/i915_dma.c        |  4 ++--
 drivers/gpu/drm/i915/i915_drv.h        |  3 +--
 drivers/gpu/drm/i915/i915_gem.c        |  2 +-
 drivers/gpu/drm/i915/i915_gem_evict.c  | 10 +++++-----
 drivers/gpu/drm/i915/i915_gem_gtt.c    | 17 +++++++++++------
 drivers/gpu/drm/i915/i915_gem_stolen.c |  4 ++--
 6 files changed, 22 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 15bca96..3ac9dcc 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1363,7 +1363,7 @@ cleanup_gem:
 	i915_gem_context_fini(dev);
 	mutex_unlock(&dev->struct_mutex);
 	i915_gem_cleanup_aliasing_ppgtt(dev);
-	drm_mm_takedown(&dev_priv->mm.gtt_space);
+	drm_mm_takedown(&dev_priv->gtt.base.mm);
 cleanup_irq:
 	drm_irq_uninstall(dev);
 cleanup_gem_stolen:
@@ -1754,7 +1754,7 @@ int i915_driver_unload(struct drm_device *dev)
 			i915_free_hws(dev);
 	}
 
-	drm_mm_takedown(&dev_priv->mm.gtt_space);
+	drm_mm_takedown(&dev_priv->gtt.base.mm);
 	if (dev_priv->regs != NULL)
 		pci_iounmap(dev->pdev, dev_priv->regs);
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index d6d4d7d..1296565 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -447,6 +447,7 @@ enum i915_cache_level {
 typedef uint32_t gen6_gtt_pte_t;
 
 struct i915_address_space {
+	struct drm_mm mm;
 	struct drm_device *dev;
 	unsigned long start;		/* Start offset always 0 for dri2 */
 	size_t total;		/* size addr space maps (ex. 2GB for ggtt) */
@@ -819,8 +820,6 @@ struct intel_l3_parity {
 struct i915_gem_mm {
 	/** Memory allocator for GTT stolen memory */
 	struct drm_mm stolen;
-	/** Memory allocator for GTT */
-	struct drm_mm gtt_space;
 	/** List of all objects in gtt_space. Used to restore gtt
 	 * mappings on resume */
 	struct list_head bound_list;
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 3ecedfd..ad763e3 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3112,7 +3112,7 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
 	i915_gem_object_pin_pages(obj);
 
 search_free:
-	ret = drm_mm_insert_node_in_range_generic(&dev_priv->mm.gtt_space,
+	ret = drm_mm_insert_node_in_range_generic(&dev_priv->gtt.base.mm,
 						  &obj->gtt_space,
 						  size, alignment,
 						  obj->cache_level, 0, gtt_max);
diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
index 5f8afc4..f1c9ab0 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -78,12 +78,12 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
 
 	INIT_LIST_HEAD(&unwind_list);
 	if (mappable)
-		drm_mm_init_scan_with_range(&dev_priv->mm.gtt_space,
-					    min_size, alignment, cache_level,
-					    0, dev_priv->gtt.mappable_end);
+		drm_mm_init_scan_with_range(&dev_priv->gtt.base.mm, min_size,
+					    alignment, cache_level, 0,
+					    dev_priv->gtt.mappable_end);
 	else
-		drm_mm_init_scan(&dev_priv->mm.gtt_space,
-				 min_size, alignment, cache_level);
+		drm_mm_init_scan(&dev_priv->gtt.base.mm, min_size, alignment,
+				 cache_level);
 
 	/* First see if there is a large enough contiguous idle region... */
 	list_for_each_entry(obj, &dev_priv->mm.inactive_list, mm_list) {
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 693115a..b9400e9 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -247,6 +247,8 @@ static void gen6_ppgtt_cleanup(struct i915_address_space *vm)
 		container_of(vm, struct i915_hw_ppgtt, base);
 	int i;
 
+	drm_mm_takedown(&ppgtt->base.mm);
+
 	if (ppgtt->pt_dma_addr) {
 		for (i = 0; i < ppgtt->num_pd_entries; i++)
 			pci_unmap_page(ppgtt->base.dev->pdev,
@@ -359,8 +361,11 @@ static int i915_gem_init_aliasing_ppgtt(struct drm_device *dev)
 
 	if (ret)
 		kfree(ppgtt);
-	else
+	else {
 		dev_priv->mm.aliasing_ppgtt = ppgtt;
+		drm_mm_init(&ppgtt->base.mm, ppgtt->base.start,
+			    ppgtt->base.total);
+	}
 
 	return ret;
 }
@@ -628,9 +633,9 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
 	BUG_ON(mappable_end > end);
 
 	/* Subtract the guard page ... */
-	drm_mm_init(&dev_priv->mm.gtt_space, start, end - start - PAGE_SIZE);
+	drm_mm_init(&dev_priv->gtt.base.mm, start, end - start - PAGE_SIZE);
 	if (!HAS_LLC(dev))
-		dev_priv->mm.gtt_space.color_adjust = i915_gtt_color_adjust;
+		dev_priv->gtt.base.mm.color_adjust = i915_gtt_color_adjust;
 
 	/* Mark any preallocated objects as occupied */
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
@@ -639,7 +644,7 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
 			      i915_gem_obj_ggtt_offset(obj), obj->base.size);
 
 		WARN_ON(i915_gem_obj_ggtt_bound(obj));
-		ret = drm_mm_reserve_node(&dev_priv->mm.gtt_space,
+		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm,
 					  &obj->gtt_space);
 		if (ret)
 			DRM_DEBUG_KMS("Reservation failed\n");
@@ -650,7 +655,7 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
 	dev_priv->gtt.base.total = end - start;
 
 	/* Clear any non-preallocated blocks */
-	drm_mm_for_each_hole(entry, &dev_priv->mm.gtt_space,
+	drm_mm_for_each_hole(entry, &dev_priv->gtt.base.mm,
 			     hole_start, hole_end) {
 		const unsigned long count = (hole_end - hole_start) / PAGE_SIZE;
 		DRM_DEBUG_KMS("clearing unused GTT space: [%lx, %lx]\n",
@@ -704,7 +709,7 @@ void i915_gem_init_global_gtt(struct drm_device *dev)
 			return;
 
 		DRM_ERROR("Aliased PPGTT setup failed %d\n", ret);
-		drm_mm_takedown(&dev_priv->mm.gtt_space);
+		drm_mm_takedown(&dev_priv->gtt.base.mm);
 		gtt_size += GEN6_PPGTT_PD_ENTRIES * PAGE_SIZE;
 	}
 	i915_gem_setup_global_gtt(dev, 0, mappable_size, gtt_size);
diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index 24cae1c..c201321 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -396,8 +396,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 	 */
 	obj->gtt_space.start = gtt_offset;
 	obj->gtt_space.size = size;
-	if (drm_mm_initialized(&dev_priv->mm.gtt_space)) {
-		ret = drm_mm_reserve_node(&dev_priv->mm.gtt_space,
+	if (drm_mm_initialized(&dev_priv->gtt.base.mm)) {
+		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm,
 					  &obj->gtt_space);
 		if (ret) {
 			DRM_DEBUG_KMS("failed to allocate stolen GTT space\n");
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 03/11] drm/i915: Create a global list of vms
  2013-07-09  6:08 [PATCH 00/11] ppgtt: just the VMA Ben Widawsky
  2013-07-09  6:08 ` [PATCH 01/11] drm/i915: Move gtt and ppgtt under address space umbrella Ben Widawsky
  2013-07-09  6:08 ` [PATCH 02/11] drm/i915: Put the mm in the parent address space Ben Widawsky
@ 2013-07-09  6:08 ` Ben Widawsky
  2013-07-09  6:37   ` Daniel Vetter
  2013-07-09  6:08 ` [PATCH 04/11] drm/i915: Move active/inactive lists to new mm Ben Widawsky
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 50+ messages in thread
From: Ben Widawsky @ 2013-07-09  6:08 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

After we plumb our code to support multiple address spaces (VMs), there
are a few situations where we want to be able to traverse the list of
all address spaces in the system. Cases like eviction, or error state
collection are obvious example.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_dma.c | 5 +++++
 drivers/gpu/drm/i915/i915_drv.h | 2 ++
 2 files changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 3ac9dcc..d13e21f 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1497,6 +1497,10 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
 
 	i915_dump_device_info(dev_priv);
 
+	INIT_LIST_HEAD(&dev_priv->vm_list);
+	INIT_LIST_HEAD(&dev_priv->gtt.base.global_link);
+	list_add(&dev_priv->gtt.base.global_link, &dev_priv->vm_list);
+
 	if (i915_get_bridge_dev(dev)) {
 		ret = -EIO;
 		goto free_priv;
@@ -1754,6 +1758,7 @@ int i915_driver_unload(struct drm_device *dev)
 			i915_free_hws(dev);
 	}
 
+	list_del(&dev_priv->vm_list);
 	drm_mm_takedown(&dev_priv->gtt.base.mm);
 	if (dev_priv->regs != NULL)
 		pci_iounmap(dev->pdev, dev_priv->regs);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 1296565..997c9a5 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -449,6 +449,7 @@ typedef uint32_t gen6_gtt_pte_t;
 struct i915_address_space {
 	struct drm_mm mm;
 	struct drm_device *dev;
+	struct list_head global_link;
 	unsigned long start;		/* Start offset always 0 for dri2 */
 	size_t total;		/* size addr space maps (ex. 2GB for ggtt) */
 
@@ -1120,6 +1121,7 @@ typedef struct drm_i915_private {
 	enum modeset_restore modeset_restore;
 	struct mutex modeset_restore_lock;
 
+	struct list_head vm_list; /* Global list of all address spaces */
 	struct i915_gtt gtt; /* VMA representing the global address space */
 
 	struct i915_gem_mm mm;
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 04/11] drm/i915: Move active/inactive lists to new mm
  2013-07-09  6:08 [PATCH 00/11] ppgtt: just the VMA Ben Widawsky
                   ` (2 preceding siblings ...)
  2013-07-09  6:08 ` [PATCH 03/11] drm/i915: Create a global list of vms Ben Widawsky
@ 2013-07-09  6:08 ` Ben Widawsky
  2013-07-09  6:08 ` [PATCH 05/11] drm/i915: Create VMAs Ben Widawsky
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 50+ messages in thread
From: Ben Widawsky @ 2013-07-09  6:08 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

Shamelessly manipulated out of Daniel :-)
"When moving the lists around explain that the active/inactive stuff is
used by eviction when we run out of address space, so needs to be
per-vma and per-address space. Bound/unbound otoh is used by the
shrinker which only cares about the amount of memory used and not one
bit about in which address space this memory is all used in. Of course
to actual kick out an object we need to unbind it from every address
space, but for that we have the per-object list of vmas."

v2: Leave the bound list as a global one. (Chris, indirectly)

v3: Rebased with no i915_gtt_vm. In most places I added a new *vm local,
since it will eventually be replaces by a vm argument.
Put comment back inline, since it no longer makes sense to do otherwise.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_debugfs.c    | 16 +++++++-----
 drivers/gpu/drm/i915/i915_drv.h        | 46 +++++++++++++++++-----------------
 drivers/gpu/drm/i915/i915_gem.c        | 33 ++++++++++++------------
 drivers/gpu/drm/i915/i915_gem_debug.c  |  2 +-
 drivers/gpu/drm/i915/i915_gem_evict.c  | 18 ++++++-------
 drivers/gpu/drm/i915/i915_gem_stolen.c |  3 ++-
 drivers/gpu/drm/i915/i915_irq.c        |  8 +++---
 7 files changed, 67 insertions(+), 59 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index d870f27..16b2aaf 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -146,7 +146,8 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
 	uintptr_t list = (uintptr_t) node->info_ent->data;
 	struct list_head *head;
 	struct drm_device *dev = node->minor->dev;
-	drm_i915_private_t *dev_priv = dev->dev_private;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct i915_address_space *vm = &dev_priv->gtt.base;
 	struct drm_i915_gem_object *obj;
 	size_t total_obj_size, total_gtt_size;
 	int count, ret;
@@ -158,11 +159,11 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
 	switch (list) {
 	case ACTIVE_LIST:
 		seq_puts(m, "Active:\n");
-		head = &dev_priv->mm.active_list;
+		head = &vm->active_list;
 		break;
 	case INACTIVE_LIST:
 		seq_puts(m, "Inactive:\n");
-		head = &dev_priv->mm.inactive_list;
+		head = &vm->inactive_list;
 		break;
 	default:
 		mutex_unlock(&dev->struct_mutex);
@@ -230,6 +231,7 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
 	u32 count, mappable_count, purgeable_count;
 	size_t size, mappable_size, purgeable_size;
 	struct drm_i915_gem_object *obj;
+	struct i915_address_space *vm = &dev_priv->gtt.base;
 	struct drm_file *file;
 	int ret;
 
@@ -247,12 +249,12 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
 		   count, mappable_count, size, mappable_size);
 
 	size = count = mappable_size = mappable_count = 0;
-	count_objects(&dev_priv->mm.active_list, mm_list);
+	count_objects(&vm->active_list, mm_list);
 	seq_printf(m, "  %u [%u] active objects, %zu [%zu] bytes\n",
 		   count, mappable_count, size, mappable_size);
 
 	size = count = mappable_size = mappable_count = 0;
-	count_objects(&dev_priv->mm.inactive_list, mm_list);
+	count_objects(&vm->inactive_list, mm_list);
 	seq_printf(m, "  %u [%u] inactive objects, %zu [%zu] bytes\n",
 		   count, mappable_count, size, mappable_size);
 
@@ -2025,6 +2027,7 @@ i915_drop_caches_set(void *data, u64 val)
 	struct drm_device *dev = data;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_gem_object *obj, *next;
+	struct i915_address_space *vm = &dev_priv->gtt.base;
 	int ret;
 
 	DRM_DEBUG_DRIVER("Dropping caches: 0x%08llx\n", val);
@@ -2045,7 +2048,8 @@ i915_drop_caches_set(void *data, u64 val)
 		i915_gem_retire_requests(dev);
 
 	if (val & DROP_BOUND) {
-		list_for_each_entry_safe(obj, next, &dev_priv->mm.inactive_list, mm_list)
+		list_for_each_entry_safe(obj, next, &vm->inactive_list,
+					 mm_list)
 			if (obj->pin_count == 0) {
 				ret = i915_gem_object_unbind(obj);
 				if (ret)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 997c9a5..3759c09 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -458,6 +458,29 @@ struct i915_address_space {
 		struct page *page;
 	} scratch;
 
+	/**
+	 * List of objects currently involved in rendering.
+	 *
+	 * Includes buffers having the contents of their GPU caches
+	 * flushed, not necessarily primitives.  last_rendering_seqno
+	 * represents when the rendering involved will be completed.
+	 *
+	 * A reference is held on the buffer while on this list.
+	 */
+	struct list_head active_list;
+
+	/**
+	 * LRU list of objects which are not in the ringbuffer and
+	 * are ready to unbind, but are still in the GTT.
+	 *
+	 * last_rendering_seqno is 0 while an object is in this list.
+	 *
+	 * A reference is not held on the buffer while on this list,
+	 * as merely being GTT-bound shouldn't prevent its being
+	 * freed, and we'll pull it off the list in the free path.
+	 */
+	struct list_head inactive_list;
+
 	/* FIXME: Need a more generic return type */
 	gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
 				     enum i915_cache_level level);
@@ -840,29 +863,6 @@ struct i915_gem_mm {
 	struct shrinker inactive_shrinker;
 	bool shrinker_no_lock_stealing;
 
-	/**
-	 * List of objects currently involved in rendering.
-	 *
-	 * Includes buffers having the contents of their GPU caches
-	 * flushed, not necessarily primitives.  last_rendering_seqno
-	 * represents when the rendering involved will be completed.
-	 *
-	 * A reference is held on the buffer while on this list.
-	 */
-	struct list_head active_list;
-
-	/**
-	 * LRU list of objects which are not in the ringbuffer and
-	 * are ready to unbind, but are still in the GTT.
-	 *
-	 * last_rendering_seqno is 0 while an object is in this list.
-	 *
-	 * A reference is not held on the buffer while on this list,
-	 * as merely being GTT-bound shouldn't prevent its being
-	 * freed, and we'll pull it off the list in the free path.
-	 */
-	struct list_head inactive_list;
-
 	/** LRU list of objects with fence regs on them. */
 	struct list_head fence_list;
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index ad763e3..525aa8f 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1692,6 +1692,7 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
 		  bool purgeable_only)
 {
 	struct drm_i915_gem_object *obj, *next;
+	struct i915_address_space *vm = &dev_priv->gtt.base;
 	long count = 0;
 
 	list_for_each_entry_safe(obj, next,
@@ -1705,9 +1706,7 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
 		}
 	}
 
-	list_for_each_entry_safe(obj, next,
-				 &dev_priv->mm.inactive_list,
-				 mm_list) {
+	list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list) {
 		if ((i915_gem_object_is_purgeable(obj) || !purgeable_only) &&
 		    i915_gem_object_unbind(obj) == 0 &&
 		    i915_gem_object_put_pages(obj) == 0) {
@@ -1878,6 +1877,7 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
 {
 	struct drm_device *dev = obj->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct i915_address_space *vm = &dev_priv->gtt.base;
 	u32 seqno = intel_ring_get_seqno(ring);
 
 	BUG_ON(ring == NULL);
@@ -1890,7 +1890,7 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
 	}
 
 	/* Move from whatever list we were on to the tail of execution. */
-	list_move_tail(&obj->mm_list, &dev_priv->mm.active_list);
+	list_move_tail(&obj->mm_list, &vm->active_list);
 	list_move_tail(&obj->ring_list, &ring->active_list);
 
 	obj->last_read_seqno = seqno;
@@ -1914,11 +1914,12 @@ i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
 {
 	struct drm_device *dev = obj->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct i915_address_space *vm = &dev_priv->gtt.base;
 
 	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
 	BUG_ON(!obj->active);
 
-	list_move_tail(&obj->mm_list, &dev_priv->mm.inactive_list);
+	list_move_tail(&obj->mm_list, &vm->inactive_list);
 
 	list_del_init(&obj->ring_list);
 	obj->ring = NULL;
@@ -2262,6 +2263,7 @@ void i915_gem_restore_fences(struct drm_device *dev)
 void i915_gem_reset(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct i915_address_space *vm = &dev_priv->gtt.base;
 	struct drm_i915_gem_object *obj;
 	struct intel_ring_buffer *ring;
 	int i;
@@ -2272,12 +2274,8 @@ void i915_gem_reset(struct drm_device *dev)
 	/* Move everything out of the GPU domains to ensure we do any
 	 * necessary invalidation upon reuse.
 	 */
-	list_for_each_entry(obj,
-			    &dev_priv->mm.inactive_list,
-			    mm_list)
-	{
+	list_for_each_entry(obj, &vm->inactive_list, mm_list)
 		obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
-	}
 
 	i915_gem_restore_fences(dev);
 }
@@ -3067,6 +3065,7 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
 {
 	struct drm_device *dev = obj->base.dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
+	struct i915_address_space *vm = &dev_priv->gtt.base;
 	u32 size, fence_size, fence_alignment, unfenced_alignment;
 	bool mappable, fenceable;
 	size_t gtt_max = map_and_fenceable ?
@@ -3142,7 +3141,7 @@ search_free:
 	}
 
 	list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
-	list_add_tail(&obj->mm_list, &dev_priv->mm.inactive_list);
+	list_add_tail(&obj->mm_list, &vm->inactive_list);
 
 	fenceable =
 		i915_gem_obj_ggtt_size(obj) == fence_size &&
@@ -3290,7 +3289,8 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
 
 	/* And bump the LRU for this access */
 	if (i915_gem_object_is_inactive(obj))
-		list_move_tail(&obj->mm_list, &dev_priv->mm.inactive_list);
+		list_move_tail(&obj->mm_list,
+			       &dev_priv->gtt.base.inactive_list);
 
 	return 0;
 }
@@ -4240,7 +4240,7 @@ i915_gem_entervt_ioctl(struct drm_device *dev, void *data,
 		return ret;
 	}
 
-	BUG_ON(!list_empty(&dev_priv->mm.active_list));
+	BUG_ON(!list_empty(&dev_priv->gtt.base.active_list));
 	mutex_unlock(&dev->struct_mutex);
 
 	ret = drm_irq_install(dev);
@@ -4301,8 +4301,8 @@ i915_gem_load(struct drm_device *dev)
 				  SLAB_HWCACHE_ALIGN,
 				  NULL);
 
-	INIT_LIST_HEAD(&dev_priv->mm.active_list);
-	INIT_LIST_HEAD(&dev_priv->mm.inactive_list);
+	INIT_LIST_HEAD(&dev_priv->gtt.base.active_list);
+	INIT_LIST_HEAD(&dev_priv->gtt.base.inactive_list);
 	INIT_LIST_HEAD(&dev_priv->mm.unbound_list);
 	INIT_LIST_HEAD(&dev_priv->mm.bound_list);
 	INIT_LIST_HEAD(&dev_priv->mm.fence_list);
@@ -4573,6 +4573,7 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
 			     struct drm_i915_private,
 			     mm.inactive_shrinker);
 	struct drm_device *dev = dev_priv->dev;
+	struct i915_address_space *vm = &dev_priv->gtt.base;
 	struct drm_i915_gem_object *obj;
 	int nr_to_scan = sc->nr_to_scan;
 	bool unlock = true;
@@ -4601,7 +4602,7 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
 	list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_list)
 		if (obj->pages_pin_count == 0)
 			cnt += obj->base.size >> PAGE_SHIFT;
-	list_for_each_entry(obj, &dev_priv->mm.inactive_list, mm_list)
+	list_for_each_entry(obj, &vm->inactive_list, mm_list)
 		if (obj->pin_count == 0 && obj->pages_pin_count == 0)
 			cnt += obj->base.size >> PAGE_SHIFT;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_debug.c b/drivers/gpu/drm/i915/i915_gem_debug.c
index 582e6a5..bf945a3 100644
--- a/drivers/gpu/drm/i915/i915_gem_debug.c
+++ b/drivers/gpu/drm/i915/i915_gem_debug.c
@@ -97,7 +97,7 @@ i915_verify_lists(struct drm_device *dev)
 		}
 	}
 
-	list_for_each_entry(obj, &dev_priv->mm.inactive_list, list) {
+	list_for_each_entry(obj, &i915_gtt_vm->inactive_list, list) {
 		if (obj->base.dev != dev ||
 		    !atomic_read(&obj->base.refcount.refcount)) {
 			DRM_ERROR("freed inactive %p\n", obj);
diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
index f1c9ab0..43b8235 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -47,6 +47,7 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
 			 bool mappable, bool nonblocking)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
+	struct i915_address_space *vm = &dev_priv->gtt.base;
 	struct list_head eviction_list, unwind_list;
 	struct drm_i915_gem_object *obj;
 	int ret = 0;
@@ -78,15 +79,14 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
 
 	INIT_LIST_HEAD(&unwind_list);
 	if (mappable)
-		drm_mm_init_scan_with_range(&dev_priv->gtt.base.mm, min_size,
+		drm_mm_init_scan_with_range(&vm->mm, min_size,
 					    alignment, cache_level, 0,
 					    dev_priv->gtt.mappable_end);
 	else
-		drm_mm_init_scan(&dev_priv->gtt.base.mm, min_size, alignment,
-				 cache_level);
+		drm_mm_init_scan(&vm->mm, min_size, alignment, cache_level);
 
 	/* First see if there is a large enough contiguous idle region... */
-	list_for_each_entry(obj, &dev_priv->mm.inactive_list, mm_list) {
+	list_for_each_entry(obj, &vm->inactive_list, mm_list) {
 		if (mark_free(obj, &unwind_list))
 			goto found;
 	}
@@ -95,7 +95,7 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
 		goto none;
 
 	/* Now merge in the soon-to-be-expired objects... */
-	list_for_each_entry(obj, &dev_priv->mm.active_list, mm_list) {
+	list_for_each_entry(obj, &vm->active_list, mm_list) {
 		if (mark_free(obj, &unwind_list))
 			goto found;
 	}
@@ -154,12 +154,13 @@ int
 i915_gem_evict_everything(struct drm_device *dev)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
+	struct i915_address_space *vm = &dev_priv->gtt.base;
 	struct drm_i915_gem_object *obj, *next;
 	bool lists_empty;
 	int ret;
 
-	lists_empty = (list_empty(&dev_priv->mm.inactive_list) &&
-		       list_empty(&dev_priv->mm.active_list));
+	lists_empty = (list_empty(&vm->inactive_list) &&
+		       list_empty(&vm->active_list));
 	if (lists_empty)
 		return -ENOSPC;
 
@@ -176,8 +177,7 @@ i915_gem_evict_everything(struct drm_device *dev)
 	i915_gem_retire_requests(dev);
 
 	/* Having flushed everything, unbind() should never raise an error */
-	list_for_each_entry_safe(obj, next,
-				 &dev_priv->mm.inactive_list, mm_list)
+	list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
 		if (obj->pin_count == 0)
 			WARN_ON(i915_gem_object_unbind(obj));
 
diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index c201321..a4c3136 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -348,6 +348,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 					       u32 size)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct i915_address_space *vm = &dev_priv->gtt.base;
 	struct drm_i915_gem_object *obj;
 	struct drm_mm_node *stolen;
 	int ret;
@@ -408,7 +409,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 	obj->has_global_gtt_mapping = 1;
 
 	list_add_tail(&obj->global_list, &dev_priv->mm.bound_list);
-	list_add_tail(&obj->mm_list, &dev_priv->mm.inactive_list);
+	list_add_tail(&obj->mm_list, &vm->inactive_list);
 
 	return obj;
 
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 56199ef..79fbb17 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1707,6 +1707,7 @@ i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
 			     struct intel_ring_buffer *ring)
 {
 	struct drm_i915_gem_object *obj;
+	struct i915_address_space *vm = &dev_priv->gtt.base;
 	u32 seqno;
 
 	if (!ring->get_seqno)
@@ -1725,7 +1726,7 @@ i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
 	}
 
 	seqno = ring->get_seqno(ring, false);
-	list_for_each_entry(obj, &dev_priv->mm.active_list, mm_list) {
+	list_for_each_entry(obj, &vm->active_list, mm_list) {
 		if (obj->ring != ring)
 			continue;
 
@@ -1858,10 +1859,11 @@ static void i915_gem_capture_buffers(struct drm_i915_private *dev_priv,
 				     struct drm_i915_error_state *error)
 {
 	struct drm_i915_gem_object *obj;
+	struct i915_address_space *vm = &dev_priv->gtt.base;
 	int i;
 
 	i = 0;
-	list_for_each_entry(obj, &dev_priv->mm.active_list, mm_list)
+	list_for_each_entry(obj, &vm->active_list, mm_list)
 		i++;
 	error->active_bo_count = i;
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list)
@@ -1881,7 +1883,7 @@ static void i915_gem_capture_buffers(struct drm_i915_private *dev_priv,
 		error->active_bo_count =
 			capture_active_bo(error->active_bo,
 					  error->active_bo_count,
-					  &dev_priv->mm.active_list);
+					  &vm->active_list);
 
 	if (error->pinned_bo)
 		error->pinned_bo_count =
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 05/11] drm/i915: Create VMAs
  2013-07-09  6:08 [PATCH 00/11] ppgtt: just the VMA Ben Widawsky
                   ` (3 preceding siblings ...)
  2013-07-09  6:08 ` [PATCH 04/11] drm/i915: Move active/inactive lists to new mm Ben Widawsky
@ 2013-07-09  6:08 ` Ben Widawsky
  2013-07-11 11:20   ` Imre Deak
  2013-07-09  6:08 ` [PATCH 06/11] drm/i915: plumb VM into object operations Ben Widawsky
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 50+ messages in thread
From: Ben Widawsky @ 2013-07-09  6:08 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

Formerly: "drm/i915: Create VMAs (part 1)"

In a previous patch, the notion of a VM was introduced. A VMA describes
an area of part of the VM address space. A VMA is similar to the concept
in the linux mm. However, instead of representing regular memory, a VMA
is backed by a GEM BO. There may be many VMAs for a given object, one
for each VM the object is to be used in. This may occur through flink,
dma-buf, or a number of other transient states.

Currently the code depends on only 1 VMA per object, for the global GTT
(and aliasing PPGTT). The following patches will address this and make
the rest of the infrastructure more suited

v2: s/i915_obj/i915_gem_obj (Chris)

v3: Only move an object to the now global unbound list if there are no
more VMAs for the object which are bound into a VM (ie. the list is
empty).

v4: killed obj->gtt_space
some reworks due to rebase

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h        | 48 ++++++++++++++++++++++------
 drivers/gpu/drm/i915/i915_gem.c        | 57 +++++++++++++++++++++++++++++-----
 drivers/gpu/drm/i915/i915_gem_evict.c  | 12 ++++---
 drivers/gpu/drm/i915/i915_gem_gtt.c    |  5 +--
 drivers/gpu/drm/i915/i915_gem_stolen.c | 14 ++++++---
 5 files changed, 110 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 3759c09..38cccc8 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -533,6 +533,17 @@ struct i915_hw_ppgtt {
 	int (*enable)(struct drm_device *dev);
 };
 
+/* To make things as simple as possible (ie. no refcounting), a VMA's lifetime
+ * will always be <= an objects lifetime. So object refcounting should cover us.
+ */
+struct i915_vma {
+	struct drm_mm_node node;
+	struct drm_i915_gem_object *obj;
+	struct i915_address_space *vm;
+
+	struct list_head vma_link; /* Link in the object's VMA list */
+};
+
 struct i915_ctx_hang_stats {
 	/* This context had batch pending when hang was declared */
 	unsigned batch_pending;
@@ -1224,8 +1235,9 @@ struct drm_i915_gem_object {
 
 	const struct drm_i915_gem_object_ops *ops;
 
-	/** Current space allocated to this object in the GTT, if any. */
-	struct drm_mm_node gtt_space;
+	/** List of VMAs backed by this object */
+	struct list_head vma_list;
+
 	/** Stolen memory for this object, instead of being backed by shmem. */
 	struct drm_mm_node *stolen;
 	struct list_head global_list;
@@ -1351,18 +1363,32 @@ struct drm_i915_gem_object {
 
 #define to_intel_bo(x) container_of(x, struct drm_i915_gem_object, base)
 
-/* Offset of the first PTE pointing to this object */
-static inline unsigned long
-i915_gem_obj_ggtt_offset(struct drm_i915_gem_object *o)
+/* This is a temporary define to help transition us to real VMAs. If you see
+ * this, you're either reviewing code, or bisecting it. */
+static inline struct i915_vma *
+__i915_gem_obj_to_vma(struct drm_i915_gem_object *obj)
 {
-	return o->gtt_space.start;
+	if (list_empty(&obj->vma_list))
+		return NULL;
+	return list_first_entry(&obj->vma_list, struct i915_vma, vma_link);
 }
 
 /* Whether or not this object is currently mapped by the translation tables */
 static inline bool
 i915_gem_obj_ggtt_bound(struct drm_i915_gem_object *o)
 {
-	return drm_mm_node_allocated(&o->gtt_space);
+	struct i915_vma *vma = __i915_gem_obj_to_vma(o);
+	if (vma == NULL)
+		return false;
+	return drm_mm_node_allocated(&vma->node);
+}
+
+/* Offset of the first PTE pointing to this object */
+static inline unsigned long
+i915_gem_obj_ggtt_offset(struct drm_i915_gem_object *o)
+{
+	BUG_ON(list_empty(&o->vma_list));
+	return __i915_gem_obj_to_vma(o)->node.start;
 }
 
 /* The size used in the translation tables may be larger than the actual size of
@@ -1372,14 +1398,15 @@ i915_gem_obj_ggtt_bound(struct drm_i915_gem_object *o)
 static inline unsigned long
 i915_gem_obj_ggtt_size(struct drm_i915_gem_object *o)
 {
-	return o->gtt_space.size;
+	BUG_ON(list_empty(&o->vma_list));
+	return __i915_gem_obj_to_vma(o)->node.size;
 }
 
 static inline void
 i915_gem_obj_ggtt_set_color(struct drm_i915_gem_object *o,
 			    enum i915_cache_level color)
 {
-	o->gtt_space.color = color;
+	__i915_gem_obj_to_vma(o)->node.color = color;
 }
 
 /**
@@ -1694,6 +1721,9 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
 struct drm_i915_gem_object *i915_gem_alloc_object(struct drm_device *dev,
 						  size_t size);
 void i915_gem_free_object(struct drm_gem_object *obj);
+struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
+				     struct i915_address_space *vm);
+void i915_gem_vma_destroy(struct i915_vma *vma);
 
 int __must_check i915_gem_object_pin(struct drm_i915_gem_object *obj,
 				     uint32_t alignment,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 525aa8f..058ad44 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2578,6 +2578,7 @@ int
 i915_gem_object_unbind(struct drm_i915_gem_object *obj)
 {
 	drm_i915_private_t *dev_priv = obj->base.dev->dev_private;
+	struct i915_vma *vma;
 	int ret;
 
 	if (!i915_gem_obj_ggtt_bound(obj))
@@ -2615,11 +2616,20 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
 	i915_gem_object_unpin_pages(obj);
 
 	list_del(&obj->mm_list);
-	list_move_tail(&obj->global_list, &dev_priv->mm.unbound_list);
 	/* Avoid an unnecessary call to unbind on rebind. */
 	obj->map_and_fenceable = true;
 
-	drm_mm_remove_node(&obj->gtt_space);
+	vma = __i915_gem_obj_to_vma(obj);
+	list_del(&vma->vma_link);
+	drm_mm_remove_node(&vma->node);
+	i915_gem_vma_destroy(vma);
+
+	/* Since the unbound list is global, only move to that list if
+	 * no more VMAs exist.
+	 * NB: Until we have real VMAs there will only ever be one */
+	WARN_ON(!list_empty(&obj->vma_list));
+	if (list_empty(&obj->vma_list))
+		list_move_tail(&obj->global_list, &dev_priv->mm.unbound_list);
 
 	return 0;
 }
@@ -3070,8 +3080,12 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
 	bool mappable, fenceable;
 	size_t gtt_max = map_and_fenceable ?
 		dev_priv->gtt.mappable_end : dev_priv->gtt.base.total;
+	struct i915_vma *vma;
 	int ret;
 
+	if (WARN_ON(!list_empty(&obj->vma_list)))
+		return -EBUSY;
+
 	fence_size = i915_gem_get_gtt_size(dev,
 					   obj->base.size,
 					   obj->tiling_mode);
@@ -3110,9 +3124,15 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
 
 	i915_gem_object_pin_pages(obj);
 
+	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
+	if (vma == NULL) {
+		i915_gem_object_unpin_pages(obj);
+		return -ENOMEM;
+	}
+
 search_free:
 	ret = drm_mm_insert_node_in_range_generic(&dev_priv->gtt.base.mm,
-						  &obj->gtt_space,
+						  &vma->node,
 						  size, alignment,
 						  obj->cache_level, 0, gtt_max);
 	if (ret) {
@@ -3126,22 +3146,23 @@ search_free:
 		i915_gem_object_unpin_pages(obj);
 		return ret;
 	}
-	if (WARN_ON(!i915_gem_valid_gtt_space(dev, &obj->gtt_space,
+	if (WARN_ON(!i915_gem_valid_gtt_space(dev, &vma->node,
 					      obj->cache_level))) {
 		i915_gem_object_unpin_pages(obj);
-		drm_mm_remove_node(&obj->gtt_space);
+		drm_mm_remove_node(&vma->node);
 		return -EINVAL;
 	}
 
 	ret = i915_gem_gtt_prepare_object(obj);
 	if (ret) {
 		i915_gem_object_unpin_pages(obj);
-		drm_mm_remove_node(&obj->gtt_space);
+		drm_mm_remove_node(&vma->node);
 		return ret;
 	}
 
 	list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
 	list_add_tail(&obj->mm_list, &vm->inactive_list);
+	list_add(&vma->vma_link, &obj->vma_list);
 
 	fenceable =
 		i915_gem_obj_ggtt_size(obj) == fence_size &&
@@ -3300,6 +3321,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 {
 	struct drm_device *dev = obj->base.dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
+	struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
 	int ret;
 
 	if (obj->cache_level == cache_level)
@@ -3310,7 +3332,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 		return -EBUSY;
 	}
 
-	if (!i915_gem_valid_gtt_space(dev, &obj->gtt_space, cache_level)) {
+	if (!i915_gem_valid_gtt_space(dev, &vma->node, cache_level)) {
 		ret = i915_gem_object_unbind(obj);
 		if (ret)
 			return ret;
@@ -3855,6 +3877,7 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
 	INIT_LIST_HEAD(&obj->global_list);
 	INIT_LIST_HEAD(&obj->ring_list);
 	INIT_LIST_HEAD(&obj->exec_list);
+	INIT_LIST_HEAD(&obj->vma_list);
 
 	obj->ops = ops;
 
@@ -3975,6 +3998,26 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
 	i915_gem_object_free(obj);
 }
 
+struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
+				     struct i915_address_space *vm)
+{
+	struct i915_vma *vma = kzalloc(sizeof(*vma), GFP_KERNEL);
+	if (vma == NULL)
+		return ERR_PTR(-ENOMEM);
+
+	INIT_LIST_HEAD(&vma->vma_link);
+	vma->vm = vm;
+	vma->obj = obj;
+
+	return vma;
+}
+
+void i915_gem_vma_destroy(struct i915_vma *vma)
+{
+	WARN_ON(vma->node.allocated);
+	kfree(vma);
+}
+
 int
 i915_gem_idle(struct drm_device *dev)
 {
diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
index 43b8235..df61f33 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -34,11 +34,13 @@
 static bool
 mark_free(struct drm_i915_gem_object *obj, struct list_head *unwind)
 {
+	struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
+
 	if (obj->pin_count)
 		return false;
 
 	list_add(&obj->exec_list, unwind);
-	return drm_mm_scan_add_block(&obj->gtt_space);
+	return drm_mm_scan_add_block(&vma->node);
 }
 
 int
@@ -49,6 +51,7 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct i915_address_space *vm = &dev_priv->gtt.base;
 	struct list_head eviction_list, unwind_list;
+	struct i915_vma *vma;
 	struct drm_i915_gem_object *obj;
 	int ret = 0;
 
@@ -106,8 +109,8 @@ none:
 		obj = list_first_entry(&unwind_list,
 				       struct drm_i915_gem_object,
 				       exec_list);
-
-		ret = drm_mm_scan_remove_block(&obj->gtt_space);
+		vma = __i915_gem_obj_to_vma(obj);
+		ret = drm_mm_scan_remove_block(&vma->node);
 		BUG_ON(ret);
 
 		list_del_init(&obj->exec_list);
@@ -127,7 +130,8 @@ found:
 		obj = list_first_entry(&unwind_list,
 				       struct drm_i915_gem_object,
 				       exec_list);
-		if (drm_mm_scan_remove_block(&obj->gtt_space)) {
+		vma = __i915_gem_obj_to_vma(obj);
+		if (drm_mm_scan_remove_block(&vma->node)) {
 			list_move(&obj->exec_list, &eviction_list);
 			drm_gem_object_reference(&obj->base);
 			continue;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index b9400e9..298fc42 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -639,16 +639,17 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
 
 	/* Mark any preallocated objects as occupied */
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
+		struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
 		int ret;
 		DRM_DEBUG_KMS("reserving preallocated space: %lx + %zx\n",
 			      i915_gem_obj_ggtt_offset(obj), obj->base.size);
 
 		WARN_ON(i915_gem_obj_ggtt_bound(obj));
-		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm,
-					  &obj->gtt_space);
+		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
 		if (ret)
 			DRM_DEBUG_KMS("Reservation failed\n");
 		obj->has_global_gtt_mapping = 1;
+		list_add(&vma->vma_link, &obj->vma_list);
 	}
 
 	dev_priv->gtt.base.start = start;
diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index a4c3136..245eb1d 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -351,6 +351,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 	struct i915_address_space *vm = &dev_priv->gtt.base;
 	struct drm_i915_gem_object *obj;
 	struct drm_mm_node *stolen;
+	struct i915_vma *vma;
 	int ret;
 
 	if (!drm_mm_initialized(&dev_priv->mm.stolen))
@@ -390,16 +391,21 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 	if (gtt_offset == I915_GTT_OFFSET_NONE)
 		return obj;
 
+	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
+	if (!vma) {
+		drm_gem_object_unreference(&obj->base);
+		return NULL;
+	}
+
 	/* To simplify the initialisation sequence between KMS and GTT,
 	 * we allow construction of the stolen object prior to
 	 * setting up the GTT space. The actual reservation will occur
 	 * later.
 	 */
-	obj->gtt_space.start = gtt_offset;
-	obj->gtt_space.size = size;
+	vma->node.start = gtt_offset;
+	vma->node.size = size;
 	if (drm_mm_initialized(&dev_priv->gtt.base.mm)) {
-		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm,
-					  &obj->gtt_space);
+		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
 		if (ret) {
 			DRM_DEBUG_KMS("failed to allocate stolen GTT space\n");
 			goto unref_out;
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 06/11] drm/i915: plumb VM into object operations
  2013-07-09  6:08 [PATCH 00/11] ppgtt: just the VMA Ben Widawsky
                   ` (4 preceding siblings ...)
  2013-07-09  6:08 ` [PATCH 05/11] drm/i915: Create VMAs Ben Widawsky
@ 2013-07-09  6:08 ` Ben Widawsky
  2013-07-09  7:15   ` Daniel Vetter
  2013-07-09  6:08 ` [PATCH 07/11] drm/i915: Fix up map and fenceable for VMA Ben Widawsky
                   ` (6 subsequent siblings)
  12 siblings, 1 reply; 50+ messages in thread
From: Ben Widawsky @ 2013-07-09  6:08 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

This patch was formerly known as:
"drm/i915: Create VMAs (part 3) - plumbing"

This patch adds a VM argument, bind/unbind, and the object
offset/size/color getters/setters. It preserves the old ggtt helper
functions because things still need, and will continue to need them.

Some code will still need to be ported over after this.

v2: Fix purge to pick an object and unbind all vmas
This was doable because of the global bound list change.

v3: With the commit to actually pin/unpin pages in place, there is no
longer a need to check if unbind succeeded before calling put_pages().
Make put_pages only BUG() after checking pin count.

v4: Rebased on top of the new hangcheck work by Mika
plumbed eb_destroy also
Many checkpatch related fixes

v5: Very large rebase

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_debugfs.c        |  31 ++-
 drivers/gpu/drm/i915/i915_dma.c            |   4 -
 drivers/gpu/drm/i915/i915_drv.h            | 107 +++++-----
 drivers/gpu/drm/i915/i915_gem.c            | 316 +++++++++++++++++++++--------
 drivers/gpu/drm/i915/i915_gem_context.c    |   9 +-
 drivers/gpu/drm/i915/i915_gem_evict.c      |  51 +++--
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  85 +++++---
 drivers/gpu/drm/i915/i915_gem_gtt.c        |  41 ++--
 drivers/gpu/drm/i915/i915_gem_stolen.c     |   6 +-
 drivers/gpu/drm/i915/i915_gem_tiling.c     |  10 +-
 drivers/gpu/drm/i915/i915_irq.c            |   6 +-
 drivers/gpu/drm/i915/i915_trace.h          |  20 +-
 drivers/gpu/drm/i915/intel_fb.c            |   1 -
 drivers/gpu/drm/i915/intel_overlay.c       |   2 +-
 drivers/gpu/drm/i915/intel_pm.c            |   2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c    |  16 +-
 16 files changed, 468 insertions(+), 239 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 16b2aaf..867ed07 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -122,9 +122,18 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
 		seq_printf(m, " (pinned x %d)", obj->pin_count);
 	if (obj->fence_reg != I915_FENCE_REG_NONE)
 		seq_printf(m, " (fence: %d)", obj->fence_reg);
-	if (i915_gem_obj_ggtt_bound(obj))
-		seq_printf(m, " (gtt offset: %08lx, size: %08x)",
-			   i915_gem_obj_ggtt_offset(obj), (unsigned int)i915_gem_obj_ggtt_size(obj));
+	if (i915_gem_obj_bound_any(obj)) {
+		struct i915_vma *vma;
+		list_for_each_entry(vma, &obj->vma_list, vma_link) {
+			if (!i915_is_ggtt(vma->vm))
+				seq_puts(m, " (pp");
+			else
+				seq_puts(m, " (g");
+			seq_printf(m, " gtt offset: %08lx, size: %08lx)",
+				   i915_gem_obj_offset(obj, vma->vm),
+				   i915_gem_obj_size(obj, vma->vm));
+		}
+	}
 	if (obj->stolen)
 		seq_printf(m, " (stolen: %08lx)", obj->stolen->start);
 	if (obj->pin_mappable || obj->fault_mappable) {
@@ -186,6 +195,7 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
 	return 0;
 }
 
+/* FIXME: Support multiple VM? */
 #define count_objects(list, member) do { \
 	list_for_each_entry(obj, list, member) { \
 		size += i915_gem_obj_ggtt_size(obj); \
@@ -2049,18 +2059,21 @@ i915_drop_caches_set(void *data, u64 val)
 
 	if (val & DROP_BOUND) {
 		list_for_each_entry_safe(obj, next, &vm->inactive_list,
-					 mm_list)
-			if (obj->pin_count == 0) {
-				ret = i915_gem_object_unbind(obj);
-				if (ret)
-					goto unlock;
-			}
+					 mm_list) {
+			if (obj->pin_count)
+				continue;
+
+			ret = i915_gem_object_unbind(obj, &dev_priv->gtt.base);
+			if (ret)
+				goto unlock;
+		}
 	}
 
 	if (val & DROP_UNBOUND) {
 		list_for_each_entry_safe(obj, next, &dev_priv->mm.unbound_list,
 					 global_list)
 			if (obj->pages_pin_count == 0) {
+				/* FIXME: Do this for all vms? */
 				ret = i915_gem_object_put_pages(obj);
 				if (ret)
 					goto unlock;
diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index d13e21f..b190439 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1497,10 +1497,6 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
 
 	i915_dump_device_info(dev_priv);
 
-	INIT_LIST_HEAD(&dev_priv->vm_list);
-	INIT_LIST_HEAD(&dev_priv->gtt.base.global_link);
-	list_add(&dev_priv->gtt.base.global_link, &dev_priv->vm_list);
-
 	if (i915_get_bridge_dev(dev)) {
 		ret = -EIO;
 		goto free_priv;
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 38cccc8..48baccc 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1363,52 +1363,6 @@ struct drm_i915_gem_object {
 
 #define to_intel_bo(x) container_of(x, struct drm_i915_gem_object, base)
 
-/* This is a temporary define to help transition us to real VMAs. If you see
- * this, you're either reviewing code, or bisecting it. */
-static inline struct i915_vma *
-__i915_gem_obj_to_vma(struct drm_i915_gem_object *obj)
-{
-	if (list_empty(&obj->vma_list))
-		return NULL;
-	return list_first_entry(&obj->vma_list, struct i915_vma, vma_link);
-}
-
-/* Whether or not this object is currently mapped by the translation tables */
-static inline bool
-i915_gem_obj_ggtt_bound(struct drm_i915_gem_object *o)
-{
-	struct i915_vma *vma = __i915_gem_obj_to_vma(o);
-	if (vma == NULL)
-		return false;
-	return drm_mm_node_allocated(&vma->node);
-}
-
-/* Offset of the first PTE pointing to this object */
-static inline unsigned long
-i915_gem_obj_ggtt_offset(struct drm_i915_gem_object *o)
-{
-	BUG_ON(list_empty(&o->vma_list));
-	return __i915_gem_obj_to_vma(o)->node.start;
-}
-
-/* The size used in the translation tables may be larger than the actual size of
- * the object on GEN2/GEN3 because of the way tiling is handled. See
- * i915_gem_get_gtt_size() for more details.
- */
-static inline unsigned long
-i915_gem_obj_ggtt_size(struct drm_i915_gem_object *o)
-{
-	BUG_ON(list_empty(&o->vma_list));
-	return __i915_gem_obj_to_vma(o)->node.size;
-}
-
-static inline void
-i915_gem_obj_ggtt_set_color(struct drm_i915_gem_object *o,
-			    enum i915_cache_level color)
-{
-	__i915_gem_obj_to_vma(o)->node.color = color;
-}
-
 /**
  * Request queue structure.
  *
@@ -1726,11 +1680,13 @@ struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
 void i915_gem_vma_destroy(struct i915_vma *vma);
 
 int __must_check i915_gem_object_pin(struct drm_i915_gem_object *obj,
+				     struct i915_address_space *vm,
 				     uint32_t alignment,
 				     bool map_and_fenceable,
 				     bool nonblocking);
 void i915_gem_object_unpin(struct drm_i915_gem_object *obj);
-int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj);
+int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj,
+					struct i915_address_space *vm);
 int i915_gem_object_put_pages(struct drm_i915_gem_object *obj);
 void i915_gem_release_mmap(struct drm_i915_gem_object *obj);
 void i915_gem_lastclose(struct drm_device *dev);
@@ -1760,6 +1716,7 @@ int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
 int i915_gem_object_sync(struct drm_i915_gem_object *obj,
 			 struct intel_ring_buffer *to);
 void i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
+				    struct i915_address_space *vm,
 				    struct intel_ring_buffer *ring);
 
 int i915_gem_dumb_create(struct drm_file *file_priv,
@@ -1866,6 +1823,7 @@ i915_gem_get_gtt_alignment(struct drm_device *dev, uint32_t size,
 			    int tiling_mode, bool fenced);
 
 int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
+				    struct i915_address_space *vm,
 				    enum i915_cache_level cache_level);
 
 struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev,
@@ -1876,6 +1834,56 @@ struct dma_buf *i915_gem_prime_export(struct drm_device *dev,
 
 void i915_gem_restore_fences(struct drm_device *dev);
 
+unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
+				  struct i915_address_space *vm);
+bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o);
+bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
+			struct i915_address_space *vm);
+unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
+				struct i915_address_space *vm);
+void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
+			    struct i915_address_space *vm,
+			    enum i915_cache_level color);
+struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
+				     struct i915_address_space *vm);
+/* Some GGTT VM helpers */
+#define obj_to_ggtt(obj) \
+	(&((struct drm_i915_private *)(obj)->base.dev->dev_private)->gtt.base)
+static inline bool i915_is_ggtt(struct i915_address_space *vm)
+{
+	struct i915_address_space *ggtt =
+		&((struct drm_i915_private *)(vm)->dev->dev_private)->gtt.base;
+	return vm == ggtt;
+}
+
+static inline bool i915_gem_obj_ggtt_bound(struct drm_i915_gem_object *obj)
+{
+	return i915_gem_obj_bound(obj, obj_to_ggtt(obj));
+}
+
+static inline unsigned long
+i915_gem_obj_ggtt_offset(struct drm_i915_gem_object *obj)
+{
+	return i915_gem_obj_offset(obj, obj_to_ggtt(obj));
+}
+
+static inline unsigned long
+i915_gem_obj_ggtt_size(struct drm_i915_gem_object *obj)
+{
+	return i915_gem_obj_size(obj, obj_to_ggtt(obj));
+}
+
+static inline int __must_check
+i915_gem_ggtt_pin(struct drm_i915_gem_object *obj,
+		  uint32_t alignment,
+		  bool map_and_fenceable,
+		  bool nonblocking)
+{
+	return i915_gem_object_pin(obj, obj_to_ggtt(obj), alignment,
+				   map_and_fenceable, nonblocking);
+}
+#undef obj_to_ggtt
+
 /* i915_gem_context.c */
 void i915_gem_context_init(struct drm_device *dev);
 void i915_gem_context_fini(struct drm_device *dev);
@@ -1912,6 +1920,7 @@ void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
 
 void i915_gem_restore_gtt_mappings(struct drm_device *dev);
 int __must_check i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj);
+/* FIXME: this is never okay with full PPGTT */
 void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
 				enum i915_cache_level cache_level);
 void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj);
@@ -1928,7 +1937,9 @@ static inline void i915_gem_chipset_flush(struct drm_device *dev)
 
 
 /* i915_gem_evict.c */
-int __must_check i915_gem_evict_something(struct drm_device *dev, int min_size,
+int __must_check i915_gem_evict_something(struct drm_device *dev,
+					  struct i915_address_space *vm,
+					  int min_size,
 					  unsigned alignment,
 					  unsigned cache_level,
 					  bool mappable,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 058ad44..21015cd 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -38,10 +38,12 @@
 
 static void i915_gem_object_flush_gtt_write_domain(struct drm_i915_gem_object *obj);
 static void i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *obj);
-static __must_check int i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
-						    unsigned alignment,
-						    bool map_and_fenceable,
-						    bool nonblocking);
+static __must_check int
+i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
+			    struct i915_address_space *vm,
+			    unsigned alignment,
+			    bool map_and_fenceable,
+			    bool nonblocking);
 static int i915_gem_phys_pwrite(struct drm_device *dev,
 				struct drm_i915_gem_object *obj,
 				struct drm_i915_gem_pwrite *args,
@@ -135,7 +137,7 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
 static inline bool
 i915_gem_object_is_inactive(struct drm_i915_gem_object *obj)
 {
-	return i915_gem_obj_ggtt_bound(obj) && !obj->active;
+	return i915_gem_obj_bound_any(obj) && !obj->active;
 }
 
 int
@@ -422,7 +424,7 @@ i915_gem_shmem_pread(struct drm_device *dev,
 		 * anyway again before the next pread happens. */
 		if (obj->cache_level == I915_CACHE_NONE)
 			needs_clflush = 1;
-		if (i915_gem_obj_ggtt_bound(obj)) {
+		if (i915_gem_obj_bound_any(obj)) {
 			ret = i915_gem_object_set_to_gtt_domain(obj, false);
 			if (ret)
 				return ret;
@@ -594,7 +596,7 @@ i915_gem_gtt_pwrite_fast(struct drm_device *dev,
 	char __user *user_data;
 	int page_offset, page_length, ret;
 
-	ret = i915_gem_object_pin(obj, 0, true, true);
+	ret = i915_gem_ggtt_pin(obj, 0, true, true);
 	if (ret)
 		goto out;
 
@@ -739,7 +741,7 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
 		 * right away and we therefore have to clflush anyway. */
 		if (obj->cache_level == I915_CACHE_NONE)
 			needs_clflush_after = 1;
-		if (i915_gem_obj_ggtt_bound(obj)) {
+		if (i915_gem_obj_bound_any(obj)) {
 			ret = i915_gem_object_set_to_gtt_domain(obj, true);
 			if (ret)
 				return ret;
@@ -1346,7 +1348,7 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
 	}
 
 	/* Now bind it into the GTT if needed */
-	ret = i915_gem_object_pin(obj, 0, true, false);
+	ret = i915_gem_ggtt_pin(obj,  0, true, false);
 	if (ret)
 		goto unlock;
 
@@ -1668,11 +1670,11 @@ i915_gem_object_put_pages(struct drm_i915_gem_object *obj)
 	if (obj->pages == NULL)
 		return 0;
 
-	BUG_ON(i915_gem_obj_ggtt_bound(obj));
-
 	if (obj->pages_pin_count)
 		return -EBUSY;
 
+	BUG_ON(i915_gem_obj_bound_any(obj));
+
 	/* ->put_pages might need to allocate memory for the bit17 swizzle
 	 * array, hence protect them from being reaped by removing them from gtt
 	 * lists early. */
@@ -1692,7 +1694,6 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
 		  bool purgeable_only)
 {
 	struct drm_i915_gem_object *obj, *next;
-	struct i915_address_space *vm = &dev_priv->gtt.base;
 	long count = 0;
 
 	list_for_each_entry_safe(obj, next,
@@ -1706,14 +1707,22 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
 		}
 	}
 
-	list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list) {
-		if ((i915_gem_object_is_purgeable(obj) || !purgeable_only) &&
-		    i915_gem_object_unbind(obj) == 0 &&
-		    i915_gem_object_put_pages(obj) == 0) {
+	list_for_each_entry_safe(obj, next, &dev_priv->mm.bound_list,
+				 global_list) {
+		struct i915_vma *vma, *v;
+
+		if (!i915_gem_object_is_purgeable(obj) && purgeable_only)
+			continue;
+
+		list_for_each_entry_safe(vma, v, &obj->vma_list, vma_link)
+			if (i915_gem_object_unbind(obj, vma->vm))
+				break;
+
+		if (!i915_gem_object_put_pages(obj))
 			count += obj->base.size >> PAGE_SHIFT;
-			if (count >= target)
-				return count;
-		}
+
+		if (count >= target)
+			return count;
 	}
 
 	return count;
@@ -1873,11 +1882,11 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
 
 void
 i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
+			       struct i915_address_space *vm,
 			       struct intel_ring_buffer *ring)
 {
 	struct drm_device *dev = obj->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct i915_address_space *vm = &dev_priv->gtt.base;
 	u32 seqno = intel_ring_get_seqno(ring);
 
 	BUG_ON(ring == NULL);
@@ -1910,12 +1919,9 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
 }
 
 static void
-i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
+i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj,
+				 struct i915_address_space *vm)
 {
-	struct drm_device *dev = obj->base.dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct i915_address_space *vm = &dev_priv->gtt.base;
-
 	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
 	BUG_ON(!obj->active);
 
@@ -2117,10 +2123,11 @@ i915_gem_request_remove_from_client(struct drm_i915_gem_request *request)
 	spin_unlock(&file_priv->mm.lock);
 }
 
-static bool i915_head_inside_object(u32 acthd, struct drm_i915_gem_object *obj)
+static bool i915_head_inside_object(u32 acthd, struct drm_i915_gem_object *obj,
+				    struct i915_address_space *vm)
 {
-	if (acthd >= i915_gem_obj_ggtt_offset(obj) &&
-	    acthd < i915_gem_obj_ggtt_offset(obj) + obj->base.size)
+	if (acthd >= i915_gem_obj_offset(obj, vm) &&
+	    acthd < i915_gem_obj_offset(obj, vm) + obj->base.size)
 		return true;
 
 	return false;
@@ -2143,6 +2150,17 @@ static bool i915_head_inside_request(const u32 acthd_unmasked,
 	return false;
 }
 
+static struct i915_address_space *
+request_to_vm(struct drm_i915_gem_request *request)
+{
+	struct drm_i915_private *dev_priv = request->ring->dev->dev_private;
+	struct i915_address_space *vm;
+
+	vm = &dev_priv->gtt.base;
+
+	return vm;
+}
+
 static bool i915_request_guilty(struct drm_i915_gem_request *request,
 				const u32 acthd, bool *inside)
 {
@@ -2150,9 +2168,9 @@ static bool i915_request_guilty(struct drm_i915_gem_request *request,
 	 * pointing inside the ring, matches the batch_obj address range.
 	 * However this is extremely unlikely.
 	 */
-
 	if (request->batch_obj) {
-		if (i915_head_inside_object(acthd, request->batch_obj)) {
+		if (i915_head_inside_object(acthd, request->batch_obj,
+					    request_to_vm(request))) {
 			*inside = true;
 			return true;
 		}
@@ -2172,17 +2190,21 @@ static void i915_set_reset_status(struct intel_ring_buffer *ring,
 {
 	struct i915_ctx_hang_stats *hs = NULL;
 	bool inside, guilty;
+	unsigned long offset = 0;
 
 	/* Innocent until proven guilty */
 	guilty = false;
 
+	if (request->batch_obj)
+		offset = i915_gem_obj_offset(request->batch_obj,
+					     request_to_vm(request));
+
 	if (ring->hangcheck.action != wait &&
 	    i915_request_guilty(request, acthd, &inside)) {
 		DRM_ERROR("%s hung %s bo (0x%lx ctx %d) at 0x%x\n",
 			  ring->name,
 			  inside ? "inside" : "flushing",
-			  request->batch_obj ?
-			  i915_gem_obj_ggtt_offset(request->batch_obj) : 0,
+			  offset,
 			  request->ctx ? request->ctx->id : 0,
 			  acthd);
 
@@ -2239,13 +2261,15 @@ static void i915_gem_reset_ring_lists(struct drm_i915_private *dev_priv,
 	}
 
 	while (!list_empty(&ring->active_list)) {
+		struct i915_address_space *vm;
 		struct drm_i915_gem_object *obj;
 
 		obj = list_first_entry(&ring->active_list,
 				       struct drm_i915_gem_object,
 				       ring_list);
 
-		i915_gem_object_move_to_inactive(obj);
+		list_for_each_entry(vm, &dev_priv->vm_list, global_link)
+			i915_gem_object_move_to_inactive(obj, vm);
 	}
 }
 
@@ -2263,7 +2287,7 @@ void i915_gem_restore_fences(struct drm_device *dev)
 void i915_gem_reset(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct i915_address_space *vm = &dev_priv->gtt.base;
+	struct i915_address_space *vm;
 	struct drm_i915_gem_object *obj;
 	struct intel_ring_buffer *ring;
 	int i;
@@ -2274,8 +2298,9 @@ void i915_gem_reset(struct drm_device *dev)
 	/* Move everything out of the GPU domains to ensure we do any
 	 * necessary invalidation upon reuse.
 	 */
-	list_for_each_entry(obj, &vm->inactive_list, mm_list)
-		obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
+	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
+		list_for_each_entry(obj, &vm->inactive_list, mm_list)
+			obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
 
 	i915_gem_restore_fences(dev);
 }
@@ -2320,6 +2345,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
 	 * by the ringbuffer to the flushing/inactive lists as appropriate.
 	 */
 	while (!list_empty(&ring->active_list)) {
+		struct drm_i915_private *dev_priv = ring->dev->dev_private;
+		struct i915_address_space *vm;
 		struct drm_i915_gem_object *obj;
 
 		obj = list_first_entry(&ring->active_list,
@@ -2329,7 +2356,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
 		if (!i915_seqno_passed(seqno, obj->last_read_seqno))
 			break;
 
-		i915_gem_object_move_to_inactive(obj);
+		list_for_each_entry(vm, &dev_priv->vm_list, global_link)
+			i915_gem_object_move_to_inactive(obj, vm);
 	}
 
 	if (unlikely(ring->trace_irq_seqno &&
@@ -2575,13 +2603,14 @@ static void i915_gem_object_finish_gtt(struct drm_i915_gem_object *obj)
  * Unbinds an object from the GTT aperture.
  */
 int
-i915_gem_object_unbind(struct drm_i915_gem_object *obj)
+i915_gem_object_unbind(struct drm_i915_gem_object *obj,
+		       struct i915_address_space *vm)
 {
 	drm_i915_private_t *dev_priv = obj->base.dev->dev_private;
 	struct i915_vma *vma;
 	int ret;
 
-	if (!i915_gem_obj_ggtt_bound(obj))
+	if (!i915_gem_obj_bound(obj, vm))
 		return 0;
 
 	if (obj->pin_count)
@@ -2604,7 +2633,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
 	if (ret)
 		return ret;
 
-	trace_i915_gem_object_unbind(obj);
+	trace_i915_gem_object_unbind(obj, vm);
 
 	if (obj->has_global_gtt_mapping)
 		i915_gem_gtt_unbind_object(obj);
@@ -2619,7 +2648,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
 	/* Avoid an unnecessary call to unbind on rebind. */
 	obj->map_and_fenceable = true;
 
-	vma = __i915_gem_obj_to_vma(obj);
+	vma = i915_gem_obj_to_vma(obj, vm);
 	list_del(&vma->vma_link);
 	drm_mm_remove_node(&vma->node);
 	i915_gem_vma_destroy(vma);
@@ -2748,6 +2777,7 @@ static void i830_write_fence_reg(struct drm_device *dev, int reg,
 		     "object 0x%08lx not 512K or pot-size 0x%08x aligned\n",
 		     i915_gem_obj_ggtt_offset(obj), size);
 
+
 		pitch_val = obj->stride / 128;
 		pitch_val = ffs(pitch_val) - 1;
 
@@ -3069,23 +3099,25 @@ static void i915_gem_verify_gtt(struct drm_device *dev)
  */
 static int
 i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
+			    struct i915_address_space *vm,
 			    unsigned alignment,
 			    bool map_and_fenceable,
 			    bool nonblocking)
 {
 	struct drm_device *dev = obj->base.dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
-	struct i915_address_space *vm = &dev_priv->gtt.base;
 	u32 size, fence_size, fence_alignment, unfenced_alignment;
 	bool mappable, fenceable;
-	size_t gtt_max = map_and_fenceable ?
-		dev_priv->gtt.mappable_end : dev_priv->gtt.base.total;
+	size_t gtt_max =
+		map_and_fenceable ? dev_priv->gtt.mappable_end : vm->total;
 	struct i915_vma *vma;
 	int ret;
 
 	if (WARN_ON(!list_empty(&obj->vma_list)))
 		return -EBUSY;
 
+	BUG_ON(!i915_is_ggtt(vm));
+
 	fence_size = i915_gem_get_gtt_size(dev,
 					   obj->base.size,
 					   obj->tiling_mode);
@@ -3125,18 +3157,21 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
 	i915_gem_object_pin_pages(obj);
 
 	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
+	/* For now we only ever use 1 vma per object */
+	WARN_ON(!list_empty(&obj->vma_list));
+
+	vma = i915_gem_vma_create(obj, vm);
 	if (vma == NULL) {
 		i915_gem_object_unpin_pages(obj);
 		return -ENOMEM;
 	}
 
 search_free:
-	ret = drm_mm_insert_node_in_range_generic(&dev_priv->gtt.base.mm,
-						  &vma->node,
+	ret = drm_mm_insert_node_in_range_generic(&vm->mm, &vma->node,
 						  size, alignment,
 						  obj->cache_level, 0, gtt_max);
 	if (ret) {
-		ret = i915_gem_evict_something(dev, size, alignment,
+		ret = i915_gem_evict_something(dev, vm, size, alignment,
 					       obj->cache_level,
 					       map_and_fenceable,
 					       nonblocking);
@@ -3162,18 +3197,25 @@ search_free:
 
 	list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
 	list_add_tail(&obj->mm_list, &vm->inactive_list);
-	list_add(&vma->vma_link, &obj->vma_list);
+
+	/* Keep GGTT vmas first to make debug easier */
+	if (i915_is_ggtt(vm))
+		list_add(&vma->vma_link, &obj->vma_list);
+	else
+		list_add_tail(&vma->vma_link, &obj->vma_list);
 
 	fenceable =
+		i915_is_ggtt(vm) &&
 		i915_gem_obj_ggtt_size(obj) == fence_size &&
 		(i915_gem_obj_ggtt_offset(obj) & (fence_alignment - 1)) == 0;
 
-	mappable = i915_gem_obj_ggtt_offset(obj) + obj->base.size <=
-		dev_priv->gtt.mappable_end;
+	mappable =
+		i915_is_ggtt(vm) &&
+		vma->node.start + obj->base.size <= dev_priv->gtt.mappable_end;
 
 	obj->map_and_fenceable = mappable && fenceable;
 
-	trace_i915_gem_object_bind(obj, map_and_fenceable);
+	trace_i915_gem_object_bind(obj, vm, map_and_fenceable);
 	i915_gem_verify_gtt(dev);
 	return 0;
 }
@@ -3271,7 +3313,7 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
 	int ret;
 
 	/* Not valid to be called on unbound objects. */
-	if (!i915_gem_obj_ggtt_bound(obj))
+	if (!i915_gem_obj_bound_any(obj))
 		return -EINVAL;
 
 	if (obj->base.write_domain == I915_GEM_DOMAIN_GTT)
@@ -3317,11 +3359,12 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
 }
 
 int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
+				    struct i915_address_space *vm,
 				    enum i915_cache_level cache_level)
 {
 	struct drm_device *dev = obj->base.dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
-	struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
+	struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
 	int ret;
 
 	if (obj->cache_level == cache_level)
@@ -3333,12 +3376,15 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 	}
 
 	if (!i915_gem_valid_gtt_space(dev, &vma->node, cache_level)) {
-		ret = i915_gem_object_unbind(obj);
+		ret = i915_gem_object_unbind(obj, vm);
 		if (ret)
 			return ret;
 	}
 
-	if (i915_gem_obj_ggtt_bound(obj)) {
+	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
+		if (!i915_gem_obj_bound(obj, vm))
+			continue;
+
 		ret = i915_gem_object_finish_gpu(obj);
 		if (ret)
 			return ret;
@@ -3361,7 +3407,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 			i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
 					       obj, cache_level);
 
-		i915_gem_obj_ggtt_set_color(obj, cache_level);
+		i915_gem_obj_set_color(obj, vm, cache_level);
 	}
 
 	if (cache_level == I915_CACHE_NONE) {
@@ -3421,6 +3467,7 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
 			       struct drm_file *file)
 {
 	struct drm_i915_gem_caching *args = data;
+	struct drm_i915_private *dev_priv;
 	struct drm_i915_gem_object *obj;
 	enum i915_cache_level level;
 	int ret;
@@ -3445,8 +3492,10 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
 		ret = -ENOENT;
 		goto unlock;
 	}
+	dev_priv = obj->base.dev->dev_private;
 
-	ret = i915_gem_object_set_cache_level(obj, level);
+	/* FIXME: Add interface for specific VM? */
+	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base, level);
 
 	drm_gem_object_unreference(&obj->base);
 unlock:
@@ -3464,6 +3513,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
 				     u32 alignment,
 				     struct intel_ring_buffer *pipelined)
 {
+	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
 	u32 old_read_domains, old_write_domain;
 	int ret;
 
@@ -3482,7 +3532,8 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
 	 * of uncaching, which would allow us to flush all the LLC-cached data
 	 * with that bit in the PTE to main memory with just one PIPE_CONTROL.
 	 */
-	ret = i915_gem_object_set_cache_level(obj, I915_CACHE_NONE);
+	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
+					      I915_CACHE_NONE);
 	if (ret)
 		return ret;
 
@@ -3490,7 +3541,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
 	 * (e.g. libkms for the bootup splash), we have to ensure that we
 	 * always use map_and_fenceable for all scanout buffers.
 	 */
-	ret = i915_gem_object_pin(obj, alignment, true, false);
+	ret = i915_gem_ggtt_pin(obj, alignment, true, false);
 	if (ret)
 		return ret;
 
@@ -3633,6 +3684,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
 
 int
 i915_gem_object_pin(struct drm_i915_gem_object *obj,
+		    struct i915_address_space *vm,
 		    uint32_t alignment,
 		    bool map_and_fenceable,
 		    bool nonblocking)
@@ -3642,26 +3694,29 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
 	if (WARN_ON(obj->pin_count == DRM_I915_GEM_OBJECT_MAX_PIN_COUNT))
 		return -EBUSY;
 
-	if (i915_gem_obj_ggtt_bound(obj)) {
-		if ((alignment && i915_gem_obj_ggtt_offset(obj) & (alignment - 1)) ||
+	BUG_ON(map_and_fenceable && !i915_is_ggtt(vm));
+
+	if (i915_gem_obj_bound(obj, vm)) {
+		if ((alignment &&
+		     i915_gem_obj_offset(obj, vm) & (alignment - 1)) ||
 		    (map_and_fenceable && !obj->map_and_fenceable)) {
 			WARN(obj->pin_count,
 			     "bo is already pinned with incorrect alignment:"
 			     " offset=%lx, req.alignment=%x, req.map_and_fenceable=%d,"
 			     " obj->map_and_fenceable=%d\n",
-			     i915_gem_obj_ggtt_offset(obj), alignment,
+			     i915_gem_obj_offset(obj, vm), alignment,
 			     map_and_fenceable,
 			     obj->map_and_fenceable);
-			ret = i915_gem_object_unbind(obj);
+			ret = i915_gem_object_unbind(obj, vm);
 			if (ret)
 				return ret;
 		}
 	}
 
-	if (!i915_gem_obj_ggtt_bound(obj)) {
+	if (!i915_gem_obj_bound(obj, vm)) {
 		struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
 
-		ret = i915_gem_object_bind_to_gtt(obj, alignment,
+		ret = i915_gem_object_bind_to_gtt(obj, vm, alignment,
 						  map_and_fenceable,
 						  nonblocking);
 		if (ret)
@@ -3684,7 +3739,7 @@ void
 i915_gem_object_unpin(struct drm_i915_gem_object *obj)
 {
 	BUG_ON(obj->pin_count == 0);
-	BUG_ON(!i915_gem_obj_ggtt_bound(obj));
+	BUG_ON(!i915_gem_obj_bound_any(obj));
 
 	if (--obj->pin_count == 0)
 		obj->pin_mappable = false;
@@ -3722,7 +3777,7 @@ i915_gem_pin_ioctl(struct drm_device *dev, void *data,
 	}
 
 	if (obj->user_pin_count == 0) {
-		ret = i915_gem_object_pin(obj, args->alignment, true, false);
+		ret = i915_gem_ggtt_pin(obj, args->alignment, true, false);
 		if (ret)
 			goto out;
 	}
@@ -3957,6 +4012,7 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
 	struct drm_i915_gem_object *obj = to_intel_bo(gem_obj);
 	struct drm_device *dev = obj->base.dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
+	struct i915_vma *vma, *next;
 
 	trace_i915_gem_object_destroy(obj);
 
@@ -3964,15 +4020,21 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
 		i915_gem_detach_phys_object(dev, obj);
 
 	obj->pin_count = 0;
-	if (WARN_ON(i915_gem_object_unbind(obj) == -ERESTARTSYS)) {
-		bool was_interruptible;
+	/* NB: 0 or 1 elements */
+	WARN_ON(!list_empty(&obj->vma_list) &&
+		!list_is_singular(&obj->vma_list));
+	list_for_each_entry_safe(vma, next, &obj->vma_list, vma_link) {
+		int ret = i915_gem_object_unbind(obj, vma->vm);
+		if (WARN_ON(ret == -ERESTARTSYS)) {
+			bool was_interruptible;
 
-		was_interruptible = dev_priv->mm.interruptible;
-		dev_priv->mm.interruptible = false;
+			was_interruptible = dev_priv->mm.interruptible;
+			dev_priv->mm.interruptible = false;
 
-		WARN_ON(i915_gem_object_unbind(obj));
+			WARN_ON(i915_gem_object_unbind(obj, vma->vm));
 
-		dev_priv->mm.interruptible = was_interruptible;
+			dev_priv->mm.interruptible = was_interruptible;
+		}
 	}
 
 	/* Stolen objects don't hold a ref, but do hold pin count. Fix that up
@@ -4332,6 +4394,16 @@ init_ring_lists(struct intel_ring_buffer *ring)
 	INIT_LIST_HEAD(&ring->request_list);
 }
 
+static void i915_init_vm(struct drm_i915_private *dev_priv,
+			 struct i915_address_space *vm)
+{
+	vm->dev = dev_priv->dev;
+	INIT_LIST_HEAD(&vm->active_list);
+	INIT_LIST_HEAD(&vm->inactive_list);
+	INIT_LIST_HEAD(&vm->global_link);
+	list_add(&vm->global_link, &dev_priv->vm_list);
+}
+
 void
 i915_gem_load(struct drm_device *dev)
 {
@@ -4344,8 +4416,9 @@ i915_gem_load(struct drm_device *dev)
 				  SLAB_HWCACHE_ALIGN,
 				  NULL);
 
-	INIT_LIST_HEAD(&dev_priv->gtt.base.active_list);
-	INIT_LIST_HEAD(&dev_priv->gtt.base.inactive_list);
+	INIT_LIST_HEAD(&dev_priv->vm_list);
+	i915_init_vm(dev_priv, &dev_priv->gtt.base);
+
 	INIT_LIST_HEAD(&dev_priv->mm.unbound_list);
 	INIT_LIST_HEAD(&dev_priv->mm.bound_list);
 	INIT_LIST_HEAD(&dev_priv->mm.fence_list);
@@ -4616,9 +4689,9 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
 			     struct drm_i915_private,
 			     mm.inactive_shrinker);
 	struct drm_device *dev = dev_priv->dev;
-	struct i915_address_space *vm = &dev_priv->gtt.base;
+	struct i915_address_space *vm;
 	struct drm_i915_gem_object *obj;
-	int nr_to_scan = sc->nr_to_scan;
+	int nr_to_scan;
 	bool unlock = true;
 	int cnt;
 
@@ -4632,6 +4705,7 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
 		unlock = false;
 	}
 
+	nr_to_scan = sc->nr_to_scan;
 	if (nr_to_scan) {
 		nr_to_scan -= i915_gem_purge(dev_priv, nr_to_scan);
 		if (nr_to_scan > 0)
@@ -4645,11 +4719,93 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
 	list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_list)
 		if (obj->pages_pin_count == 0)
 			cnt += obj->base.size >> PAGE_SHIFT;
-	list_for_each_entry(obj, &vm->inactive_list, mm_list)
-		if (obj->pin_count == 0 && obj->pages_pin_count == 0)
-			cnt += obj->base.size >> PAGE_SHIFT;
+
+	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
+		list_for_each_entry(obj, &vm->inactive_list, global_list)
+			if (obj->pin_count == 0 && obj->pages_pin_count == 0)
+				cnt += obj->base.size >> PAGE_SHIFT;
 
 	if (unlock)
 		mutex_unlock(&dev->struct_mutex);
 	return cnt;
 }
+
+/* All the new VM stuff */
+unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
+				  struct i915_address_space *vm)
+{
+	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
+	struct i915_vma *vma;
+
+	if (vm == &dev_priv->mm.aliasing_ppgtt->base)
+		vm = &dev_priv->gtt.base;
+
+	BUG_ON(list_empty(&o->vma_list));
+	list_for_each_entry(vma, &o->vma_list, vma_link) {
+		if (vma->vm == vm)
+			return vma->node.start;
+
+	}
+	return -1;
+}
+
+bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o)
+{
+	return !list_empty(&o->vma_list);
+}
+
+bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
+			struct i915_address_space *vm)
+{
+	struct i915_vma *vma;
+
+	list_for_each_entry(vma, &o->vma_list, vma_link) {
+		if (vma->vm == vm)
+			return true;
+	}
+	return false;
+}
+
+unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
+				struct i915_address_space *vm)
+{
+	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
+	struct i915_vma *vma;
+
+	if (vm == &dev_priv->mm.aliasing_ppgtt->base)
+		vm = &dev_priv->gtt.base;
+	BUG_ON(list_empty(&o->vma_list));
+	list_for_each_entry(vma, &o->vma_list, vma_link) {
+		if (vma->vm == vm)
+			return vma->node.size;
+	}
+
+	return 0;
+}
+
+void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
+			    struct i915_address_space *vm,
+			    enum i915_cache_level color)
+{
+	struct i915_vma *vma;
+	BUG_ON(list_empty(&o->vma_list));
+	list_for_each_entry(vma, &o->vma_list, vma_link) {
+		if (vma->vm == vm) {
+			vma->node.color = color;
+			return;
+		}
+	}
+
+	WARN(1, "Couldn't set color for VM %p\n", vm);
+}
+
+struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
+				     struct i915_address_space *vm)
+{
+	struct i915_vma *vma;
+	list_for_each_entry(vma, &obj->vma_list, vma_link)
+		if (vma->vm == vm)
+			return vma;
+
+	return NULL;
+}
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 2074544..c92fd81 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -155,6 +155,7 @@ create_hw_context(struct drm_device *dev,
 
 	if (INTEL_INFO(dev)->gen >= 7) {
 		ret = i915_gem_object_set_cache_level(ctx->obj,
+						      &dev_priv->gtt.base,
 						      I915_CACHE_LLC_MLC);
 		/* Failure shouldn't ever happen this early */
 		if (WARN_ON(ret))
@@ -214,7 +215,7 @@ static int create_default_context(struct drm_i915_private *dev_priv)
 	 * default context.
 	 */
 	dev_priv->ring[RCS].default_context = ctx;
-	ret = i915_gem_object_pin(ctx->obj, CONTEXT_ALIGN, false, false);
+	ret = i915_gem_ggtt_pin(ctx->obj, CONTEXT_ALIGN, false, false);
 	if (ret) {
 		DRM_DEBUG_DRIVER("Couldn't pin %d\n", ret);
 		goto err_destroy;
@@ -398,6 +399,7 @@ mi_set_context(struct intel_ring_buffer *ring,
 static int do_switch(struct i915_hw_context *to)
 {
 	struct intel_ring_buffer *ring = to->ring;
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	struct i915_hw_context *from = ring->last_context;
 	u32 hw_flags = 0;
 	int ret;
@@ -407,7 +409,7 @@ static int do_switch(struct i915_hw_context *to)
 	if (from == to)
 		return 0;
 
-	ret = i915_gem_object_pin(to->obj, CONTEXT_ALIGN, false, false);
+	ret = i915_gem_ggtt_pin(to->obj, CONTEXT_ALIGN, false, false);
 	if (ret)
 		return ret;
 
@@ -444,7 +446,8 @@ static int do_switch(struct i915_hw_context *to)
 	 */
 	if (from != NULL) {
 		from->obj->base.read_domains = I915_GEM_DOMAIN_INSTRUCTION;
-		i915_gem_object_move_to_active(from->obj, ring);
+		i915_gem_object_move_to_active(from->obj, &dev_priv->gtt.base,
+					       ring);
 		/* As long as MI_SET_CONTEXT is serializing, ie. it flushes the
 		 * whole damn pipeline, we don't need to explicitly mark the
 		 * object dirty. The only exception is that the context must be
diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
index df61f33..32efdc0 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -32,24 +32,21 @@
 #include "i915_trace.h"
 
 static bool
-mark_free(struct drm_i915_gem_object *obj, struct list_head *unwind)
+mark_free(struct i915_vma *vma, struct list_head *unwind)
 {
-	struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
-
-	if (obj->pin_count)
+	if (vma->obj->pin_count)
 		return false;
 
-	list_add(&obj->exec_list, unwind);
+	list_add(&vma->obj->exec_list, unwind);
 	return drm_mm_scan_add_block(&vma->node);
 }
 
 int
-i915_gem_evict_something(struct drm_device *dev, int min_size,
-			 unsigned alignment, unsigned cache_level,
+i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
+			 int min_size, unsigned alignment, unsigned cache_level,
 			 bool mappable, bool nonblocking)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
-	struct i915_address_space *vm = &dev_priv->gtt.base;
 	struct list_head eviction_list, unwind_list;
 	struct i915_vma *vma;
 	struct drm_i915_gem_object *obj;
@@ -81,16 +78,18 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
 	 */
 
 	INIT_LIST_HEAD(&unwind_list);
-	if (mappable)
+	if (mappable) {
+		BUG_ON(!i915_is_ggtt(vm));
 		drm_mm_init_scan_with_range(&vm->mm, min_size,
 					    alignment, cache_level, 0,
 					    dev_priv->gtt.mappable_end);
-	else
+	} else
 		drm_mm_init_scan(&vm->mm, min_size, alignment, cache_level);
 
 	/* First see if there is a large enough contiguous idle region... */
 	list_for_each_entry(obj, &vm->inactive_list, mm_list) {
-		if (mark_free(obj, &unwind_list))
+		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
+		if (mark_free(vma, &unwind_list))
 			goto found;
 	}
 
@@ -99,7 +98,8 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
 
 	/* Now merge in the soon-to-be-expired objects... */
 	list_for_each_entry(obj, &vm->active_list, mm_list) {
-		if (mark_free(obj, &unwind_list))
+		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
+		if (mark_free(vma, &unwind_list))
 			goto found;
 	}
 
@@ -109,7 +109,7 @@ none:
 		obj = list_first_entry(&unwind_list,
 				       struct drm_i915_gem_object,
 				       exec_list);
-		vma = __i915_gem_obj_to_vma(obj);
+		vma = i915_gem_obj_to_vma(obj, vm);
 		ret = drm_mm_scan_remove_block(&vma->node);
 		BUG_ON(ret);
 
@@ -130,7 +130,7 @@ found:
 		obj = list_first_entry(&unwind_list,
 				       struct drm_i915_gem_object,
 				       exec_list);
-		vma = __i915_gem_obj_to_vma(obj);
+		vma = i915_gem_obj_to_vma(obj, vm);
 		if (drm_mm_scan_remove_block(&vma->node)) {
 			list_move(&obj->exec_list, &eviction_list);
 			drm_gem_object_reference(&obj->base);
@@ -145,7 +145,7 @@ found:
 				       struct drm_i915_gem_object,
 				       exec_list);
 		if (ret == 0)
-			ret = i915_gem_object_unbind(obj);
+			ret = i915_gem_object_unbind(obj, vm);
 
 		list_del_init(&obj->exec_list);
 		drm_gem_object_unreference(&obj->base);
@@ -158,13 +158,18 @@ int
 i915_gem_evict_everything(struct drm_device *dev)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
-	struct i915_address_space *vm = &dev_priv->gtt.base;
+	struct i915_address_space *vm;
 	struct drm_i915_gem_object *obj, *next;
-	bool lists_empty;
+	bool lists_empty = true;
 	int ret;
 
-	lists_empty = (list_empty(&vm->inactive_list) &&
-		       list_empty(&vm->active_list));
+	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
+		lists_empty = (list_empty(&vm->inactive_list) &&
+			       list_empty(&vm->active_list));
+		if (!lists_empty)
+			lists_empty = false;
+	}
+
 	if (lists_empty)
 		return -ENOSPC;
 
@@ -181,9 +186,11 @@ i915_gem_evict_everything(struct drm_device *dev)
 	i915_gem_retire_requests(dev);
 
 	/* Having flushed everything, unbind() should never raise an error */
-	list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
-		if (obj->pin_count == 0)
-			WARN_ON(i915_gem_object_unbind(obj));
+	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
+		list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
+			if (obj->pin_count == 0)
+				WARN_ON(i915_gem_object_unbind(obj, vm));
+	}
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 5aeb447..e90182d 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -150,7 +150,7 @@ eb_get_object(struct eb_objects *eb, unsigned long handle)
 }
 
 static void
-eb_destroy(struct eb_objects *eb)
+eb_destroy(struct eb_objects *eb, struct i915_address_space *vm)
 {
 	while (!list_empty(&eb->objects)) {
 		struct drm_i915_gem_object *obj;
@@ -174,7 +174,8 @@ static inline int use_cpu_reloc(struct drm_i915_gem_object *obj)
 static int
 i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
 				   struct eb_objects *eb,
-				   struct drm_i915_gem_relocation_entry *reloc)
+				   struct drm_i915_gem_relocation_entry *reloc,
+				   struct i915_address_space *vm)
 {
 	struct drm_device *dev = obj->base.dev;
 	struct drm_gem_object *target_obj;
@@ -297,7 +298,8 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
 
 static int
 i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
-				    struct eb_objects *eb)
+				    struct eb_objects *eb,
+				    struct i915_address_space *vm)
 {
 #define N_RELOC(x) ((x) / sizeof(struct drm_i915_gem_relocation_entry))
 	struct drm_i915_gem_relocation_entry stack_reloc[N_RELOC(512)];
@@ -321,7 +323,8 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
 		do {
 			u64 offset = r->presumed_offset;
 
-			ret = i915_gem_execbuffer_relocate_entry(obj, eb, r);
+			ret = i915_gem_execbuffer_relocate_entry(obj, eb, r,
+								 vm);
 			if (ret)
 				return ret;
 
@@ -344,13 +347,15 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
 static int
 i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
 					 struct eb_objects *eb,
-					 struct drm_i915_gem_relocation_entry *relocs)
+					 struct drm_i915_gem_relocation_entry *relocs,
+					 struct i915_address_space *vm)
 {
 	const struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
 	int i, ret;
 
 	for (i = 0; i < entry->relocation_count; i++) {
-		ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i]);
+		ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i],
+							 vm);
 		if (ret)
 			return ret;
 	}
@@ -359,7 +364,8 @@ i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
 }
 
 static int
-i915_gem_execbuffer_relocate(struct eb_objects *eb)
+i915_gem_execbuffer_relocate(struct eb_objects *eb,
+			     struct i915_address_space *vm)
 {
 	struct drm_i915_gem_object *obj;
 	int ret = 0;
@@ -373,7 +379,7 @@ i915_gem_execbuffer_relocate(struct eb_objects *eb)
 	 */
 	pagefault_disable();
 	list_for_each_entry(obj, &eb->objects, exec_list) {
-		ret = i915_gem_execbuffer_relocate_object(obj, eb);
+		ret = i915_gem_execbuffer_relocate_object(obj, eb, vm);
 		if (ret)
 			break;
 	}
@@ -395,6 +401,7 @@ need_reloc_mappable(struct drm_i915_gem_object *obj)
 static int
 i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
 				   struct intel_ring_buffer *ring,
+				   struct i915_address_space *vm,
 				   bool *need_reloc)
 {
 	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
@@ -409,7 +416,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
 		obj->tiling_mode != I915_TILING_NONE;
 	need_mappable = need_fence || need_reloc_mappable(obj);
 
-	ret = i915_gem_object_pin(obj, entry->alignment, need_mappable, false);
+	ret = i915_gem_object_pin(obj, vm, entry->alignment, need_mappable,
+				  false);
 	if (ret)
 		return ret;
 
@@ -436,8 +444,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
 		obj->has_aliasing_ppgtt_mapping = 1;
 	}
 
-	if (entry->offset != i915_gem_obj_ggtt_offset(obj)) {
-		entry->offset = i915_gem_obj_ggtt_offset(obj);
+	if (entry->offset != i915_gem_obj_offset(obj, vm)) {
+		entry->offset = i915_gem_obj_offset(obj, vm);
 		*need_reloc = true;
 	}
 
@@ -458,7 +466,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
 {
 	struct drm_i915_gem_exec_object2 *entry;
 
-	if (!i915_gem_obj_ggtt_bound(obj))
+	if (!i915_gem_obj_bound_any(obj))
 		return;
 
 	entry = obj->exec_entry;
@@ -475,6 +483,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
 static int
 i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
 			    struct list_head *objects,
+			    struct i915_address_space *vm,
 			    bool *need_relocs)
 {
 	struct drm_i915_gem_object *obj;
@@ -529,32 +538,37 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
 		list_for_each_entry(obj, objects, exec_list) {
 			struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
 			bool need_fence, need_mappable;
+			u32 obj_offset;
 
-			if (!i915_gem_obj_ggtt_bound(obj))
+			if (!i915_gem_obj_bound(obj, vm))
 				continue;
 
+			obj_offset = i915_gem_obj_offset(obj, vm);
 			need_fence =
 				has_fenced_gpu_access &&
 				entry->flags & EXEC_OBJECT_NEEDS_FENCE &&
 				obj->tiling_mode != I915_TILING_NONE;
 			need_mappable = need_fence || need_reloc_mappable(obj);
 
+			BUG_ON((need_mappable || need_fence) &&
+			       !i915_is_ggtt(vm));
+
 			if ((entry->alignment &&
-			     i915_gem_obj_ggtt_offset(obj) & (entry->alignment - 1)) ||
+			     obj_offset & (entry->alignment - 1)) ||
 			    (need_mappable && !obj->map_and_fenceable))
-				ret = i915_gem_object_unbind(obj);
+				ret = i915_gem_object_unbind(obj, vm);
 			else
-				ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs);
+				ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
 			if (ret)
 				goto err;
 		}
 
 		/* Bind fresh objects */
 		list_for_each_entry(obj, objects, exec_list) {
-			if (i915_gem_obj_ggtt_bound(obj))
+			if (i915_gem_obj_bound(obj, vm))
 				continue;
 
-			ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs);
+			ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
 			if (ret)
 				goto err;
 		}
@@ -578,7 +592,8 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
 				  struct drm_file *file,
 				  struct intel_ring_buffer *ring,
 				  struct eb_objects *eb,
-				  struct drm_i915_gem_exec_object2 *exec)
+				  struct drm_i915_gem_exec_object2 *exec,
+				  struct i915_address_space *vm)
 {
 	struct drm_i915_gem_relocation_entry *reloc;
 	struct drm_i915_gem_object *obj;
@@ -662,14 +677,15 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
 		goto err;
 
 	need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
-	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs);
+	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
 	if (ret)
 		goto err;
 
 	list_for_each_entry(obj, &eb->objects, exec_list) {
 		int offset = obj->exec_entry - exec;
 		ret = i915_gem_execbuffer_relocate_object_slow(obj, eb,
-							       reloc + reloc_offset[offset]);
+							       reloc + reloc_offset[offset],
+							       vm);
 		if (ret)
 			goto err;
 	}
@@ -768,6 +784,7 @@ validate_exec_list(struct drm_i915_gem_exec_object2 *exec,
 
 static void
 i915_gem_execbuffer_move_to_active(struct list_head *objects,
+				   struct i915_address_space *vm,
 				   struct intel_ring_buffer *ring)
 {
 	struct drm_i915_gem_object *obj;
@@ -782,7 +799,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *objects,
 		obj->base.read_domains = obj->base.pending_read_domains;
 		obj->fenced_gpu_access = obj->pending_fenced_gpu_access;
 
-		i915_gem_object_move_to_active(obj, ring);
+		i915_gem_object_move_to_active(obj, vm, ring);
 		if (obj->base.write_domain) {
 			obj->dirty = 1;
 			obj->last_write_seqno = intel_ring_get_seqno(ring);
@@ -836,7 +853,8 @@ static int
 i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 		       struct drm_file *file,
 		       struct drm_i915_gem_execbuffer2 *args,
-		       struct drm_i915_gem_exec_object2 *exec)
+		       struct drm_i915_gem_exec_object2 *exec,
+		       struct i915_address_space *vm)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct eb_objects *eb;
@@ -998,17 +1016,17 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 
 	/* Move the objects en-masse into the GTT, evicting if necessary. */
 	need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
-	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs);
+	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
 	if (ret)
 		goto err;
 
 	/* The objects are in their final locations, apply the relocations. */
 	if (need_relocs)
-		ret = i915_gem_execbuffer_relocate(eb);
+		ret = i915_gem_execbuffer_relocate(eb, vm);
 	if (ret) {
 		if (ret == -EFAULT) {
 			ret = i915_gem_execbuffer_relocate_slow(dev, args, file, ring,
-								eb, exec);
+								eb, exec, vm);
 			BUG_ON(!mutex_is_locked(&dev->struct_mutex));
 		}
 		if (ret)
@@ -1059,7 +1077,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 			goto err;
 	}
 
-	exec_start = i915_gem_obj_ggtt_offset(batch_obj) + args->batch_start_offset;
+	exec_start = i915_gem_obj_offset(batch_obj, vm) +
+		args->batch_start_offset;
 	exec_len = args->batch_len;
 	if (cliprects) {
 		for (i = 0; i < args->num_cliprects; i++) {
@@ -1084,11 +1103,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 
 	trace_i915_gem_ring_dispatch(ring, intel_ring_get_seqno(ring), flags);
 
-	i915_gem_execbuffer_move_to_active(&eb->objects, ring);
+	i915_gem_execbuffer_move_to_active(&eb->objects, vm, ring);
 	i915_gem_execbuffer_retire_commands(dev, file, ring, batch_obj);
 
 err:
-	eb_destroy(eb);
+	eb_destroy(eb, vm);
 
 	mutex_unlock(&dev->struct_mutex);
 
@@ -1105,6 +1124,7 @@ int
 i915_gem_execbuffer(struct drm_device *dev, void *data,
 		    struct drm_file *file)
 {
+	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_gem_execbuffer *args = data;
 	struct drm_i915_gem_execbuffer2 exec2;
 	struct drm_i915_gem_exec_object *exec_list = NULL;
@@ -1160,7 +1180,8 @@ i915_gem_execbuffer(struct drm_device *dev, void *data,
 	exec2.flags = I915_EXEC_RENDER;
 	i915_execbuffer2_set_context_id(exec2, 0);
 
-	ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list);
+	ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list,
+				     &dev_priv->gtt.base);
 	if (!ret) {
 		/* Copy the new buffer offsets back to the user's exec list. */
 		for (i = 0; i < args->buffer_count; i++)
@@ -1186,6 +1207,7 @@ int
 i915_gem_execbuffer2(struct drm_device *dev, void *data,
 		     struct drm_file *file)
 {
+	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_gem_execbuffer2 *args = data;
 	struct drm_i915_gem_exec_object2 *exec2_list = NULL;
 	int ret;
@@ -1216,7 +1238,8 @@ i915_gem_execbuffer2(struct drm_device *dev, void *data,
 		return -EFAULT;
 	}
 
-	ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list);
+	ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list,
+				     &dev_priv->gtt.base);
 	if (!ret) {
 		/* Copy the new buffer offsets back to the user's exec list. */
 		ret = copy_to_user(to_user_ptr(args->buffers_ptr),
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 298fc42..70ce2f6 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -367,6 +367,8 @@ static int i915_gem_init_aliasing_ppgtt(struct drm_device *dev)
 			    ppgtt->base.total);
 	}
 
+	/* i915_init_vm(dev_priv, &ppgtt->base) */
+
 	return ret;
 }
 
@@ -386,17 +388,22 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
 			    struct drm_i915_gem_object *obj,
 			    enum i915_cache_level cache_level)
 {
-	ppgtt->base.insert_entries(&ppgtt->base, obj->pages,
-				   i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
-				   cache_level);
+	struct i915_address_space *vm = &ppgtt->base;
+	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
+
+	vm->insert_entries(vm, obj->pages,
+			   obj_offset >> PAGE_SHIFT,
+			   cache_level);
 }
 
 void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
 			      struct drm_i915_gem_object *obj)
 {
-	ppgtt->base.clear_range(&ppgtt->base,
-				i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
-				obj->base.size >> PAGE_SHIFT);
+	struct i915_address_space *vm = &ppgtt->base;
+	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
+
+	vm->clear_range(vm, obj_offset >> PAGE_SHIFT,
+			obj->base.size >> PAGE_SHIFT);
 }
 
 extern int intel_iommu_gfx_mapped;
@@ -447,6 +454,9 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
 				       dev_priv->gtt.base.start / PAGE_SIZE,
 				       dev_priv->gtt.base.total / PAGE_SIZE);
 
+	if (dev_priv->mm.aliasing_ppgtt)
+		gen6_write_pdes(dev_priv->mm.aliasing_ppgtt);
+
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
 		i915_gem_clflush_object(obj);
 		i915_gem_gtt_bind_object(obj, obj->cache_level);
@@ -625,7 +635,8 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
 	 * aperture.  One page should be enough to keep any prefetching inside
 	 * of the aperture.
 	 */
-	drm_i915_private_t *dev_priv = dev->dev_private;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct i915_address_space *ggtt_vm = &dev_priv->gtt.base;
 	struct drm_mm_node *entry;
 	struct drm_i915_gem_object *obj;
 	unsigned long hole_start, hole_end;
@@ -633,19 +644,19 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
 	BUG_ON(mappable_end > end);
 
 	/* Subtract the guard page ... */
-	drm_mm_init(&dev_priv->gtt.base.mm, start, end - start - PAGE_SIZE);
+	drm_mm_init(&ggtt_vm->mm, start, end - start - PAGE_SIZE);
 	if (!HAS_LLC(dev))
 		dev_priv->gtt.base.mm.color_adjust = i915_gtt_color_adjust;
 
 	/* Mark any preallocated objects as occupied */
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
-		struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
+		struct i915_vma *vma = i915_gem_obj_to_vma(obj, ggtt_vm);
 		int ret;
 		DRM_DEBUG_KMS("reserving preallocated space: %lx + %zx\n",
 			      i915_gem_obj_ggtt_offset(obj), obj->base.size);
 
 		WARN_ON(i915_gem_obj_ggtt_bound(obj));
-		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
+		ret = drm_mm_reserve_node(&ggtt_vm->mm, &vma->node);
 		if (ret)
 			DRM_DEBUG_KMS("Reservation failed\n");
 		obj->has_global_gtt_mapping = 1;
@@ -656,19 +667,15 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
 	dev_priv->gtt.base.total = end - start;
 
 	/* Clear any non-preallocated blocks */
-	drm_mm_for_each_hole(entry, &dev_priv->gtt.base.mm,
-			     hole_start, hole_end) {
+	drm_mm_for_each_hole(entry, &ggtt_vm->mm, hole_start, hole_end) {
 		const unsigned long count = (hole_end - hole_start) / PAGE_SIZE;
 		DRM_DEBUG_KMS("clearing unused GTT space: [%lx, %lx]\n",
 			      hole_start, hole_end);
-		dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
-					       hole_start / PAGE_SIZE,
-					       count);
+		ggtt_vm->clear_range(ggtt_vm, hole_start / PAGE_SIZE, count);
 	}
 
 	/* And finally clear the reserved guard page */
-	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
-				       end / PAGE_SIZE - 1, 1);
+	ggtt_vm->clear_range(ggtt_vm, end / PAGE_SIZE - 1, 1);
 }
 
 static bool
diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index 245eb1d..bfe61fa 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -391,7 +391,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 	if (gtt_offset == I915_GTT_OFFSET_NONE)
 		return obj;
 
-	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
+	vma = i915_gem_vma_create(obj, vm);
 	if (!vma) {
 		drm_gem_object_unreference(&obj->base);
 		return NULL;
@@ -404,8 +404,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 	 */
 	vma->node.start = gtt_offset;
 	vma->node.size = size;
-	if (drm_mm_initialized(&dev_priv->gtt.base.mm)) {
-		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
+	if (drm_mm_initialized(&vm->mm)) {
+		ret = drm_mm_reserve_node(&vm->mm, &vma->node);
 		if (ret) {
 			DRM_DEBUG_KMS("failed to allocate stolen GTT space\n");
 			goto unref_out;
diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
index 92a8d27..808ca2a 100644
--- a/drivers/gpu/drm/i915/i915_gem_tiling.c
+++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
@@ -360,17 +360,19 @@ i915_gem_set_tiling(struct drm_device *dev, void *data,
 
 		obj->map_and_fenceable =
 			!i915_gem_obj_ggtt_bound(obj) ||
-			(i915_gem_obj_ggtt_offset(obj) + obj->base.size <= dev_priv->gtt.mappable_end &&
+			(i915_gem_obj_ggtt_offset(obj) +
+			 obj->base.size <= dev_priv->gtt.mappable_end &&
 			 i915_gem_object_fence_ok(obj, args->tiling_mode));
 
 		/* Rebind if we need a change of alignment */
 		if (!obj->map_and_fenceable) {
-			u32 unfenced_alignment =
+			struct i915_address_space *ggtt = &dev_priv->gtt.base;
+			u32 unfenced_align =
 				i915_gem_get_gtt_alignment(dev, obj->base.size,
 							    args->tiling_mode,
 							    false);
-			if (i915_gem_obj_ggtt_offset(obj) & (unfenced_alignment - 1))
-				ret = i915_gem_object_unbind(obj);
+			if (i915_gem_obj_ggtt_offset(obj) & (unfenced_align - 1))
+				ret = i915_gem_object_unbind(obj, ggtt);
 		}
 
 		if (ret == 0) {
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 79fbb17..28fa0ff 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1716,6 +1716,9 @@ i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
 	if (HAS_BROKEN_CS_TLB(dev_priv->dev)) {
 		u32 acthd = I915_READ(ACTHD);
 
+		if (WARN_ON(HAS_HW_CONTEXTS(dev_priv->dev)))
+			return NULL;
+
 		if (WARN_ON(ring->id != RCS))
 			return NULL;
 
@@ -1802,7 +1805,8 @@ static void i915_gem_record_active_context(struct intel_ring_buffer *ring,
 		return;
 
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
-		if ((error->ccid & PAGE_MASK) == i915_gem_obj_ggtt_offset(obj)) {
+		if ((error->ccid & PAGE_MASK) ==
+		    i915_gem_obj_ggtt_offset(obj)) {
 			ering->ctx = i915_error_object_create_sized(dev_priv,
 								    obj, 1);
 			break;
diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
index 7d283b5..3f019d3 100644
--- a/drivers/gpu/drm/i915/i915_trace.h
+++ b/drivers/gpu/drm/i915/i915_trace.h
@@ -34,11 +34,13 @@ TRACE_EVENT(i915_gem_object_create,
 );
 
 TRACE_EVENT(i915_gem_object_bind,
-	    TP_PROTO(struct drm_i915_gem_object *obj, bool mappable),
-	    TP_ARGS(obj, mappable),
+	    TP_PROTO(struct drm_i915_gem_object *obj,
+		     struct i915_address_space *vm, bool mappable),
+	    TP_ARGS(obj, vm, mappable),
 
 	    TP_STRUCT__entry(
 			     __field(struct drm_i915_gem_object *, obj)
+			     __field(struct i915_address_space *, vm)
 			     __field(u32, offset)
 			     __field(u32, size)
 			     __field(bool, mappable)
@@ -46,8 +48,8 @@ TRACE_EVENT(i915_gem_object_bind,
 
 	    TP_fast_assign(
 			   __entry->obj = obj;
-			   __entry->offset = i915_gem_obj_ggtt_offset(obj);
-			   __entry->size = i915_gem_obj_ggtt_size(obj);
+			   __entry->offset = i915_gem_obj_offset(obj, vm);
+			   __entry->size = i915_gem_obj_size(obj, vm);
 			   __entry->mappable = mappable;
 			   ),
 
@@ -57,19 +59,21 @@ TRACE_EVENT(i915_gem_object_bind,
 );
 
 TRACE_EVENT(i915_gem_object_unbind,
-	    TP_PROTO(struct drm_i915_gem_object *obj),
-	    TP_ARGS(obj),
+	    TP_PROTO(struct drm_i915_gem_object *obj,
+		     struct i915_address_space *vm),
+	    TP_ARGS(obj, vm),
 
 	    TP_STRUCT__entry(
 			     __field(struct drm_i915_gem_object *, obj)
+			     __field(struct i915_address_space *, vm)
 			     __field(u32, offset)
 			     __field(u32, size)
 			     ),
 
 	    TP_fast_assign(
 			   __entry->obj = obj;
-			   __entry->offset = i915_gem_obj_ggtt_offset(obj);
-			   __entry->size = i915_gem_obj_ggtt_size(obj);
+			   __entry->offset = i915_gem_obj_offset(obj, vm);
+			   __entry->size = i915_gem_obj_size(obj, vm);
 			   ),
 
 	    TP_printk("obj=%p, offset=%08x size=%x",
diff --git a/drivers/gpu/drm/i915/intel_fb.c b/drivers/gpu/drm/i915/intel_fb.c
index f3c97e0..b69cc63 100644
--- a/drivers/gpu/drm/i915/intel_fb.c
+++ b/drivers/gpu/drm/i915/intel_fb.c
@@ -170,7 +170,6 @@ static int intelfb_create(struct drm_fb_helper *helper,
 		      fb->width, fb->height,
 		      i915_gem_obj_ggtt_offset(obj), obj);
 
-
 	mutex_unlock(&dev->struct_mutex);
 	vga_switcheroo_client_fb_set(dev->pdev, info);
 	return 0;
diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
index 81c3ca1..517e278 100644
--- a/drivers/gpu/drm/i915/intel_overlay.c
+++ b/drivers/gpu/drm/i915/intel_overlay.c
@@ -1350,7 +1350,7 @@ void intel_setup_overlay(struct drm_device *dev)
 		}
 		overlay->flip_addr = reg_bo->phys_obj->handle->busaddr;
 	} else {
-		ret = i915_gem_object_pin(reg_bo, PAGE_SIZE, true, false);
+		ret = i915_gem_ggtt_pin(reg_bo, PAGE_SIZE, true, false);
 		if (ret) {
 			DRM_ERROR("failed to pin overlay register bo\n");
 			goto out_free_bo;
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 125a741..449e57c 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -2858,7 +2858,7 @@ intel_alloc_context_page(struct drm_device *dev)
 		return NULL;
 	}
 
-	ret = i915_gem_object_pin(ctx, 4096, true, false);
+	ret = i915_gem_ggtt_pin(ctx, 4096, true, false);
 	if (ret) {
 		DRM_ERROR("failed to pin power context: %d\n", ret);
 		goto err_unref;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index bc4c11b..ebed61d 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -481,6 +481,7 @@ out:
 static int
 init_pipe_control(struct intel_ring_buffer *ring)
 {
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	struct pipe_control *pc;
 	struct drm_i915_gem_object *obj;
 	int ret;
@@ -499,9 +500,10 @@ init_pipe_control(struct intel_ring_buffer *ring)
 		goto err;
 	}
 
-	i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
+	i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
+					I915_CACHE_LLC);
 
-	ret = i915_gem_object_pin(obj, 4096, true, false);
+	ret = i915_gem_ggtt_pin(obj, 4096, true, false);
 	if (ret)
 		goto err_unref;
 
@@ -1212,6 +1214,7 @@ static void cleanup_status_page(struct intel_ring_buffer *ring)
 static int init_status_page(struct intel_ring_buffer *ring)
 {
 	struct drm_device *dev = ring->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_gem_object *obj;
 	int ret;
 
@@ -1222,9 +1225,10 @@ static int init_status_page(struct intel_ring_buffer *ring)
 		goto err;
 	}
 
-	i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
+	i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
+					I915_CACHE_LLC);
 
-	ret = i915_gem_object_pin(obj, 4096, true, false);
+	ret = i915_gem_ggtt_pin(obj, 4096, true, false);
 	if (ret != 0) {
 		goto err_unref;
 	}
@@ -1307,7 +1311,7 @@ static int intel_init_ring_buffer(struct drm_device *dev,
 
 	ring->obj = obj;
 
-	ret = i915_gem_object_pin(obj, PAGE_SIZE, true, false);
+	ret = i915_gem_ggtt_pin(obj, PAGE_SIZE, true, false);
 	if (ret)
 		goto err_unref;
 
@@ -1828,7 +1832,7 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
 			return -ENOMEM;
 		}
 
-		ret = i915_gem_object_pin(obj, 0, true, false);
+		ret = i915_gem_ggtt_pin(obj, 0, true, false);
 		if (ret != 0) {
 			drm_gem_object_unreference(&obj->base);
 			DRM_ERROR("Failed to ping batch bo\n");
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 07/11] drm/i915: Fix up map and fenceable for VMA
  2013-07-09  6:08 [PATCH 00/11] ppgtt: just the VMA Ben Widawsky
                   ` (5 preceding siblings ...)
  2013-07-09  6:08 ` [PATCH 06/11] drm/i915: plumb VM into object operations Ben Widawsky
@ 2013-07-09  6:08 ` Ben Widawsky
  2013-07-09  7:16   ` Daniel Vetter
  2013-07-09  6:08 ` [PATCH 08/11] drm/i915: mm_list is per VMA Ben Widawsky
                   ` (5 subsequent siblings)
  12 siblings, 1 reply; 50+ messages in thread
From: Ben Widawsky @ 2013-07-09  6:08 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

formerly: "drm/i915: Create VMAs (part 3.5) - map and fenceable
tracking"

The map_and_fenceable tracking is per object. GTT mapping, and fences
only apply to global GTT. As such,  object operations which are not
performed on the global GTT should not effect mappable or fenceable
characteristics.

Functionally, this commit could very well be squashed in to the previous
patch which updated object operations to take a VM argument.  This
commit is split out because it's a bit tricky (or at least it was for
me).
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 21015cd..501c590 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2635,7 +2635,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj,
 
 	trace_i915_gem_object_unbind(obj, vm);
 
-	if (obj->has_global_gtt_mapping)
+	if (obj->has_global_gtt_mapping && i915_is_ggtt(vm))
 		i915_gem_gtt_unbind_object(obj);
 	if (obj->has_aliasing_ppgtt_mapping) {
 		i915_ppgtt_unbind_object(dev_priv->mm.aliasing_ppgtt, obj);
@@ -2646,7 +2646,8 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj,
 
 	list_del(&obj->mm_list);
 	/* Avoid an unnecessary call to unbind on rebind. */
-	obj->map_and_fenceable = true;
+	if (i915_is_ggtt(vm))
+		obj->map_and_fenceable = true;
 
 	vma = i915_gem_obj_to_vma(obj, vm);
 	list_del(&vma->vma_link);
@@ -3213,7 +3214,9 @@ search_free:
 		i915_is_ggtt(vm) &&
 		vma->node.start + obj->base.size <= dev_priv->gtt.mappable_end;
 
-	obj->map_and_fenceable = mappable && fenceable;
+	/* Map and fenceable only changes if the VM is the global GGTT */
+	if (i915_is_ggtt(vm))
+		obj->map_and_fenceable = mappable && fenceable;
 
 	trace_i915_gem_object_bind(obj, vm, map_and_fenceable);
 	i915_gem_verify_gtt(dev);
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 08/11] drm/i915: mm_list is per VMA
  2013-07-09  6:08 [PATCH 00/11] ppgtt: just the VMA Ben Widawsky
                   ` (6 preceding siblings ...)
  2013-07-09  6:08 ` [PATCH 07/11] drm/i915: Fix up map and fenceable for VMA Ben Widawsky
@ 2013-07-09  6:08 ` Ben Widawsky
  2013-07-09  7:18   ` Daniel Vetter
  2013-07-09  6:08 ` [PATCH 09/11] drm/i915: Update error capture for VMs Ben Widawsky
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 50+ messages in thread
From: Ben Widawsky @ 2013-07-09  6:08 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

formerly: "drm/i915: Create VMAs (part 5) - move mm_list"

The mm_list is used for the active/inactive LRUs. Since those LRUs are
per address space, the link should be per VMx .

Because we'll only ever have 1 VMA before this point, it's not incorrect
to defer this change until this point in the patch series, and doing it
here makes the change much easier to understand.

v2: only bump GGTT LRU in i915_gem_object_set_to_gtt_domain (Chris)

v3: Moved earlier in the series

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_debugfs.c    | 53 ++++++++++++++++++++++------------
 drivers/gpu/drm/i915/i915_drv.h        |  5 ++--
 drivers/gpu/drm/i915/i915_gem.c        | 34 ++++++++++++++--------
 drivers/gpu/drm/i915/i915_gem_evict.c  | 14 ++++-----
 drivers/gpu/drm/i915/i915_gem_stolen.c |  2 +-
 drivers/gpu/drm/i915/i915_irq.c        | 37 ++++++++++++++----------
 6 files changed, 87 insertions(+), 58 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 867ed07..163ca6b 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -157,7 +157,7 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
 	struct drm_device *dev = node->minor->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct i915_address_space *vm = &dev_priv->gtt.base;
-	struct drm_i915_gem_object *obj;
+	struct i915_vma *vma;
 	size_t total_obj_size, total_gtt_size;
 	int count, ret;
 
@@ -165,6 +165,7 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
 	if (ret)
 		return ret;
 
+	/* FIXME: the user of this interface might want more than just GGTT */
 	switch (list) {
 	case ACTIVE_LIST:
 		seq_puts(m, "Active:\n");
@@ -180,12 +181,12 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
 	}
 
 	total_obj_size = total_gtt_size = count = 0;
-	list_for_each_entry(obj, head, mm_list) {
-		seq_puts(m, "   ");
-		describe_obj(m, obj);
-		seq_putc(m, '\n');
-		total_obj_size += obj->base.size;
-		total_gtt_size += i915_gem_obj_ggtt_size(obj);
+	list_for_each_entry(vma, head, mm_list) {
+		seq_printf(m, "   ");
+		describe_obj(m, vma->obj);
+		seq_printf(m, "\n");
+		total_obj_size += vma->obj->base.size;
+		total_gtt_size += i915_gem_obj_size(vma->obj, vma->vm);
 		count++;
 	}
 	mutex_unlock(&dev->struct_mutex);
@@ -233,7 +234,18 @@ static int per_file_stats(int id, void *ptr, void *data)
 	return 0;
 }
 
-static int i915_gem_object_info(struct seq_file *m, void *data)
+#define count_vmas(list, member) do { \
+	list_for_each_entry(vma, list, member) { \
+		size += i915_gem_obj_ggtt_size(vma->obj); \
+		++count; \
+		if (vma->obj->map_and_fenceable) { \
+			mappable_size += i915_gem_obj_ggtt_size(vma->obj); \
+			++mappable_count; \
+		} \
+	} \
+} while (0)
+
+static int i915_gem_object_info(struct seq_file *m, void* data)
 {
 	struct drm_info_node *node = (struct drm_info_node *) m->private;
 	struct drm_device *dev = node->minor->dev;
@@ -243,6 +255,7 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
 	struct drm_i915_gem_object *obj;
 	struct i915_address_space *vm = &dev_priv->gtt.base;
 	struct drm_file *file;
+	struct i915_vma *vma;
 	int ret;
 
 	ret = mutex_lock_interruptible(&dev->struct_mutex);
@@ -259,12 +272,12 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
 		   count, mappable_count, size, mappable_size);
 
 	size = count = mappable_size = mappable_count = 0;
-	count_objects(&vm->active_list, mm_list);
+	count_vmas(&vm->active_list, mm_list);
 	seq_printf(m, "  %u [%u] active objects, %zu [%zu] bytes\n",
 		   count, mappable_count, size, mappable_size);
 
 	size = count = mappable_size = mappable_count = 0;
-	count_objects(&vm->inactive_list, mm_list);
+	count_vmas(&vm->inactive_list, mm_list);
 	seq_printf(m, "  %u [%u] inactive objects, %zu [%zu] bytes\n",
 		   count, mappable_count, size, mappable_size);
 
@@ -2037,7 +2050,8 @@ i915_drop_caches_set(void *data, u64 val)
 	struct drm_device *dev = data;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_gem_object *obj, *next;
-	struct i915_address_space *vm = &dev_priv->gtt.base;
+	struct i915_address_space *vm;
+	struct i915_vma *vma, *x;
 	int ret;
 
 	DRM_DEBUG_DRIVER("Dropping caches: 0x%08llx\n", val);
@@ -2058,14 +2072,15 @@ i915_drop_caches_set(void *data, u64 val)
 		i915_gem_retire_requests(dev);
 
 	if (val & DROP_BOUND) {
-		list_for_each_entry_safe(obj, next, &vm->inactive_list,
-					 mm_list) {
-			if (obj->pin_count)
-				continue;
-
-			ret = i915_gem_object_unbind(obj, &dev_priv->gtt.base);
-			if (ret)
-				goto unlock;
+		list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
+			list_for_each_entry_safe(vma, x, &vm->inactive_list,
+						 mm_list)
+				if (vma->obj->pin_count == 0) {
+					ret = i915_gem_object_unbind(vma->obj,
+								     vm);
+					if (ret)
+						goto unlock;
+				}
 		}
 	}
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 48baccc..48105f8 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -541,6 +541,9 @@ struct i915_vma {
 	struct drm_i915_gem_object *obj;
 	struct i915_address_space *vm;
 
+	/** This object's place on the active/inactive lists */
+	struct list_head mm_list;
+
 	struct list_head vma_link; /* Link in the object's VMA list */
 };
 
@@ -1242,9 +1245,7 @@ struct drm_i915_gem_object {
 	struct drm_mm_node *stolen;
 	struct list_head global_list;
 
-	/** This object's place on the active/inactive lists */
 	struct list_head ring_list;
-	struct list_head mm_list;
 	/** This object's place in the batchbuffer or on the eviction list */
 	struct list_head exec_list;
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 501c590..9a58363 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1888,6 +1888,7 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
 	struct drm_device *dev = obj->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	u32 seqno = intel_ring_get_seqno(ring);
+	struct i915_vma *vma;
 
 	BUG_ON(ring == NULL);
 	obj->ring = ring;
@@ -1899,7 +1900,8 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
 	}
 
 	/* Move from whatever list we were on to the tail of execution. */
-	list_move_tail(&obj->mm_list, &vm->active_list);
+	vma = i915_gem_obj_to_vma(obj, vm);
+	list_move_tail(&vma->mm_list, &vm->active_list);
 	list_move_tail(&obj->ring_list, &ring->active_list);
 
 	obj->last_read_seqno = seqno;
@@ -1922,10 +1924,13 @@ static void
 i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj,
 				 struct i915_address_space *vm)
 {
+	struct i915_vma *vma;
+
 	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
 	BUG_ON(!obj->active);
 
-	list_move_tail(&obj->mm_list, &vm->inactive_list);
+	vma = i915_gem_obj_to_vma(obj, vm);
+	list_move_tail(&vma->mm_list, &vm->inactive_list);
 
 	list_del_init(&obj->ring_list);
 	obj->ring = NULL;
@@ -2287,9 +2292,9 @@ void i915_gem_restore_fences(struct drm_device *dev)
 void i915_gem_reset(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct i915_address_space *vm;
-	struct drm_i915_gem_object *obj;
 	struct intel_ring_buffer *ring;
+	struct i915_address_space *vm;
+	struct i915_vma *vma;
 	int i;
 
 	for_each_ring(ring, dev_priv, i)
@@ -2299,8 +2304,8 @@ void i915_gem_reset(struct drm_device *dev)
 	 * necessary invalidation upon reuse.
 	 */
 	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
-		list_for_each_entry(obj, &vm->inactive_list, mm_list)
-			obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
+		list_for_each_entry(vma, &vm->inactive_list, mm_list)
+			vma->obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
 
 	i915_gem_restore_fences(dev);
 }
@@ -2644,12 +2649,12 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj,
 	i915_gem_gtt_finish_object(obj);
 	i915_gem_object_unpin_pages(obj);
 
-	list_del(&obj->mm_list);
 	/* Avoid an unnecessary call to unbind on rebind. */
 	if (i915_is_ggtt(vm))
 		obj->map_and_fenceable = true;
 
 	vma = i915_gem_obj_to_vma(obj, vm);
+	list_del(&vma->mm_list);
 	list_del(&vma->vma_link);
 	drm_mm_remove_node(&vma->node);
 	i915_gem_vma_destroy(vma);
@@ -3197,7 +3202,7 @@ search_free:
 	}
 
 	list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
-	list_add_tail(&obj->mm_list, &vm->inactive_list);
+	list_add_tail(&vma->mm_list, &vm->inactive_list);
 
 	/* Keep GGTT vmas first to make debug easier */
 	if (i915_is_ggtt(vm))
@@ -3354,9 +3359,14 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
 					    old_write_domain);
 
 	/* And bump the LRU for this access */
-	if (i915_gem_object_is_inactive(obj))
-		list_move_tail(&obj->mm_list,
-			       &dev_priv->gtt.base.inactive_list);
+	if (i915_gem_object_is_inactive(obj)) {
+		struct i915_vma *vma = i915_gem_obj_to_vma(obj,
+							   &dev_priv->gtt.base);
+		if (vma)
+			list_move_tail(&vma->mm_list,
+				       &dev_priv->gtt.base.inactive_list);
+
+	}
 
 	return 0;
 }
@@ -3931,7 +3941,6 @@ unlock:
 void i915_gem_object_init(struct drm_i915_gem_object *obj,
 			  const struct drm_i915_gem_object_ops *ops)
 {
-	INIT_LIST_HEAD(&obj->mm_list);
 	INIT_LIST_HEAD(&obj->global_list);
 	INIT_LIST_HEAD(&obj->ring_list);
 	INIT_LIST_HEAD(&obj->exec_list);
@@ -4071,6 +4080,7 @@ struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
 		return ERR_PTR(-ENOMEM);
 
 	INIT_LIST_HEAD(&vma->vma_link);
+	INIT_LIST_HEAD(&vma->mm_list);
 	vma->vm = vm;
 	vma->obj = obj;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
index 32efdc0..18a44a9 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -87,8 +87,7 @@ i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
 		drm_mm_init_scan(&vm->mm, min_size, alignment, cache_level);
 
 	/* First see if there is a large enough contiguous idle region... */
-	list_for_each_entry(obj, &vm->inactive_list, mm_list) {
-		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
+	list_for_each_entry(vma, &vm->inactive_list, mm_list) {
 		if (mark_free(vma, &unwind_list))
 			goto found;
 	}
@@ -97,8 +96,7 @@ i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
 		goto none;
 
 	/* Now merge in the soon-to-be-expired objects... */
-	list_for_each_entry(obj, &vm->active_list, mm_list) {
-		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
+	list_for_each_entry(vma, &vm->active_list, mm_list) {
 		if (mark_free(vma, &unwind_list))
 			goto found;
 	}
@@ -159,7 +157,7 @@ i915_gem_evict_everything(struct drm_device *dev)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct i915_address_space *vm;
-	struct drm_i915_gem_object *obj, *next;
+	struct i915_vma *vma, *next;
 	bool lists_empty = true;
 	int ret;
 
@@ -187,9 +185,9 @@ i915_gem_evict_everything(struct drm_device *dev)
 
 	/* Having flushed everything, unbind() should never raise an error */
 	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
-		list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
-			if (obj->pin_count == 0)
-				WARN_ON(i915_gem_object_unbind(obj, vm));
+		list_for_each_entry_safe(vma, next, &vm->inactive_list, mm_list)
+			if (vma->obj->pin_count == 0)
+				WARN_ON(i915_gem_object_unbind(vma->obj, vm));
 	}
 
 	return 0;
diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index bfe61fa..58b2613 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -415,7 +415,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 	obj->has_global_gtt_mapping = 1;
 
 	list_add_tail(&obj->global_list, &dev_priv->mm.bound_list);
-	list_add_tail(&obj->mm_list, &vm->inactive_list);
+	list_add_tail(&vma->mm_list, &dev_priv->gtt.base.inactive_list);
 
 	return obj;
 
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 28fa0ff..e065232 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1640,11 +1640,11 @@ static void capture_bo(struct drm_i915_error_buffer *err,
 static u32 capture_active_bo(struct drm_i915_error_buffer *err,
 			     int count, struct list_head *head)
 {
-	struct drm_i915_gem_object *obj;
+	struct i915_vma *vma;
 	int i = 0;
 
-	list_for_each_entry(obj, head, mm_list) {
-		capture_bo(err++, obj);
+	list_for_each_entry(vma, head, mm_list) {
+		capture_bo(err++, vma->obj);
 		if (++i == count)
 			break;
 	}
@@ -1706,8 +1706,9 @@ static struct drm_i915_error_object *
 i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
 			     struct intel_ring_buffer *ring)
 {
+	struct i915_address_space *vm;
+	struct i915_vma *vma;
 	struct drm_i915_gem_object *obj;
-	struct i915_address_space *vm = &dev_priv->gtt.base;
 	u32 seqno;
 
 	if (!ring->get_seqno)
@@ -1729,20 +1730,23 @@ i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
 	}
 
 	seqno = ring->get_seqno(ring, false);
-	list_for_each_entry(obj, &vm->active_list, mm_list) {
-		if (obj->ring != ring)
-			continue;
+	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
+		list_for_each_entry(vma, &vm->active_list, mm_list) {
+			obj = vma->obj;
+			if (obj->ring != ring)
+				continue;
 
-		if (i915_seqno_passed(seqno, obj->last_read_seqno))
-			continue;
+			if (i915_seqno_passed(seqno, obj->last_read_seqno))
+				continue;
 
-		if ((obj->base.read_domains & I915_GEM_DOMAIN_COMMAND) == 0)
-			continue;
+			if ((obj->base.read_domains & I915_GEM_DOMAIN_COMMAND) == 0)
+				continue;
 
-		/* We need to copy these to an anonymous buffer as the simplest
-		 * method to avoid being overwritten by userspace.
-		 */
-		return i915_error_object_create(dev_priv, obj);
+			/* We need to copy these to an anonymous buffer as the simplest
+			 * method to avoid being overwritten by userspace.
+			 */
+			return i915_error_object_create(dev_priv, obj);
+		}
 	}
 
 	return NULL;
@@ -1863,11 +1867,12 @@ static void i915_gem_capture_buffers(struct drm_i915_private *dev_priv,
 				     struct drm_i915_error_state *error)
 {
 	struct drm_i915_gem_object *obj;
+	struct i915_vma *vma;
 	struct i915_address_space *vm = &dev_priv->gtt.base;
 	int i;
 
 	i = 0;
-	list_for_each_entry(obj, &vm->active_list, mm_list)
+	list_for_each_entry(vma, &vm->active_list, mm_list)
 		i++;
 	error->active_bo_count = i;
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list)
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 09/11] drm/i915: Update error capture for VMs
  2013-07-09  6:08 [PATCH 00/11] ppgtt: just the VMA Ben Widawsky
                   ` (7 preceding siblings ...)
  2013-07-09  6:08 ` [PATCH 08/11] drm/i915: mm_list is per VMA Ben Widawsky
@ 2013-07-09  6:08 ` Ben Widawsky
  2013-07-09  6:08 ` [PATCH 10/11] drm/i915: create an object_is_active() Ben Widawsky
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 50+ messages in thread
From: Ben Widawsky @ 2013-07-09  6:08 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

formerly: "drm/i915: Create VMAs (part 4) - Error capture"

Since the active/inactive lists are per VM, we need to modify the error
capture code to be aware of this, and also extend it to capture the
buffers from all the VMs. For now all the code assumes only 1 VM, but it
will become more generic over the next few patches.

NOTE: If the number of VMs in a real world system grows significantly
we'll have to focus on only capturing the guilty VM, or else it's likely
there won't be enough space for error capture.

v2: Squashed in the "part 6" which had dependencies on the mm_list
change. Since I've moved the mm_list change to an earlier point in the
series, we were able to accomplish it here and now.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_debugfs.c |   8 +--
 drivers/gpu/drm/i915/i915_drv.h     |   4 +-
 drivers/gpu/drm/i915/i915_irq.c     | 115 ++++++++++++++++++++++++++----------
 3 files changed, 91 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 163ca6b..9a4acc2 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -902,13 +902,13 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
 
 	if (error->active_bo)
 		print_error_buffers(m, "Active",
-				    error->active_bo,
-				    error->active_bo_count);
+				    error->active_bo[0],
+				    error->active_bo_count[0]);
 
 	if (error->pinned_bo)
 		print_error_buffers(m, "Pinned",
-				    error->pinned_bo,
-				    error->pinned_bo_count);
+				    error->pinned_bo[0],
+				    error->pinned_bo_count[0]);
 
 	for (i = 0; i < ARRAY_SIZE(error->ring); i++) {
 		struct drm_i915_error_object *obj;
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 48105f8..b98ad82 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -323,8 +323,8 @@ struct drm_i915_error_state {
 		u32 purgeable:1;
 		s32 ring:4;
 		u32 cache_level:2;
-	} *active_bo, *pinned_bo;
-	u32 active_bo_count, pinned_bo_count;
+	} **active_bo, **pinned_bo;
+	u32 *active_bo_count, *pinned_bo_count;
 	struct intel_overlay_error_state *overlay;
 	struct intel_display_error_state *display;
 };
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index e065232..bc54d10 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1503,6 +1503,7 @@ static void i915_get_extra_instdone(struct drm_device *dev,
 static struct drm_i915_error_object *
 i915_error_object_create_sized(struct drm_i915_private *dev_priv,
 			       struct drm_i915_gem_object *src,
+			       struct i915_address_space *vm,
 			       const int num_pages)
 {
 	struct drm_i915_error_object *dst;
@@ -1516,7 +1517,7 @@ i915_error_object_create_sized(struct drm_i915_private *dev_priv,
 	if (dst == NULL)
 		return NULL;
 
-	reloc_offset = dst->gtt_offset = i915_gem_obj_ggtt_offset(src);
+	reloc_offset = dst->gtt_offset = i915_gem_obj_offset(src, vm);
 	for (i = 0; i < num_pages; i++) {
 		unsigned long flags;
 		void *d;
@@ -1577,8 +1578,9 @@ unwind:
 	kfree(dst);
 	return NULL;
 }
-#define i915_error_object_create(dev_priv, src) \
+#define i915_error_object_create(dev_priv, src, vm) \
 	i915_error_object_create_sized((dev_priv), (src), \
+				       vm, \
 				       (src)->base.size>>PAGE_SHIFT)
 
 static void
@@ -1609,19 +1611,24 @@ i915_error_state_free(struct kref *error_ref)
 		kfree(error->ring[i].requests);
 	}
 
+	/* FIXME: Assume always 1 VM for now */
+	kfree(error->active_bo[0]);
 	kfree(error->active_bo);
+	kfree(error->active_bo_count);
+	kfree(error->pinned_bo_count);
 	kfree(error->overlay);
 	kfree(error->display);
 	kfree(error);
 }
 static void capture_bo(struct drm_i915_error_buffer *err,
-		       struct drm_i915_gem_object *obj)
+		       struct drm_i915_gem_object *obj,
+		       struct i915_address_space *vm)
 {
 	err->size = obj->base.size;
 	err->name = obj->base.name;
 	err->rseqno = obj->last_read_seqno;
 	err->wseqno = obj->last_write_seqno;
-	err->gtt_offset = i915_gem_obj_ggtt_offset(obj);
+	err->gtt_offset = i915_gem_obj_offset(obj, vm);
 	err->read_domains = obj->base.read_domains;
 	err->write_domain = obj->base.write_domain;
 	err->fence_reg = obj->fence_reg;
@@ -1644,7 +1651,7 @@ static u32 capture_active_bo(struct drm_i915_error_buffer *err,
 	int i = 0;
 
 	list_for_each_entry(vma, head, mm_list) {
-		capture_bo(err++, vma->obj);
+		capture_bo(err++, vma->obj, vma->vm);
 		if (++i == count)
 			break;
 	}
@@ -1659,10 +1666,14 @@ static u32 capture_pinned_bo(struct drm_i915_error_buffer *err,
 	int i = 0;
 
 	list_for_each_entry(obj, head, global_list) {
+		struct i915_vma *vma;
 		if (obj->pin_count == 0)
 			continue;
 
-		capture_bo(err++, obj);
+		/* Object may be pinned in multiple VMs, just take first */
+		vma = list_first_entry(&obj->vma_list, struct i915_vma,
+				       vma_link);
+		capture_bo(err++, obj, vma->vm);
 		if (++i == count)
 			break;
 	}
@@ -1710,6 +1721,7 @@ i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
 	struct i915_vma *vma;
 	struct drm_i915_gem_object *obj;
 	u32 seqno;
+	u32 pp_db;
 
 	if (!ring->get_seqno)
 		return NULL;
@@ -1726,11 +1738,19 @@ i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
 		obj = ring->private;
 		if (acthd >= i915_gem_obj_ggtt_offset(obj) &&
 		    acthd < i915_gem_obj_ggtt_offset(obj) + obj->base.size)
-			return i915_error_object_create(dev_priv, obj);
+			return i915_error_object_create(dev_priv, obj,
+							&dev_priv->gtt.base);
 	}
 
+	pp_db = I915_READ(RING_PP_DIR_BASE(ring));
 	seqno = ring->get_seqno(ring, false);
+
 	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
+		struct i915_hw_ppgtt *ppgtt =
+			container_of(vm, struct i915_hw_ppgtt, base);
+		if (!i915_is_ggtt(vm) && pp_db >> 10 != ppgtt->pd_offset)
+			continue;
+
 		list_for_each_entry(vma, &vm->active_list, mm_list) {
 			obj = vma->obj;
 			if (obj->ring != ring)
@@ -1745,7 +1765,7 @@ i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
 			/* We need to copy these to an anonymous buffer as the simplest
 			 * method to avoid being overwritten by userspace.
 			 */
-			return i915_error_object_create(dev_priv, obj);
+			return i915_error_object_create(dev_priv, obj, vm);
 		}
 	}
 
@@ -1802,6 +1822,7 @@ static void i915_gem_record_active_context(struct intel_ring_buffer *ring,
 					   struct drm_i915_error_ring *ering)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+	struct i915_address_space *ggtt = &dev_priv->gtt.base;
 	struct drm_i915_gem_object *obj;
 
 	/* Currently render ring is the only HW context user */
@@ -1809,11 +1830,15 @@ static void i915_gem_record_active_context(struct intel_ring_buffer *ring,
 		return;
 
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
+		if (!i915_gem_obj_bound(obj, ggtt))
+			continue;
+
 		if ((error->ccid & PAGE_MASK) ==
 		    i915_gem_obj_ggtt_offset(obj)) {
 			ering->ctx = i915_error_object_create_sized(dev_priv,
-								    obj, 1);
-			break;
+								    obj,
+								    ggtt,
+								    1);
 		}
 	}
 }
@@ -1833,8 +1858,8 @@ static void i915_gem_record_rings(struct drm_device *dev,
 			i915_error_first_batchbuffer(dev_priv, ring);
 
 		error->ring[i].ringbuffer =
-			i915_error_object_create(dev_priv, ring->obj);
-
+			i915_error_object_create(dev_priv, ring->obj,
+						 &dev_priv->gtt.base);
 
 		i915_gem_record_active_context(ring, error, &error->ring[i]);
 
@@ -1863,42 +1888,72 @@ static void i915_gem_record_rings(struct drm_device *dev,
 	}
 }
 
-static void i915_gem_capture_buffers(struct drm_i915_private *dev_priv,
-				     struct drm_i915_error_state *error)
+/* FIXME: Since pin count/bound list is global, we duplicate what we capture per
+ * VM.
+ */
+static void i915_gem_capture_vm(struct drm_i915_private *dev_priv,
+				struct drm_i915_error_state *error,
+				struct i915_address_space *vm,
+				const int ndx)
 {
+	struct drm_i915_error_buffer *active_bo = NULL, *pinned_bo = NULL;
 	struct drm_i915_gem_object *obj;
 	struct i915_vma *vma;
-	struct i915_address_space *vm = &dev_priv->gtt.base;
 	int i;
 
 	i = 0;
 	list_for_each_entry(vma, &vm->active_list, mm_list)
 		i++;
-	error->active_bo_count = i;
+	error->active_bo_count[ndx] = i;
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list)
 		if (obj->pin_count)
 			i++;
-	error->pinned_bo_count = i - error->active_bo_count;
+	error->pinned_bo_count[ndx] = i - error->active_bo_count[ndx];
 
 	if (i) {
-		error->active_bo = kmalloc(sizeof(*error->active_bo)*i,
-					   GFP_ATOMIC);
-		if (error->active_bo)
-			error->pinned_bo =
-				error->active_bo + error->active_bo_count;
+		active_bo = kmalloc(sizeof(*active_bo)*i, GFP_ATOMIC);
+		if (active_bo)
+			pinned_bo = active_bo + error->active_bo_count[ndx];
 	}
 
-	if (error->active_bo)
-		error->active_bo_count =
-			capture_active_bo(error->active_bo,
-					  error->active_bo_count,
+	if (active_bo)
+		error->active_bo_count[ndx] =
+			capture_active_bo(active_bo,
+					  error->active_bo_count[ndx],
 					  &vm->active_list);
 
-	if (error->pinned_bo)
-		error->pinned_bo_count =
-			capture_pinned_bo(error->pinned_bo,
-					  error->pinned_bo_count,
+	if (pinned_bo)
+		error->pinned_bo_count[ndx] =
+			capture_pinned_bo(pinned_bo,
+					  error->pinned_bo_count[ndx],
 					  &dev_priv->mm.bound_list);
+	error->active_bo[ndx] = active_bo;
+	error->pinned_bo[ndx] = pinned_bo;
+}
+
+static void i915_gem_capture_buffers(struct drm_i915_private *dev_priv,
+				     struct drm_i915_error_state *error)
+{
+	struct i915_address_space *vm;
+	int cnt = 0, i = 0;
+
+	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
+		cnt++;
+
+	if (WARN(cnt > 1, "Multiple VMs not yet supported\n"))
+		cnt = 1;
+
+	vm = &dev_priv->gtt.base;
+
+	error->active_bo = kcalloc(cnt, sizeof(*error->active_bo), GFP_ATOMIC);
+	error->pinned_bo = kcalloc(cnt, sizeof(*error->pinned_bo), GFP_ATOMIC);
+	error->active_bo_count = kcalloc(cnt, sizeof(*error->active_bo_count),
+					 GFP_ATOMIC);
+	error->pinned_bo_count = kcalloc(cnt, sizeof(*error->pinned_bo_count),
+					 GFP_ATOMIC);
+
+	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
+		i915_gem_capture_vm(dev_priv, error, vm, i++);
 }
 
 /**
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 10/11] drm/i915: create an object_is_active()
  2013-07-09  6:08 [PATCH 00/11] ppgtt: just the VMA Ben Widawsky
                   ` (8 preceding siblings ...)
  2013-07-09  6:08 ` [PATCH 09/11] drm/i915: Update error capture for VMs Ben Widawsky
@ 2013-07-09  6:08 ` Ben Widawsky
  2013-07-09  6:08 ` [PATCH 11/11] drm/i915: Move active to vma Ben Widawsky
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 50+ messages in thread
From: Ben Widawsky @ 2013-07-09  6:08 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

This is simply obj->active for now, but will serve a purpose when we
track activity per vma.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h            |  1 +
 drivers/gpu/drm/i915/i915_gem.c            | 18 ++++++++++++------
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  2 +-
 3 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index b98ad82..38d07f2 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1716,6 +1716,7 @@ static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
 int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
 int i915_gem_object_sync(struct drm_i915_gem_object *obj,
 			 struct intel_ring_buffer *to);
+bool i915_gem_object_is_active(struct drm_i915_gem_object *obj);
 void i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
 				    struct i915_address_space *vm,
 				    struct intel_ring_buffer *ring);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 9a58363..c2ecb78 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -134,10 +134,16 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
 	return 0;
 }
 
+/* NB: Not the same as !i915_gem_object_is_inactive */
+bool i915_gem_object_is_active(struct drm_i915_gem_object *obj)
+{
+	return obj->active;
+}
+
 static inline bool
 i915_gem_object_is_inactive(struct drm_i915_gem_object *obj)
 {
-	return i915_gem_obj_bound_any(obj) && !obj->active;
+	return i915_gem_obj_bound_any(obj) && !i915_gem_object_is_active(obj);
 }
 
 int
@@ -1894,7 +1900,7 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
 	obj->ring = ring;
 
 	/* Add a reference if we're newly entering the active list. */
-	if (!obj->active) {
+	if (!i915_gem_object_is_active(obj)) {
 		drm_gem_object_reference(&obj->base);
 		obj->active = 1;
 	}
@@ -1927,7 +1933,7 @@ i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj,
 	struct i915_vma *vma;
 
 	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
-	BUG_ON(!obj->active);
+	BUG_ON(!i915_gem_object_is_active(obj));
 
 	vma = i915_gem_obj_to_vma(obj, vm);
 	list_move_tail(&vma->mm_list, &vm->inactive_list);
@@ -2437,7 +2443,7 @@ i915_gem_object_flush_active(struct drm_i915_gem_object *obj)
 {
 	int ret;
 
-	if (obj->active) {
+	if (i915_gem_object_is_active(obj)) {
 		ret = i915_gem_check_olr(obj->ring, obj->last_read_seqno);
 		if (ret)
 			return ret;
@@ -2502,7 +2508,7 @@ i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
 	if (ret)
 		goto out;
 
-	if (obj->active) {
+	if (i915_gem_object_is_active(obj)) {
 		seqno = obj->last_read_seqno;
 		ring = obj->ring;
 	}
@@ -3872,7 +3878,7 @@ i915_gem_busy_ioctl(struct drm_device *dev, void *data,
 	 */
 	ret = i915_gem_object_flush_active(obj);
 
-	args->busy = obj->active;
+	args->busy = i915_gem_object_is_active(obj);
 	if (obj->ring) {
 		BUILD_BUG_ON(I915_NUM_RINGS > 16);
 		args->busy |= intel_ring_flag(obj->ring) << 16;
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index e90182d..725dd7f 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -251,7 +251,7 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
 	}
 
 	/* We can't wait for rendering with pagefaults disabled */
-	if (obj->active && in_atomic())
+	if (i915_gem_object_is_active(obj) && in_atomic())
 		return -EFAULT;
 
 	reloc->delta += target_offset;
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 11/11] drm/i915: Move active to vma
  2013-07-09  6:08 [PATCH 00/11] ppgtt: just the VMA Ben Widawsky
                   ` (9 preceding siblings ...)
  2013-07-09  6:08 ` [PATCH 10/11] drm/i915: create an object_is_active() Ben Widawsky
@ 2013-07-09  6:08 ` Ben Widawsky
  2013-07-09  7:45   ` Daniel Vetter
  2013-07-09  7:50 ` [PATCH 00/11] ppgtt: just the VMA Daniel Vetter
  2013-07-13  4:45 ` [PATCH 12/15] [RFC] create vm->bind,unbind Ben Widawsky
  12 siblings, 1 reply; 50+ messages in thread
From: Ben Widawsky @ 2013-07-09  6:08 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

Probably need to squash whole thing, or just the inactive part, tbd...

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h | 14 ++++++------
 drivers/gpu/drm/i915/i915_gem.c | 47 ++++++++++++++++++++++++-----------------
 2 files changed, 35 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 38d07f2..e6694ae 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -541,6 +541,13 @@ struct i915_vma {
 	struct drm_i915_gem_object *obj;
 	struct i915_address_space *vm;
 
+	/**
+	 * This is set if the object is on the active lists (has pending
+	 * rendering and so a non-zero seqno), and is not set if it i s on
+	 * inactive (ready to be unbound) list.
+	 */
+	unsigned int active:1;
+
 	/** This object's place on the active/inactive lists */
 	struct list_head mm_list;
 
@@ -1250,13 +1257,6 @@ struct drm_i915_gem_object {
 	struct list_head exec_list;
 
 	/**
-	 * This is set if the object is on the active lists (has pending
-	 * rendering and so a non-zero seqno), and is not set if it i s on
-	 * inactive (ready to be unbound) list.
-	 */
-	unsigned int active:1;
-
-	/**
 	 * This is set if the object has been written to since last bound
 	 * to the GTT
 	 */
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index c2ecb78..b87073b 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -137,7 +137,13 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
 /* NB: Not the same as !i915_gem_object_is_inactive */
 bool i915_gem_object_is_active(struct drm_i915_gem_object *obj)
 {
-	return obj->active;
+	struct i915_vma *vma;
+
+	list_for_each_entry(vma, &obj->vma_list, vma_link)
+		if (vma->active)
+			return true;
+
+	return false;
 }
 
 static inline bool
@@ -1899,14 +1905,14 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
 	BUG_ON(ring == NULL);
 	obj->ring = ring;
 
+	/* Move from whatever list we were on to the tail of execution. */
+	vma = i915_gem_obj_to_vma(obj, vm);
 	/* Add a reference if we're newly entering the active list. */
-	if (!i915_gem_object_is_active(obj)) {
+	if (!vma->active) {
 		drm_gem_object_reference(&obj->base);
-		obj->active = 1;
+		vma->active = 1;
 	}
 
-	/* Move from whatever list we were on to the tail of execution. */
-	vma = i915_gem_obj_to_vma(obj, vm);
 	list_move_tail(&vma->mm_list, &vm->active_list);
 	list_move_tail(&obj->ring_list, &ring->active_list);
 
@@ -1927,16 +1933,23 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
 }
 
 static void
-i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj,
-				 struct i915_address_space *vm)
+i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
 {
+	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
+	struct i915_address_space *vm;
 	struct i915_vma *vma;
+	int i = 0;
 
 	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
-	BUG_ON(!i915_gem_object_is_active(obj));
 
-	vma = i915_gem_obj_to_vma(obj, vm);
-	list_move_tail(&vma->mm_list, &vm->inactive_list);
+	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
+		vma = i915_gem_obj_to_vma(obj, vm);
+		if (!vma || !vma->active)
+			continue;
+		list_move_tail(&vma->mm_list, &vm->inactive_list);
+		vma->active = 0;
+		i++;
+	}
 
 	list_del_init(&obj->ring_list);
 	obj->ring = NULL;
@@ -1948,8 +1961,8 @@ i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj,
 	obj->last_fenced_seqno = 0;
 	obj->fenced_gpu_access = false;
 
-	obj->active = 0;
-	drm_gem_object_unreference(&obj->base);
+	while (i--)
+		drm_gem_object_unreference(&obj->base);
 
 	WARN_ON(i915_verify_lists(dev));
 }
@@ -2272,15 +2285,13 @@ static void i915_gem_reset_ring_lists(struct drm_i915_private *dev_priv,
 	}
 
 	while (!list_empty(&ring->active_list)) {
-		struct i915_address_space *vm;
 		struct drm_i915_gem_object *obj;
 
 		obj = list_first_entry(&ring->active_list,
 				       struct drm_i915_gem_object,
 				       ring_list);
 
-		list_for_each_entry(vm, &dev_priv->vm_list, global_link)
-			i915_gem_object_move_to_inactive(obj, vm);
+		i915_gem_object_move_to_inactive(obj);
 	}
 }
 
@@ -2356,8 +2367,6 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
 	 * by the ringbuffer to the flushing/inactive lists as appropriate.
 	 */
 	while (!list_empty(&ring->active_list)) {
-		struct drm_i915_private *dev_priv = ring->dev->dev_private;
-		struct i915_address_space *vm;
 		struct drm_i915_gem_object *obj;
 
 		obj = list_first_entry(&ring->active_list,
@@ -2367,8 +2376,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
 		if (!i915_seqno_passed(seqno, obj->last_read_seqno))
 			break;
 
-		list_for_each_entry(vm, &dev_priv->vm_list, global_link)
-			i915_gem_object_move_to_inactive(obj, vm);
+		BUG_ON(!i915_gem_object_is_active(obj));
+		i915_gem_object_move_to_inactive(obj);
 	}
 
 	if (unlikely(ring->trace_irq_seqno &&
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH 03/11] drm/i915: Create a global list of vms
  2013-07-09  6:08 ` [PATCH 03/11] drm/i915: Create a global list of vms Ben Widawsky
@ 2013-07-09  6:37   ` Daniel Vetter
  0 siblings, 0 replies; 50+ messages in thread
From: Daniel Vetter @ 2013-07-09  6:37 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Mon, Jul 08, 2013 at 11:08:34PM -0700, Ben Widawsky wrote:
> After we plumb our code to support multiple address spaces (VMs), there
> are a few situations where we want to be able to traverse the list of
> all address spaces in the system. Cases like eviction, or error state
> collection are obvious example.
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> ---
>  drivers/gpu/drm/i915/i915_dma.c | 5 +++++
>  drivers/gpu/drm/i915/i915_drv.h | 2 ++
>  2 files changed, 7 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> index 3ac9dcc..d13e21f 100644
> --- a/drivers/gpu/drm/i915/i915_dma.c
> +++ b/drivers/gpu/drm/i915/i915_dma.c
> @@ -1497,6 +1497,10 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
>  
>  	i915_dump_device_info(dev_priv);
>  
> +	INIT_LIST_HEAD(&dev_priv->vm_list);
> +	INIT_LIST_HEAD(&dev_priv->gtt.base.global_link);
> +	list_add(&dev_priv->gtt.base.global_link, &dev_priv->vm_list);
> +
>  	if (i915_get_bridge_dev(dev)) {
>  		ret = -EIO;
>  		goto free_priv;
> @@ -1754,6 +1758,7 @@ int i915_driver_unload(struct drm_device *dev)
>  			i915_free_hws(dev);
>  	}
>  
> +	list_del(&dev_priv->vm_list);

Shouldn't this delete (dev_priv->gtt.base.global_link)?

Also I guess a ggtt_takedown function to wrap up the various bits would be
good (but maybe only at the end of this series as a cleanup).

Finally a WARN_ON(!list_empty(&dev_priv->vm_list)); at the very end of the
gem cleanup sounds like good paranoia.

>  	drm_mm_takedown(&dev_priv->gtt.base.mm);
>  	if (dev_priv->regs != NULL)
>  		pci_iounmap(dev->pdev, dev_priv->regs);
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 1296565..997c9a5 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -449,6 +449,7 @@ typedef uint32_t gen6_gtt_pte_t;
>  struct i915_address_space {
>  	struct drm_mm mm;
>  	struct drm_device *dev;
> +	struct list_head global_link;
>  	unsigned long start;		/* Start offset always 0 for dri2 */
>  	size_t total;		/* size addr space maps (ex. 2GB for ggtt) */
>  
> @@ -1120,6 +1121,7 @@ typedef struct drm_i915_private {
>  	enum modeset_restore modeset_restore;
>  	struct mutex modeset_restore_lock;
>  
> +	struct list_head vm_list; /* Global list of all address spaces */
>  	struct i915_gtt gtt; /* VMA representing the global address space */
>  
>  	struct i915_gem_mm mm;
> -- 
> 1.8.3.2
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 01/11] drm/i915: Move gtt and ppgtt under address space umbrella
  2013-07-09  6:08 ` [PATCH 01/11] drm/i915: Move gtt and ppgtt under address space umbrella Ben Widawsky
@ 2013-07-09  6:37   ` Daniel Vetter
  2013-07-10 16:36     ` Ben Widawsky
  2013-07-11 11:14   ` Imre Deak
  1 sibling, 1 reply; 50+ messages in thread
From: Daniel Vetter @ 2013-07-09  6:37 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Mon, Jul 08, 2013 at 11:08:32PM -0700, Ben Widawsky wrote:
> The GTT and PPGTT can be thought of more generally as GPU address
> spaces. Many of their actions (insert entries), state (LRU lists) and
> many of their characteristics (size), can be shared. Do that.
> 
> The change itself doesn't actually impact most of the VMA/VM rework
> coming up, it just fits in with the grand scheme. GGTT will usually be a
> special case where we either know an object must be in the GGTT (dislay
> engine, workarounds, etc.).

Commit message cut off?
-Daniel

> 
> v2: Drop usage of i915_gtt_vm (Daniel)
> Make cleanup also part of the parent class (Ben)
> Modified commit msg
> Rebased
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> ---
>  drivers/gpu/drm/i915/i915_debugfs.c |   4 +-
>  drivers/gpu/drm/i915/i915_dma.c     |   4 +-
>  drivers/gpu/drm/i915/i915_drv.h     |  57 ++++++-------
>  drivers/gpu/drm/i915/i915_gem.c     |   4 +-
>  drivers/gpu/drm/i915/i915_gem_gtt.c | 162 ++++++++++++++++++++----------------
>  5 files changed, 121 insertions(+), 110 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index c8059f5..d870f27 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -287,8 +287,8 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
>  		   count, size);
>  
>  	seq_printf(m, "%zu [%lu] gtt total\n",
> -		   dev_priv->gtt.total,
> -		   dev_priv->gtt.mappable_end - dev_priv->gtt.start);
> +		   dev_priv->gtt.base.total,
> +		   dev_priv->gtt.mappable_end - dev_priv->gtt.base.start);
>  
>  	seq_putc(m, '\n');
>  	list_for_each_entry_reverse(file, &dev->filelist, lhead) {
> diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> index 0e22142..15bca96 100644
> --- a/drivers/gpu/drm/i915/i915_dma.c
> +++ b/drivers/gpu/drm/i915/i915_dma.c
> @@ -1669,7 +1669,7 @@ out_gem_unload:
>  out_mtrrfree:
>  	arch_phys_wc_del(dev_priv->gtt.mtrr);
>  	io_mapping_free(dev_priv->gtt.mappable);
> -	dev_priv->gtt.gtt_remove(dev);
> +	dev_priv->gtt.base.cleanup(&dev_priv->gtt.base);
>  out_rmmap:
>  	pci_iounmap(dev->pdev, dev_priv->regs);
>  put_bridge:
> @@ -1764,7 +1764,7 @@ int i915_driver_unload(struct drm_device *dev)
>  	destroy_workqueue(dev_priv->wq);
>  	pm_qos_remove_request(&dev_priv->pm_qos);
>  
> -	dev_priv->gtt.gtt_remove(dev);
> +	dev_priv->gtt.base.cleanup(&dev_priv->gtt.base);
>  
>  	if (dev_priv->slab)
>  		kmem_cache_destroy(dev_priv->slab);
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index c8d6104..d6d4d7d 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -446,6 +446,29 @@ enum i915_cache_level {
>  
>  typedef uint32_t gen6_gtt_pte_t;
>  
> +struct i915_address_space {
> +	struct drm_device *dev;
> +	unsigned long start;		/* Start offset always 0 for dri2 */
> +	size_t total;		/* size addr space maps (ex. 2GB for ggtt) */
> +
> +	struct {
> +		dma_addr_t addr;
> +		struct page *page;
> +	} scratch;
> +
> +	/* FIXME: Need a more generic return type */
> +	gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
> +				     enum i915_cache_level level);
> +	void (*clear_range)(struct i915_address_space *vm,
> +			    unsigned int first_entry,
> +			    unsigned int num_entries);
> +	void (*insert_entries)(struct i915_address_space *vm,
> +			       struct sg_table *st,
> +			       unsigned int first_entry,
> +			       enum i915_cache_level cache_level);
> +	void (*cleanup)(struct i915_address_space *vm);
> +};
> +
>  /* The Graphics Translation Table is the way in which GEN hardware translates a
>   * Graphics Virtual Address into a Physical Address. In addition to the normal
>   * collateral associated with any va->pa translations GEN hardware also has a
> @@ -454,8 +477,7 @@ typedef uint32_t gen6_gtt_pte_t;
>   * the spec.
>   */
>  struct i915_gtt {
> -	unsigned long start;		/* Start offset of used GTT */
> -	size_t total;			/* Total size GTT can map */
> +	struct i915_address_space base;
>  	size_t stolen_size;		/* Total size of stolen memory */
>  
>  	unsigned long mappable_end;	/* End offset that we can CPU map */
> @@ -466,10 +488,6 @@ struct i915_gtt {
>  	void __iomem *gsm;
>  
>  	bool do_idle_maps;
> -	struct {
> -		dma_addr_t addr;
> -		struct page *page;
> -	} scratch;
>  
>  	int mtrr;
>  
> @@ -477,38 +495,17 @@ struct i915_gtt {
>  	int (*gtt_probe)(struct drm_device *dev, size_t *gtt_total,
>  			  size_t *stolen, phys_addr_t *mappable_base,
>  			  unsigned long *mappable_end);
> -	void (*gtt_remove)(struct drm_device *dev);
> -	void (*gtt_clear_range)(struct drm_device *dev,
> -				unsigned int first_entry,
> -				unsigned int num_entries);
> -	void (*gtt_insert_entries)(struct drm_device *dev,
> -				   struct sg_table *st,
> -				   unsigned int pg_start,
> -				   enum i915_cache_level cache_level);
> -	gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
> -				     enum i915_cache_level level);
>  };
> -#define gtt_total_entries(gtt) ((gtt).total >> PAGE_SHIFT)
> +#define gtt_total_entries(gtt) ((gtt).base.total >> PAGE_SHIFT)
>  
>  struct i915_hw_ppgtt {
> -	struct drm_device *dev;
> +	struct i915_address_space base;
>  	unsigned num_pd_entries;
>  	struct page **pt_pages;
>  	uint32_t pd_offset;
>  	dma_addr_t *pt_dma_addr;
>  
> -	/* pte functions, mirroring the interface of the global gtt. */
> -	void (*clear_range)(struct i915_hw_ppgtt *ppgtt,
> -			    unsigned int first_entry,
> -			    unsigned int num_entries);
> -	void (*insert_entries)(struct i915_hw_ppgtt *ppgtt,
> -			       struct sg_table *st,
> -			       unsigned int pg_start,
> -			       enum i915_cache_level cache_level);
> -	gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
> -				     enum i915_cache_level level);
>  	int (*enable)(struct drm_device *dev);
> -	void (*cleanup)(struct i915_hw_ppgtt *ppgtt);
>  };
>  
>  struct i915_ctx_hang_stats {
> @@ -1124,7 +1121,7 @@ typedef struct drm_i915_private {
>  	enum modeset_restore modeset_restore;
>  	struct mutex modeset_restore_lock;
>  
> -	struct i915_gtt gtt;
> +	struct i915_gtt gtt; /* VMA representing the global address space */
>  
>  	struct i915_gem_mm mm;
>  
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index af61be8..3ecedfd 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -181,7 +181,7 @@ i915_gem_get_aperture_ioctl(struct drm_device *dev, void *data,
>  			pinned += i915_gem_obj_ggtt_size(obj);
>  	mutex_unlock(&dev->struct_mutex);
>  
> -	args->aper_size = dev_priv->gtt.total;
> +	args->aper_size = dev_priv->gtt.base.total;
>  	args->aper_available_size = args->aper_size - pinned;
>  
>  	return 0;
> @@ -3070,7 +3070,7 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
>  	u32 size, fence_size, fence_alignment, unfenced_alignment;
>  	bool mappable, fenceable;
>  	size_t gtt_max = map_and_fenceable ?
> -		dev_priv->gtt.mappable_end : dev_priv->gtt.total;
> +		dev_priv->gtt.mappable_end : dev_priv->gtt.base.total;
>  	int ret;
>  
>  	fence_size = i915_gem_get_gtt_size(dev,
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 242d0f9..693115a 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -102,7 +102,7 @@ static gen6_gtt_pte_t hsw_pte_encode(dma_addr_t addr,
>  
>  static void gen6_write_pdes(struct i915_hw_ppgtt *ppgtt)
>  {
> -	struct drm_i915_private *dev_priv = ppgtt->dev->dev_private;
> +	struct drm_i915_private *dev_priv = ppgtt->base.dev->dev_private;
>  	gen6_gtt_pte_t __iomem *pd_addr;
>  	uint32_t pd_entry;
>  	int i;
> @@ -181,18 +181,18 @@ static int gen6_ppgtt_enable(struct drm_device *dev)
>  }
>  
>  /* PPGTT support for Sandybdrige/Gen6 and later */
> -static void gen6_ppgtt_clear_range(struct i915_hw_ppgtt *ppgtt,
> +static void gen6_ppgtt_clear_range(struct i915_address_space *vm,
>  				   unsigned first_entry,
>  				   unsigned num_entries)
>  {
> -	struct drm_i915_private *dev_priv = ppgtt->dev->dev_private;
> +	struct i915_hw_ppgtt *ppgtt =
> +		container_of(vm, struct i915_hw_ppgtt, base);
>  	gen6_gtt_pte_t *pt_vaddr, scratch_pte;
>  	unsigned act_pt = first_entry / I915_PPGTT_PT_ENTRIES;
>  	unsigned first_pte = first_entry % I915_PPGTT_PT_ENTRIES;
>  	unsigned last_pte, i;
>  
> -	scratch_pte = ppgtt->pte_encode(dev_priv->gtt.scratch.addr,
> -					I915_CACHE_LLC);
> +	scratch_pte = vm->pte_encode(vm->scratch.addr, I915_CACHE_LLC);
>  
>  	while (num_entries) {
>  		last_pte = first_pte + num_entries;
> @@ -212,11 +212,13 @@ static void gen6_ppgtt_clear_range(struct i915_hw_ppgtt *ppgtt,
>  	}
>  }
>  
> -static void gen6_ppgtt_insert_entries(struct i915_hw_ppgtt *ppgtt,
> +static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
>  				      struct sg_table *pages,
>  				      unsigned first_entry,
>  				      enum i915_cache_level cache_level)
>  {
> +	struct i915_hw_ppgtt *ppgtt =
> +		container_of(vm, struct i915_hw_ppgtt, base);
>  	gen6_gtt_pte_t *pt_vaddr;
>  	unsigned act_pt = first_entry / I915_PPGTT_PT_ENTRIES;
>  	unsigned act_pte = first_entry % I915_PPGTT_PT_ENTRIES;
> @@ -227,7 +229,7 @@ static void gen6_ppgtt_insert_entries(struct i915_hw_ppgtt *ppgtt,
>  		dma_addr_t page_addr;
>  
>  		page_addr = sg_page_iter_dma_address(&sg_iter);
> -		pt_vaddr[act_pte] = ppgtt->pte_encode(page_addr, cache_level);
> +		pt_vaddr[act_pte] = vm->pte_encode(page_addr, cache_level);
>  		if (++act_pte == I915_PPGTT_PT_ENTRIES) {
>  			kunmap_atomic(pt_vaddr);
>  			act_pt++;
> @@ -239,13 +241,15 @@ static void gen6_ppgtt_insert_entries(struct i915_hw_ppgtt *ppgtt,
>  	kunmap_atomic(pt_vaddr);
>  }
>  
> -static void gen6_ppgtt_cleanup(struct i915_hw_ppgtt *ppgtt)
> +static void gen6_ppgtt_cleanup(struct i915_address_space *vm)
>  {
> +	struct i915_hw_ppgtt *ppgtt =
> +		container_of(vm, struct i915_hw_ppgtt, base);
>  	int i;
>  
>  	if (ppgtt->pt_dma_addr) {
>  		for (i = 0; i < ppgtt->num_pd_entries; i++)
> -			pci_unmap_page(ppgtt->dev->pdev,
> +			pci_unmap_page(ppgtt->base.dev->pdev,
>  				       ppgtt->pt_dma_addr[i],
>  				       4096, PCI_DMA_BIDIRECTIONAL);
>  	}
> @@ -259,7 +263,7 @@ static void gen6_ppgtt_cleanup(struct i915_hw_ppgtt *ppgtt)
>  
>  static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
>  {
> -	struct drm_device *dev = ppgtt->dev;
> +	struct drm_device *dev = ppgtt->base.dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	unsigned first_pd_entry_in_global_pt;
>  	int i;
> @@ -271,17 +275,17 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
>  	first_pd_entry_in_global_pt = gtt_total_entries(dev_priv->gtt);
>  
>  	if (IS_HASWELL(dev)) {
> -		ppgtt->pte_encode = hsw_pte_encode;
> +		ppgtt->base.pte_encode = hsw_pte_encode;
>  	} else if (IS_VALLEYVIEW(dev)) {
> -		ppgtt->pte_encode = byt_pte_encode;
> +		ppgtt->base.pte_encode = byt_pte_encode;
>  	} else {
> -		ppgtt->pte_encode = gen6_pte_encode;
> +		ppgtt->base.pte_encode = gen6_pte_encode;
>  	}
>  	ppgtt->num_pd_entries = GEN6_PPGTT_PD_ENTRIES;
>  	ppgtt->enable = gen6_ppgtt_enable;
> -	ppgtt->clear_range = gen6_ppgtt_clear_range;
> -	ppgtt->insert_entries = gen6_ppgtt_insert_entries;
> -	ppgtt->cleanup = gen6_ppgtt_cleanup;
> +	ppgtt->base.clear_range = gen6_ppgtt_clear_range;
> +	ppgtt->base.insert_entries = gen6_ppgtt_insert_entries;
> +	ppgtt->base.cleanup = gen6_ppgtt_cleanup;
>  	ppgtt->pt_pages = kzalloc(sizeof(struct page *)*ppgtt->num_pd_entries,
>  				  GFP_KERNEL);
>  	if (!ppgtt->pt_pages)
> @@ -312,8 +316,8 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
>  		ppgtt->pt_dma_addr[i] = pt_addr;
>  	}
>  
> -	ppgtt->clear_range(ppgtt, 0,
> -			   ppgtt->num_pd_entries*I915_PPGTT_PT_ENTRIES);
> +	ppgtt->base.clear_range(&ppgtt->base, 0,
> +				ppgtt->num_pd_entries * I915_PPGTT_PT_ENTRIES);
>  
>  	ppgtt->pd_offset = first_pd_entry_in_global_pt * sizeof(gen6_gtt_pte_t);
>  
> @@ -346,7 +350,7 @@ static int i915_gem_init_aliasing_ppgtt(struct drm_device *dev)
>  	if (!ppgtt)
>  		return -ENOMEM;
>  
> -	ppgtt->dev = dev;
> +	ppgtt->base.dev = dev;
>  
>  	if (INTEL_INFO(dev)->gen < 8)
>  		ret = gen6_ppgtt_init(ppgtt);
> @@ -369,7 +373,7 @@ void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev)
>  	if (!ppgtt)
>  		return;
>  
> -	ppgtt->cleanup(ppgtt);
> +	ppgtt->base.cleanup(&ppgtt->base);
>  	dev_priv->mm.aliasing_ppgtt = NULL;
>  }
>  
> @@ -377,17 +381,17 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
>  			    struct drm_i915_gem_object *obj,
>  			    enum i915_cache_level cache_level)
>  {
> -	ppgtt->insert_entries(ppgtt, obj->pages,
> -			      i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> -			      cache_level);
> +	ppgtt->base.insert_entries(&ppgtt->base, obj->pages,
> +				   i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> +				   cache_level);
>  }
>  
>  void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
>  			      struct drm_i915_gem_object *obj)
>  {
> -	ppgtt->clear_range(ppgtt,
> -			   i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> -			   obj->base.size >> PAGE_SHIFT);
> +	ppgtt->base.clear_range(&ppgtt->base,
> +				i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> +				obj->base.size >> PAGE_SHIFT);
>  }
>  
>  extern int intel_iommu_gfx_mapped;
> @@ -434,8 +438,9 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
>  	struct drm_i915_gem_object *obj;
>  
>  	/* First fill our portion of the GTT with scratch pages */
> -	dev_priv->gtt.gtt_clear_range(dev, dev_priv->gtt.start / PAGE_SIZE,
> -				      dev_priv->gtt.total / PAGE_SIZE);
> +	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> +				       dev_priv->gtt.base.start / PAGE_SIZE,
> +				       dev_priv->gtt.base.total / PAGE_SIZE);
>  
>  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
>  		i915_gem_clflush_object(obj);
> @@ -464,12 +469,12 @@ int i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj)
>   * within the global GTT as well as accessible by the GPU through the GMADR
>   * mapped BAR (dev_priv->mm.gtt->gtt).
>   */
> -static void gen6_ggtt_insert_entries(struct drm_device *dev,
> +static void gen6_ggtt_insert_entries(struct i915_address_space *vm,
>  				     struct sg_table *st,
>  				     unsigned int first_entry,
>  				     enum i915_cache_level level)
>  {
> -	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct drm_i915_private *dev_priv = vm->dev->dev_private;
>  	gen6_gtt_pte_t __iomem *gtt_entries =
>  		(gen6_gtt_pte_t __iomem *)dev_priv->gtt.gsm + first_entry;
>  	int i = 0;
> @@ -478,8 +483,7 @@ static void gen6_ggtt_insert_entries(struct drm_device *dev,
>  
>  	for_each_sg_page(st->sgl, &sg_iter, st->nents, 0) {
>  		addr = sg_page_iter_dma_address(&sg_iter);
> -		iowrite32(dev_priv->gtt.pte_encode(addr, level),
> -			  &gtt_entries[i]);
> +		iowrite32(vm->pte_encode(addr, level), &gtt_entries[i]);
>  		i++;
>  	}
>  
> @@ -490,8 +494,8 @@ static void gen6_ggtt_insert_entries(struct drm_device *dev,
>  	 * hardware should work, we must keep this posting read for paranoia.
>  	 */
>  	if (i != 0)
> -		WARN_ON(readl(&gtt_entries[i-1])
> -			!= dev_priv->gtt.pte_encode(addr, level));
> +		WARN_ON(readl(&gtt_entries[i-1]) !=
> +			vm->pte_encode(addr, level));
>  
>  	/* This next bit makes the above posting read even more important. We
>  	 * want to flush the TLBs only after we're certain all the PTE updates
> @@ -501,11 +505,11 @@ static void gen6_ggtt_insert_entries(struct drm_device *dev,
>  	POSTING_READ(GFX_FLSH_CNTL_GEN6);
>  }
>  
> -static void gen6_ggtt_clear_range(struct drm_device *dev,
> +static void gen6_ggtt_clear_range(struct i915_address_space *vm,
>  				  unsigned int first_entry,
>  				  unsigned int num_entries)
>  {
> -	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct drm_i915_private *dev_priv = vm->dev->dev_private;
>  	gen6_gtt_pte_t scratch_pte, __iomem *gtt_base =
>  		(gen6_gtt_pte_t __iomem *) dev_priv->gtt.gsm + first_entry;
>  	const int max_entries = gtt_total_entries(dev_priv->gtt) - first_entry;
> @@ -516,15 +520,14 @@ static void gen6_ggtt_clear_range(struct drm_device *dev,
>  		 first_entry, num_entries, max_entries))
>  		num_entries = max_entries;
>  
> -	scratch_pte = dev_priv->gtt.pte_encode(dev_priv->gtt.scratch.addr,
> -					       I915_CACHE_LLC);
> +	scratch_pte = vm->pte_encode(vm->scratch.addr, I915_CACHE_LLC);
>  	for (i = 0; i < num_entries; i++)
>  		iowrite32(scratch_pte, &gtt_base[i]);
>  	readl(gtt_base);
>  }
>  
>  
> -static void i915_ggtt_insert_entries(struct drm_device *dev,
> +static void i915_ggtt_insert_entries(struct i915_address_space *vm,
>  				     struct sg_table *st,
>  				     unsigned int pg_start,
>  				     enum i915_cache_level cache_level)
> @@ -536,7 +539,7 @@ static void i915_ggtt_insert_entries(struct drm_device *dev,
>  
>  }
>  
> -static void i915_ggtt_clear_range(struct drm_device *dev,
> +static void i915_ggtt_clear_range(struct i915_address_space *vm,
>  				  unsigned int first_entry,
>  				  unsigned int num_entries)
>  {
> @@ -549,10 +552,11 @@ void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
>  {
>  	struct drm_device *dev = obj->base.dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> +	const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
>  
> -	dev_priv->gtt.gtt_insert_entries(dev, obj->pages,
> -					 i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> -					 cache_level);
> +	dev_priv->gtt.base.insert_entries(&dev_priv->gtt.base, obj->pages,
> +					  entry,
> +					  cache_level);
>  
>  	obj->has_global_gtt_mapping = 1;
>  }
> @@ -561,10 +565,11 @@ void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj)
>  {
>  	struct drm_device *dev = obj->base.dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> +	const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
>  
> -	dev_priv->gtt.gtt_clear_range(obj->base.dev,
> -				      i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> -				      obj->base.size >> PAGE_SHIFT);
> +	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> +				       entry,
> +				       obj->base.size >> PAGE_SHIFT);
>  
>  	obj->has_global_gtt_mapping = 0;
>  }
> @@ -641,20 +646,23 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
>  		obj->has_global_gtt_mapping = 1;
>  	}
>  
> -	dev_priv->gtt.start = start;
> -	dev_priv->gtt.total = end - start;
> +	dev_priv->gtt.base.start = start;
> +	dev_priv->gtt.base.total = end - start;
>  
>  	/* Clear any non-preallocated blocks */
>  	drm_mm_for_each_hole(entry, &dev_priv->mm.gtt_space,
>  			     hole_start, hole_end) {
> +		const unsigned long count = (hole_end - hole_start) / PAGE_SIZE;
>  		DRM_DEBUG_KMS("clearing unused GTT space: [%lx, %lx]\n",
>  			      hole_start, hole_end);
> -		dev_priv->gtt.gtt_clear_range(dev, hole_start / PAGE_SIZE,
> -					      (hole_end-hole_start) / PAGE_SIZE);
> +		dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> +					       hole_start / PAGE_SIZE,
> +					       count);
>  	}
>  
>  	/* And finally clear the reserved guard page */
> -	dev_priv->gtt.gtt_clear_range(dev, end / PAGE_SIZE - 1, 1);
> +	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> +				       end / PAGE_SIZE - 1, 1);
>  }
>  
>  static bool
> @@ -677,7 +685,7 @@ void i915_gem_init_global_gtt(struct drm_device *dev)
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	unsigned long gtt_size, mappable_size;
>  
> -	gtt_size = dev_priv->gtt.total;
> +	gtt_size = dev_priv->gtt.base.total;
>  	mappable_size = dev_priv->gtt.mappable_end;
>  
>  	if (intel_enable_ppgtt(dev) && HAS_ALIASING_PPGTT(dev)) {
> @@ -722,8 +730,8 @@ static int setup_scratch_page(struct drm_device *dev)
>  #else
>  	dma_addr = page_to_phys(page);
>  #endif
> -	dev_priv->gtt.scratch.page = page;
> -	dev_priv->gtt.scratch.addr = dma_addr;
> +	dev_priv->gtt.base.scratch.page = page;
> +	dev_priv->gtt.base.scratch.addr = dma_addr;
>  
>  	return 0;
>  }
> @@ -731,11 +739,13 @@ static int setup_scratch_page(struct drm_device *dev)
>  static void teardown_scratch_page(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	set_pages_wb(dev_priv->gtt.scratch.page, 1);
> -	pci_unmap_page(dev->pdev, dev_priv->gtt.scratch.addr,
> +	struct page *page = dev_priv->gtt.base.scratch.page;
> +
> +	set_pages_wb(page, 1);
> +	pci_unmap_page(dev->pdev, dev_priv->gtt.base.scratch.addr,
>  		       PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
> -	put_page(dev_priv->gtt.scratch.page);
> -	__free_page(dev_priv->gtt.scratch.page);
> +	put_page(page);
> +	__free_page(page);
>  }
>  
>  static inline unsigned int gen6_get_total_gtt_size(u16 snb_gmch_ctl)
> @@ -798,17 +808,18 @@ static int gen6_gmch_probe(struct drm_device *dev,
>  	if (ret)
>  		DRM_ERROR("Scratch setup failed\n");
>  
> -	dev_priv->gtt.gtt_clear_range = gen6_ggtt_clear_range;
> -	dev_priv->gtt.gtt_insert_entries = gen6_ggtt_insert_entries;
> +	dev_priv->gtt.base.clear_range = gen6_ggtt_clear_range;
> +	dev_priv->gtt.base.insert_entries = gen6_ggtt_insert_entries;
>  
>  	return ret;
>  }
>  
> -static void gen6_gmch_remove(struct drm_device *dev)
> +static void gen6_gmch_remove(struct i915_address_space *vm)
>  {
> -	struct drm_i915_private *dev_priv = dev->dev_private;
> -	iounmap(dev_priv->gtt.gsm);
> -	teardown_scratch_page(dev_priv->dev);
> +
> +	struct i915_gtt *gtt = container_of(vm, struct i915_gtt, base);
> +	iounmap(gtt->gsm);
> +	teardown_scratch_page(vm->dev);
>  }
>  
>  static int i915_gmch_probe(struct drm_device *dev,
> @@ -829,13 +840,13 @@ static int i915_gmch_probe(struct drm_device *dev,
>  	intel_gtt_get(gtt_total, stolen, mappable_base, mappable_end);
>  
>  	dev_priv->gtt.do_idle_maps = needs_idle_maps(dev_priv->dev);
> -	dev_priv->gtt.gtt_clear_range = i915_ggtt_clear_range;
> -	dev_priv->gtt.gtt_insert_entries = i915_ggtt_insert_entries;
> +	dev_priv->gtt.base.clear_range = i915_ggtt_clear_range;
> +	dev_priv->gtt.base.insert_entries = i915_ggtt_insert_entries;
>  
>  	return 0;
>  }
>  
> -static void i915_gmch_remove(struct drm_device *dev)
> +static void i915_gmch_remove(struct i915_address_space *vm)
>  {
>  	intel_gmch_remove();
>  }
> @@ -848,25 +859,28 @@ int i915_gem_gtt_init(struct drm_device *dev)
>  
>  	if (INTEL_INFO(dev)->gen <= 5) {
>  		gtt->gtt_probe = i915_gmch_probe;
> -		gtt->gtt_remove = i915_gmch_remove;
> +		gtt->base.cleanup = i915_gmch_remove;
>  	} else {
>  		gtt->gtt_probe = gen6_gmch_probe;
> -		gtt->gtt_remove = gen6_gmch_remove;
> +		gtt->base.cleanup = gen6_gmch_remove;
>  		if (IS_HASWELL(dev))
> -			gtt->pte_encode = hsw_pte_encode;
> +			gtt->base.pte_encode = hsw_pte_encode;
>  		else if (IS_VALLEYVIEW(dev))
> -			gtt->pte_encode = byt_pte_encode;
> +			gtt->base.pte_encode = byt_pte_encode;
>  		else
> -			gtt->pte_encode = gen6_pte_encode;
> +			gtt->base.pte_encode = gen6_pte_encode;
>  	}
>  
> -	ret = gtt->gtt_probe(dev, &gtt->total, &gtt->stolen_size,
> +	ret = gtt->gtt_probe(dev, &gtt->base.total, &gtt->stolen_size,
>  			     &gtt->mappable_base, &gtt->mappable_end);
>  	if (ret)
>  		return ret;
>  
> +	gtt->base.dev = dev;
> +
>  	/* GMADR is the PCI mmio aperture into the global GTT. */
> -	DRM_INFO("Memory usable by graphics device = %zdM\n", gtt->total >> 20);
> +	DRM_INFO("Memory usable by graphics device = %zdM\n",
> +		 gtt->base.total >> 20);
>  	DRM_DEBUG_DRIVER("GMADR size = %ldM\n", gtt->mappable_end >> 20);
>  	DRM_DEBUG_DRIVER("GTT stolen size = %zdM\n", gtt->stolen_size >> 20);
>  
> -- 
> 1.8.3.2
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 06/11] drm/i915: plumb VM into object operations
  2013-07-09  6:08 ` [PATCH 06/11] drm/i915: plumb VM into object operations Ben Widawsky
@ 2013-07-09  7:15   ` Daniel Vetter
  2013-07-10 16:37     ` Ben Widawsky
  2013-07-12  2:23     ` Ben Widawsky
  0 siblings, 2 replies; 50+ messages in thread
From: Daniel Vetter @ 2013-07-09  7:15 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Mon, Jul 08, 2013 at 11:08:37PM -0700, Ben Widawsky wrote:
> This patch was formerly known as:
> "drm/i915: Create VMAs (part 3) - plumbing"
> 
> This patch adds a VM argument, bind/unbind, and the object
> offset/size/color getters/setters. It preserves the old ggtt helper
> functions because things still need, and will continue to need them.
> 
> Some code will still need to be ported over after this.
> 
> v2: Fix purge to pick an object and unbind all vmas
> This was doable because of the global bound list change.
> 
> v3: With the commit to actually pin/unpin pages in place, there is no
> longer a need to check if unbind succeeded before calling put_pages().
> Make put_pages only BUG() after checking pin count.
> 
> v4: Rebased on top of the new hangcheck work by Mika
> plumbed eb_destroy also
> Many checkpatch related fixes
> 
> v5: Very large rebase
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

This one is a rather large beast. Any chance we could split it into
topics, e.g. convert execbuf code, convert shrinker code? Or does that get
messy, fast?

> ---
>  drivers/gpu/drm/i915/i915_debugfs.c        |  31 ++-
>  drivers/gpu/drm/i915/i915_dma.c            |   4 -
>  drivers/gpu/drm/i915/i915_drv.h            | 107 +++++-----
>  drivers/gpu/drm/i915/i915_gem.c            | 316 +++++++++++++++++++++--------
>  drivers/gpu/drm/i915/i915_gem_context.c    |   9 +-
>  drivers/gpu/drm/i915/i915_gem_evict.c      |  51 +++--
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c |  85 +++++---
>  drivers/gpu/drm/i915/i915_gem_gtt.c        |  41 ++--
>  drivers/gpu/drm/i915/i915_gem_stolen.c     |   6 +-
>  drivers/gpu/drm/i915/i915_gem_tiling.c     |  10 +-
>  drivers/gpu/drm/i915/i915_irq.c            |   6 +-
>  drivers/gpu/drm/i915/i915_trace.h          |  20 +-
>  drivers/gpu/drm/i915/intel_fb.c            |   1 -
>  drivers/gpu/drm/i915/intel_overlay.c       |   2 +-
>  drivers/gpu/drm/i915/intel_pm.c            |   2 +-
>  drivers/gpu/drm/i915/intel_ringbuffer.c    |  16 +-
>  16 files changed, 468 insertions(+), 239 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 16b2aaf..867ed07 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -122,9 +122,18 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
>  		seq_printf(m, " (pinned x %d)", obj->pin_count);
>  	if (obj->fence_reg != I915_FENCE_REG_NONE)
>  		seq_printf(m, " (fence: %d)", obj->fence_reg);
> -	if (i915_gem_obj_ggtt_bound(obj))
> -		seq_printf(m, " (gtt offset: %08lx, size: %08x)",
> -			   i915_gem_obj_ggtt_offset(obj), (unsigned int)i915_gem_obj_ggtt_size(obj));
> +	if (i915_gem_obj_bound_any(obj)) {

list_for_each will short-circuit already, so this is redundant.

> +		struct i915_vma *vma;
> +		list_for_each_entry(vma, &obj->vma_list, vma_link) {
> +			if (!i915_is_ggtt(vma->vm))
> +				seq_puts(m, " (pp");
> +			else
> +				seq_puts(m, " (g");
> +			seq_printf(m, " gtt offset: %08lx, size: %08lx)",

                                       ^ that space looks superflous now

> +				   i915_gem_obj_offset(obj, vma->vm),
> +				   i915_gem_obj_size(obj, vma->vm));
> +		}
> +	}
>  	if (obj->stolen)
>  		seq_printf(m, " (stolen: %08lx)", obj->stolen->start);
>  	if (obj->pin_mappable || obj->fault_mappable) {
> @@ -186,6 +195,7 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
>  	return 0;
>  }
>  
> +/* FIXME: Support multiple VM? */
>  #define count_objects(list, member) do { \
>  	list_for_each_entry(obj, list, member) { \
>  		size += i915_gem_obj_ggtt_size(obj); \
> @@ -2049,18 +2059,21 @@ i915_drop_caches_set(void *data, u64 val)
>  
>  	if (val & DROP_BOUND) {
>  		list_for_each_entry_safe(obj, next, &vm->inactive_list,
> -					 mm_list)
> -			if (obj->pin_count == 0) {
> -				ret = i915_gem_object_unbind(obj);
> -				if (ret)
> -					goto unlock;
> -			}
> +					 mm_list) {
> +			if (obj->pin_count)
> +				continue;
> +
> +			ret = i915_gem_object_unbind(obj, &dev_priv->gtt.base);
> +			if (ret)
> +				goto unlock;
> +		}
>  	}
>  
>  	if (val & DROP_UNBOUND) {
>  		list_for_each_entry_safe(obj, next, &dev_priv->mm.unbound_list,
>  					 global_list)
>  			if (obj->pages_pin_count == 0) {
> +				/* FIXME: Do this for all vms? */
>  				ret = i915_gem_object_put_pages(obj);
>  				if (ret)
>  					goto unlock;
> diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> index d13e21f..b190439 100644
> --- a/drivers/gpu/drm/i915/i915_dma.c
> +++ b/drivers/gpu/drm/i915/i915_dma.c
> @@ -1497,10 +1497,6 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
>  
>  	i915_dump_device_info(dev_priv);
>  
> -	INIT_LIST_HEAD(&dev_priv->vm_list);
> -	INIT_LIST_HEAD(&dev_priv->gtt.base.global_link);
> -	list_add(&dev_priv->gtt.base.global_link, &dev_priv->vm_list);
> -
>  	if (i915_get_bridge_dev(dev)) {
>  		ret = -EIO;
>  		goto free_priv;
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 38cccc8..48baccc 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1363,52 +1363,6 @@ struct drm_i915_gem_object {
>  
>  #define to_intel_bo(x) container_of(x, struct drm_i915_gem_object, base)
>  
> -/* This is a temporary define to help transition us to real VMAs. If you see
> - * this, you're either reviewing code, or bisecting it. */
> -static inline struct i915_vma *
> -__i915_gem_obj_to_vma(struct drm_i915_gem_object *obj)
> -{
> -	if (list_empty(&obj->vma_list))
> -		return NULL;
> -	return list_first_entry(&obj->vma_list, struct i915_vma, vma_link);
> -}
> -
> -/* Whether or not this object is currently mapped by the translation tables */
> -static inline bool
> -i915_gem_obj_ggtt_bound(struct drm_i915_gem_object *o)
> -{
> -	struct i915_vma *vma = __i915_gem_obj_to_vma(o);
> -	if (vma == NULL)
> -		return false;
> -	return drm_mm_node_allocated(&vma->node);
> -}
> -
> -/* Offset of the first PTE pointing to this object */
> -static inline unsigned long
> -i915_gem_obj_ggtt_offset(struct drm_i915_gem_object *o)
> -{
> -	BUG_ON(list_empty(&o->vma_list));
> -	return __i915_gem_obj_to_vma(o)->node.start;
> -}
> -
> -/* The size used in the translation tables may be larger than the actual size of
> - * the object on GEN2/GEN3 because of the way tiling is handled. See
> - * i915_gem_get_gtt_size() for more details.
> - */
> -static inline unsigned long
> -i915_gem_obj_ggtt_size(struct drm_i915_gem_object *o)
> -{
> -	BUG_ON(list_empty(&o->vma_list));
> -	return __i915_gem_obj_to_vma(o)->node.size;
> -}
> -
> -static inline void
> -i915_gem_obj_ggtt_set_color(struct drm_i915_gem_object *o,
> -			    enum i915_cache_level color)
> -{
> -	__i915_gem_obj_to_vma(o)->node.color = color;
> -}
> -
>  /**
>   * Request queue structure.
>   *
> @@ -1726,11 +1680,13 @@ struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
>  void i915_gem_vma_destroy(struct i915_vma *vma);
>  
>  int __must_check i915_gem_object_pin(struct drm_i915_gem_object *obj,
> +				     struct i915_address_space *vm,
>  				     uint32_t alignment,
>  				     bool map_and_fenceable,
>  				     bool nonblocking);
>  void i915_gem_object_unpin(struct drm_i915_gem_object *obj);
> -int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj);
> +int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj,
> +					struct i915_address_space *vm);
>  int i915_gem_object_put_pages(struct drm_i915_gem_object *obj);
>  void i915_gem_release_mmap(struct drm_i915_gem_object *obj);
>  void i915_gem_lastclose(struct drm_device *dev);
> @@ -1760,6 +1716,7 @@ int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
>  int i915_gem_object_sync(struct drm_i915_gem_object *obj,
>  			 struct intel_ring_buffer *to);
>  void i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> +				    struct i915_address_space *vm,
>  				    struct intel_ring_buffer *ring);
>  
>  int i915_gem_dumb_create(struct drm_file *file_priv,
> @@ -1866,6 +1823,7 @@ i915_gem_get_gtt_alignment(struct drm_device *dev, uint32_t size,
>  			    int tiling_mode, bool fenced);
>  
>  int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> +				    struct i915_address_space *vm,
>  				    enum i915_cache_level cache_level);
>  
>  struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev,
> @@ -1876,6 +1834,56 @@ struct dma_buf *i915_gem_prime_export(struct drm_device *dev,
>  
>  void i915_gem_restore_fences(struct drm_device *dev);
>  
> +unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
> +				  struct i915_address_space *vm);
> +bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o);
> +bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
> +			struct i915_address_space *vm);
> +unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
> +				struct i915_address_space *vm);
> +void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
> +			    struct i915_address_space *vm,
> +			    enum i915_cache_level color);
> +struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
> +				     struct i915_address_space *vm);
> +/* Some GGTT VM helpers */
> +#define obj_to_ggtt(obj) \
> +	(&((struct drm_i915_private *)(obj)->base.dev->dev_private)->gtt.base)
> +static inline bool i915_is_ggtt(struct i915_address_space *vm)
> +{
> +	struct i915_address_space *ggtt =
> +		&((struct drm_i915_private *)(vm)->dev->dev_private)->gtt.base;
> +	return vm == ggtt;
> +}
> +
> +static inline bool i915_gem_obj_ggtt_bound(struct drm_i915_gem_object *obj)
> +{
> +	return i915_gem_obj_bound(obj, obj_to_ggtt(obj));
> +}
> +
> +static inline unsigned long
> +i915_gem_obj_ggtt_offset(struct drm_i915_gem_object *obj)
> +{
> +	return i915_gem_obj_offset(obj, obj_to_ggtt(obj));
> +}
> +
> +static inline unsigned long
> +i915_gem_obj_ggtt_size(struct drm_i915_gem_object *obj)
> +{
> +	return i915_gem_obj_size(obj, obj_to_ggtt(obj));
> +}
> +
> +static inline int __must_check
> +i915_gem_ggtt_pin(struct drm_i915_gem_object *obj,
> +		  uint32_t alignment,
> +		  bool map_and_fenceable,
> +		  bool nonblocking)
> +{
> +	return i915_gem_object_pin(obj, obj_to_ggtt(obj), alignment,
> +				   map_and_fenceable, nonblocking);
> +}
> +#undef obj_to_ggtt
> +
>  /* i915_gem_context.c */
>  void i915_gem_context_init(struct drm_device *dev);
>  void i915_gem_context_fini(struct drm_device *dev);
> @@ -1912,6 +1920,7 @@ void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
>  
>  void i915_gem_restore_gtt_mappings(struct drm_device *dev);
>  int __must_check i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj);
> +/* FIXME: this is never okay with full PPGTT */
>  void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
>  				enum i915_cache_level cache_level);
>  void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj);
> @@ -1928,7 +1937,9 @@ static inline void i915_gem_chipset_flush(struct drm_device *dev)
>  
>  
>  /* i915_gem_evict.c */
> -int __must_check i915_gem_evict_something(struct drm_device *dev, int min_size,
> +int __must_check i915_gem_evict_something(struct drm_device *dev,
> +					  struct i915_address_space *vm,
> +					  int min_size,
>  					  unsigned alignment,
>  					  unsigned cache_level,
>  					  bool mappable,
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 058ad44..21015cd 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -38,10 +38,12 @@
>  
>  static void i915_gem_object_flush_gtt_write_domain(struct drm_i915_gem_object *obj);
>  static void i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *obj);
> -static __must_check int i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> -						    unsigned alignment,
> -						    bool map_and_fenceable,
> -						    bool nonblocking);
> +static __must_check int
> +i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> +			    struct i915_address_space *vm,
> +			    unsigned alignment,
> +			    bool map_and_fenceable,
> +			    bool nonblocking);
>  static int i915_gem_phys_pwrite(struct drm_device *dev,
>  				struct drm_i915_gem_object *obj,
>  				struct drm_i915_gem_pwrite *args,
> @@ -135,7 +137,7 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
>  static inline bool
>  i915_gem_object_is_inactive(struct drm_i915_gem_object *obj)
>  {
> -	return i915_gem_obj_ggtt_bound(obj) && !obj->active;
> +	return i915_gem_obj_bound_any(obj) && !obj->active;
>  }
>  
>  int
> @@ -422,7 +424,7 @@ i915_gem_shmem_pread(struct drm_device *dev,
>  		 * anyway again before the next pread happens. */
>  		if (obj->cache_level == I915_CACHE_NONE)
>  			needs_clflush = 1;
> -		if (i915_gem_obj_ggtt_bound(obj)) {
> +		if (i915_gem_obj_bound_any(obj)) {
>  			ret = i915_gem_object_set_to_gtt_domain(obj, false);

This is essentially a very convoluted version of "if there's gpu rendering
outstanding, please wait for it". Maybe we should switch this to

	if (obj->active)
		wait_rendering(obj, true);

Same for the shmem_pwrite case below. Would be a separate patch to prep
things though. Can I volunteer you for that? The ugly part is to review
whether any of the lru list updating that set_domain does in addition to
wait_rendering is required, but on a quick read that's not the case.

>  			if (ret)
>  				return ret;
> @@ -594,7 +596,7 @@ i915_gem_gtt_pwrite_fast(struct drm_device *dev,
>  	char __user *user_data;
>  	int page_offset, page_length, ret;
>  
> -	ret = i915_gem_object_pin(obj, 0, true, true);
> +	ret = i915_gem_ggtt_pin(obj, 0, true, true);
>  	if (ret)
>  		goto out;
>  
> @@ -739,7 +741,7 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
>  		 * right away and we therefore have to clflush anyway. */
>  		if (obj->cache_level == I915_CACHE_NONE)
>  			needs_clflush_after = 1;
> -		if (i915_gem_obj_ggtt_bound(obj)) {
> +		if (i915_gem_obj_bound_any(obj)) {

... see above.
>  			ret = i915_gem_object_set_to_gtt_domain(obj, true);
>  			if (ret)
>  				return ret;
> @@ -1346,7 +1348,7 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
>  	}
>  
>  	/* Now bind it into the GTT if needed */
> -	ret = i915_gem_object_pin(obj, 0, true, false);
> +	ret = i915_gem_ggtt_pin(obj,  0, true, false);
>  	if (ret)
>  		goto unlock;
>  
> @@ -1668,11 +1670,11 @@ i915_gem_object_put_pages(struct drm_i915_gem_object *obj)
>  	if (obj->pages == NULL)
>  		return 0;
>  
> -	BUG_ON(i915_gem_obj_ggtt_bound(obj));
> -
>  	if (obj->pages_pin_count)
>  		return -EBUSY;
>  
> +	BUG_ON(i915_gem_obj_bound_any(obj));
> +
>  	/* ->put_pages might need to allocate memory for the bit17 swizzle
>  	 * array, hence protect them from being reaped by removing them from gtt
>  	 * lists early. */
> @@ -1692,7 +1694,6 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
>  		  bool purgeable_only)
>  {
>  	struct drm_i915_gem_object *obj, *next;
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
>  	long count = 0;
>  
>  	list_for_each_entry_safe(obj, next,
> @@ -1706,14 +1707,22 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
>  		}
>  	}
>  
> -	list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list) {
> -		if ((i915_gem_object_is_purgeable(obj) || !purgeable_only) &&
> -		    i915_gem_object_unbind(obj) == 0 &&
> -		    i915_gem_object_put_pages(obj) == 0) {
> +	list_for_each_entry_safe(obj, next, &dev_priv->mm.bound_list,
> +				 global_list) {
> +		struct i915_vma *vma, *v;
> +
> +		if (!i915_gem_object_is_purgeable(obj) && purgeable_only)
> +			continue;
> +
> +		list_for_each_entry_safe(vma, v, &obj->vma_list, vma_link)
> +			if (i915_gem_object_unbind(obj, vma->vm))
> +				break;
> +
> +		if (!i915_gem_object_put_pages(obj))
>  			count += obj->base.size >> PAGE_SHIFT;
> -			if (count >= target)
> -				return count;
> -		}
> +
> +		if (count >= target)
> +			return count;
>  	}
>  
>  	return count;
> @@ -1873,11 +1882,11 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
>  
>  void
>  i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> +			       struct i915_address_space *vm,
>  			       struct intel_ring_buffer *ring)
>  {
>  	struct drm_device *dev = obj->base.dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
>  	u32 seqno = intel_ring_get_seqno(ring);
>  
>  	BUG_ON(ring == NULL);
> @@ -1910,12 +1919,9 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
>  }
>  
>  static void
> -i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
> +i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj,
> +				 struct i915_address_space *vm)
>  {
> -	struct drm_device *dev = obj->base.dev;
> -	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
> -
>  	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
>  	BUG_ON(!obj->active);
>  
> @@ -2117,10 +2123,11 @@ i915_gem_request_remove_from_client(struct drm_i915_gem_request *request)
>  	spin_unlock(&file_priv->mm.lock);
>  }
>  
> -static bool i915_head_inside_object(u32 acthd, struct drm_i915_gem_object *obj)
> +static bool i915_head_inside_object(u32 acthd, struct drm_i915_gem_object *obj,
> +				    struct i915_address_space *vm)
>  {
> -	if (acthd >= i915_gem_obj_ggtt_offset(obj) &&
> -	    acthd < i915_gem_obj_ggtt_offset(obj) + obj->base.size)
> +	if (acthd >= i915_gem_obj_offset(obj, vm) &&
> +	    acthd < i915_gem_obj_offset(obj, vm) + obj->base.size)
>  		return true;
>  
>  	return false;
> @@ -2143,6 +2150,17 @@ static bool i915_head_inside_request(const u32 acthd_unmasked,
>  	return false;
>  }
>  
> +static struct i915_address_space *
> +request_to_vm(struct drm_i915_gem_request *request)
> +{
> +	struct drm_i915_private *dev_priv = request->ring->dev->dev_private;
> +	struct i915_address_space *vm;
> +
> +	vm = &dev_priv->gtt.base;
> +
> +	return vm;
> +}
> +
>  static bool i915_request_guilty(struct drm_i915_gem_request *request,
>  				const u32 acthd, bool *inside)
>  {
> @@ -2150,9 +2168,9 @@ static bool i915_request_guilty(struct drm_i915_gem_request *request,
>  	 * pointing inside the ring, matches the batch_obj address range.
>  	 * However this is extremely unlikely.
>  	 */
> -
>  	if (request->batch_obj) {
> -		if (i915_head_inside_object(acthd, request->batch_obj)) {
> +		if (i915_head_inside_object(acthd, request->batch_obj,
> +					    request_to_vm(request))) {
>  			*inside = true;
>  			return true;
>  		}
> @@ -2172,17 +2190,21 @@ static void i915_set_reset_status(struct intel_ring_buffer *ring,
>  {
>  	struct i915_ctx_hang_stats *hs = NULL;
>  	bool inside, guilty;
> +	unsigned long offset = 0;
>  
>  	/* Innocent until proven guilty */
>  	guilty = false;
>  
> +	if (request->batch_obj)
> +		offset = i915_gem_obj_offset(request->batch_obj,
> +					     request_to_vm(request));
> +
>  	if (ring->hangcheck.action != wait &&
>  	    i915_request_guilty(request, acthd, &inside)) {
>  		DRM_ERROR("%s hung %s bo (0x%lx ctx %d) at 0x%x\n",
>  			  ring->name,
>  			  inside ? "inside" : "flushing",
> -			  request->batch_obj ?
> -			  i915_gem_obj_ggtt_offset(request->batch_obj) : 0,
> +			  offset,
>  			  request->ctx ? request->ctx->id : 0,
>  			  acthd);
>  
> @@ -2239,13 +2261,15 @@ static void i915_gem_reset_ring_lists(struct drm_i915_private *dev_priv,
>  	}
>  
>  	while (!list_empty(&ring->active_list)) {
> +		struct i915_address_space *vm;
>  		struct drm_i915_gem_object *obj;
>  
>  		obj = list_first_entry(&ring->active_list,
>  				       struct drm_i915_gem_object,
>  				       ring_list);
>  
> -		i915_gem_object_move_to_inactive(obj);
> +		list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> +			i915_gem_object_move_to_inactive(obj, vm);
>  	}
>  }
>  
> @@ -2263,7 +2287,7 @@ void i915_gem_restore_fences(struct drm_device *dev)
>  void i915_gem_reset(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
> +	struct i915_address_space *vm;
>  	struct drm_i915_gem_object *obj;
>  	struct intel_ring_buffer *ring;
>  	int i;
> @@ -2274,8 +2298,9 @@ void i915_gem_reset(struct drm_device *dev)
>  	/* Move everything out of the GPU domains to ensure we do any
>  	 * necessary invalidation upon reuse.
>  	 */
> -	list_for_each_entry(obj, &vm->inactive_list, mm_list)
> -		obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
> +	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> +		list_for_each_entry(obj, &vm->inactive_list, mm_list)
> +			obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
>  
>  	i915_gem_restore_fences(dev);
>  }
> @@ -2320,6 +2345,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
>  	 * by the ringbuffer to the flushing/inactive lists as appropriate.
>  	 */
>  	while (!list_empty(&ring->active_list)) {
> +		struct drm_i915_private *dev_priv = ring->dev->dev_private;
> +		struct i915_address_space *vm;
>  		struct drm_i915_gem_object *obj;
>  
>  		obj = list_first_entry(&ring->active_list,
> @@ -2329,7 +2356,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
>  		if (!i915_seqno_passed(seqno, obj->last_read_seqno))
>  			break;
>  
> -		i915_gem_object_move_to_inactive(obj);
> +		list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> +			i915_gem_object_move_to_inactive(obj, vm);
>  	}
>  
>  	if (unlikely(ring->trace_irq_seqno &&
> @@ -2575,13 +2603,14 @@ static void i915_gem_object_finish_gtt(struct drm_i915_gem_object *obj)
>   * Unbinds an object from the GTT aperture.
>   */
>  int
> -i915_gem_object_unbind(struct drm_i915_gem_object *obj)
> +i915_gem_object_unbind(struct drm_i915_gem_object *obj,
> +		       struct i915_address_space *vm)
>  {
>  	drm_i915_private_t *dev_priv = obj->base.dev->dev_private;
>  	struct i915_vma *vma;
>  	int ret;
>  
> -	if (!i915_gem_obj_ggtt_bound(obj))
> +	if (!i915_gem_obj_bound(obj, vm))
>  		return 0;
>  
>  	if (obj->pin_count)
> @@ -2604,7 +2633,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
>  	if (ret)
>  		return ret;
>  
> -	trace_i915_gem_object_unbind(obj);
> +	trace_i915_gem_object_unbind(obj, vm);
>  
>  	if (obj->has_global_gtt_mapping)
>  		i915_gem_gtt_unbind_object(obj);
> @@ -2619,7 +2648,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
>  	/* Avoid an unnecessary call to unbind on rebind. */
>  	obj->map_and_fenceable = true;
>  
> -	vma = __i915_gem_obj_to_vma(obj);
> +	vma = i915_gem_obj_to_vma(obj, vm);
>  	list_del(&vma->vma_link);
>  	drm_mm_remove_node(&vma->node);
>  	i915_gem_vma_destroy(vma);
> @@ -2748,6 +2777,7 @@ static void i830_write_fence_reg(struct drm_device *dev, int reg,
>  		     "object 0x%08lx not 512K or pot-size 0x%08x aligned\n",
>  		     i915_gem_obj_ggtt_offset(obj), size);
>  
> +
>  		pitch_val = obj->stride / 128;
>  		pitch_val = ffs(pitch_val) - 1;
>  
> @@ -3069,23 +3099,25 @@ static void i915_gem_verify_gtt(struct drm_device *dev)
>   */
>  static int
>  i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> +			    struct i915_address_space *vm,
>  			    unsigned alignment,
>  			    bool map_and_fenceable,
>  			    bool nonblocking)
>  {
>  	struct drm_device *dev = obj->base.dev;
>  	drm_i915_private_t *dev_priv = dev->dev_private;
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
>  	u32 size, fence_size, fence_alignment, unfenced_alignment;
>  	bool mappable, fenceable;
> -	size_t gtt_max = map_and_fenceable ?
> -		dev_priv->gtt.mappable_end : dev_priv->gtt.base.total;
> +	size_t gtt_max =
> +		map_and_fenceable ? dev_priv->gtt.mappable_end : vm->total;
>  	struct i915_vma *vma;
>  	int ret;
>  
>  	if (WARN_ON(!list_empty(&obj->vma_list)))
>  		return -EBUSY;
>  
> +	BUG_ON(!i915_is_ggtt(vm));
> +
>  	fence_size = i915_gem_get_gtt_size(dev,
>  					   obj->base.size,
>  					   obj->tiling_mode);
> @@ -3125,18 +3157,21 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
>  	i915_gem_object_pin_pages(obj);
>  
>  	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
> +	/* For now we only ever use 1 vma per object */
> +	WARN_ON(!list_empty(&obj->vma_list));
> +
> +	vma = i915_gem_vma_create(obj, vm);
>  	if (vma == NULL) {
>  		i915_gem_object_unpin_pages(obj);
>  		return -ENOMEM;
>  	}
>  
>  search_free:
> -	ret = drm_mm_insert_node_in_range_generic(&dev_priv->gtt.base.mm,
> -						  &vma->node,
> +	ret = drm_mm_insert_node_in_range_generic(&vm->mm, &vma->node,
>  						  size, alignment,
>  						  obj->cache_level, 0, gtt_max);
>  	if (ret) {
> -		ret = i915_gem_evict_something(dev, size, alignment,
> +		ret = i915_gem_evict_something(dev, vm, size, alignment,
>  					       obj->cache_level,
>  					       map_and_fenceable,
>  					       nonblocking);
> @@ -3162,18 +3197,25 @@ search_free:
>  
>  	list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
>  	list_add_tail(&obj->mm_list, &vm->inactive_list);
> -	list_add(&vma->vma_link, &obj->vma_list);
> +
> +	/* Keep GGTT vmas first to make debug easier */
> +	if (i915_is_ggtt(vm))
> +		list_add(&vma->vma_link, &obj->vma_list);
> +	else
> +		list_add_tail(&vma->vma_link, &obj->vma_list);
>  
>  	fenceable =
> +		i915_is_ggtt(vm) &&
>  		i915_gem_obj_ggtt_size(obj) == fence_size &&
>  		(i915_gem_obj_ggtt_offset(obj) & (fence_alignment - 1)) == 0;
>  
> -	mappable = i915_gem_obj_ggtt_offset(obj) + obj->base.size <=
> -		dev_priv->gtt.mappable_end;
> +	mappable =
> +		i915_is_ggtt(vm) &&
> +		vma->node.start + obj->base.size <= dev_priv->gtt.mappable_end;
>  
>  	obj->map_and_fenceable = mappable && fenceable;
>  
> -	trace_i915_gem_object_bind(obj, map_and_fenceable);
> +	trace_i915_gem_object_bind(obj, vm, map_and_fenceable);
>  	i915_gem_verify_gtt(dev);
>  	return 0;
>  }
> @@ -3271,7 +3313,7 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
>  	int ret;
>  
>  	/* Not valid to be called on unbound objects. */
> -	if (!i915_gem_obj_ggtt_bound(obj))
> +	if (!i915_gem_obj_bound_any(obj))
>  		return -EINVAL;

If we're converting the shmem paths over to wait_rendering then there's
only the fault handler and the set_domain ioctl left. For the later it
would make sense to clflush even when an object is on the unbound list, to
allow userspace to optimize when the clflushing happens. But that would
only make sense in conjunction with Chris' create2 ioctl and a flag to
preallocate the storage (and so putting the object onto the unbound list).
So nothing to do here.

>  
>  	if (obj->base.write_domain == I915_GEM_DOMAIN_GTT)
> @@ -3317,11 +3359,12 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
>  }
>  
>  int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> +				    struct i915_address_space *vm,
>  				    enum i915_cache_level cache_level)
>  {
>  	struct drm_device *dev = obj->base.dev;
>  	drm_i915_private_t *dev_priv = dev->dev_private;
> -	struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> +	struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
>  	int ret;
>  
>  	if (obj->cache_level == cache_level)
> @@ -3333,12 +3376,15 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
>  	}
>  
>  	if (!i915_gem_valid_gtt_space(dev, &vma->node, cache_level)) {
> -		ret = i915_gem_object_unbind(obj);
> +		ret = i915_gem_object_unbind(obj, vm);
>  		if (ret)
>  			return ret;
>  	}
>  
> -	if (i915_gem_obj_ggtt_bound(obj)) {
> +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> +		if (!i915_gem_obj_bound(obj, vm))
> +			continue;

Hm, shouldn't we have a per-object list of vmas? Or will that follow later
on?

Self-correction: It exists already ... why can't we use this here?

> +
>  		ret = i915_gem_object_finish_gpu(obj);
>  		if (ret)
>  			return ret;
> @@ -3361,7 +3407,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
>  			i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
>  					       obj, cache_level);
>  
> -		i915_gem_obj_ggtt_set_color(obj, cache_level);
> +		i915_gem_obj_set_color(obj, vm, cache_level);
>  	}
>  
>  	if (cache_level == I915_CACHE_NONE) {
> @@ -3421,6 +3467,7 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
>  			       struct drm_file *file)
>  {
>  	struct drm_i915_gem_caching *args = data;
> +	struct drm_i915_private *dev_priv;
>  	struct drm_i915_gem_object *obj;
>  	enum i915_cache_level level;
>  	int ret;
> @@ -3445,8 +3492,10 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
>  		ret = -ENOENT;
>  		goto unlock;
>  	}
> +	dev_priv = obj->base.dev->dev_private;
>  
> -	ret = i915_gem_object_set_cache_level(obj, level);
> +	/* FIXME: Add interface for specific VM? */
> +	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base, level);
>  
>  	drm_gem_object_unreference(&obj->base);
>  unlock:
> @@ -3464,6 +3513,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
>  				     u32 alignment,
>  				     struct intel_ring_buffer *pipelined)
>  {
> +	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
>  	u32 old_read_domains, old_write_domain;
>  	int ret;
>  
> @@ -3482,7 +3532,8 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
>  	 * of uncaching, which would allow us to flush all the LLC-cached data
>  	 * with that bit in the PTE to main memory with just one PIPE_CONTROL.
>  	 */
> -	ret = i915_gem_object_set_cache_level(obj, I915_CACHE_NONE);
> +	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> +					      I915_CACHE_NONE);
>  	if (ret)
>  		return ret;
>  
> @@ -3490,7 +3541,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
>  	 * (e.g. libkms for the bootup splash), we have to ensure that we
>  	 * always use map_and_fenceable for all scanout buffers.
>  	 */
> -	ret = i915_gem_object_pin(obj, alignment, true, false);
> +	ret = i915_gem_ggtt_pin(obj, alignment, true, false);
>  	if (ret)
>  		return ret;
>  
> @@ -3633,6 +3684,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
>  
>  int
>  i915_gem_object_pin(struct drm_i915_gem_object *obj,
> +		    struct i915_address_space *vm,
>  		    uint32_t alignment,
>  		    bool map_and_fenceable,
>  		    bool nonblocking)
> @@ -3642,26 +3694,29 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
>  	if (WARN_ON(obj->pin_count == DRM_I915_GEM_OBJECT_MAX_PIN_COUNT))
>  		return -EBUSY;
>  
> -	if (i915_gem_obj_ggtt_bound(obj)) {
> -		if ((alignment && i915_gem_obj_ggtt_offset(obj) & (alignment - 1)) ||
> +	BUG_ON(map_and_fenceable && !i915_is_ggtt(vm));

WARN_ON, since presumably we can keep on going if we get this wrong
(albeit with slightly corrupted state, so render corruptions might
follow).

> +
> +	if (i915_gem_obj_bound(obj, vm)) {
> +		if ((alignment &&
> +		     i915_gem_obj_offset(obj, vm) & (alignment - 1)) ||
>  		    (map_and_fenceable && !obj->map_and_fenceable)) {
>  			WARN(obj->pin_count,
>  			     "bo is already pinned with incorrect alignment:"
>  			     " offset=%lx, req.alignment=%x, req.map_and_fenceable=%d,"
>  			     " obj->map_and_fenceable=%d\n",
> -			     i915_gem_obj_ggtt_offset(obj), alignment,
> +			     i915_gem_obj_offset(obj, vm), alignment,
>  			     map_and_fenceable,
>  			     obj->map_and_fenceable);
> -			ret = i915_gem_object_unbind(obj);
> +			ret = i915_gem_object_unbind(obj, vm);
>  			if (ret)
>  				return ret;
>  		}
>  	}
>  
> -	if (!i915_gem_obj_ggtt_bound(obj)) {
> +	if (!i915_gem_obj_bound(obj, vm)) {
>  		struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
>  
> -		ret = i915_gem_object_bind_to_gtt(obj, alignment,
> +		ret = i915_gem_object_bind_to_gtt(obj, vm, alignment,
>  						  map_and_fenceable,
>  						  nonblocking);
>  		if (ret)
> @@ -3684,7 +3739,7 @@ void
>  i915_gem_object_unpin(struct drm_i915_gem_object *obj)
>  {
>  	BUG_ON(obj->pin_count == 0);
> -	BUG_ON(!i915_gem_obj_ggtt_bound(obj));
> +	BUG_ON(!i915_gem_obj_bound_any(obj));
>  
>  	if (--obj->pin_count == 0)
>  		obj->pin_mappable = false;
> @@ -3722,7 +3777,7 @@ i915_gem_pin_ioctl(struct drm_device *dev, void *data,
>  	}
>  
>  	if (obj->user_pin_count == 0) {
> -		ret = i915_gem_object_pin(obj, args->alignment, true, false);
> +		ret = i915_gem_ggtt_pin(obj, args->alignment, true, false);
>  		if (ret)
>  			goto out;
>  	}
> @@ -3957,6 +4012,7 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
>  	struct drm_i915_gem_object *obj = to_intel_bo(gem_obj);
>  	struct drm_device *dev = obj->base.dev;
>  	drm_i915_private_t *dev_priv = dev->dev_private;
> +	struct i915_vma *vma, *next;
>  
>  	trace_i915_gem_object_destroy(obj);
>  
> @@ -3964,15 +4020,21 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
>  		i915_gem_detach_phys_object(dev, obj);
>  
>  	obj->pin_count = 0;
> -	if (WARN_ON(i915_gem_object_unbind(obj) == -ERESTARTSYS)) {
> -		bool was_interruptible;
> +	/* NB: 0 or 1 elements */
> +	WARN_ON(!list_empty(&obj->vma_list) &&
> +		!list_is_singular(&obj->vma_list));
> +	list_for_each_entry_safe(vma, next, &obj->vma_list, vma_link) {
> +		int ret = i915_gem_object_unbind(obj, vma->vm);
> +		if (WARN_ON(ret == -ERESTARTSYS)) {
> +			bool was_interruptible;
>  
> -		was_interruptible = dev_priv->mm.interruptible;
> -		dev_priv->mm.interruptible = false;
> +			was_interruptible = dev_priv->mm.interruptible;
> +			dev_priv->mm.interruptible = false;
>  
> -		WARN_ON(i915_gem_object_unbind(obj));
> +			WARN_ON(i915_gem_object_unbind(obj, vma->vm));
>  
> -		dev_priv->mm.interruptible = was_interruptible;
> +			dev_priv->mm.interruptible = was_interruptible;
> +		}
>  	}
>  
>  	/* Stolen objects don't hold a ref, but do hold pin count. Fix that up
> @@ -4332,6 +4394,16 @@ init_ring_lists(struct intel_ring_buffer *ring)
>  	INIT_LIST_HEAD(&ring->request_list);
>  }
>  
> +static void i915_init_vm(struct drm_i915_private *dev_priv,
> +			 struct i915_address_space *vm)
> +{
> +	vm->dev = dev_priv->dev;
> +	INIT_LIST_HEAD(&vm->active_list);
> +	INIT_LIST_HEAD(&vm->inactive_list);
> +	INIT_LIST_HEAD(&vm->global_link);
> +	list_add(&vm->global_link, &dev_priv->vm_list);
> +}
> +
>  void
>  i915_gem_load(struct drm_device *dev)
>  {
> @@ -4344,8 +4416,9 @@ i915_gem_load(struct drm_device *dev)
>  				  SLAB_HWCACHE_ALIGN,
>  				  NULL);
>  
> -	INIT_LIST_HEAD(&dev_priv->gtt.base.active_list);
> -	INIT_LIST_HEAD(&dev_priv->gtt.base.inactive_list);
> +	INIT_LIST_HEAD(&dev_priv->vm_list);
> +	i915_init_vm(dev_priv, &dev_priv->gtt.base);
> +
>  	INIT_LIST_HEAD(&dev_priv->mm.unbound_list);
>  	INIT_LIST_HEAD(&dev_priv->mm.bound_list);
>  	INIT_LIST_HEAD(&dev_priv->mm.fence_list);
> @@ -4616,9 +4689,9 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
>  			     struct drm_i915_private,
>  			     mm.inactive_shrinker);
>  	struct drm_device *dev = dev_priv->dev;
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
> +	struct i915_address_space *vm;
>  	struct drm_i915_gem_object *obj;
> -	int nr_to_scan = sc->nr_to_scan;
> +	int nr_to_scan;
>  	bool unlock = true;
>  	int cnt;
>  
> @@ -4632,6 +4705,7 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
>  		unlock = false;
>  	}
>  
> +	nr_to_scan = sc->nr_to_scan;
>  	if (nr_to_scan) {
>  		nr_to_scan -= i915_gem_purge(dev_priv, nr_to_scan);
>  		if (nr_to_scan > 0)
> @@ -4645,11 +4719,93 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
>  	list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_list)
>  		if (obj->pages_pin_count == 0)
>  			cnt += obj->base.size >> PAGE_SHIFT;
> -	list_for_each_entry(obj, &vm->inactive_list, mm_list)
> -		if (obj->pin_count == 0 && obj->pages_pin_count == 0)
> -			cnt += obj->base.size >> PAGE_SHIFT;
> +
> +	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> +		list_for_each_entry(obj, &vm->inactive_list, global_list)
> +			if (obj->pin_count == 0 && obj->pages_pin_count == 0)
> +				cnt += obj->base.size >> PAGE_SHIFT;

Isn't this now double-counting objects? In the shrinker we only care about
how much physical RAM an object occupies, not how much virtual space it
occupies. So just walking the bound list of objects here should be good
enough ...

>  
>  	if (unlock)
>  		mutex_unlock(&dev->struct_mutex);
>  	return cnt;
>  }
> +
> +/* All the new VM stuff */
> +unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
> +				  struct i915_address_space *vm)
> +{
> +	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
> +	struct i915_vma *vma;
> +
> +	if (vm == &dev_priv->mm.aliasing_ppgtt->base)
> +		vm = &dev_priv->gtt.base;
> +
> +	BUG_ON(list_empty(&o->vma_list));
> +	list_for_each_entry(vma, &o->vma_list, vma_link) {

Imo the vma list walking here and in the other helpers below indicates
that we should deal more often in vmas instead of (object, vm) pairs. Or
is this again something that'll get fixed later on?

I just want to avoid diff churn, and it also makes reviewing easier if the
foreshadowing is correct ;-) So generally I'd vote for more liberal
sprinkling of obj_to_vma in callers.

> +		if (vma->vm == vm)
> +			return vma->node.start;
> +
> +	}
> +	return -1;
> +}
> +
> +bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o)
> +{
> +	return !list_empty(&o->vma_list);
> +}
> +
> +bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
> +			struct i915_address_space *vm)
> +{
> +	struct i915_vma *vma;
> +
> +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> +		if (vma->vm == vm)
> +			return true;
> +	}
> +	return false;
> +}
> +
> +unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
> +				struct i915_address_space *vm)
> +{
> +	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
> +	struct i915_vma *vma;
> +
> +	if (vm == &dev_priv->mm.aliasing_ppgtt->base)
> +		vm = &dev_priv->gtt.base;
> +	BUG_ON(list_empty(&o->vma_list));
> +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> +		if (vma->vm == vm)
> +			return vma->node.size;
> +	}
> +
> +	return 0;
> +}
> +
> +void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
> +			    struct i915_address_space *vm,
> +			    enum i915_cache_level color)
> +{
> +	struct i915_vma *vma;
> +	BUG_ON(list_empty(&o->vma_list));
> +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> +		if (vma->vm == vm) {
> +			vma->node.color = color;
> +			return;
> +		}
> +	}
> +
> +	WARN(1, "Couldn't set color for VM %p\n", vm);
> +}
> +
> +struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
> +				     struct i915_address_space *vm)
> +{
> +	struct i915_vma *vma;
> +	list_for_each_entry(vma, &obj->vma_list, vma_link)
> +		if (vma->vm == vm)
> +			return vma;
> +
> +	return NULL;
> +}
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index 2074544..c92fd81 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -155,6 +155,7 @@ create_hw_context(struct drm_device *dev,
>  
>  	if (INTEL_INFO(dev)->gen >= 7) {
>  		ret = i915_gem_object_set_cache_level(ctx->obj,
> +						      &dev_priv->gtt.base,
>  						      I915_CACHE_LLC_MLC);
>  		/* Failure shouldn't ever happen this early */
>  		if (WARN_ON(ret))
> @@ -214,7 +215,7 @@ static int create_default_context(struct drm_i915_private *dev_priv)
>  	 * default context.
>  	 */
>  	dev_priv->ring[RCS].default_context = ctx;
> -	ret = i915_gem_object_pin(ctx->obj, CONTEXT_ALIGN, false, false);
> +	ret = i915_gem_ggtt_pin(ctx->obj, CONTEXT_ALIGN, false, false);
>  	if (ret) {
>  		DRM_DEBUG_DRIVER("Couldn't pin %d\n", ret);
>  		goto err_destroy;
> @@ -398,6 +399,7 @@ mi_set_context(struct intel_ring_buffer *ring,
>  static int do_switch(struct i915_hw_context *to)
>  {
>  	struct intel_ring_buffer *ring = to->ring;
> +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
>  	struct i915_hw_context *from = ring->last_context;
>  	u32 hw_flags = 0;
>  	int ret;
> @@ -407,7 +409,7 @@ static int do_switch(struct i915_hw_context *to)
>  	if (from == to)
>  		return 0;
>  
> -	ret = i915_gem_object_pin(to->obj, CONTEXT_ALIGN, false, false);
> +	ret = i915_gem_ggtt_pin(to->obj, CONTEXT_ALIGN, false, false);
>  	if (ret)
>  		return ret;
>  
> @@ -444,7 +446,8 @@ static int do_switch(struct i915_hw_context *to)
>  	 */
>  	if (from != NULL) {
>  		from->obj->base.read_domains = I915_GEM_DOMAIN_INSTRUCTION;
> -		i915_gem_object_move_to_active(from->obj, ring);
> +		i915_gem_object_move_to_active(from->obj, &dev_priv->gtt.base,
> +					       ring);
>  		/* As long as MI_SET_CONTEXT is serializing, ie. it flushes the
>  		 * whole damn pipeline, we don't need to explicitly mark the
>  		 * object dirty. The only exception is that the context must be
> diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
> index df61f33..32efdc0 100644
> --- a/drivers/gpu/drm/i915/i915_gem_evict.c
> +++ b/drivers/gpu/drm/i915/i915_gem_evict.c
> @@ -32,24 +32,21 @@
>  #include "i915_trace.h"
>  
>  static bool
> -mark_free(struct drm_i915_gem_object *obj, struct list_head *unwind)
> +mark_free(struct i915_vma *vma, struct list_head *unwind)
>  {
> -	struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> -
> -	if (obj->pin_count)
> +	if (vma->obj->pin_count)
>  		return false;
>  
> -	list_add(&obj->exec_list, unwind);
> +	list_add(&vma->obj->exec_list, unwind);
>  	return drm_mm_scan_add_block(&vma->node);
>  }
>  
>  int
> -i915_gem_evict_something(struct drm_device *dev, int min_size,
> -			 unsigned alignment, unsigned cache_level,
> +i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
> +			 int min_size, unsigned alignment, unsigned cache_level,
>  			 bool mappable, bool nonblocking)
>  {
>  	drm_i915_private_t *dev_priv = dev->dev_private;
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
>  	struct list_head eviction_list, unwind_list;
>  	struct i915_vma *vma;
>  	struct drm_i915_gem_object *obj;
> @@ -81,16 +78,18 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
>  	 */
>  
>  	INIT_LIST_HEAD(&unwind_list);
> -	if (mappable)
> +	if (mappable) {
> +		BUG_ON(!i915_is_ggtt(vm));
>  		drm_mm_init_scan_with_range(&vm->mm, min_size,
>  					    alignment, cache_level, 0,
>  					    dev_priv->gtt.mappable_end);
> -	else
> +	} else
>  		drm_mm_init_scan(&vm->mm, min_size, alignment, cache_level);
>  
>  	/* First see if there is a large enough contiguous idle region... */
>  	list_for_each_entry(obj, &vm->inactive_list, mm_list) {
> -		if (mark_free(obj, &unwind_list))
> +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> +		if (mark_free(vma, &unwind_list))
>  			goto found;
>  	}
>  
> @@ -99,7 +98,8 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
>  
>  	/* Now merge in the soon-to-be-expired objects... */
>  	list_for_each_entry(obj, &vm->active_list, mm_list) {
> -		if (mark_free(obj, &unwind_list))
> +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> +		if (mark_free(vma, &unwind_list))
>  			goto found;
>  	}
>  
> @@ -109,7 +109,7 @@ none:
>  		obj = list_first_entry(&unwind_list,
>  				       struct drm_i915_gem_object,
>  				       exec_list);
> -		vma = __i915_gem_obj_to_vma(obj);
> +		vma = i915_gem_obj_to_vma(obj, vm);
>  		ret = drm_mm_scan_remove_block(&vma->node);
>  		BUG_ON(ret);
>  
> @@ -130,7 +130,7 @@ found:
>  		obj = list_first_entry(&unwind_list,
>  				       struct drm_i915_gem_object,
>  				       exec_list);
> -		vma = __i915_gem_obj_to_vma(obj);
> +		vma = i915_gem_obj_to_vma(obj, vm);
>  		if (drm_mm_scan_remove_block(&vma->node)) {
>  			list_move(&obj->exec_list, &eviction_list);
>  			drm_gem_object_reference(&obj->base);
> @@ -145,7 +145,7 @@ found:
>  				       struct drm_i915_gem_object,
>  				       exec_list);
>  		if (ret == 0)
> -			ret = i915_gem_object_unbind(obj);
> +			ret = i915_gem_object_unbind(obj, vm);
>  
>  		list_del_init(&obj->exec_list);
>  		drm_gem_object_unreference(&obj->base);
> @@ -158,13 +158,18 @@ int
>  i915_gem_evict_everything(struct drm_device *dev)

I suspect evict_everything eventually wants a address_space *vm argument
for those cases where we only want to evict everything in a given vm. Atm
we have two use-cases of this:
- Called from the shrinker as a last-ditch effort. For that it should move
  _every_ object onto the unbound list.
- Called from execbuf for badly-fragmented address spaces to clean up the
  mess. For that case we only care about one address space.

>  {
>  	drm_i915_private_t *dev_priv = dev->dev_private;
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
> +	struct i915_address_space *vm;
>  	struct drm_i915_gem_object *obj, *next;
> -	bool lists_empty;
> +	bool lists_empty = true;
>  	int ret;
>  
> -	lists_empty = (list_empty(&vm->inactive_list) &&
> -		       list_empty(&vm->active_list));
> +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> +		lists_empty = (list_empty(&vm->inactive_list) &&
> +			       list_empty(&vm->active_list));
> +		if (!lists_empty)
> +			lists_empty = false;
> +	}
> +
>  	if (lists_empty)
>  		return -ENOSPC;
>  
> @@ -181,9 +186,11 @@ i915_gem_evict_everything(struct drm_device *dev)
>  	i915_gem_retire_requests(dev);
>  
>  	/* Having flushed everything, unbind() should never raise an error */
> -	list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> -		if (obj->pin_count == 0)
> -			WARN_ON(i915_gem_object_unbind(obj));
> +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> +		list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> +			if (obj->pin_count == 0)
> +				WARN_ON(i915_gem_object_unbind(obj, vm));
> +	}
>  
>  	return 0;
>  }
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 5aeb447..e90182d 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -150,7 +150,7 @@ eb_get_object(struct eb_objects *eb, unsigned long handle)
>  }
>  
>  static void
> -eb_destroy(struct eb_objects *eb)
> +eb_destroy(struct eb_objects *eb, struct i915_address_space *vm)
>  {
>  	while (!list_empty(&eb->objects)) {
>  		struct drm_i915_gem_object *obj;
> @@ -174,7 +174,8 @@ static inline int use_cpu_reloc(struct drm_i915_gem_object *obj)
>  static int
>  i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
>  				   struct eb_objects *eb,
> -				   struct drm_i915_gem_relocation_entry *reloc)
> +				   struct drm_i915_gem_relocation_entry *reloc,
> +				   struct i915_address_space *vm)
>  {
>  	struct drm_device *dev = obj->base.dev;
>  	struct drm_gem_object *target_obj;
> @@ -297,7 +298,8 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
>  
>  static int
>  i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
> -				    struct eb_objects *eb)
> +				    struct eb_objects *eb,
> +				    struct i915_address_space *vm)
>  {
>  #define N_RELOC(x) ((x) / sizeof(struct drm_i915_gem_relocation_entry))
>  	struct drm_i915_gem_relocation_entry stack_reloc[N_RELOC(512)];
> @@ -321,7 +323,8 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
>  		do {
>  			u64 offset = r->presumed_offset;
>  
> -			ret = i915_gem_execbuffer_relocate_entry(obj, eb, r);
> +			ret = i915_gem_execbuffer_relocate_entry(obj, eb, r,
> +								 vm);
>  			if (ret)
>  				return ret;
>  
> @@ -344,13 +347,15 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
>  static int
>  i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
>  					 struct eb_objects *eb,
> -					 struct drm_i915_gem_relocation_entry *relocs)
> +					 struct drm_i915_gem_relocation_entry *relocs,
> +					 struct i915_address_space *vm)
>  {
>  	const struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
>  	int i, ret;
>  
>  	for (i = 0; i < entry->relocation_count; i++) {
> -		ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i]);
> +		ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i],
> +							 vm);
>  		if (ret)
>  			return ret;
>  	}
> @@ -359,7 +364,8 @@ i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
>  }
>  
>  static int
> -i915_gem_execbuffer_relocate(struct eb_objects *eb)
> +i915_gem_execbuffer_relocate(struct eb_objects *eb,
> +			     struct i915_address_space *vm)
>  {
>  	struct drm_i915_gem_object *obj;
>  	int ret = 0;
> @@ -373,7 +379,7 @@ i915_gem_execbuffer_relocate(struct eb_objects *eb)
>  	 */
>  	pagefault_disable();
>  	list_for_each_entry(obj, &eb->objects, exec_list) {
> -		ret = i915_gem_execbuffer_relocate_object(obj, eb);
> +		ret = i915_gem_execbuffer_relocate_object(obj, eb, vm);
>  		if (ret)
>  			break;
>  	}
> @@ -395,6 +401,7 @@ need_reloc_mappable(struct drm_i915_gem_object *obj)
>  static int
>  i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
>  				   struct intel_ring_buffer *ring,
> +				   struct i915_address_space *vm,
>  				   bool *need_reloc)
>  {
>  	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> @@ -409,7 +416,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
>  		obj->tiling_mode != I915_TILING_NONE;
>  	need_mappable = need_fence || need_reloc_mappable(obj);
>  
> -	ret = i915_gem_object_pin(obj, entry->alignment, need_mappable, false);
> +	ret = i915_gem_object_pin(obj, vm, entry->alignment, need_mappable,
> +				  false);
>  	if (ret)
>  		return ret;
>  
> @@ -436,8 +444,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
>  		obj->has_aliasing_ppgtt_mapping = 1;
>  	}
>  
> -	if (entry->offset != i915_gem_obj_ggtt_offset(obj)) {
> -		entry->offset = i915_gem_obj_ggtt_offset(obj);
> +	if (entry->offset != i915_gem_obj_offset(obj, vm)) {
> +		entry->offset = i915_gem_obj_offset(obj, vm);
>  		*need_reloc = true;
>  	}
>  
> @@ -458,7 +466,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
>  {
>  	struct drm_i915_gem_exec_object2 *entry;
>  
> -	if (!i915_gem_obj_ggtt_bound(obj))
> +	if (!i915_gem_obj_bound_any(obj))
>  		return;
>  
>  	entry = obj->exec_entry;
> @@ -475,6 +483,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
>  static int
>  i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
>  			    struct list_head *objects,
> +			    struct i915_address_space *vm,
>  			    bool *need_relocs)
>  {
>  	struct drm_i915_gem_object *obj;
> @@ -529,32 +538,37 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
>  		list_for_each_entry(obj, objects, exec_list) {
>  			struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
>  			bool need_fence, need_mappable;
> +			u32 obj_offset;
>  
> -			if (!i915_gem_obj_ggtt_bound(obj))
> +			if (!i915_gem_obj_bound(obj, vm))
>  				continue;

I wonder a bit how we could avoid the multipler (obj, vm) -> vma lookups
here ... Maybe we should cache them in some pointer somewhere (either in
the eb object or by adding a new pointer to the object struct, e.g.
obj->eb_vma, similar to obj->eb_list).

>  
> +			obj_offset = i915_gem_obj_offset(obj, vm);
>  			need_fence =
>  				has_fenced_gpu_access &&
>  				entry->flags & EXEC_OBJECT_NEEDS_FENCE &&
>  				obj->tiling_mode != I915_TILING_NONE;
>  			need_mappable = need_fence || need_reloc_mappable(obj);
>  
> +			BUG_ON((need_mappable || need_fence) &&
> +			       !i915_is_ggtt(vm));
> +
>  			if ((entry->alignment &&
> -			     i915_gem_obj_ggtt_offset(obj) & (entry->alignment - 1)) ||
> +			     obj_offset & (entry->alignment - 1)) ||
>  			    (need_mappable && !obj->map_and_fenceable))
> -				ret = i915_gem_object_unbind(obj);
> +				ret = i915_gem_object_unbind(obj, vm);
>  			else
> -				ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs);
> +				ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
>  			if (ret)
>  				goto err;
>  		}
>  
>  		/* Bind fresh objects */
>  		list_for_each_entry(obj, objects, exec_list) {
> -			if (i915_gem_obj_ggtt_bound(obj))
> +			if (i915_gem_obj_bound(obj, vm))
>  				continue;
>  
> -			ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs);
> +			ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
>  			if (ret)
>  				goto err;
>  		}
> @@ -578,7 +592,8 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
>  				  struct drm_file *file,
>  				  struct intel_ring_buffer *ring,
>  				  struct eb_objects *eb,
> -				  struct drm_i915_gem_exec_object2 *exec)
> +				  struct drm_i915_gem_exec_object2 *exec,
> +				  struct i915_address_space *vm)
>  {
>  	struct drm_i915_gem_relocation_entry *reloc;
>  	struct drm_i915_gem_object *obj;
> @@ -662,14 +677,15 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
>  		goto err;
>  
>  	need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
> -	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs);
> +	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
>  	if (ret)
>  		goto err;
>  
>  	list_for_each_entry(obj, &eb->objects, exec_list) {
>  		int offset = obj->exec_entry - exec;
>  		ret = i915_gem_execbuffer_relocate_object_slow(obj, eb,
> -							       reloc + reloc_offset[offset]);
> +							       reloc + reloc_offset[offset],
> +							       vm);
>  		if (ret)
>  			goto err;
>  	}
> @@ -768,6 +784,7 @@ validate_exec_list(struct drm_i915_gem_exec_object2 *exec,
>  
>  static void
>  i915_gem_execbuffer_move_to_active(struct list_head *objects,
> +				   struct i915_address_space *vm,
>  				   struct intel_ring_buffer *ring)
>  {
>  	struct drm_i915_gem_object *obj;
> @@ -782,7 +799,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *objects,
>  		obj->base.read_domains = obj->base.pending_read_domains;
>  		obj->fenced_gpu_access = obj->pending_fenced_gpu_access;
>  
> -		i915_gem_object_move_to_active(obj, ring);
> +		i915_gem_object_move_to_active(obj, vm, ring);
>  		if (obj->base.write_domain) {
>  			obj->dirty = 1;
>  			obj->last_write_seqno = intel_ring_get_seqno(ring);
> @@ -836,7 +853,8 @@ static int
>  i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  		       struct drm_file *file,
>  		       struct drm_i915_gem_execbuffer2 *args,
> -		       struct drm_i915_gem_exec_object2 *exec)
> +		       struct drm_i915_gem_exec_object2 *exec,
> +		       struct i915_address_space *vm)
>  {
>  	drm_i915_private_t *dev_priv = dev->dev_private;
>  	struct eb_objects *eb;
> @@ -998,17 +1016,17 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  
>  	/* Move the objects en-masse into the GTT, evicting if necessary. */
>  	need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
> -	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs);
> +	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
>  	if (ret)
>  		goto err;
>  
>  	/* The objects are in their final locations, apply the relocations. */
>  	if (need_relocs)
> -		ret = i915_gem_execbuffer_relocate(eb);
> +		ret = i915_gem_execbuffer_relocate(eb, vm);
>  	if (ret) {
>  		if (ret == -EFAULT) {
>  			ret = i915_gem_execbuffer_relocate_slow(dev, args, file, ring,
> -								eb, exec);
> +								eb, exec, vm);
>  			BUG_ON(!mutex_is_locked(&dev->struct_mutex));
>  		}
>  		if (ret)
> @@ -1059,7 +1077,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  			goto err;
>  	}
>  
> -	exec_start = i915_gem_obj_ggtt_offset(batch_obj) + args->batch_start_offset;
> +	exec_start = i915_gem_obj_offset(batch_obj, vm) +
> +		args->batch_start_offset;
>  	exec_len = args->batch_len;
>  	if (cliprects) {
>  		for (i = 0; i < args->num_cliprects; i++) {
> @@ -1084,11 +1103,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  
>  	trace_i915_gem_ring_dispatch(ring, intel_ring_get_seqno(ring), flags);
>  
> -	i915_gem_execbuffer_move_to_active(&eb->objects, ring);
> +	i915_gem_execbuffer_move_to_active(&eb->objects, vm, ring);
>  	i915_gem_execbuffer_retire_commands(dev, file, ring, batch_obj);
>  
>  err:
> -	eb_destroy(eb);
> +	eb_destroy(eb, vm);
>  
>  	mutex_unlock(&dev->struct_mutex);
>  
> @@ -1105,6 +1124,7 @@ int
>  i915_gem_execbuffer(struct drm_device *dev, void *data,
>  		    struct drm_file *file)
>  {
> +	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct drm_i915_gem_execbuffer *args = data;
>  	struct drm_i915_gem_execbuffer2 exec2;
>  	struct drm_i915_gem_exec_object *exec_list = NULL;
> @@ -1160,7 +1180,8 @@ i915_gem_execbuffer(struct drm_device *dev, void *data,
>  	exec2.flags = I915_EXEC_RENDER;
>  	i915_execbuffer2_set_context_id(exec2, 0);
>  
> -	ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list);
> +	ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list,
> +				     &dev_priv->gtt.base);
>  	if (!ret) {
>  		/* Copy the new buffer offsets back to the user's exec list. */
>  		for (i = 0; i < args->buffer_count; i++)
> @@ -1186,6 +1207,7 @@ int
>  i915_gem_execbuffer2(struct drm_device *dev, void *data,
>  		     struct drm_file *file)
>  {
> +	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct drm_i915_gem_execbuffer2 *args = data;
>  	struct drm_i915_gem_exec_object2 *exec2_list = NULL;
>  	int ret;
> @@ -1216,7 +1238,8 @@ i915_gem_execbuffer2(struct drm_device *dev, void *data,
>  		return -EFAULT;
>  	}
>  
> -	ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list);
> +	ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list,
> +				     &dev_priv->gtt.base);
>  	if (!ret) {
>  		/* Copy the new buffer offsets back to the user's exec list. */
>  		ret = copy_to_user(to_user_ptr(args->buffers_ptr),
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 298fc42..70ce2f6 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -367,6 +367,8 @@ static int i915_gem_init_aliasing_ppgtt(struct drm_device *dev)
>  			    ppgtt->base.total);
>  	}
>  
> +	/* i915_init_vm(dev_priv, &ppgtt->base) */
> +
>  	return ret;
>  }
>  
> @@ -386,17 +388,22 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
>  			    struct drm_i915_gem_object *obj,
>  			    enum i915_cache_level cache_level)
>  {
> -	ppgtt->base.insert_entries(&ppgtt->base, obj->pages,
> -				   i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> -				   cache_level);
> +	struct i915_address_space *vm = &ppgtt->base;
> +	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
> +
> +	vm->insert_entries(vm, obj->pages,
> +			   obj_offset >> PAGE_SHIFT,
> +			   cache_level);
>  }
>  
>  void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
>  			      struct drm_i915_gem_object *obj)
>  {
> -	ppgtt->base.clear_range(&ppgtt->base,
> -				i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> -				obj->base.size >> PAGE_SHIFT);
> +	struct i915_address_space *vm = &ppgtt->base;
> +	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
> +
> +	vm->clear_range(vm, obj_offset >> PAGE_SHIFT,
> +			obj->base.size >> PAGE_SHIFT);
>  }
>  
>  extern int intel_iommu_gfx_mapped;
> @@ -447,6 +454,9 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
>  				       dev_priv->gtt.base.start / PAGE_SIZE,
>  				       dev_priv->gtt.base.total / PAGE_SIZE);
>  
> +	if (dev_priv->mm.aliasing_ppgtt)
> +		gen6_write_pdes(dev_priv->mm.aliasing_ppgtt);
> +
>  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
>  		i915_gem_clflush_object(obj);
>  		i915_gem_gtt_bind_object(obj, obj->cache_level);
> @@ -625,7 +635,8 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
>  	 * aperture.  One page should be enough to keep any prefetching inside
>  	 * of the aperture.
>  	 */
> -	drm_i915_private_t *dev_priv = dev->dev_private;
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct i915_address_space *ggtt_vm = &dev_priv->gtt.base;
>  	struct drm_mm_node *entry;
>  	struct drm_i915_gem_object *obj;
>  	unsigned long hole_start, hole_end;
> @@ -633,19 +644,19 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
>  	BUG_ON(mappable_end > end);
>  
>  	/* Subtract the guard page ... */
> -	drm_mm_init(&dev_priv->gtt.base.mm, start, end - start - PAGE_SIZE);
> +	drm_mm_init(&ggtt_vm->mm, start, end - start - PAGE_SIZE);
>  	if (!HAS_LLC(dev))
>  		dev_priv->gtt.base.mm.color_adjust = i915_gtt_color_adjust;
>  
>  	/* Mark any preallocated objects as occupied */
>  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> -		struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, ggtt_vm);
>  		int ret;
>  		DRM_DEBUG_KMS("reserving preallocated space: %lx + %zx\n",
>  			      i915_gem_obj_ggtt_offset(obj), obj->base.size);
>  
>  		WARN_ON(i915_gem_obj_ggtt_bound(obj));
> -		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
> +		ret = drm_mm_reserve_node(&ggtt_vm->mm, &vma->node);
>  		if (ret)
>  			DRM_DEBUG_KMS("Reservation failed\n");
>  		obj->has_global_gtt_mapping = 1;
> @@ -656,19 +667,15 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
>  	dev_priv->gtt.base.total = end - start;
>  
>  	/* Clear any non-preallocated blocks */
> -	drm_mm_for_each_hole(entry, &dev_priv->gtt.base.mm,
> -			     hole_start, hole_end) {
> +	drm_mm_for_each_hole(entry, &ggtt_vm->mm, hole_start, hole_end) {
>  		const unsigned long count = (hole_end - hole_start) / PAGE_SIZE;
>  		DRM_DEBUG_KMS("clearing unused GTT space: [%lx, %lx]\n",
>  			      hole_start, hole_end);
> -		dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> -					       hole_start / PAGE_SIZE,
> -					       count);
> +		ggtt_vm->clear_range(ggtt_vm, hole_start / PAGE_SIZE, count);
>  	}
>  
>  	/* And finally clear the reserved guard page */
> -	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> -				       end / PAGE_SIZE - 1, 1);
> +	ggtt_vm->clear_range(ggtt_vm, end / PAGE_SIZE - 1, 1);
>  }
>  
>  static bool
> diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> index 245eb1d..bfe61fa 100644
> --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> @@ -391,7 +391,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
>  	if (gtt_offset == I915_GTT_OFFSET_NONE)
>  		return obj;
>  
> -	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
> +	vma = i915_gem_vma_create(obj, vm);
>  	if (!vma) {
>  		drm_gem_object_unreference(&obj->base);
>  		return NULL;
> @@ -404,8 +404,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
>  	 */
>  	vma->node.start = gtt_offset;
>  	vma->node.size = size;
> -	if (drm_mm_initialized(&dev_priv->gtt.base.mm)) {
> -		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
> +	if (drm_mm_initialized(&vm->mm)) {
> +		ret = drm_mm_reserve_node(&vm->mm, &vma->node);

These two hunks here for stolen look fishy - we only ever use the stolen
preallocated stuff for objects with mappings in the global gtt. So keeping
that explicit is imo the better approach. And tbh I'm confused where the
local variable vm is from ...

>  		if (ret) {
>  			DRM_DEBUG_KMS("failed to allocate stolen GTT space\n");
>  			goto unref_out;
> diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
> index 92a8d27..808ca2a 100644
> --- a/drivers/gpu/drm/i915/i915_gem_tiling.c
> +++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
> @@ -360,17 +360,19 @@ i915_gem_set_tiling(struct drm_device *dev, void *data,
>  
>  		obj->map_and_fenceable =
>  			!i915_gem_obj_ggtt_bound(obj) ||
> -			(i915_gem_obj_ggtt_offset(obj) + obj->base.size <= dev_priv->gtt.mappable_end &&
> +			(i915_gem_obj_ggtt_offset(obj) +
> +			 obj->base.size <= dev_priv->gtt.mappable_end &&
>  			 i915_gem_object_fence_ok(obj, args->tiling_mode));
>  
>  		/* Rebind if we need a change of alignment */
>  		if (!obj->map_and_fenceable) {
> -			u32 unfenced_alignment =
> +			struct i915_address_space *ggtt = &dev_priv->gtt.base;
> +			u32 unfenced_align =
>  				i915_gem_get_gtt_alignment(dev, obj->base.size,
>  							    args->tiling_mode,
>  							    false);
> -			if (i915_gem_obj_ggtt_offset(obj) & (unfenced_alignment - 1))
> -				ret = i915_gem_object_unbind(obj);
> +			if (i915_gem_obj_ggtt_offset(obj) & (unfenced_align - 1))
> +				ret = i915_gem_object_unbind(obj, ggtt);
>  		}
>  
>  		if (ret == 0) {
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 79fbb17..28fa0ff 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -1716,6 +1716,9 @@ i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
>  	if (HAS_BROKEN_CS_TLB(dev_priv->dev)) {
>  		u32 acthd = I915_READ(ACTHD);
>  
> +		if (WARN_ON(HAS_HW_CONTEXTS(dev_priv->dev)))
> +			return NULL;
> +
>  		if (WARN_ON(ring->id != RCS))
>  			return NULL;
>  
> @@ -1802,7 +1805,8 @@ static void i915_gem_record_active_context(struct intel_ring_buffer *ring,
>  		return;
>  
>  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> -		if ((error->ccid & PAGE_MASK) == i915_gem_obj_ggtt_offset(obj)) {
> +		if ((error->ccid & PAGE_MASK) ==
> +		    i915_gem_obj_ggtt_offset(obj)) {
>  			ering->ctx = i915_error_object_create_sized(dev_priv,
>  								    obj, 1);
>  			break;
> diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
> index 7d283b5..3f019d3 100644
> --- a/drivers/gpu/drm/i915/i915_trace.h
> +++ b/drivers/gpu/drm/i915/i915_trace.h
> @@ -34,11 +34,13 @@ TRACE_EVENT(i915_gem_object_create,
>  );
>  
>  TRACE_EVENT(i915_gem_object_bind,
> -	    TP_PROTO(struct drm_i915_gem_object *obj, bool mappable),
> -	    TP_ARGS(obj, mappable),
> +	    TP_PROTO(struct drm_i915_gem_object *obj,
> +		     struct i915_address_space *vm, bool mappable),
> +	    TP_ARGS(obj, vm, mappable),
>  
>  	    TP_STRUCT__entry(
>  			     __field(struct drm_i915_gem_object *, obj)
> +			     __field(struct i915_address_space *, vm)
>  			     __field(u32, offset)
>  			     __field(u32, size)
>  			     __field(bool, mappable)
> @@ -46,8 +48,8 @@ TRACE_EVENT(i915_gem_object_bind,
>  
>  	    TP_fast_assign(
>  			   __entry->obj = obj;
> -			   __entry->offset = i915_gem_obj_ggtt_offset(obj);
> -			   __entry->size = i915_gem_obj_ggtt_size(obj);
> +			   __entry->offset = i915_gem_obj_offset(obj, vm);
> +			   __entry->size = i915_gem_obj_size(obj, vm);
>  			   __entry->mappable = mappable;
>  			   ),
>  
> @@ -57,19 +59,21 @@ TRACE_EVENT(i915_gem_object_bind,
>  );
>  
>  TRACE_EVENT(i915_gem_object_unbind,
> -	    TP_PROTO(struct drm_i915_gem_object *obj),
> -	    TP_ARGS(obj),
> +	    TP_PROTO(struct drm_i915_gem_object *obj,
> +		     struct i915_address_space *vm),
> +	    TP_ARGS(obj, vm),
>  
>  	    TP_STRUCT__entry(
>  			     __field(struct drm_i915_gem_object *, obj)
> +			     __field(struct i915_address_space *, vm)
>  			     __field(u32, offset)
>  			     __field(u32, size)
>  			     ),
>  
>  	    TP_fast_assign(
>  			   __entry->obj = obj;
> -			   __entry->offset = i915_gem_obj_ggtt_offset(obj);
> -			   __entry->size = i915_gem_obj_ggtt_size(obj);
> +			   __entry->offset = i915_gem_obj_offset(obj, vm);
> +			   __entry->size = i915_gem_obj_size(obj, vm);
>  			   ),
>  
>  	    TP_printk("obj=%p, offset=%08x size=%x",
> diff --git a/drivers/gpu/drm/i915/intel_fb.c b/drivers/gpu/drm/i915/intel_fb.c
> index f3c97e0..b69cc63 100644
> --- a/drivers/gpu/drm/i915/intel_fb.c
> +++ b/drivers/gpu/drm/i915/intel_fb.c
> @@ -170,7 +170,6 @@ static int intelfb_create(struct drm_fb_helper *helper,
>  		      fb->width, fb->height,
>  		      i915_gem_obj_ggtt_offset(obj), obj);
>  
> -
>  	mutex_unlock(&dev->struct_mutex);
>  	vga_switcheroo_client_fb_set(dev->pdev, info);
>  	return 0;
> diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
> index 81c3ca1..517e278 100644
> --- a/drivers/gpu/drm/i915/intel_overlay.c
> +++ b/drivers/gpu/drm/i915/intel_overlay.c
> @@ -1350,7 +1350,7 @@ void intel_setup_overlay(struct drm_device *dev)
>  		}
>  		overlay->flip_addr = reg_bo->phys_obj->handle->busaddr;
>  	} else {
> -		ret = i915_gem_object_pin(reg_bo, PAGE_SIZE, true, false);
> +		ret = i915_gem_ggtt_pin(reg_bo, PAGE_SIZE, true, false);
>  		if (ret) {
>  			DRM_ERROR("failed to pin overlay register bo\n");
>  			goto out_free_bo;
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 125a741..449e57c 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -2858,7 +2858,7 @@ intel_alloc_context_page(struct drm_device *dev)
>  		return NULL;
>  	}
>  
> -	ret = i915_gem_object_pin(ctx, 4096, true, false);
> +	ret = i915_gem_ggtt_pin(ctx, 4096, true, false);
>  	if (ret) {
>  		DRM_ERROR("failed to pin power context: %d\n", ret);
>  		goto err_unref;
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index bc4c11b..ebed61d 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -481,6 +481,7 @@ out:
>  static int
>  init_pipe_control(struct intel_ring_buffer *ring)
>  {
> +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
>  	struct pipe_control *pc;
>  	struct drm_i915_gem_object *obj;
>  	int ret;
> @@ -499,9 +500,10 @@ init_pipe_control(struct intel_ring_buffer *ring)
>  		goto err;
>  	}
>  
> -	i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
> +	i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> +					I915_CACHE_LLC);
>  
> -	ret = i915_gem_object_pin(obj, 4096, true, false);
> +	ret = i915_gem_ggtt_pin(obj, 4096, true, false);
>  	if (ret)
>  		goto err_unref;
>  
> @@ -1212,6 +1214,7 @@ static void cleanup_status_page(struct intel_ring_buffer *ring)
>  static int init_status_page(struct intel_ring_buffer *ring)
>  {
>  	struct drm_device *dev = ring->dev;
> +	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct drm_i915_gem_object *obj;
>  	int ret;
>  
> @@ -1222,9 +1225,10 @@ static int init_status_page(struct intel_ring_buffer *ring)
>  		goto err;
>  	}
>  
> -	i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
> +	i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> +					I915_CACHE_LLC);
>  
> -	ret = i915_gem_object_pin(obj, 4096, true, false);
> +	ret = i915_gem_ggtt_pin(obj, 4096, true, false);
>  	if (ret != 0) {
>  		goto err_unref;
>  	}
> @@ -1307,7 +1311,7 @@ static int intel_init_ring_buffer(struct drm_device *dev,
>  
>  	ring->obj = obj;
>  
> -	ret = i915_gem_object_pin(obj, PAGE_SIZE, true, false);
> +	ret = i915_gem_ggtt_pin(obj, PAGE_SIZE, true, false);
>  	if (ret)
>  		goto err_unref;
>  
> @@ -1828,7 +1832,7 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
>  			return -ENOMEM;
>  		}
>  
> -		ret = i915_gem_object_pin(obj, 0, true, false);
> +		ret = i915_gem_ggtt_pin(obj, 0, true, false);
>  		if (ret != 0) {
>  			drm_gem_object_unreference(&obj->base);
>  			DRM_ERROR("Failed to ping batch bo\n");
> -- 
> 1.8.3.2
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 07/11] drm/i915: Fix up map and fenceable for VMA
  2013-07-09  6:08 ` [PATCH 07/11] drm/i915: Fix up map and fenceable for VMA Ben Widawsky
@ 2013-07-09  7:16   ` Daniel Vetter
  2013-07-10 16:39     ` Ben Widawsky
  0 siblings, 1 reply; 50+ messages in thread
From: Daniel Vetter @ 2013-07-09  7:16 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Mon, Jul 08, 2013 at 11:08:38PM -0700, Ben Widawsky wrote:
> formerly: "drm/i915: Create VMAs (part 3.5) - map and fenceable
> tracking"
> 
> The map_and_fenceable tracking is per object. GTT mapping, and fences
> only apply to global GTT. As such,  object operations which are not
> performed on the global GTT should not effect mappable or fenceable
> characteristics.
> 
> Functionally, this commit could very well be squashed in to the previous
> patch which updated object operations to take a VM argument.  This
> commit is split out because it's a bit tricky (or at least it was for
> me).
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> ---
>  drivers/gpu/drm/i915/i915_gem.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 21015cd..501c590 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2635,7 +2635,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj,
>  
>  	trace_i915_gem_object_unbind(obj, vm);
>  
> -	if (obj->has_global_gtt_mapping)
> +	if (obj->has_global_gtt_mapping && i915_is_ggtt(vm))
>  		i915_gem_gtt_unbind_object(obj);

Wont this part be done as part of the global gtt clear_range callback?

>  	if (obj->has_aliasing_ppgtt_mapping) {
>  		i915_ppgtt_unbind_object(dev_priv->mm.aliasing_ppgtt, obj);

And I have a hunch that we should shovel the aliasing ppgtt clearing into
the ggtt write_ptes/clear_range callbacks, too. Once all this has settled
at least.

> @@ -2646,7 +2646,8 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj,
>  
>  	list_del(&obj->mm_list);
>  	/* Avoid an unnecessary call to unbind on rebind. */
> -	obj->map_and_fenceable = true;
> +	if (i915_is_ggtt(vm))
> +		obj->map_and_fenceable = true;
>  
>  	vma = i915_gem_obj_to_vma(obj, vm);
>  	list_del(&vma->vma_link);
> @@ -3213,7 +3214,9 @@ search_free:
>  		i915_is_ggtt(vm) &&
>  		vma->node.start + obj->base.size <= dev_priv->gtt.mappable_end;
>  
> -	obj->map_and_fenceable = mappable && fenceable;
> +	/* Map and fenceable only changes if the VM is the global GGTT */
> +	if (i915_is_ggtt(vm))
> +		obj->map_and_fenceable = mappable && fenceable;
>  
>  	trace_i915_gem_object_bind(obj, vm, map_and_fenceable);
>  	i915_gem_verify_gtt(dev);
> -- 
> 1.8.3.2
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 08/11] drm/i915: mm_list is per VMA
  2013-07-09  6:08 ` [PATCH 08/11] drm/i915: mm_list is per VMA Ben Widawsky
@ 2013-07-09  7:18   ` Daniel Vetter
  2013-07-10 16:39     ` Ben Widawsky
  0 siblings, 1 reply; 50+ messages in thread
From: Daniel Vetter @ 2013-07-09  7:18 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Mon, Jul 08, 2013 at 11:08:39PM -0700, Ben Widawsky wrote:
> formerly: "drm/i915: Create VMAs (part 5) - move mm_list"
> 
> The mm_list is used for the active/inactive LRUs. Since those LRUs are
> per address space, the link should be per VMx .
> 
> Because we'll only ever have 1 VMA before this point, it's not incorrect
> to defer this change until this point in the patch series, and doing it
> here makes the change much easier to understand.
> 
> v2: only bump GGTT LRU in i915_gem_object_set_to_gtt_domain (Chris)
> 
> v3: Moved earlier in the series
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

The commit message seems to miss the explanation (that I've written for
you) why we move only some of the dev_priv->mm lrus, not all of them ...
> ---
>  drivers/gpu/drm/i915/i915_debugfs.c    | 53 ++++++++++++++++++++++------------
>  drivers/gpu/drm/i915/i915_drv.h        |  5 ++--
>  drivers/gpu/drm/i915/i915_gem.c        | 34 ++++++++++++++--------
>  drivers/gpu/drm/i915/i915_gem_evict.c  | 14 ++++-----
>  drivers/gpu/drm/i915/i915_gem_stolen.c |  2 +-
>  drivers/gpu/drm/i915/i915_irq.c        | 37 ++++++++++++++----------
>  6 files changed, 87 insertions(+), 58 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 867ed07..163ca6b 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -157,7 +157,7 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
>  	struct drm_device *dev = node->minor->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct i915_address_space *vm = &dev_priv->gtt.base;
> -	struct drm_i915_gem_object *obj;
> +	struct i915_vma *vma;
>  	size_t total_obj_size, total_gtt_size;
>  	int count, ret;
>  
> @@ -165,6 +165,7 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
>  	if (ret)
>  		return ret;
>  
> +	/* FIXME: the user of this interface might want more than just GGTT */
>  	switch (list) {
>  	case ACTIVE_LIST:
>  		seq_puts(m, "Active:\n");
> @@ -180,12 +181,12 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
>  	}
>  
>  	total_obj_size = total_gtt_size = count = 0;
> -	list_for_each_entry(obj, head, mm_list) {
> -		seq_puts(m, "   ");
> -		describe_obj(m, obj);
> -		seq_putc(m, '\n');
> -		total_obj_size += obj->base.size;
> -		total_gtt_size += i915_gem_obj_ggtt_size(obj);
> +	list_for_each_entry(vma, head, mm_list) {
> +		seq_printf(m, "   ");
> +		describe_obj(m, vma->obj);
> +		seq_printf(m, "\n");
> +		total_obj_size += vma->obj->base.size;
> +		total_gtt_size += i915_gem_obj_size(vma->obj, vma->vm);
>  		count++;
>  	}
>  	mutex_unlock(&dev->struct_mutex);
> @@ -233,7 +234,18 @@ static int per_file_stats(int id, void *ptr, void *data)
>  	return 0;
>  }
>  
> -static int i915_gem_object_info(struct seq_file *m, void *data)
> +#define count_vmas(list, member) do { \
> +	list_for_each_entry(vma, list, member) { \
> +		size += i915_gem_obj_ggtt_size(vma->obj); \
> +		++count; \
> +		if (vma->obj->map_and_fenceable) { \
> +			mappable_size += i915_gem_obj_ggtt_size(vma->obj); \
> +			++mappable_count; \
> +		} \
> +	} \
> +} while (0)
> +
> +static int i915_gem_object_info(struct seq_file *m, void* data)
>  {
>  	struct drm_info_node *node = (struct drm_info_node *) m->private;
>  	struct drm_device *dev = node->minor->dev;
> @@ -243,6 +255,7 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
>  	struct drm_i915_gem_object *obj;
>  	struct i915_address_space *vm = &dev_priv->gtt.base;
>  	struct drm_file *file;
> +	struct i915_vma *vma;
>  	int ret;
>  
>  	ret = mutex_lock_interruptible(&dev->struct_mutex);
> @@ -259,12 +272,12 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
>  		   count, mappable_count, size, mappable_size);
>  
>  	size = count = mappable_size = mappable_count = 0;
> -	count_objects(&vm->active_list, mm_list);
> +	count_vmas(&vm->active_list, mm_list);
>  	seq_printf(m, "  %u [%u] active objects, %zu [%zu] bytes\n",
>  		   count, mappable_count, size, mappable_size);
>  
>  	size = count = mappable_size = mappable_count = 0;
> -	count_objects(&vm->inactive_list, mm_list);
> +	count_vmas(&vm->inactive_list, mm_list);
>  	seq_printf(m, "  %u [%u] inactive objects, %zu [%zu] bytes\n",
>  		   count, mappable_count, size, mappable_size);
>  
> @@ -2037,7 +2050,8 @@ i915_drop_caches_set(void *data, u64 val)
>  	struct drm_device *dev = data;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct drm_i915_gem_object *obj, *next;
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
> +	struct i915_address_space *vm;
> +	struct i915_vma *vma, *x;
>  	int ret;
>  
>  	DRM_DEBUG_DRIVER("Dropping caches: 0x%08llx\n", val);
> @@ -2058,14 +2072,15 @@ i915_drop_caches_set(void *data, u64 val)
>  		i915_gem_retire_requests(dev);
>  
>  	if (val & DROP_BOUND) {
> -		list_for_each_entry_safe(obj, next, &vm->inactive_list,
> -					 mm_list) {
> -			if (obj->pin_count)
> -				continue;
> -
> -			ret = i915_gem_object_unbind(obj, &dev_priv->gtt.base);
> -			if (ret)
> -				goto unlock;
> +		list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> +			list_for_each_entry_safe(vma, x, &vm->inactive_list,
> +						 mm_list)
> +				if (vma->obj->pin_count == 0) {
> +					ret = i915_gem_object_unbind(vma->obj,
> +								     vm);
> +					if (ret)
> +						goto unlock;
> +				}
>  		}
>  	}
>  
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 48baccc..48105f8 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -541,6 +541,9 @@ struct i915_vma {
>  	struct drm_i915_gem_object *obj;
>  	struct i915_address_space *vm;
>  
> +	/** This object's place on the active/inactive lists */
> +	struct list_head mm_list;
> +
>  	struct list_head vma_link; /* Link in the object's VMA list */
>  };
>  
> @@ -1242,9 +1245,7 @@ struct drm_i915_gem_object {
>  	struct drm_mm_node *stolen;
>  	struct list_head global_list;
>  
> -	/** This object's place on the active/inactive lists */
>  	struct list_head ring_list;
> -	struct list_head mm_list;
>  	/** This object's place in the batchbuffer or on the eviction list */
>  	struct list_head exec_list;
>  
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 501c590..9a58363 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1888,6 +1888,7 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
>  	struct drm_device *dev = obj->base.dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	u32 seqno = intel_ring_get_seqno(ring);
> +	struct i915_vma *vma;
>  
>  	BUG_ON(ring == NULL);
>  	obj->ring = ring;
> @@ -1899,7 +1900,8 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
>  	}
>  
>  	/* Move from whatever list we were on to the tail of execution. */
> -	list_move_tail(&obj->mm_list, &vm->active_list);
> +	vma = i915_gem_obj_to_vma(obj, vm);
> +	list_move_tail(&vma->mm_list, &vm->active_list);
>  	list_move_tail(&obj->ring_list, &ring->active_list);
>  
>  	obj->last_read_seqno = seqno;
> @@ -1922,10 +1924,13 @@ static void
>  i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj,
>  				 struct i915_address_space *vm)
>  {
> +	struct i915_vma *vma;
> +
>  	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
>  	BUG_ON(!obj->active);
>  
> -	list_move_tail(&obj->mm_list, &vm->inactive_list);
> +	vma = i915_gem_obj_to_vma(obj, vm);
> +	list_move_tail(&vma->mm_list, &vm->inactive_list);
>  
>  	list_del_init(&obj->ring_list);
>  	obj->ring = NULL;
> @@ -2287,9 +2292,9 @@ void i915_gem_restore_fences(struct drm_device *dev)
>  void i915_gem_reset(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct i915_address_space *vm;
> -	struct drm_i915_gem_object *obj;
>  	struct intel_ring_buffer *ring;
> +	struct i915_address_space *vm;
> +	struct i915_vma *vma;
>  	int i;
>  
>  	for_each_ring(ring, dev_priv, i)
> @@ -2299,8 +2304,8 @@ void i915_gem_reset(struct drm_device *dev)
>  	 * necessary invalidation upon reuse.
>  	 */
>  	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> -		list_for_each_entry(obj, &vm->inactive_list, mm_list)
> -			obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
> +		list_for_each_entry(vma, &vm->inactive_list, mm_list)
> +			vma->obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
>  
>  	i915_gem_restore_fences(dev);
>  }
> @@ -2644,12 +2649,12 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj,
>  	i915_gem_gtt_finish_object(obj);
>  	i915_gem_object_unpin_pages(obj);
>  
> -	list_del(&obj->mm_list);
>  	/* Avoid an unnecessary call to unbind on rebind. */
>  	if (i915_is_ggtt(vm))
>  		obj->map_and_fenceable = true;
>  
>  	vma = i915_gem_obj_to_vma(obj, vm);
> +	list_del(&vma->mm_list);
>  	list_del(&vma->vma_link);
>  	drm_mm_remove_node(&vma->node);
>  	i915_gem_vma_destroy(vma);
> @@ -3197,7 +3202,7 @@ search_free:
>  	}
>  
>  	list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
> -	list_add_tail(&obj->mm_list, &vm->inactive_list);
> +	list_add_tail(&vma->mm_list, &vm->inactive_list);
>  
>  	/* Keep GGTT vmas first to make debug easier */
>  	if (i915_is_ggtt(vm))
> @@ -3354,9 +3359,14 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
>  					    old_write_domain);
>  
>  	/* And bump the LRU for this access */
> -	if (i915_gem_object_is_inactive(obj))
> -		list_move_tail(&obj->mm_list,
> -			       &dev_priv->gtt.base.inactive_list);
> +	if (i915_gem_object_is_inactive(obj)) {
> +		struct i915_vma *vma = i915_gem_obj_to_vma(obj,
> +							   &dev_priv->gtt.base);
> +		if (vma)
> +			list_move_tail(&vma->mm_list,
> +				       &dev_priv->gtt.base.inactive_list);
> +
> +	}
>  
>  	return 0;
>  }
> @@ -3931,7 +3941,6 @@ unlock:
>  void i915_gem_object_init(struct drm_i915_gem_object *obj,
>  			  const struct drm_i915_gem_object_ops *ops)
>  {
> -	INIT_LIST_HEAD(&obj->mm_list);
>  	INIT_LIST_HEAD(&obj->global_list);
>  	INIT_LIST_HEAD(&obj->ring_list);
>  	INIT_LIST_HEAD(&obj->exec_list);
> @@ -4071,6 +4080,7 @@ struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
>  		return ERR_PTR(-ENOMEM);
>  
>  	INIT_LIST_HEAD(&vma->vma_link);
> +	INIT_LIST_HEAD(&vma->mm_list);
>  	vma->vm = vm;
>  	vma->obj = obj;
>  
> diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
> index 32efdc0..18a44a9 100644
> --- a/drivers/gpu/drm/i915/i915_gem_evict.c
> +++ b/drivers/gpu/drm/i915/i915_gem_evict.c
> @@ -87,8 +87,7 @@ i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
>  		drm_mm_init_scan(&vm->mm, min_size, alignment, cache_level);
>  
>  	/* First see if there is a large enough contiguous idle region... */
> -	list_for_each_entry(obj, &vm->inactive_list, mm_list) {
> -		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> +	list_for_each_entry(vma, &vm->inactive_list, mm_list) {
>  		if (mark_free(vma, &unwind_list))
>  			goto found;
>  	}
> @@ -97,8 +96,7 @@ i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
>  		goto none;
>  
>  	/* Now merge in the soon-to-be-expired objects... */
> -	list_for_each_entry(obj, &vm->active_list, mm_list) {
> -		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> +	list_for_each_entry(vma, &vm->active_list, mm_list) {
>  		if (mark_free(vma, &unwind_list))
>  			goto found;
>  	}
> @@ -159,7 +157,7 @@ i915_gem_evict_everything(struct drm_device *dev)
>  {
>  	drm_i915_private_t *dev_priv = dev->dev_private;
>  	struct i915_address_space *vm;
> -	struct drm_i915_gem_object *obj, *next;
> +	struct i915_vma *vma, *next;
>  	bool lists_empty = true;
>  	int ret;
>  
> @@ -187,9 +185,9 @@ i915_gem_evict_everything(struct drm_device *dev)
>  
>  	/* Having flushed everything, unbind() should never raise an error */
>  	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> -		list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> -			if (obj->pin_count == 0)
> -				WARN_ON(i915_gem_object_unbind(obj, vm));
> +		list_for_each_entry_safe(vma, next, &vm->inactive_list, mm_list)
> +			if (vma->obj->pin_count == 0)
> +				WARN_ON(i915_gem_object_unbind(vma->obj, vm));
>  	}
>  
>  	return 0;
> diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> index bfe61fa..58b2613 100644
> --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> @@ -415,7 +415,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
>  	obj->has_global_gtt_mapping = 1;
>  
>  	list_add_tail(&obj->global_list, &dev_priv->mm.bound_list);
> -	list_add_tail(&obj->mm_list, &vm->inactive_list);
> +	list_add_tail(&vma->mm_list, &dev_priv->gtt.base.inactive_list);
>  
>  	return obj;
>  
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 28fa0ff..e065232 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -1640,11 +1640,11 @@ static void capture_bo(struct drm_i915_error_buffer *err,
>  static u32 capture_active_bo(struct drm_i915_error_buffer *err,
>  			     int count, struct list_head *head)
>  {
> -	struct drm_i915_gem_object *obj;
> +	struct i915_vma *vma;
>  	int i = 0;
>  
> -	list_for_each_entry(obj, head, mm_list) {
> -		capture_bo(err++, obj);
> +	list_for_each_entry(vma, head, mm_list) {
> +		capture_bo(err++, vma->obj);
>  		if (++i == count)
>  			break;
>  	}
> @@ -1706,8 +1706,9 @@ static struct drm_i915_error_object *
>  i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
>  			     struct intel_ring_buffer *ring)
>  {
> +	struct i915_address_space *vm;
> +	struct i915_vma *vma;
>  	struct drm_i915_gem_object *obj;
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
>  	u32 seqno;
>  
>  	if (!ring->get_seqno)
> @@ -1729,20 +1730,23 @@ i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
>  	}
>  
>  	seqno = ring->get_seqno(ring, false);
> -	list_for_each_entry(obj, &vm->active_list, mm_list) {
> -		if (obj->ring != ring)
> -			continue;
> +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> +		list_for_each_entry(vma, &vm->active_list, mm_list) {
> +			obj = vma->obj;
> +			if (obj->ring != ring)
> +				continue;
>  
> -		if (i915_seqno_passed(seqno, obj->last_read_seqno))
> -			continue;
> +			if (i915_seqno_passed(seqno, obj->last_read_seqno))
> +				continue;
>  
> -		if ((obj->base.read_domains & I915_GEM_DOMAIN_COMMAND) == 0)
> -			continue;
> +			if ((obj->base.read_domains & I915_GEM_DOMAIN_COMMAND) == 0)
> +				continue;
>  
> -		/* We need to copy these to an anonymous buffer as the simplest
> -		 * method to avoid being overwritten by userspace.
> -		 */
> -		return i915_error_object_create(dev_priv, obj);
> +			/* We need to copy these to an anonymous buffer as the simplest
> +			 * method to avoid being overwritten by userspace.
> +			 */
> +			return i915_error_object_create(dev_priv, obj);
> +		}
>  	}
>  
>  	return NULL;
> @@ -1863,11 +1867,12 @@ static void i915_gem_capture_buffers(struct drm_i915_private *dev_priv,
>  				     struct drm_i915_error_state *error)
>  {
>  	struct drm_i915_gem_object *obj;
> +	struct i915_vma *vma;
>  	struct i915_address_space *vm = &dev_priv->gtt.base;
>  	int i;
>  
>  	i = 0;
> -	list_for_each_entry(obj, &vm->active_list, mm_list)
> +	list_for_each_entry(vma, &vm->active_list, mm_list)
>  		i++;
>  	error->active_bo_count = i;
>  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list)
> -- 
> 1.8.3.2
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 11/11] drm/i915: Move active to vma
  2013-07-09  6:08 ` [PATCH 11/11] drm/i915: Move active to vma Ben Widawsky
@ 2013-07-09  7:45   ` Daniel Vetter
  2013-07-10 16:39     ` Ben Widawsky
  0 siblings, 1 reply; 50+ messages in thread
From: Daniel Vetter @ 2013-07-09  7:45 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Mon, Jul 08, 2013 at 11:08:42PM -0700, Ben Widawsky wrote:
> Probably need to squash whole thing, or just the inactive part, tbd...
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

I agree that we need vma->active, but I'm not sold on removing
obj->active. Atm we have to use-cases for checking obj->active:
- In the evict/unbind code to check whether the gpu is still using this
  specific mapping. This use-case nicely fits into checking vma->active.
- In the shrinker code and everywhere we want to do cpu access we only
  care about whether the gpu is accessing the object, not at all through
  which mapping precisely. There a vma-independant obj->active sounds much
  saner.

Note though that just keeping track of vma->active isn't too useful, since
if some other vma is keeping the object busy we'll still stall on that one
for eviction. So we'd need a vma->ring and vma->last_rendering_seqno, too.

At that point I wonder a bit whether all this complexity is worth it ...

I need to ponder this some more.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_drv.h | 14 ++++++------
>  drivers/gpu/drm/i915/i915_gem.c | 47 ++++++++++++++++++++++++-----------------
>  2 files changed, 35 insertions(+), 26 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 38d07f2..e6694ae 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -541,6 +541,13 @@ struct i915_vma {
>  	struct drm_i915_gem_object *obj;
>  	struct i915_address_space *vm;
>  
> +	/**
> +	 * This is set if the object is on the active lists (has pending
> +	 * rendering and so a non-zero seqno), and is not set if it i s on
> +	 * inactive (ready to be unbound) list.
> +	 */
> +	unsigned int active:1;
> +
>  	/** This object's place on the active/inactive lists */
>  	struct list_head mm_list;
>  
> @@ -1250,13 +1257,6 @@ struct drm_i915_gem_object {
>  	struct list_head exec_list;
>  
>  	/**
> -	 * This is set if the object is on the active lists (has pending
> -	 * rendering and so a non-zero seqno), and is not set if it i s on
> -	 * inactive (ready to be unbound) list.
> -	 */
> -	unsigned int active:1;
> -
> -	/**
>  	 * This is set if the object has been written to since last bound
>  	 * to the GTT
>  	 */
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index c2ecb78..b87073b 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -137,7 +137,13 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
>  /* NB: Not the same as !i915_gem_object_is_inactive */
>  bool i915_gem_object_is_active(struct drm_i915_gem_object *obj)
>  {
> -	return obj->active;
> +	struct i915_vma *vma;
> +
> +	list_for_each_entry(vma, &obj->vma_list, vma_link)
> +		if (vma->active)
> +			return true;
> +
> +	return false;
>  }
>  
>  static inline bool
> @@ -1899,14 +1905,14 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
>  	BUG_ON(ring == NULL);
>  	obj->ring = ring;
>  
> +	/* Move from whatever list we were on to the tail of execution. */
> +	vma = i915_gem_obj_to_vma(obj, vm);
>  	/* Add a reference if we're newly entering the active list. */
> -	if (!i915_gem_object_is_active(obj)) {
> +	if (!vma->active) {
>  		drm_gem_object_reference(&obj->base);
> -		obj->active = 1;
> +		vma->active = 1;
>  	}
>  
> -	/* Move from whatever list we were on to the tail of execution. */
> -	vma = i915_gem_obj_to_vma(obj, vm);
>  	list_move_tail(&vma->mm_list, &vm->active_list);
>  	list_move_tail(&obj->ring_list, &ring->active_list);
>  
> @@ -1927,16 +1933,23 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
>  }
>  
>  static void
> -i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj,
> -				 struct i915_address_space *vm)
> +i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
>  {
> +	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> +	struct i915_address_space *vm;
>  	struct i915_vma *vma;
> +	int i = 0;
>  
>  	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
> -	BUG_ON(!i915_gem_object_is_active(obj));
>  
> -	vma = i915_gem_obj_to_vma(obj, vm);
> -	list_move_tail(&vma->mm_list, &vm->inactive_list);
> +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> +		vma = i915_gem_obj_to_vma(obj, vm);
For paranoia we might want to track the vm used to run a batch in it
request struct, then we

> +		if (!vma || !vma->active)
> +			continue;
> +		list_move_tail(&vma->mm_list, &vm->inactive_list);
> +		vma->active = 0;
> +		i++;
> +	}
>  
>  	list_del_init(&obj->ring_list);
>  	obj->ring = NULL;
> @@ -1948,8 +1961,8 @@ i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj,
>  	obj->last_fenced_seqno = 0;
>  	obj->fenced_gpu_access = false;
>  
> -	obj->active = 0;
> -	drm_gem_object_unreference(&obj->base);
> +	while (i--)
> +		drm_gem_object_unreference(&obj->base);
>  
>  	WARN_ON(i915_verify_lists(dev));
>  }
> @@ -2272,15 +2285,13 @@ static void i915_gem_reset_ring_lists(struct drm_i915_private *dev_priv,
>  	}
>  
>  	while (!list_empty(&ring->active_list)) {
> -		struct i915_address_space *vm;
>  		struct drm_i915_gem_object *obj;
>  
>  		obj = list_first_entry(&ring->active_list,
>  				       struct drm_i915_gem_object,
>  				       ring_list);
>  
> -		list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> -			i915_gem_object_move_to_inactive(obj, vm);
> +		i915_gem_object_move_to_inactive(obj);
>  	}
>  }
>  
> @@ -2356,8 +2367,6 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
>  	 * by the ringbuffer to the flushing/inactive lists as appropriate.
>  	 */
>  	while (!list_empty(&ring->active_list)) {
> -		struct drm_i915_private *dev_priv = ring->dev->dev_private;
> -		struct i915_address_space *vm;
>  		struct drm_i915_gem_object *obj;
>  
>  		obj = list_first_entry(&ring->active_list,
> @@ -2367,8 +2376,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
>  		if (!i915_seqno_passed(seqno, obj->last_read_seqno))
>  			break;
>  
> -		list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> -			i915_gem_object_move_to_inactive(obj, vm);
> +		BUG_ON(!i915_gem_object_is_active(obj));
> +		i915_gem_object_move_to_inactive(obj);
>  	}
>  
>  	if (unlikely(ring->trace_irq_seqno &&
> -- 
> 1.8.3.2
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 00/11] ppgtt: just the VMA
  2013-07-09  6:08 [PATCH 00/11] ppgtt: just the VMA Ben Widawsky
                   ` (10 preceding siblings ...)
  2013-07-09  6:08 ` [PATCH 11/11] drm/i915: Move active to vma Ben Widawsky
@ 2013-07-09  7:50 ` Daniel Vetter
  2013-07-13  4:45 ` [PATCH 12/15] [RFC] create vm->bind,unbind Ben Widawsky
  12 siblings, 0 replies; 50+ messages in thread
From: Daniel Vetter @ 2013-07-09  7:50 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Mon, Jul 08, 2013 at 11:08:31PM -0700, Ben Widawsky wrote:
> By Daniel's request, to make the PPGTT merging more manageable, here are the
> patches associated with the VM/VMA infrastructure. They are not as well tested
> as the previous series, although I would hope that without actually changing
> address space, most of this series is just massaging code.
> 
> Even though these patches were all cherry picked from the original,
> working series, the amount of rework was not insignificant ie. there may
> be a lot of bugs present, or changes needed.
> 
> There should be little to no effect on the code, since there will only ever be
> one VM until the rest of the PPGTT series is merged.
> 
> Finally, Daniel, is this more or less what you wanted first?

Yeah, looks good. I think up to patch 5 I can merge it (only 2 tiny
bikesheds), but I'll volunteer someone to do an in-depth review. Later
patches I think need a bit more discussion and maybe split-out of prep
work. I've dropped my questions on them.
-Daniel

> 
> References:
> http://lists.freedesktop.org/archives/intel-gfx/2013-June/029408.html
> 
> Ben Widawsky (11):
>   drm/i915: Move gtt and ppgtt under address space umbrella
>   drm/i915: Put the mm in the parent address space
>   drm/i915: Create a global list of vms
>   drm/i915: Move active/inactive lists to new mm
>   drm/i915: Create VMAs
>   drm/i915: plumb VM into object operations
>   drm/i915: Fix up map and fenceable for VMA
>   drm/i915: mm_list is per VMA
>   drm/i915: Update error capture for VMs
>   drm/i915: create an object_is_active()
>   drm/i915: Move active to vma
> 
>  drivers/gpu/drm/i915/i915_debugfs.c        |  88 ++++--
>  drivers/gpu/drm/i915/i915_dma.c            |   9 +-
>  drivers/gpu/drm/i915/i915_drv.h            | 243 +++++++++-------
>  drivers/gpu/drm/i915/i915_gem.c            | 432 ++++++++++++++++++++++-------
>  drivers/gpu/drm/i915/i915_gem_context.c    |   9 +-
>  drivers/gpu/drm/i915/i915_gem_debug.c      |   2 +-
>  drivers/gpu/drm/i915/i915_gem_evict.c      |  67 +++--
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c |  87 +++---
>  drivers/gpu/drm/i915/i915_gem_gtt.c        | 193 +++++++------
>  drivers/gpu/drm/i915/i915_gem_stolen.c     |  19 +-
>  drivers/gpu/drm/i915/i915_gem_tiling.c     |  10 +-
>  drivers/gpu/drm/i915/i915_irq.c            | 158 ++++++++---
>  drivers/gpu/drm/i915/i915_trace.h          |  20 +-
>  drivers/gpu/drm/i915/intel_fb.c            |   1 -
>  drivers/gpu/drm/i915/intel_overlay.c       |   2 +-
>  drivers/gpu/drm/i915/intel_pm.c            |   2 +-
>  drivers/gpu/drm/i915/intel_ringbuffer.c    |  16 +-
>  17 files changed, 902 insertions(+), 456 deletions(-)
> 
> -- 
> 1.8.3.2
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 01/11] drm/i915: Move gtt and ppgtt under address space umbrella
  2013-07-09  6:37   ` Daniel Vetter
@ 2013-07-10 16:36     ` Ben Widawsky
  2013-07-10 17:03       ` Daniel Vetter
  0 siblings, 1 reply; 50+ messages in thread
From: Ben Widawsky @ 2013-07-10 16:36 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX

On Tue, Jul 09, 2013 at 08:37:45AM +0200, Daniel Vetter wrote:
> On Mon, Jul 08, 2013 at 11:08:32PM -0700, Ben Widawsky wrote:
> > The GTT and PPGTT can be thought of more generally as GPU address
> > spaces. Many of their actions (insert entries), state (LRU lists) and
> > many of their characteristics (size), can be shared. Do that.
> > 
> > The change itself doesn't actually impact most of the VMA/VM rework
> > coming up, it just fits in with the grand scheme. GGTT will usually be a
> > special case where we either know an object must be in the GGTT (dislay
> > engine, workarounds, etc.).
> 
> Commit message cut off?
> -Daniel

Maybe. I can't remember. Do you want me to add something else in
particular.

> 
> > 
> > v2: Drop usage of i915_gtt_vm (Daniel)
> > Make cleanup also part of the parent class (Ben)
> > Modified commit msg
> > Rebased
> > 
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > ---
> >  drivers/gpu/drm/i915/i915_debugfs.c |   4 +-
> >  drivers/gpu/drm/i915/i915_dma.c     |   4 +-
> >  drivers/gpu/drm/i915/i915_drv.h     |  57 ++++++-------
> >  drivers/gpu/drm/i915/i915_gem.c     |   4 +-
> >  drivers/gpu/drm/i915/i915_gem_gtt.c | 162 ++++++++++++++++++++----------------
> >  5 files changed, 121 insertions(+), 110 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> > index c8059f5..d870f27 100644
> > --- a/drivers/gpu/drm/i915/i915_debugfs.c
> > +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> > @@ -287,8 +287,8 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
> >  		   count, size);
> >  
> >  	seq_printf(m, "%zu [%lu] gtt total\n",
> > -		   dev_priv->gtt.total,
> > -		   dev_priv->gtt.mappable_end - dev_priv->gtt.start);
> > +		   dev_priv->gtt.base.total,
> > +		   dev_priv->gtt.mappable_end - dev_priv->gtt.base.start);
> >  
> >  	seq_putc(m, '\n');
> >  	list_for_each_entry_reverse(file, &dev->filelist, lhead) {
> > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> > index 0e22142..15bca96 100644
> > --- a/drivers/gpu/drm/i915/i915_dma.c
> > +++ b/drivers/gpu/drm/i915/i915_dma.c
> > @@ -1669,7 +1669,7 @@ out_gem_unload:
> >  out_mtrrfree:
> >  	arch_phys_wc_del(dev_priv->gtt.mtrr);
> >  	io_mapping_free(dev_priv->gtt.mappable);
> > -	dev_priv->gtt.gtt_remove(dev);
> > +	dev_priv->gtt.base.cleanup(&dev_priv->gtt.base);
> >  out_rmmap:
> >  	pci_iounmap(dev->pdev, dev_priv->regs);
> >  put_bridge:
> > @@ -1764,7 +1764,7 @@ int i915_driver_unload(struct drm_device *dev)
> >  	destroy_workqueue(dev_priv->wq);
> >  	pm_qos_remove_request(&dev_priv->pm_qos);
> >  
> > -	dev_priv->gtt.gtt_remove(dev);
> > +	dev_priv->gtt.base.cleanup(&dev_priv->gtt.base);
> >  
> >  	if (dev_priv->slab)
> >  		kmem_cache_destroy(dev_priv->slab);
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index c8d6104..d6d4d7d 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -446,6 +446,29 @@ enum i915_cache_level {
> >  
> >  typedef uint32_t gen6_gtt_pte_t;
> >  
> > +struct i915_address_space {
> > +	struct drm_device *dev;
> > +	unsigned long start;		/* Start offset always 0 for dri2 */
> > +	size_t total;		/* size addr space maps (ex. 2GB for ggtt) */
> > +
> > +	struct {
> > +		dma_addr_t addr;
> > +		struct page *page;
> > +	} scratch;
> > +
> > +	/* FIXME: Need a more generic return type */
> > +	gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
> > +				     enum i915_cache_level level);
> > +	void (*clear_range)(struct i915_address_space *vm,
> > +			    unsigned int first_entry,
> > +			    unsigned int num_entries);
> > +	void (*insert_entries)(struct i915_address_space *vm,
> > +			       struct sg_table *st,
> > +			       unsigned int first_entry,
> > +			       enum i915_cache_level cache_level);
> > +	void (*cleanup)(struct i915_address_space *vm);
> > +};
> > +
> >  /* The Graphics Translation Table is the way in which GEN hardware translates a
> >   * Graphics Virtual Address into a Physical Address. In addition to the normal
> >   * collateral associated with any va->pa translations GEN hardware also has a
> > @@ -454,8 +477,7 @@ typedef uint32_t gen6_gtt_pte_t;
> >   * the spec.
> >   */
> >  struct i915_gtt {
> > -	unsigned long start;		/* Start offset of used GTT */
> > -	size_t total;			/* Total size GTT can map */
> > +	struct i915_address_space base;
> >  	size_t stolen_size;		/* Total size of stolen memory */
> >  
> >  	unsigned long mappable_end;	/* End offset that we can CPU map */
> > @@ -466,10 +488,6 @@ struct i915_gtt {
> >  	void __iomem *gsm;
> >  
> >  	bool do_idle_maps;
> > -	struct {
> > -		dma_addr_t addr;
> > -		struct page *page;
> > -	} scratch;
> >  
> >  	int mtrr;
> >  
> > @@ -477,38 +495,17 @@ struct i915_gtt {
> >  	int (*gtt_probe)(struct drm_device *dev, size_t *gtt_total,
> >  			  size_t *stolen, phys_addr_t *mappable_base,
> >  			  unsigned long *mappable_end);
> > -	void (*gtt_remove)(struct drm_device *dev);
> > -	void (*gtt_clear_range)(struct drm_device *dev,
> > -				unsigned int first_entry,
> > -				unsigned int num_entries);
> > -	void (*gtt_insert_entries)(struct drm_device *dev,
> > -				   struct sg_table *st,
> > -				   unsigned int pg_start,
> > -				   enum i915_cache_level cache_level);
> > -	gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
> > -				     enum i915_cache_level level);
> >  };
> > -#define gtt_total_entries(gtt) ((gtt).total >> PAGE_SHIFT)
> > +#define gtt_total_entries(gtt) ((gtt).base.total >> PAGE_SHIFT)
> >  
> >  struct i915_hw_ppgtt {
> > -	struct drm_device *dev;
> > +	struct i915_address_space base;
> >  	unsigned num_pd_entries;
> >  	struct page **pt_pages;
> >  	uint32_t pd_offset;
> >  	dma_addr_t *pt_dma_addr;
> >  
> > -	/* pte functions, mirroring the interface of the global gtt. */
> > -	void (*clear_range)(struct i915_hw_ppgtt *ppgtt,
> > -			    unsigned int first_entry,
> > -			    unsigned int num_entries);
> > -	void (*insert_entries)(struct i915_hw_ppgtt *ppgtt,
> > -			       struct sg_table *st,
> > -			       unsigned int pg_start,
> > -			       enum i915_cache_level cache_level);
> > -	gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
> > -				     enum i915_cache_level level);
> >  	int (*enable)(struct drm_device *dev);
> > -	void (*cleanup)(struct i915_hw_ppgtt *ppgtt);
> >  };
> >  
> >  struct i915_ctx_hang_stats {
> > @@ -1124,7 +1121,7 @@ typedef struct drm_i915_private {
> >  	enum modeset_restore modeset_restore;
> >  	struct mutex modeset_restore_lock;
> >  
> > -	struct i915_gtt gtt;
> > +	struct i915_gtt gtt; /* VMA representing the global address space */
> >  
> >  	struct i915_gem_mm mm;
> >  
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index af61be8..3ecedfd 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -181,7 +181,7 @@ i915_gem_get_aperture_ioctl(struct drm_device *dev, void *data,
> >  			pinned += i915_gem_obj_ggtt_size(obj);
> >  	mutex_unlock(&dev->struct_mutex);
> >  
> > -	args->aper_size = dev_priv->gtt.total;
> > +	args->aper_size = dev_priv->gtt.base.total;
> >  	args->aper_available_size = args->aper_size - pinned;
> >  
> >  	return 0;
> > @@ -3070,7 +3070,7 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> >  	u32 size, fence_size, fence_alignment, unfenced_alignment;
> >  	bool mappable, fenceable;
> >  	size_t gtt_max = map_and_fenceable ?
> > -		dev_priv->gtt.mappable_end : dev_priv->gtt.total;
> > +		dev_priv->gtt.mappable_end : dev_priv->gtt.base.total;
> >  	int ret;
> >  
> >  	fence_size = i915_gem_get_gtt_size(dev,
> > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > index 242d0f9..693115a 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > @@ -102,7 +102,7 @@ static gen6_gtt_pte_t hsw_pte_encode(dma_addr_t addr,
> >  
> >  static void gen6_write_pdes(struct i915_hw_ppgtt *ppgtt)
> >  {
> > -	struct drm_i915_private *dev_priv = ppgtt->dev->dev_private;
> > +	struct drm_i915_private *dev_priv = ppgtt->base.dev->dev_private;
> >  	gen6_gtt_pte_t __iomem *pd_addr;
> >  	uint32_t pd_entry;
> >  	int i;
> > @@ -181,18 +181,18 @@ static int gen6_ppgtt_enable(struct drm_device *dev)
> >  }
> >  
> >  /* PPGTT support for Sandybdrige/Gen6 and later */
> > -static void gen6_ppgtt_clear_range(struct i915_hw_ppgtt *ppgtt,
> > +static void gen6_ppgtt_clear_range(struct i915_address_space *vm,
> >  				   unsigned first_entry,
> >  				   unsigned num_entries)
> >  {
> > -	struct drm_i915_private *dev_priv = ppgtt->dev->dev_private;
> > +	struct i915_hw_ppgtt *ppgtt =
> > +		container_of(vm, struct i915_hw_ppgtt, base);
> >  	gen6_gtt_pte_t *pt_vaddr, scratch_pte;
> >  	unsigned act_pt = first_entry / I915_PPGTT_PT_ENTRIES;
> >  	unsigned first_pte = first_entry % I915_PPGTT_PT_ENTRIES;
> >  	unsigned last_pte, i;
> >  
> > -	scratch_pte = ppgtt->pte_encode(dev_priv->gtt.scratch.addr,
> > -					I915_CACHE_LLC);
> > +	scratch_pte = vm->pte_encode(vm->scratch.addr, I915_CACHE_LLC);
> >  
> >  	while (num_entries) {
> >  		last_pte = first_pte + num_entries;
> > @@ -212,11 +212,13 @@ static void gen6_ppgtt_clear_range(struct i915_hw_ppgtt *ppgtt,
> >  	}
> >  }
> >  
> > -static void gen6_ppgtt_insert_entries(struct i915_hw_ppgtt *ppgtt,
> > +static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
> >  				      struct sg_table *pages,
> >  				      unsigned first_entry,
> >  				      enum i915_cache_level cache_level)
> >  {
> > +	struct i915_hw_ppgtt *ppgtt =
> > +		container_of(vm, struct i915_hw_ppgtt, base);
> >  	gen6_gtt_pte_t *pt_vaddr;
> >  	unsigned act_pt = first_entry / I915_PPGTT_PT_ENTRIES;
> >  	unsigned act_pte = first_entry % I915_PPGTT_PT_ENTRIES;
> > @@ -227,7 +229,7 @@ static void gen6_ppgtt_insert_entries(struct i915_hw_ppgtt *ppgtt,
> >  		dma_addr_t page_addr;
> >  
> >  		page_addr = sg_page_iter_dma_address(&sg_iter);
> > -		pt_vaddr[act_pte] = ppgtt->pte_encode(page_addr, cache_level);
> > +		pt_vaddr[act_pte] = vm->pte_encode(page_addr, cache_level);
> >  		if (++act_pte == I915_PPGTT_PT_ENTRIES) {
> >  			kunmap_atomic(pt_vaddr);
> >  			act_pt++;
> > @@ -239,13 +241,15 @@ static void gen6_ppgtt_insert_entries(struct i915_hw_ppgtt *ppgtt,
> >  	kunmap_atomic(pt_vaddr);
> >  }
> >  
> > -static void gen6_ppgtt_cleanup(struct i915_hw_ppgtt *ppgtt)
> > +static void gen6_ppgtt_cleanup(struct i915_address_space *vm)
> >  {
> > +	struct i915_hw_ppgtt *ppgtt =
> > +		container_of(vm, struct i915_hw_ppgtt, base);
> >  	int i;
> >  
> >  	if (ppgtt->pt_dma_addr) {
> >  		for (i = 0; i < ppgtt->num_pd_entries; i++)
> > -			pci_unmap_page(ppgtt->dev->pdev,
> > +			pci_unmap_page(ppgtt->base.dev->pdev,
> >  				       ppgtt->pt_dma_addr[i],
> >  				       4096, PCI_DMA_BIDIRECTIONAL);
> >  	}
> > @@ -259,7 +263,7 @@ static void gen6_ppgtt_cleanup(struct i915_hw_ppgtt *ppgtt)
> >  
> >  static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
> >  {
> > -	struct drm_device *dev = ppgtt->dev;
> > +	struct drm_device *dev = ppgtt->base.dev;
> >  	struct drm_i915_private *dev_priv = dev->dev_private;
> >  	unsigned first_pd_entry_in_global_pt;
> >  	int i;
> > @@ -271,17 +275,17 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
> >  	first_pd_entry_in_global_pt = gtt_total_entries(dev_priv->gtt);
> >  
> >  	if (IS_HASWELL(dev)) {
> > -		ppgtt->pte_encode = hsw_pte_encode;
> > +		ppgtt->base.pte_encode = hsw_pte_encode;
> >  	} else if (IS_VALLEYVIEW(dev)) {
> > -		ppgtt->pte_encode = byt_pte_encode;
> > +		ppgtt->base.pte_encode = byt_pte_encode;
> >  	} else {
> > -		ppgtt->pte_encode = gen6_pte_encode;
> > +		ppgtt->base.pte_encode = gen6_pte_encode;
> >  	}
> >  	ppgtt->num_pd_entries = GEN6_PPGTT_PD_ENTRIES;
> >  	ppgtt->enable = gen6_ppgtt_enable;
> > -	ppgtt->clear_range = gen6_ppgtt_clear_range;
> > -	ppgtt->insert_entries = gen6_ppgtt_insert_entries;
> > -	ppgtt->cleanup = gen6_ppgtt_cleanup;
> > +	ppgtt->base.clear_range = gen6_ppgtt_clear_range;
> > +	ppgtt->base.insert_entries = gen6_ppgtt_insert_entries;
> > +	ppgtt->base.cleanup = gen6_ppgtt_cleanup;
> >  	ppgtt->pt_pages = kzalloc(sizeof(struct page *)*ppgtt->num_pd_entries,
> >  				  GFP_KERNEL);
> >  	if (!ppgtt->pt_pages)
> > @@ -312,8 +316,8 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
> >  		ppgtt->pt_dma_addr[i] = pt_addr;
> >  	}
> >  
> > -	ppgtt->clear_range(ppgtt, 0,
> > -			   ppgtt->num_pd_entries*I915_PPGTT_PT_ENTRIES);
> > +	ppgtt->base.clear_range(&ppgtt->base, 0,
> > +				ppgtt->num_pd_entries * I915_PPGTT_PT_ENTRIES);
> >  
> >  	ppgtt->pd_offset = first_pd_entry_in_global_pt * sizeof(gen6_gtt_pte_t);
> >  
> > @@ -346,7 +350,7 @@ static int i915_gem_init_aliasing_ppgtt(struct drm_device *dev)
> >  	if (!ppgtt)
> >  		return -ENOMEM;
> >  
> > -	ppgtt->dev = dev;
> > +	ppgtt->base.dev = dev;
> >  
> >  	if (INTEL_INFO(dev)->gen < 8)
> >  		ret = gen6_ppgtt_init(ppgtt);
> > @@ -369,7 +373,7 @@ void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev)
> >  	if (!ppgtt)
> >  		return;
> >  
> > -	ppgtt->cleanup(ppgtt);
> > +	ppgtt->base.cleanup(&ppgtt->base);
> >  	dev_priv->mm.aliasing_ppgtt = NULL;
> >  }
> >  
> > @@ -377,17 +381,17 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
> >  			    struct drm_i915_gem_object *obj,
> >  			    enum i915_cache_level cache_level)
> >  {
> > -	ppgtt->insert_entries(ppgtt, obj->pages,
> > -			      i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> > -			      cache_level);
> > +	ppgtt->base.insert_entries(&ppgtt->base, obj->pages,
> > +				   i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> > +				   cache_level);
> >  }
> >  
> >  void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
> >  			      struct drm_i915_gem_object *obj)
> >  {
> > -	ppgtt->clear_range(ppgtt,
> > -			   i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> > -			   obj->base.size >> PAGE_SHIFT);
> > +	ppgtt->base.clear_range(&ppgtt->base,
> > +				i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> > +				obj->base.size >> PAGE_SHIFT);
> >  }
> >  
> >  extern int intel_iommu_gfx_mapped;
> > @@ -434,8 +438,9 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
> >  	struct drm_i915_gem_object *obj;
> >  
> >  	/* First fill our portion of the GTT with scratch pages */
> > -	dev_priv->gtt.gtt_clear_range(dev, dev_priv->gtt.start / PAGE_SIZE,
> > -				      dev_priv->gtt.total / PAGE_SIZE);
> > +	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> > +				       dev_priv->gtt.base.start / PAGE_SIZE,
> > +				       dev_priv->gtt.base.total / PAGE_SIZE);
> >  
> >  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> >  		i915_gem_clflush_object(obj);
> > @@ -464,12 +469,12 @@ int i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj)
> >   * within the global GTT as well as accessible by the GPU through the GMADR
> >   * mapped BAR (dev_priv->mm.gtt->gtt).
> >   */
> > -static void gen6_ggtt_insert_entries(struct drm_device *dev,
> > +static void gen6_ggtt_insert_entries(struct i915_address_space *vm,
> >  				     struct sg_table *st,
> >  				     unsigned int first_entry,
> >  				     enum i915_cache_level level)
> >  {
> > -	struct drm_i915_private *dev_priv = dev->dev_private;
> > +	struct drm_i915_private *dev_priv = vm->dev->dev_private;
> >  	gen6_gtt_pte_t __iomem *gtt_entries =
> >  		(gen6_gtt_pte_t __iomem *)dev_priv->gtt.gsm + first_entry;
> >  	int i = 0;
> > @@ -478,8 +483,7 @@ static void gen6_ggtt_insert_entries(struct drm_device *dev,
> >  
> >  	for_each_sg_page(st->sgl, &sg_iter, st->nents, 0) {
> >  		addr = sg_page_iter_dma_address(&sg_iter);
> > -		iowrite32(dev_priv->gtt.pte_encode(addr, level),
> > -			  &gtt_entries[i]);
> > +		iowrite32(vm->pte_encode(addr, level), &gtt_entries[i]);
> >  		i++;
> >  	}
> >  
> > @@ -490,8 +494,8 @@ static void gen6_ggtt_insert_entries(struct drm_device *dev,
> >  	 * hardware should work, we must keep this posting read for paranoia.
> >  	 */
> >  	if (i != 0)
> > -		WARN_ON(readl(&gtt_entries[i-1])
> > -			!= dev_priv->gtt.pte_encode(addr, level));
> > +		WARN_ON(readl(&gtt_entries[i-1]) !=
> > +			vm->pte_encode(addr, level));
> >  
> >  	/* This next bit makes the above posting read even more important. We
> >  	 * want to flush the TLBs only after we're certain all the PTE updates
> > @@ -501,11 +505,11 @@ static void gen6_ggtt_insert_entries(struct drm_device *dev,
> >  	POSTING_READ(GFX_FLSH_CNTL_GEN6);
> >  }
> >  
> > -static void gen6_ggtt_clear_range(struct drm_device *dev,
> > +static void gen6_ggtt_clear_range(struct i915_address_space *vm,
> >  				  unsigned int first_entry,
> >  				  unsigned int num_entries)
> >  {
> > -	struct drm_i915_private *dev_priv = dev->dev_private;
> > +	struct drm_i915_private *dev_priv = vm->dev->dev_private;
> >  	gen6_gtt_pte_t scratch_pte, __iomem *gtt_base =
> >  		(gen6_gtt_pte_t __iomem *) dev_priv->gtt.gsm + first_entry;
> >  	const int max_entries = gtt_total_entries(dev_priv->gtt) - first_entry;
> > @@ -516,15 +520,14 @@ static void gen6_ggtt_clear_range(struct drm_device *dev,
> >  		 first_entry, num_entries, max_entries))
> >  		num_entries = max_entries;
> >  
> > -	scratch_pte = dev_priv->gtt.pte_encode(dev_priv->gtt.scratch.addr,
> > -					       I915_CACHE_LLC);
> > +	scratch_pte = vm->pte_encode(vm->scratch.addr, I915_CACHE_LLC);
> >  	for (i = 0; i < num_entries; i++)
> >  		iowrite32(scratch_pte, &gtt_base[i]);
> >  	readl(gtt_base);
> >  }
> >  
> >  
> > -static void i915_ggtt_insert_entries(struct drm_device *dev,
> > +static void i915_ggtt_insert_entries(struct i915_address_space *vm,
> >  				     struct sg_table *st,
> >  				     unsigned int pg_start,
> >  				     enum i915_cache_level cache_level)
> > @@ -536,7 +539,7 @@ static void i915_ggtt_insert_entries(struct drm_device *dev,
> >  
> >  }
> >  
> > -static void i915_ggtt_clear_range(struct drm_device *dev,
> > +static void i915_ggtt_clear_range(struct i915_address_space *vm,
> >  				  unsigned int first_entry,
> >  				  unsigned int num_entries)
> >  {
> > @@ -549,10 +552,11 @@ void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
> >  {
> >  	struct drm_device *dev = obj->base.dev;
> >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > +	const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
> >  
> > -	dev_priv->gtt.gtt_insert_entries(dev, obj->pages,
> > -					 i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> > -					 cache_level);
> > +	dev_priv->gtt.base.insert_entries(&dev_priv->gtt.base, obj->pages,
> > +					  entry,
> > +					  cache_level);
> >  
> >  	obj->has_global_gtt_mapping = 1;
> >  }
> > @@ -561,10 +565,11 @@ void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj)
> >  {
> >  	struct drm_device *dev = obj->base.dev;
> >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > +	const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
> >  
> > -	dev_priv->gtt.gtt_clear_range(obj->base.dev,
> > -				      i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> > -				      obj->base.size >> PAGE_SHIFT);
> > +	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> > +				       entry,
> > +				       obj->base.size >> PAGE_SHIFT);
> >  
> >  	obj->has_global_gtt_mapping = 0;
> >  }
> > @@ -641,20 +646,23 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
> >  		obj->has_global_gtt_mapping = 1;
> >  	}
> >  
> > -	dev_priv->gtt.start = start;
> > -	dev_priv->gtt.total = end - start;
> > +	dev_priv->gtt.base.start = start;
> > +	dev_priv->gtt.base.total = end - start;
> >  
> >  	/* Clear any non-preallocated blocks */
> >  	drm_mm_for_each_hole(entry, &dev_priv->mm.gtt_space,
> >  			     hole_start, hole_end) {
> > +		const unsigned long count = (hole_end - hole_start) / PAGE_SIZE;
> >  		DRM_DEBUG_KMS("clearing unused GTT space: [%lx, %lx]\n",
> >  			      hole_start, hole_end);
> > -		dev_priv->gtt.gtt_clear_range(dev, hole_start / PAGE_SIZE,
> > -					      (hole_end-hole_start) / PAGE_SIZE);
> > +		dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> > +					       hole_start / PAGE_SIZE,
> > +					       count);
> >  	}
> >  
> >  	/* And finally clear the reserved guard page */
> > -	dev_priv->gtt.gtt_clear_range(dev, end / PAGE_SIZE - 1, 1);
> > +	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> > +				       end / PAGE_SIZE - 1, 1);
> >  }
> >  
> >  static bool
> > @@ -677,7 +685,7 @@ void i915_gem_init_global_gtt(struct drm_device *dev)
> >  	struct drm_i915_private *dev_priv = dev->dev_private;
> >  	unsigned long gtt_size, mappable_size;
> >  
> > -	gtt_size = dev_priv->gtt.total;
> > +	gtt_size = dev_priv->gtt.base.total;
> >  	mappable_size = dev_priv->gtt.mappable_end;
> >  
> >  	if (intel_enable_ppgtt(dev) && HAS_ALIASING_PPGTT(dev)) {
> > @@ -722,8 +730,8 @@ static int setup_scratch_page(struct drm_device *dev)
> >  #else
> >  	dma_addr = page_to_phys(page);
> >  #endif
> > -	dev_priv->gtt.scratch.page = page;
> > -	dev_priv->gtt.scratch.addr = dma_addr;
> > +	dev_priv->gtt.base.scratch.page = page;
> > +	dev_priv->gtt.base.scratch.addr = dma_addr;
> >  
> >  	return 0;
> >  }
> > @@ -731,11 +739,13 @@ static int setup_scratch_page(struct drm_device *dev)
> >  static void teardown_scratch_page(struct drm_device *dev)
> >  {
> >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > -	set_pages_wb(dev_priv->gtt.scratch.page, 1);
> > -	pci_unmap_page(dev->pdev, dev_priv->gtt.scratch.addr,
> > +	struct page *page = dev_priv->gtt.base.scratch.page;
> > +
> > +	set_pages_wb(page, 1);
> > +	pci_unmap_page(dev->pdev, dev_priv->gtt.base.scratch.addr,
> >  		       PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
> > -	put_page(dev_priv->gtt.scratch.page);
> > -	__free_page(dev_priv->gtt.scratch.page);
> > +	put_page(page);
> > +	__free_page(page);
> >  }
> >  
> >  static inline unsigned int gen6_get_total_gtt_size(u16 snb_gmch_ctl)
> > @@ -798,17 +808,18 @@ static int gen6_gmch_probe(struct drm_device *dev,
> >  	if (ret)
> >  		DRM_ERROR("Scratch setup failed\n");
> >  
> > -	dev_priv->gtt.gtt_clear_range = gen6_ggtt_clear_range;
> > -	dev_priv->gtt.gtt_insert_entries = gen6_ggtt_insert_entries;
> > +	dev_priv->gtt.base.clear_range = gen6_ggtt_clear_range;
> > +	dev_priv->gtt.base.insert_entries = gen6_ggtt_insert_entries;
> >  
> >  	return ret;
> >  }
> >  
> > -static void gen6_gmch_remove(struct drm_device *dev)
> > +static void gen6_gmch_remove(struct i915_address_space *vm)
> >  {
> > -	struct drm_i915_private *dev_priv = dev->dev_private;
> > -	iounmap(dev_priv->gtt.gsm);
> > -	teardown_scratch_page(dev_priv->dev);
> > +
> > +	struct i915_gtt *gtt = container_of(vm, struct i915_gtt, base);
> > +	iounmap(gtt->gsm);
> > +	teardown_scratch_page(vm->dev);
> >  }
> >  
> >  static int i915_gmch_probe(struct drm_device *dev,
> > @@ -829,13 +840,13 @@ static int i915_gmch_probe(struct drm_device *dev,
> >  	intel_gtt_get(gtt_total, stolen, mappable_base, mappable_end);
> >  
> >  	dev_priv->gtt.do_idle_maps = needs_idle_maps(dev_priv->dev);
> > -	dev_priv->gtt.gtt_clear_range = i915_ggtt_clear_range;
> > -	dev_priv->gtt.gtt_insert_entries = i915_ggtt_insert_entries;
> > +	dev_priv->gtt.base.clear_range = i915_ggtt_clear_range;
> > +	dev_priv->gtt.base.insert_entries = i915_ggtt_insert_entries;
> >  
> >  	return 0;
> >  }
> >  
> > -static void i915_gmch_remove(struct drm_device *dev)
> > +static void i915_gmch_remove(struct i915_address_space *vm)
> >  {
> >  	intel_gmch_remove();
> >  }
> > @@ -848,25 +859,28 @@ int i915_gem_gtt_init(struct drm_device *dev)
> >  
> >  	if (INTEL_INFO(dev)->gen <= 5) {
> >  		gtt->gtt_probe = i915_gmch_probe;
> > -		gtt->gtt_remove = i915_gmch_remove;
> > +		gtt->base.cleanup = i915_gmch_remove;
> >  	} else {
> >  		gtt->gtt_probe = gen6_gmch_probe;
> > -		gtt->gtt_remove = gen6_gmch_remove;
> > +		gtt->base.cleanup = gen6_gmch_remove;
> >  		if (IS_HASWELL(dev))
> > -			gtt->pte_encode = hsw_pte_encode;
> > +			gtt->base.pte_encode = hsw_pte_encode;
> >  		else if (IS_VALLEYVIEW(dev))
> > -			gtt->pte_encode = byt_pte_encode;
> > +			gtt->base.pte_encode = byt_pte_encode;
> >  		else
> > -			gtt->pte_encode = gen6_pte_encode;
> > +			gtt->base.pte_encode = gen6_pte_encode;
> >  	}
> >  
> > -	ret = gtt->gtt_probe(dev, &gtt->total, &gtt->stolen_size,
> > +	ret = gtt->gtt_probe(dev, &gtt->base.total, &gtt->stolen_size,
> >  			     &gtt->mappable_base, &gtt->mappable_end);
> >  	if (ret)
> >  		return ret;
> >  
> > +	gtt->base.dev = dev;
> > +
> >  	/* GMADR is the PCI mmio aperture into the global GTT. */
> > -	DRM_INFO("Memory usable by graphics device = %zdM\n", gtt->total >> 20);
> > +	DRM_INFO("Memory usable by graphics device = %zdM\n",
> > +		 gtt->base.total >> 20);
> >  	DRM_DEBUG_DRIVER("GMADR size = %ldM\n", gtt->mappable_end >> 20);
> >  	DRM_DEBUG_DRIVER("GTT stolen size = %zdM\n", gtt->stolen_size >> 20);
> >  
> > -- 
> > 1.8.3.2
> > 
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 06/11] drm/i915: plumb VM into object operations
  2013-07-09  7:15   ` Daniel Vetter
@ 2013-07-10 16:37     ` Ben Widawsky
  2013-07-10 17:05       ` Daniel Vetter
  2013-07-12  2:23     ` Ben Widawsky
  1 sibling, 1 reply; 50+ messages in thread
From: Ben Widawsky @ 2013-07-10 16:37 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX

On Tue, Jul 09, 2013 at 09:15:01AM +0200, Daniel Vetter wrote:
> On Mon, Jul 08, 2013 at 11:08:37PM -0700, Ben Widawsky wrote:
> > This patch was formerly known as:
> > "drm/i915: Create VMAs (part 3) - plumbing"
> > 
> > This patch adds a VM argument, bind/unbind, and the object
> > offset/size/color getters/setters. It preserves the old ggtt helper
> > functions because things still need, and will continue to need them.
> > 
> > Some code will still need to be ported over after this.
> > 
> > v2: Fix purge to pick an object and unbind all vmas
> > This was doable because of the global bound list change.
> > 
> > v3: With the commit to actually pin/unpin pages in place, there is no
> > longer a need to check if unbind succeeded before calling put_pages().
> > Make put_pages only BUG() after checking pin count.
> > 
> > v4: Rebased on top of the new hangcheck work by Mika
> > plumbed eb_destroy also
> > Many checkpatch related fixes
> > 
> > v5: Very large rebase
> > 
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> 
> This one is a rather large beast. Any chance we could split it into
> topics, e.g. convert execbuf code, convert shrinker code? Or does that get
> messy, fast?
> 

I've thought of this...

The one solution I came up with is to have two bind/unbind functions
(similar to what I did with pin, and indeed it was my original plan with
pin), and do the set_caching one separately.

I think it won't be too messy, just a lot of typing, as Keith likes to
say.

However, my opinion was, since it's early in the merge cycle, we don't
yet have multiple VMs, and it's /mostly/ a copypasta kind of patch, it's
not a big deal. At a functional level too, I felt this made more sense.

So I'll defer to your request on this and start splitting it up, unless
my email has changed your mind ;-).

> > ---
> >  drivers/gpu/drm/i915/i915_debugfs.c        |  31 ++-
> >  drivers/gpu/drm/i915/i915_dma.c            |   4 -
> >  drivers/gpu/drm/i915/i915_drv.h            | 107 +++++-----
> >  drivers/gpu/drm/i915/i915_gem.c            | 316 +++++++++++++++++++++--------
> >  drivers/gpu/drm/i915/i915_gem_context.c    |   9 +-
> >  drivers/gpu/drm/i915/i915_gem_evict.c      |  51 +++--
> >  drivers/gpu/drm/i915/i915_gem_execbuffer.c |  85 +++++---
> >  drivers/gpu/drm/i915/i915_gem_gtt.c        |  41 ++--
> >  drivers/gpu/drm/i915/i915_gem_stolen.c     |   6 +-
> >  drivers/gpu/drm/i915/i915_gem_tiling.c     |  10 +-
> >  drivers/gpu/drm/i915/i915_irq.c            |   6 +-
> >  drivers/gpu/drm/i915/i915_trace.h          |  20 +-
> >  drivers/gpu/drm/i915/intel_fb.c            |   1 -
> >  drivers/gpu/drm/i915/intel_overlay.c       |   2 +-
> >  drivers/gpu/drm/i915/intel_pm.c            |   2 +-
> >  drivers/gpu/drm/i915/intel_ringbuffer.c    |  16 +-
> >  16 files changed, 468 insertions(+), 239 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> > index 16b2aaf..867ed07 100644
> > --- a/drivers/gpu/drm/i915/i915_debugfs.c
> > +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> > @@ -122,9 +122,18 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
> >  		seq_printf(m, " (pinned x %d)", obj->pin_count);
> >  	if (obj->fence_reg != I915_FENCE_REG_NONE)
> >  		seq_printf(m, " (fence: %d)", obj->fence_reg);
> > -	if (i915_gem_obj_ggtt_bound(obj))
> > -		seq_printf(m, " (gtt offset: %08lx, size: %08x)",
> > -			   i915_gem_obj_ggtt_offset(obj), (unsigned int)i915_gem_obj_ggtt_size(obj));
> > +	if (i915_gem_obj_bound_any(obj)) {
> 
> list_for_each will short-circuit already, so this is redundant.
> 
> > +		struct i915_vma *vma;
> > +		list_for_each_entry(vma, &obj->vma_list, vma_link) {
> > +			if (!i915_is_ggtt(vma->vm))
> > +				seq_puts(m, " (pp");
> > +			else
> > +				seq_puts(m, " (g");
> > +			seq_printf(m, " gtt offset: %08lx, size: %08lx)",
> 
>                                        ^ that space looks superflous now
> 
> > +				   i915_gem_obj_offset(obj, vma->vm),
> > +				   i915_gem_obj_size(obj, vma->vm));
> > +		}
> > +	}
> >  	if (obj->stolen)
> >  		seq_printf(m, " (stolen: %08lx)", obj->stolen->start);
> >  	if (obj->pin_mappable || obj->fault_mappable) {
> > @@ -186,6 +195,7 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
> >  	return 0;
> >  }
> >  
> > +/* FIXME: Support multiple VM? */
> >  #define count_objects(list, member) do { \
> >  	list_for_each_entry(obj, list, member) { \
> >  		size += i915_gem_obj_ggtt_size(obj); \
> > @@ -2049,18 +2059,21 @@ i915_drop_caches_set(void *data, u64 val)
> >  
> >  	if (val & DROP_BOUND) {
> >  		list_for_each_entry_safe(obj, next, &vm->inactive_list,
> > -					 mm_list)
> > -			if (obj->pin_count == 0) {
> > -				ret = i915_gem_object_unbind(obj);
> > -				if (ret)
> > -					goto unlock;
> > -			}
> > +					 mm_list) {
> > +			if (obj->pin_count)
> > +				continue;
> > +
> > +			ret = i915_gem_object_unbind(obj, &dev_priv->gtt.base);
> > +			if (ret)
> > +				goto unlock;
> > +		}
> >  	}
> >  
> >  	if (val & DROP_UNBOUND) {
> >  		list_for_each_entry_safe(obj, next, &dev_priv->mm.unbound_list,
> >  					 global_list)
> >  			if (obj->pages_pin_count == 0) {
> > +				/* FIXME: Do this for all vms? */
> >  				ret = i915_gem_object_put_pages(obj);
> >  				if (ret)
> >  					goto unlock;
> > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> > index d13e21f..b190439 100644
> > --- a/drivers/gpu/drm/i915/i915_dma.c
> > +++ b/drivers/gpu/drm/i915/i915_dma.c
> > @@ -1497,10 +1497,6 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
> >  
> >  	i915_dump_device_info(dev_priv);
> >  
> > -	INIT_LIST_HEAD(&dev_priv->vm_list);
> > -	INIT_LIST_HEAD(&dev_priv->gtt.base.global_link);
> > -	list_add(&dev_priv->gtt.base.global_link, &dev_priv->vm_list);
> > -
> >  	if (i915_get_bridge_dev(dev)) {
> >  		ret = -EIO;
> >  		goto free_priv;
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index 38cccc8..48baccc 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -1363,52 +1363,6 @@ struct drm_i915_gem_object {
> >  
> >  #define to_intel_bo(x) container_of(x, struct drm_i915_gem_object, base)
> >  
> > -/* This is a temporary define to help transition us to real VMAs. If you see
> > - * this, you're either reviewing code, or bisecting it. */
> > -static inline struct i915_vma *
> > -__i915_gem_obj_to_vma(struct drm_i915_gem_object *obj)
> > -{
> > -	if (list_empty(&obj->vma_list))
> > -		return NULL;
> > -	return list_first_entry(&obj->vma_list, struct i915_vma, vma_link);
> > -}
> > -
> > -/* Whether or not this object is currently mapped by the translation tables */
> > -static inline bool
> > -i915_gem_obj_ggtt_bound(struct drm_i915_gem_object *o)
> > -{
> > -	struct i915_vma *vma = __i915_gem_obj_to_vma(o);
> > -	if (vma == NULL)
> > -		return false;
> > -	return drm_mm_node_allocated(&vma->node);
> > -}
> > -
> > -/* Offset of the first PTE pointing to this object */
> > -static inline unsigned long
> > -i915_gem_obj_ggtt_offset(struct drm_i915_gem_object *o)
> > -{
> > -	BUG_ON(list_empty(&o->vma_list));
> > -	return __i915_gem_obj_to_vma(o)->node.start;
> > -}
> > -
> > -/* The size used in the translation tables may be larger than the actual size of
> > - * the object on GEN2/GEN3 because of the way tiling is handled. See
> > - * i915_gem_get_gtt_size() for more details.
> > - */
> > -static inline unsigned long
> > -i915_gem_obj_ggtt_size(struct drm_i915_gem_object *o)
> > -{
> > -	BUG_ON(list_empty(&o->vma_list));
> > -	return __i915_gem_obj_to_vma(o)->node.size;
> > -}
> > -
> > -static inline void
> > -i915_gem_obj_ggtt_set_color(struct drm_i915_gem_object *o,
> > -			    enum i915_cache_level color)
> > -{
> > -	__i915_gem_obj_to_vma(o)->node.color = color;
> > -}
> > -
> >  /**
> >   * Request queue structure.
> >   *
> > @@ -1726,11 +1680,13 @@ struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
> >  void i915_gem_vma_destroy(struct i915_vma *vma);
> >  
> >  int __must_check i915_gem_object_pin(struct drm_i915_gem_object *obj,
> > +				     struct i915_address_space *vm,
> >  				     uint32_t alignment,
> >  				     bool map_and_fenceable,
> >  				     bool nonblocking);
> >  void i915_gem_object_unpin(struct drm_i915_gem_object *obj);
> > -int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj);
> > +int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj,
> > +					struct i915_address_space *vm);
> >  int i915_gem_object_put_pages(struct drm_i915_gem_object *obj);
> >  void i915_gem_release_mmap(struct drm_i915_gem_object *obj);
> >  void i915_gem_lastclose(struct drm_device *dev);
> > @@ -1760,6 +1716,7 @@ int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
> >  int i915_gem_object_sync(struct drm_i915_gem_object *obj,
> >  			 struct intel_ring_buffer *to);
> >  void i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> > +				    struct i915_address_space *vm,
> >  				    struct intel_ring_buffer *ring);
> >  
> >  int i915_gem_dumb_create(struct drm_file *file_priv,
> > @@ -1866,6 +1823,7 @@ i915_gem_get_gtt_alignment(struct drm_device *dev, uint32_t size,
> >  			    int tiling_mode, bool fenced);
> >  
> >  int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > +				    struct i915_address_space *vm,
> >  				    enum i915_cache_level cache_level);
> >  
> >  struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev,
> > @@ -1876,6 +1834,56 @@ struct dma_buf *i915_gem_prime_export(struct drm_device *dev,
> >  
> >  void i915_gem_restore_fences(struct drm_device *dev);
> >  
> > +unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
> > +				  struct i915_address_space *vm);
> > +bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o);
> > +bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
> > +			struct i915_address_space *vm);
> > +unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
> > +				struct i915_address_space *vm);
> > +void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
> > +			    struct i915_address_space *vm,
> > +			    enum i915_cache_level color);
> > +struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
> > +				     struct i915_address_space *vm);
> > +/* Some GGTT VM helpers */
> > +#define obj_to_ggtt(obj) \
> > +	(&((struct drm_i915_private *)(obj)->base.dev->dev_private)->gtt.base)
> > +static inline bool i915_is_ggtt(struct i915_address_space *vm)
> > +{
> > +	struct i915_address_space *ggtt =
> > +		&((struct drm_i915_private *)(vm)->dev->dev_private)->gtt.base;
> > +	return vm == ggtt;
> > +}
> > +
> > +static inline bool i915_gem_obj_ggtt_bound(struct drm_i915_gem_object *obj)
> > +{
> > +	return i915_gem_obj_bound(obj, obj_to_ggtt(obj));
> > +}
> > +
> > +static inline unsigned long
> > +i915_gem_obj_ggtt_offset(struct drm_i915_gem_object *obj)
> > +{
> > +	return i915_gem_obj_offset(obj, obj_to_ggtt(obj));
> > +}
> > +
> > +static inline unsigned long
> > +i915_gem_obj_ggtt_size(struct drm_i915_gem_object *obj)
> > +{
> > +	return i915_gem_obj_size(obj, obj_to_ggtt(obj));
> > +}
> > +
> > +static inline int __must_check
> > +i915_gem_ggtt_pin(struct drm_i915_gem_object *obj,
> > +		  uint32_t alignment,
> > +		  bool map_and_fenceable,
> > +		  bool nonblocking)
> > +{
> > +	return i915_gem_object_pin(obj, obj_to_ggtt(obj), alignment,
> > +				   map_and_fenceable, nonblocking);
> > +}
> > +#undef obj_to_ggtt
> > +
> >  /* i915_gem_context.c */
> >  void i915_gem_context_init(struct drm_device *dev);
> >  void i915_gem_context_fini(struct drm_device *dev);
> > @@ -1912,6 +1920,7 @@ void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
> >  
> >  void i915_gem_restore_gtt_mappings(struct drm_device *dev);
> >  int __must_check i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj);
> > +/* FIXME: this is never okay with full PPGTT */
> >  void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
> >  				enum i915_cache_level cache_level);
> >  void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj);
> > @@ -1928,7 +1937,9 @@ static inline void i915_gem_chipset_flush(struct drm_device *dev)
> >  
> >  
> >  /* i915_gem_evict.c */
> > -int __must_check i915_gem_evict_something(struct drm_device *dev, int min_size,
> > +int __must_check i915_gem_evict_something(struct drm_device *dev,
> > +					  struct i915_address_space *vm,
> > +					  int min_size,
> >  					  unsigned alignment,
> >  					  unsigned cache_level,
> >  					  bool mappable,
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index 058ad44..21015cd 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -38,10 +38,12 @@
> >  
> >  static void i915_gem_object_flush_gtt_write_domain(struct drm_i915_gem_object *obj);
> >  static void i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *obj);
> > -static __must_check int i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> > -						    unsigned alignment,
> > -						    bool map_and_fenceable,
> > -						    bool nonblocking);
> > +static __must_check int
> > +i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> > +			    struct i915_address_space *vm,
> > +			    unsigned alignment,
> > +			    bool map_and_fenceable,
> > +			    bool nonblocking);
> >  static int i915_gem_phys_pwrite(struct drm_device *dev,
> >  				struct drm_i915_gem_object *obj,
> >  				struct drm_i915_gem_pwrite *args,
> > @@ -135,7 +137,7 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
> >  static inline bool
> >  i915_gem_object_is_inactive(struct drm_i915_gem_object *obj)
> >  {
> > -	return i915_gem_obj_ggtt_bound(obj) && !obj->active;
> > +	return i915_gem_obj_bound_any(obj) && !obj->active;
> >  }
> >  
> >  int
> > @@ -422,7 +424,7 @@ i915_gem_shmem_pread(struct drm_device *dev,
> >  		 * anyway again before the next pread happens. */
> >  		if (obj->cache_level == I915_CACHE_NONE)
> >  			needs_clflush = 1;
> > -		if (i915_gem_obj_ggtt_bound(obj)) {
> > +		if (i915_gem_obj_bound_any(obj)) {
> >  			ret = i915_gem_object_set_to_gtt_domain(obj, false);
> 
> This is essentially a very convoluted version of "if there's gpu rendering
> outstanding, please wait for it". Maybe we should switch this to
> 
> 	if (obj->active)
> 		wait_rendering(obj, true);
> 
> Same for the shmem_pwrite case below. Would be a separate patch to prep
> things though. Can I volunteer you for that? The ugly part is to review
> whether any of the lru list updating that set_domain does in addition to
> wait_rendering is required, but on a quick read that's not the case.
> 
> >  			if (ret)
> >  				return ret;
> > @@ -594,7 +596,7 @@ i915_gem_gtt_pwrite_fast(struct drm_device *dev,
> >  	char __user *user_data;
> >  	int page_offset, page_length, ret;
> >  
> > -	ret = i915_gem_object_pin(obj, 0, true, true);
> > +	ret = i915_gem_ggtt_pin(obj, 0, true, true);
> >  	if (ret)
> >  		goto out;
> >  
> > @@ -739,7 +741,7 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
> >  		 * right away and we therefore have to clflush anyway. */
gg> >  		if (obj->cache_level == I915_CACHE_NONE)
> >  			needs_clflush_after = 1;
> > -		if (i915_gem_obj_ggtt_bound(obj)) {
> > +		if (i915_gem_obj_bound_any(obj)) {
> 
> ... see above.
> >  			ret = i915_gem_object_set_to_gtt_domain(obj, true);
> >  			if (ret)
> >  				return ret;
> > @@ -1346,7 +1348,7 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
> >  	}
> >  
> >  	/* Now bind it into the GTT if needed */
> > -	ret = i915_gem_object_pin(obj, 0, true, false);
> > +	ret = i915_gem_ggtt_pin(obj,  0, true, false);
> >  	if (ret)
> >  		goto unlock;
> >  
> > @@ -1668,11 +1670,11 @@ i915_gem_object_put_pages(struct drm_i915_gem_object *obj)
> >  	if (obj->pages == NULL)
> >  		return 0;
> >  
> > -	BUG_ON(i915_gem_obj_ggtt_bound(obj));
> > -
> >  	if (obj->pages_pin_count)
> >  		return -EBUSY;
> >  
> > +	BUG_ON(i915_gem_obj_bound_any(obj));
> > +
> >  	/* ->put_pages might need to allocate memory for the bit17 swizzle
> >  	 * array, hence protect them from being reaped by removing them from gtt
> >  	 * lists early. */
> > @@ -1692,7 +1694,6 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
> >  		  bool purgeable_only)
> >  {
> >  	struct drm_i915_gem_object *obj, *next;
> > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> >  	long count = 0;
> >  
> >  	list_for_each_entry_safe(obj, next,
> > @@ -1706,14 +1707,22 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
> >  		}
> >  	}
> >  
> > -	list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list) {
> > -		if ((i915_gem_object_is_purgeable(obj) || !purgeable_only) &&
> > -		    i915_gem_object_unbind(obj) == 0 &&
> > -		    i915_gem_object_put_pages(obj) == 0) {
> > +	list_for_each_entry_safe(obj, next, &dev_priv->mm.bound_list,
> > +				 global_list) {
> > +		struct i915_vma *vma, *v;
> > +
> > +		if (!i915_gem_object_is_purgeable(obj) && purgeable_only)
> > +			continue;
> > +
> > +		list_for_each_entry_safe(vma, v, &obj->vma_list, vma_link)
> > +			if (i915_gem_object_unbind(obj, vma->vm))
> > +				break;
> > +
> > +		if (!i915_gem_object_put_pages(obj))
> >  			count += obj->base.size >> PAGE_SHIFT;
> > -			if (count >= target)
> > -				return count;
> > -		}
> > +
> > +		if (count >= target)
> > +			return count;
> >  	}
> >  
> >  	return count;
> > @@ -1873,11 +1882,11 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
> >  
> >  void
> >  i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> > +			       struct i915_address_space *vm,
> >  			       struct intel_ring_buffer *ring)
> >  {
> >  	struct drm_device *dev = obj->base.dev;
> >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> >  	u32 seqno = intel_ring_get_seqno(ring);
> >  
> >  	BUG_ON(ring == NULL);
> > @@ -1910,12 +1919,9 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> >  }
> >  
> >  static void
> > -i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
> > +i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj,
> > +				 struct i915_address_space *vm)
> >  {
> > -	struct drm_device *dev = obj->base.dev;
> > -	struct drm_i915_private *dev_priv = dev->dev_private;
> > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > -
> >  	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
> >  	BUG_ON(!obj->active);
> >  
> > @@ -2117,10 +2123,11 @@ i915_gem_request_remove_from_client(struct drm_i915_gem_request *request)
> >  	spin_unlock(&file_priv->mm.lock);
> >  }
> >  
> > -static bool i915_head_inside_object(u32 acthd, struct drm_i915_gem_object *obj)
> > +static bool i915_head_inside_object(u32 acthd, struct drm_i915_gem_object *obj,
> > +				    struct i915_address_space *vm)
> >  {
> > -	if (acthd >= i915_gem_obj_ggtt_offset(obj) &&
> > -	    acthd < i915_gem_obj_ggtt_offset(obj) + obj->base.size)
> > +	if (acthd >= i915_gem_obj_offset(obj, vm) &&
> > +	    acthd < i915_gem_obj_offset(obj, vm) + obj->base.size)
> >  		return true;
> >  
> >  	return false;
> > @@ -2143,6 +2150,17 @@ static bool i915_head_inside_request(const u32 acthd_unmasked,
> >  	return false;
> >  }
> >  
> > +static struct i915_address_space *
> > +request_to_vm(struct drm_i915_gem_request *request)
> > +{
> > +	struct drm_i915_private *dev_priv = request->ring->dev->dev_private;
> > +	struct i915_address_space *vm;
> > +
> > +	vm = &dev_priv->gtt.base;
> > +
> > +	return vm;
> > +}
> > +
> >  static bool i915_request_guilty(struct drm_i915_gem_request *request,
> >  				const u32 acthd, bool *inside)
> >  {
> > @@ -2150,9 +2168,9 @@ static bool i915_request_guilty(struct drm_i915_gem_request *request,
> >  	 * pointing inside the ring, matches the batch_obj address range.
> >  	 * However this is extremely unlikely.
> >  	 */
> > -
> >  	if (request->batch_obj) {
> > -		if (i915_head_inside_object(acthd, request->batch_obj)) {
> > +		if (i915_head_inside_object(acthd, request->batch_obj,
> > +					    request_to_vm(request))) {
> >  			*inside = true;
> >  			return true;
> >  		}
> > @@ -2172,17 +2190,21 @@ static void i915_set_reset_status(struct intel_ring_buffer *ring,
> >  {
> >  	struct i915_ctx_hang_stats *hs = NULL;
> >  	bool inside, guilty;
> > +	unsigned long offset = 0;
> >  
> >  	/* Innocent until proven guilty */
> >  	guilty = false;
> >  
> > +	if (request->batch_obj)
> > +		offset = i915_gem_obj_offset(request->batch_obj,
> > +					     request_to_vm(request));
> > +
> >  	if (ring->hangcheck.action != wait &&
> >  	    i915_request_guilty(request, acthd, &inside)) {
> >  		DRM_ERROR("%s hung %s bo (0x%lx ctx %d) at 0x%x\n",
> >  			  ring->name,
> >  			  inside ? "inside" : "flushing",
> > -			  request->batch_obj ?
> > -			  i915_gem_obj_ggtt_offset(request->batch_obj) : 0,
> > +			  offset,
> >  			  request->ctx ? request->ctx->id : 0,
> >  			  acthd);
> >  
> > @@ -2239,13 +2261,15 @@ static void i915_gem_reset_ring_lists(struct drm_i915_private *dev_priv,
> >  	}
> >  
> >  	while (!list_empty(&ring->active_list)) {
> > +		struct i915_address_space *vm;
> >  		struct drm_i915_gem_object *obj;
> >  
> >  		obj = list_first_entry(&ring->active_list,
> >  				       struct drm_i915_gem_object,
> >  				       ring_list);
> >  
> > -		i915_gem_object_move_to_inactive(obj);
> > +		list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > +			i915_gem_object_move_to_inactive(obj, vm);
> >  	}
> >  }
> >  
> > @@ -2263,7 +2287,7 @@ void i915_gem_restore_fences(struct drm_device *dev)
> >  void i915_gem_reset(struct drm_device *dev)
> >  {
> >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > +	struct i915_address_space *vm;
> >  	struct drm_i915_gem_object *obj;
> >  	struct intel_ring_buffer *ring;
> >  	int i;
> > @@ -2274,8 +2298,9 @@ void i915_gem_reset(struct drm_device *dev)
> >  	/* Move everything out of the GPU domains to ensure we do any
> >  	 * necessary invalidation upon reuse.
> >  	 */
> > -	list_for_each_entry(obj, &vm->inactive_list, mm_list)
> > -		obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
> > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > +		list_for_each_entry(obj, &vm->inactive_list, mm_list)
> > +			obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
> >  
> >  	i915_gem_restore_fences(dev);
> >  }
> > @@ -2320,6 +2345,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
> >  	 * by the ringbuffer to the flushing/inactive lists as appropriate.
> >  	 */
> >  	while (!list_empty(&ring->active_list)) {
> > +		struct drm_i915_private *dev_priv = ring->dev->dev_private;
> > +		struct i915_address_space *vm;
> >  		struct drm_i915_gem_object *obj;
> >  
> >  		obj = list_first_entry(&ring->active_list,
> > @@ -2329,7 +2356,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
> >  		if (!i915_seqno_passed(seqno, obj->last_read_seqno))
> >  			break;
> >  
> > -		i915_gem_object_move_to_inactive(obj);
> > +		list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > +			i915_gem_object_move_to_inactive(obj, vm);
> >  	}
> >  
> >  	if (unlikely(ring->trace_irq_seqno &&
> > @@ -2575,13 +2603,14 @@ static void i915_gem_object_finish_gtt(struct drm_i915_gem_object *obj)
> >   * Unbinds an object from the GTT aperture.
> >   */
> >  int
> > -i915_gem_object_unbind(struct drm_i915_gem_object *obj)
> > +i915_gem_object_unbind(struct drm_i915_gem_object *obj,
> > +		       struct i915_address_space *vm)
> >  {
> >  	drm_i915_private_t *dev_priv = obj->base.dev->dev_private;
> >  	struct i915_vma *vma;
> >  	int ret;
> >  
> > -	if (!i915_gem_obj_ggtt_bound(obj))
> > +	if (!i915_gem_obj_bound(obj, vm))
> >  		return 0;
> >  
> >  	if (obj->pin_count)
> > @@ -2604,7 +2633,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
> >  	if (ret)
> >  		return ret;
> >  
> > -	trace_i915_gem_object_unbind(obj);
> > +	trace_i915_gem_object_unbind(obj, vm);
> >  
> >  	if (obj->has_global_gtt_mapping)
> >  		i915_gem_gtt_unbind_object(obj);
> > @@ -2619,7 +2648,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
> >  	/* Avoid an unnecessary call to unbind on rebind. */
> >  	obj->map_and_fenceable = true;
> >  
> > -	vma = __i915_gem_obj_to_vma(obj);
> > +	vma = i915_gem_obj_to_vma(obj, vm);
> >  	list_del(&vma->vma_link);
> >  	drm_mm_remove_node(&vma->node);
> >  	i915_gem_vma_destroy(vma);
> > @@ -2748,6 +2777,7 @@ static void i830_write_fence_reg(struct drm_device *dev, int reg,
> >  		     "object 0x%08lx not 512K or pot-size 0x%08x aligned\n",
> >  		     i915_gem_obj_ggtt_offset(obj), size);
> >  
> > +
> >  		pitch_val = obj->stride / 128;
> >  		pitch_val = ffs(pitch_val) - 1;
> >  
> > @@ -3069,23 +3099,25 @@ static void i915_gem_verify_gtt(struct drm_device *dev)
> >   */
> >  static int
> >  i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> > +			    struct i915_address_space *vm,
> >  			    unsigned alignment,
> >  			    bool map_and_fenceable,
> >  			    bool nonblocking)
> >  {
> >  	struct drm_device *dev = obj->base.dev;
> >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> >  	u32 size, fence_size, fence_alignment, unfenced_alignment;
> >  	bool mappable, fenceable;
> > -	size_t gtt_max = map_and_fenceable ?
> > -		dev_priv->gtt.mappable_end : dev_priv->gtt.base.total;
> > +	size_t gtt_max =
> > +		map_and_fenceable ? dev_priv->gtt.mappable_end : vm->total;
> >  	struct i915_vma *vma;
> >  	int ret;
> >  
> >  	if (WARN_ON(!list_empty(&obj->vma_list)))
> >  		return -EBUSY;
> >  
> > +	BUG_ON(!i915_is_ggtt(vm));
> > +
> >  	fence_size = i915_gem_get_gtt_size(dev,
> >  					   obj->base.size,
> >  					   obj->tiling_mode);
> > @@ -3125,18 +3157,21 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> >  	i915_gem_object_pin_pages(obj);
> >  
> >  	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
> > +	/* For now we only ever use 1 vma per object */
> > +	WARN_ON(!list_empty(&obj->vma_list));
> > +
> > +	vma = i915_gem_vma_create(obj, vm);
> >  	if (vma == NULL) {
> >  		i915_gem_object_unpin_pages(obj);
> >  		return -ENOMEM;
> >  	}
> >  
> >  search_free:
> > -	ret = drm_mm_insert_node_in_range_generic(&dev_priv->gtt.base.mm,
> > -						  &vma->node,
> > +	ret = drm_mm_insert_node_in_range_generic(&vm->mm, &vma->node,
> >  						  size, alignment,
> >  						  obj->cache_level, 0, gtt_max);
> >  	if (ret) {
> > -		ret = i915_gem_evict_something(dev, size, alignment,
> > +		ret = i915_gem_evict_something(dev, vm, size, alignment,
> >  					       obj->cache_level,
> >  					       map_and_fenceable,
> >  					       nonblocking);
> > @@ -3162,18 +3197,25 @@ search_free:
> >  
> >  	list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
> >  	list_add_tail(&obj->mm_list, &vm->inactive_list);
> > -	list_add(&vma->vma_link, &obj->vma_list);
> > +
> > +	/* Keep GGTT vmas first to make debug easier */
> > +	if (i915_is_ggtt(vm))
> > +		list_add(&vma->vma_link, &obj->vma_list);
> > +	else
> > +		list_add_tail(&vma->vma_link, &obj->vma_list);
> >  
> >  	fenceable =
> > +		i915_is_ggtt(vm) &&
> >  		i915_gem_obj_ggtt_size(obj) == fence_size &&
> >  		(i915_gem_obj_ggtt_offset(obj) & (fence_alignment - 1)) == 0;
> >  
> > -	mappable = i915_gem_obj_ggtt_offset(obj) + obj->base.size <=
> > -		dev_priv->gtt.mappable_end;
> > +	mappable =
> > +		i915_is_ggtt(vm) &&
> > +		vma->node.start + obj->base.size <= dev_priv->gtt.mappable_end;
> >  
> >  	obj->map_and_fenceable = mappable && fenceable;
> >  
> > -	trace_i915_gem_object_bind(obj, map_and_fenceable);
> > +	trace_i915_gem_object_bind(obj, vm, map_and_fenceable);
> >  	i915_gem_verify_gtt(dev);
> >  	return 0;
> >  }
> > @@ -3271,7 +3313,7 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
> >  	int ret;
> >  
> >  	/* Not valid to be called on unbound objects. */
> > -	if (!i915_gem_obj_ggtt_bound(obj))
> > +	if (!i915_gem_obj_bound_any(obj))
> >  		return -EINVAL;
> 
> If we're converting the shmem paths over to wait_rendering then there's
> only the fault handler and the set_domain ioctl left. For the later it
> would make sense to clflush even when an object is on the unbound list, to
> allow userspace to optimize when the clflushing happens. But that would
> only make sense in conjunction with Chris' create2 ioctl and a flag to
> preallocate the storage (and so putting the object onto the unbound list).
> So nothing to do here.
> 
> >  
> >  	if (obj->base.write_domain == I915_GEM_DOMAIN_GTT)
> > @@ -3317,11 +3359,12 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
> >  }
> >  
> >  int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > +				    struct i915_address_space *vm,
> >  				    enum i915_cache_level cache_level)
> >  {
> >  	struct drm_device *dev = obj->base.dev;
> >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > -	struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> > +	struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> >  	int ret;
> >  
> >  	if (obj->cache_level == cache_level)
> > @@ -3333,12 +3376,15 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> >  	}
> >  
> >  	if (!i915_gem_valid_gtt_space(dev, &vma->node, cache_level)) {
> > -		ret = i915_gem_object_unbind(obj);
> > +		ret = i915_gem_object_unbind(obj, vm);
> >  		if (ret)
> >  			return ret;
> >  	}
> >  
> > -	if (i915_gem_obj_ggtt_bound(obj)) {
> > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > +		if (!i915_gem_obj_bound(obj, vm))
> > +			continue;
> 
> Hm, shouldn't we have a per-object list of vmas? Or will that follow later
> on?
> 
> Self-correction: It exists already ... why can't we use this here?
> 
> > +
> >  		ret = i915_gem_object_finish_gpu(obj);
> >  		if (ret)
> >  			return ret;
> > @@ -3361,7 +3407,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> >  			i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
> >  					       obj, cache_level);
> >  
> > -		i915_gem_obj_ggtt_set_color(obj, cache_level);
> > +		i915_gem_obj_set_color(obj, vm, cache_level);
> >  	}
> >  
> >  	if (cache_level == I915_CACHE_NONE) {
> > @@ -3421,6 +3467,7 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
> >  			       struct drm_file *file)
> >  {
> >  	struct drm_i915_gem_caching *args = data;
> > +	struct drm_i915_private *dev_priv;
> >  	struct drm_i915_gem_object *obj;
> >  	enum i915_cache_level level;
> >  	int ret;
> > @@ -3445,8 +3492,10 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
> >  		ret = -ENOENT;
> >  		goto unlock;
> >  	}
> > +	dev_priv = obj->base.dev->dev_private;
> >  
> > -	ret = i915_gem_object_set_cache_level(obj, level);
> > +	/* FIXME: Add interface for specific VM? */
> > +	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base, level);
> >  
> >  	drm_gem_object_unreference(&obj->base);
> >  unlock:
> > @@ -3464,6 +3513,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> >  				     u32 alignment,
> >  				     struct intel_ring_buffer *pipelined)
> >  {
> > +	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> >  	u32 old_read_domains, old_write_domain;
> >  	int ret;
> >  
> > @@ -3482,7 +3532,8 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> >  	 * of uncaching, which would allow us to flush all the LLC-cached data
> >  	 * with that bit in the PTE to main memory with just one PIPE_CONTROL.
> >  	 */
> > -	ret = i915_gem_object_set_cache_level(obj, I915_CACHE_NONE);
> > +	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> > +					      I915_CACHE_NONE);
> >  	if (ret)
> >  		return ret;
> >  
> > @@ -3490,7 +3541,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> >  	 * (e.g. libkms for the bootup splash), we have to ensure that we
> >  	 * always use map_and_fenceable for all scanout buffers.
> >  	 */
> > -	ret = i915_gem_object_pin(obj, alignment, true, false);
> > +	ret = i915_gem_ggtt_pin(obj, alignment, true, false);
> >  	if (ret)
> >  		return ret;
> >  
> > @@ -3633,6 +3684,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
> >  
> >  int
> >  i915_gem_object_pin(struct drm_i915_gem_object *obj,
> > +		    struct i915_address_space *vm,
> >  		    uint32_t alignment,
> >  		    bool map_and_fenceable,
> >  		    bool nonblocking)
> > @@ -3642,26 +3694,29 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
> >  	if (WARN_ON(obj->pin_count == DRM_I915_GEM_OBJECT_MAX_PIN_COUNT))
> >  		return -EBUSY;
> >  
> > -	if (i915_gem_obj_ggtt_bound(obj)) {
> > -		if ((alignment && i915_gem_obj_ggtt_offset(obj) & (alignment - 1)) ||
> > +	BUG_ON(map_and_fenceable && !i915_is_ggtt(vm));
> 
> WARN_ON, since presumably we can keep on going if we get this wrong
> (albeit with slightly corrupted state, so render corruptions might
> follow).
> 
> > +
> > +	if (i915_gem_obj_bound(obj, vm)) {
> > +		if ((alignment &&
> > +		     i915_gem_obj_offset(obj, vm) & (alignment - 1)) ||
> >  		    (map_and_fenceable && !obj->map_and_fenceable)) {
> >  			WARN(obj->pin_count,
> >  			     "bo is already pinned with incorrect alignment:"
> >  			     " offset=%lx, req.alignment=%x, req.map_and_fenceable=%d,"
> >  			     " obj->map_and_fenceable=%d\n",
> > -			     i915_gem_obj_ggtt_offset(obj), alignment,
> > +			     i915_gem_obj_offset(obj, vm), alignment,
> >  			     map_and_fenceable,
> >  			     obj->map_and_fenceable);
> > -			ret = i915_gem_object_unbind(obj);
> > +			ret = i915_gem_object_unbind(obj, vm);
> >  			if (ret)
> >  				return ret;
> >  		}
> >  	}
> >  
> > -	if (!i915_gem_obj_ggtt_bound(obj)) {
> > +	if (!i915_gem_obj_bound(obj, vm)) {
> >  		struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> >  
> > -		ret = i915_gem_object_bind_to_gtt(obj, alignment,
> > +		ret = i915_gem_object_bind_to_gtt(obj, vm, alignment,
> >  						  map_and_fenceable,
> >  						  nonblocking);
> >  		if (ret)
> > @@ -3684,7 +3739,7 @@ void
> >  i915_gem_object_unpin(struct drm_i915_gem_object *obj)
> >  {
> >  	BUG_ON(obj->pin_count == 0);
> > -	BUG_ON(!i915_gem_obj_ggtt_bound(obj));
> > +	BUG_ON(!i915_gem_obj_bound_any(obj));
> >  
> >  	if (--obj->pin_count == 0)
> >  		obj->pin_mappable = false;
> > @@ -3722,7 +3777,7 @@ i915_gem_pin_ioctl(struct drm_device *dev, void *data,
> >  	}
> >  
> >  	if (obj->user_pin_count == 0) {
> > -		ret = i915_gem_object_pin(obj, args->alignment, true, false);
> > +		ret = i915_gem_ggtt_pin(obj, args->alignment, true, false);
> >  		if (ret)
> >  			goto out;
> >  	}
> > @@ -3957,6 +4012,7 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
> >  	struct drm_i915_gem_object *obj = to_intel_bo(gem_obj);
> >  	struct drm_device *dev = obj->base.dev;
> >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > +	struct i915_vma *vma, *next;
> >  
> >  	trace_i915_gem_object_destroy(obj);
> >  
> > @@ -3964,15 +4020,21 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
> >  		i915_gem_detach_phys_object(dev, obj);
> >  
> >  	obj->pin_count = 0;
> > -	if (WARN_ON(i915_gem_object_unbind(obj) == -ERESTARTSYS)) {
> > -		bool was_interruptible;
> > +	/* NB: 0 or 1 elements */
> > +	WARN_ON(!list_empty(&obj->vma_list) &&
> > +		!list_is_singular(&obj->vma_list));
> > +	list_for_each_entry_safe(vma, next, &obj->vma_list, vma_link) {
> > +		int ret = i915_gem_object_unbind(obj, vma->vm);
> > +		if (WARN_ON(ret == -ERESTARTSYS)) {
> > +			bool was_interruptible;
> >  
> > -		was_interruptible = dev_priv->mm.interruptible;
> > -		dev_priv->mm.interruptible = false;
> > +			was_interruptible = dev_priv->mm.interruptible;
> > +			dev_priv->mm.interruptible = false;
> >  
> > -		WARN_ON(i915_gem_object_unbind(obj));
> > +			WARN_ON(i915_gem_object_unbind(obj, vma->vm));
> >  
> > -		dev_priv->mm.interruptible = was_interruptible;
> > +			dev_priv->mm.interruptible = was_interruptible;
> > +		}
> >  	}
> >  
> >  	/* Stolen objects don't hold a ref, but do hold pin count. Fix that up
> > @@ -4332,6 +4394,16 @@ init_ring_lists(struct intel_ring_buffer *ring)
> >  	INIT_LIST_HEAD(&ring->request_list);
> >  }
> >  
> > +static void i915_init_vm(struct drm_i915_private *dev_priv,
> > +			 struct i915_address_space *vm)
> > +{
> > +	vm->dev = dev_priv->dev;
> > +	INIT_LIST_HEAD(&vm->active_list);
> > +	INIT_LIST_HEAD(&vm->inactive_list);
> > +	INIT_LIST_HEAD(&vm->global_link);
> > +	list_add(&vm->global_link, &dev_priv->vm_list);
> > +}
> > +
> >  void
> >  i915_gem_load(struct drm_device *dev)
> >  {
> > @@ -4344,8 +4416,9 @@ i915_gem_load(struct drm_device *dev)
> >  				  SLAB_HWCACHE_ALIGN,
> >  				  NULL);
> >  
> > -	INIT_LIST_HEAD(&dev_priv->gtt.base.active_list);
> > -	INIT_LIST_HEAD(&dev_priv->gtt.base.inactive_list);
> > +	INIT_LIST_HEAD(&dev_priv->vm_list);
> > +	i915_init_vm(dev_priv, &dev_priv->gtt.base);
> > +
> >  	INIT_LIST_HEAD(&dev_priv->mm.unbound_list);
> >  	INIT_LIST_HEAD(&dev_priv->mm.bound_list);
> >  	INIT_LIST_HEAD(&dev_priv->mm.fence_list);
> > @@ -4616,9 +4689,9 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> >  			     struct drm_i915_private,
> >  			     mm.inactive_shrinker);
> >  	struct drm_device *dev = dev_priv->dev;
> > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > +	struct i915_address_space *vm;
> >  	struct drm_i915_gem_object *obj;
> > -	int nr_to_scan = sc->nr_to_scan;
> > +	int nr_to_scan;
> >  	bool unlock = true;
> >  	int cnt;
> >  
> > @@ -4632,6 +4705,7 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> >  		unlock = false;
> >  	}
> >  
> > +	nr_to_scan = sc->nr_to_scan;
> >  	if (nr_to_scan) {
> >  		nr_to_scan -= i915_gem_purge(dev_priv, nr_to_scan);
> >  		if (nr_to_scan > 0)
> > @@ -4645,11 +4719,93 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> >  	list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_list)
> >  		if (obj->pages_pin_count == 0)
> >  			cnt += obj->base.size >> PAGE_SHIFT;
> > -	list_for_each_entry(obj, &vm->inactive_list, mm_list)
> > -		if (obj->pin_count == 0 && obj->pages_pin_count == 0)
> > -			cnt += obj->base.size >> PAGE_SHIFT;
> > +
> > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > +		list_for_each_entry(obj, &vm->inactive_list, global_list)
> > +			if (obj->pin_count == 0 && obj->pages_pin_count == 0)
> > +				cnt += obj->base.size >> PAGE_SHIFT;
> 
> Isn't this now double-counting objects? In the shrinker we only care about
> how much physical RAM an object occupies, not how much virtual space it
> occupies. So just walking the bound list of objects here should be good
> enough ...
> 
> >  
> >  	if (unlock)
> >  		mutex_unlock(&dev->struct_mutex);
> >  	return cnt;
> >  }
> > +
> > +/* All the new VM stuff */
> > +unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
> > +				  struct i915_address_space *vm)
> > +{
> > +	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
> > +	struct i915_vma *vma;
> > +
> > +	if (vm == &dev_priv->mm.aliasing_ppgtt->base)
> > +		vm = &dev_priv->gtt.base;
> > +
> > +	BUG_ON(list_empty(&o->vma_list));
> > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> 
> Imo the vma list walking here and in the other helpers below indicates
> that we should deal more often in vmas instead of (object, vm) pairs. Or
> is this again something that'll get fixed later on?
> 
> I just want to avoid diff churn, and it also makes reviewing easier if the
> foreshadowing is correct ;-) So generally I'd vote for more liberal
> sprinkling of obj_to_vma in callers.
> 
> > +		if (vma->vm == vm)
> > +			return vma->node.start;
> > +
> > +	}
> > +	return -1;
> > +}
> > +
> > +bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o)
> > +{
> > +	return !list_empty(&o->vma_list);
> > +}
> > +
> > +bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
> > +			struct i915_address_space *vm)
> > +{
> > +	struct i915_vma *vma;
> > +
> > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > +		if (vma->vm == vm)
> > +			return true;
> > +	}
> > +	return false;
> > +}
> > +
> > +unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
> > +				struct i915_address_space *vm)
> > +{
> > +	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
> > +	struct i915_vma *vma;
> > +
> > +	if (vm == &dev_priv->mm.aliasing_ppgtt->base)
> > +		vm = &dev_priv->gtt.base;
> > +	BUG_ON(list_empty(&o->vma_list));
> > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > +		if (vma->vm == vm)
> > +			return vma->node.size;
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
> > +			    struct i915_address_space *vm,
> > +			    enum i915_cache_level color)
> > +{
> > +	struct i915_vma *vma;
> > +	BUG_ON(list_empty(&o->vma_list));
> > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > +		if (vma->vm == vm) {
> > +			vma->node.color = color;
> > +			return;
> > +		}
> > +	}
> > +
> > +	WARN(1, "Couldn't set color for VM %p\n", vm);
> > +}
> > +
> > +struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
> > +				     struct i915_address_space *vm)
> > +{
> > +	struct i915_vma *vma;
> > +	list_for_each_entry(vma, &obj->vma_list, vma_link)
> > +		if (vma->vm == vm)
> > +			return vma;
> > +
> > +	return NULL;
> > +}
> > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> > index 2074544..c92fd81 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> > @@ -155,6 +155,7 @@ create_hw_context(struct drm_device *dev,
> >  
> >  	if (INTEL_INFO(dev)->gen >= 7) {
> >  		ret = i915_gem_object_set_cache_level(ctx->obj,
> > +						      &dev_priv->gtt.base,
> >  						      I915_CACHE_LLC_MLC);
> >  		/* Failure shouldn't ever happen this early */
> >  		if (WARN_ON(ret))
> > @@ -214,7 +215,7 @@ static int create_default_context(struct drm_i915_private *dev_priv)
> >  	 * default context.
> >  	 */
> >  	dev_priv->ring[RCS].default_context = ctx;
> > -	ret = i915_gem_object_pin(ctx->obj, CONTEXT_ALIGN, false, false);
> > +	ret = i915_gem_ggtt_pin(ctx->obj, CONTEXT_ALIGN, false, false);
> >  	if (ret) {
> >  		DRM_DEBUG_DRIVER("Couldn't pin %d\n", ret);
> >  		goto err_destroy;
> > @@ -398,6 +399,7 @@ mi_set_context(struct intel_ring_buffer *ring,
> >  static int do_switch(struct i915_hw_context *to)
> >  {
> >  	struct intel_ring_buffer *ring = to->ring;
> > +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> >  	struct i915_hw_context *from = ring->last_context;
> >  	u32 hw_flags = 0;
> >  	int ret;
> > @@ -407,7 +409,7 @@ static int do_switch(struct i915_hw_context *to)
> >  	if (from == to)
> >  		return 0;
> >  
> > -	ret = i915_gem_object_pin(to->obj, CONTEXT_ALIGN, false, false);
> > +	ret = i915_gem_ggtt_pin(to->obj, CONTEXT_ALIGN, false, false);
> >  	if (ret)
> >  		return ret;
> >  
> > @@ -444,7 +446,8 @@ static int do_switch(struct i915_hw_context *to)
> >  	 */
> >  	if (from != NULL) {
> >  		from->obj->base.read_domains = I915_GEM_DOMAIN_INSTRUCTION;
> > -		i915_gem_object_move_to_active(from->obj, ring);
> > +		i915_gem_object_move_to_active(from->obj, &dev_priv->gtt.base,
> > +					       ring);
> >  		/* As long as MI_SET_CONTEXT is serializing, ie. it flushes the
> >  		 * whole damn pipeline, we don't need to explicitly mark the
> >  		 * object dirty. The only exception is that the context must be
> > diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
> > index df61f33..32efdc0 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_evict.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_evict.c
> > @@ -32,24 +32,21 @@
> >  #include "i915_trace.h"
> >  
> >  static bool
> > -mark_free(struct drm_i915_gem_object *obj, struct list_head *unwind)
> > +mark_free(struct i915_vma *vma, struct list_head *unwind)
> >  {
> > -	struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> > -
> > -	if (obj->pin_count)
> > +	if (vma->obj->pin_count)
> >  		return false;
> >  
> > -	list_add(&obj->exec_list, unwind);
> > +	list_add(&vma->obj->exec_list, unwind);
> >  	return drm_mm_scan_add_block(&vma->node);
> >  }
> >  
> >  int
> > -i915_gem_evict_something(struct drm_device *dev, int min_size,
> > -			 unsigned alignment, unsigned cache_level,
> > +i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
> > +			 int min_size, unsigned alignment, unsigned cache_level,
> >  			 bool mappable, bool nonblocking)
> >  {
> >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> >  	struct list_head eviction_list, unwind_list;
> >  	struct i915_vma *vma;
> >  	struct drm_i915_gem_object *obj;
> > @@ -81,16 +78,18 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
> >  	 */
> >  
> >  	INIT_LIST_HEAD(&unwind_list);
> > -	if (mappable)
> > +	if (mappable) {
> > +		BUG_ON(!i915_is_ggtt(vm));
> >  		drm_mm_init_scan_with_range(&vm->mm, min_size,
> >  					    alignment, cache_level, 0,
> >  					    dev_priv->gtt.mappable_end);
> > -	else
> > +	} else
> >  		drm_mm_init_scan(&vm->mm, min_size, alignment, cache_level);
> >  
> >  	/* First see if there is a large enough contiguous idle region... */
> >  	list_for_each_entry(obj, &vm->inactive_list, mm_list) {
> > -		if (mark_free(obj, &unwind_list))
> > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> > +		if (mark_free(vma, &unwind_list))
> >  			goto found;
> >  	}
> >  
> > @@ -99,7 +98,8 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
> >  
> >  	/* Now merge in the soon-to-be-expired objects... */
> >  	list_for_each_entry(obj, &vm->active_list, mm_list) {
> > -		if (mark_free(obj, &unwind_list))
> > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> > +		if (mark_free(vma, &unwind_list))
> >  			goto found;
> >  	}
> >  
> > @@ -109,7 +109,7 @@ none:
> >  		obj = list_first_entry(&unwind_list,
> >  				       struct drm_i915_gem_object,
> >  				       exec_list);
> > -		vma = __i915_gem_obj_to_vma(obj);
> > +		vma = i915_gem_obj_to_vma(obj, vm);
> >  		ret = drm_mm_scan_remove_block(&vma->node);
> >  		BUG_ON(ret);
> >  
> > @@ -130,7 +130,7 @@ found:
> >  		obj = list_first_entry(&unwind_list,
> >  				       struct drm_i915_gem_object,
> >  				       exec_list);
> > -		vma = __i915_gem_obj_to_vma(obj);
> > +		vma = i915_gem_obj_to_vma(obj, vm);
> >  		if (drm_mm_scan_remove_block(&vma->node)) {
> >  			list_move(&obj->exec_list, &eviction_list);
> >  			drm_gem_object_reference(&obj->base);
> > @@ -145,7 +145,7 @@ found:
> >  				       struct drm_i915_gem_object,
> >  				       exec_list);
> >  		if (ret == 0)
> > -			ret = i915_gem_object_unbind(obj);
> > +			ret = i915_gem_object_unbind(obj, vm);
> >  
> >  		list_del_init(&obj->exec_list);
> >  		drm_gem_object_unreference(&obj->base);
> > @@ -158,13 +158,18 @@ int
> >  i915_gem_evict_everything(struct drm_device *dev)
> 
> I suspect evict_everything eventually wants a address_space *vm argument
> for those cases where we only want to evict everything in a given vm. Atm
> we have two use-cases of this:
> - Called from the shrinker as a last-ditch effort. For that it should move
>   _every_ object onto the unbound list.
> - Called from execbuf for badly-fragmented address spaces to clean up the
>   mess. For that case we only care about one address space.
> 
> >  {
> >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > +	struct i915_address_space *vm;
> >  	struct drm_i915_gem_object *obj, *next;
> > -	bool lists_empty;
> > +	bool lists_empty = true;
> >  	int ret;
> >  
> > -	lists_empty = (list_empty(&vm->inactive_list) &&
> > -		       list_empty(&vm->active_list));
> > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > +		lists_empty = (list_empty(&vm->inactive_list) &&
> > +			       list_empty(&vm->active_list));
> > +		if (!lists_empty)
> > +			lists_empty = false;
> > +	}
> > +
> >  	if (lists_empty)
> >  		return -ENOSPC;
> >  
> > @@ -181,9 +186,11 @@ i915_gem_evict_everything(struct drm_device *dev)
> >  	i915_gem_retire_requests(dev);
> >  
> >  	/* Having flushed everything, unbind() should never raise an error */
> > -	list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> > -		if (obj->pin_count == 0)
> > -			WARN_ON(i915_gem_object_unbind(obj));
> > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > +		list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> > +			if (obj->pin_count == 0)
> > +				WARN_ON(i915_gem_object_unbind(obj, vm));
> > +	}
> >  
> >  	return 0;
> >  }
> > diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > index 5aeb447..e90182d 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > @@ -150,7 +150,7 @@ eb_get_object(struct eb_objects *eb, unsigned long handle)
> >  }
> >  
> >  static void
> > -eb_destroy(struct eb_objects *eb)
> > +eb_destroy(struct eb_objects *eb, struct i915_address_space *vm)
> >  {
> >  	while (!list_empty(&eb->objects)) {
> >  		struct drm_i915_gem_object *obj;
> > @@ -174,7 +174,8 @@ static inline int use_cpu_reloc(struct drm_i915_gem_object *obj)
> >  static int
> >  i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
> >  				   struct eb_objects *eb,
> > -				   struct drm_i915_gem_relocation_entry *reloc)
> > +				   struct drm_i915_gem_relocation_entry *reloc,
> > +				   struct i915_address_space *vm)
> >  {
> >  	struct drm_device *dev = obj->base.dev;
> >  	struct drm_gem_object *target_obj;
> > @@ -297,7 +298,8 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
> >  
> >  static int
> >  i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
> > -				    struct eb_objects *eb)
> > +				    struct eb_objects *eb,
> > +				    struct i915_address_space *vm)
> >  {
> >  #define N_RELOC(x) ((x) / sizeof(struct drm_i915_gem_relocation_entry))
> >  	struct drm_i915_gem_relocation_entry stack_reloc[N_RELOC(512)];
> > @@ -321,7 +323,8 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
> >  		do {
> >  			u64 offset = r->presumed_offset;
> >  
> > -			ret = i915_gem_execbuffer_relocate_entry(obj, eb, r);
> > +			ret = i915_gem_execbuffer_relocate_entry(obj, eb, r,
> > +								 vm);
> >  			if (ret)
> >  				return ret;
> >  
> > @@ -344,13 +347,15 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
> >  static int
> >  i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
> >  					 struct eb_objects *eb,
> > -					 struct drm_i915_gem_relocation_entry *relocs)
> > +					 struct drm_i915_gem_relocation_entry *relocs,
> > +					 struct i915_address_space *vm)
> >  {
> >  	const struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
> >  	int i, ret;
> >  
> >  	for (i = 0; i < entry->relocation_count; i++) {
> > -		ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i]);
> > +		ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i],
> > +							 vm);
> >  		if (ret)
> >  			return ret;
> >  	}
> > @@ -359,7 +364,8 @@ i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
> >  }
> >  
> >  static int
> > -i915_gem_execbuffer_relocate(struct eb_objects *eb)
> > +i915_gem_execbuffer_relocate(struct eb_objects *eb,
> > +			     struct i915_address_space *vm)
> >  {
> >  	struct drm_i915_gem_object *obj;
> >  	int ret = 0;
> > @@ -373,7 +379,7 @@ i915_gem_execbuffer_relocate(struct eb_objects *eb)
> >  	 */
> >  	pagefault_disable();
> >  	list_for_each_entry(obj, &eb->objects, exec_list) {
> > -		ret = i915_gem_execbuffer_relocate_object(obj, eb);
> > +		ret = i915_gem_execbuffer_relocate_object(obj, eb, vm);
> >  		if (ret)
> >  			break;
> >  	}
> > @@ -395,6 +401,7 @@ need_reloc_mappable(struct drm_i915_gem_object *obj)
> >  static int
> >  i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
> >  				   struct intel_ring_buffer *ring,
> > +				   struct i915_address_space *vm,
> >  				   bool *need_reloc)
> >  {
> >  	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> > @@ -409,7 +416,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
> >  		obj->tiling_mode != I915_TILING_NONE;
> >  	need_mappable = need_fence || need_reloc_mappable(obj);
> >  
> > -	ret = i915_gem_object_pin(obj, entry->alignment, need_mappable, false);
> > +	ret = i915_gem_object_pin(obj, vm, entry->alignment, need_mappable,
> > +				  false);
> >  	if (ret)
> >  		return ret;
> >  
> > @@ -436,8 +444,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
> >  		obj->has_aliasing_ppgtt_mapping = 1;
> >  	}
> >  
> > -	if (entry->offset != i915_gem_obj_ggtt_offset(obj)) {
> > -		entry->offset = i915_gem_obj_ggtt_offset(obj);
> > +	if (entry->offset != i915_gem_obj_offset(obj, vm)) {
> > +		entry->offset = i915_gem_obj_offset(obj, vm);
> >  		*need_reloc = true;
> >  	}
> >  
> > @@ -458,7 +466,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
> >  {
> >  	struct drm_i915_gem_exec_object2 *entry;
> >  
> > -	if (!i915_gem_obj_ggtt_bound(obj))
> > +	if (!i915_gem_obj_bound_any(obj))
> >  		return;
> >  
> >  	entry = obj->exec_entry;
> > @@ -475,6 +483,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
> >  static int
> >  i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
> >  			    struct list_head *objects,
> > +			    struct i915_address_space *vm,
> >  			    bool *need_relocs)
> >  {
> >  	struct drm_i915_gem_object *obj;
> > @@ -529,32 +538,37 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
> >  		list_for_each_entry(obj, objects, exec_list) {
> >  			struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
> >  			bool need_fence, need_mappable;
> > +			u32 obj_offset;
> >  
> > -			if (!i915_gem_obj_ggtt_bound(obj))
> > +			if (!i915_gem_obj_bound(obj, vm))
> >  				continue;
> 
> I wonder a bit how we could avoid the multipler (obj, vm) -> vma lookups
> here ... Maybe we should cache them in some pointer somewhere (either in
> the eb object or by adding a new pointer to the object struct, e.g.
> obj->eb_vma, similar to obj->eb_list).
> 
> >  
> > +			obj_offset = i915_gem_obj_offset(obj, vm);
> >  			need_fence =
> >  				has_fenced_gpu_access &&
> >  				entry->flags & EXEC_OBJECT_NEEDS_FENCE &&
> >  				obj->tiling_mode != I915_TILING_NONE;
> >  			need_mappable = need_fence || need_reloc_mappable(obj);
> >  
> > +			BUG_ON((need_mappable || need_fence) &&
> > +			       !i915_is_ggtt(vm));
> > +
> >  			if ((entry->alignment &&
> > -			     i915_gem_obj_ggtt_offset(obj) & (entry->alignment - 1)) ||
> > +			     obj_offset & (entry->alignment - 1)) ||
> >  			    (need_mappable && !obj->map_and_fenceable))
> > -				ret = i915_gem_object_unbind(obj);
> > +				ret = i915_gem_object_unbind(obj, vm);
> >  			else
> > -				ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs);
> > +				ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
> >  			if (ret)
> >  				goto err;
> >  		}
> >  
> >  		/* Bind fresh objects */
> >  		list_for_each_entry(obj, objects, exec_list) {
> > -			if (i915_gem_obj_ggtt_bound(obj))
> > +			if (i915_gem_obj_bound(obj, vm))
> >  				continue;
> >  
> > -			ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs);
> > +			ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
> >  			if (ret)
> >  				goto err;
> >  		}
> > @@ -578,7 +592,8 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
> >  				  struct drm_file *file,
> >  				  struct intel_ring_buffer *ring,
> >  				  struct eb_objects *eb,
> > -				  struct drm_i915_gem_exec_object2 *exec)
> > +				  struct drm_i915_gem_exec_object2 *exec,
> > +				  struct i915_address_space *vm)
> >  {
> >  	struct drm_i915_gem_relocation_entry *reloc;
> >  	struct drm_i915_gem_object *obj;
> > @@ -662,14 +677,15 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
> >  		goto err;
> >  
> >  	need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
> > -	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs);
> > +	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
> >  	if (ret)
> >  		goto err;
> >  
> >  	list_for_each_entry(obj, &eb->objects, exec_list) {
> >  		int offset = obj->exec_entry - exec;
> >  		ret = i915_gem_execbuffer_relocate_object_slow(obj, eb,
> > -							       reloc + reloc_offset[offset]);
> > +							       reloc + reloc_offset[offset],
> > +							       vm);
> >  		if (ret)
> >  			goto err;
> >  	}
> > @@ -768,6 +784,7 @@ validate_exec_list(struct drm_i915_gem_exec_object2 *exec,
> >  
> >  static void
> >  i915_gem_execbuffer_move_to_active(struct list_head *objects,
> > +				   struct i915_address_space *vm,
> >  				   struct intel_ring_buffer *ring)
> >  {
> >  	struct drm_i915_gem_object *obj;
> > @@ -782,7 +799,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *objects,
> >  		obj->base.read_domains = obj->base.pending_read_domains;
> >  		obj->fenced_gpu_access = obj->pending_fenced_gpu_access;
> >  
> > -		i915_gem_object_move_to_active(obj, ring);
> > +		i915_gem_object_move_to_active(obj, vm, ring);
> >  		if (obj->base.write_domain) {
> >  			obj->dirty = 1;
> >  			obj->last_write_seqno = intel_ring_get_seqno(ring);
> > @@ -836,7 +853,8 @@ static int
> >  i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> >  		       struct drm_file *file,
> >  		       struct drm_i915_gem_execbuffer2 *args,
> > -		       struct drm_i915_gem_exec_object2 *exec)
> > +		       struct drm_i915_gem_exec_object2 *exec,
> > +		       struct i915_address_space *vm)
> >  {
> >  	drm_i915_private_t *dev_priv = dev->dev_private;
> >  	struct eb_objects *eb;
> > @@ -998,17 +1016,17 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> >  
> >  	/* Move the objects en-masse into the GTT, evicting if necessary. */
> >  	need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
> > -	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs);
> > +	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
> >  	if (ret)
> >  		goto err;
> >  
> >  	/* The objects are in their final locations, apply the relocations. */
> >  	if (need_relocs)
> > -		ret = i915_gem_execbuffer_relocate(eb);
> > +		ret = i915_gem_execbuffer_relocate(eb, vm);
> >  	if (ret) {
> >  		if (ret == -EFAULT) {
> >  			ret = i915_gem_execbuffer_relocate_slow(dev, args, file, ring,
> > -								eb, exec);
> > +								eb, exec, vm);
> >  			BUG_ON(!mutex_is_locked(&dev->struct_mutex));
> >  		}
> >  		if (ret)
> > @@ -1059,7 +1077,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> >  			goto err;
> >  	}
> >  
> > -	exec_start = i915_gem_obj_ggtt_offset(batch_obj) + args->batch_start_offset;
> > +	exec_start = i915_gem_obj_offset(batch_obj, vm) +
> > +		args->batch_start_offset;
> >  	exec_len = args->batch_len;
> >  	if (cliprects) {
> >  		for (i = 0; i < args->num_cliprects; i++) {
> > @@ -1084,11 +1103,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> >  
> >  	trace_i915_gem_ring_dispatch(ring, intel_ring_get_seqno(ring), flags);
> >  
> > -	i915_gem_execbuffer_move_to_active(&eb->objects, ring);
> > +	i915_gem_execbuffer_move_to_active(&eb->objects, vm, ring);
> >  	i915_gem_execbuffer_retire_commands(dev, file, ring, batch_obj);
> >  
> >  err:
> > -	eb_destroy(eb);
> > +	eb_destroy(eb, vm);
> >  
> >  	mutex_unlock(&dev->struct_mutex);
> >  
> > @@ -1105,6 +1124,7 @@ int
> >  i915_gem_execbuffer(struct drm_device *dev, void *data,
> >  		    struct drm_file *file)
> >  {
> > +	struct drm_i915_private *dev_priv = dev->dev_private;
> >  	struct drm_i915_gem_execbuffer *args = data;
> >  	struct drm_i915_gem_execbuffer2 exec2;
> >  	struct drm_i915_gem_exec_object *exec_list = NULL;
> > @@ -1160,7 +1180,8 @@ i915_gem_execbuffer(struct drm_device *dev, void *data,
> >  	exec2.flags = I915_EXEC_RENDER;
> >  	i915_execbuffer2_set_context_id(exec2, 0);
> >  
> > -	ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list);
> > +	ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list,
> > +				     &dev_priv->gtt.base);
> >  	if (!ret) {
> >  		/* Copy the new buffer offsets back to the user's exec list. */
> >  		for (i = 0; i < args->buffer_count; i++)
> > @@ -1186,6 +1207,7 @@ int
> >  i915_gem_execbuffer2(struct drm_device *dev, void *data,
> >  		     struct drm_file *file)
> >  {
> > +	struct drm_i915_private *dev_priv = dev->dev_private;
> >  	struct drm_i915_gem_execbuffer2 *args = data;
> >  	struct drm_i915_gem_exec_object2 *exec2_list = NULL;
> >  	int ret;
> > @@ -1216,7 +1238,8 @@ i915_gem_execbuffer2(struct drm_device *dev, void *data,
> >  		return -EFAULT;
> >  	}
> >  
> > -	ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list);
> > +	ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list,
> > +				     &dev_priv->gtt.base);
> >  	if (!ret) {
> >  		/* Copy the new buffer offsets back to the user's exec list. */
> >  		ret = copy_to_user(to_user_ptr(args->buffers_ptr),
> > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > index 298fc42..70ce2f6 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > @@ -367,6 +367,8 @@ static int i915_gem_init_aliasing_ppgtt(struct drm_device *dev)
> >  			    ppgtt->base.total);
> >  	}
> >  
> > +	/* i915_init_vm(dev_priv, &ppgtt->base) */
> > +
> >  	return ret;
> >  }
> >  
> > @@ -386,17 +388,22 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
> >  			    struct drm_i915_gem_object *obj,
> >  			    enum i915_cache_level cache_level)
> >  {
> > -	ppgtt->base.insert_entries(&ppgtt->base, obj->pages,
> > -				   i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> > -				   cache_level);
> > +	struct i915_address_space *vm = &ppgtt->base;
> > +	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
> > +
> > +	vm->insert_entries(vm, obj->pages,
> > +			   obj_offset >> PAGE_SHIFT,
> > +			   cache_level);
> >  }
> >  
> >  void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
> >  			      struct drm_i915_gem_object *obj)
> >  {
> > -	ppgtt->base.clear_range(&ppgtt->base,
> > -				i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> > -				obj->base.size >> PAGE_SHIFT);
> > +	struct i915_address_space *vm = &ppgtt->base;
> > +	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
> > +
> > +	vm->clear_range(vm, obj_offset >> PAGE_SHIFT,
> > +			obj->base.size >> PAGE_SHIFT);
> >  }
> >  
> >  extern int intel_iommu_gfx_mapped;
> > @@ -447,6 +454,9 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
> >  				       dev_priv->gtt.base.start / PAGE_SIZE,
> >  				       dev_priv->gtt.base.total / PAGE_SIZE);
> >  
> > +	if (dev_priv->mm.aliasing_ppgtt)
> > +		gen6_write_pdes(dev_priv->mm.aliasing_ppgtt);
> > +
> >  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> >  		i915_gem_clflush_object(obj);
> >  		i915_gem_gtt_bind_object(obj, obj->cache_level);
> > @@ -625,7 +635,8 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
> >  	 * aperture.  One page should be enough to keep any prefetching inside
> >  	 * of the aperture.
> >  	 */
> > -	drm_i915_private_t *dev_priv = dev->dev_private;
> > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > +	struct i915_address_space *ggtt_vm = &dev_priv->gtt.base;
> >  	struct drm_mm_node *entry;
> >  	struct drm_i915_gem_object *obj;
> >  	unsigned long hole_start, hole_end;
> > @@ -633,19 +644,19 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
> >  	BUG_ON(mappable_end > end);
> >  
> >  	/* Subtract the guard page ... */
> > -	drm_mm_init(&dev_priv->gtt.base.mm, start, end - start - PAGE_SIZE);
> > +	drm_mm_init(&ggtt_vm->mm, start, end - start - PAGE_SIZE);
> >  	if (!HAS_LLC(dev))
> >  		dev_priv->gtt.base.mm.color_adjust = i915_gtt_color_adjust;
> >  
> >  	/* Mark any preallocated objects as occupied */
> >  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> > -		struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, ggtt_vm);
> >  		int ret;
> >  		DRM_DEBUG_KMS("reserving preallocated space: %lx + %zx\n",
> >  			      i915_gem_obj_ggtt_offset(obj), obj->base.size);
> >  
> >  		WARN_ON(i915_gem_obj_ggtt_bound(obj));
> > -		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
> > +		ret = drm_mm_reserve_node(&ggtt_vm->mm, &vma->node);
> >  		if (ret)
> >  			DRM_DEBUG_KMS("Reservation failed\n");
> >  		obj->has_global_gtt_mapping = 1;
> > @@ -656,19 +667,15 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
> >  	dev_priv->gtt.base.total = end - start;
> >  
> >  	/* Clear any non-preallocated blocks */
> > -	drm_mm_for_each_hole(entry, &dev_priv->gtt.base.mm,
> > -			     hole_start, hole_end) {
> > +	drm_mm_for_each_hole(entry, &ggtt_vm->mm, hole_start, hole_end) {
> >  		const unsigned long count = (hole_end - hole_start) / PAGE_SIZE;
> >  		DRM_DEBUG_KMS("clearing unused GTT space: [%lx, %lx]\n",
> >  			      hole_start, hole_end);
> > -		dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> > -					       hole_start / PAGE_SIZE,
> > -					       count);
> > +		ggtt_vm->clear_range(ggtt_vm, hole_start / PAGE_SIZE, count);
> >  	}
> >  
> >  	/* And finally clear the reserved guard page */
> > -	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> > -				       end / PAGE_SIZE - 1, 1);
> > +	ggtt_vm->clear_range(ggtt_vm, end / PAGE_SIZE - 1, 1);
> >  }
> >  
> >  static bool
> > diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > index 245eb1d..bfe61fa 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > @@ -391,7 +391,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
> >  	if (gtt_offset == I915_GTT_OFFSET_NONE)
> >  		return obj;
> >  
> > -	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
> > +	vma = i915_gem_vma_create(obj, vm);
> >  	if (!vma) {
> >  		drm_gem_object_unreference(&obj->base);
> >  		return NULL;
> > @@ -404,8 +404,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
> >  	 */
> >  	vma->node.start = gtt_offset;
> >  	vma->node.size = size;
> > -	if (drm_mm_initialized(&dev_priv->gtt.base.mm)) {
> > -		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
> > +	if (drm_mm_initialized(&vm->mm)) {
> > +		ret = drm_mm_reserve_node(&vm->mm, &vma->node);
> 
> These two hunks here for stolen look fishy - we only ever use the stolen
> preallocated stuff for objects with mappings in the global gtt. So keeping
> that explicit is imo the better approach. And tbh I'm confused where the
> local variable vm is from ...
> 
> >  		if (ret) {
> >  			DRM_DEBUG_KMS("failed to allocate stolen GTT space\n");
> >  			goto unref_out;
> > diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
> > index 92a8d27..808ca2a 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_tiling.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
> > @@ -360,17 +360,19 @@ i915_gem_set_tiling(struct drm_device *dev, void *data,
> >  
> >  		obj->map_and_fenceable =
> >  			!i915_gem_obj_ggtt_bound(obj) ||
> > -			(i915_gem_obj_ggtt_offset(obj) + obj->base.size <= dev_priv->gtt.mappable_end &&
> > +			(i915_gem_obj_ggtt_offset(obj) +
> > +			 obj->base.size <= dev_priv->gtt.mappable_end &&
> >  			 i915_gem_object_fence_ok(obj, args->tiling_mode));
> >  
> >  		/* Rebind if we need a change of alignment */
> >  		if (!obj->map_and_fenceable) {
> > -			u32 unfenced_alignment =
> > +			struct i915_address_space *ggtt = &dev_priv->gtt.base;
> > +			u32 unfenced_align =
> >  				i915_gem_get_gtt_alignment(dev, obj->base.size,
> >  							    args->tiling_mode,
> >  							    false);
> > -			if (i915_gem_obj_ggtt_offset(obj) & (unfenced_alignment - 1))
> > -				ret = i915_gem_object_unbind(obj);
> > +			if (i915_gem_obj_ggtt_offset(obj) & (unfenced_align - 1))
> > +				ret = i915_gem_object_unbind(obj, ggtt);
> >  		}
> >  
> >  		if (ret == 0) {
> > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> > index 79fbb17..28fa0ff 100644
> > --- a/drivers/gpu/drm/i915/i915_irq.c
> > +++ b/drivers/gpu/drm/i915/i915_irq.c
> > @@ -1716,6 +1716,9 @@ i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
> >  	if (HAS_BROKEN_CS_TLB(dev_priv->dev)) {
> >  		u32 acthd = I915_READ(ACTHD);
> >  
> > +		if (WARN_ON(HAS_HW_CONTEXTS(dev_priv->dev)))
> > +			return NULL;
> > +
> >  		if (WARN_ON(ring->id != RCS))
> >  			return NULL;
> >  
> > @@ -1802,7 +1805,8 @@ static void i915_gem_record_active_context(struct intel_ring_buffer *ring,
> >  		return;
> >  
> >  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> > -		if ((error->ccid & PAGE_MASK) == i915_gem_obj_ggtt_offset(obj)) {
> > +		if ((error->ccid & PAGE_MASK) ==
> > +		    i915_gem_obj_ggtt_offset(obj)) {
> >  			ering->ctx = i915_error_object_create_sized(dev_priv,
> >  								    obj, 1);
> >  			break;
> > diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
> > index 7d283b5..3f019d3 100644
> > --- a/drivers/gpu/drm/i915/i915_trace.h
> > +++ b/drivers/gpu/drm/i915/i915_trace.h
> > @@ -34,11 +34,13 @@ TRACE_EVENT(i915_gem_object_create,
> >  );
> >  
> >  TRACE_EVENT(i915_gem_object_bind,
> > -	    TP_PROTO(struct drm_i915_gem_object *obj, bool mappable),
> > -	    TP_ARGS(obj, mappable),
> > +	    TP_PROTO(struct drm_i915_gem_object *obj,
> > +		     struct i915_address_space *vm, bool mappable),
> > +	    TP_ARGS(obj, vm, mappable),
> >  
> >  	    TP_STRUCT__entry(
> >  			     __field(struct drm_i915_gem_object *, obj)
> > +			     __field(struct i915_address_space *, vm)
> >  			     __field(u32, offset)
> >  			     __field(u32, size)
> >  			     __field(bool, mappable)
> > @@ -46,8 +48,8 @@ TRACE_EVENT(i915_gem_object_bind,
> >  
> >  	    TP_fast_assign(
> >  			   __entry->obj = obj;
> > -			   __entry->offset = i915_gem_obj_ggtt_offset(obj);
> > -			   __entry->size = i915_gem_obj_ggtt_size(obj);
> > +			   __entry->offset = i915_gem_obj_offset(obj, vm);
> > +			   __entry->size = i915_gem_obj_size(obj, vm);
> >  			   __entry->mappable = mappable;
> >  			   ),
> >  
> > @@ -57,19 +59,21 @@ TRACE_EVENT(i915_gem_object_bind,
> >  );
> >  
> >  TRACE_EVENT(i915_gem_object_unbind,
> > -	    TP_PROTO(struct drm_i915_gem_object *obj),
> > -	    TP_ARGS(obj),
> > +	    TP_PROTO(struct drm_i915_gem_object *obj,
> > +		     struct i915_address_space *vm),
> > +	    TP_ARGS(obj, vm),
> >  
> >  	    TP_STRUCT__entry(
> >  			     __field(struct drm_i915_gem_object *, obj)
> > +			     __field(struct i915_address_space *, vm)
> >  			     __field(u32, offset)
> >  			     __field(u32, size)
> >  			     ),
> >  
> >  	    TP_fast_assign(
> >  			   __entry->obj = obj;
> > -			   __entry->offset = i915_gem_obj_ggtt_offset(obj);
> > -			   __entry->size = i915_gem_obj_ggtt_size(obj);
> > +			   __entry->offset = i915_gem_obj_offset(obj, vm);
> > +			   __entry->size = i915_gem_obj_size(obj, vm);
> >  			   ),
> >  
> >  	    TP_printk("obj=%p, offset=%08x size=%x",
> > diff --git a/drivers/gpu/drm/i915/intel_fb.c b/drivers/gpu/drm/i915/intel_fb.c
> > index f3c97e0..b69cc63 100644
> > --- a/drivers/gpu/drm/i915/intel_fb.c
> > +++ b/drivers/gpu/drm/i915/intel_fb.c
> > @@ -170,7 +170,6 @@ static int intelfb_create(struct drm_fb_helper *helper,
> >  		      fb->width, fb->height,
> >  		      i915_gem_obj_ggtt_offset(obj), obj);
> >  
> > -
> >  	mutex_unlock(&dev->struct_mutex);
> >  	vga_switcheroo_client_fb_set(dev->pdev, info);
> >  	return 0;
> > diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
> > index 81c3ca1..517e278 100644
> > --- a/drivers/gpu/drm/i915/intel_overlay.c
> > +++ b/drivers/gpu/drm/i915/intel_overlay.c
> > @@ -1350,7 +1350,7 @@ void intel_setup_overlay(struct drm_device *dev)
> >  		}
> >  		overlay->flip_addr = reg_bo->phys_obj->handle->busaddr;
> >  	} else {
> > -		ret = i915_gem_object_pin(reg_bo, PAGE_SIZE, true, false);
> > +		ret = i915_gem_ggtt_pin(reg_bo, PAGE_SIZE, true, false);
> >  		if (ret) {
> >  			DRM_ERROR("failed to pin overlay register bo\n");
> >  			goto out_free_bo;
> > diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> > index 125a741..449e57c 100644
> > --- a/drivers/gpu/drm/i915/intel_pm.c
> > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > @@ -2858,7 +2858,7 @@ intel_alloc_context_page(struct drm_device *dev)
> >  		return NULL;
> >  	}
> >  
> > -	ret = i915_gem_object_pin(ctx, 4096, true, false);
> > +	ret = i915_gem_ggtt_pin(ctx, 4096, true, false);
> >  	if (ret) {
> >  		DRM_ERROR("failed to pin power context: %d\n", ret);
> >  		goto err_unref;
> > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > index bc4c11b..ebed61d 100644
> > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > @@ -481,6 +481,7 @@ out:
> >  static int
> >  init_pipe_control(struct intel_ring_buffer *ring)
> >  {
> > +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> >  	struct pipe_control *pc;
> >  	struct drm_i915_gem_object *obj;
> >  	int ret;
> > @@ -499,9 +500,10 @@ init_pipe_control(struct intel_ring_buffer *ring)
> >  		goto err;
> >  	}
> >  
> > -	i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
> > +	i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> > +					I915_CACHE_LLC);
> >  
> > -	ret = i915_gem_object_pin(obj, 4096, true, false);
> > +	ret = i915_gem_ggtt_pin(obj, 4096, true, false);
> >  	if (ret)
> >  		goto err_unref;
> >  
> > @@ -1212,6 +1214,7 @@ static void cleanup_status_page(struct intel_ring_buffer *ring)
> >  static int init_status_page(struct intel_ring_buffer *ring)
> >  {
> >  	struct drm_device *dev = ring->dev;
> > +	struct drm_i915_private *dev_priv = dev->dev_private;
> >  	struct drm_i915_gem_object *obj;
> >  	int ret;
> >  
> > @@ -1222,9 +1225,10 @@ static int init_status_page(struct intel_ring_buffer *ring)
> >  		goto err;
> >  	}
> >  
> > -	i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
> > +	i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> > +					I915_CACHE_LLC);
> >  
> > -	ret = i915_gem_object_pin(obj, 4096, true, false);
> > +	ret = i915_gem_ggtt_pin(obj, 4096, true, false);
> >  	if (ret != 0) {
> >  		goto err_unref;
> >  	}
> > @@ -1307,7 +1311,7 @@ static int intel_init_ring_buffer(struct drm_device *dev,
> >  
> >  	ring->obj = obj;
> >  
> > -	ret = i915_gem_object_pin(obj, PAGE_SIZE, true, false);
> > +	ret = i915_gem_ggtt_pin(obj, PAGE_SIZE, true, false);
> >  	if (ret)
> >  		goto err_unref;
> >  
> > @@ -1828,7 +1832,7 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
> >  			return -ENOMEM;
> >  		}
> >  
> > -		ret = i915_gem_object_pin(obj, 0, true, false);
> > +		ret = i915_gem_ggtt_pin(obj, 0, true, false);
> >  		if (ret != 0) {
> >  			drm_gem_object_unreference(&obj->base);
> >  			DRM_ERROR("Failed to ping batch bo\n");
> > -- 
> > 1.8.3.2
> > 
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 07/11] drm/i915: Fix up map and fenceable for VMA
  2013-07-09  7:16   ` Daniel Vetter
@ 2013-07-10 16:39     ` Ben Widawsky
  2013-07-10 17:08       ` Daniel Vetter
  0 siblings, 1 reply; 50+ messages in thread
From: Ben Widawsky @ 2013-07-10 16:39 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX

On Tue, Jul 09, 2013 at 09:16:54AM +0200, Daniel Vetter wrote:
> On Mon, Jul 08, 2013 at 11:08:38PM -0700, Ben Widawsky wrote:
> > formerly: "drm/i915: Create VMAs (part 3.5) - map and fenceable
> > tracking"
> > 
> > The map_and_fenceable tracking is per object. GTT mapping, and fences
> > only apply to global GTT. As such,  object operations which are not
> > performed on the global GTT should not effect mappable or fenceable
> > characteristics.
> > 
> > Functionally, this commit could very well be squashed in to the previous
> > patch which updated object operations to take a VM argument.  This
> > commit is split out because it's a bit tricky (or at least it was for
> > me).
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > ---
> >  drivers/gpu/drm/i915/i915_gem.c | 9 ++++++---
> >  1 file changed, 6 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index 21015cd..501c590 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -2635,7 +2635,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj,
> >  
> >  	trace_i915_gem_object_unbind(obj, vm);
> >  
> > -	if (obj->has_global_gtt_mapping)
> > +	if (obj->has_global_gtt_mapping && i915_is_ggtt(vm))
> >  		i915_gem_gtt_unbind_object(obj);
> 
> Wont this part be done as part of the global gtt clear_range callback?
> 
> >  	if (obj->has_aliasing_ppgtt_mapping) {
> >  		i915_ppgtt_unbind_object(dev_priv->mm.aliasing_ppgtt, obj);
> 
> And I have a hunch that we should shovel the aliasing ppgtt clearing into
> the ggtt write_ptes/clear_range callbacks, too. Once all this has settled
> at least.
> 

Addressing both comments at once:

First, this is a rebase mistake AFAICT because this hunk doesn't really
belong in this patch anyway.

Eventually, I'd want to kill i915_gem_gtt_unbind_object, and
i915_ppgtt_unbind_object. In the 66 patch series, I killed the latter,
but decided to leave the former to make it clear that is a special case.

In the original 66 patch series, I did not move clear_range which is
probably why this was left like this. I believe bind was fixed to just
be vm->bleh()

If you're good with the idea, I'll add a new patch to remove those and
use the i915_address_space. I'll do the same in other applicable places.
It's easiest if I do that as a patch 12, I think, if you don't mind?

I do think this hunk belongs in another patch though until I do the
above. I'm not really sure where to put that.


> > @@ -2646,7 +2646,8 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj,
> >  
> >  	list_del(&obj->mm_list);
> >  	/* Avoid an unnecessary call to unbind on rebind. */
> > -	obj->map_and_fenceable = true;
> > +	if (i915_is_ggtt(vm))
> > +		obj->map_and_fenceable = true;
> >  
> >  	vma = i915_gem_obj_to_vma(obj, vm);
> >  	list_del(&vma->vma_link);
> > @@ -3213,7 +3214,9 @@ search_free:
> >  		i915_is_ggtt(vm) &&
> >  		vma->node.start + obj->base.size <= dev_priv->gtt.mappable_end;
> >  
> > -	obj->map_and_fenceable = mappable && fenceable;
> > +	/* Map and fenceable only changes if the VM is the global GGTT */
> > +	if (i915_is_ggtt(vm))
> > +		obj->map_and_fenceable = mappable && fenceable;
> >  
> >  	trace_i915_gem_object_bind(obj, vm, map_and_fenceable);
> >  	i915_gem_verify_gtt(dev);
> > -- 
> > 1.8.3.2
> > 
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 08/11] drm/i915: mm_list is per VMA
  2013-07-09  7:18   ` Daniel Vetter
@ 2013-07-10 16:39     ` Ben Widawsky
  0 siblings, 0 replies; 50+ messages in thread
From: Ben Widawsky @ 2013-07-10 16:39 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX

On Tue, Jul 09, 2013 at 09:18:46AM +0200, Daniel Vetter wrote:
> On Mon, Jul 08, 2013 at 11:08:39PM -0700, Ben Widawsky wrote:
> > formerly: "drm/i915: Create VMAs (part 5) - move mm_list"
> > 
> > The mm_list is used for the active/inactive LRUs. Since those LRUs are
> > per address space, the link should be per VMx .
> > 
> > Because we'll only ever have 1 VMA before this point, it's not incorrect
> > to defer this change until this point in the patch series, and doing it
> > here makes the change much easier to understand.
> > 
> > v2: only bump GGTT LRU in i915_gem_object_set_to_gtt_domain (Chris)
> > 
> > v3: Moved earlier in the series
> > 
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> 
> The commit message seems to miss the explanation (that I've written for
> you) why we move only some of the dev_priv->mm lrus, not all of them ...

Yes. This is a mistake. I had copied into a commit message with
"Shamelessly manipulated out of Daniel:" I'm not sure where it went.
After I address the other issues, I'll decide if I resubmit the whole
series or just this fixed.

Sorry about that, it wasn't ignorance, just incompetence.

> > ---
> >  drivers/gpu/drm/i915/i915_debugfs.c    | 53 ++++++++++++++++++++++------------
> >  drivers/gpu/drm/i915/i915_drv.h        |  5 ++--
> >  drivers/gpu/drm/i915/i915_gem.c        | 34 ++++++++++++++--------
> >  drivers/gpu/drm/i915/i915_gem_evict.c  | 14 ++++-----
> >  drivers/gpu/drm/i915/i915_gem_stolen.c |  2 +-
> >  drivers/gpu/drm/i915/i915_irq.c        | 37 ++++++++++++++----------
> >  6 files changed, 87 insertions(+), 58 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> > index 867ed07..163ca6b 100644
> > --- a/drivers/gpu/drm/i915/i915_debugfs.c
> > +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> > @@ -157,7 +157,7 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
> >  	struct drm_device *dev = node->minor->dev;
> >  	struct drm_i915_private *dev_priv = dev->dev_private;
> >  	struct i915_address_space *vm = &dev_priv->gtt.base;
> > -	struct drm_i915_gem_object *obj;
> > +	struct i915_vma *vma;
> >  	size_t total_obj_size, total_gtt_size;
> >  	int count, ret;
> >  
> > @@ -165,6 +165,7 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
> >  	if (ret)
> >  		return ret;
> >  
> > +	/* FIXME: the user of this interface might want more than just GGTT */
> >  	switch (list) {
> >  	case ACTIVE_LIST:
> >  		seq_puts(m, "Active:\n");
> > @@ -180,12 +181,12 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
> >  	}
> >  
> >  	total_obj_size = total_gtt_size = count = 0;
> > -	list_for_each_entry(obj, head, mm_list) {
> > -		seq_puts(m, "   ");
> > -		describe_obj(m, obj);
> > -		seq_putc(m, '\n');
> > -		total_obj_size += obj->base.size;
> > -		total_gtt_size += i915_gem_obj_ggtt_size(obj);
> > +	list_for_each_entry(vma, head, mm_list) {
> > +		seq_printf(m, "   ");
> > +		describe_obj(m, vma->obj);
> > +		seq_printf(m, "\n");
> > +		total_obj_size += vma->obj->base.size;
> > +		total_gtt_size += i915_gem_obj_size(vma->obj, vma->vm);
> >  		count++;
> >  	}
> >  	mutex_unlock(&dev->struct_mutex);
> > @@ -233,7 +234,18 @@ static int per_file_stats(int id, void *ptr, void *data)
> >  	return 0;
> >  }
> >  
> > -static int i915_gem_object_info(struct seq_file *m, void *data)
> > +#define count_vmas(list, member) do { \
> > +	list_for_each_entry(vma, list, member) { \
> > +		size += i915_gem_obj_ggtt_size(vma->obj); \
> > +		++count; \
> > +		if (vma->obj->map_and_fenceable) { \
> > +			mappable_size += i915_gem_obj_ggtt_size(vma->obj); \
> > +			++mappable_count; \
> > +		} \
> > +	} \
> > +} while (0)
> > +
> > +static int i915_gem_object_info(struct seq_file *m, void* data)
> >  {
> >  	struct drm_info_node *node = (struct drm_info_node *) m->private;
> >  	struct drm_device *dev = node->minor->dev;
> > @@ -243,6 +255,7 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
> >  	struct drm_i915_gem_object *obj;
> >  	struct i915_address_space *vm = &dev_priv->gtt.base;
> >  	struct drm_file *file;
> > +	struct i915_vma *vma;
> >  	int ret;
> >  
> >  	ret = mutex_lock_interruptible(&dev->struct_mutex);
> > @@ -259,12 +272,12 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
> >  		   count, mappable_count, size, mappable_size);
> >  
> >  	size = count = mappable_size = mappable_count = 0;
> > -	count_objects(&vm->active_list, mm_list);
> > +	count_vmas(&vm->active_list, mm_list);
> >  	seq_printf(m, "  %u [%u] active objects, %zu [%zu] bytes\n",
> >  		   count, mappable_count, size, mappable_size);
> >  
> >  	size = count = mappable_size = mappable_count = 0;
> > -	count_objects(&vm->inactive_list, mm_list);
> > +	count_vmas(&vm->inactive_list, mm_list);
> >  	seq_printf(m, "  %u [%u] inactive objects, %zu [%zu] bytes\n",
> >  		   count, mappable_count, size, mappable_size);
> >  
> > @@ -2037,7 +2050,8 @@ i915_drop_caches_set(void *data, u64 val)
> >  	struct drm_device *dev = data;
> >  	struct drm_i915_private *dev_priv = dev->dev_private;
> >  	struct drm_i915_gem_object *obj, *next;
> > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > +	struct i915_address_space *vm;
> > +	struct i915_vma *vma, *x;
> >  	int ret;
> >  
> >  	DRM_DEBUG_DRIVER("Dropping caches: 0x%08llx\n", val);
> > @@ -2058,14 +2072,15 @@ i915_drop_caches_set(void *data, u64 val)
> >  		i915_gem_retire_requests(dev);
> >  
> >  	if (val & DROP_BOUND) {
> > -		list_for_each_entry_safe(obj, next, &vm->inactive_list,
> > -					 mm_list) {
> > -			if (obj->pin_count)
> > -				continue;
> > -
> > -			ret = i915_gem_object_unbind(obj, &dev_priv->gtt.base);
> > -			if (ret)
> > -				goto unlock;
> > +		list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > +			list_for_each_entry_safe(vma, x, &vm->inactive_list,
> > +						 mm_list)
> > +				if (vma->obj->pin_count == 0) {
> > +					ret = i915_gem_object_unbind(vma->obj,
> > +								     vm);
> > +					if (ret)
> > +						goto unlock;
> > +				}
> >  		}
> >  	}
> >  
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index 48baccc..48105f8 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -541,6 +541,9 @@ struct i915_vma {
> >  	struct drm_i915_gem_object *obj;
> >  	struct i915_address_space *vm;
> >  
> > +	/** This object's place on the active/inactive lists */
> > +	struct list_head mm_list;
> > +
> >  	struct list_head vma_link; /* Link in the object's VMA list */
> >  };
> >  
> > @@ -1242,9 +1245,7 @@ struct drm_i915_gem_object {
> >  	struct drm_mm_node *stolen;
> >  	struct list_head global_list;
> >  
> > -	/** This object's place on the active/inactive lists */
> >  	struct list_head ring_list;
> > -	struct list_head mm_list;
> >  	/** This object's place in the batchbuffer or on the eviction list */
> >  	struct list_head exec_list;
> >  
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index 501c590..9a58363 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -1888,6 +1888,7 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> >  	struct drm_device *dev = obj->base.dev;
> >  	struct drm_i915_private *dev_priv = dev->dev_private;
> >  	u32 seqno = intel_ring_get_seqno(ring);
> > +	struct i915_vma *vma;
> >  
> >  	BUG_ON(ring == NULL);
> >  	obj->ring = ring;
> > @@ -1899,7 +1900,8 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> >  	}
> >  
> >  	/* Move from whatever list we were on to the tail of execution. */
> > -	list_move_tail(&obj->mm_list, &vm->active_list);
> > +	vma = i915_gem_obj_to_vma(obj, vm);
> > +	list_move_tail(&vma->mm_list, &vm->active_list);
> >  	list_move_tail(&obj->ring_list, &ring->active_list);
> >  
> >  	obj->last_read_seqno = seqno;
> > @@ -1922,10 +1924,13 @@ static void
> >  i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj,
> >  				 struct i915_address_space *vm)
> >  {
> > +	struct i915_vma *vma;
> > +
> >  	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
> >  	BUG_ON(!obj->active);
> >  
> > -	list_move_tail(&obj->mm_list, &vm->inactive_list);
> > +	vma = i915_gem_obj_to_vma(obj, vm);
> > +	list_move_tail(&vma->mm_list, &vm->inactive_list);
> >  
> >  	list_del_init(&obj->ring_list);
> >  	obj->ring = NULL;
> > @@ -2287,9 +2292,9 @@ void i915_gem_restore_fences(struct drm_device *dev)
> >  void i915_gem_reset(struct drm_device *dev)
> >  {
> >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > -	struct i915_address_space *vm;
> > -	struct drm_i915_gem_object *obj;
> >  	struct intel_ring_buffer *ring;
> > +	struct i915_address_space *vm;
> > +	struct i915_vma *vma;
> >  	int i;
> >  
> >  	for_each_ring(ring, dev_priv, i)
> > @@ -2299,8 +2304,8 @@ void i915_gem_reset(struct drm_device *dev)
> >  	 * necessary invalidation upon reuse.
> >  	 */
> >  	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > -		list_for_each_entry(obj, &vm->inactive_list, mm_list)
> > -			obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
> > +		list_for_each_entry(vma, &vm->inactive_list, mm_list)
> > +			vma->obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
> >  
> >  	i915_gem_restore_fences(dev);
> >  }
> > @@ -2644,12 +2649,12 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj,
> >  	i915_gem_gtt_finish_object(obj);
> >  	i915_gem_object_unpin_pages(obj);
> >  
> > -	list_del(&obj->mm_list);
> >  	/* Avoid an unnecessary call to unbind on rebind. */
> >  	if (i915_is_ggtt(vm))
> >  		obj->map_and_fenceable = true;
> >  
> >  	vma = i915_gem_obj_to_vma(obj, vm);
> > +	list_del(&vma->mm_list);
> >  	list_del(&vma->vma_link);
> >  	drm_mm_remove_node(&vma->node);
> >  	i915_gem_vma_destroy(vma);
> > @@ -3197,7 +3202,7 @@ search_free:
> >  	}
> >  
> >  	list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
> > -	list_add_tail(&obj->mm_list, &vm->inactive_list);
> > +	list_add_tail(&vma->mm_list, &vm->inactive_list);
> >  
> >  	/* Keep GGTT vmas first to make debug easier */
> >  	if (i915_is_ggtt(vm))
> > @@ -3354,9 +3359,14 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
> >  					    old_write_domain);
> >  
> >  	/* And bump the LRU for this access */
> > -	if (i915_gem_object_is_inactive(obj))
> > -		list_move_tail(&obj->mm_list,
> > -			       &dev_priv->gtt.base.inactive_list);
> > +	if (i915_gem_object_is_inactive(obj)) {
> > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj,
> > +							   &dev_priv->gtt.base);
> > +		if (vma)
> > +			list_move_tail(&vma->mm_list,
> > +				       &dev_priv->gtt.base.inactive_list);
> > +
> > +	}
> >  
> >  	return 0;
> >  }
> > @@ -3931,7 +3941,6 @@ unlock:
> >  void i915_gem_object_init(struct drm_i915_gem_object *obj,
> >  			  const struct drm_i915_gem_object_ops *ops)
> >  {
> > -	INIT_LIST_HEAD(&obj->mm_list);
> >  	INIT_LIST_HEAD(&obj->global_list);
> >  	INIT_LIST_HEAD(&obj->ring_list);
> >  	INIT_LIST_HEAD(&obj->exec_list);
> > @@ -4071,6 +4080,7 @@ struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
> >  		return ERR_PTR(-ENOMEM);
> >  
> >  	INIT_LIST_HEAD(&vma->vma_link);
> > +	INIT_LIST_HEAD(&vma->mm_list);
> >  	vma->vm = vm;
> >  	vma->obj = obj;
> >  
> > diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
> > index 32efdc0..18a44a9 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_evict.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_evict.c
> > @@ -87,8 +87,7 @@ i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
> >  		drm_mm_init_scan(&vm->mm, min_size, alignment, cache_level);
> >  
> >  	/* First see if there is a large enough contiguous idle region... */
> > -	list_for_each_entry(obj, &vm->inactive_list, mm_list) {
> > -		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> > +	list_for_each_entry(vma, &vm->inactive_list, mm_list) {
> >  		if (mark_free(vma, &unwind_list))
> >  			goto found;
> >  	}
> > @@ -97,8 +96,7 @@ i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
> >  		goto none;
> >  
> >  	/* Now merge in the soon-to-be-expired objects... */
> > -	list_for_each_entry(obj, &vm->active_list, mm_list) {
> > -		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> > +	list_for_each_entry(vma, &vm->active_list, mm_list) {
> >  		if (mark_free(vma, &unwind_list))
> >  			goto found;
> >  	}
> > @@ -159,7 +157,7 @@ i915_gem_evict_everything(struct drm_device *dev)
> >  {
> >  	drm_i915_private_t *dev_priv = dev->dev_private;
> >  	struct i915_address_space *vm;
> > -	struct drm_i915_gem_object *obj, *next;
> > +	struct i915_vma *vma, *next;
> >  	bool lists_empty = true;
> >  	int ret;
> >  
> > @@ -187,9 +185,9 @@ i915_gem_evict_everything(struct drm_device *dev)
> >  
> >  	/* Having flushed everything, unbind() should never raise an error */
> >  	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > -		list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> > -			if (obj->pin_count == 0)
> > -				WARN_ON(i915_gem_object_unbind(obj, vm));
> > +		list_for_each_entry_safe(vma, next, &vm->inactive_list, mm_list)
> > +			if (vma->obj->pin_count == 0)
> > +				WARN_ON(i915_gem_object_unbind(vma->obj, vm));
> >  	}
> >  
> >  	return 0;
> > diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > index bfe61fa..58b2613 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > @@ -415,7 +415,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
> >  	obj->has_global_gtt_mapping = 1;
> >  
> >  	list_add_tail(&obj->global_list, &dev_priv->mm.bound_list);
> > -	list_add_tail(&obj->mm_list, &vm->inactive_list);
> > +	list_add_tail(&vma->mm_list, &dev_priv->gtt.base.inactive_list);
> >  
> >  	return obj;
> >  
> > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> > index 28fa0ff..e065232 100644
> > --- a/drivers/gpu/drm/i915/i915_irq.c
> > +++ b/drivers/gpu/drm/i915/i915_irq.c
> > @@ -1640,11 +1640,11 @@ static void capture_bo(struct drm_i915_error_buffer *err,
> >  static u32 capture_active_bo(struct drm_i915_error_buffer *err,
> >  			     int count, struct list_head *head)
> >  {
> > -	struct drm_i915_gem_object *obj;
> > +	struct i915_vma *vma;
> >  	int i = 0;
> >  
> > -	list_for_each_entry(obj, head, mm_list) {
> > -		capture_bo(err++, obj);
> > +	list_for_each_entry(vma, head, mm_list) {
> > +		capture_bo(err++, vma->obj);
> >  		if (++i == count)
> >  			break;
> >  	}
> > @@ -1706,8 +1706,9 @@ static struct drm_i915_error_object *
> >  i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
> >  			     struct intel_ring_buffer *ring)
> >  {
> > +	struct i915_address_space *vm;
> > +	struct i915_vma *vma;
> >  	struct drm_i915_gem_object *obj;
> > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> >  	u32 seqno;
> >  
> >  	if (!ring->get_seqno)
> > @@ -1729,20 +1730,23 @@ i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
> >  	}
> >  
> >  	seqno = ring->get_seqno(ring, false);
> > -	list_for_each_entry(obj, &vm->active_list, mm_list) {
> > -		if (obj->ring != ring)
> > -			continue;
> > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > +		list_for_each_entry(vma, &vm->active_list, mm_list) {
> > +			obj = vma->obj;
> > +			if (obj->ring != ring)
> > +				continue;
> >  
> > -		if (i915_seqno_passed(seqno, obj->last_read_seqno))
> > -			continue;
> > +			if (i915_seqno_passed(seqno, obj->last_read_seqno))
> > +				continue;
> >  
> > -		if ((obj->base.read_domains & I915_GEM_DOMAIN_COMMAND) == 0)
> > -			continue;
> > +			if ((obj->base.read_domains & I915_GEM_DOMAIN_COMMAND) == 0)
> > +				continue;
> >  
> > -		/* We need to copy these to an anonymous buffer as the simplest
> > -		 * method to avoid being overwritten by userspace.
> > -		 */
> > -		return i915_error_object_create(dev_priv, obj);
> > +			/* We need to copy these to an anonymous buffer as the simplest
> > +			 * method to avoid being overwritten by userspace.
> > +			 */
> > +			return i915_error_object_create(dev_priv, obj);
> > +		}
> >  	}
> >  
> >  	return NULL;
> > @@ -1863,11 +1867,12 @@ static void i915_gem_capture_buffers(struct drm_i915_private *dev_priv,
> >  				     struct drm_i915_error_state *error)
> >  {
> >  	struct drm_i915_gem_object *obj;
> > +	struct i915_vma *vma;
> >  	struct i915_address_space *vm = &dev_priv->gtt.base;
> >  	int i;
> >  
> >  	i = 0;
> > -	list_for_each_entry(obj, &vm->active_list, mm_list)
> > +	list_for_each_entry(vma, &vm->active_list, mm_list)
> >  		i++;
> >  	error->active_bo_count = i;
> >  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list)
> > -- 
> > 1.8.3.2
> > 
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 11/11] drm/i915: Move active to vma
  2013-07-09  7:45   ` Daniel Vetter
@ 2013-07-10 16:39     ` Ben Widawsky
  2013-07-10 17:13       ` Daniel Vetter
  0 siblings, 1 reply; 50+ messages in thread
From: Ben Widawsky @ 2013-07-10 16:39 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX

On Tue, Jul 09, 2013 at 09:45:09AM +0200, Daniel Vetter wrote:
> On Mon, Jul 08, 2013 at 11:08:42PM -0700, Ben Widawsky wrote:
> > Probably need to squash whole thing, or just the inactive part, tbd...
> > 
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> 
> I agree that we need vma->active, but I'm not sold on removing
> obj->active. Atm we have to use-cases for checking obj->active:
> - In the evict/unbind code to check whether the gpu is still using this
>   specific mapping. This use-case nicely fits into checking vma->active.
> - In the shrinker code and everywhere we want to do cpu access we only
>   care about whether the gpu is accessing the object, not at all through
>   which mapping precisely. There a vma-independant obj->active sounds much
>   saner.
> 
> Note though that just keeping track of vma->active isn't too useful, since
> if some other vma is keeping the object busy we'll still stall on that one
> for eviction. So we'd need a vma->ring and vma->last_rendering_seqno, too.
> 
> At that point I wonder a bit whether all this complexity is worth it ...
> 
> I need to ponder this some more.
> -Daniel

I think eventually the complexity might prove worthwhile, it might not.

In the meanwhile, I see vma->active as just a bookkeeping thing, and not
really useful in determining what we actually care about. As you mention
obj->active is really what we care about, and I used the getter
i915_gem_object_is_active() as a way to avoid the confusion of having
two active members.

I think we're in the same state of mind on this, and I've picked what I
consider to be a less offensive solution which is easy to clean up
later.

Let me know when you make a decision.

> 
> > ---
> >  drivers/gpu/drm/i915/i915_drv.h | 14 ++++++------
> >  drivers/gpu/drm/i915/i915_gem.c | 47 ++++++++++++++++++++++++-----------------
> >  2 files changed, 35 insertions(+), 26 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index 38d07f2..e6694ae 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -541,6 +541,13 @@ struct i915_vma {
> >  	struct drm_i915_gem_object *obj;
> >  	struct i915_address_space *vm;
> >  
> > +	/**
> > +	 * This is set if the object is on the active lists (has pending
> > +	 * rendering and so a non-zero seqno), and is not set if it i s on
> > +	 * inactive (ready to be unbound) list.
> > +	 */
> > +	unsigned int active:1;
> > +
> >  	/** This object's place on the active/inactive lists */
> >  	struct list_head mm_list;
> >  
> > @@ -1250,13 +1257,6 @@ struct drm_i915_gem_object {
> >  	struct list_head exec_list;
> >  
> >  	/**
> > -	 * This is set if the object is on the active lists (has pending
> > -	 * rendering and so a non-zero seqno), and is not set if it i s on
> > -	 * inactive (ready to be unbound) list.
> > -	 */
> > -	unsigned int active:1;
> > -
> > -	/**
> >  	 * This is set if the object has been written to since last bound
> >  	 * to the GTT
> >  	 */
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index c2ecb78..b87073b 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -137,7 +137,13 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
> >  /* NB: Not the same as !i915_gem_object_is_inactive */
> >  bool i915_gem_object_is_active(struct drm_i915_gem_object *obj)
> >  {
> > -	return obj->active;
> > +	struct i915_vma *vma;
> > +
> > +	list_for_each_entry(vma, &obj->vma_list, vma_link)
> > +		if (vma->active)
> > +			return true;
> > +
> > +	return false;
> >  }
> >  
> >  static inline bool
> > @@ -1899,14 +1905,14 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> >  	BUG_ON(ring == NULL);
> >  	obj->ring = ring;
> >  
> > +	/* Move from whatever list we were on to the tail of execution. */
> > +	vma = i915_gem_obj_to_vma(obj, vm);
> >  	/* Add a reference if we're newly entering the active list. */
> > -	if (!i915_gem_object_is_active(obj)) {
> > +	if (!vma->active) {
> >  		drm_gem_object_reference(&obj->base);
> > -		obj->active = 1;
> > +		vma->active = 1;
> >  	}
> >  
> > -	/* Move from whatever list we were on to the tail of execution. */
> > -	vma = i915_gem_obj_to_vma(obj, vm);
> >  	list_move_tail(&vma->mm_list, &vm->active_list);
> >  	list_move_tail(&obj->ring_list, &ring->active_list);
> >  
> > @@ -1927,16 +1933,23 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> >  }
> >  
> >  static void
> > -i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj,
> > -				 struct i915_address_space *vm)
> > +i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
> >  {
> > +	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> > +	struct i915_address_space *vm;
> >  	struct i915_vma *vma;
> > +	int i = 0;
> >  
> >  	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
> > -	BUG_ON(!i915_gem_object_is_active(obj));
> >  
> > -	vma = i915_gem_obj_to_vma(obj, vm);
> > -	list_move_tail(&vma->mm_list, &vm->inactive_list);
> > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > +		vma = i915_gem_obj_to_vma(obj, vm);
> For paranoia we might want to track the vm used to run a batch in it
> request struct, then we
> 
> > +		if (!vma || !vma->active)
> > +			continue;
> > +		list_move_tail(&vma->mm_list, &vm->inactive_list);
> > +		vma->active = 0;
> > +		i++;
> > +	}
> >  
> >  	list_del_init(&obj->ring_list);
> >  	obj->ring = NULL;
> > @@ -1948,8 +1961,8 @@ i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj,
> >  	obj->last_fenced_seqno = 0;
> >  	obj->fenced_gpu_access = false;
> >  
> > -	obj->active = 0;
> > -	drm_gem_object_unreference(&obj->base);
> > +	while (i--)
> > +		drm_gem_object_unreference(&obj->base);
> >  
> >  	WARN_ON(i915_verify_lists(dev));
> >  }
> > @@ -2272,15 +2285,13 @@ static void i915_gem_reset_ring_lists(struct drm_i915_private *dev_priv,
> >  	}
> >  
> >  	while (!list_empty(&ring->active_list)) {
> > -		struct i915_address_space *vm;
> >  		struct drm_i915_gem_object *obj;
> >  
> >  		obj = list_first_entry(&ring->active_list,
> >  				       struct drm_i915_gem_object,
> >  				       ring_list);
> >  
> > -		list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > -			i915_gem_object_move_to_inactive(obj, vm);
> > +		i915_gem_object_move_to_inactive(obj);
> >  	}
> >  }
> >  
> > @@ -2356,8 +2367,6 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
> >  	 * by the ringbuffer to the flushing/inactive lists as appropriate.
> >  	 */
> >  	while (!list_empty(&ring->active_list)) {
> > -		struct drm_i915_private *dev_priv = ring->dev->dev_private;
> > -		struct i915_address_space *vm;
> >  		struct drm_i915_gem_object *obj;
> >  
> >  		obj = list_first_entry(&ring->active_list,
> > @@ -2367,8 +2376,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
> >  		if (!i915_seqno_passed(seqno, obj->last_read_seqno))
> >  			break;
> >  
> > -		list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > -			i915_gem_object_move_to_inactive(obj, vm);
> > +		BUG_ON(!i915_gem_object_is_active(obj));
> > +		i915_gem_object_move_to_inactive(obj);
> >  	}
> >  
> >  	if (unlikely(ring->trace_irq_seqno &&
> > -- 
> > 1.8.3.2
> > 
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 01/11] drm/i915: Move gtt and ppgtt under address space umbrella
  2013-07-10 16:36     ` Ben Widawsky
@ 2013-07-10 17:03       ` Daniel Vetter
  0 siblings, 0 replies; 50+ messages in thread
From: Daniel Vetter @ 2013-07-10 17:03 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Wed, Jul 10, 2013 at 09:36:58AM -0700, Ben Widawsky wrote:
> On Tue, Jul 09, 2013 at 08:37:45AM +0200, Daniel Vetter wrote:
> > On Mon, Jul 08, 2013 at 11:08:32PM -0700, Ben Widawsky wrote:
> > > The GTT and PPGTT can be thought of more generally as GPU address
> > > spaces. Many of their actions (insert entries), state (LRU lists) and
> > > many of their characteristics (size), can be shared. Do that.
> > > 
> > > The change itself doesn't actually impact most of the VMA/VM rework
> > > coming up, it just fits in with the grand scheme. GGTT will usually be a
> > > special case where we either know an object must be in the GGTT (dislay
> > > engine, workarounds, etc.).
> > 
> > Commit message cut off?
> > -Daniel
> 
> Maybe. I can't remember. Do you want me to add something else in
> particular.

I was just wondering since after the "either" I'd expect and "or". My
parser never found it though ;-)
-Daniel

> 
> > 
> > > 
> > > v2: Drop usage of i915_gtt_vm (Daniel)
> > > Make cleanup also part of the parent class (Ben)
> > > Modified commit msg
> > > Rebased
> > > 
> > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > > ---
> > >  drivers/gpu/drm/i915/i915_debugfs.c |   4 +-
> > >  drivers/gpu/drm/i915/i915_dma.c     |   4 +-
> > >  drivers/gpu/drm/i915/i915_drv.h     |  57 ++++++-------
> > >  drivers/gpu/drm/i915/i915_gem.c     |   4 +-
> > >  drivers/gpu/drm/i915/i915_gem_gtt.c | 162 ++++++++++++++++++++----------------
> > >  5 files changed, 121 insertions(+), 110 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> > > index c8059f5..d870f27 100644
> > > --- a/drivers/gpu/drm/i915/i915_debugfs.c
> > > +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> > > @@ -287,8 +287,8 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
> > >  		   count, size);
> > >  
> > >  	seq_printf(m, "%zu [%lu] gtt total\n",
> > > -		   dev_priv->gtt.total,
> > > -		   dev_priv->gtt.mappable_end - dev_priv->gtt.start);
> > > +		   dev_priv->gtt.base.total,
> > > +		   dev_priv->gtt.mappable_end - dev_priv->gtt.base.start);
> > >  
> > >  	seq_putc(m, '\n');
> > >  	list_for_each_entry_reverse(file, &dev->filelist, lhead) {
> > > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> > > index 0e22142..15bca96 100644
> > > --- a/drivers/gpu/drm/i915/i915_dma.c
> > > +++ b/drivers/gpu/drm/i915/i915_dma.c
> > > @@ -1669,7 +1669,7 @@ out_gem_unload:
> > >  out_mtrrfree:
> > >  	arch_phys_wc_del(dev_priv->gtt.mtrr);
> > >  	io_mapping_free(dev_priv->gtt.mappable);
> > > -	dev_priv->gtt.gtt_remove(dev);
> > > +	dev_priv->gtt.base.cleanup(&dev_priv->gtt.base);
> > >  out_rmmap:
> > >  	pci_iounmap(dev->pdev, dev_priv->regs);
> > >  put_bridge:
> > > @@ -1764,7 +1764,7 @@ int i915_driver_unload(struct drm_device *dev)
> > >  	destroy_workqueue(dev_priv->wq);
> > >  	pm_qos_remove_request(&dev_priv->pm_qos);
> > >  
> > > -	dev_priv->gtt.gtt_remove(dev);
> > > +	dev_priv->gtt.base.cleanup(&dev_priv->gtt.base);
> > >  
> > >  	if (dev_priv->slab)
> > >  		kmem_cache_destroy(dev_priv->slab);
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > > index c8d6104..d6d4d7d 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > @@ -446,6 +446,29 @@ enum i915_cache_level {
> > >  
> > >  typedef uint32_t gen6_gtt_pte_t;
> > >  
> > > +struct i915_address_space {
> > > +	struct drm_device *dev;
> > > +	unsigned long start;		/* Start offset always 0 for dri2 */
> > > +	size_t total;		/* size addr space maps (ex. 2GB for ggtt) */
> > > +
> > > +	struct {
> > > +		dma_addr_t addr;
> > > +		struct page *page;
> > > +	} scratch;
> > > +
> > > +	/* FIXME: Need a more generic return type */
> > > +	gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
> > > +				     enum i915_cache_level level);
> > > +	void (*clear_range)(struct i915_address_space *vm,
> > > +			    unsigned int first_entry,
> > > +			    unsigned int num_entries);
> > > +	void (*insert_entries)(struct i915_address_space *vm,
> > > +			       struct sg_table *st,
> > > +			       unsigned int first_entry,
> > > +			       enum i915_cache_level cache_level);
> > > +	void (*cleanup)(struct i915_address_space *vm);
> > > +};
> > > +
> > >  /* The Graphics Translation Table is the way in which GEN hardware translates a
> > >   * Graphics Virtual Address into a Physical Address. In addition to the normal
> > >   * collateral associated with any va->pa translations GEN hardware also has a
> > > @@ -454,8 +477,7 @@ typedef uint32_t gen6_gtt_pte_t;
> > >   * the spec.
> > >   */
> > >  struct i915_gtt {
> > > -	unsigned long start;		/* Start offset of used GTT */
> > > -	size_t total;			/* Total size GTT can map */
> > > +	struct i915_address_space base;
> > >  	size_t stolen_size;		/* Total size of stolen memory */
> > >  
> > >  	unsigned long mappable_end;	/* End offset that we can CPU map */
> > > @@ -466,10 +488,6 @@ struct i915_gtt {
> > >  	void __iomem *gsm;
> > >  
> > >  	bool do_idle_maps;
> > > -	struct {
> > > -		dma_addr_t addr;
> > > -		struct page *page;
> > > -	} scratch;
> > >  
> > >  	int mtrr;
> > >  
> > > @@ -477,38 +495,17 @@ struct i915_gtt {
> > >  	int (*gtt_probe)(struct drm_device *dev, size_t *gtt_total,
> > >  			  size_t *stolen, phys_addr_t *mappable_base,
> > >  			  unsigned long *mappable_end);
> > > -	void (*gtt_remove)(struct drm_device *dev);
> > > -	void (*gtt_clear_range)(struct drm_device *dev,
> > > -				unsigned int first_entry,
> > > -				unsigned int num_entries);
> > > -	void (*gtt_insert_entries)(struct drm_device *dev,
> > > -				   struct sg_table *st,
> > > -				   unsigned int pg_start,
> > > -				   enum i915_cache_level cache_level);
> > > -	gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
> > > -				     enum i915_cache_level level);
> > >  };
> > > -#define gtt_total_entries(gtt) ((gtt).total >> PAGE_SHIFT)
> > > +#define gtt_total_entries(gtt) ((gtt).base.total >> PAGE_SHIFT)
> > >  
> > >  struct i915_hw_ppgtt {
> > > -	struct drm_device *dev;
> > > +	struct i915_address_space base;
> > >  	unsigned num_pd_entries;
> > >  	struct page **pt_pages;
> > >  	uint32_t pd_offset;
> > >  	dma_addr_t *pt_dma_addr;
> > >  
> > > -	/* pte functions, mirroring the interface of the global gtt. */
> > > -	void (*clear_range)(struct i915_hw_ppgtt *ppgtt,
> > > -			    unsigned int first_entry,
> > > -			    unsigned int num_entries);
> > > -	void (*insert_entries)(struct i915_hw_ppgtt *ppgtt,
> > > -			       struct sg_table *st,
> > > -			       unsigned int pg_start,
> > > -			       enum i915_cache_level cache_level);
> > > -	gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
> > > -				     enum i915_cache_level level);
> > >  	int (*enable)(struct drm_device *dev);
> > > -	void (*cleanup)(struct i915_hw_ppgtt *ppgtt);
> > >  };
> > >  
> > >  struct i915_ctx_hang_stats {
> > > @@ -1124,7 +1121,7 @@ typedef struct drm_i915_private {
> > >  	enum modeset_restore modeset_restore;
> > >  	struct mutex modeset_restore_lock;
> > >  
> > > -	struct i915_gtt gtt;
> > > +	struct i915_gtt gtt; /* VMA representing the global address space */
> > >  
> > >  	struct i915_gem_mm mm;
> > >  
> > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > > index af61be8..3ecedfd 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > @@ -181,7 +181,7 @@ i915_gem_get_aperture_ioctl(struct drm_device *dev, void *data,
> > >  			pinned += i915_gem_obj_ggtt_size(obj);
> > >  	mutex_unlock(&dev->struct_mutex);
> > >  
> > > -	args->aper_size = dev_priv->gtt.total;
> > > +	args->aper_size = dev_priv->gtt.base.total;
> > >  	args->aper_available_size = args->aper_size - pinned;
> > >  
> > >  	return 0;
> > > @@ -3070,7 +3070,7 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> > >  	u32 size, fence_size, fence_alignment, unfenced_alignment;
> > >  	bool mappable, fenceable;
> > >  	size_t gtt_max = map_and_fenceable ?
> > > -		dev_priv->gtt.mappable_end : dev_priv->gtt.total;
> > > +		dev_priv->gtt.mappable_end : dev_priv->gtt.base.total;
> > >  	int ret;
> > >  
> > >  	fence_size = i915_gem_get_gtt_size(dev,
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > index 242d0f9..693115a 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > @@ -102,7 +102,7 @@ static gen6_gtt_pte_t hsw_pte_encode(dma_addr_t addr,
> > >  
> > >  static void gen6_write_pdes(struct i915_hw_ppgtt *ppgtt)
> > >  {
> > > -	struct drm_i915_private *dev_priv = ppgtt->dev->dev_private;
> > > +	struct drm_i915_private *dev_priv = ppgtt->base.dev->dev_private;
> > >  	gen6_gtt_pte_t __iomem *pd_addr;
> > >  	uint32_t pd_entry;
> > >  	int i;
> > > @@ -181,18 +181,18 @@ static int gen6_ppgtt_enable(struct drm_device *dev)
> > >  }
> > >  
> > >  /* PPGTT support for Sandybdrige/Gen6 and later */
> > > -static void gen6_ppgtt_clear_range(struct i915_hw_ppgtt *ppgtt,
> > > +static void gen6_ppgtt_clear_range(struct i915_address_space *vm,
> > >  				   unsigned first_entry,
> > >  				   unsigned num_entries)
> > >  {
> > > -	struct drm_i915_private *dev_priv = ppgtt->dev->dev_private;
> > > +	struct i915_hw_ppgtt *ppgtt =
> > > +		container_of(vm, struct i915_hw_ppgtt, base);
> > >  	gen6_gtt_pte_t *pt_vaddr, scratch_pte;
> > >  	unsigned act_pt = first_entry / I915_PPGTT_PT_ENTRIES;
> > >  	unsigned first_pte = first_entry % I915_PPGTT_PT_ENTRIES;
> > >  	unsigned last_pte, i;
> > >  
> > > -	scratch_pte = ppgtt->pte_encode(dev_priv->gtt.scratch.addr,
> > > -					I915_CACHE_LLC);
> > > +	scratch_pte = vm->pte_encode(vm->scratch.addr, I915_CACHE_LLC);
> > >  
> > >  	while (num_entries) {
> > >  		last_pte = first_pte + num_entries;
> > > @@ -212,11 +212,13 @@ static void gen6_ppgtt_clear_range(struct i915_hw_ppgtt *ppgtt,
> > >  	}
> > >  }
> > >  
> > > -static void gen6_ppgtt_insert_entries(struct i915_hw_ppgtt *ppgtt,
> > > +static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
> > >  				      struct sg_table *pages,
> > >  				      unsigned first_entry,
> > >  				      enum i915_cache_level cache_level)
> > >  {
> > > +	struct i915_hw_ppgtt *ppgtt =
> > > +		container_of(vm, struct i915_hw_ppgtt, base);
> > >  	gen6_gtt_pte_t *pt_vaddr;
> > >  	unsigned act_pt = first_entry / I915_PPGTT_PT_ENTRIES;
> > >  	unsigned act_pte = first_entry % I915_PPGTT_PT_ENTRIES;
> > > @@ -227,7 +229,7 @@ static void gen6_ppgtt_insert_entries(struct i915_hw_ppgtt *ppgtt,
> > >  		dma_addr_t page_addr;
> > >  
> > >  		page_addr = sg_page_iter_dma_address(&sg_iter);
> > > -		pt_vaddr[act_pte] = ppgtt->pte_encode(page_addr, cache_level);
> > > +		pt_vaddr[act_pte] = vm->pte_encode(page_addr, cache_level);
> > >  		if (++act_pte == I915_PPGTT_PT_ENTRIES) {
> > >  			kunmap_atomic(pt_vaddr);
> > >  			act_pt++;
> > > @@ -239,13 +241,15 @@ static void gen6_ppgtt_insert_entries(struct i915_hw_ppgtt *ppgtt,
> > >  	kunmap_atomic(pt_vaddr);
> > >  }
> > >  
> > > -static void gen6_ppgtt_cleanup(struct i915_hw_ppgtt *ppgtt)
> > > +static void gen6_ppgtt_cleanup(struct i915_address_space *vm)
> > >  {
> > > +	struct i915_hw_ppgtt *ppgtt =
> > > +		container_of(vm, struct i915_hw_ppgtt, base);
> > >  	int i;
> > >  
> > >  	if (ppgtt->pt_dma_addr) {
> > >  		for (i = 0; i < ppgtt->num_pd_entries; i++)
> > > -			pci_unmap_page(ppgtt->dev->pdev,
> > > +			pci_unmap_page(ppgtt->base.dev->pdev,
> > >  				       ppgtt->pt_dma_addr[i],
> > >  				       4096, PCI_DMA_BIDIRECTIONAL);
> > >  	}
> > > @@ -259,7 +263,7 @@ static void gen6_ppgtt_cleanup(struct i915_hw_ppgtt *ppgtt)
> > >  
> > >  static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
> > >  {
> > > -	struct drm_device *dev = ppgtt->dev;
> > > +	struct drm_device *dev = ppgtt->base.dev;
> > >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > >  	unsigned first_pd_entry_in_global_pt;
> > >  	int i;
> > > @@ -271,17 +275,17 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
> > >  	first_pd_entry_in_global_pt = gtt_total_entries(dev_priv->gtt);
> > >  
> > >  	if (IS_HASWELL(dev)) {
> > > -		ppgtt->pte_encode = hsw_pte_encode;
> > > +		ppgtt->base.pte_encode = hsw_pte_encode;
> > >  	} else if (IS_VALLEYVIEW(dev)) {
> > > -		ppgtt->pte_encode = byt_pte_encode;
> > > +		ppgtt->base.pte_encode = byt_pte_encode;
> > >  	} else {
> > > -		ppgtt->pte_encode = gen6_pte_encode;
> > > +		ppgtt->base.pte_encode = gen6_pte_encode;
> > >  	}
> > >  	ppgtt->num_pd_entries = GEN6_PPGTT_PD_ENTRIES;
> > >  	ppgtt->enable = gen6_ppgtt_enable;
> > > -	ppgtt->clear_range = gen6_ppgtt_clear_range;
> > > -	ppgtt->insert_entries = gen6_ppgtt_insert_entries;
> > > -	ppgtt->cleanup = gen6_ppgtt_cleanup;
> > > +	ppgtt->base.clear_range = gen6_ppgtt_clear_range;
> > > +	ppgtt->base.insert_entries = gen6_ppgtt_insert_entries;
> > > +	ppgtt->base.cleanup = gen6_ppgtt_cleanup;
> > >  	ppgtt->pt_pages = kzalloc(sizeof(struct page *)*ppgtt->num_pd_entries,
> > >  				  GFP_KERNEL);
> > >  	if (!ppgtt->pt_pages)
> > > @@ -312,8 +316,8 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
> > >  		ppgtt->pt_dma_addr[i] = pt_addr;
> > >  	}
> > >  
> > > -	ppgtt->clear_range(ppgtt, 0,
> > > -			   ppgtt->num_pd_entries*I915_PPGTT_PT_ENTRIES);
> > > +	ppgtt->base.clear_range(&ppgtt->base, 0,
> > > +				ppgtt->num_pd_entries * I915_PPGTT_PT_ENTRIES);
> > >  
> > >  	ppgtt->pd_offset = first_pd_entry_in_global_pt * sizeof(gen6_gtt_pte_t);
> > >  
> > > @@ -346,7 +350,7 @@ static int i915_gem_init_aliasing_ppgtt(struct drm_device *dev)
> > >  	if (!ppgtt)
> > >  		return -ENOMEM;
> > >  
> > > -	ppgtt->dev = dev;
> > > +	ppgtt->base.dev = dev;
> > >  
> > >  	if (INTEL_INFO(dev)->gen < 8)
> > >  		ret = gen6_ppgtt_init(ppgtt);
> > > @@ -369,7 +373,7 @@ void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev)
> > >  	if (!ppgtt)
> > >  		return;
> > >  
> > > -	ppgtt->cleanup(ppgtt);
> > > +	ppgtt->base.cleanup(&ppgtt->base);
> > >  	dev_priv->mm.aliasing_ppgtt = NULL;
> > >  }
> > >  
> > > @@ -377,17 +381,17 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
> > >  			    struct drm_i915_gem_object *obj,
> > >  			    enum i915_cache_level cache_level)
> > >  {
> > > -	ppgtt->insert_entries(ppgtt, obj->pages,
> > > -			      i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> > > -			      cache_level);
> > > +	ppgtt->base.insert_entries(&ppgtt->base, obj->pages,
> > > +				   i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> > > +				   cache_level);
> > >  }
> > >  
> > >  void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
> > >  			      struct drm_i915_gem_object *obj)
> > >  {
> > > -	ppgtt->clear_range(ppgtt,
> > > -			   i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> > > -			   obj->base.size >> PAGE_SHIFT);
> > > +	ppgtt->base.clear_range(&ppgtt->base,
> > > +				i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> > > +				obj->base.size >> PAGE_SHIFT);
> > >  }
> > >  
> > >  extern int intel_iommu_gfx_mapped;
> > > @@ -434,8 +438,9 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
> > >  	struct drm_i915_gem_object *obj;
> > >  
> > >  	/* First fill our portion of the GTT with scratch pages */
> > > -	dev_priv->gtt.gtt_clear_range(dev, dev_priv->gtt.start / PAGE_SIZE,
> > > -				      dev_priv->gtt.total / PAGE_SIZE);
> > > +	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> > > +				       dev_priv->gtt.base.start / PAGE_SIZE,
> > > +				       dev_priv->gtt.base.total / PAGE_SIZE);
> > >  
> > >  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> > >  		i915_gem_clflush_object(obj);
> > > @@ -464,12 +469,12 @@ int i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj)
> > >   * within the global GTT as well as accessible by the GPU through the GMADR
> > >   * mapped BAR (dev_priv->mm.gtt->gtt).
> > >   */
> > > -static void gen6_ggtt_insert_entries(struct drm_device *dev,
> > > +static void gen6_ggtt_insert_entries(struct i915_address_space *vm,
> > >  				     struct sg_table *st,
> > >  				     unsigned int first_entry,
> > >  				     enum i915_cache_level level)
> > >  {
> > > -	struct drm_i915_private *dev_priv = dev->dev_private;
> > > +	struct drm_i915_private *dev_priv = vm->dev->dev_private;
> > >  	gen6_gtt_pte_t __iomem *gtt_entries =
> > >  		(gen6_gtt_pte_t __iomem *)dev_priv->gtt.gsm + first_entry;
> > >  	int i = 0;
> > > @@ -478,8 +483,7 @@ static void gen6_ggtt_insert_entries(struct drm_device *dev,
> > >  
> > >  	for_each_sg_page(st->sgl, &sg_iter, st->nents, 0) {
> > >  		addr = sg_page_iter_dma_address(&sg_iter);
> > > -		iowrite32(dev_priv->gtt.pte_encode(addr, level),
> > > -			  &gtt_entries[i]);
> > > +		iowrite32(vm->pte_encode(addr, level), &gtt_entries[i]);
> > >  		i++;
> > >  	}
> > >  
> > > @@ -490,8 +494,8 @@ static void gen6_ggtt_insert_entries(struct drm_device *dev,
> > >  	 * hardware should work, we must keep this posting read for paranoia.
> > >  	 */
> > >  	if (i != 0)
> > > -		WARN_ON(readl(&gtt_entries[i-1])
> > > -			!= dev_priv->gtt.pte_encode(addr, level));
> > > +		WARN_ON(readl(&gtt_entries[i-1]) !=
> > > +			vm->pte_encode(addr, level));
> > >  
> > >  	/* This next bit makes the above posting read even more important. We
> > >  	 * want to flush the TLBs only after we're certain all the PTE updates
> > > @@ -501,11 +505,11 @@ static void gen6_ggtt_insert_entries(struct drm_device *dev,
> > >  	POSTING_READ(GFX_FLSH_CNTL_GEN6);
> > >  }
> > >  
> > > -static void gen6_ggtt_clear_range(struct drm_device *dev,
> > > +static void gen6_ggtt_clear_range(struct i915_address_space *vm,
> > >  				  unsigned int first_entry,
> > >  				  unsigned int num_entries)
> > >  {
> > > -	struct drm_i915_private *dev_priv = dev->dev_private;
> > > +	struct drm_i915_private *dev_priv = vm->dev->dev_private;
> > >  	gen6_gtt_pte_t scratch_pte, __iomem *gtt_base =
> > >  		(gen6_gtt_pte_t __iomem *) dev_priv->gtt.gsm + first_entry;
> > >  	const int max_entries = gtt_total_entries(dev_priv->gtt) - first_entry;
> > > @@ -516,15 +520,14 @@ static void gen6_ggtt_clear_range(struct drm_device *dev,
> > >  		 first_entry, num_entries, max_entries))
> > >  		num_entries = max_entries;
> > >  
> > > -	scratch_pte = dev_priv->gtt.pte_encode(dev_priv->gtt.scratch.addr,
> > > -					       I915_CACHE_LLC);
> > > +	scratch_pte = vm->pte_encode(vm->scratch.addr, I915_CACHE_LLC);
> > >  	for (i = 0; i < num_entries; i++)
> > >  		iowrite32(scratch_pte, &gtt_base[i]);
> > >  	readl(gtt_base);
> > >  }
> > >  
> > >  
> > > -static void i915_ggtt_insert_entries(struct drm_device *dev,
> > > +static void i915_ggtt_insert_entries(struct i915_address_space *vm,
> > >  				     struct sg_table *st,
> > >  				     unsigned int pg_start,
> > >  				     enum i915_cache_level cache_level)
> > > @@ -536,7 +539,7 @@ static void i915_ggtt_insert_entries(struct drm_device *dev,
> > >  
> > >  }
> > >  
> > > -static void i915_ggtt_clear_range(struct drm_device *dev,
> > > +static void i915_ggtt_clear_range(struct i915_address_space *vm,
> > >  				  unsigned int first_entry,
> > >  				  unsigned int num_entries)
> > >  {
> > > @@ -549,10 +552,11 @@ void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
> > >  {
> > >  	struct drm_device *dev = obj->base.dev;
> > >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > > +	const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
> > >  
> > > -	dev_priv->gtt.gtt_insert_entries(dev, obj->pages,
> > > -					 i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> > > -					 cache_level);
> > > +	dev_priv->gtt.base.insert_entries(&dev_priv->gtt.base, obj->pages,
> > > +					  entry,
> > > +					  cache_level);
> > >  
> > >  	obj->has_global_gtt_mapping = 1;
> > >  }
> > > @@ -561,10 +565,11 @@ void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj)
> > >  {
> > >  	struct drm_device *dev = obj->base.dev;
> > >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > > +	const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
> > >  
> > > -	dev_priv->gtt.gtt_clear_range(obj->base.dev,
> > > -				      i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> > > -				      obj->base.size >> PAGE_SHIFT);
> > > +	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> > > +				       entry,
> > > +				       obj->base.size >> PAGE_SHIFT);
> > >  
> > >  	obj->has_global_gtt_mapping = 0;
> > >  }
> > > @@ -641,20 +646,23 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
> > >  		obj->has_global_gtt_mapping = 1;
> > >  	}
> > >  
> > > -	dev_priv->gtt.start = start;
> > > -	dev_priv->gtt.total = end - start;
> > > +	dev_priv->gtt.base.start = start;
> > > +	dev_priv->gtt.base.total = end - start;
> > >  
> > >  	/* Clear any non-preallocated blocks */
> > >  	drm_mm_for_each_hole(entry, &dev_priv->mm.gtt_space,
> > >  			     hole_start, hole_end) {
> > > +		const unsigned long count = (hole_end - hole_start) / PAGE_SIZE;
> > >  		DRM_DEBUG_KMS("clearing unused GTT space: [%lx, %lx]\n",
> > >  			      hole_start, hole_end);
> > > -		dev_priv->gtt.gtt_clear_range(dev, hole_start / PAGE_SIZE,
> > > -					      (hole_end-hole_start) / PAGE_SIZE);
> > > +		dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> > > +					       hole_start / PAGE_SIZE,
> > > +					       count);
> > >  	}
> > >  
> > >  	/* And finally clear the reserved guard page */
> > > -	dev_priv->gtt.gtt_clear_range(dev, end / PAGE_SIZE - 1, 1);
> > > +	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> > > +				       end / PAGE_SIZE - 1, 1);
> > >  }
> > >  
> > >  static bool
> > > @@ -677,7 +685,7 @@ void i915_gem_init_global_gtt(struct drm_device *dev)
> > >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > >  	unsigned long gtt_size, mappable_size;
> > >  
> > > -	gtt_size = dev_priv->gtt.total;
> > > +	gtt_size = dev_priv->gtt.base.total;
> > >  	mappable_size = dev_priv->gtt.mappable_end;
> > >  
> > >  	if (intel_enable_ppgtt(dev) && HAS_ALIASING_PPGTT(dev)) {
> > > @@ -722,8 +730,8 @@ static int setup_scratch_page(struct drm_device *dev)
> > >  #else
> > >  	dma_addr = page_to_phys(page);
> > >  #endif
> > > -	dev_priv->gtt.scratch.page = page;
> > > -	dev_priv->gtt.scratch.addr = dma_addr;
> > > +	dev_priv->gtt.base.scratch.page = page;
> > > +	dev_priv->gtt.base.scratch.addr = dma_addr;
> > >  
> > >  	return 0;
> > >  }
> > > @@ -731,11 +739,13 @@ static int setup_scratch_page(struct drm_device *dev)
> > >  static void teardown_scratch_page(struct drm_device *dev)
> > >  {
> > >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > > -	set_pages_wb(dev_priv->gtt.scratch.page, 1);
> > > -	pci_unmap_page(dev->pdev, dev_priv->gtt.scratch.addr,
> > > +	struct page *page = dev_priv->gtt.base.scratch.page;
> > > +
> > > +	set_pages_wb(page, 1);
> > > +	pci_unmap_page(dev->pdev, dev_priv->gtt.base.scratch.addr,
> > >  		       PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
> > > -	put_page(dev_priv->gtt.scratch.page);
> > > -	__free_page(dev_priv->gtt.scratch.page);
> > > +	put_page(page);
> > > +	__free_page(page);
> > >  }
> > >  
> > >  static inline unsigned int gen6_get_total_gtt_size(u16 snb_gmch_ctl)
> > > @@ -798,17 +808,18 @@ static int gen6_gmch_probe(struct drm_device *dev,
> > >  	if (ret)
> > >  		DRM_ERROR("Scratch setup failed\n");
> > >  
> > > -	dev_priv->gtt.gtt_clear_range = gen6_ggtt_clear_range;
> > > -	dev_priv->gtt.gtt_insert_entries = gen6_ggtt_insert_entries;
> > > +	dev_priv->gtt.base.clear_range = gen6_ggtt_clear_range;
> > > +	dev_priv->gtt.base.insert_entries = gen6_ggtt_insert_entries;
> > >  
> > >  	return ret;
> > >  }
> > >  
> > > -static void gen6_gmch_remove(struct drm_device *dev)
> > > +static void gen6_gmch_remove(struct i915_address_space *vm)
> > >  {
> > > -	struct drm_i915_private *dev_priv = dev->dev_private;
> > > -	iounmap(dev_priv->gtt.gsm);
> > > -	teardown_scratch_page(dev_priv->dev);
> > > +
> > > +	struct i915_gtt *gtt = container_of(vm, struct i915_gtt, base);
> > > +	iounmap(gtt->gsm);
> > > +	teardown_scratch_page(vm->dev);
> > >  }
> > >  
> > >  static int i915_gmch_probe(struct drm_device *dev,
> > > @@ -829,13 +840,13 @@ static int i915_gmch_probe(struct drm_device *dev,
> > >  	intel_gtt_get(gtt_total, stolen, mappable_base, mappable_end);
> > >  
> > >  	dev_priv->gtt.do_idle_maps = needs_idle_maps(dev_priv->dev);
> > > -	dev_priv->gtt.gtt_clear_range = i915_ggtt_clear_range;
> > > -	dev_priv->gtt.gtt_insert_entries = i915_ggtt_insert_entries;
> > > +	dev_priv->gtt.base.clear_range = i915_ggtt_clear_range;
> > > +	dev_priv->gtt.base.insert_entries = i915_ggtt_insert_entries;
> > >  
> > >  	return 0;
> > >  }
> > >  
> > > -static void i915_gmch_remove(struct drm_device *dev)
> > > +static void i915_gmch_remove(struct i915_address_space *vm)
> > >  {
> > >  	intel_gmch_remove();
> > >  }
> > > @@ -848,25 +859,28 @@ int i915_gem_gtt_init(struct drm_device *dev)
> > >  
> > >  	if (INTEL_INFO(dev)->gen <= 5) {
> > >  		gtt->gtt_probe = i915_gmch_probe;
> > > -		gtt->gtt_remove = i915_gmch_remove;
> > > +		gtt->base.cleanup = i915_gmch_remove;
> > >  	} else {
> > >  		gtt->gtt_probe = gen6_gmch_probe;
> > > -		gtt->gtt_remove = gen6_gmch_remove;
> > > +		gtt->base.cleanup = gen6_gmch_remove;
> > >  		if (IS_HASWELL(dev))
> > > -			gtt->pte_encode = hsw_pte_encode;
> > > +			gtt->base.pte_encode = hsw_pte_encode;
> > >  		else if (IS_VALLEYVIEW(dev))
> > > -			gtt->pte_encode = byt_pte_encode;
> > > +			gtt->base.pte_encode = byt_pte_encode;
> > >  		else
> > > -			gtt->pte_encode = gen6_pte_encode;
> > > +			gtt->base.pte_encode = gen6_pte_encode;
> > >  	}
> > >  
> > > -	ret = gtt->gtt_probe(dev, &gtt->total, &gtt->stolen_size,
> > > +	ret = gtt->gtt_probe(dev, &gtt->base.total, &gtt->stolen_size,
> > >  			     &gtt->mappable_base, &gtt->mappable_end);
> > >  	if (ret)
> > >  		return ret;
> > >  
> > > +	gtt->base.dev = dev;
> > > +
> > >  	/* GMADR is the PCI mmio aperture into the global GTT. */
> > > -	DRM_INFO("Memory usable by graphics device = %zdM\n", gtt->total >> 20);
> > > +	DRM_INFO("Memory usable by graphics device = %zdM\n",
> > > +		 gtt->base.total >> 20);
> > >  	DRM_DEBUG_DRIVER("GMADR size = %ldM\n", gtt->mappable_end >> 20);
> > >  	DRM_DEBUG_DRIVER("GTT stolen size = %zdM\n", gtt->stolen_size >> 20);
> > >  
> > > -- 
> > > 1.8.3.2
> > > 
> > > _______________________________________________
> > > Intel-gfx mailing list
> > > Intel-gfx@lists.freedesktop.org
> > > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> > 
> > -- 
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > +41 (0) 79 365 57 48 - http://blog.ffwll.ch
> 
> -- 
> Ben Widawsky, Intel Open Source Technology Center

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 06/11] drm/i915: plumb VM into object operations
  2013-07-10 16:37     ` Ben Widawsky
@ 2013-07-10 17:05       ` Daniel Vetter
  2013-07-10 22:23         ` Ben Widawsky
  0 siblings, 1 reply; 50+ messages in thread
From: Daniel Vetter @ 2013-07-10 17:05 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Wed, Jul 10, 2013 at 09:37:10AM -0700, Ben Widawsky wrote:
> On Tue, Jul 09, 2013 at 09:15:01AM +0200, Daniel Vetter wrote:
> > On Mon, Jul 08, 2013 at 11:08:37PM -0700, Ben Widawsky wrote:
> > > This patch was formerly known as:
> > > "drm/i915: Create VMAs (part 3) - plumbing"
> > > 
> > > This patch adds a VM argument, bind/unbind, and the object
> > > offset/size/color getters/setters. It preserves the old ggtt helper
> > > functions because things still need, and will continue to need them.
> > > 
> > > Some code will still need to be ported over after this.
> > > 
> > > v2: Fix purge to pick an object and unbind all vmas
> > > This was doable because of the global bound list change.
> > > 
> > > v3: With the commit to actually pin/unpin pages in place, there is no
> > > longer a need to check if unbind succeeded before calling put_pages().
> > > Make put_pages only BUG() after checking pin count.
> > > 
> > > v4: Rebased on top of the new hangcheck work by Mika
> > > plumbed eb_destroy also
> > > Many checkpatch related fixes
> > > 
> > > v5: Very large rebase
> > > 
> > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > 
> > This one is a rather large beast. Any chance we could split it into
> > topics, e.g. convert execbuf code, convert shrinker code? Or does that get
> > messy, fast?
> > 
> 
> I've thought of this...
> 
> The one solution I came up with is to have two bind/unbind functions
> (similar to what I did with pin, and indeed it was my original plan with
> pin), and do the set_caching one separately.
> 
> I think it won't be too messy, just a lot of typing, as Keith likes to
> say.
> 
> However, my opinion was, since it's early in the merge cycle, we don't
> yet have multiple VMs, and it's /mostly/ a copypasta kind of patch, it's
> not a big deal. At a functional level too, I felt this made more sense.
> 
> So I'll defer to your request on this and start splitting it up, unless
> my email has changed your mind ;-).

Well, my concern is mostly in reviewing since we need to think about each
case and whether it makes sense to talk in therms of vma or objects in
that function. And what exactly to test.

If you've played around and concluded it'll be a mess then I don't think
it'll help in reviewing. So pointless.

Still, there's a bunch of questions on this patch that we need to discuss
;-)

Cheers, Daniel

> 
> > > ---
> > >  drivers/gpu/drm/i915/i915_debugfs.c        |  31 ++-
> > >  drivers/gpu/drm/i915/i915_dma.c            |   4 -
> > >  drivers/gpu/drm/i915/i915_drv.h            | 107 +++++-----
> > >  drivers/gpu/drm/i915/i915_gem.c            | 316 +++++++++++++++++++++--------
> > >  drivers/gpu/drm/i915/i915_gem_context.c    |   9 +-
> > >  drivers/gpu/drm/i915/i915_gem_evict.c      |  51 +++--
> > >  drivers/gpu/drm/i915/i915_gem_execbuffer.c |  85 +++++---
> > >  drivers/gpu/drm/i915/i915_gem_gtt.c        |  41 ++--
> > >  drivers/gpu/drm/i915/i915_gem_stolen.c     |   6 +-
> > >  drivers/gpu/drm/i915/i915_gem_tiling.c     |  10 +-
> > >  drivers/gpu/drm/i915/i915_irq.c            |   6 +-
> > >  drivers/gpu/drm/i915/i915_trace.h          |  20 +-
> > >  drivers/gpu/drm/i915/intel_fb.c            |   1 -
> > >  drivers/gpu/drm/i915/intel_overlay.c       |   2 +-
> > >  drivers/gpu/drm/i915/intel_pm.c            |   2 +-
> > >  drivers/gpu/drm/i915/intel_ringbuffer.c    |  16 +-
> > >  16 files changed, 468 insertions(+), 239 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> > > index 16b2aaf..867ed07 100644
> > > --- a/drivers/gpu/drm/i915/i915_debugfs.c
> > > +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> > > @@ -122,9 +122,18 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
> > >  		seq_printf(m, " (pinned x %d)", obj->pin_count);
> > >  	if (obj->fence_reg != I915_FENCE_REG_NONE)
> > >  		seq_printf(m, " (fence: %d)", obj->fence_reg);
> > > -	if (i915_gem_obj_ggtt_bound(obj))
> > > -		seq_printf(m, " (gtt offset: %08lx, size: %08x)",
> > > -			   i915_gem_obj_ggtt_offset(obj), (unsigned int)i915_gem_obj_ggtt_size(obj));
> > > +	if (i915_gem_obj_bound_any(obj)) {
> > 
> > list_for_each will short-circuit already, so this is redundant.
> > 
> > > +		struct i915_vma *vma;
> > > +		list_for_each_entry(vma, &obj->vma_list, vma_link) {
> > > +			if (!i915_is_ggtt(vma->vm))
> > > +				seq_puts(m, " (pp");
> > > +			else
> > > +				seq_puts(m, " (g");
> > > +			seq_printf(m, " gtt offset: %08lx, size: %08lx)",
> > 
> >                                        ^ that space looks superflous now
> > 
> > > +				   i915_gem_obj_offset(obj, vma->vm),
> > > +				   i915_gem_obj_size(obj, vma->vm));
> > > +		}
> > > +	}
> > >  	if (obj->stolen)
> > >  		seq_printf(m, " (stolen: %08lx)", obj->stolen->start);
> > >  	if (obj->pin_mappable || obj->fault_mappable) {
> > > @@ -186,6 +195,7 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
> > >  	return 0;
> > >  }
> > >  
> > > +/* FIXME: Support multiple VM? */
> > >  #define count_objects(list, member) do { \
> > >  	list_for_each_entry(obj, list, member) { \
> > >  		size += i915_gem_obj_ggtt_size(obj); \
> > > @@ -2049,18 +2059,21 @@ i915_drop_caches_set(void *data, u64 val)
> > >  
> > >  	if (val & DROP_BOUND) {
> > >  		list_for_each_entry_safe(obj, next, &vm->inactive_list,
> > > -					 mm_list)
> > > -			if (obj->pin_count == 0) {
> > > -				ret = i915_gem_object_unbind(obj);
> > > -				if (ret)
> > > -					goto unlock;
> > > -			}
> > > +					 mm_list) {
> > > +			if (obj->pin_count)
> > > +				continue;
> > > +
> > > +			ret = i915_gem_object_unbind(obj, &dev_priv->gtt.base);
> > > +			if (ret)
> > > +				goto unlock;
> > > +		}
> > >  	}
> > >  
> > >  	if (val & DROP_UNBOUND) {
> > >  		list_for_each_entry_safe(obj, next, &dev_priv->mm.unbound_list,
> > >  					 global_list)
> > >  			if (obj->pages_pin_count == 0) {
> > > +				/* FIXME: Do this for all vms? */
> > >  				ret = i915_gem_object_put_pages(obj);
> > >  				if (ret)
> > >  					goto unlock;
> > > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> > > index d13e21f..b190439 100644
> > > --- a/drivers/gpu/drm/i915/i915_dma.c
> > > +++ b/drivers/gpu/drm/i915/i915_dma.c
> > > @@ -1497,10 +1497,6 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
> > >  
> > >  	i915_dump_device_info(dev_priv);
> > >  
> > > -	INIT_LIST_HEAD(&dev_priv->vm_list);
> > > -	INIT_LIST_HEAD(&dev_priv->gtt.base.global_link);
> > > -	list_add(&dev_priv->gtt.base.global_link, &dev_priv->vm_list);
> > > -
> > >  	if (i915_get_bridge_dev(dev)) {
> > >  		ret = -EIO;
> > >  		goto free_priv;
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > > index 38cccc8..48baccc 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > @@ -1363,52 +1363,6 @@ struct drm_i915_gem_object {
> > >  
> > >  #define to_intel_bo(x) container_of(x, struct drm_i915_gem_object, base)
> > >  
> > > -/* This is a temporary define to help transition us to real VMAs. If you see
> > > - * this, you're either reviewing code, or bisecting it. */
> > > -static inline struct i915_vma *
> > > -__i915_gem_obj_to_vma(struct drm_i915_gem_object *obj)
> > > -{
> > > -	if (list_empty(&obj->vma_list))
> > > -		return NULL;
> > > -	return list_first_entry(&obj->vma_list, struct i915_vma, vma_link);
> > > -}
> > > -
> > > -/* Whether or not this object is currently mapped by the translation tables */
> > > -static inline bool
> > > -i915_gem_obj_ggtt_bound(struct drm_i915_gem_object *o)
> > > -{
> > > -	struct i915_vma *vma = __i915_gem_obj_to_vma(o);
> > > -	if (vma == NULL)
> > > -		return false;
> > > -	return drm_mm_node_allocated(&vma->node);
> > > -}
> > > -
> > > -/* Offset of the first PTE pointing to this object */
> > > -static inline unsigned long
> > > -i915_gem_obj_ggtt_offset(struct drm_i915_gem_object *o)
> > > -{
> > > -	BUG_ON(list_empty(&o->vma_list));
> > > -	return __i915_gem_obj_to_vma(o)->node.start;
> > > -}
> > > -
> > > -/* The size used in the translation tables may be larger than the actual size of
> > > - * the object on GEN2/GEN3 because of the way tiling is handled. See
> > > - * i915_gem_get_gtt_size() for more details.
> > > - */
> > > -static inline unsigned long
> > > -i915_gem_obj_ggtt_size(struct drm_i915_gem_object *o)
> > > -{
> > > -	BUG_ON(list_empty(&o->vma_list));
> > > -	return __i915_gem_obj_to_vma(o)->node.size;
> > > -}
> > > -
> > > -static inline void
> > > -i915_gem_obj_ggtt_set_color(struct drm_i915_gem_object *o,
> > > -			    enum i915_cache_level color)
> > > -{
> > > -	__i915_gem_obj_to_vma(o)->node.color = color;
> > > -}
> > > -
> > >  /**
> > >   * Request queue structure.
> > >   *
> > > @@ -1726,11 +1680,13 @@ struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
> > >  void i915_gem_vma_destroy(struct i915_vma *vma);
> > >  
> > >  int __must_check i915_gem_object_pin(struct drm_i915_gem_object *obj,
> > > +				     struct i915_address_space *vm,
> > >  				     uint32_t alignment,
> > >  				     bool map_and_fenceable,
> > >  				     bool nonblocking);
> > >  void i915_gem_object_unpin(struct drm_i915_gem_object *obj);
> > > -int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj);
> > > +int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj,
> > > +					struct i915_address_space *vm);
> > >  int i915_gem_object_put_pages(struct drm_i915_gem_object *obj);
> > >  void i915_gem_release_mmap(struct drm_i915_gem_object *obj);
> > >  void i915_gem_lastclose(struct drm_device *dev);
> > > @@ -1760,6 +1716,7 @@ int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
> > >  int i915_gem_object_sync(struct drm_i915_gem_object *obj,
> > >  			 struct intel_ring_buffer *to);
> > >  void i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> > > +				    struct i915_address_space *vm,
> > >  				    struct intel_ring_buffer *ring);
> > >  
> > >  int i915_gem_dumb_create(struct drm_file *file_priv,
> > > @@ -1866,6 +1823,7 @@ i915_gem_get_gtt_alignment(struct drm_device *dev, uint32_t size,
> > >  			    int tiling_mode, bool fenced);
> > >  
> > >  int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > > +				    struct i915_address_space *vm,
> > >  				    enum i915_cache_level cache_level);
> > >  
> > >  struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev,
> > > @@ -1876,6 +1834,56 @@ struct dma_buf *i915_gem_prime_export(struct drm_device *dev,
> > >  
> > >  void i915_gem_restore_fences(struct drm_device *dev);
> > >  
> > > +unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
> > > +				  struct i915_address_space *vm);
> > > +bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o);
> > > +bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
> > > +			struct i915_address_space *vm);
> > > +unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
> > > +				struct i915_address_space *vm);
> > > +void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
> > > +			    struct i915_address_space *vm,
> > > +			    enum i915_cache_level color);
> > > +struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
> > > +				     struct i915_address_space *vm);
> > > +/* Some GGTT VM helpers */
> > > +#define obj_to_ggtt(obj) \
> > > +	(&((struct drm_i915_private *)(obj)->base.dev->dev_private)->gtt.base)
> > > +static inline bool i915_is_ggtt(struct i915_address_space *vm)
> > > +{
> > > +	struct i915_address_space *ggtt =
> > > +		&((struct drm_i915_private *)(vm)->dev->dev_private)->gtt.base;
> > > +	return vm == ggtt;
> > > +}
> > > +
> > > +static inline bool i915_gem_obj_ggtt_bound(struct drm_i915_gem_object *obj)
> > > +{
> > > +	return i915_gem_obj_bound(obj, obj_to_ggtt(obj));
> > > +}
> > > +
> > > +static inline unsigned long
> > > +i915_gem_obj_ggtt_offset(struct drm_i915_gem_object *obj)
> > > +{
> > > +	return i915_gem_obj_offset(obj, obj_to_ggtt(obj));
> > > +}
> > > +
> > > +static inline unsigned long
> > > +i915_gem_obj_ggtt_size(struct drm_i915_gem_object *obj)
> > > +{
> > > +	return i915_gem_obj_size(obj, obj_to_ggtt(obj));
> > > +}
> > > +
> > > +static inline int __must_check
> > > +i915_gem_ggtt_pin(struct drm_i915_gem_object *obj,
> > > +		  uint32_t alignment,
> > > +		  bool map_and_fenceable,
> > > +		  bool nonblocking)
> > > +{
> > > +	return i915_gem_object_pin(obj, obj_to_ggtt(obj), alignment,
> > > +				   map_and_fenceable, nonblocking);
> > > +}
> > > +#undef obj_to_ggtt
> > > +
> > >  /* i915_gem_context.c */
> > >  void i915_gem_context_init(struct drm_device *dev);
> > >  void i915_gem_context_fini(struct drm_device *dev);
> > > @@ -1912,6 +1920,7 @@ void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
> > >  
> > >  void i915_gem_restore_gtt_mappings(struct drm_device *dev);
> > >  int __must_check i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj);
> > > +/* FIXME: this is never okay with full PPGTT */
> > >  void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
> > >  				enum i915_cache_level cache_level);
> > >  void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj);
> > > @@ -1928,7 +1937,9 @@ static inline void i915_gem_chipset_flush(struct drm_device *dev)
> > >  
> > >  
> > >  /* i915_gem_evict.c */
> > > -int __must_check i915_gem_evict_something(struct drm_device *dev, int min_size,
> > > +int __must_check i915_gem_evict_something(struct drm_device *dev,
> > > +					  struct i915_address_space *vm,
> > > +					  int min_size,
> > >  					  unsigned alignment,
> > >  					  unsigned cache_level,
> > >  					  bool mappable,
> > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > > index 058ad44..21015cd 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > @@ -38,10 +38,12 @@
> > >  
> > >  static void i915_gem_object_flush_gtt_write_domain(struct drm_i915_gem_object *obj);
> > >  static void i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *obj);
> > > -static __must_check int i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> > > -						    unsigned alignment,
> > > -						    bool map_and_fenceable,
> > > -						    bool nonblocking);
> > > +static __must_check int
> > > +i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> > > +			    struct i915_address_space *vm,
> > > +			    unsigned alignment,
> > > +			    bool map_and_fenceable,
> > > +			    bool nonblocking);
> > >  static int i915_gem_phys_pwrite(struct drm_device *dev,
> > >  				struct drm_i915_gem_object *obj,
> > >  				struct drm_i915_gem_pwrite *args,
> > > @@ -135,7 +137,7 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
> > >  static inline bool
> > >  i915_gem_object_is_inactive(struct drm_i915_gem_object *obj)
> > >  {
> > > -	return i915_gem_obj_ggtt_bound(obj) && !obj->active;
> > > +	return i915_gem_obj_bound_any(obj) && !obj->active;
> > >  }
> > >  
> > >  int
> > > @@ -422,7 +424,7 @@ i915_gem_shmem_pread(struct drm_device *dev,
> > >  		 * anyway again before the next pread happens. */
> > >  		if (obj->cache_level == I915_CACHE_NONE)
> > >  			needs_clflush = 1;
> > > -		if (i915_gem_obj_ggtt_bound(obj)) {
> > > +		if (i915_gem_obj_bound_any(obj)) {
> > >  			ret = i915_gem_object_set_to_gtt_domain(obj, false);
> > 
> > This is essentially a very convoluted version of "if there's gpu rendering
> > outstanding, please wait for it". Maybe we should switch this to
> > 
> > 	if (obj->active)
> > 		wait_rendering(obj, true);
> > 
> > Same for the shmem_pwrite case below. Would be a separate patch to prep
> > things though. Can I volunteer you for that? The ugly part is to review
> > whether any of the lru list updating that set_domain does in addition to
> > wait_rendering is required, but on a quick read that's not the case.
> > 
> > >  			if (ret)
> > >  				return ret;
> > > @@ -594,7 +596,7 @@ i915_gem_gtt_pwrite_fast(struct drm_device *dev,
> > >  	char __user *user_data;
> > >  	int page_offset, page_length, ret;
> > >  
> > > -	ret = i915_gem_object_pin(obj, 0, true, true);
> > > +	ret = i915_gem_ggtt_pin(obj, 0, true, true);
> > >  	if (ret)
> > >  		goto out;
> > >  
> > > @@ -739,7 +741,7 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
> > >  		 * right away and we therefore have to clflush anyway. */
> gg> >  		if (obj->cache_level == I915_CACHE_NONE)
> > >  			needs_clflush_after = 1;
> > > -		if (i915_gem_obj_ggtt_bound(obj)) {
> > > +		if (i915_gem_obj_bound_any(obj)) {
> > 
> > ... see above.
> > >  			ret = i915_gem_object_set_to_gtt_domain(obj, true);
> > >  			if (ret)
> > >  				return ret;
> > > @@ -1346,7 +1348,7 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
> > >  	}
> > >  
> > >  	/* Now bind it into the GTT if needed */
> > > -	ret = i915_gem_object_pin(obj, 0, true, false);
> > > +	ret = i915_gem_ggtt_pin(obj,  0, true, false);
> > >  	if (ret)
> > >  		goto unlock;
> > >  
> > > @@ -1668,11 +1670,11 @@ i915_gem_object_put_pages(struct drm_i915_gem_object *obj)
> > >  	if (obj->pages == NULL)
> > >  		return 0;
> > >  
> > > -	BUG_ON(i915_gem_obj_ggtt_bound(obj));
> > > -
> > >  	if (obj->pages_pin_count)
> > >  		return -EBUSY;
> > >  
> > > +	BUG_ON(i915_gem_obj_bound_any(obj));
> > > +
> > >  	/* ->put_pages might need to allocate memory for the bit17 swizzle
> > >  	 * array, hence protect them from being reaped by removing them from gtt
> > >  	 * lists early. */
> > > @@ -1692,7 +1694,6 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
> > >  		  bool purgeable_only)
> > >  {
> > >  	struct drm_i915_gem_object *obj, *next;
> > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > >  	long count = 0;
> > >  
> > >  	list_for_each_entry_safe(obj, next,
> > > @@ -1706,14 +1707,22 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
> > >  		}
> > >  	}
> > >  
> > > -	list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list) {
> > > -		if ((i915_gem_object_is_purgeable(obj) || !purgeable_only) &&
> > > -		    i915_gem_object_unbind(obj) == 0 &&
> > > -		    i915_gem_object_put_pages(obj) == 0) {
> > > +	list_for_each_entry_safe(obj, next, &dev_priv->mm.bound_list,
> > > +				 global_list) {
> > > +		struct i915_vma *vma, *v;
> > > +
> > > +		if (!i915_gem_object_is_purgeable(obj) && purgeable_only)
> > > +			continue;
> > > +
> > > +		list_for_each_entry_safe(vma, v, &obj->vma_list, vma_link)
> > > +			if (i915_gem_object_unbind(obj, vma->vm))
> > > +				break;
> > > +
> > > +		if (!i915_gem_object_put_pages(obj))
> > >  			count += obj->base.size >> PAGE_SHIFT;
> > > -			if (count >= target)
> > > -				return count;
> > > -		}
> > > +
> > > +		if (count >= target)
> > > +			return count;
> > >  	}
> > >  
> > >  	return count;
> > > @@ -1873,11 +1882,11 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
> > >  
> > >  void
> > >  i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> > > +			       struct i915_address_space *vm,
> > >  			       struct intel_ring_buffer *ring)
> > >  {
> > >  	struct drm_device *dev = obj->base.dev;
> > >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > >  	u32 seqno = intel_ring_get_seqno(ring);
> > >  
> > >  	BUG_ON(ring == NULL);
> > > @@ -1910,12 +1919,9 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> > >  }
> > >  
> > >  static void
> > > -i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
> > > +i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj,
> > > +				 struct i915_address_space *vm)
> > >  {
> > > -	struct drm_device *dev = obj->base.dev;
> > > -	struct drm_i915_private *dev_priv = dev->dev_private;
> > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > -
> > >  	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
> > >  	BUG_ON(!obj->active);
> > >  
> > > @@ -2117,10 +2123,11 @@ i915_gem_request_remove_from_client(struct drm_i915_gem_request *request)
> > >  	spin_unlock(&file_priv->mm.lock);
> > >  }
> > >  
> > > -static bool i915_head_inside_object(u32 acthd, struct drm_i915_gem_object *obj)
> > > +static bool i915_head_inside_object(u32 acthd, struct drm_i915_gem_object *obj,
> > > +				    struct i915_address_space *vm)
> > >  {
> > > -	if (acthd >= i915_gem_obj_ggtt_offset(obj) &&
> > > -	    acthd < i915_gem_obj_ggtt_offset(obj) + obj->base.size)
> > > +	if (acthd >= i915_gem_obj_offset(obj, vm) &&
> > > +	    acthd < i915_gem_obj_offset(obj, vm) + obj->base.size)
> > >  		return true;
> > >  
> > >  	return false;
> > > @@ -2143,6 +2150,17 @@ static bool i915_head_inside_request(const u32 acthd_unmasked,
> > >  	return false;
> > >  }
> > >  
> > > +static struct i915_address_space *
> > > +request_to_vm(struct drm_i915_gem_request *request)
> > > +{
> > > +	struct drm_i915_private *dev_priv = request->ring->dev->dev_private;
> > > +	struct i915_address_space *vm;
> > > +
> > > +	vm = &dev_priv->gtt.base;
> > > +
> > > +	return vm;
> > > +}
> > > +
> > >  static bool i915_request_guilty(struct drm_i915_gem_request *request,
> > >  				const u32 acthd, bool *inside)
> > >  {
> > > @@ -2150,9 +2168,9 @@ static bool i915_request_guilty(struct drm_i915_gem_request *request,
> > >  	 * pointing inside the ring, matches the batch_obj address range.
> > >  	 * However this is extremely unlikely.
> > >  	 */
> > > -
> > >  	if (request->batch_obj) {
> > > -		if (i915_head_inside_object(acthd, request->batch_obj)) {
> > > +		if (i915_head_inside_object(acthd, request->batch_obj,
> > > +					    request_to_vm(request))) {
> > >  			*inside = true;
> > >  			return true;
> > >  		}
> > > @@ -2172,17 +2190,21 @@ static void i915_set_reset_status(struct intel_ring_buffer *ring,
> > >  {
> > >  	struct i915_ctx_hang_stats *hs = NULL;
> > >  	bool inside, guilty;
> > > +	unsigned long offset = 0;
> > >  
> > >  	/* Innocent until proven guilty */
> > >  	guilty = false;
> > >  
> > > +	if (request->batch_obj)
> > > +		offset = i915_gem_obj_offset(request->batch_obj,
> > > +					     request_to_vm(request));
> > > +
> > >  	if (ring->hangcheck.action != wait &&
> > >  	    i915_request_guilty(request, acthd, &inside)) {
> > >  		DRM_ERROR("%s hung %s bo (0x%lx ctx %d) at 0x%x\n",
> > >  			  ring->name,
> > >  			  inside ? "inside" : "flushing",
> > > -			  request->batch_obj ?
> > > -			  i915_gem_obj_ggtt_offset(request->batch_obj) : 0,
> > > +			  offset,
> > >  			  request->ctx ? request->ctx->id : 0,
> > >  			  acthd);
> > >  
> > > @@ -2239,13 +2261,15 @@ static void i915_gem_reset_ring_lists(struct drm_i915_private *dev_priv,
> > >  	}
> > >  
> > >  	while (!list_empty(&ring->active_list)) {
> > > +		struct i915_address_space *vm;
> > >  		struct drm_i915_gem_object *obj;
> > >  
> > >  		obj = list_first_entry(&ring->active_list,
> > >  				       struct drm_i915_gem_object,
> > >  				       ring_list);
> > >  
> > > -		i915_gem_object_move_to_inactive(obj);
> > > +		list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > > +			i915_gem_object_move_to_inactive(obj, vm);
> > >  	}
> > >  }
> > >  
> > > @@ -2263,7 +2287,7 @@ void i915_gem_restore_fences(struct drm_device *dev)
> > >  void i915_gem_reset(struct drm_device *dev)
> > >  {
> > >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > +	struct i915_address_space *vm;
> > >  	struct drm_i915_gem_object *obj;
> > >  	struct intel_ring_buffer *ring;
> > >  	int i;
> > > @@ -2274,8 +2298,9 @@ void i915_gem_reset(struct drm_device *dev)
> > >  	/* Move everything out of the GPU domains to ensure we do any
> > >  	 * necessary invalidation upon reuse.
> > >  	 */
> > > -	list_for_each_entry(obj, &vm->inactive_list, mm_list)
> > > -		obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
> > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > > +		list_for_each_entry(obj, &vm->inactive_list, mm_list)
> > > +			obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
> > >  
> > >  	i915_gem_restore_fences(dev);
> > >  }
> > > @@ -2320,6 +2345,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
> > >  	 * by the ringbuffer to the flushing/inactive lists as appropriate.
> > >  	 */
> > >  	while (!list_empty(&ring->active_list)) {
> > > +		struct drm_i915_private *dev_priv = ring->dev->dev_private;
> > > +		struct i915_address_space *vm;
> > >  		struct drm_i915_gem_object *obj;
> > >  
> > >  		obj = list_first_entry(&ring->active_list,
> > > @@ -2329,7 +2356,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
> > >  		if (!i915_seqno_passed(seqno, obj->last_read_seqno))
> > >  			break;
> > >  
> > > -		i915_gem_object_move_to_inactive(obj);
> > > +		list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > > +			i915_gem_object_move_to_inactive(obj, vm);
> > >  	}
> > >  
> > >  	if (unlikely(ring->trace_irq_seqno &&
> > > @@ -2575,13 +2603,14 @@ static void i915_gem_object_finish_gtt(struct drm_i915_gem_object *obj)
> > >   * Unbinds an object from the GTT aperture.
> > >   */
> > >  int
> > > -i915_gem_object_unbind(struct drm_i915_gem_object *obj)
> > > +i915_gem_object_unbind(struct drm_i915_gem_object *obj,
> > > +		       struct i915_address_space *vm)
> > >  {
> > >  	drm_i915_private_t *dev_priv = obj->base.dev->dev_private;
> > >  	struct i915_vma *vma;
> > >  	int ret;
> > >  
> > > -	if (!i915_gem_obj_ggtt_bound(obj))
> > > +	if (!i915_gem_obj_bound(obj, vm))
> > >  		return 0;
> > >  
> > >  	if (obj->pin_count)
> > > @@ -2604,7 +2633,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
> > >  	if (ret)
> > >  		return ret;
> > >  
> > > -	trace_i915_gem_object_unbind(obj);
> > > +	trace_i915_gem_object_unbind(obj, vm);
> > >  
> > >  	if (obj->has_global_gtt_mapping)
> > >  		i915_gem_gtt_unbind_object(obj);
> > > @@ -2619,7 +2648,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
> > >  	/* Avoid an unnecessary call to unbind on rebind. */
> > >  	obj->map_and_fenceable = true;
> > >  
> > > -	vma = __i915_gem_obj_to_vma(obj);
> > > +	vma = i915_gem_obj_to_vma(obj, vm);
> > >  	list_del(&vma->vma_link);
> > >  	drm_mm_remove_node(&vma->node);
> > >  	i915_gem_vma_destroy(vma);
> > > @@ -2748,6 +2777,7 @@ static void i830_write_fence_reg(struct drm_device *dev, int reg,
> > >  		     "object 0x%08lx not 512K or pot-size 0x%08x aligned\n",
> > >  		     i915_gem_obj_ggtt_offset(obj), size);
> > >  
> > > +
> > >  		pitch_val = obj->stride / 128;
> > >  		pitch_val = ffs(pitch_val) - 1;
> > >  
> > > @@ -3069,23 +3099,25 @@ static void i915_gem_verify_gtt(struct drm_device *dev)
> > >   */
> > >  static int
> > >  i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> > > +			    struct i915_address_space *vm,
> > >  			    unsigned alignment,
> > >  			    bool map_and_fenceable,
> > >  			    bool nonblocking)
> > >  {
> > >  	struct drm_device *dev = obj->base.dev;
> > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > >  	u32 size, fence_size, fence_alignment, unfenced_alignment;
> > >  	bool mappable, fenceable;
> > > -	size_t gtt_max = map_and_fenceable ?
> > > -		dev_priv->gtt.mappable_end : dev_priv->gtt.base.total;
> > > +	size_t gtt_max =
> > > +		map_and_fenceable ? dev_priv->gtt.mappable_end : vm->total;
> > >  	struct i915_vma *vma;
> > >  	int ret;
> > >  
> > >  	if (WARN_ON(!list_empty(&obj->vma_list)))
> > >  		return -EBUSY;
> > >  
> > > +	BUG_ON(!i915_is_ggtt(vm));
> > > +
> > >  	fence_size = i915_gem_get_gtt_size(dev,
> > >  					   obj->base.size,
> > >  					   obj->tiling_mode);
> > > @@ -3125,18 +3157,21 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> > >  	i915_gem_object_pin_pages(obj);
> > >  
> > >  	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
> > > +	/* For now we only ever use 1 vma per object */
> > > +	WARN_ON(!list_empty(&obj->vma_list));
> > > +
> > > +	vma = i915_gem_vma_create(obj, vm);
> > >  	if (vma == NULL) {
> > >  		i915_gem_object_unpin_pages(obj);
> > >  		return -ENOMEM;
> > >  	}
> > >  
> > >  search_free:
> > > -	ret = drm_mm_insert_node_in_range_generic(&dev_priv->gtt.base.mm,
> > > -						  &vma->node,
> > > +	ret = drm_mm_insert_node_in_range_generic(&vm->mm, &vma->node,
> > >  						  size, alignment,
> > >  						  obj->cache_level, 0, gtt_max);
> > >  	if (ret) {
> > > -		ret = i915_gem_evict_something(dev, size, alignment,
> > > +		ret = i915_gem_evict_something(dev, vm, size, alignment,
> > >  					       obj->cache_level,
> > >  					       map_and_fenceable,
> > >  					       nonblocking);
> > > @@ -3162,18 +3197,25 @@ search_free:
> > >  
> > >  	list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
> > >  	list_add_tail(&obj->mm_list, &vm->inactive_list);
> > > -	list_add(&vma->vma_link, &obj->vma_list);
> > > +
> > > +	/* Keep GGTT vmas first to make debug easier */
> > > +	if (i915_is_ggtt(vm))
> > > +		list_add(&vma->vma_link, &obj->vma_list);
> > > +	else
> > > +		list_add_tail(&vma->vma_link, &obj->vma_list);
> > >  
> > >  	fenceable =
> > > +		i915_is_ggtt(vm) &&
> > >  		i915_gem_obj_ggtt_size(obj) == fence_size &&
> > >  		(i915_gem_obj_ggtt_offset(obj) & (fence_alignment - 1)) == 0;
> > >  
> > > -	mappable = i915_gem_obj_ggtt_offset(obj) + obj->base.size <=
> > > -		dev_priv->gtt.mappable_end;
> > > +	mappable =
> > > +		i915_is_ggtt(vm) &&
> > > +		vma->node.start + obj->base.size <= dev_priv->gtt.mappable_end;
> > >  
> > >  	obj->map_and_fenceable = mappable && fenceable;
> > >  
> > > -	trace_i915_gem_object_bind(obj, map_and_fenceable);
> > > +	trace_i915_gem_object_bind(obj, vm, map_and_fenceable);
> > >  	i915_gem_verify_gtt(dev);
> > >  	return 0;
> > >  }
> > > @@ -3271,7 +3313,7 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
> > >  	int ret;
> > >  
> > >  	/* Not valid to be called on unbound objects. */
> > > -	if (!i915_gem_obj_ggtt_bound(obj))
> > > +	if (!i915_gem_obj_bound_any(obj))
> > >  		return -EINVAL;
> > 
> > If we're converting the shmem paths over to wait_rendering then there's
> > only the fault handler and the set_domain ioctl left. For the later it
> > would make sense to clflush even when an object is on the unbound list, to
> > allow userspace to optimize when the clflushing happens. But that would
> > only make sense in conjunction with Chris' create2 ioctl and a flag to
> > preallocate the storage (and so putting the object onto the unbound list).
> > So nothing to do here.
> > 
> > >  
> > >  	if (obj->base.write_domain == I915_GEM_DOMAIN_GTT)
> > > @@ -3317,11 +3359,12 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
> > >  }
> > >  
> > >  int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > > +				    struct i915_address_space *vm,
> > >  				    enum i915_cache_level cache_level)
> > >  {
> > >  	struct drm_device *dev = obj->base.dev;
> > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > > -	struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> > > +	struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> > >  	int ret;
> > >  
> > >  	if (obj->cache_level == cache_level)
> > > @@ -3333,12 +3376,15 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > >  	}
> > >  
> > >  	if (!i915_gem_valid_gtt_space(dev, &vma->node, cache_level)) {
> > > -		ret = i915_gem_object_unbind(obj);
> > > +		ret = i915_gem_object_unbind(obj, vm);
> > >  		if (ret)
> > >  			return ret;
> > >  	}
> > >  
> > > -	if (i915_gem_obj_ggtt_bound(obj)) {
> > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > > +		if (!i915_gem_obj_bound(obj, vm))
> > > +			continue;
> > 
> > Hm, shouldn't we have a per-object list of vmas? Or will that follow later
> > on?
> > 
> > Self-correction: It exists already ... why can't we use this here?
> > 
> > > +
> > >  		ret = i915_gem_object_finish_gpu(obj);
> > >  		if (ret)
> > >  			return ret;
> > > @@ -3361,7 +3407,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > >  			i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
> > >  					       obj, cache_level);
> > >  
> > > -		i915_gem_obj_ggtt_set_color(obj, cache_level);
> > > +		i915_gem_obj_set_color(obj, vm, cache_level);
> > >  	}
> > >  
> > >  	if (cache_level == I915_CACHE_NONE) {
> > > @@ -3421,6 +3467,7 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
> > >  			       struct drm_file *file)
> > >  {
> > >  	struct drm_i915_gem_caching *args = data;
> > > +	struct drm_i915_private *dev_priv;
> > >  	struct drm_i915_gem_object *obj;
> > >  	enum i915_cache_level level;
> > >  	int ret;
> > > @@ -3445,8 +3492,10 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
> > >  		ret = -ENOENT;
> > >  		goto unlock;
> > >  	}
> > > +	dev_priv = obj->base.dev->dev_private;
> > >  
> > > -	ret = i915_gem_object_set_cache_level(obj, level);
> > > +	/* FIXME: Add interface for specific VM? */
> > > +	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base, level);
> > >  
> > >  	drm_gem_object_unreference(&obj->base);
> > >  unlock:
> > > @@ -3464,6 +3513,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> > >  				     u32 alignment,
> > >  				     struct intel_ring_buffer *pipelined)
> > >  {
> > > +	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> > >  	u32 old_read_domains, old_write_domain;
> > >  	int ret;
> > >  
> > > @@ -3482,7 +3532,8 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> > >  	 * of uncaching, which would allow us to flush all the LLC-cached data
> > >  	 * with that bit in the PTE to main memory with just one PIPE_CONTROL.
> > >  	 */
> > > -	ret = i915_gem_object_set_cache_level(obj, I915_CACHE_NONE);
> > > +	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> > > +					      I915_CACHE_NONE);
> > >  	if (ret)
> > >  		return ret;
> > >  
> > > @@ -3490,7 +3541,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> > >  	 * (e.g. libkms for the bootup splash), we have to ensure that we
> > >  	 * always use map_and_fenceable for all scanout buffers.
> > >  	 */
> > > -	ret = i915_gem_object_pin(obj, alignment, true, false);
> > > +	ret = i915_gem_ggtt_pin(obj, alignment, true, false);
> > >  	if (ret)
> > >  		return ret;
> > >  
> > > @@ -3633,6 +3684,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
> > >  
> > >  int
> > >  i915_gem_object_pin(struct drm_i915_gem_object *obj,
> > > +		    struct i915_address_space *vm,
> > >  		    uint32_t alignment,
> > >  		    bool map_and_fenceable,
> > >  		    bool nonblocking)
> > > @@ -3642,26 +3694,29 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
> > >  	if (WARN_ON(obj->pin_count == DRM_I915_GEM_OBJECT_MAX_PIN_COUNT))
> > >  		return -EBUSY;
> > >  
> > > -	if (i915_gem_obj_ggtt_bound(obj)) {
> > > -		if ((alignment && i915_gem_obj_ggtt_offset(obj) & (alignment - 1)) ||
> > > +	BUG_ON(map_and_fenceable && !i915_is_ggtt(vm));
> > 
> > WARN_ON, since presumably we can keep on going if we get this wrong
> > (albeit with slightly corrupted state, so render corruptions might
> > follow).
> > 
> > > +
> > > +	if (i915_gem_obj_bound(obj, vm)) {
> > > +		if ((alignment &&
> > > +		     i915_gem_obj_offset(obj, vm) & (alignment - 1)) ||
> > >  		    (map_and_fenceable && !obj->map_and_fenceable)) {
> > >  			WARN(obj->pin_count,
> > >  			     "bo is already pinned with incorrect alignment:"
> > >  			     " offset=%lx, req.alignment=%x, req.map_and_fenceable=%d,"
> > >  			     " obj->map_and_fenceable=%d\n",
> > > -			     i915_gem_obj_ggtt_offset(obj), alignment,
> > > +			     i915_gem_obj_offset(obj, vm), alignment,
> > >  			     map_and_fenceable,
> > >  			     obj->map_and_fenceable);
> > > -			ret = i915_gem_object_unbind(obj);
> > > +			ret = i915_gem_object_unbind(obj, vm);
> > >  			if (ret)
> > >  				return ret;
> > >  		}
> > >  	}
> > >  
> > > -	if (!i915_gem_obj_ggtt_bound(obj)) {
> > > +	if (!i915_gem_obj_bound(obj, vm)) {
> > >  		struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> > >  
> > > -		ret = i915_gem_object_bind_to_gtt(obj, alignment,
> > > +		ret = i915_gem_object_bind_to_gtt(obj, vm, alignment,
> > >  						  map_and_fenceable,
> > >  						  nonblocking);
> > >  		if (ret)
> > > @@ -3684,7 +3739,7 @@ void
> > >  i915_gem_object_unpin(struct drm_i915_gem_object *obj)
> > >  {
> > >  	BUG_ON(obj->pin_count == 0);
> > > -	BUG_ON(!i915_gem_obj_ggtt_bound(obj));
> > > +	BUG_ON(!i915_gem_obj_bound_any(obj));
> > >  
> > >  	if (--obj->pin_count == 0)
> > >  		obj->pin_mappable = false;
> > > @@ -3722,7 +3777,7 @@ i915_gem_pin_ioctl(struct drm_device *dev, void *data,
> > >  	}
> > >  
> > >  	if (obj->user_pin_count == 0) {
> > > -		ret = i915_gem_object_pin(obj, args->alignment, true, false);
> > > +		ret = i915_gem_ggtt_pin(obj, args->alignment, true, false);
> > >  		if (ret)
> > >  			goto out;
> > >  	}
> > > @@ -3957,6 +4012,7 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
> > >  	struct drm_i915_gem_object *obj = to_intel_bo(gem_obj);
> > >  	struct drm_device *dev = obj->base.dev;
> > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > > +	struct i915_vma *vma, *next;
> > >  
> > >  	trace_i915_gem_object_destroy(obj);
> > >  
> > > @@ -3964,15 +4020,21 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
> > >  		i915_gem_detach_phys_object(dev, obj);
> > >  
> > >  	obj->pin_count = 0;
> > > -	if (WARN_ON(i915_gem_object_unbind(obj) == -ERESTARTSYS)) {
> > > -		bool was_interruptible;
> > > +	/* NB: 0 or 1 elements */
> > > +	WARN_ON(!list_empty(&obj->vma_list) &&
> > > +		!list_is_singular(&obj->vma_list));
> > > +	list_for_each_entry_safe(vma, next, &obj->vma_list, vma_link) {
> > > +		int ret = i915_gem_object_unbind(obj, vma->vm);
> > > +		if (WARN_ON(ret == -ERESTARTSYS)) {
> > > +			bool was_interruptible;
> > >  
> > > -		was_interruptible = dev_priv->mm.interruptible;
> > > -		dev_priv->mm.interruptible = false;
> > > +			was_interruptible = dev_priv->mm.interruptible;
> > > +			dev_priv->mm.interruptible = false;
> > >  
> > > -		WARN_ON(i915_gem_object_unbind(obj));
> > > +			WARN_ON(i915_gem_object_unbind(obj, vma->vm));
> > >  
> > > -		dev_priv->mm.interruptible = was_interruptible;
> > > +			dev_priv->mm.interruptible = was_interruptible;
> > > +		}
> > >  	}
> > >  
> > >  	/* Stolen objects don't hold a ref, but do hold pin count. Fix that up
> > > @@ -4332,6 +4394,16 @@ init_ring_lists(struct intel_ring_buffer *ring)
> > >  	INIT_LIST_HEAD(&ring->request_list);
> > >  }
> > >  
> > > +static void i915_init_vm(struct drm_i915_private *dev_priv,
> > > +			 struct i915_address_space *vm)
> > > +{
> > > +	vm->dev = dev_priv->dev;
> > > +	INIT_LIST_HEAD(&vm->active_list);
> > > +	INIT_LIST_HEAD(&vm->inactive_list);
> > > +	INIT_LIST_HEAD(&vm->global_link);
> > > +	list_add(&vm->global_link, &dev_priv->vm_list);
> > > +}
> > > +
> > >  void
> > >  i915_gem_load(struct drm_device *dev)
> > >  {
> > > @@ -4344,8 +4416,9 @@ i915_gem_load(struct drm_device *dev)
> > >  				  SLAB_HWCACHE_ALIGN,
> > >  				  NULL);
> > >  
> > > -	INIT_LIST_HEAD(&dev_priv->gtt.base.active_list);
> > > -	INIT_LIST_HEAD(&dev_priv->gtt.base.inactive_list);
> > > +	INIT_LIST_HEAD(&dev_priv->vm_list);
> > > +	i915_init_vm(dev_priv, &dev_priv->gtt.base);
> > > +
> > >  	INIT_LIST_HEAD(&dev_priv->mm.unbound_list);
> > >  	INIT_LIST_HEAD(&dev_priv->mm.bound_list);
> > >  	INIT_LIST_HEAD(&dev_priv->mm.fence_list);
> > > @@ -4616,9 +4689,9 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> > >  			     struct drm_i915_private,
> > >  			     mm.inactive_shrinker);
> > >  	struct drm_device *dev = dev_priv->dev;
> > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > +	struct i915_address_space *vm;
> > >  	struct drm_i915_gem_object *obj;
> > > -	int nr_to_scan = sc->nr_to_scan;
> > > +	int nr_to_scan;
> > >  	bool unlock = true;
> > >  	int cnt;
> > >  
> > > @@ -4632,6 +4705,7 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> > >  		unlock = false;
> > >  	}
> > >  
> > > +	nr_to_scan = sc->nr_to_scan;
> > >  	if (nr_to_scan) {
> > >  		nr_to_scan -= i915_gem_purge(dev_priv, nr_to_scan);
> > >  		if (nr_to_scan > 0)
> > > @@ -4645,11 +4719,93 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> > >  	list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_list)
> > >  		if (obj->pages_pin_count == 0)
> > >  			cnt += obj->base.size >> PAGE_SHIFT;
> > > -	list_for_each_entry(obj, &vm->inactive_list, mm_list)
> > > -		if (obj->pin_count == 0 && obj->pages_pin_count == 0)
> > > -			cnt += obj->base.size >> PAGE_SHIFT;
> > > +
> > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > > +		list_for_each_entry(obj, &vm->inactive_list, global_list)
> > > +			if (obj->pin_count == 0 && obj->pages_pin_count == 0)
> > > +				cnt += obj->base.size >> PAGE_SHIFT;
> > 
> > Isn't this now double-counting objects? In the shrinker we only care about
> > how much physical RAM an object occupies, not how much virtual space it
> > occupies. So just walking the bound list of objects here should be good
> > enough ...
> > 
> > >  
> > >  	if (unlock)
> > >  		mutex_unlock(&dev->struct_mutex);
> > >  	return cnt;
> > >  }
> > > +
> > > +/* All the new VM stuff */
> > > +unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
> > > +				  struct i915_address_space *vm)
> > > +{
> > > +	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
> > > +	struct i915_vma *vma;
> > > +
> > > +	if (vm == &dev_priv->mm.aliasing_ppgtt->base)
> > > +		vm = &dev_priv->gtt.base;
> > > +
> > > +	BUG_ON(list_empty(&o->vma_list));
> > > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > 
> > Imo the vma list walking here and in the other helpers below indicates
> > that we should deal more often in vmas instead of (object, vm) pairs. Or
> > is this again something that'll get fixed later on?
> > 
> > I just want to avoid diff churn, and it also makes reviewing easier if the
> > foreshadowing is correct ;-) So generally I'd vote for more liberal
> > sprinkling of obj_to_vma in callers.
> > 
> > > +		if (vma->vm == vm)
> > > +			return vma->node.start;
> > > +
> > > +	}
> > > +	return -1;
> > > +}
> > > +
> > > +bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o)
> > > +{
> > > +	return !list_empty(&o->vma_list);
> > > +}
> > > +
> > > +bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
> > > +			struct i915_address_space *vm)
> > > +{
> > > +	struct i915_vma *vma;
> > > +
> > > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > > +		if (vma->vm == vm)
> > > +			return true;
> > > +	}
> > > +	return false;
> > > +}
> > > +
> > > +unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
> > > +				struct i915_address_space *vm)
> > > +{
> > > +	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
> > > +	struct i915_vma *vma;
> > > +
> > > +	if (vm == &dev_priv->mm.aliasing_ppgtt->base)
> > > +		vm = &dev_priv->gtt.base;
> > > +	BUG_ON(list_empty(&o->vma_list));
> > > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > > +		if (vma->vm == vm)
> > > +			return vma->node.size;
> > > +	}
> > > +
> > > +	return 0;
> > > +}
> > > +
> > > +void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
> > > +			    struct i915_address_space *vm,
> > > +			    enum i915_cache_level color)
> > > +{
> > > +	struct i915_vma *vma;
> > > +	BUG_ON(list_empty(&o->vma_list));
> > > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > > +		if (vma->vm == vm) {
> > > +			vma->node.color = color;
> > > +			return;
> > > +		}
> > > +	}
> > > +
> > > +	WARN(1, "Couldn't set color for VM %p\n", vm);
> > > +}
> > > +
> > > +struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
> > > +				     struct i915_address_space *vm)
> > > +{
> > > +	struct i915_vma *vma;
> > > +	list_for_each_entry(vma, &obj->vma_list, vma_link)
> > > +		if (vma->vm == vm)
> > > +			return vma;
> > > +
> > > +	return NULL;
> > > +}
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> > > index 2074544..c92fd81 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_context.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> > > @@ -155,6 +155,7 @@ create_hw_context(struct drm_device *dev,
> > >  
> > >  	if (INTEL_INFO(dev)->gen >= 7) {
> > >  		ret = i915_gem_object_set_cache_level(ctx->obj,
> > > +						      &dev_priv->gtt.base,
> > >  						      I915_CACHE_LLC_MLC);
> > >  		/* Failure shouldn't ever happen this early */
> > >  		if (WARN_ON(ret))
> > > @@ -214,7 +215,7 @@ static int create_default_context(struct drm_i915_private *dev_priv)
> > >  	 * default context.
> > >  	 */
> > >  	dev_priv->ring[RCS].default_context = ctx;
> > > -	ret = i915_gem_object_pin(ctx->obj, CONTEXT_ALIGN, false, false);
> > > +	ret = i915_gem_ggtt_pin(ctx->obj, CONTEXT_ALIGN, false, false);
> > >  	if (ret) {
> > >  		DRM_DEBUG_DRIVER("Couldn't pin %d\n", ret);
> > >  		goto err_destroy;
> > > @@ -398,6 +399,7 @@ mi_set_context(struct intel_ring_buffer *ring,
> > >  static int do_switch(struct i915_hw_context *to)
> > >  {
> > >  	struct intel_ring_buffer *ring = to->ring;
> > > +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> > >  	struct i915_hw_context *from = ring->last_context;
> > >  	u32 hw_flags = 0;
> > >  	int ret;
> > > @@ -407,7 +409,7 @@ static int do_switch(struct i915_hw_context *to)
> > >  	if (from == to)
> > >  		return 0;
> > >  
> > > -	ret = i915_gem_object_pin(to->obj, CONTEXT_ALIGN, false, false);
> > > +	ret = i915_gem_ggtt_pin(to->obj, CONTEXT_ALIGN, false, false);
> > >  	if (ret)
> > >  		return ret;
> > >  
> > > @@ -444,7 +446,8 @@ static int do_switch(struct i915_hw_context *to)
> > >  	 */
> > >  	if (from != NULL) {
> > >  		from->obj->base.read_domains = I915_GEM_DOMAIN_INSTRUCTION;
> > > -		i915_gem_object_move_to_active(from->obj, ring);
> > > +		i915_gem_object_move_to_active(from->obj, &dev_priv->gtt.base,
> > > +					       ring);
> > >  		/* As long as MI_SET_CONTEXT is serializing, ie. it flushes the
> > >  		 * whole damn pipeline, we don't need to explicitly mark the
> > >  		 * object dirty. The only exception is that the context must be
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
> > > index df61f33..32efdc0 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_evict.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_evict.c
> > > @@ -32,24 +32,21 @@
> > >  #include "i915_trace.h"
> > >  
> > >  static bool
> > > -mark_free(struct drm_i915_gem_object *obj, struct list_head *unwind)
> > > +mark_free(struct i915_vma *vma, struct list_head *unwind)
> > >  {
> > > -	struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> > > -
> > > -	if (obj->pin_count)
> > > +	if (vma->obj->pin_count)
> > >  		return false;
> > >  
> > > -	list_add(&obj->exec_list, unwind);
> > > +	list_add(&vma->obj->exec_list, unwind);
> > >  	return drm_mm_scan_add_block(&vma->node);
> > >  }
> > >  
> > >  int
> > > -i915_gem_evict_something(struct drm_device *dev, int min_size,
> > > -			 unsigned alignment, unsigned cache_level,
> > > +i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
> > > +			 int min_size, unsigned alignment, unsigned cache_level,
> > >  			 bool mappable, bool nonblocking)
> > >  {
> > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > >  	struct list_head eviction_list, unwind_list;
> > >  	struct i915_vma *vma;
> > >  	struct drm_i915_gem_object *obj;
> > > @@ -81,16 +78,18 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
> > >  	 */
> > >  
> > >  	INIT_LIST_HEAD(&unwind_list);
> > > -	if (mappable)
> > > +	if (mappable) {
> > > +		BUG_ON(!i915_is_ggtt(vm));
> > >  		drm_mm_init_scan_with_range(&vm->mm, min_size,
> > >  					    alignment, cache_level, 0,
> > >  					    dev_priv->gtt.mappable_end);
> > > -	else
> > > +	} else
> > >  		drm_mm_init_scan(&vm->mm, min_size, alignment, cache_level);
> > >  
> > >  	/* First see if there is a large enough contiguous idle region... */
> > >  	list_for_each_entry(obj, &vm->inactive_list, mm_list) {
> > > -		if (mark_free(obj, &unwind_list))
> > > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> > > +		if (mark_free(vma, &unwind_list))
> > >  			goto found;
> > >  	}
> > >  
> > > @@ -99,7 +98,8 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
> > >  
> > >  	/* Now merge in the soon-to-be-expired objects... */
> > >  	list_for_each_entry(obj, &vm->active_list, mm_list) {
> > > -		if (mark_free(obj, &unwind_list))
> > > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> > > +		if (mark_free(vma, &unwind_list))
> > >  			goto found;
> > >  	}
> > >  
> > > @@ -109,7 +109,7 @@ none:
> > >  		obj = list_first_entry(&unwind_list,
> > >  				       struct drm_i915_gem_object,
> > >  				       exec_list);
> > > -		vma = __i915_gem_obj_to_vma(obj);
> > > +		vma = i915_gem_obj_to_vma(obj, vm);
> > >  		ret = drm_mm_scan_remove_block(&vma->node);
> > >  		BUG_ON(ret);
> > >  
> > > @@ -130,7 +130,7 @@ found:
> > >  		obj = list_first_entry(&unwind_list,
> > >  				       struct drm_i915_gem_object,
> > >  				       exec_list);
> > > -		vma = __i915_gem_obj_to_vma(obj);
> > > +		vma = i915_gem_obj_to_vma(obj, vm);
> > >  		if (drm_mm_scan_remove_block(&vma->node)) {
> > >  			list_move(&obj->exec_list, &eviction_list);
> > >  			drm_gem_object_reference(&obj->base);
> > > @@ -145,7 +145,7 @@ found:
> > >  				       struct drm_i915_gem_object,
> > >  				       exec_list);
> > >  		if (ret == 0)
> > > -			ret = i915_gem_object_unbind(obj);
> > > +			ret = i915_gem_object_unbind(obj, vm);
> > >  
> > >  		list_del_init(&obj->exec_list);
> > >  		drm_gem_object_unreference(&obj->base);
> > > @@ -158,13 +158,18 @@ int
> > >  i915_gem_evict_everything(struct drm_device *dev)
> > 
> > I suspect evict_everything eventually wants a address_space *vm argument
> > for those cases where we only want to evict everything in a given vm. Atm
> > we have two use-cases of this:
> > - Called from the shrinker as a last-ditch effort. For that it should move
> >   _every_ object onto the unbound list.
> > - Called from execbuf for badly-fragmented address spaces to clean up the
> >   mess. For that case we only care about one address space.
> > 
> > >  {
> > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > +	struct i915_address_space *vm;
> > >  	struct drm_i915_gem_object *obj, *next;
> > > -	bool lists_empty;
> > > +	bool lists_empty = true;
> > >  	int ret;
> > >  
> > > -	lists_empty = (list_empty(&vm->inactive_list) &&
> > > -		       list_empty(&vm->active_list));
> > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > > +		lists_empty = (list_empty(&vm->inactive_list) &&
> > > +			       list_empty(&vm->active_list));
> > > +		if (!lists_empty)
> > > +			lists_empty = false;
> > > +	}
> > > +
> > >  	if (lists_empty)
> > >  		return -ENOSPC;
> > >  
> > > @@ -181,9 +186,11 @@ i915_gem_evict_everything(struct drm_device *dev)
> > >  	i915_gem_retire_requests(dev);
> > >  
> > >  	/* Having flushed everything, unbind() should never raise an error */
> > > -	list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> > > -		if (obj->pin_count == 0)
> > > -			WARN_ON(i915_gem_object_unbind(obj));
> > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > > +		list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> > > +			if (obj->pin_count == 0)
> > > +				WARN_ON(i915_gem_object_unbind(obj, vm));
> > > +	}
> > >  
> > >  	return 0;
> > >  }
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > > index 5aeb447..e90182d 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > > @@ -150,7 +150,7 @@ eb_get_object(struct eb_objects *eb, unsigned long handle)
> > >  }
> > >  
> > >  static void
> > > -eb_destroy(struct eb_objects *eb)
> > > +eb_destroy(struct eb_objects *eb, struct i915_address_space *vm)
> > >  {
> > >  	while (!list_empty(&eb->objects)) {
> > >  		struct drm_i915_gem_object *obj;
> > > @@ -174,7 +174,8 @@ static inline int use_cpu_reloc(struct drm_i915_gem_object *obj)
> > >  static int
> > >  i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
> > >  				   struct eb_objects *eb,
> > > -				   struct drm_i915_gem_relocation_entry *reloc)
> > > +				   struct drm_i915_gem_relocation_entry *reloc,
> > > +				   struct i915_address_space *vm)
> > >  {
> > >  	struct drm_device *dev = obj->base.dev;
> > >  	struct drm_gem_object *target_obj;
> > > @@ -297,7 +298,8 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
> > >  
> > >  static int
> > >  i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
> > > -				    struct eb_objects *eb)
> > > +				    struct eb_objects *eb,
> > > +				    struct i915_address_space *vm)
> > >  {
> > >  #define N_RELOC(x) ((x) / sizeof(struct drm_i915_gem_relocation_entry))
> > >  	struct drm_i915_gem_relocation_entry stack_reloc[N_RELOC(512)];
> > > @@ -321,7 +323,8 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
> > >  		do {
> > >  			u64 offset = r->presumed_offset;
> > >  
> > > -			ret = i915_gem_execbuffer_relocate_entry(obj, eb, r);
> > > +			ret = i915_gem_execbuffer_relocate_entry(obj, eb, r,
> > > +								 vm);
> > >  			if (ret)
> > >  				return ret;
> > >  
> > > @@ -344,13 +347,15 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
> > >  static int
> > >  i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
> > >  					 struct eb_objects *eb,
> > > -					 struct drm_i915_gem_relocation_entry *relocs)
> > > +					 struct drm_i915_gem_relocation_entry *relocs,
> > > +					 struct i915_address_space *vm)
> > >  {
> > >  	const struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
> > >  	int i, ret;
> > >  
> > >  	for (i = 0; i < entry->relocation_count; i++) {
> > > -		ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i]);
> > > +		ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i],
> > > +							 vm);
> > >  		if (ret)
> > >  			return ret;
> > >  	}
> > > @@ -359,7 +364,8 @@ i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
> > >  }
> > >  
> > >  static int
> > > -i915_gem_execbuffer_relocate(struct eb_objects *eb)
> > > +i915_gem_execbuffer_relocate(struct eb_objects *eb,
> > > +			     struct i915_address_space *vm)
> > >  {
> > >  	struct drm_i915_gem_object *obj;
> > >  	int ret = 0;
> > > @@ -373,7 +379,7 @@ i915_gem_execbuffer_relocate(struct eb_objects *eb)
> > >  	 */
> > >  	pagefault_disable();
> > >  	list_for_each_entry(obj, &eb->objects, exec_list) {
> > > -		ret = i915_gem_execbuffer_relocate_object(obj, eb);
> > > +		ret = i915_gem_execbuffer_relocate_object(obj, eb, vm);
> > >  		if (ret)
> > >  			break;
> > >  	}
> > > @@ -395,6 +401,7 @@ need_reloc_mappable(struct drm_i915_gem_object *obj)
> > >  static int
> > >  i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
> > >  				   struct intel_ring_buffer *ring,
> > > +				   struct i915_address_space *vm,
> > >  				   bool *need_reloc)
> > >  {
> > >  	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> > > @@ -409,7 +416,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
> > >  		obj->tiling_mode != I915_TILING_NONE;
> > >  	need_mappable = need_fence || need_reloc_mappable(obj);
> > >  
> > > -	ret = i915_gem_object_pin(obj, entry->alignment, need_mappable, false);
> > > +	ret = i915_gem_object_pin(obj, vm, entry->alignment, need_mappable,
> > > +				  false);
> > >  	if (ret)
> > >  		return ret;
> > >  
> > > @@ -436,8 +444,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
> > >  		obj->has_aliasing_ppgtt_mapping = 1;
> > >  	}
> > >  
> > > -	if (entry->offset != i915_gem_obj_ggtt_offset(obj)) {
> > > -		entry->offset = i915_gem_obj_ggtt_offset(obj);
> > > +	if (entry->offset != i915_gem_obj_offset(obj, vm)) {
> > > +		entry->offset = i915_gem_obj_offset(obj, vm);
> > >  		*need_reloc = true;
> > >  	}
> > >  
> > > @@ -458,7 +466,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
> > >  {
> > >  	struct drm_i915_gem_exec_object2 *entry;
> > >  
> > > -	if (!i915_gem_obj_ggtt_bound(obj))
> > > +	if (!i915_gem_obj_bound_any(obj))
> > >  		return;
> > >  
> > >  	entry = obj->exec_entry;
> > > @@ -475,6 +483,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
> > >  static int
> > >  i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
> > >  			    struct list_head *objects,
> > > +			    struct i915_address_space *vm,
> > >  			    bool *need_relocs)
> > >  {
> > >  	struct drm_i915_gem_object *obj;
> > > @@ -529,32 +538,37 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
> > >  		list_for_each_entry(obj, objects, exec_list) {
> > >  			struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
> > >  			bool need_fence, need_mappable;
> > > +			u32 obj_offset;
> > >  
> > > -			if (!i915_gem_obj_ggtt_bound(obj))
> > > +			if (!i915_gem_obj_bound(obj, vm))
> > >  				continue;
> > 
> > I wonder a bit how we could avoid the multipler (obj, vm) -> vma lookups
> > here ... Maybe we should cache them in some pointer somewhere (either in
> > the eb object or by adding a new pointer to the object struct, e.g.
> > obj->eb_vma, similar to obj->eb_list).
> > 
> > >  
> > > +			obj_offset = i915_gem_obj_offset(obj, vm);
> > >  			need_fence =
> > >  				has_fenced_gpu_access &&
> > >  				entry->flags & EXEC_OBJECT_NEEDS_FENCE &&
> > >  				obj->tiling_mode != I915_TILING_NONE;
> > >  			need_mappable = need_fence || need_reloc_mappable(obj);
> > >  
> > > +			BUG_ON((need_mappable || need_fence) &&
> > > +			       !i915_is_ggtt(vm));
> > > +
> > >  			if ((entry->alignment &&
> > > -			     i915_gem_obj_ggtt_offset(obj) & (entry->alignment - 1)) ||
> > > +			     obj_offset & (entry->alignment - 1)) ||
> > >  			    (need_mappable && !obj->map_and_fenceable))
> > > -				ret = i915_gem_object_unbind(obj);
> > > +				ret = i915_gem_object_unbind(obj, vm);
> > >  			else
> > > -				ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs);
> > > +				ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
> > >  			if (ret)
> > >  				goto err;
> > >  		}
> > >  
> > >  		/* Bind fresh objects */
> > >  		list_for_each_entry(obj, objects, exec_list) {
> > > -			if (i915_gem_obj_ggtt_bound(obj))
> > > +			if (i915_gem_obj_bound(obj, vm))
> > >  				continue;
> > >  
> > > -			ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs);
> > > +			ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
> > >  			if (ret)
> > >  				goto err;
> > >  		}
> > > @@ -578,7 +592,8 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
> > >  				  struct drm_file *file,
> > >  				  struct intel_ring_buffer *ring,
> > >  				  struct eb_objects *eb,
> > > -				  struct drm_i915_gem_exec_object2 *exec)
> > > +				  struct drm_i915_gem_exec_object2 *exec,
> > > +				  struct i915_address_space *vm)
> > >  {
> > >  	struct drm_i915_gem_relocation_entry *reloc;
> > >  	struct drm_i915_gem_object *obj;
> > > @@ -662,14 +677,15 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
> > >  		goto err;
> > >  
> > >  	need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
> > > -	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs);
> > > +	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
> > >  	if (ret)
> > >  		goto err;
> > >  
> > >  	list_for_each_entry(obj, &eb->objects, exec_list) {
> > >  		int offset = obj->exec_entry - exec;
> > >  		ret = i915_gem_execbuffer_relocate_object_slow(obj, eb,
> > > -							       reloc + reloc_offset[offset]);
> > > +							       reloc + reloc_offset[offset],
> > > +							       vm);
> > >  		if (ret)
> > >  			goto err;
> > >  	}
> > > @@ -768,6 +784,7 @@ validate_exec_list(struct drm_i915_gem_exec_object2 *exec,
> > >  
> > >  static void
> > >  i915_gem_execbuffer_move_to_active(struct list_head *objects,
> > > +				   struct i915_address_space *vm,
> > >  				   struct intel_ring_buffer *ring)
> > >  {
> > >  	struct drm_i915_gem_object *obj;
> > > @@ -782,7 +799,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *objects,
> > >  		obj->base.read_domains = obj->base.pending_read_domains;
> > >  		obj->fenced_gpu_access = obj->pending_fenced_gpu_access;
> > >  
> > > -		i915_gem_object_move_to_active(obj, ring);
> > > +		i915_gem_object_move_to_active(obj, vm, ring);
> > >  		if (obj->base.write_domain) {
> > >  			obj->dirty = 1;
> > >  			obj->last_write_seqno = intel_ring_get_seqno(ring);
> > > @@ -836,7 +853,8 @@ static int
> > >  i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> > >  		       struct drm_file *file,
> > >  		       struct drm_i915_gem_execbuffer2 *args,
> > > -		       struct drm_i915_gem_exec_object2 *exec)
> > > +		       struct drm_i915_gem_exec_object2 *exec,
> > > +		       struct i915_address_space *vm)
> > >  {
> > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > >  	struct eb_objects *eb;
> > > @@ -998,17 +1016,17 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> > >  
> > >  	/* Move the objects en-masse into the GTT, evicting if necessary. */
> > >  	need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
> > > -	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs);
> > > +	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
> > >  	if (ret)
> > >  		goto err;
> > >  
> > >  	/* The objects are in their final locations, apply the relocations. */
> > >  	if (need_relocs)
> > > -		ret = i915_gem_execbuffer_relocate(eb);
> > > +		ret = i915_gem_execbuffer_relocate(eb, vm);
> > >  	if (ret) {
> > >  		if (ret == -EFAULT) {
> > >  			ret = i915_gem_execbuffer_relocate_slow(dev, args, file, ring,
> > > -								eb, exec);
> > > +								eb, exec, vm);
> > >  			BUG_ON(!mutex_is_locked(&dev->struct_mutex));
> > >  		}
> > >  		if (ret)
> > > @@ -1059,7 +1077,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> > >  			goto err;
> > >  	}
> > >  
> > > -	exec_start = i915_gem_obj_ggtt_offset(batch_obj) + args->batch_start_offset;
> > > +	exec_start = i915_gem_obj_offset(batch_obj, vm) +
> > > +		args->batch_start_offset;
> > >  	exec_len = args->batch_len;
> > >  	if (cliprects) {
> > >  		for (i = 0; i < args->num_cliprects; i++) {
> > > @@ -1084,11 +1103,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> > >  
> > >  	trace_i915_gem_ring_dispatch(ring, intel_ring_get_seqno(ring), flags);
> > >  
> > > -	i915_gem_execbuffer_move_to_active(&eb->objects, ring);
> > > +	i915_gem_execbuffer_move_to_active(&eb->objects, vm, ring);
> > >  	i915_gem_execbuffer_retire_commands(dev, file, ring, batch_obj);
> > >  
> > >  err:
> > > -	eb_destroy(eb);
> > > +	eb_destroy(eb, vm);
> > >  
> > >  	mutex_unlock(&dev->struct_mutex);
> > >  
> > > @@ -1105,6 +1124,7 @@ int
> > >  i915_gem_execbuffer(struct drm_device *dev, void *data,
> > >  		    struct drm_file *file)
> > >  {
> > > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > >  	struct drm_i915_gem_execbuffer *args = data;
> > >  	struct drm_i915_gem_execbuffer2 exec2;
> > >  	struct drm_i915_gem_exec_object *exec_list = NULL;
> > > @@ -1160,7 +1180,8 @@ i915_gem_execbuffer(struct drm_device *dev, void *data,
> > >  	exec2.flags = I915_EXEC_RENDER;
> > >  	i915_execbuffer2_set_context_id(exec2, 0);
> > >  
> > > -	ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list);
> > > +	ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list,
> > > +				     &dev_priv->gtt.base);
> > >  	if (!ret) {
> > >  		/* Copy the new buffer offsets back to the user's exec list. */
> > >  		for (i = 0; i < args->buffer_count; i++)
> > > @@ -1186,6 +1207,7 @@ int
> > >  i915_gem_execbuffer2(struct drm_device *dev, void *data,
> > >  		     struct drm_file *file)
> > >  {
> > > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > >  	struct drm_i915_gem_execbuffer2 *args = data;
> > >  	struct drm_i915_gem_exec_object2 *exec2_list = NULL;
> > >  	int ret;
> > > @@ -1216,7 +1238,8 @@ i915_gem_execbuffer2(struct drm_device *dev, void *data,
> > >  		return -EFAULT;
> > >  	}
> > >  
> > > -	ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list);
> > > +	ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list,
> > > +				     &dev_priv->gtt.base);
> > >  	if (!ret) {
> > >  		/* Copy the new buffer offsets back to the user's exec list. */
> > >  		ret = copy_to_user(to_user_ptr(args->buffers_ptr),
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > index 298fc42..70ce2f6 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > @@ -367,6 +367,8 @@ static int i915_gem_init_aliasing_ppgtt(struct drm_device *dev)
> > >  			    ppgtt->base.total);
> > >  	}
> > >  
> > > +	/* i915_init_vm(dev_priv, &ppgtt->base) */
> > > +
> > >  	return ret;
> > >  }
> > >  
> > > @@ -386,17 +388,22 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
> > >  			    struct drm_i915_gem_object *obj,
> > >  			    enum i915_cache_level cache_level)
> > >  {
> > > -	ppgtt->base.insert_entries(&ppgtt->base, obj->pages,
> > > -				   i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> > > -				   cache_level);
> > > +	struct i915_address_space *vm = &ppgtt->base;
> > > +	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
> > > +
> > > +	vm->insert_entries(vm, obj->pages,
> > > +			   obj_offset >> PAGE_SHIFT,
> > > +			   cache_level);
> > >  }
> > >  
> > >  void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
> > >  			      struct drm_i915_gem_object *obj)
> > >  {
> > > -	ppgtt->base.clear_range(&ppgtt->base,
> > > -				i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> > > -				obj->base.size >> PAGE_SHIFT);
> > > +	struct i915_address_space *vm = &ppgtt->base;
> > > +	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
> > > +
> > > +	vm->clear_range(vm, obj_offset >> PAGE_SHIFT,
> > > +			obj->base.size >> PAGE_SHIFT);
> > >  }
> > >  
> > >  extern int intel_iommu_gfx_mapped;
> > > @@ -447,6 +454,9 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
> > >  				       dev_priv->gtt.base.start / PAGE_SIZE,
> > >  				       dev_priv->gtt.base.total / PAGE_SIZE);
> > >  
> > > +	if (dev_priv->mm.aliasing_ppgtt)
> > > +		gen6_write_pdes(dev_priv->mm.aliasing_ppgtt);
> > > +
> > >  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> > >  		i915_gem_clflush_object(obj);
> > >  		i915_gem_gtt_bind_object(obj, obj->cache_level);
> > > @@ -625,7 +635,8 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
> > >  	 * aperture.  One page should be enough to keep any prefetching inside
> > >  	 * of the aperture.
> > >  	 */
> > > -	drm_i915_private_t *dev_priv = dev->dev_private;
> > > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > > +	struct i915_address_space *ggtt_vm = &dev_priv->gtt.base;
> > >  	struct drm_mm_node *entry;
> > >  	struct drm_i915_gem_object *obj;
> > >  	unsigned long hole_start, hole_end;
> > > @@ -633,19 +644,19 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
> > >  	BUG_ON(mappable_end > end);
> > >  
> > >  	/* Subtract the guard page ... */
> > > -	drm_mm_init(&dev_priv->gtt.base.mm, start, end - start - PAGE_SIZE);
> > > +	drm_mm_init(&ggtt_vm->mm, start, end - start - PAGE_SIZE);
> > >  	if (!HAS_LLC(dev))
> > >  		dev_priv->gtt.base.mm.color_adjust = i915_gtt_color_adjust;
> > >  
> > >  	/* Mark any preallocated objects as occupied */
> > >  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> > > -		struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> > > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, ggtt_vm);
> > >  		int ret;
> > >  		DRM_DEBUG_KMS("reserving preallocated space: %lx + %zx\n",
> > >  			      i915_gem_obj_ggtt_offset(obj), obj->base.size);
> > >  
> > >  		WARN_ON(i915_gem_obj_ggtt_bound(obj));
> > > -		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
> > > +		ret = drm_mm_reserve_node(&ggtt_vm->mm, &vma->node);
> > >  		if (ret)
> > >  			DRM_DEBUG_KMS("Reservation failed\n");
> > >  		obj->has_global_gtt_mapping = 1;
> > > @@ -656,19 +667,15 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
> > >  	dev_priv->gtt.base.total = end - start;
> > >  
> > >  	/* Clear any non-preallocated blocks */
> > > -	drm_mm_for_each_hole(entry, &dev_priv->gtt.base.mm,
> > > -			     hole_start, hole_end) {
> > > +	drm_mm_for_each_hole(entry, &ggtt_vm->mm, hole_start, hole_end) {
> > >  		const unsigned long count = (hole_end - hole_start) / PAGE_SIZE;
> > >  		DRM_DEBUG_KMS("clearing unused GTT space: [%lx, %lx]\n",
> > >  			      hole_start, hole_end);
> > > -		dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> > > -					       hole_start / PAGE_SIZE,
> > > -					       count);
> > > +		ggtt_vm->clear_range(ggtt_vm, hole_start / PAGE_SIZE, count);
> > >  	}
> > >  
> > >  	/* And finally clear the reserved guard page */
> > > -	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> > > -				       end / PAGE_SIZE - 1, 1);
> > > +	ggtt_vm->clear_range(ggtt_vm, end / PAGE_SIZE - 1, 1);
> > >  }
> > >  
> > >  static bool
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > index 245eb1d..bfe61fa 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > @@ -391,7 +391,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
> > >  	if (gtt_offset == I915_GTT_OFFSET_NONE)
> > >  		return obj;
> > >  
> > > -	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
> > > +	vma = i915_gem_vma_create(obj, vm);
> > >  	if (!vma) {
> > >  		drm_gem_object_unreference(&obj->base);
> > >  		return NULL;
> > > @@ -404,8 +404,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
> > >  	 */
> > >  	vma->node.start = gtt_offset;
> > >  	vma->node.size = size;
> > > -	if (drm_mm_initialized(&dev_priv->gtt.base.mm)) {
> > > -		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
> > > +	if (drm_mm_initialized(&vm->mm)) {
> > > +		ret = drm_mm_reserve_node(&vm->mm, &vma->node);
> > 
> > These two hunks here for stolen look fishy - we only ever use the stolen
> > preallocated stuff for objects with mappings in the global gtt. So keeping
> > that explicit is imo the better approach. And tbh I'm confused where the
> > local variable vm is from ...
> > 
> > >  		if (ret) {
> > >  			DRM_DEBUG_KMS("failed to allocate stolen GTT space\n");
> > >  			goto unref_out;
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
> > > index 92a8d27..808ca2a 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_tiling.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
> > > @@ -360,17 +360,19 @@ i915_gem_set_tiling(struct drm_device *dev, void *data,
> > >  
> > >  		obj->map_and_fenceable =
> > >  			!i915_gem_obj_ggtt_bound(obj) ||
> > > -			(i915_gem_obj_ggtt_offset(obj) + obj->base.size <= dev_priv->gtt.mappable_end &&
> > > +			(i915_gem_obj_ggtt_offset(obj) +
> > > +			 obj->base.size <= dev_priv->gtt.mappable_end &&
> > >  			 i915_gem_object_fence_ok(obj, args->tiling_mode));
> > >  
> > >  		/* Rebind if we need a change of alignment */
> > >  		if (!obj->map_and_fenceable) {
> > > -			u32 unfenced_alignment =
> > > +			struct i915_address_space *ggtt = &dev_priv->gtt.base;
> > > +			u32 unfenced_align =
> > >  				i915_gem_get_gtt_alignment(dev, obj->base.size,
> > >  							    args->tiling_mode,
> > >  							    false);
> > > -			if (i915_gem_obj_ggtt_offset(obj) & (unfenced_alignment - 1))
> > > -				ret = i915_gem_object_unbind(obj);
> > > +			if (i915_gem_obj_ggtt_offset(obj) & (unfenced_align - 1))
> > > +				ret = i915_gem_object_unbind(obj, ggtt);
> > >  		}
> > >  
> > >  		if (ret == 0) {
> > > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> > > index 79fbb17..28fa0ff 100644
> > > --- a/drivers/gpu/drm/i915/i915_irq.c
> > > +++ b/drivers/gpu/drm/i915/i915_irq.c
> > > @@ -1716,6 +1716,9 @@ i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
> > >  	if (HAS_BROKEN_CS_TLB(dev_priv->dev)) {
> > >  		u32 acthd = I915_READ(ACTHD);
> > >  
> > > +		if (WARN_ON(HAS_HW_CONTEXTS(dev_priv->dev)))
> > > +			return NULL;
> > > +
> > >  		if (WARN_ON(ring->id != RCS))
> > >  			return NULL;
> > >  
> > > @@ -1802,7 +1805,8 @@ static void i915_gem_record_active_context(struct intel_ring_buffer *ring,
> > >  		return;
> > >  
> > >  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> > > -		if ((error->ccid & PAGE_MASK) == i915_gem_obj_ggtt_offset(obj)) {
> > > +		if ((error->ccid & PAGE_MASK) ==
> > > +		    i915_gem_obj_ggtt_offset(obj)) {
> > >  			ering->ctx = i915_error_object_create_sized(dev_priv,
> > >  								    obj, 1);
> > >  			break;
> > > diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
> > > index 7d283b5..3f019d3 100644
> > > --- a/drivers/gpu/drm/i915/i915_trace.h
> > > +++ b/drivers/gpu/drm/i915/i915_trace.h
> > > @@ -34,11 +34,13 @@ TRACE_EVENT(i915_gem_object_create,
> > >  );
> > >  
> > >  TRACE_EVENT(i915_gem_object_bind,
> > > -	    TP_PROTO(struct drm_i915_gem_object *obj, bool mappable),
> > > -	    TP_ARGS(obj, mappable),
> > > +	    TP_PROTO(struct drm_i915_gem_object *obj,
> > > +		     struct i915_address_space *vm, bool mappable),
> > > +	    TP_ARGS(obj, vm, mappable),
> > >  
> > >  	    TP_STRUCT__entry(
> > >  			     __field(struct drm_i915_gem_object *, obj)
> > > +			     __field(struct i915_address_space *, vm)
> > >  			     __field(u32, offset)
> > >  			     __field(u32, size)
> > >  			     __field(bool, mappable)
> > > @@ -46,8 +48,8 @@ TRACE_EVENT(i915_gem_object_bind,
> > >  
> > >  	    TP_fast_assign(
> > >  			   __entry->obj = obj;
> > > -			   __entry->offset = i915_gem_obj_ggtt_offset(obj);
> > > -			   __entry->size = i915_gem_obj_ggtt_size(obj);
> > > +			   __entry->offset = i915_gem_obj_offset(obj, vm);
> > > +			   __entry->size = i915_gem_obj_size(obj, vm);
> > >  			   __entry->mappable = mappable;
> > >  			   ),
> > >  
> > > @@ -57,19 +59,21 @@ TRACE_EVENT(i915_gem_object_bind,
> > >  );
> > >  
> > >  TRACE_EVENT(i915_gem_object_unbind,
> > > -	    TP_PROTO(struct drm_i915_gem_object *obj),
> > > -	    TP_ARGS(obj),
> > > +	    TP_PROTO(struct drm_i915_gem_object *obj,
> > > +		     struct i915_address_space *vm),
> > > +	    TP_ARGS(obj, vm),
> > >  
> > >  	    TP_STRUCT__entry(
> > >  			     __field(struct drm_i915_gem_object *, obj)
> > > +			     __field(struct i915_address_space *, vm)
> > >  			     __field(u32, offset)
> > >  			     __field(u32, size)
> > >  			     ),
> > >  
> > >  	    TP_fast_assign(
> > >  			   __entry->obj = obj;
> > > -			   __entry->offset = i915_gem_obj_ggtt_offset(obj);
> > > -			   __entry->size = i915_gem_obj_ggtt_size(obj);
> > > +			   __entry->offset = i915_gem_obj_offset(obj, vm);
> > > +			   __entry->size = i915_gem_obj_size(obj, vm);
> > >  			   ),
> > >  
> > >  	    TP_printk("obj=%p, offset=%08x size=%x",
> > > diff --git a/drivers/gpu/drm/i915/intel_fb.c b/drivers/gpu/drm/i915/intel_fb.c
> > > index f3c97e0..b69cc63 100644
> > > --- a/drivers/gpu/drm/i915/intel_fb.c
> > > +++ b/drivers/gpu/drm/i915/intel_fb.c
> > > @@ -170,7 +170,6 @@ static int intelfb_create(struct drm_fb_helper *helper,
> > >  		      fb->width, fb->height,
> > >  		      i915_gem_obj_ggtt_offset(obj), obj);
> > >  
> > > -
> > >  	mutex_unlock(&dev->struct_mutex);
> > >  	vga_switcheroo_client_fb_set(dev->pdev, info);
> > >  	return 0;
> > > diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
> > > index 81c3ca1..517e278 100644
> > > --- a/drivers/gpu/drm/i915/intel_overlay.c
> > > +++ b/drivers/gpu/drm/i915/intel_overlay.c
> > > @@ -1350,7 +1350,7 @@ void intel_setup_overlay(struct drm_device *dev)
> > >  		}
> > >  		overlay->flip_addr = reg_bo->phys_obj->handle->busaddr;
> > >  	} else {
> > > -		ret = i915_gem_object_pin(reg_bo, PAGE_SIZE, true, false);
> > > +		ret = i915_gem_ggtt_pin(reg_bo, PAGE_SIZE, true, false);
> > >  		if (ret) {
> > >  			DRM_ERROR("failed to pin overlay register bo\n");
> > >  			goto out_free_bo;
> > > diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> > > index 125a741..449e57c 100644
> > > --- a/drivers/gpu/drm/i915/intel_pm.c
> > > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > > @@ -2858,7 +2858,7 @@ intel_alloc_context_page(struct drm_device *dev)
> > >  		return NULL;
> > >  	}
> > >  
> > > -	ret = i915_gem_object_pin(ctx, 4096, true, false);
> > > +	ret = i915_gem_ggtt_pin(ctx, 4096, true, false);
> > >  	if (ret) {
> > >  		DRM_ERROR("failed to pin power context: %d\n", ret);
> > >  		goto err_unref;
> > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > index bc4c11b..ebed61d 100644
> > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > @@ -481,6 +481,7 @@ out:
> > >  static int
> > >  init_pipe_control(struct intel_ring_buffer *ring)
> > >  {
> > > +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> > >  	struct pipe_control *pc;
> > >  	struct drm_i915_gem_object *obj;
> > >  	int ret;
> > > @@ -499,9 +500,10 @@ init_pipe_control(struct intel_ring_buffer *ring)
> > >  		goto err;
> > >  	}
> > >  
> > > -	i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
> > > +	i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> > > +					I915_CACHE_LLC);
> > >  
> > > -	ret = i915_gem_object_pin(obj, 4096, true, false);
> > > +	ret = i915_gem_ggtt_pin(obj, 4096, true, false);
> > >  	if (ret)
> > >  		goto err_unref;
> > >  
> > > @@ -1212,6 +1214,7 @@ static void cleanup_status_page(struct intel_ring_buffer *ring)
> > >  static int init_status_page(struct intel_ring_buffer *ring)
> > >  {
> > >  	struct drm_device *dev = ring->dev;
> > > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > >  	struct drm_i915_gem_object *obj;
> > >  	int ret;
> > >  
> > > @@ -1222,9 +1225,10 @@ static int init_status_page(struct intel_ring_buffer *ring)
> > >  		goto err;
> > >  	}
> > >  
> > > -	i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
> > > +	i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> > > +					I915_CACHE_LLC);
> > >  
> > > -	ret = i915_gem_object_pin(obj, 4096, true, false);
> > > +	ret = i915_gem_ggtt_pin(obj, 4096, true, false);
> > >  	if (ret != 0) {
> > >  		goto err_unref;
> > >  	}
> > > @@ -1307,7 +1311,7 @@ static int intel_init_ring_buffer(struct drm_device *dev,
> > >  
> > >  	ring->obj = obj;
> > >  
> > > -	ret = i915_gem_object_pin(obj, PAGE_SIZE, true, false);
> > > +	ret = i915_gem_ggtt_pin(obj, PAGE_SIZE, true, false);
> > >  	if (ret)
> > >  		goto err_unref;
> > >  
> > > @@ -1828,7 +1832,7 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
> > >  			return -ENOMEM;
> > >  		}
> > >  
> > > -		ret = i915_gem_object_pin(obj, 0, true, false);
> > > +		ret = i915_gem_ggtt_pin(obj, 0, true, false);
> > >  		if (ret != 0) {
> > >  			drm_gem_object_unreference(&obj->base);
> > >  			DRM_ERROR("Failed to ping batch bo\n");
> > > -- 
> > > 1.8.3.2
> > > 
> > > _______________________________________________
> > > Intel-gfx mailing list
> > > Intel-gfx@lists.freedesktop.org
> > > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> > 
> > -- 
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > +41 (0) 79 365 57 48 - http://blog.ffwll.ch
> 
> -- 
> Ben Widawsky, Intel Open Source Technology Center

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 07/11] drm/i915: Fix up map and fenceable for VMA
  2013-07-10 16:39     ` Ben Widawsky
@ 2013-07-10 17:08       ` Daniel Vetter
  0 siblings, 0 replies; 50+ messages in thread
From: Daniel Vetter @ 2013-07-10 17:08 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Wed, Jul 10, 2013 at 09:39:00AM -0700, Ben Widawsky wrote:
> On Tue, Jul 09, 2013 at 09:16:54AM +0200, Daniel Vetter wrote:
> > On Mon, Jul 08, 2013 at 11:08:38PM -0700, Ben Widawsky wrote:
> > > formerly: "drm/i915: Create VMAs (part 3.5) - map and fenceable
> > > tracking"
> > > 
> > > The map_and_fenceable tracking is per object. GTT mapping, and fences
> > > only apply to global GTT. As such,  object operations which are not
> > > performed on the global GTT should not effect mappable or fenceable
> > > characteristics.
> > > 
> > > Functionally, this commit could very well be squashed in to the previous
> > > patch which updated object operations to take a VM argument.  This
> > > commit is split out because it's a bit tricky (or at least it was for
> > > me).
> > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > > ---
> > >  drivers/gpu/drm/i915/i915_gem.c | 9 ++++++---
> > >  1 file changed, 6 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > > index 21015cd..501c590 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > @@ -2635,7 +2635,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj,
> > >  
> > >  	trace_i915_gem_object_unbind(obj, vm);
> > >  
> > > -	if (obj->has_global_gtt_mapping)
> > > +	if (obj->has_global_gtt_mapping && i915_is_ggtt(vm))
> > >  		i915_gem_gtt_unbind_object(obj);
> > 
> > Wont this part be done as part of the global gtt clear_range callback?
> > 
> > >  	if (obj->has_aliasing_ppgtt_mapping) {
> > >  		i915_ppgtt_unbind_object(dev_priv->mm.aliasing_ppgtt, obj);
> > 
> > And I have a hunch that we should shovel the aliasing ppgtt clearing into
> > the ggtt write_ptes/clear_range callbacks, too. Once all this has settled
> > at least.
> > 
> 
> Addressing both comments at once:
> 
> First, this is a rebase mistake AFAICT because this hunk doesn't really
> belong in this patch anyway.
> 
> Eventually, I'd want to kill i915_gem_gtt_unbind_object, and
> i915_ppgtt_unbind_object. In the 66 patch series, I killed the latter,
> but decided to leave the former to make it clear that is a special case.
> 
> In the original 66 patch series, I did not move clear_range which is
> probably why this was left like this. I believe bind was fixed to just
> be vm->bleh()
> 
> If you're good with the idea, I'll add a new patch to remove those and
> use the i915_address_space. I'll do the same in other applicable places.
> It's easiest if I do that as a patch 12, I think, if you don't mind?

Yeah, I'm ok with that, since the above hunk that started my thinking will
go away anyway.

> I do think this hunk belongs in another patch though until I do the
> above. I'm not really sure where to put that.

If you want to keep it, please add a comment explaining why and how
exactly it will get fixed up. In case of doubt just create a new patch to
highlight the special case imo.
-Daniel

> 
> 
> > > @@ -2646,7 +2646,8 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj,
> > >  
> > >  	list_del(&obj->mm_list);
> > >  	/* Avoid an unnecessary call to unbind on rebind. */
> > > -	obj->map_and_fenceable = true;
> > > +	if (i915_is_ggtt(vm))
> > > +		obj->map_and_fenceable = true;
> > >  
> > >  	vma = i915_gem_obj_to_vma(obj, vm);
> > >  	list_del(&vma->vma_link);
> > > @@ -3213,7 +3214,9 @@ search_free:
> > >  		i915_is_ggtt(vm) &&
> > >  		vma->node.start + obj->base.size <= dev_priv->gtt.mappable_end;
> > >  
> > > -	obj->map_and_fenceable = mappable && fenceable;
> > > +	/* Map and fenceable only changes if the VM is the global GGTT */
> > > +	if (i915_is_ggtt(vm))
> > > +		obj->map_and_fenceable = mappable && fenceable;
> > >  
> > >  	trace_i915_gem_object_bind(obj, vm, map_and_fenceable);
> > >  	i915_gem_verify_gtt(dev);
> > > -- 
> > > 1.8.3.2
> > > 
> > > _______________________________________________
> > > Intel-gfx mailing list
> > > Intel-gfx@lists.freedesktop.org
> > > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> > 
> > -- 
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > +41 (0) 79 365 57 48 - http://blog.ffwll.ch
> 
> -- 
> Ben Widawsky, Intel Open Source Technology Center

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 11/11] drm/i915: Move active to vma
  2013-07-10 16:39     ` Ben Widawsky
@ 2013-07-10 17:13       ` Daniel Vetter
  0 siblings, 0 replies; 50+ messages in thread
From: Daniel Vetter @ 2013-07-10 17:13 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Wed, Jul 10, 2013 at 09:39:30AM -0700, Ben Widawsky wrote:
> On Tue, Jul 09, 2013 at 09:45:09AM +0200, Daniel Vetter wrote:
> > On Mon, Jul 08, 2013 at 11:08:42PM -0700, Ben Widawsky wrote:
> > > Probably need to squash whole thing, or just the inactive part, tbd...
> > > 
> > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > 
> > I agree that we need vma->active, but I'm not sold on removing
> > obj->active. Atm we have to use-cases for checking obj->active:
> > - In the evict/unbind code to check whether the gpu is still using this
> >   specific mapping. This use-case nicely fits into checking vma->active.
> > - In the shrinker code and everywhere we want to do cpu access we only
> >   care about whether the gpu is accessing the object, not at all through
> >   which mapping precisely. There a vma-independant obj->active sounds much
> >   saner.
> > 
> > Note though that just keeping track of vma->active isn't too useful, since
> > if some other vma is keeping the object busy we'll still stall on that one
> > for eviction. So we'd need a vma->ring and vma->last_rendering_seqno, too.
> > 
> > At that point I wonder a bit whether all this complexity is worth it ...
> > 
> > I need to ponder this some more.
> > -Daniel
> 
> I think eventually the complexity might prove worthwhile, it might not.
> 
> In the meanwhile, I see vma->active as just a bookkeeping thing, and not
> really useful in determining what we actually care about. As you mention
> obj->active is really what we care about, and I used the getter
> i915_gem_object_is_active() as a way to avoid the confusion of having
> two active members.
> 
> I think we're in the same state of mind on this, and I've picked what I
> consider to be a less offensive solution which is easy to clean up
> later.
> 
> Let me know when you make a decision.

Since you don't seem to be too keen on defending it for now I'd say let's
keep it in obj->active. This way obj->active still reflects accurately
whether the object is active on a ring or not.

That leaves us with updating the vm->active list. I'm leaning somewhat
towards simply merging the inactive and active vm list since in the only
place we care about those list (eviction) we treat them essentially as one
big lru. That would cut down on complexity in retire_request to keep all
the vmas on the rigth per-vm active/inactive list, too.

Optional cleanup after-the-fact imo (or upfront if it makes request
retiring much simpler). Whatever suits you.
-Daniel

> 
> > 
> > > ---
> > >  drivers/gpu/drm/i915/i915_drv.h | 14 ++++++------
> > >  drivers/gpu/drm/i915/i915_gem.c | 47 ++++++++++++++++++++++++-----------------
> > >  2 files changed, 35 insertions(+), 26 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > > index 38d07f2..e6694ae 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > @@ -541,6 +541,13 @@ struct i915_vma {
> > >  	struct drm_i915_gem_object *obj;
> > >  	struct i915_address_space *vm;
> > >  
> > > +	/**
> > > +	 * This is set if the object is on the active lists (has pending
> > > +	 * rendering and so a non-zero seqno), and is not set if it i s on
> > > +	 * inactive (ready to be unbound) list.
> > > +	 */
> > > +	unsigned int active:1;
> > > +
> > >  	/** This object's place on the active/inactive lists */
> > >  	struct list_head mm_list;
> > >  
> > > @@ -1250,13 +1257,6 @@ struct drm_i915_gem_object {
> > >  	struct list_head exec_list;
> > >  
> > >  	/**
> > > -	 * This is set if the object is on the active lists (has pending
> > > -	 * rendering and so a non-zero seqno), and is not set if it i s on
> > > -	 * inactive (ready to be unbound) list.
> > > -	 */
> > > -	unsigned int active:1;
> > > -
> > > -	/**
> > >  	 * This is set if the object has been written to since last bound
> > >  	 * to the GTT
> > >  	 */
> > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > > index c2ecb78..b87073b 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > @@ -137,7 +137,13 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
> > >  /* NB: Not the same as !i915_gem_object_is_inactive */
> > >  bool i915_gem_object_is_active(struct drm_i915_gem_object *obj)
> > >  {
> > > -	return obj->active;
> > > +	struct i915_vma *vma;
> > > +
> > > +	list_for_each_entry(vma, &obj->vma_list, vma_link)
> > > +		if (vma->active)
> > > +			return true;
> > > +
> > > +	return false;
> > >  }
> > >  
> > >  static inline bool
> > > @@ -1899,14 +1905,14 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> > >  	BUG_ON(ring == NULL);
> > >  	obj->ring = ring;
> > >  
> > > +	/* Move from whatever list we were on to the tail of execution. */
> > > +	vma = i915_gem_obj_to_vma(obj, vm);
> > >  	/* Add a reference if we're newly entering the active list. */
> > > -	if (!i915_gem_object_is_active(obj)) {
> > > +	if (!vma->active) {
> > >  		drm_gem_object_reference(&obj->base);
> > > -		obj->active = 1;
> > > +		vma->active = 1;
> > >  	}
> > >  
> > > -	/* Move from whatever list we were on to the tail of execution. */
> > > -	vma = i915_gem_obj_to_vma(obj, vm);
> > >  	list_move_tail(&vma->mm_list, &vm->active_list);
> > >  	list_move_tail(&obj->ring_list, &ring->active_list);
> > >  
> > > @@ -1927,16 +1933,23 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> > >  }
> > >  
> > >  static void
> > > -i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj,
> > > -				 struct i915_address_space *vm)
> > > +i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
> > >  {
> > > +	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> > > +	struct i915_address_space *vm;
> > >  	struct i915_vma *vma;
> > > +	int i = 0;
> > >  
> > >  	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
> > > -	BUG_ON(!i915_gem_object_is_active(obj));
> > >  
> > > -	vma = i915_gem_obj_to_vma(obj, vm);
> > > -	list_move_tail(&vma->mm_list, &vm->inactive_list);
> > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > > +		vma = i915_gem_obj_to_vma(obj, vm);
> > For paranoia we might want to track the vm used to run a batch in it
> > request struct, then we
> > 
> > > +		if (!vma || !vma->active)
> > > +			continue;
> > > +		list_move_tail(&vma->mm_list, &vm->inactive_list);
> > > +		vma->active = 0;
> > > +		i++;
> > > +	}
> > >  
> > >  	list_del_init(&obj->ring_list);
> > >  	obj->ring = NULL;
> > > @@ -1948,8 +1961,8 @@ i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj,
> > >  	obj->last_fenced_seqno = 0;
> > >  	obj->fenced_gpu_access = false;
> > >  
> > > -	obj->active = 0;
> > > -	drm_gem_object_unreference(&obj->base);
> > > +	while (i--)
> > > +		drm_gem_object_unreference(&obj->base);
> > >  
> > >  	WARN_ON(i915_verify_lists(dev));
> > >  }
> > > @@ -2272,15 +2285,13 @@ static void i915_gem_reset_ring_lists(struct drm_i915_private *dev_priv,
> > >  	}
> > >  
> > >  	while (!list_empty(&ring->active_list)) {
> > > -		struct i915_address_space *vm;
> > >  		struct drm_i915_gem_object *obj;
> > >  
> > >  		obj = list_first_entry(&ring->active_list,
> > >  				       struct drm_i915_gem_object,
> > >  				       ring_list);
> > >  
> > > -		list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > > -			i915_gem_object_move_to_inactive(obj, vm);
> > > +		i915_gem_object_move_to_inactive(obj);
> > >  	}
> > >  }
> > >  
> > > @@ -2356,8 +2367,6 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
> > >  	 * by the ringbuffer to the flushing/inactive lists as appropriate.
> > >  	 */
> > >  	while (!list_empty(&ring->active_list)) {
> > > -		struct drm_i915_private *dev_priv = ring->dev->dev_private;
> > > -		struct i915_address_space *vm;
> > >  		struct drm_i915_gem_object *obj;
> > >  
> > >  		obj = list_first_entry(&ring->active_list,
> > > @@ -2367,8 +2376,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
> > >  		if (!i915_seqno_passed(seqno, obj->last_read_seqno))
> > >  			break;
> > >  
> > > -		list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > > -			i915_gem_object_move_to_inactive(obj, vm);
> > > +		BUG_ON(!i915_gem_object_is_active(obj));
> > > +		i915_gem_object_move_to_inactive(obj);
> > >  	}
> > >  
> > >  	if (unlikely(ring->trace_irq_seqno &&
> > > -- 
> > > 1.8.3.2
> > > 
> > > _______________________________________________
> > > Intel-gfx mailing list
> > > Intel-gfx@lists.freedesktop.org
> > > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> > 
> > -- 
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > +41 (0) 79 365 57 48 - http://blog.ffwll.ch
> 
> -- 
> Ben Widawsky, Intel Open Source Technology Center

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 06/11] drm/i915: plumb VM into object operations
  2013-07-10 17:05       ` Daniel Vetter
@ 2013-07-10 22:23         ` Ben Widawsky
  2013-07-11  6:01           ` Daniel Vetter
  0 siblings, 1 reply; 50+ messages in thread
From: Ben Widawsky @ 2013-07-10 22:23 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX

On Wed, Jul 10, 2013 at 07:05:52PM +0200, Daniel Vetter wrote:
> On Wed, Jul 10, 2013 at 09:37:10AM -0700, Ben Widawsky wrote:
> > On Tue, Jul 09, 2013 at 09:15:01AM +0200, Daniel Vetter wrote:
> > > On Mon, Jul 08, 2013 at 11:08:37PM -0700, Ben Widawsky wrote:
> > > > This patch was formerly known as:
> > > > "drm/i915: Create VMAs (part 3) - plumbing"
> > > > 
> > > > This patch adds a VM argument, bind/unbind, and the object
> > > > offset/size/color getters/setters. It preserves the old ggtt helper
> > > > functions because things still need, and will continue to need them.
> > > > 
> > > > Some code will still need to be ported over after this.
> > > > 
> > > > v2: Fix purge to pick an object and unbind all vmas
> > > > This was doable because of the global bound list change.
> > > > 
> > > > v3: With the commit to actually pin/unpin pages in place, there is no
> > > > longer a need to check if unbind succeeded before calling put_pages().
> > > > Make put_pages only BUG() after checking pin count.
> > > > 
> > > > v4: Rebased on top of the new hangcheck work by Mika
> > > > plumbed eb_destroy also
> > > > Many checkpatch related fixes
> > > > 
> > > > v5: Very large rebase
> > > > 
> > > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > > 
> > > This one is a rather large beast. Any chance we could split it into
> > > topics, e.g. convert execbuf code, convert shrinker code? Or does that get
> > > messy, fast?
> > > 
> > 
> > I've thought of this...
> > 
> > The one solution I came up with is to have two bind/unbind functions
> > (similar to what I did with pin, and indeed it was my original plan with
> > pin), and do the set_caching one separately.
> > 
> > I think it won't be too messy, just a lot of typing, as Keith likes to
> > say.
> > 
> > However, my opinion was, since it's early in the merge cycle, we don't
> > yet have multiple VMs, and it's /mostly/ a copypasta kind of patch, it's
> > not a big deal. At a functional level too, I felt this made more sense.
> > 
> > So I'll defer to your request on this and start splitting it up, unless
> > my email has changed your mind ;-).
> 
> Well, my concern is mostly in reviewing since we need to think about each
> case and whether it makes sense to talk in therms of vma or objects in
> that function. And what exactly to test.
> 
> If you've played around and concluded it'll be a mess then I don't think
> it'll help in reviewing. So pointless.

I said I don't think it will be a mess, though I feel it won't really
help review too much. Can you take a crack and review and poke me if you
want me to try it. I'd rather not do it if I can avoid it, so I can try
to go back to my 15 patch maximum rule.

> 
> Still, there's a bunch of questions on this patch that we need to discuss
> ;-)

Ready whenever.

> 
> Cheers, Daniel
> 
> > 
> > > > ---
> > > >  drivers/gpu/drm/i915/i915_debugfs.c        |  31 ++-
> > > >  drivers/gpu/drm/i915/i915_dma.c            |   4 -
> > > >  drivers/gpu/drm/i915/i915_drv.h            | 107 +++++-----
> > > >  drivers/gpu/drm/i915/i915_gem.c            | 316 +++++++++++++++++++++--------
> > > >  drivers/gpu/drm/i915/i915_gem_context.c    |   9 +-
> > > >  drivers/gpu/drm/i915/i915_gem_evict.c      |  51 +++--
> > > >  drivers/gpu/drm/i915/i915_gem_execbuffer.c |  85 +++++---
> > > >  drivers/gpu/drm/i915/i915_gem_gtt.c        |  41 ++--
> > > >  drivers/gpu/drm/i915/i915_gem_stolen.c     |   6 +-
> > > >  drivers/gpu/drm/i915/i915_gem_tiling.c     |  10 +-
> > > >  drivers/gpu/drm/i915/i915_irq.c            |   6 +-
> > > >  drivers/gpu/drm/i915/i915_trace.h          |  20 +-
> > > >  drivers/gpu/drm/i915/intel_fb.c            |   1 -
> > > >  drivers/gpu/drm/i915/intel_overlay.c       |   2 +-
> > > >  drivers/gpu/drm/i915/intel_pm.c            |   2 +-
> > > >  drivers/gpu/drm/i915/intel_ringbuffer.c    |  16 +-
> > > >  16 files changed, 468 insertions(+), 239 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> > > > index 16b2aaf..867ed07 100644
> > > > --- a/drivers/gpu/drm/i915/i915_debugfs.c
> > > > +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> > > > @@ -122,9 +122,18 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
> > > >  		seq_printf(m, " (pinned x %d)", obj->pin_count);
> > > >  	if (obj->fence_reg != I915_FENCE_REG_NONE)
> > > >  		seq_printf(m, " (fence: %d)", obj->fence_reg);
> > > > -	if (i915_gem_obj_ggtt_bound(obj))
> > > > -		seq_printf(m, " (gtt offset: %08lx, size: %08x)",
> > > > -			   i915_gem_obj_ggtt_offset(obj), (unsigned int)i915_gem_obj_ggtt_size(obj));
> > > > +	if (i915_gem_obj_bound_any(obj)) {
> > > 
> > > list_for_each will short-circuit already, so this is redundant.
> > > 
> > > > +		struct i915_vma *vma;
> > > > +		list_for_each_entry(vma, &obj->vma_list, vma_link) {
> > > > +			if (!i915_is_ggtt(vma->vm))
> > > > +				seq_puts(m, " (pp");
> > > > +			else
> > > > +				seq_puts(m, " (g");
> > > > +			seq_printf(m, " gtt offset: %08lx, size: %08lx)",
> > > 
> > >                                        ^ that space looks superflous now
> > > 
> > > > +				   i915_gem_obj_offset(obj, vma->vm),
> > > > +				   i915_gem_obj_size(obj, vma->vm));
> > > > +		}
> > > > +	}
> > > >  	if (obj->stolen)
> > > >  		seq_printf(m, " (stolen: %08lx)", obj->stolen->start);
> > > >  	if (obj->pin_mappable || obj->fault_mappable) {
> > > > @@ -186,6 +195,7 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
> > > >  	return 0;
> > > >  }
> > > >  
> > > > +/* FIXME: Support multiple VM? */
> > > >  #define count_objects(list, member) do { \
> > > >  	list_for_each_entry(obj, list, member) { \
> > > >  		size += i915_gem_obj_ggtt_size(obj); \
> > > > @@ -2049,18 +2059,21 @@ i915_drop_caches_set(void *data, u64 val)
> > > >  
> > > >  	if (val & DROP_BOUND) {
> > > >  		list_for_each_entry_safe(obj, next, &vm->inactive_list,
> > > > -					 mm_list)
> > > > -			if (obj->pin_count == 0) {
> > > > -				ret = i915_gem_object_unbind(obj);
> > > > -				if (ret)
> > > > -					goto unlock;
> > > > -			}
> > > > +					 mm_list) {
> > > > +			if (obj->pin_count)
> > > > +				continue;
> > > > +
> > > > +			ret = i915_gem_object_unbind(obj, &dev_priv->gtt.base);
> > > > +			if (ret)
> > > > +				goto unlock;
> > > > +		}
> > > >  	}
> > > >  
> > > >  	if (val & DROP_UNBOUND) {
> > > >  		list_for_each_entry_safe(obj, next, &dev_priv->mm.unbound_list,
> > > >  					 global_list)
> > > >  			if (obj->pages_pin_count == 0) {
> > > > +				/* FIXME: Do this for all vms? */
> > > >  				ret = i915_gem_object_put_pages(obj);
> > > >  				if (ret)
> > > >  					goto unlock;
> > > > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> > > > index d13e21f..b190439 100644
> > > > --- a/drivers/gpu/drm/i915/i915_dma.c
> > > > +++ b/drivers/gpu/drm/i915/i915_dma.c
> > > > @@ -1497,10 +1497,6 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
> > > >  
> > > >  	i915_dump_device_info(dev_priv);
> > > >  
> > > > -	INIT_LIST_HEAD(&dev_priv->vm_list);
> > > > -	INIT_LIST_HEAD(&dev_priv->gtt.base.global_link);
> > > > -	list_add(&dev_priv->gtt.base.global_link, &dev_priv->vm_list);
> > > > -
> > > >  	if (i915_get_bridge_dev(dev)) {
> > > >  		ret = -EIO;
> > > >  		goto free_priv;
> > > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > > > index 38cccc8..48baccc 100644
> > > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > > @@ -1363,52 +1363,6 @@ struct drm_i915_gem_object {
> > > >  
> > > >  #define to_intel_bo(x) container_of(x, struct drm_i915_gem_object, base)
> > > >  
> > > > -/* This is a temporary define to help transition us to real VMAs. If you see
> > > > - * this, you're either reviewing code, or bisecting it. */
> > > > -static inline struct i915_vma *
> > > > -__i915_gem_obj_to_vma(struct drm_i915_gem_object *obj)
> > > > -{
> > > > -	if (list_empty(&obj->vma_list))
> > > > -		return NULL;
> > > > -	return list_first_entry(&obj->vma_list, struct i915_vma, vma_link);
> > > > -}
> > > > -
> > > > -/* Whether or not this object is currently mapped by the translation tables */
> > > > -static inline bool
> > > > -i915_gem_obj_ggtt_bound(struct drm_i915_gem_object *o)
> > > > -{
> > > > -	struct i915_vma *vma = __i915_gem_obj_to_vma(o);
> > > > -	if (vma == NULL)
> > > > -		return false;
> > > > -	return drm_mm_node_allocated(&vma->node);
> > > > -}
> > > > -
> > > > -/* Offset of the first PTE pointing to this object */
> > > > -static inline unsigned long
> > > > -i915_gem_obj_ggtt_offset(struct drm_i915_gem_object *o)
> > > > -{
> > > > -	BUG_ON(list_empty(&o->vma_list));
> > > > -	return __i915_gem_obj_to_vma(o)->node.start;
> > > > -}
> > > > -
> > > > -/* The size used in the translation tables may be larger than the actual size of
> > > > - * the object on GEN2/GEN3 because of the way tiling is handled. See
> > > > - * i915_gem_get_gtt_size() for more details.
> > > > - */
> > > > -static inline unsigned long
> > > > -i915_gem_obj_ggtt_size(struct drm_i915_gem_object *o)
> > > > -{
> > > > -	BUG_ON(list_empty(&o->vma_list));
> > > > -	return __i915_gem_obj_to_vma(o)->node.size;
> > > > -}
> > > > -
> > > > -static inline void
> > > > -i915_gem_obj_ggtt_set_color(struct drm_i915_gem_object *o,
> > > > -			    enum i915_cache_level color)
> > > > -{
> > > > -	__i915_gem_obj_to_vma(o)->node.color = color;
> > > > -}
> > > > -
> > > >  /**
> > > >   * Request queue structure.
> > > >   *
> > > > @@ -1726,11 +1680,13 @@ struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
> > > >  void i915_gem_vma_destroy(struct i915_vma *vma);
> > > >  
> > > >  int __must_check i915_gem_object_pin(struct drm_i915_gem_object *obj,
> > > > +				     struct i915_address_space *vm,
> > > >  				     uint32_t alignment,
> > > >  				     bool map_and_fenceable,
> > > >  				     bool nonblocking);
> > > >  void i915_gem_object_unpin(struct drm_i915_gem_object *obj);
> > > > -int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj);
> > > > +int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj,
> > > > +					struct i915_address_space *vm);
> > > >  int i915_gem_object_put_pages(struct drm_i915_gem_object *obj);
> > > >  void i915_gem_release_mmap(struct drm_i915_gem_object *obj);
> > > >  void i915_gem_lastclose(struct drm_device *dev);
> > > > @@ -1760,6 +1716,7 @@ int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
> > > >  int i915_gem_object_sync(struct drm_i915_gem_object *obj,
> > > >  			 struct intel_ring_buffer *to);
> > > >  void i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> > > > +				    struct i915_address_space *vm,
> > > >  				    struct intel_ring_buffer *ring);
> > > >  
> > > >  int i915_gem_dumb_create(struct drm_file *file_priv,
> > > > @@ -1866,6 +1823,7 @@ i915_gem_get_gtt_alignment(struct drm_device *dev, uint32_t size,
> > > >  			    int tiling_mode, bool fenced);
> > > >  
> > > >  int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > > > +				    struct i915_address_space *vm,
> > > >  				    enum i915_cache_level cache_level);
> > > >  
> > > >  struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev,
> > > > @@ -1876,6 +1834,56 @@ struct dma_buf *i915_gem_prime_export(struct drm_device *dev,
> > > >  
> > > >  void i915_gem_restore_fences(struct drm_device *dev);
> > > >  
> > > > +unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
> > > > +				  struct i915_address_space *vm);
> > > > +bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o);
> > > > +bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
> > > > +			struct i915_address_space *vm);
> > > > +unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
> > > > +				struct i915_address_space *vm);
> > > > +void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
> > > > +			    struct i915_address_space *vm,
> > > > +			    enum i915_cache_level color);
> > > > +struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
> > > > +				     struct i915_address_space *vm);
> > > > +/* Some GGTT VM helpers */
> > > > +#define obj_to_ggtt(obj) \
> > > > +	(&((struct drm_i915_private *)(obj)->base.dev->dev_private)->gtt.base)
> > > > +static inline bool i915_is_ggtt(struct i915_address_space *vm)
> > > > +{
> > > > +	struct i915_address_space *ggtt =
> > > > +		&((struct drm_i915_private *)(vm)->dev->dev_private)->gtt.base;
> > > > +	return vm == ggtt;
> > > > +}
> > > > +
> > > > +static inline bool i915_gem_obj_ggtt_bound(struct drm_i915_gem_object *obj)
> > > > +{
> > > > +	return i915_gem_obj_bound(obj, obj_to_ggtt(obj));
> > > > +}
> > > > +
> > > > +static inline unsigned long
> > > > +i915_gem_obj_ggtt_offset(struct drm_i915_gem_object *obj)
> > > > +{
> > > > +	return i915_gem_obj_offset(obj, obj_to_ggtt(obj));
> > > > +}
> > > > +
> > > > +static inline unsigned long
> > > > +i915_gem_obj_ggtt_size(struct drm_i915_gem_object *obj)
> > > > +{
> > > > +	return i915_gem_obj_size(obj, obj_to_ggtt(obj));
> > > > +}
> > > > +
> > > > +static inline int __must_check
> > > > +i915_gem_ggtt_pin(struct drm_i915_gem_object *obj,
> > > > +		  uint32_t alignment,
> > > > +		  bool map_and_fenceable,
> > > > +		  bool nonblocking)
> > > > +{
> > > > +	return i915_gem_object_pin(obj, obj_to_ggtt(obj), alignment,
> > > > +				   map_and_fenceable, nonblocking);
> > > > +}
> > > > +#undef obj_to_ggtt
> > > > +
> > > >  /* i915_gem_context.c */
> > > >  void i915_gem_context_init(struct drm_device *dev);
> > > >  void i915_gem_context_fini(struct drm_device *dev);
> > > > @@ -1912,6 +1920,7 @@ void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
> > > >  
> > > >  void i915_gem_restore_gtt_mappings(struct drm_device *dev);
> > > >  int __must_check i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj);
> > > > +/* FIXME: this is never okay with full PPGTT */
> > > >  void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
> > > >  				enum i915_cache_level cache_level);
> > > >  void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj);
> > > > @@ -1928,7 +1937,9 @@ static inline void i915_gem_chipset_flush(struct drm_device *dev)
> > > >  
> > > >  
> > > >  /* i915_gem_evict.c */
> > > > -int __must_check i915_gem_evict_something(struct drm_device *dev, int min_size,
> > > > +int __must_check i915_gem_evict_something(struct drm_device *dev,
> > > > +					  struct i915_address_space *vm,
> > > > +					  int min_size,
> > > >  					  unsigned alignment,
> > > >  					  unsigned cache_level,
> > > >  					  bool mappable,
> > > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > > > index 058ad44..21015cd 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > > @@ -38,10 +38,12 @@
> > > >  
> > > >  static void i915_gem_object_flush_gtt_write_domain(struct drm_i915_gem_object *obj);
> > > >  static void i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *obj);
> > > > -static __must_check int i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> > > > -						    unsigned alignment,
> > > > -						    bool map_and_fenceable,
> > > > -						    bool nonblocking);
> > > > +static __must_check int
> > > > +i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> > > > +			    struct i915_address_space *vm,
> > > > +			    unsigned alignment,
> > > > +			    bool map_and_fenceable,
> > > > +			    bool nonblocking);
> > > >  static int i915_gem_phys_pwrite(struct drm_device *dev,
> > > >  				struct drm_i915_gem_object *obj,
> > > >  				struct drm_i915_gem_pwrite *args,
> > > > @@ -135,7 +137,7 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
> > > >  static inline bool
> > > >  i915_gem_object_is_inactive(struct drm_i915_gem_object *obj)
> > > >  {
> > > > -	return i915_gem_obj_ggtt_bound(obj) && !obj->active;
> > > > +	return i915_gem_obj_bound_any(obj) && !obj->active;
> > > >  }
> > > >  
> > > >  int
> > > > @@ -422,7 +424,7 @@ i915_gem_shmem_pread(struct drm_device *dev,
> > > >  		 * anyway again before the next pread happens. */
> > > >  		if (obj->cache_level == I915_CACHE_NONE)
> > > >  			needs_clflush = 1;
> > > > -		if (i915_gem_obj_ggtt_bound(obj)) {
> > > > +		if (i915_gem_obj_bound_any(obj)) {
> > > >  			ret = i915_gem_object_set_to_gtt_domain(obj, false);
> > > 
> > > This is essentially a very convoluted version of "if there's gpu rendering
> > > outstanding, please wait for it". Maybe we should switch this to
> > > 
> > > 	if (obj->active)
> > > 		wait_rendering(obj, true);
> > > 
> > > Same for the shmem_pwrite case below. Would be a separate patch to prep
> > > things though. Can I volunteer you for that? The ugly part is to review
> > > whether any of the lru list updating that set_domain does in addition to
> > > wait_rendering is required, but on a quick read that's not the case.
> > > 
> > > >  			if (ret)
> > > >  				return ret;
> > > > @@ -594,7 +596,7 @@ i915_gem_gtt_pwrite_fast(struct drm_device *dev,
> > > >  	char __user *user_data;
> > > >  	int page_offset, page_length, ret;
> > > >  
> > > > -	ret = i915_gem_object_pin(obj, 0, true, true);
> > > > +	ret = i915_gem_ggtt_pin(obj, 0, true, true);
> > > >  	if (ret)
> > > >  		goto out;
> > > >  
> > > > @@ -739,7 +741,7 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
> > > >  		 * right away and we therefore have to clflush anyway. */
> > gg> >  		if (obj->cache_level == I915_CACHE_NONE)
> > > >  			needs_clflush_after = 1;
> > > > -		if (i915_gem_obj_ggtt_bound(obj)) {
> > > > +		if (i915_gem_obj_bound_any(obj)) {
> > > 
> > > ... see above.
> > > >  			ret = i915_gem_object_set_to_gtt_domain(obj, true);
> > > >  			if (ret)
> > > >  				return ret;
> > > > @@ -1346,7 +1348,7 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
> > > >  	}
> > > >  
> > > >  	/* Now bind it into the GTT if needed */
> > > > -	ret = i915_gem_object_pin(obj, 0, true, false);
> > > > +	ret = i915_gem_ggtt_pin(obj,  0, true, false);
> > > >  	if (ret)
> > > >  		goto unlock;
> > > >  
> > > > @@ -1668,11 +1670,11 @@ i915_gem_object_put_pages(struct drm_i915_gem_object *obj)
> > > >  	if (obj->pages == NULL)
> > > >  		return 0;
> > > >  
> > > > -	BUG_ON(i915_gem_obj_ggtt_bound(obj));
> > > > -
> > > >  	if (obj->pages_pin_count)
> > > >  		return -EBUSY;
> > > >  
> > > > +	BUG_ON(i915_gem_obj_bound_any(obj));
> > > > +
> > > >  	/* ->put_pages might need to allocate memory for the bit17 swizzle
> > > >  	 * array, hence protect them from being reaped by removing them from gtt
> > > >  	 * lists early. */
> > > > @@ -1692,7 +1694,6 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
> > > >  		  bool purgeable_only)
> > > >  {
> > > >  	struct drm_i915_gem_object *obj, *next;
> > > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > >  	long count = 0;
> > > >  
> > > >  	list_for_each_entry_safe(obj, next,
> > > > @@ -1706,14 +1707,22 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
> > > >  		}
> > > >  	}
> > > >  
> > > > -	list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list) {
> > > > -		if ((i915_gem_object_is_purgeable(obj) || !purgeable_only) &&
> > > > -		    i915_gem_object_unbind(obj) == 0 &&
> > > > -		    i915_gem_object_put_pages(obj) == 0) {
> > > > +	list_for_each_entry_safe(obj, next, &dev_priv->mm.bound_list,
> > > > +				 global_list) {
> > > > +		struct i915_vma *vma, *v;
> > > > +
> > > > +		if (!i915_gem_object_is_purgeable(obj) && purgeable_only)
> > > > +			continue;
> > > > +
> > > > +		list_for_each_entry_safe(vma, v, &obj->vma_list, vma_link)
> > > > +			if (i915_gem_object_unbind(obj, vma->vm))
> > > > +				break;
> > > > +
> > > > +		if (!i915_gem_object_put_pages(obj))
> > > >  			count += obj->base.size >> PAGE_SHIFT;
> > > > -			if (count >= target)
> > > > -				return count;
> > > > -		}
> > > > +
> > > > +		if (count >= target)
> > > > +			return count;
> > > >  	}
> > > >  
> > > >  	return count;
> > > > @@ -1873,11 +1882,11 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
> > > >  
> > > >  void
> > > >  i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> > > > +			       struct i915_address_space *vm,
> > > >  			       struct intel_ring_buffer *ring)
> > > >  {
> > > >  	struct drm_device *dev = obj->base.dev;
> > > >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > >  	u32 seqno = intel_ring_get_seqno(ring);
> > > >  
> > > >  	BUG_ON(ring == NULL);
> > > > @@ -1910,12 +1919,9 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> > > >  }
> > > >  
> > > >  static void
> > > > -i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
> > > > +i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj,
> > > > +				 struct i915_address_space *vm)
> > > >  {
> > > > -	struct drm_device *dev = obj->base.dev;
> > > > -	struct drm_i915_private *dev_priv = dev->dev_private;
> > > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > > -
> > > >  	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
> > > >  	BUG_ON(!obj->active);
> > > >  
> > > > @@ -2117,10 +2123,11 @@ i915_gem_request_remove_from_client(struct drm_i915_gem_request *request)
> > > >  	spin_unlock(&file_priv->mm.lock);
> > > >  }
> > > >  
> > > > -static bool i915_head_inside_object(u32 acthd, struct drm_i915_gem_object *obj)
> > > > +static bool i915_head_inside_object(u32 acthd, struct drm_i915_gem_object *obj,
> > > > +				    struct i915_address_space *vm)
> > > >  {
> > > > -	if (acthd >= i915_gem_obj_ggtt_offset(obj) &&
> > > > -	    acthd < i915_gem_obj_ggtt_offset(obj) + obj->base.size)
> > > > +	if (acthd >= i915_gem_obj_offset(obj, vm) &&
> > > > +	    acthd < i915_gem_obj_offset(obj, vm) + obj->base.size)
> > > >  		return true;
> > > >  
> > > >  	return false;
> > > > @@ -2143,6 +2150,17 @@ static bool i915_head_inside_request(const u32 acthd_unmasked,
> > > >  	return false;
> > > >  }
> > > >  
> > > > +static struct i915_address_space *
> > > > +request_to_vm(struct drm_i915_gem_request *request)
> > > > +{
> > > > +	struct drm_i915_private *dev_priv = request->ring->dev->dev_private;
> > > > +	struct i915_address_space *vm;
> > > > +
> > > > +	vm = &dev_priv->gtt.base;
> > > > +
> > > > +	return vm;
> > > > +}
> > > > +
> > > >  static bool i915_request_guilty(struct drm_i915_gem_request *request,
> > > >  				const u32 acthd, bool *inside)
> > > >  {
> > > > @@ -2150,9 +2168,9 @@ static bool i915_request_guilty(struct drm_i915_gem_request *request,
> > > >  	 * pointing inside the ring, matches the batch_obj address range.
> > > >  	 * However this is extremely unlikely.
> > > >  	 */
> > > > -
> > > >  	if (request->batch_obj) {
> > > > -		if (i915_head_inside_object(acthd, request->batch_obj)) {
> > > > +		if (i915_head_inside_object(acthd, request->batch_obj,
> > > > +					    request_to_vm(request))) {
> > > >  			*inside = true;
> > > >  			return true;
> > > >  		}
> > > > @@ -2172,17 +2190,21 @@ static void i915_set_reset_status(struct intel_ring_buffer *ring,
> > > >  {
> > > >  	struct i915_ctx_hang_stats *hs = NULL;
> > > >  	bool inside, guilty;
> > > > +	unsigned long offset = 0;
> > > >  
> > > >  	/* Innocent until proven guilty */
> > > >  	guilty = false;
> > > >  
> > > > +	if (request->batch_obj)
> > > > +		offset = i915_gem_obj_offset(request->batch_obj,
> > > > +					     request_to_vm(request));
> > > > +
> > > >  	if (ring->hangcheck.action != wait &&
> > > >  	    i915_request_guilty(request, acthd, &inside)) {
> > > >  		DRM_ERROR("%s hung %s bo (0x%lx ctx %d) at 0x%x\n",
> > > >  			  ring->name,
> > > >  			  inside ? "inside" : "flushing",
> > > > -			  request->batch_obj ?
> > > > -			  i915_gem_obj_ggtt_offset(request->batch_obj) : 0,
> > > > +			  offset,
> > > >  			  request->ctx ? request->ctx->id : 0,
> > > >  			  acthd);
> > > >  
> > > > @@ -2239,13 +2261,15 @@ static void i915_gem_reset_ring_lists(struct drm_i915_private *dev_priv,
> > > >  	}
> > > >  
> > > >  	while (!list_empty(&ring->active_list)) {
> > > > +		struct i915_address_space *vm;
> > > >  		struct drm_i915_gem_object *obj;
> > > >  
> > > >  		obj = list_first_entry(&ring->active_list,
> > > >  				       struct drm_i915_gem_object,
> > > >  				       ring_list);
> > > >  
> > > > -		i915_gem_object_move_to_inactive(obj);
> > > > +		list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > > > +			i915_gem_object_move_to_inactive(obj, vm);
> > > >  	}
> > > >  }
> > > >  
> > > > @@ -2263,7 +2287,7 @@ void i915_gem_restore_fences(struct drm_device *dev)
> > > >  void i915_gem_reset(struct drm_device *dev)
> > > >  {
> > > >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > > +	struct i915_address_space *vm;
> > > >  	struct drm_i915_gem_object *obj;
> > > >  	struct intel_ring_buffer *ring;
> > > >  	int i;
> > > > @@ -2274,8 +2298,9 @@ void i915_gem_reset(struct drm_device *dev)
> > > >  	/* Move everything out of the GPU domains to ensure we do any
> > > >  	 * necessary invalidation upon reuse.
> > > >  	 */
> > > > -	list_for_each_entry(obj, &vm->inactive_list, mm_list)
> > > > -		obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
> > > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > > > +		list_for_each_entry(obj, &vm->inactive_list, mm_list)
> > > > +			obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
> > > >  
> > > >  	i915_gem_restore_fences(dev);
> > > >  }
> > > > @@ -2320,6 +2345,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
> > > >  	 * by the ringbuffer to the flushing/inactive lists as appropriate.
> > > >  	 */
> > > >  	while (!list_empty(&ring->active_list)) {
> > > > +		struct drm_i915_private *dev_priv = ring->dev->dev_private;
> > > > +		struct i915_address_space *vm;
> > > >  		struct drm_i915_gem_object *obj;
> > > >  
> > > >  		obj = list_first_entry(&ring->active_list,
> > > > @@ -2329,7 +2356,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
> > > >  		if (!i915_seqno_passed(seqno, obj->last_read_seqno))
> > > >  			break;
> > > >  
> > > > -		i915_gem_object_move_to_inactive(obj);
> > > > +		list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > > > +			i915_gem_object_move_to_inactive(obj, vm);
> > > >  	}
> > > >  
> > > >  	if (unlikely(ring->trace_irq_seqno &&
> > > > @@ -2575,13 +2603,14 @@ static void i915_gem_object_finish_gtt(struct drm_i915_gem_object *obj)
> > > >   * Unbinds an object from the GTT aperture.
> > > >   */
> > > >  int
> > > > -i915_gem_object_unbind(struct drm_i915_gem_object *obj)
> > > > +i915_gem_object_unbind(struct drm_i915_gem_object *obj,
> > > > +		       struct i915_address_space *vm)
> > > >  {
> > > >  	drm_i915_private_t *dev_priv = obj->base.dev->dev_private;
> > > >  	struct i915_vma *vma;
> > > >  	int ret;
> > > >  
> > > > -	if (!i915_gem_obj_ggtt_bound(obj))
> > > > +	if (!i915_gem_obj_bound(obj, vm))
> > > >  		return 0;
> > > >  
> > > >  	if (obj->pin_count)
> > > > @@ -2604,7 +2633,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
> > > >  	if (ret)
> > > >  		return ret;
> > > >  
> > > > -	trace_i915_gem_object_unbind(obj);
> > > > +	trace_i915_gem_object_unbind(obj, vm);
> > > >  
> > > >  	if (obj->has_global_gtt_mapping)
> > > >  		i915_gem_gtt_unbind_object(obj);
> > > > @@ -2619,7 +2648,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
> > > >  	/* Avoid an unnecessary call to unbind on rebind. */
> > > >  	obj->map_and_fenceable = true;
> > > >  
> > > > -	vma = __i915_gem_obj_to_vma(obj);
> > > > +	vma = i915_gem_obj_to_vma(obj, vm);
> > > >  	list_del(&vma->vma_link);
> > > >  	drm_mm_remove_node(&vma->node);
> > > >  	i915_gem_vma_destroy(vma);
> > > > @@ -2748,6 +2777,7 @@ static void i830_write_fence_reg(struct drm_device *dev, int reg,
> > > >  		     "object 0x%08lx not 512K or pot-size 0x%08x aligned\n",
> > > >  		     i915_gem_obj_ggtt_offset(obj), size);
> > > >  
> > > > +
> > > >  		pitch_val = obj->stride / 128;
> > > >  		pitch_val = ffs(pitch_val) - 1;
> > > >  
> > > > @@ -3069,23 +3099,25 @@ static void i915_gem_verify_gtt(struct drm_device *dev)
> > > >   */
> > > >  static int
> > > >  i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> > > > +			    struct i915_address_space *vm,
> > > >  			    unsigned alignment,
> > > >  			    bool map_and_fenceable,
> > > >  			    bool nonblocking)
> > > >  {
> > > >  	struct drm_device *dev = obj->base.dev;
> > > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > >  	u32 size, fence_size, fence_alignment, unfenced_alignment;
> > > >  	bool mappable, fenceable;
> > > > -	size_t gtt_max = map_and_fenceable ?
> > > > -		dev_priv->gtt.mappable_end : dev_priv->gtt.base.total;
> > > > +	size_t gtt_max =
> > > > +		map_and_fenceable ? dev_priv->gtt.mappable_end : vm->total;
> > > >  	struct i915_vma *vma;
> > > >  	int ret;
> > > >  
> > > >  	if (WARN_ON(!list_empty(&obj->vma_list)))
> > > >  		return -EBUSY;
> > > >  
> > > > +	BUG_ON(!i915_is_ggtt(vm));
> > > > +
> > > >  	fence_size = i915_gem_get_gtt_size(dev,
> > > >  					   obj->base.size,
> > > >  					   obj->tiling_mode);
> > > > @@ -3125,18 +3157,21 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> > > >  	i915_gem_object_pin_pages(obj);
> > > >  
> > > >  	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
> > > > +	/* For now we only ever use 1 vma per object */
> > > > +	WARN_ON(!list_empty(&obj->vma_list));
> > > > +
> > > > +	vma = i915_gem_vma_create(obj, vm);
> > > >  	if (vma == NULL) {
> > > >  		i915_gem_object_unpin_pages(obj);
> > > >  		return -ENOMEM;
> > > >  	}
> > > >  
> > > >  search_free:
> > > > -	ret = drm_mm_insert_node_in_range_generic(&dev_priv->gtt.base.mm,
> > > > -						  &vma->node,
> > > > +	ret = drm_mm_insert_node_in_range_generic(&vm->mm, &vma->node,
> > > >  						  size, alignment,
> > > >  						  obj->cache_level, 0, gtt_max);
> > > >  	if (ret) {
> > > > -		ret = i915_gem_evict_something(dev, size, alignment,
> > > > +		ret = i915_gem_evict_something(dev, vm, size, alignment,
> > > >  					       obj->cache_level,
> > > >  					       map_and_fenceable,
> > > >  					       nonblocking);
> > > > @@ -3162,18 +3197,25 @@ search_free:
> > > >  
> > > >  	list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
> > > >  	list_add_tail(&obj->mm_list, &vm->inactive_list);
> > > > -	list_add(&vma->vma_link, &obj->vma_list);
> > > > +
> > > > +	/* Keep GGTT vmas first to make debug easier */
> > > > +	if (i915_is_ggtt(vm))
> > > > +		list_add(&vma->vma_link, &obj->vma_list);
> > > > +	else
> > > > +		list_add_tail(&vma->vma_link, &obj->vma_list);
> > > >  
> > > >  	fenceable =
> > > > +		i915_is_ggtt(vm) &&
> > > >  		i915_gem_obj_ggtt_size(obj) == fence_size &&
> > > >  		(i915_gem_obj_ggtt_offset(obj) & (fence_alignment - 1)) == 0;
> > > >  
> > > > -	mappable = i915_gem_obj_ggtt_offset(obj) + obj->base.size <=
> > > > -		dev_priv->gtt.mappable_end;
> > > > +	mappable =
> > > > +		i915_is_ggtt(vm) &&
> > > > +		vma->node.start + obj->base.size <= dev_priv->gtt.mappable_end;
> > > >  
> > > >  	obj->map_and_fenceable = mappable && fenceable;
> > > >  
> > > > -	trace_i915_gem_object_bind(obj, map_and_fenceable);
> > > > +	trace_i915_gem_object_bind(obj, vm, map_and_fenceable);
> > > >  	i915_gem_verify_gtt(dev);
> > > >  	return 0;
> > > >  }
> > > > @@ -3271,7 +3313,7 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
> > > >  	int ret;
> > > >  
> > > >  	/* Not valid to be called on unbound objects. */
> > > > -	if (!i915_gem_obj_ggtt_bound(obj))
> > > > +	if (!i915_gem_obj_bound_any(obj))
> > > >  		return -EINVAL;
> > > 
> > > If we're converting the shmem paths over to wait_rendering then there's
> > > only the fault handler and the set_domain ioctl left. For the later it
> > > would make sense to clflush even when an object is on the unbound list, to
> > > allow userspace to optimize when the clflushing happens. But that would
> > > only make sense in conjunction with Chris' create2 ioctl and a flag to
> > > preallocate the storage (and so putting the object onto the unbound list).
> > > So nothing to do here.
> > > 
> > > >  
> > > >  	if (obj->base.write_domain == I915_GEM_DOMAIN_GTT)
> > > > @@ -3317,11 +3359,12 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
> > > >  }
> > > >  
> > > >  int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > > > +				    struct i915_address_space *vm,
> > > >  				    enum i915_cache_level cache_level)
> > > >  {
> > > >  	struct drm_device *dev = obj->base.dev;
> > > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > > > -	struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> > > > +	struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> > > >  	int ret;
> > > >  
> > > >  	if (obj->cache_level == cache_level)
> > > > @@ -3333,12 +3376,15 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > > >  	}
> > > >  
> > > >  	if (!i915_gem_valid_gtt_space(dev, &vma->node, cache_level)) {
> > > > -		ret = i915_gem_object_unbind(obj);
> > > > +		ret = i915_gem_object_unbind(obj, vm);
> > > >  		if (ret)
> > > >  			return ret;
> > > >  	}
> > > >  
> > > > -	if (i915_gem_obj_ggtt_bound(obj)) {
> > > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > > > +		if (!i915_gem_obj_bound(obj, vm))
> > > > +			continue;
> > > 
> > > Hm, shouldn't we have a per-object list of vmas? Or will that follow later
> > > on?
> > > 
> > > Self-correction: It exists already ... why can't we use this here?
> > > 
> > > > +
> > > >  		ret = i915_gem_object_finish_gpu(obj);
> > > >  		if (ret)
> > > >  			return ret;
> > > > @@ -3361,7 +3407,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > > >  			i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
> > > >  					       obj, cache_level);
> > > >  
> > > > -		i915_gem_obj_ggtt_set_color(obj, cache_level);
> > > > +		i915_gem_obj_set_color(obj, vm, cache_level);
> > > >  	}
> > > >  
> > > >  	if (cache_level == I915_CACHE_NONE) {
> > > > @@ -3421,6 +3467,7 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
> > > >  			       struct drm_file *file)
> > > >  {
> > > >  	struct drm_i915_gem_caching *args = data;
> > > > +	struct drm_i915_private *dev_priv;
> > > >  	struct drm_i915_gem_object *obj;
> > > >  	enum i915_cache_level level;
> > > >  	int ret;
> > > > @@ -3445,8 +3492,10 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
> > > >  		ret = -ENOENT;
> > > >  		goto unlock;
> > > >  	}
> > > > +	dev_priv = obj->base.dev->dev_private;
> > > >  
> > > > -	ret = i915_gem_object_set_cache_level(obj, level);
> > > > +	/* FIXME: Add interface for specific VM? */
> > > > +	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base, level);
> > > >  
> > > >  	drm_gem_object_unreference(&obj->base);
> > > >  unlock:
> > > > @@ -3464,6 +3513,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> > > >  				     u32 alignment,
> > > >  				     struct intel_ring_buffer *pipelined)
> > > >  {
> > > > +	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> > > >  	u32 old_read_domains, old_write_domain;
> > > >  	int ret;
> > > >  
> > > > @@ -3482,7 +3532,8 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> > > >  	 * of uncaching, which would allow us to flush all the LLC-cached data
> > > >  	 * with that bit in the PTE to main memory with just one PIPE_CONTROL.
> > > >  	 */
> > > > -	ret = i915_gem_object_set_cache_level(obj, I915_CACHE_NONE);
> > > > +	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> > > > +					      I915_CACHE_NONE);
> > > >  	if (ret)
> > > >  		return ret;
> > > >  
> > > > @@ -3490,7 +3541,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> > > >  	 * (e.g. libkms for the bootup splash), we have to ensure that we
> > > >  	 * always use map_and_fenceable for all scanout buffers.
> > > >  	 */
> > > > -	ret = i915_gem_object_pin(obj, alignment, true, false);
> > > > +	ret = i915_gem_ggtt_pin(obj, alignment, true, false);
> > > >  	if (ret)
> > > >  		return ret;
> > > >  
> > > > @@ -3633,6 +3684,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
> > > >  
> > > >  int
> > > >  i915_gem_object_pin(struct drm_i915_gem_object *obj,
> > > > +		    struct i915_address_space *vm,
> > > >  		    uint32_t alignment,
> > > >  		    bool map_and_fenceable,
> > > >  		    bool nonblocking)
> > > > @@ -3642,26 +3694,29 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
> > > >  	if (WARN_ON(obj->pin_count == DRM_I915_GEM_OBJECT_MAX_PIN_COUNT))
> > > >  		return -EBUSY;
> > > >  
> > > > -	if (i915_gem_obj_ggtt_bound(obj)) {
> > > > -		if ((alignment && i915_gem_obj_ggtt_offset(obj) & (alignment - 1)) ||
> > > > +	BUG_ON(map_and_fenceable && !i915_is_ggtt(vm));
> > > 
> > > WARN_ON, since presumably we can keep on going if we get this wrong
> > > (albeit with slightly corrupted state, so render corruptions might
> > > follow).
> > > 
> > > > +
> > > > +	if (i915_gem_obj_bound(obj, vm)) {
> > > > +		if ((alignment &&
> > > > +		     i915_gem_obj_offset(obj, vm) & (alignment - 1)) ||
> > > >  		    (map_and_fenceable && !obj->map_and_fenceable)) {
> > > >  			WARN(obj->pin_count,
> > > >  			     "bo is already pinned with incorrect alignment:"
> > > >  			     " offset=%lx, req.alignment=%x, req.map_and_fenceable=%d,"
> > > >  			     " obj->map_and_fenceable=%d\n",
> > > > -			     i915_gem_obj_ggtt_offset(obj), alignment,
> > > > +			     i915_gem_obj_offset(obj, vm), alignment,
> > > >  			     map_and_fenceable,
> > > >  			     obj->map_and_fenceable);
> > > > -			ret = i915_gem_object_unbind(obj);
> > > > +			ret = i915_gem_object_unbind(obj, vm);
> > > >  			if (ret)
> > > >  				return ret;
> > > >  		}
> > > >  	}
> > > >  
> > > > -	if (!i915_gem_obj_ggtt_bound(obj)) {
> > > > +	if (!i915_gem_obj_bound(obj, vm)) {
> > > >  		struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> > > >  
> > > > -		ret = i915_gem_object_bind_to_gtt(obj, alignment,
> > > > +		ret = i915_gem_object_bind_to_gtt(obj, vm, alignment,
> > > >  						  map_and_fenceable,
> > > >  						  nonblocking);
> > > >  		if (ret)
> > > > @@ -3684,7 +3739,7 @@ void
> > > >  i915_gem_object_unpin(struct drm_i915_gem_object *obj)
> > > >  {
> > > >  	BUG_ON(obj->pin_count == 0);
> > > > -	BUG_ON(!i915_gem_obj_ggtt_bound(obj));
> > > > +	BUG_ON(!i915_gem_obj_bound_any(obj));
> > > >  
> > > >  	if (--obj->pin_count == 0)
> > > >  		obj->pin_mappable = false;
> > > > @@ -3722,7 +3777,7 @@ i915_gem_pin_ioctl(struct drm_device *dev, void *data,
> > > >  	}
> > > >  
> > > >  	if (obj->user_pin_count == 0) {
> > > > -		ret = i915_gem_object_pin(obj, args->alignment, true, false);
> > > > +		ret = i915_gem_ggtt_pin(obj, args->alignment, true, false);
> > > >  		if (ret)
> > > >  			goto out;
> > > >  	}
> > > > @@ -3957,6 +4012,7 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
> > > >  	struct drm_i915_gem_object *obj = to_intel_bo(gem_obj);
> > > >  	struct drm_device *dev = obj->base.dev;
> > > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > > > +	struct i915_vma *vma, *next;
> > > >  
> > > >  	trace_i915_gem_object_destroy(obj);
> > > >  
> > > > @@ -3964,15 +4020,21 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
> > > >  		i915_gem_detach_phys_object(dev, obj);
> > > >  
> > > >  	obj->pin_count = 0;
> > > > -	if (WARN_ON(i915_gem_object_unbind(obj) == -ERESTARTSYS)) {
> > > > -		bool was_interruptible;
> > > > +	/* NB: 0 or 1 elements */
> > > > +	WARN_ON(!list_empty(&obj->vma_list) &&
> > > > +		!list_is_singular(&obj->vma_list));
> > > > +	list_for_each_entry_safe(vma, next, &obj->vma_list, vma_link) {
> > > > +		int ret = i915_gem_object_unbind(obj, vma->vm);
> > > > +		if (WARN_ON(ret == -ERESTARTSYS)) {
> > > > +			bool was_interruptible;
> > > >  
> > > > -		was_interruptible = dev_priv->mm.interruptible;
> > > > -		dev_priv->mm.interruptible = false;
> > > > +			was_interruptible = dev_priv->mm.interruptible;
> > > > +			dev_priv->mm.interruptible = false;
> > > >  
> > > > -		WARN_ON(i915_gem_object_unbind(obj));
> > > > +			WARN_ON(i915_gem_object_unbind(obj, vma->vm));
> > > >  
> > > > -		dev_priv->mm.interruptible = was_interruptible;
> > > > +			dev_priv->mm.interruptible = was_interruptible;
> > > > +		}
> > > >  	}
> > > >  
> > > >  	/* Stolen objects don't hold a ref, but do hold pin count. Fix that up
> > > > @@ -4332,6 +4394,16 @@ init_ring_lists(struct intel_ring_buffer *ring)
> > > >  	INIT_LIST_HEAD(&ring->request_list);
> > > >  }
> > > >  
> > > > +static void i915_init_vm(struct drm_i915_private *dev_priv,
> > > > +			 struct i915_address_space *vm)
> > > > +{
> > > > +	vm->dev = dev_priv->dev;
> > > > +	INIT_LIST_HEAD(&vm->active_list);
> > > > +	INIT_LIST_HEAD(&vm->inactive_list);
> > > > +	INIT_LIST_HEAD(&vm->global_link);
> > > > +	list_add(&vm->global_link, &dev_priv->vm_list);
> > > > +}
> > > > +
> > > >  void
> > > >  i915_gem_load(struct drm_device *dev)
> > > >  {
> > > > @@ -4344,8 +4416,9 @@ i915_gem_load(struct drm_device *dev)
> > > >  				  SLAB_HWCACHE_ALIGN,
> > > >  				  NULL);
> > > >  
> > > > -	INIT_LIST_HEAD(&dev_priv->gtt.base.active_list);
> > > > -	INIT_LIST_HEAD(&dev_priv->gtt.base.inactive_list);
> > > > +	INIT_LIST_HEAD(&dev_priv->vm_list);
> > > > +	i915_init_vm(dev_priv, &dev_priv->gtt.base);
> > > > +
> > > >  	INIT_LIST_HEAD(&dev_priv->mm.unbound_list);
> > > >  	INIT_LIST_HEAD(&dev_priv->mm.bound_list);
> > > >  	INIT_LIST_HEAD(&dev_priv->mm.fence_list);
> > > > @@ -4616,9 +4689,9 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> > > >  			     struct drm_i915_private,
> > > >  			     mm.inactive_shrinker);
> > > >  	struct drm_device *dev = dev_priv->dev;
> > > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > > +	struct i915_address_space *vm;
> > > >  	struct drm_i915_gem_object *obj;
> > > > -	int nr_to_scan = sc->nr_to_scan;
> > > > +	int nr_to_scan;
> > > >  	bool unlock = true;
> > > >  	int cnt;
> > > >  
> > > > @@ -4632,6 +4705,7 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> > > >  		unlock = false;
> > > >  	}
> > > >  
> > > > +	nr_to_scan = sc->nr_to_scan;
> > > >  	if (nr_to_scan) {
> > > >  		nr_to_scan -= i915_gem_purge(dev_priv, nr_to_scan);
> > > >  		if (nr_to_scan > 0)
> > > > @@ -4645,11 +4719,93 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> > > >  	list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_list)
> > > >  		if (obj->pages_pin_count == 0)
> > > >  			cnt += obj->base.size >> PAGE_SHIFT;
> > > > -	list_for_each_entry(obj, &vm->inactive_list, mm_list)
> > > > -		if (obj->pin_count == 0 && obj->pages_pin_count == 0)
> > > > -			cnt += obj->base.size >> PAGE_SHIFT;
> > > > +
> > > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > > > +		list_for_each_entry(obj, &vm->inactive_list, global_list)
> > > > +			if (obj->pin_count == 0 && obj->pages_pin_count == 0)
> > > > +				cnt += obj->base.size >> PAGE_SHIFT;
> > > 
> > > Isn't this now double-counting objects? In the shrinker we only care about
> > > how much physical RAM an object occupies, not how much virtual space it
> > > occupies. So just walking the bound list of objects here should be good
> > > enough ...
> > > 
> > > >  
> > > >  	if (unlock)
> > > >  		mutex_unlock(&dev->struct_mutex);
> > > >  	return cnt;
> > > >  }
> > > > +
> > > > +/* All the new VM stuff */
> > > > +unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
> > > > +				  struct i915_address_space *vm)
> > > > +{
> > > > +	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
> > > > +	struct i915_vma *vma;
> > > > +
> > > > +	if (vm == &dev_priv->mm.aliasing_ppgtt->base)
> > > > +		vm = &dev_priv->gtt.base;
> > > > +
> > > > +	BUG_ON(list_empty(&o->vma_list));
> > > > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > > 
> > > Imo the vma list walking here and in the other helpers below indicates
> > > that we should deal more often in vmas instead of (object, vm) pairs. Or
> > > is this again something that'll get fixed later on?
> > > 
> > > I just want to avoid diff churn, and it also makes reviewing easier if the
> > > foreshadowing is correct ;-) So generally I'd vote for more liberal
> > > sprinkling of obj_to_vma in callers.
> > > 
> > > > +		if (vma->vm == vm)
> > > > +			return vma->node.start;
> > > > +
> > > > +	}
> > > > +	return -1;
> > > > +}
> > > > +
> > > > +bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o)
> > > > +{
> > > > +	return !list_empty(&o->vma_list);
> > > > +}
> > > > +
> > > > +bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
> > > > +			struct i915_address_space *vm)
> > > > +{
> > > > +	struct i915_vma *vma;
> > > > +
> > > > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > > > +		if (vma->vm == vm)
> > > > +			return true;
> > > > +	}
> > > > +	return false;
> > > > +}
> > > > +
> > > > +unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
> > > > +				struct i915_address_space *vm)
> > > > +{
> > > > +	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
> > > > +	struct i915_vma *vma;
> > > > +
> > > > +	if (vm == &dev_priv->mm.aliasing_ppgtt->base)
> > > > +		vm = &dev_priv->gtt.base;
> > > > +	BUG_ON(list_empty(&o->vma_list));
> > > > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > > > +		if (vma->vm == vm)
> > > > +			return vma->node.size;
> > > > +	}
> > > > +
> > > > +	return 0;
> > > > +}
> > > > +
> > > > +void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
> > > > +			    struct i915_address_space *vm,
> > > > +			    enum i915_cache_level color)
> > > > +{
> > > > +	struct i915_vma *vma;
> > > > +	BUG_ON(list_empty(&o->vma_list));
> > > > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > > > +		if (vma->vm == vm) {
> > > > +			vma->node.color = color;
> > > > +			return;
> > > > +		}
> > > > +	}
> > > > +
> > > > +	WARN(1, "Couldn't set color for VM %p\n", vm);
> > > > +}
> > > > +
> > > > +struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
> > > > +				     struct i915_address_space *vm)
> > > > +{
> > > > +	struct i915_vma *vma;
> > > > +	list_for_each_entry(vma, &obj->vma_list, vma_link)
> > > > +		if (vma->vm == vm)
> > > > +			return vma;
> > > > +
> > > > +	return NULL;
> > > > +}
> > > > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> > > > index 2074544..c92fd81 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem_context.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> > > > @@ -155,6 +155,7 @@ create_hw_context(struct drm_device *dev,
> > > >  
> > > >  	if (INTEL_INFO(dev)->gen >= 7) {
> > > >  		ret = i915_gem_object_set_cache_level(ctx->obj,
> > > > +						      &dev_priv->gtt.base,
> > > >  						      I915_CACHE_LLC_MLC);
> > > >  		/* Failure shouldn't ever happen this early */
> > > >  		if (WARN_ON(ret))
> > > > @@ -214,7 +215,7 @@ static int create_default_context(struct drm_i915_private *dev_priv)
> > > >  	 * default context.
> > > >  	 */
> > > >  	dev_priv->ring[RCS].default_context = ctx;
> > > > -	ret = i915_gem_object_pin(ctx->obj, CONTEXT_ALIGN, false, false);
> > > > +	ret = i915_gem_ggtt_pin(ctx->obj, CONTEXT_ALIGN, false, false);
> > > >  	if (ret) {
> > > >  		DRM_DEBUG_DRIVER("Couldn't pin %d\n", ret);
> > > >  		goto err_destroy;
> > > > @@ -398,6 +399,7 @@ mi_set_context(struct intel_ring_buffer *ring,
> > > >  static int do_switch(struct i915_hw_context *to)
> > > >  {
> > > >  	struct intel_ring_buffer *ring = to->ring;
> > > > +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> > > >  	struct i915_hw_context *from = ring->last_context;
> > > >  	u32 hw_flags = 0;
> > > >  	int ret;
> > > > @@ -407,7 +409,7 @@ static int do_switch(struct i915_hw_context *to)
> > > >  	if (from == to)
> > > >  		return 0;
> > > >  
> > > > -	ret = i915_gem_object_pin(to->obj, CONTEXT_ALIGN, false, false);
> > > > +	ret = i915_gem_ggtt_pin(to->obj, CONTEXT_ALIGN, false, false);
> > > >  	if (ret)
> > > >  		return ret;
> > > >  
> > > > @@ -444,7 +446,8 @@ static int do_switch(struct i915_hw_context *to)
> > > >  	 */
> > > >  	if (from != NULL) {
> > > >  		from->obj->base.read_domains = I915_GEM_DOMAIN_INSTRUCTION;
> > > > -		i915_gem_object_move_to_active(from->obj, ring);
> > > > +		i915_gem_object_move_to_active(from->obj, &dev_priv->gtt.base,
> > > > +					       ring);
> > > >  		/* As long as MI_SET_CONTEXT is serializing, ie. it flushes the
> > > >  		 * whole damn pipeline, we don't need to explicitly mark the
> > > >  		 * object dirty. The only exception is that the context must be
> > > > diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
> > > > index df61f33..32efdc0 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem_evict.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gem_evict.c
> > > > @@ -32,24 +32,21 @@
> > > >  #include "i915_trace.h"
> > > >  
> > > >  static bool
> > > > -mark_free(struct drm_i915_gem_object *obj, struct list_head *unwind)
> > > > +mark_free(struct i915_vma *vma, struct list_head *unwind)
> > > >  {
> > > > -	struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> > > > -
> > > > -	if (obj->pin_count)
> > > > +	if (vma->obj->pin_count)
> > > >  		return false;
> > > >  
> > > > -	list_add(&obj->exec_list, unwind);
> > > > +	list_add(&vma->obj->exec_list, unwind);
> > > >  	return drm_mm_scan_add_block(&vma->node);
> > > >  }
> > > >  
> > > >  int
> > > > -i915_gem_evict_something(struct drm_device *dev, int min_size,
> > > > -			 unsigned alignment, unsigned cache_level,
> > > > +i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
> > > > +			 int min_size, unsigned alignment, unsigned cache_level,
> > > >  			 bool mappable, bool nonblocking)
> > > >  {
> > > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > >  	struct list_head eviction_list, unwind_list;
> > > >  	struct i915_vma *vma;
> > > >  	struct drm_i915_gem_object *obj;
> > > > @@ -81,16 +78,18 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
> > > >  	 */
> > > >  
> > > >  	INIT_LIST_HEAD(&unwind_list);
> > > > -	if (mappable)
> > > > +	if (mappable) {
> > > > +		BUG_ON(!i915_is_ggtt(vm));
> > > >  		drm_mm_init_scan_with_range(&vm->mm, min_size,
> > > >  					    alignment, cache_level, 0,
> > > >  					    dev_priv->gtt.mappable_end);
> > > > -	else
> > > > +	} else
> > > >  		drm_mm_init_scan(&vm->mm, min_size, alignment, cache_level);
> > > >  
> > > >  	/* First see if there is a large enough contiguous idle region... */
> > > >  	list_for_each_entry(obj, &vm->inactive_list, mm_list) {
> > > > -		if (mark_free(obj, &unwind_list))
> > > > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> > > > +		if (mark_free(vma, &unwind_list))
> > > >  			goto found;
> > > >  	}
> > > >  
> > > > @@ -99,7 +98,8 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
> > > >  
> > > >  	/* Now merge in the soon-to-be-expired objects... */
> > > >  	list_for_each_entry(obj, &vm->active_list, mm_list) {
> > > > -		if (mark_free(obj, &unwind_list))
> > > > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> > > > +		if (mark_free(vma, &unwind_list))
> > > >  			goto found;
> > > >  	}
> > > >  
> > > > @@ -109,7 +109,7 @@ none:
> > > >  		obj = list_first_entry(&unwind_list,
> > > >  				       struct drm_i915_gem_object,
> > > >  				       exec_list);
> > > > -		vma = __i915_gem_obj_to_vma(obj);
> > > > +		vma = i915_gem_obj_to_vma(obj, vm);
> > > >  		ret = drm_mm_scan_remove_block(&vma->node);
> > > >  		BUG_ON(ret);
> > > >  
> > > > @@ -130,7 +130,7 @@ found:
> > > >  		obj = list_first_entry(&unwind_list,
> > > >  				       struct drm_i915_gem_object,
> > > >  				       exec_list);
> > > > -		vma = __i915_gem_obj_to_vma(obj);
> > > > +		vma = i915_gem_obj_to_vma(obj, vm);
> > > >  		if (drm_mm_scan_remove_block(&vma->node)) {
> > > >  			list_move(&obj->exec_list, &eviction_list);
> > > >  			drm_gem_object_reference(&obj->base);
> > > > @@ -145,7 +145,7 @@ found:
> > > >  				       struct drm_i915_gem_object,
> > > >  				       exec_list);
> > > >  		if (ret == 0)
> > > > -			ret = i915_gem_object_unbind(obj);
> > > > +			ret = i915_gem_object_unbind(obj, vm);
> > > >  
> > > >  		list_del_init(&obj->exec_list);
> > > >  		drm_gem_object_unreference(&obj->base);
> > > > @@ -158,13 +158,18 @@ int
> > > >  i915_gem_evict_everything(struct drm_device *dev)
> > > 
> > > I suspect evict_everything eventually wants a address_space *vm argument
> > > for those cases where we only want to evict everything in a given vm. Atm
> > > we have two use-cases of this:
> > > - Called from the shrinker as a last-ditch effort. For that it should move
> > >   _every_ object onto the unbound list.
> > > - Called from execbuf for badly-fragmented address spaces to clean up the
> > >   mess. For that case we only care about one address space.
> > > 
> > > >  {
> > > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > > +	struct i915_address_space *vm;
> > > >  	struct drm_i915_gem_object *obj, *next;
> > > > -	bool lists_empty;
> > > > +	bool lists_empty = true;
> > > >  	int ret;
> > > >  
> > > > -	lists_empty = (list_empty(&vm->inactive_list) &&
> > > > -		       list_empty(&vm->active_list));
> > > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > > > +		lists_empty = (list_empty(&vm->inactive_list) &&
> > > > +			       list_empty(&vm->active_list));
> > > > +		if (!lists_empty)
> > > > +			lists_empty = false;
> > > > +	}
> > > > +
> > > >  	if (lists_empty)
> > > >  		return -ENOSPC;
> > > >  
> > > > @@ -181,9 +186,11 @@ i915_gem_evict_everything(struct drm_device *dev)
> > > >  	i915_gem_retire_requests(dev);
> > > >  
> > > >  	/* Having flushed everything, unbind() should never raise an error */
> > > > -	list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> > > > -		if (obj->pin_count == 0)
> > > > -			WARN_ON(i915_gem_object_unbind(obj));
> > > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > > > +		list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> > > > +			if (obj->pin_count == 0)
> > > > +				WARN_ON(i915_gem_object_unbind(obj, vm));
> > > > +	}
> > > >  
> > > >  	return 0;
> > > >  }
> > > > diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > > > index 5aeb447..e90182d 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > > > @@ -150,7 +150,7 @@ eb_get_object(struct eb_objects *eb, unsigned long handle)
> > > >  }
> > > >  
> > > >  static void
> > > > -eb_destroy(struct eb_objects *eb)
> > > > +eb_destroy(struct eb_objects *eb, struct i915_address_space *vm)
> > > >  {
> > > >  	while (!list_empty(&eb->objects)) {
> > > >  		struct drm_i915_gem_object *obj;
> > > > @@ -174,7 +174,8 @@ static inline int use_cpu_reloc(struct drm_i915_gem_object *obj)
> > > >  static int
> > > >  i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
> > > >  				   struct eb_objects *eb,
> > > > -				   struct drm_i915_gem_relocation_entry *reloc)
> > > > +				   struct drm_i915_gem_relocation_entry *reloc,
> > > > +				   struct i915_address_space *vm)
> > > >  {
> > > >  	struct drm_device *dev = obj->base.dev;
> > > >  	struct drm_gem_object *target_obj;
> > > > @@ -297,7 +298,8 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
> > > >  
> > > >  static int
> > > >  i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
> > > > -				    struct eb_objects *eb)
> > > > +				    struct eb_objects *eb,
> > > > +				    struct i915_address_space *vm)
> > > >  {
> > > >  #define N_RELOC(x) ((x) / sizeof(struct drm_i915_gem_relocation_entry))
> > > >  	struct drm_i915_gem_relocation_entry stack_reloc[N_RELOC(512)];
> > > > @@ -321,7 +323,8 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
> > > >  		do {
> > > >  			u64 offset = r->presumed_offset;
> > > >  
> > > > -			ret = i915_gem_execbuffer_relocate_entry(obj, eb, r);
> > > > +			ret = i915_gem_execbuffer_relocate_entry(obj, eb, r,
> > > > +								 vm);
> > > >  			if (ret)
> > > >  				return ret;
> > > >  
> > > > @@ -344,13 +347,15 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
> > > >  static int
> > > >  i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
> > > >  					 struct eb_objects *eb,
> > > > -					 struct drm_i915_gem_relocation_entry *relocs)
> > > > +					 struct drm_i915_gem_relocation_entry *relocs,
> > > > +					 struct i915_address_space *vm)
> > > >  {
> > > >  	const struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
> > > >  	int i, ret;
> > > >  
> > > >  	for (i = 0; i < entry->relocation_count; i++) {
> > > > -		ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i]);
> > > > +		ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i],
> > > > +							 vm);
> > > >  		if (ret)
> > > >  			return ret;
> > > >  	}
> > > > @@ -359,7 +364,8 @@ i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
> > > >  }
> > > >  
> > > >  static int
> > > > -i915_gem_execbuffer_relocate(struct eb_objects *eb)
> > > > +i915_gem_execbuffer_relocate(struct eb_objects *eb,
> > > > +			     struct i915_address_space *vm)
> > > >  {
> > > >  	struct drm_i915_gem_object *obj;
> > > >  	int ret = 0;
> > > > @@ -373,7 +379,7 @@ i915_gem_execbuffer_relocate(struct eb_objects *eb)
> > > >  	 */
> > > >  	pagefault_disable();
> > > >  	list_for_each_entry(obj, &eb->objects, exec_list) {
> > > > -		ret = i915_gem_execbuffer_relocate_object(obj, eb);
> > > > +		ret = i915_gem_execbuffer_relocate_object(obj, eb, vm);
> > > >  		if (ret)
> > > >  			break;
> > > >  	}
> > > > @@ -395,6 +401,7 @@ need_reloc_mappable(struct drm_i915_gem_object *obj)
> > > >  static int
> > > >  i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
> > > >  				   struct intel_ring_buffer *ring,
> > > > +				   struct i915_address_space *vm,
> > > >  				   bool *need_reloc)
> > > >  {
> > > >  	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> > > > @@ -409,7 +416,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
> > > >  		obj->tiling_mode != I915_TILING_NONE;
> > > >  	need_mappable = need_fence || need_reloc_mappable(obj);
> > > >  
> > > > -	ret = i915_gem_object_pin(obj, entry->alignment, need_mappable, false);
> > > > +	ret = i915_gem_object_pin(obj, vm, entry->alignment, need_mappable,
> > > > +				  false);
> > > >  	if (ret)
> > > >  		return ret;
> > > >  
> > > > @@ -436,8 +444,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
> > > >  		obj->has_aliasing_ppgtt_mapping = 1;
> > > >  	}
> > > >  
> > > > -	if (entry->offset != i915_gem_obj_ggtt_offset(obj)) {
> > > > -		entry->offset = i915_gem_obj_ggtt_offset(obj);
> > > > +	if (entry->offset != i915_gem_obj_offset(obj, vm)) {
> > > > +		entry->offset = i915_gem_obj_offset(obj, vm);
> > > >  		*need_reloc = true;
> > > >  	}
> > > >  
> > > > @@ -458,7 +466,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
> > > >  {
> > > >  	struct drm_i915_gem_exec_object2 *entry;
> > > >  
> > > > -	if (!i915_gem_obj_ggtt_bound(obj))
> > > > +	if (!i915_gem_obj_bound_any(obj))
> > > >  		return;
> > > >  
> > > >  	entry = obj->exec_entry;
> > > > @@ -475,6 +483,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
> > > >  static int
> > > >  i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
> > > >  			    struct list_head *objects,
> > > > +			    struct i915_address_space *vm,
> > > >  			    bool *need_relocs)
> > > >  {
> > > >  	struct drm_i915_gem_object *obj;
> > > > @@ -529,32 +538,37 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
> > > >  		list_for_each_entry(obj, objects, exec_list) {
> > > >  			struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
> > > >  			bool need_fence, need_mappable;
> > > > +			u32 obj_offset;
> > > >  
> > > > -			if (!i915_gem_obj_ggtt_bound(obj))
> > > > +			if (!i915_gem_obj_bound(obj, vm))
> > > >  				continue;
> > > 
> > > I wonder a bit how we could avoid the multipler (obj, vm) -> vma lookups
> > > here ... Maybe we should cache them in some pointer somewhere (either in
> > > the eb object or by adding a new pointer to the object struct, e.g.
> > > obj->eb_vma, similar to obj->eb_list).
> > > 
> > > >  
> > > > +			obj_offset = i915_gem_obj_offset(obj, vm);
> > > >  			need_fence =
> > > >  				has_fenced_gpu_access &&
> > > >  				entry->flags & EXEC_OBJECT_NEEDS_FENCE &&
> > > >  				obj->tiling_mode != I915_TILING_NONE;
> > > >  			need_mappable = need_fence || need_reloc_mappable(obj);
> > > >  
> > > > +			BUG_ON((need_mappable || need_fence) &&
> > > > +			       !i915_is_ggtt(vm));
> > > > +
> > > >  			if ((entry->alignment &&
> > > > -			     i915_gem_obj_ggtt_offset(obj) & (entry->alignment - 1)) ||
> > > > +			     obj_offset & (entry->alignment - 1)) ||
> > > >  			    (need_mappable && !obj->map_and_fenceable))
> > > > -				ret = i915_gem_object_unbind(obj);
> > > > +				ret = i915_gem_object_unbind(obj, vm);
> > > >  			else
> > > > -				ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs);
> > > > +				ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
> > > >  			if (ret)
> > > >  				goto err;
> > > >  		}
> > > >  
> > > >  		/* Bind fresh objects */
> > > >  		list_for_each_entry(obj, objects, exec_list) {
> > > > -			if (i915_gem_obj_ggtt_bound(obj))
> > > > +			if (i915_gem_obj_bound(obj, vm))
> > > >  				continue;
> > > >  
> > > > -			ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs);
> > > > +			ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
> > > >  			if (ret)
> > > >  				goto err;
> > > >  		}
> > > > @@ -578,7 +592,8 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
> > > >  				  struct drm_file *file,
> > > >  				  struct intel_ring_buffer *ring,
> > > >  				  struct eb_objects *eb,
> > > > -				  struct drm_i915_gem_exec_object2 *exec)
> > > > +				  struct drm_i915_gem_exec_object2 *exec,
> > > > +				  struct i915_address_space *vm)
> > > >  {
> > > >  	struct drm_i915_gem_relocation_entry *reloc;
> > > >  	struct drm_i915_gem_object *obj;
> > > > @@ -662,14 +677,15 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
> > > >  		goto err;
> > > >  
> > > >  	need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
> > > > -	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs);
> > > > +	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
> > > >  	if (ret)
> > > >  		goto err;
> > > >  
> > > >  	list_for_each_entry(obj, &eb->objects, exec_list) {
> > > >  		int offset = obj->exec_entry - exec;
> > > >  		ret = i915_gem_execbuffer_relocate_object_slow(obj, eb,
> > > > -							       reloc + reloc_offset[offset]);
> > > > +							       reloc + reloc_offset[offset],
> > > > +							       vm);
> > > >  		if (ret)
> > > >  			goto err;
> > > >  	}
> > > > @@ -768,6 +784,7 @@ validate_exec_list(struct drm_i915_gem_exec_object2 *exec,
> > > >  
> > > >  static void
> > > >  i915_gem_execbuffer_move_to_active(struct list_head *objects,
> > > > +				   struct i915_address_space *vm,
> > > >  				   struct intel_ring_buffer *ring)
> > > >  {
> > > >  	struct drm_i915_gem_object *obj;
> > > > @@ -782,7 +799,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *objects,
> > > >  		obj->base.read_domains = obj->base.pending_read_domains;
> > > >  		obj->fenced_gpu_access = obj->pending_fenced_gpu_access;
> > > >  
> > > > -		i915_gem_object_move_to_active(obj, ring);
> > > > +		i915_gem_object_move_to_active(obj, vm, ring);
> > > >  		if (obj->base.write_domain) {
> > > >  			obj->dirty = 1;
> > > >  			obj->last_write_seqno = intel_ring_get_seqno(ring);
> > > > @@ -836,7 +853,8 @@ static int
> > > >  i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> > > >  		       struct drm_file *file,
> > > >  		       struct drm_i915_gem_execbuffer2 *args,
> > > > -		       struct drm_i915_gem_exec_object2 *exec)
> > > > +		       struct drm_i915_gem_exec_object2 *exec,
> > > > +		       struct i915_address_space *vm)
> > > >  {
> > > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > > >  	struct eb_objects *eb;
> > > > @@ -998,17 +1016,17 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> > > >  
> > > >  	/* Move the objects en-masse into the GTT, evicting if necessary. */
> > > >  	need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
> > > > -	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs);
> > > > +	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
> > > >  	if (ret)
> > > >  		goto err;
> > > >  
> > > >  	/* The objects are in their final locations, apply the relocations. */
> > > >  	if (need_relocs)
> > > > -		ret = i915_gem_execbuffer_relocate(eb);
> > > > +		ret = i915_gem_execbuffer_relocate(eb, vm);
> > > >  	if (ret) {
> > > >  		if (ret == -EFAULT) {
> > > >  			ret = i915_gem_execbuffer_relocate_slow(dev, args, file, ring,
> > > > -								eb, exec);
> > > > +								eb, exec, vm);
> > > >  			BUG_ON(!mutex_is_locked(&dev->struct_mutex));
> > > >  		}
> > > >  		if (ret)
> > > > @@ -1059,7 +1077,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> > > >  			goto err;
> > > >  	}
> > > >  
> > > > -	exec_start = i915_gem_obj_ggtt_offset(batch_obj) + args->batch_start_offset;
> > > > +	exec_start = i915_gem_obj_offset(batch_obj, vm) +
> > > > +		args->batch_start_offset;
> > > >  	exec_len = args->batch_len;
> > > >  	if (cliprects) {
> > > >  		for (i = 0; i < args->num_cliprects; i++) {
> > > > @@ -1084,11 +1103,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> > > >  
> > > >  	trace_i915_gem_ring_dispatch(ring, intel_ring_get_seqno(ring), flags);
> > > >  
> > > > -	i915_gem_execbuffer_move_to_active(&eb->objects, ring);
> > > > +	i915_gem_execbuffer_move_to_active(&eb->objects, vm, ring);
> > > >  	i915_gem_execbuffer_retire_commands(dev, file, ring, batch_obj);
> > > >  
> > > >  err:
> > > > -	eb_destroy(eb);
> > > > +	eb_destroy(eb, vm);
> > > >  
> > > >  	mutex_unlock(&dev->struct_mutex);
> > > >  
> > > > @@ -1105,6 +1124,7 @@ int
> > > >  i915_gem_execbuffer(struct drm_device *dev, void *data,
> > > >  		    struct drm_file *file)
> > > >  {
> > > > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > > >  	struct drm_i915_gem_execbuffer *args = data;
> > > >  	struct drm_i915_gem_execbuffer2 exec2;
> > > >  	struct drm_i915_gem_exec_object *exec_list = NULL;
> > > > @@ -1160,7 +1180,8 @@ i915_gem_execbuffer(struct drm_device *dev, void *data,
> > > >  	exec2.flags = I915_EXEC_RENDER;
> > > >  	i915_execbuffer2_set_context_id(exec2, 0);
> > > >  
> > > > -	ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list);
> > > > +	ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list,
> > > > +				     &dev_priv->gtt.base);
> > > >  	if (!ret) {
> > > >  		/* Copy the new buffer offsets back to the user's exec list. */
> > > >  		for (i = 0; i < args->buffer_count; i++)
> > > > @@ -1186,6 +1207,7 @@ int
> > > >  i915_gem_execbuffer2(struct drm_device *dev, void *data,
> > > >  		     struct drm_file *file)
> > > >  {
> > > > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > > >  	struct drm_i915_gem_execbuffer2 *args = data;
> > > >  	struct drm_i915_gem_exec_object2 *exec2_list = NULL;
> > > >  	int ret;
> > > > @@ -1216,7 +1238,8 @@ i915_gem_execbuffer2(struct drm_device *dev, void *data,
> > > >  		return -EFAULT;
> > > >  	}
> > > >  
> > > > -	ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list);
> > > > +	ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list,
> > > > +				     &dev_priv->gtt.base);
> > > >  	if (!ret) {
> > > >  		/* Copy the new buffer offsets back to the user's exec list. */
> > > >  		ret = copy_to_user(to_user_ptr(args->buffers_ptr),
> > > > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > > index 298fc42..70ce2f6 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > > @@ -367,6 +367,8 @@ static int i915_gem_init_aliasing_ppgtt(struct drm_device *dev)
> > > >  			    ppgtt->base.total);
> > > >  	}
> > > >  
> > > > +	/* i915_init_vm(dev_priv, &ppgtt->base) */
> > > > +
> > > >  	return ret;
> > > >  }
> > > >  
> > > > @@ -386,17 +388,22 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
> > > >  			    struct drm_i915_gem_object *obj,
> > > >  			    enum i915_cache_level cache_level)
> > > >  {
> > > > -	ppgtt->base.insert_entries(&ppgtt->base, obj->pages,
> > > > -				   i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> > > > -				   cache_level);
> > > > +	struct i915_address_space *vm = &ppgtt->base;
> > > > +	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
> > > > +
> > > > +	vm->insert_entries(vm, obj->pages,
> > > > +			   obj_offset >> PAGE_SHIFT,
> > > > +			   cache_level);
> > > >  }
> > > >  
> > > >  void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
> > > >  			      struct drm_i915_gem_object *obj)
> > > >  {
> > > > -	ppgtt->base.clear_range(&ppgtt->base,
> > > > -				i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> > > > -				obj->base.size >> PAGE_SHIFT);
> > > > +	struct i915_address_space *vm = &ppgtt->base;
> > > > +	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
> > > > +
> > > > +	vm->clear_range(vm, obj_offset >> PAGE_SHIFT,
> > > > +			obj->base.size >> PAGE_SHIFT);
> > > >  }
> > > >  
> > > >  extern int intel_iommu_gfx_mapped;
> > > > @@ -447,6 +454,9 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
> > > >  				       dev_priv->gtt.base.start / PAGE_SIZE,
> > > >  				       dev_priv->gtt.base.total / PAGE_SIZE);
> > > >  
> > > > +	if (dev_priv->mm.aliasing_ppgtt)
> > > > +		gen6_write_pdes(dev_priv->mm.aliasing_ppgtt);
> > > > +
> > > >  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> > > >  		i915_gem_clflush_object(obj);
> > > >  		i915_gem_gtt_bind_object(obj, obj->cache_level);
> > > > @@ -625,7 +635,8 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
> > > >  	 * aperture.  One page should be enough to keep any prefetching inside
> > > >  	 * of the aperture.
> > > >  	 */
> > > > -	drm_i915_private_t *dev_priv = dev->dev_private;
> > > > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > > > +	struct i915_address_space *ggtt_vm = &dev_priv->gtt.base;
> > > >  	struct drm_mm_node *entry;
> > > >  	struct drm_i915_gem_object *obj;
> > > >  	unsigned long hole_start, hole_end;
> > > > @@ -633,19 +644,19 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
> > > >  	BUG_ON(mappable_end > end);
> > > >  
> > > >  	/* Subtract the guard page ... */
> > > > -	drm_mm_init(&dev_priv->gtt.base.mm, start, end - start - PAGE_SIZE);
> > > > +	drm_mm_init(&ggtt_vm->mm, start, end - start - PAGE_SIZE);
> > > >  	if (!HAS_LLC(dev))
> > > >  		dev_priv->gtt.base.mm.color_adjust = i915_gtt_color_adjust;
> > > >  
> > > >  	/* Mark any preallocated objects as occupied */
> > > >  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> > > > -		struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> > > > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, ggtt_vm);
> > > >  		int ret;
> > > >  		DRM_DEBUG_KMS("reserving preallocated space: %lx + %zx\n",
> > > >  			      i915_gem_obj_ggtt_offset(obj), obj->base.size);
> > > >  
> > > >  		WARN_ON(i915_gem_obj_ggtt_bound(obj));
> > > > -		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
> > > > +		ret = drm_mm_reserve_node(&ggtt_vm->mm, &vma->node);
> > > >  		if (ret)
> > > >  			DRM_DEBUG_KMS("Reservation failed\n");
> > > >  		obj->has_global_gtt_mapping = 1;
> > > > @@ -656,19 +667,15 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
> > > >  	dev_priv->gtt.base.total = end - start;
> > > >  
> > > >  	/* Clear any non-preallocated blocks */
> > > > -	drm_mm_for_each_hole(entry, &dev_priv->gtt.base.mm,
> > > > -			     hole_start, hole_end) {
> > > > +	drm_mm_for_each_hole(entry, &ggtt_vm->mm, hole_start, hole_end) {
> > > >  		const unsigned long count = (hole_end - hole_start) / PAGE_SIZE;
> > > >  		DRM_DEBUG_KMS("clearing unused GTT space: [%lx, %lx]\n",
> > > >  			      hole_start, hole_end);
> > > > -		dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> > > > -					       hole_start / PAGE_SIZE,
> > > > -					       count);
> > > > +		ggtt_vm->clear_range(ggtt_vm, hole_start / PAGE_SIZE, count);
> > > >  	}
> > > >  
> > > >  	/* And finally clear the reserved guard page */
> > > > -	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> > > > -				       end / PAGE_SIZE - 1, 1);
> > > > +	ggtt_vm->clear_range(ggtt_vm, end / PAGE_SIZE - 1, 1);
> > > >  }
> > > >  
> > > >  static bool
> > > > diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > > index 245eb1d..bfe61fa 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > > @@ -391,7 +391,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
> > > >  	if (gtt_offset == I915_GTT_OFFSET_NONE)
> > > >  		return obj;
> > > >  
> > > > -	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
> > > > +	vma = i915_gem_vma_create(obj, vm);
> > > >  	if (!vma) {
> > > >  		drm_gem_object_unreference(&obj->base);
> > > >  		return NULL;
> > > > @@ -404,8 +404,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
> > > >  	 */
> > > >  	vma->node.start = gtt_offset;
> > > >  	vma->node.size = size;
> > > > -	if (drm_mm_initialized(&dev_priv->gtt.base.mm)) {
> > > > -		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
> > > > +	if (drm_mm_initialized(&vm->mm)) {
> > > > +		ret = drm_mm_reserve_node(&vm->mm, &vma->node);
> > > 
> > > These two hunks here for stolen look fishy - we only ever use the stolen
> > > preallocated stuff for objects with mappings in the global gtt. So keeping
> > > that explicit is imo the better approach. And tbh I'm confused where the
> > > local variable vm is from ...
> > > 
> > > >  		if (ret) {
> > > >  			DRM_DEBUG_KMS("failed to allocate stolen GTT space\n");
> > > >  			goto unref_out;
> > > > diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
> > > > index 92a8d27..808ca2a 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem_tiling.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
> > > > @@ -360,17 +360,19 @@ i915_gem_set_tiling(struct drm_device *dev, void *data,
> > > >  
> > > >  		obj->map_and_fenceable =
> > > >  			!i915_gem_obj_ggtt_bound(obj) ||
> > > > -			(i915_gem_obj_ggtt_offset(obj) + obj->base.size <= dev_priv->gtt.mappable_end &&
> > > > +			(i915_gem_obj_ggtt_offset(obj) +
> > > > +			 obj->base.size <= dev_priv->gtt.mappable_end &&
> > > >  			 i915_gem_object_fence_ok(obj, args->tiling_mode));
> > > >  
> > > >  		/* Rebind if we need a change of alignment */
> > > >  		if (!obj->map_and_fenceable) {
> > > > -			u32 unfenced_alignment =
> > > > +			struct i915_address_space *ggtt = &dev_priv->gtt.base;
> > > > +			u32 unfenced_align =
> > > >  				i915_gem_get_gtt_alignment(dev, obj->base.size,
> > > >  							    args->tiling_mode,
> > > >  							    false);
> > > > -			if (i915_gem_obj_ggtt_offset(obj) & (unfenced_alignment - 1))
> > > > -				ret = i915_gem_object_unbind(obj);
> > > > +			if (i915_gem_obj_ggtt_offset(obj) & (unfenced_align - 1))
> > > > +				ret = i915_gem_object_unbind(obj, ggtt);
> > > >  		}
> > > >  
> > > >  		if (ret == 0) {
> > > > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> > > > index 79fbb17..28fa0ff 100644
> > > > --- a/drivers/gpu/drm/i915/i915_irq.c
> > > > +++ b/drivers/gpu/drm/i915/i915_irq.c
> > > > @@ -1716,6 +1716,9 @@ i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
> > > >  	if (HAS_BROKEN_CS_TLB(dev_priv->dev)) {
> > > >  		u32 acthd = I915_READ(ACTHD);
> > > >  
> > > > +		if (WARN_ON(HAS_HW_CONTEXTS(dev_priv->dev)))
> > > > +			return NULL;
> > > > +
> > > >  		if (WARN_ON(ring->id != RCS))
> > > >  			return NULL;
> > > >  
> > > > @@ -1802,7 +1805,8 @@ static void i915_gem_record_active_context(struct intel_ring_buffer *ring,
> > > >  		return;
> > > >  
> > > >  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> > > > -		if ((error->ccid & PAGE_MASK) == i915_gem_obj_ggtt_offset(obj)) {
> > > > +		if ((error->ccid & PAGE_MASK) ==
> > > > +		    i915_gem_obj_ggtt_offset(obj)) {
> > > >  			ering->ctx = i915_error_object_create_sized(dev_priv,
> > > >  								    obj, 1);
> > > >  			break;
> > > > diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
> > > > index 7d283b5..3f019d3 100644
> > > > --- a/drivers/gpu/drm/i915/i915_trace.h
> > > > +++ b/drivers/gpu/drm/i915/i915_trace.h
> > > > @@ -34,11 +34,13 @@ TRACE_EVENT(i915_gem_object_create,
> > > >  );
> > > >  
> > > >  TRACE_EVENT(i915_gem_object_bind,
> > > > -	    TP_PROTO(struct drm_i915_gem_object *obj, bool mappable),
> > > > -	    TP_ARGS(obj, mappable),
> > > > +	    TP_PROTO(struct drm_i915_gem_object *obj,
> > > > +		     struct i915_address_space *vm, bool mappable),
> > > > +	    TP_ARGS(obj, vm, mappable),
> > > >  
> > > >  	    TP_STRUCT__entry(
> > > >  			     __field(struct drm_i915_gem_object *, obj)
> > > > +			     __field(struct i915_address_space *, vm)
> > > >  			     __field(u32, offset)
> > > >  			     __field(u32, size)
> > > >  			     __field(bool, mappable)
> > > > @@ -46,8 +48,8 @@ TRACE_EVENT(i915_gem_object_bind,
> > > >  
> > > >  	    TP_fast_assign(
> > > >  			   __entry->obj = obj;
> > > > -			   __entry->offset = i915_gem_obj_ggtt_offset(obj);
> > > > -			   __entry->size = i915_gem_obj_ggtt_size(obj);
> > > > +			   __entry->offset = i915_gem_obj_offset(obj, vm);
> > > > +			   __entry->size = i915_gem_obj_size(obj, vm);
> > > >  			   __entry->mappable = mappable;
> > > >  			   ),
> > > >  
> > > > @@ -57,19 +59,21 @@ TRACE_EVENT(i915_gem_object_bind,
> > > >  );
> > > >  
> > > >  TRACE_EVENT(i915_gem_object_unbind,
> > > > -	    TP_PROTO(struct drm_i915_gem_object *obj),
> > > > -	    TP_ARGS(obj),
> > > > +	    TP_PROTO(struct drm_i915_gem_object *obj,
> > > > +		     struct i915_address_space *vm),
> > > > +	    TP_ARGS(obj, vm),
> > > >  
> > > >  	    TP_STRUCT__entry(
> > > >  			     __field(struct drm_i915_gem_object *, obj)
> > > > +			     __field(struct i915_address_space *, vm)
> > > >  			     __field(u32, offset)
> > > >  			     __field(u32, size)
> > > >  			     ),
> > > >  
> > > >  	    TP_fast_assign(
> > > >  			   __entry->obj = obj;
> > > > -			   __entry->offset = i915_gem_obj_ggtt_offset(obj);
> > > > -			   __entry->size = i915_gem_obj_ggtt_size(obj);
> > > > +			   __entry->offset = i915_gem_obj_offset(obj, vm);
> > > > +			   __entry->size = i915_gem_obj_size(obj, vm);
> > > >  			   ),
> > > >  
> > > >  	    TP_printk("obj=%p, offset=%08x size=%x",
> > > > diff --git a/drivers/gpu/drm/i915/intel_fb.c b/drivers/gpu/drm/i915/intel_fb.c
> > > > index f3c97e0..b69cc63 100644
> > > > --- a/drivers/gpu/drm/i915/intel_fb.c
> > > > +++ b/drivers/gpu/drm/i915/intel_fb.c
> > > > @@ -170,7 +170,6 @@ static int intelfb_create(struct drm_fb_helper *helper,
> > > >  		      fb->width, fb->height,
> > > >  		      i915_gem_obj_ggtt_offset(obj), obj);
> > > >  
> > > > -
> > > >  	mutex_unlock(&dev->struct_mutex);
> > > >  	vga_switcheroo_client_fb_set(dev->pdev, info);
> > > >  	return 0;
> > > > diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
> > > > index 81c3ca1..517e278 100644
> > > > --- a/drivers/gpu/drm/i915/intel_overlay.c
> > > > +++ b/drivers/gpu/drm/i915/intel_overlay.c
> > > > @@ -1350,7 +1350,7 @@ void intel_setup_overlay(struct drm_device *dev)
> > > >  		}
> > > >  		overlay->flip_addr = reg_bo->phys_obj->handle->busaddr;
> > > >  	} else {
> > > > -		ret = i915_gem_object_pin(reg_bo, PAGE_SIZE, true, false);
> > > > +		ret = i915_gem_ggtt_pin(reg_bo, PAGE_SIZE, true, false);
> > > >  		if (ret) {
> > > >  			DRM_ERROR("failed to pin overlay register bo\n");
> > > >  			goto out_free_bo;
> > > > diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> > > > index 125a741..449e57c 100644
> > > > --- a/drivers/gpu/drm/i915/intel_pm.c
> > > > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > > > @@ -2858,7 +2858,7 @@ intel_alloc_context_page(struct drm_device *dev)
> > > >  		return NULL;
> > > >  	}
> > > >  
> > > > -	ret = i915_gem_object_pin(ctx, 4096, true, false);
> > > > +	ret = i915_gem_ggtt_pin(ctx, 4096, true, false);
> > > >  	if (ret) {
> > > >  		DRM_ERROR("failed to pin power context: %d\n", ret);
> > > >  		goto err_unref;
> > > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > index bc4c11b..ebed61d 100644
> > > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > @@ -481,6 +481,7 @@ out:
> > > >  static int
> > > >  init_pipe_control(struct intel_ring_buffer *ring)
> > > >  {
> > > > +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> > > >  	struct pipe_control *pc;
> > > >  	struct drm_i915_gem_object *obj;
> > > >  	int ret;
> > > > @@ -499,9 +500,10 @@ init_pipe_control(struct intel_ring_buffer *ring)
> > > >  		goto err;
> > > >  	}
> > > >  
> > > > -	i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
> > > > +	i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> > > > +					I915_CACHE_LLC);
> > > >  
> > > > -	ret = i915_gem_object_pin(obj, 4096, true, false);
> > > > +	ret = i915_gem_ggtt_pin(obj, 4096, true, false);
> > > >  	if (ret)
> > > >  		goto err_unref;
> > > >  
> > > > @@ -1212,6 +1214,7 @@ static void cleanup_status_page(struct intel_ring_buffer *ring)
> > > >  static int init_status_page(struct intel_ring_buffer *ring)
> > > >  {
> > > >  	struct drm_device *dev = ring->dev;
> > > > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > > >  	struct drm_i915_gem_object *obj;
> > > >  	int ret;
> > > >  
> > > > @@ -1222,9 +1225,10 @@ static int init_status_page(struct intel_ring_buffer *ring)
> > > >  		goto err;
> > > >  	}
> > > >  
> > > > -	i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
> > > > +	i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> > > > +					I915_CACHE_LLC);
> > > >  
> > > > -	ret = i915_gem_object_pin(obj, 4096, true, false);
> > > > +	ret = i915_gem_ggtt_pin(obj, 4096, true, false);
> > > >  	if (ret != 0) {
> > > >  		goto err_unref;
> > > >  	}
> > > > @@ -1307,7 +1311,7 @@ static int intel_init_ring_buffer(struct drm_device *dev,
> > > >  
> > > >  	ring->obj = obj;
> > > >  
> > > > -	ret = i915_gem_object_pin(obj, PAGE_SIZE, true, false);
> > > > +	ret = i915_gem_ggtt_pin(obj, PAGE_SIZE, true, false);
> > > >  	if (ret)
> > > >  		goto err_unref;
> > > >  
> > > > @@ -1828,7 +1832,7 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
> > > >  			return -ENOMEM;
> > > >  		}
> > > >  
> > > > -		ret = i915_gem_object_pin(obj, 0, true, false);
> > > > +		ret = i915_gem_ggtt_pin(obj, 0, true, false);
> > > >  		if (ret != 0) {
> > > >  			drm_gem_object_unreference(&obj->base);
> > > >  			DRM_ERROR("Failed to ping batch bo\n");
> > > > -- 
> > > > 1.8.3.2
> > > > 
> > > > _______________________________________________
> > > > Intel-gfx mailing list
> > > > Intel-gfx@lists.freedesktop.org
> > > > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> > > 
> > > -- 
> > > Daniel Vetter
> > > Software Engineer, Intel Corporation
> > > +41 (0) 79 365 57 48 - http://blog.ffwll.ch
> > 
> > -- 
> > Ben Widawsky, Intel Open Source Technology Center
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 06/11] drm/i915: plumb VM into object operations
  2013-07-10 22:23         ` Ben Widawsky
@ 2013-07-11  6:01           ` Daniel Vetter
  0 siblings, 0 replies; 50+ messages in thread
From: Daniel Vetter @ 2013-07-11  6:01 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Thu, Jul 11, 2013 at 12:23 AM, Ben Widawsky <ben@bwidawsk.net> wrote:
> On Wed, Jul 10, 2013 at 07:05:52PM +0200, Daniel Vetter wrote:
>> On Wed, Jul 10, 2013 at 09:37:10AM -0700, Ben Widawsky wrote:
>> > On Tue, Jul 09, 2013 at 09:15:01AM +0200, Daniel Vetter wrote:
>> > > On Mon, Jul 08, 2013 at 11:08:37PM -0700, Ben Widawsky wrote:
>> > > > This patch was formerly known as:
>> > > > "drm/i915: Create VMAs (part 3) - plumbing"
>> > > >
>> > > > This patch adds a VM argument, bind/unbind, and the object
>> > > > offset/size/color getters/setters. It preserves the old ggtt helper
>> > > > functions because things still need, and will continue to need them.
>> > > >
>> > > > Some code will still need to be ported over after this.
>> > > >
>> > > > v2: Fix purge to pick an object and unbind all vmas
>> > > > This was doable because of the global bound list change.
>> > > >
>> > > > v3: With the commit to actually pin/unpin pages in place, there is no
>> > > > longer a need to check if unbind succeeded before calling put_pages().
>> > > > Make put_pages only BUG() after checking pin count.
>> > > >
>> > > > v4: Rebased on top of the new hangcheck work by Mika
>> > > > plumbed eb_destroy also
>> > > > Many checkpatch related fixes
>> > > >
>> > > > v5: Very large rebase
>> > > >
>> > > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
>> > >
>> > > This one is a rather large beast. Any chance we could split it into
>> > > topics, e.g. convert execbuf code, convert shrinker code? Or does that get
>> > > messy, fast?
>> > >
>> >
>> > I've thought of this...
>> >
>> > The one solution I came up with is to have two bind/unbind functions
>> > (similar to what I did with pin, and indeed it was my original plan with
>> > pin), and do the set_caching one separately.
>> >
>> > I think it won't be too messy, just a lot of typing, as Keith likes to
>> > say.
>> >
>> > However, my opinion was, since it's early in the merge cycle, we don't
>> > yet have multiple VMs, and it's /mostly/ a copypasta kind of patch, it's
>> > not a big deal. At a functional level too, I felt this made more sense.
>> >
>> > So I'll defer to your request on this and start splitting it up, unless
>> > my email has changed your mind ;-).
>>
>> Well, my concern is mostly in reviewing since we need to think about each
>> case and whether it makes sense to talk in therms of vma or objects in
>> that function. And what exactly to test.
>>
>> If you've played around and concluded it'll be a mess then I don't think
>> it'll help in reviewing. So pointless.
>
> I said I don't think it will be a mess, though I feel it won't really
> help review too much. Can you take a crack and review and poke me if you
> want me to try it. I'd rather not do it if I can avoid it, so I can try
> to go back to my 15 patch maximum rule.
>
>>
>> Still, there's a bunch of questions on this patch that we need to discuss
>> ;-)
>
> Ready whenever.

It's waiting for you in my first reply, just scroll down a bit ;-)
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 01/11] drm/i915: Move gtt and ppgtt under address space umbrella
  2013-07-09  6:08 ` [PATCH 01/11] drm/i915: Move gtt and ppgtt under address space umbrella Ben Widawsky
  2013-07-09  6:37   ` Daniel Vetter
@ 2013-07-11 11:14   ` Imre Deak
  2013-07-11 23:57     ` Ben Widawsky
  1 sibling, 1 reply; 50+ messages in thread
From: Imre Deak @ 2013-07-11 11:14 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX


[-- Attachment #1.1: Type: text/plain, Size: 2748 bytes --]

On Mon, 2013-07-08 at 23:08 -0700, Ben Widawsky wrote:
> The GTT and PPGTT can be thought of more generally as GPU address
> spaces. Many of their actions (insert entries), state (LRU lists) and
> many of their characteristics (size), can be shared. Do that.
> 
> The change itself doesn't actually impact most of the VMA/VM rework
> coming up, it just fits in with the grand scheme. GGTT will usually be a
> special case where we either know an object must be in the GGTT (dislay
> engine, workarounds, etc.).
> 
> v2: Drop usage of i915_gtt_vm (Daniel)
> Make cleanup also part of the parent class (Ben)
> Modified commit msg
> Rebased
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> ---
>  drivers/gpu/drm/i915/i915_debugfs.c |   4 +-
>  drivers/gpu/drm/i915/i915_dma.c     |   4 +-
>  drivers/gpu/drm/i915/i915_drv.h     |  57 ++++++-------
>  drivers/gpu/drm/i915/i915_gem.c     |   4 +-
>  drivers/gpu/drm/i915/i915_gem_gtt.c | 162 ++++++++++++++++++++----------------
>  5 files changed, 121 insertions(+), 110 deletions(-)
> 
>[...]
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 242d0f9..693115a 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -102,7 +102,7 @@ static gen6_gtt_pte_t hsw_pte_encode(dma_addr_t addr,
>  
>  static void gen6_write_pdes(struct i915_hw_ppgtt *ppgtt)
>  {
> -	struct drm_i915_private *dev_priv = ppgtt->dev->dev_private;
> +	struct drm_i915_private *dev_priv = ppgtt->base.dev->dev_private;
>  	gen6_gtt_pte_t __iomem *pd_addr;
>  	uint32_t pd_entry;
>  	int i;
> @@ -181,18 +181,18 @@ static int gen6_ppgtt_enable(struct drm_device *dev)
>  }
>  
>  /* PPGTT support for Sandybdrige/Gen6 and later */
> -static void gen6_ppgtt_clear_range(struct i915_hw_ppgtt *ppgtt,
> +static void gen6_ppgtt_clear_range(struct i915_address_space *vm,
>  				   unsigned first_entry,
>  				   unsigned num_entries)
>  {
> -	struct drm_i915_private *dev_priv = ppgtt->dev->dev_private;
> +	struct i915_hw_ppgtt *ppgtt =
> +		container_of(vm, struct i915_hw_ppgtt, base);
>  	gen6_gtt_pte_t *pt_vaddr, scratch_pte;
>  	unsigned act_pt = first_entry / I915_PPGTT_PT_ENTRIES;
>  	unsigned first_pte = first_entry % I915_PPGTT_PT_ENTRIES;
>  	unsigned last_pte, i;
>  
> -	scratch_pte = ppgtt->pte_encode(dev_priv->gtt.scratch.addr,
> -					I915_CACHE_LLC);
> +	scratch_pte = vm->pte_encode(vm->scratch.addr, I915_CACHE_LLC);

I only see ggtt's scratch page being initialized, but can't find the
corresponding init/teardown for ppgtt. Btw, why do we need separate
global/per-process scratch pages? (would be nice to add it to the commit
message)

--Imre


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 05/11] drm/i915: Create VMAs
  2013-07-09  6:08 ` [PATCH 05/11] drm/i915: Create VMAs Ben Widawsky
@ 2013-07-11 11:20   ` Imre Deak
  2013-07-12  2:23     ` Ben Widawsky
  0 siblings, 1 reply; 50+ messages in thread
From: Imre Deak @ 2013-07-11 11:20 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX


[-- Attachment #1.1: Type: text/plain, Size: 4516 bytes --]

On Mon, 2013-07-08 at 23:08 -0700, Ben Widawsky wrote:
> Formerly: "drm/i915: Create VMAs (part 1)"
> 
> In a previous patch, the notion of a VM was introduced. A VMA describes
> an area of part of the VM address space. A VMA is similar to the concept
> in the linux mm. However, instead of representing regular memory, a VMA
> is backed by a GEM BO. There may be many VMAs for a given object, one
> for each VM the object is to be used in. This may occur through flink,
> dma-buf, or a number of other transient states.
> 
> Currently the code depends on only 1 VMA per object, for the global GTT
> (and aliasing PPGTT). The following patches will address this and make
> the rest of the infrastructure more suited
> 
> v2: s/i915_obj/i915_gem_obj (Chris)
> 
> v3: Only move an object to the now global unbound list if there are no
> more VMAs for the object which are bound into a VM (ie. the list is
> empty).
> 
> v4: killed obj->gtt_space
> some reworks due to rebase
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> ---
>  drivers/gpu/drm/i915/i915_drv.h        | 48 ++++++++++++++++++++++------
>  drivers/gpu/drm/i915/i915_gem.c        | 57 +++++++++++++++++++++++++++++-----
>  drivers/gpu/drm/i915/i915_gem_evict.c  | 12 ++++---
>  drivers/gpu/drm/i915/i915_gem_gtt.c    |  5 +--
>  drivers/gpu/drm/i915/i915_gem_stolen.c | 14 ++++++---
>  5 files changed, 110 insertions(+), 26 deletions(-)
> 
> [...]
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 525aa8f..058ad44 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2578,6 +2578,7 @@ int
>  i915_gem_object_unbind(struct drm_i915_gem_object *obj)
>  {
>  	drm_i915_private_t *dev_priv = obj->base.dev->dev_private;
> +	struct i915_vma *vma;
>  	int ret;
>  
>  	if (!i915_gem_obj_ggtt_bound(obj))
> @@ -2615,11 +2616,20 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
>  	i915_gem_object_unpin_pages(obj);
>  
>  	list_del(&obj->mm_list);
> -	list_move_tail(&obj->global_list, &dev_priv->mm.unbound_list);
>  	/* Avoid an unnecessary call to unbind on rebind. */
>  	obj->map_and_fenceable = true;
>  
> -	drm_mm_remove_node(&obj->gtt_space);
> +	vma = __i915_gem_obj_to_vma(obj);
> +	list_del(&vma->vma_link);
> +	drm_mm_remove_node(&vma->node);
> +	i915_gem_vma_destroy(vma);
> +
> +	/* Since the unbound list is global, only move to that list if
> +	 * no more VMAs exist.
> +	 * NB: Until we have real VMAs there will only ever be one */
> +	WARN_ON(!list_empty(&obj->vma_list));
> +	if (list_empty(&obj->vma_list))
> +		list_move_tail(&obj->global_list, &dev_priv->mm.unbound_list);
>  
>  	return 0;
>  }
> @@ -3070,8 +3080,12 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
>  	bool mappable, fenceable;
>  	size_t gtt_max = map_and_fenceable ?
>  		dev_priv->gtt.mappable_end : dev_priv->gtt.base.total;
> +	struct i915_vma *vma;
>  	int ret;
>  
> +	if (WARN_ON(!list_empty(&obj->vma_list)))
> +		return -EBUSY;
> +
>  	fence_size = i915_gem_get_gtt_size(dev,
>  					   obj->base.size,
>  					   obj->tiling_mode);
> @@ -3110,9 +3124,15 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
>  
>  	i915_gem_object_pin_pages(obj);
>  
> +	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
> +	if (vma == NULL) {
> +		i915_gem_object_unpin_pages(obj);
> +		return -ENOMEM;
> +	}
> +
>  search_free:
>  	ret = drm_mm_insert_node_in_range_generic(&dev_priv->gtt.base.mm,
> -						  &obj->gtt_space,
> +						  &vma->node,
>  						  size, alignment,
>  						  obj->cache_level, 0, gtt_max);
>  	if (ret) {
> @@ -3126,22 +3146,23 @@ search_free:
>  		i915_gem_object_unpin_pages(obj);
>  		return ret;
>  	}
> -	if (WARN_ON(!i915_gem_valid_gtt_space(dev, &obj->gtt_space,
> +	if (WARN_ON(!i915_gem_valid_gtt_space(dev, &vma->node,
>  					      obj->cache_level))) {
>  		i915_gem_object_unpin_pages(obj);
> -		drm_mm_remove_node(&obj->gtt_space);
> +		drm_mm_remove_node(&vma->node);
>  		return -EINVAL;
>  	}
>  
>  	ret = i915_gem_gtt_prepare_object(obj);
>  	if (ret) {
>  		i915_gem_object_unpin_pages(obj);
> -		drm_mm_remove_node(&obj->gtt_space);
> +		drm_mm_remove_node(&vma->node);
>  		return ret;
>  	}

Freeing vma on the error path is missing.

With this and the issue in 1/5 addressed things look good to me, so on
1-5:

Reviewed-by: Imre Deak <imre.deak@intel.com>

--Imre

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 01/11] drm/i915: Move gtt and ppgtt under address space umbrella
  2013-07-11 11:14   ` Imre Deak
@ 2013-07-11 23:57     ` Ben Widawsky
  2013-07-12 15:59       ` Ben Widawsky
  0 siblings, 1 reply; 50+ messages in thread
From: Ben Widawsky @ 2013-07-11 23:57 UTC (permalink / raw)
  To: Imre Deak; +Cc: Intel GFX

On Thu, Jul 11, 2013 at 02:14:06PM +0300, Imre Deak wrote:
> On Mon, 2013-07-08 at 23:08 -0700, Ben Widawsky wrote:
> > The GTT and PPGTT can be thought of more generally as GPU address
> > spaces. Many of their actions (insert entries), state (LRU lists) and
> > many of their characteristics (size), can be shared. Do that.
> > 
> > The change itself doesn't actually impact most of the VMA/VM rework
> > coming up, it just fits in with the grand scheme. GGTT will usually be a
> > special case where we either know an object must be in the GGTT (dislay
> > engine, workarounds, etc.).
> > 
> > v2: Drop usage of i915_gtt_vm (Daniel)
> > Make cleanup also part of the parent class (Ben)
> > Modified commit msg
> > Rebased
> > 
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > ---
> >  drivers/gpu/drm/i915/i915_debugfs.c |   4 +-
> >  drivers/gpu/drm/i915/i915_dma.c     |   4 +-
> >  drivers/gpu/drm/i915/i915_drv.h     |  57 ++++++-------
> >  drivers/gpu/drm/i915/i915_gem.c     |   4 +-
> >  drivers/gpu/drm/i915/i915_gem_gtt.c | 162 ++++++++++++++++++++----------------
> >  5 files changed, 121 insertions(+), 110 deletions(-)
> > 
> >[...]
> > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > index 242d0f9..693115a 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > @@ -102,7 +102,7 @@ static gen6_gtt_pte_t hsw_pte_encode(dma_addr_t addr,
> >  
> >  static void gen6_write_pdes(struct i915_hw_ppgtt *ppgtt)
> >  {
> > -	struct drm_i915_private *dev_priv = ppgtt->dev->dev_private;
> > +	struct drm_i915_private *dev_priv = ppgtt->base.dev->dev_private;
> >  	gen6_gtt_pte_t __iomem *pd_addr;
> >  	uint32_t pd_entry;
> >  	int i;
> > @@ -181,18 +181,18 @@ static int gen6_ppgtt_enable(struct drm_device *dev)
> >  }
> >  
> >  /* PPGTT support for Sandybdrige/Gen6 and later */
> > -static void gen6_ppgtt_clear_range(struct i915_hw_ppgtt *ppgtt,
> > +static void gen6_ppgtt_clear_range(struct i915_address_space *vm,
> >  				   unsigned first_entry,
> >  				   unsigned num_entries)
> >  {
> > -	struct drm_i915_private *dev_priv = ppgtt->dev->dev_private;
> > +	struct i915_hw_ppgtt *ppgtt =
> > +		container_of(vm, struct i915_hw_ppgtt, base);
> >  	gen6_gtt_pte_t *pt_vaddr, scratch_pte;
> >  	unsigned act_pt = first_entry / I915_PPGTT_PT_ENTRIES;
> >  	unsigned first_pte = first_entry % I915_PPGTT_PT_ENTRIES;
> >  	unsigned last_pte, i;
> >  
> > -	scratch_pte = ppgtt->pte_encode(dev_priv->gtt.scratch.addr,
> > -					I915_CACHE_LLC);
> > +	scratch_pte = vm->pte_encode(vm->scratch.addr, I915_CACHE_LLC);
> 
> I only see ggtt's scratch page being initialized, but can't find the
> corresponding init/teardown for ppgtt. Btw, why do we need separate
> global/per-process scratch pages? (would be nice to add it to the commit
> message)
> 
> --Imre
> 

There is indeed a bug here, it existed somewhere, so I've mistakenly dropped
it. Here is my local fix, which is what I had done previously.

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 552e4cb..c8130db 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -295,6 +295,7 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
        ppgtt->base.clear_range = gen6_ppgtt_clear_range;
        ppgtt->base.bind_object = gen6_ppgtt_bind_object;
        ppgtt->base.cleanup = gen6_ppgtt_cleanup;
+       ppgtt->base.scratch = dev_priv->gtt.base.scratch;
        ppgtt->pt_pages = kzalloc(sizeof(struct page *)*ppgtt->num_pd_entries,
                                  GFP_KERNEL);
        if (!ppgtt->pt_pages)


Not sure what you mean, there should be only 1 scratch page now.

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH 06/11] drm/i915: plumb VM into object operations
  2013-07-09  7:15   ` Daniel Vetter
  2013-07-10 16:37     ` Ben Widawsky
@ 2013-07-12  2:23     ` Ben Widawsky
  2013-07-12  6:26       ` Daniel Vetter
  1 sibling, 1 reply; 50+ messages in thread
From: Ben Widawsky @ 2013-07-12  2:23 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX

On Tue, Jul 09, 2013 at 09:15:01AM +0200, Daniel Vetter wrote:
> On Mon, Jul 08, 2013 at 11:08:37PM -0700, Ben Widawsky wrote:
> > This patch was formerly known as:
> > "drm/i915: Create VMAs (part 3) - plumbing"
> > 
> > This patch adds a VM argument, bind/unbind, and the object
> > offset/size/color getters/setters. It preserves the old ggtt helper
> > functions because things still need, and will continue to need them.
> > 
> > Some code will still need to be ported over after this.
> > 
> > v2: Fix purge to pick an object and unbind all vmas
> > This was doable because of the global bound list change.
> > 
> > v3: With the commit to actually pin/unpin pages in place, there is no
> > longer a need to check if unbind succeeded before calling put_pages().
> > Make put_pages only BUG() after checking pin count.
> > 
> > v4: Rebased on top of the new hangcheck work by Mika
> > plumbed eb_destroy also
> > Many checkpatch related fixes
> > 
> > v5: Very large rebase
> > 
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> 
> This one is a rather large beast. Any chance we could split it into
> topics, e.g. convert execbuf code, convert shrinker code? Or does that get
> messy, fast?
> 
> > ---
> >  drivers/gpu/drm/i915/i915_debugfs.c        |  31 ++-
> >  drivers/gpu/drm/i915/i915_dma.c            |   4 -
> >  drivers/gpu/drm/i915/i915_drv.h            | 107 +++++-----
> >  drivers/gpu/drm/i915/i915_gem.c            | 316 +++++++++++++++++++++--------
> >  drivers/gpu/drm/i915/i915_gem_context.c    |   9 +-
> >  drivers/gpu/drm/i915/i915_gem_evict.c      |  51 +++--
> >  drivers/gpu/drm/i915/i915_gem_execbuffer.c |  85 +++++---
> >  drivers/gpu/drm/i915/i915_gem_gtt.c        |  41 ++--
> >  drivers/gpu/drm/i915/i915_gem_stolen.c     |   6 +-
> >  drivers/gpu/drm/i915/i915_gem_tiling.c     |  10 +-
> >  drivers/gpu/drm/i915/i915_irq.c            |   6 +-
> >  drivers/gpu/drm/i915/i915_trace.h          |  20 +-
> >  drivers/gpu/drm/i915/intel_fb.c            |   1 -
> >  drivers/gpu/drm/i915/intel_overlay.c       |   2 +-
> >  drivers/gpu/drm/i915/intel_pm.c            |   2 +-
> >  drivers/gpu/drm/i915/intel_ringbuffer.c    |  16 +-
> >  16 files changed, 468 insertions(+), 239 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> > index 16b2aaf..867ed07 100644
> > --- a/drivers/gpu/drm/i915/i915_debugfs.c
> > +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> > @@ -122,9 +122,18 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
> >  		seq_printf(m, " (pinned x %d)", obj->pin_count);
> >  	if (obj->fence_reg != I915_FENCE_REG_NONE)
> >  		seq_printf(m, " (fence: %d)", obj->fence_reg);
> > -	if (i915_gem_obj_ggtt_bound(obj))
> > -		seq_printf(m, " (gtt offset: %08lx, size: %08x)",
> > -			   i915_gem_obj_ggtt_offset(obj), (unsigned int)i915_gem_obj_ggtt_size(obj));
> > +	if (i915_gem_obj_bound_any(obj)) {
> 
> list_for_each will short-circuit already, so this is redundant.
> 

Got it.

> > +		struct i915_vma *vma;
> > +		list_for_each_entry(vma, &obj->vma_list, vma_link) {
> > +			if (!i915_is_ggtt(vma->vm))
> > +				seq_puts(m, " (pp");
> > +			else
> > +				seq_puts(m, " (g");
> > +			seq_printf(m, " gtt offset: %08lx, size: %08lx)",
> 
>                                        ^ that space looks superflous now
Got it.

> 
> > +				   i915_gem_obj_offset(obj, vma->vm),
> > +				   i915_gem_obj_size(obj, vma->vm));
> > +		}
> > +	}
> >  	if (obj->stolen)
> >  		seq_printf(m, " (stolen: %08lx)", obj->stolen->start);
> >  	if (obj->pin_mappable || obj->fault_mappable) {
> > @@ -186,6 +195,7 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
> >  	return 0;
> >  }
> >  
> > +/* FIXME: Support multiple VM? */
> >  #define count_objects(list, member) do { \
> >  	list_for_each_entry(obj, list, member) { \
> >  		size += i915_gem_obj_ggtt_size(obj); \
> > @@ -2049,18 +2059,21 @@ i915_drop_caches_set(void *data, u64 val)
> >  
> >  	if (val & DROP_BOUND) {
> >  		list_for_each_entry_safe(obj, next, &vm->inactive_list,
> > -					 mm_list)
> > -			if (obj->pin_count == 0) {
> > -				ret = i915_gem_object_unbind(obj);
> > -				if (ret)
> > -					goto unlock;
> > -			}
> > +					 mm_list) {
> > +			if (obj->pin_count)
> > +				continue;
> > +
> > +			ret = i915_gem_object_unbind(obj, &dev_priv->gtt.base);
> > +			if (ret)
> > +				goto unlock;
> > +		}
> >  	}
> >  
> >  	if (val & DROP_UNBOUND) {
> >  		list_for_each_entry_safe(obj, next, &dev_priv->mm.unbound_list,
> >  					 global_list)
> >  			if (obj->pages_pin_count == 0) {
> > +				/* FIXME: Do this for all vms? */
> >  				ret = i915_gem_object_put_pages(obj);
> >  				if (ret)
> >  					goto unlock;
> > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> > index d13e21f..b190439 100644
> > --- a/drivers/gpu/drm/i915/i915_dma.c
> > +++ b/drivers/gpu/drm/i915/i915_dma.c
> > @@ -1497,10 +1497,6 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
> >  
> >  	i915_dump_device_info(dev_priv);
> >  
> > -	INIT_LIST_HEAD(&dev_priv->vm_list);
> > -	INIT_LIST_HEAD(&dev_priv->gtt.base.global_link);
> > -	list_add(&dev_priv->gtt.base.global_link, &dev_priv->vm_list);
> > -
> >  	if (i915_get_bridge_dev(dev)) {
> >  		ret = -EIO;
> >  		goto free_priv;
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index 38cccc8..48baccc 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -1363,52 +1363,6 @@ struct drm_i915_gem_object {
> >  
> >  #define to_intel_bo(x) container_of(x, struct drm_i915_gem_object, base)
> >  
> > -/* This is a temporary define to help transition us to real VMAs. If you see
> > - * this, you're either reviewing code, or bisecting it. */
> > -static inline struct i915_vma *
> > -__i915_gem_obj_to_vma(struct drm_i915_gem_object *obj)
> > -{
> > -	if (list_empty(&obj->vma_list))
> > -		return NULL;
> > -	return list_first_entry(&obj->vma_list, struct i915_vma, vma_link);
> > -}
> > -
> > -/* Whether or not this object is currently mapped by the translation tables */
> > -static inline bool
> > -i915_gem_obj_ggtt_bound(struct drm_i915_gem_object *o)
> > -{
> > -	struct i915_vma *vma = __i915_gem_obj_to_vma(o);
> > -	if (vma == NULL)
> > -		return false;
> > -	return drm_mm_node_allocated(&vma->node);
> > -}
> > -
> > -/* Offset of the first PTE pointing to this object */
> > -static inline unsigned long
> > -i915_gem_obj_ggtt_offset(struct drm_i915_gem_object *o)
> > -{
> > -	BUG_ON(list_empty(&o->vma_list));
> > -	return __i915_gem_obj_to_vma(o)->node.start;
> > -}
> > -
> > -/* The size used in the translation tables may be larger than the actual size of
> > - * the object on GEN2/GEN3 because of the way tiling is handled. See
> > - * i915_gem_get_gtt_size() for more details.
> > - */
> > -static inline unsigned long
> > -i915_gem_obj_ggtt_size(struct drm_i915_gem_object *o)
> > -{
> > -	BUG_ON(list_empty(&o->vma_list));
> > -	return __i915_gem_obj_to_vma(o)->node.size;
> > -}
> > -
> > -static inline void
> > -i915_gem_obj_ggtt_set_color(struct drm_i915_gem_object *o,
> > -			    enum i915_cache_level color)
> > -{
> > -	__i915_gem_obj_to_vma(o)->node.color = color;
> > -}
> > -
> >  /**
> >   * Request queue structure.
> >   *
> > @@ -1726,11 +1680,13 @@ struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
> >  void i915_gem_vma_destroy(struct i915_vma *vma);
> >  
> >  int __must_check i915_gem_object_pin(struct drm_i915_gem_object *obj,
> > +				     struct i915_address_space *vm,
> >  				     uint32_t alignment,
> >  				     bool map_and_fenceable,
> >  				     bool nonblocking);
> >  void i915_gem_object_unpin(struct drm_i915_gem_object *obj);
> > -int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj);
> > +int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj,
> > +					struct i915_address_space *vm);
> >  int i915_gem_object_put_pages(struct drm_i915_gem_object *obj);
> >  void i915_gem_release_mmap(struct drm_i915_gem_object *obj);
> >  void i915_gem_lastclose(struct drm_device *dev);
> > @@ -1760,6 +1716,7 @@ int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
> >  int i915_gem_object_sync(struct drm_i915_gem_object *obj,
> >  			 struct intel_ring_buffer *to);
> >  void i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> > +				    struct i915_address_space *vm,
> >  				    struct intel_ring_buffer *ring);
> >  
> >  int i915_gem_dumb_create(struct drm_file *file_priv,
> > @@ -1866,6 +1823,7 @@ i915_gem_get_gtt_alignment(struct drm_device *dev, uint32_t size,
> >  			    int tiling_mode, bool fenced);
> >  
> >  int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > +				    struct i915_address_space *vm,
> >  				    enum i915_cache_level cache_level);
> >  
> >  struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev,
> > @@ -1876,6 +1834,56 @@ struct dma_buf *i915_gem_prime_export(struct drm_device *dev,
> >  
> >  void i915_gem_restore_fences(struct drm_device *dev);
> >  
> > +unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
> > +				  struct i915_address_space *vm);
> > +bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o);
> > +bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
> > +			struct i915_address_space *vm);
> > +unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
> > +				struct i915_address_space *vm);
> > +void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
> > +			    struct i915_address_space *vm,
> > +			    enum i915_cache_level color);
> > +struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
> > +				     struct i915_address_space *vm);
> > +/* Some GGTT VM helpers */
> > +#define obj_to_ggtt(obj) \
> > +	(&((struct drm_i915_private *)(obj)->base.dev->dev_private)->gtt.base)
> > +static inline bool i915_is_ggtt(struct i915_address_space *vm)
> > +{
> > +	struct i915_address_space *ggtt =
> > +		&((struct drm_i915_private *)(vm)->dev->dev_private)->gtt.base;
> > +	return vm == ggtt;
> > +}
> > +
> > +static inline bool i915_gem_obj_ggtt_bound(struct drm_i915_gem_object *obj)
> > +{
> > +	return i915_gem_obj_bound(obj, obj_to_ggtt(obj));
> > +}
> > +
> > +static inline unsigned long
> > +i915_gem_obj_ggtt_offset(struct drm_i915_gem_object *obj)
> > +{
> > +	return i915_gem_obj_offset(obj, obj_to_ggtt(obj));
> > +}
> > +
> > +static inline unsigned long
> > +i915_gem_obj_ggtt_size(struct drm_i915_gem_object *obj)
> > +{
> > +	return i915_gem_obj_size(obj, obj_to_ggtt(obj));
> > +}
> > +
> > +static inline int __must_check
> > +i915_gem_ggtt_pin(struct drm_i915_gem_object *obj,
> > +		  uint32_t alignment,
> > +		  bool map_and_fenceable,
> > +		  bool nonblocking)
> > +{
> > +	return i915_gem_object_pin(obj, obj_to_ggtt(obj), alignment,
> > +				   map_and_fenceable, nonblocking);
> > +}
> > +#undef obj_to_ggtt
> > +
> >  /* i915_gem_context.c */
> >  void i915_gem_context_init(struct drm_device *dev);
> >  void i915_gem_context_fini(struct drm_device *dev);
> > @@ -1912,6 +1920,7 @@ void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
> >  
> >  void i915_gem_restore_gtt_mappings(struct drm_device *dev);
> >  int __must_check i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj);
> > +/* FIXME: this is never okay with full PPGTT */
> >  void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
> >  				enum i915_cache_level cache_level);
> >  void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj);
> > @@ -1928,7 +1937,9 @@ static inline void i915_gem_chipset_flush(struct drm_device *dev)
> >  
> >  
> >  /* i915_gem_evict.c */
> > -int __must_check i915_gem_evict_something(struct drm_device *dev, int min_size,
> > +int __must_check i915_gem_evict_something(struct drm_device *dev,
> > +					  struct i915_address_space *vm,
> > +					  int min_size,
> >  					  unsigned alignment,
> >  					  unsigned cache_level,
> >  					  bool mappable,
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index 058ad44..21015cd 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -38,10 +38,12 @@
> >  
> >  static void i915_gem_object_flush_gtt_write_domain(struct drm_i915_gem_object *obj);
> >  static void i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *obj);
> > -static __must_check int i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> > -						    unsigned alignment,
> > -						    bool map_and_fenceable,
> > -						    bool nonblocking);
> > +static __must_check int
> > +i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> > +			    struct i915_address_space *vm,
> > +			    unsigned alignment,
> > +			    bool map_and_fenceable,
> > +			    bool nonblocking);
> >  static int i915_gem_phys_pwrite(struct drm_device *dev,
> >  				struct drm_i915_gem_object *obj,
> >  				struct drm_i915_gem_pwrite *args,
> > @@ -135,7 +137,7 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
> >  static inline bool
> >  i915_gem_object_is_inactive(struct drm_i915_gem_object *obj)
> >  {
> > -	return i915_gem_obj_ggtt_bound(obj) && !obj->active;
> > +	return i915_gem_obj_bound_any(obj) && !obj->active;
> >  }
> >  
> >  int
> > @@ -422,7 +424,7 @@ i915_gem_shmem_pread(struct drm_device *dev,
> >  		 * anyway again before the next pread happens. */
> >  		if (obj->cache_level == I915_CACHE_NONE)
> >  			needs_clflush = 1;
> > -		if (i915_gem_obj_ggtt_bound(obj)) {
> > +		if (i915_gem_obj_bound_any(obj)) {
> >  			ret = i915_gem_object_set_to_gtt_domain(obj, false);
> 
> This is essentially a very convoluted version of "if there's gpu rendering
> outstanding, please wait for it". Maybe we should switch this to
> 
> 	if (obj->active)
> 		wait_rendering(obj, true);
> 
> Same for the shmem_pwrite case below. Would be a separate patch to prep
> things though. Can I volunteer you for that? The ugly part is to review
> whether any of the lru list updating that set_domain does in addition to
> wait_rendering is required, but on a quick read that's not the case.

Just reading the comment above it says we need the clflush. I don't
actually understand why we do that even after reading the comment, but
meh. You tell me, I don't mind doing this as a prep first.

> 
> >  			if (ret)
> >  				return ret;
> > @@ -594,7 +596,7 @@ i915_gem_gtt_pwrite_fast(struct drm_device *dev,
> >  	char __user *user_data;
> >  	int page_offset, page_length, ret;
> >  
> > -	ret = i915_gem_object_pin(obj, 0, true, true);
> > +	ret = i915_gem_ggtt_pin(obj, 0, true, true);
> >  	if (ret)
> >  		goto out;
> >  
> > @@ -739,7 +741,7 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
> >  		 * right away and we therefore have to clflush anyway. */
> >  		if (obj->cache_level == I915_CACHE_NONE)
> >  			needs_clflush_after = 1;
> > -		if (i915_gem_obj_ggtt_bound(obj)) {
> > +		if (i915_gem_obj_bound_any(obj)) {
> 
> ... see above.
> >  			ret = i915_gem_object_set_to_gtt_domain(obj, true);
> >  			if (ret)
> >  				return ret;
> > @@ -1346,7 +1348,7 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
> >  	}
> >  
> >  	/* Now bind it into the GTT if needed */
> > -	ret = i915_gem_object_pin(obj, 0, true, false);
> > +	ret = i915_gem_ggtt_pin(obj,  0, true, false);
> >  	if (ret)
> >  		goto unlock;
> >  
> > @@ -1668,11 +1670,11 @@ i915_gem_object_put_pages(struct drm_i915_gem_object *obj)
> >  	if (obj->pages == NULL)
> >  		return 0;
> >  
> > -	BUG_ON(i915_gem_obj_ggtt_bound(obj));
> > -
> >  	if (obj->pages_pin_count)
> >  		return -EBUSY;
> >  
> > +	BUG_ON(i915_gem_obj_bound_any(obj));
> > +
> >  	/* ->put_pages might need to allocate memory for the bit17 swizzle
> >  	 * array, hence protect them from being reaped by removing them from gtt
> >  	 * lists early. */
> > @@ -1692,7 +1694,6 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
> >  		  bool purgeable_only)
> >  {
> >  	struct drm_i915_gem_object *obj, *next;
> > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> >  	long count = 0;
> >  
> >  	list_for_each_entry_safe(obj, next,
> > @@ -1706,14 +1707,22 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
> >  		}
> >  	}
> >  
> > -	list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list) {
> > -		if ((i915_gem_object_is_purgeable(obj) || !purgeable_only) &&
> > -		    i915_gem_object_unbind(obj) == 0 &&
> > -		    i915_gem_object_put_pages(obj) == 0) {
> > +	list_for_each_entry_safe(obj, next, &dev_priv->mm.bound_list,
> > +				 global_list) {
> > +		struct i915_vma *vma, *v;
> > +
> > +		if (!i915_gem_object_is_purgeable(obj) && purgeable_only)
> > +			continue;
> > +
> > +		list_for_each_entry_safe(vma, v, &obj->vma_list, vma_link)
> > +			if (i915_gem_object_unbind(obj, vma->vm))
> > +				break;
> > +
> > +		if (!i915_gem_object_put_pages(obj))
> >  			count += obj->base.size >> PAGE_SHIFT;
> > -			if (count >= target)
> > -				return count;
> > -		}
> > +
> > +		if (count >= target)
> > +			return count;
> >  	}
> >  
> >  	return count;
> > @@ -1873,11 +1882,11 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
> >  
> >  void
> >  i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> > +			       struct i915_address_space *vm,
> >  			       struct intel_ring_buffer *ring)
> >  {
> >  	struct drm_device *dev = obj->base.dev;
> >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> >  	u32 seqno = intel_ring_get_seqno(ring);
> >  
> >  	BUG_ON(ring == NULL);
> > @@ -1910,12 +1919,9 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> >  }
> >  
> >  static void
> > -i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
> > +i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj,
> > +				 struct i915_address_space *vm)
> >  {
> > -	struct drm_device *dev = obj->base.dev;
> > -	struct drm_i915_private *dev_priv = dev->dev_private;
> > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > -
> >  	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
> >  	BUG_ON(!obj->active);
> >  
> > @@ -2117,10 +2123,11 @@ i915_gem_request_remove_from_client(struct drm_i915_gem_request *request)
> >  	spin_unlock(&file_priv->mm.lock);
> >  }
> >  
> > -static bool i915_head_inside_object(u32 acthd, struct drm_i915_gem_object *obj)
> > +static bool i915_head_inside_object(u32 acthd, struct drm_i915_gem_object *obj,
> > +				    struct i915_address_space *vm)
> >  {
> > -	if (acthd >= i915_gem_obj_ggtt_offset(obj) &&
> > -	    acthd < i915_gem_obj_ggtt_offset(obj) + obj->base.size)
> > +	if (acthd >= i915_gem_obj_offset(obj, vm) &&
> > +	    acthd < i915_gem_obj_offset(obj, vm) + obj->base.size)
> >  		return true;
> >  
> >  	return false;
> > @@ -2143,6 +2150,17 @@ static bool i915_head_inside_request(const u32 acthd_unmasked,
> >  	return false;
> >  }
> >  
> > +static struct i915_address_space *
> > +request_to_vm(struct drm_i915_gem_request *request)
> > +{
> > +	struct drm_i915_private *dev_priv = request->ring->dev->dev_private;
> > +	struct i915_address_space *vm;
> > +
> > +	vm = &dev_priv->gtt.base;
> > +
> > +	return vm;
> > +}
> > +
> >  static bool i915_request_guilty(struct drm_i915_gem_request *request,
> >  				const u32 acthd, bool *inside)
> >  {
> > @@ -2150,9 +2168,9 @@ static bool i915_request_guilty(struct drm_i915_gem_request *request,
> >  	 * pointing inside the ring, matches the batch_obj address range.
> >  	 * However this is extremely unlikely.
> >  	 */
> > -
> >  	if (request->batch_obj) {
> > -		if (i915_head_inside_object(acthd, request->batch_obj)) {
> > +		if (i915_head_inside_object(acthd, request->batch_obj,
> > +					    request_to_vm(request))) {
> >  			*inside = true;
> >  			return true;
> >  		}
> > @@ -2172,17 +2190,21 @@ static void i915_set_reset_status(struct intel_ring_buffer *ring,
> >  {
> >  	struct i915_ctx_hang_stats *hs = NULL;
> >  	bool inside, guilty;
> > +	unsigned long offset = 0;
> >  
> >  	/* Innocent until proven guilty */
> >  	guilty = false;
> >  
> > +	if (request->batch_obj)
> > +		offset = i915_gem_obj_offset(request->batch_obj,
> > +					     request_to_vm(request));
> > +
> >  	if (ring->hangcheck.action != wait &&
> >  	    i915_request_guilty(request, acthd, &inside)) {
> >  		DRM_ERROR("%s hung %s bo (0x%lx ctx %d) at 0x%x\n",
> >  			  ring->name,
> >  			  inside ? "inside" : "flushing",
> > -			  request->batch_obj ?
> > -			  i915_gem_obj_ggtt_offset(request->batch_obj) : 0,
> > +			  offset,
> >  			  request->ctx ? request->ctx->id : 0,
> >  			  acthd);
> >  
> > @@ -2239,13 +2261,15 @@ static void i915_gem_reset_ring_lists(struct drm_i915_private *dev_priv,
> >  	}
> >  
> >  	while (!list_empty(&ring->active_list)) {
> > +		struct i915_address_space *vm;
> >  		struct drm_i915_gem_object *obj;
> >  
> >  		obj = list_first_entry(&ring->active_list,
> >  				       struct drm_i915_gem_object,
> >  				       ring_list);
> >  
> > -		i915_gem_object_move_to_inactive(obj);
> > +		list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > +			i915_gem_object_move_to_inactive(obj, vm);
> >  	}
> >  }
> >  
> > @@ -2263,7 +2287,7 @@ void i915_gem_restore_fences(struct drm_device *dev)
> >  void i915_gem_reset(struct drm_device *dev)
> >  {
> >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > +	struct i915_address_space *vm;
> >  	struct drm_i915_gem_object *obj;
> >  	struct intel_ring_buffer *ring;
> >  	int i;
> > @@ -2274,8 +2298,9 @@ void i915_gem_reset(struct drm_device *dev)
> >  	/* Move everything out of the GPU domains to ensure we do any
> >  	 * necessary invalidation upon reuse.
> >  	 */
> > -	list_for_each_entry(obj, &vm->inactive_list, mm_list)
> > -		obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
> > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > +		list_for_each_entry(obj, &vm->inactive_list, mm_list)
> > +			obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
> >  
> >  	i915_gem_restore_fences(dev);
> >  }
> > @@ -2320,6 +2345,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
> >  	 * by the ringbuffer to the flushing/inactive lists as appropriate.
> >  	 */
> >  	while (!list_empty(&ring->active_list)) {
> > +		struct drm_i915_private *dev_priv = ring->dev->dev_private;
> > +		struct i915_address_space *vm;
> >  		struct drm_i915_gem_object *obj;
> >  
> >  		obj = list_first_entry(&ring->active_list,
> > @@ -2329,7 +2356,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
> >  		if (!i915_seqno_passed(seqno, obj->last_read_seqno))
> >  			break;
> >  
> > -		i915_gem_object_move_to_inactive(obj);
> > +		list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > +			i915_gem_object_move_to_inactive(obj, vm);
> >  	}
> >  
> >  	if (unlikely(ring->trace_irq_seqno &&
> > @@ -2575,13 +2603,14 @@ static void i915_gem_object_finish_gtt(struct drm_i915_gem_object *obj)
> >   * Unbinds an object from the GTT aperture.
> >   */
> >  int
> > -i915_gem_object_unbind(struct drm_i915_gem_object *obj)
> > +i915_gem_object_unbind(struct drm_i915_gem_object *obj,
> > +		       struct i915_address_space *vm)
> >  {
> >  	drm_i915_private_t *dev_priv = obj->base.dev->dev_private;
> >  	struct i915_vma *vma;
> >  	int ret;
> >  
> > -	if (!i915_gem_obj_ggtt_bound(obj))
> > +	if (!i915_gem_obj_bound(obj, vm))
> >  		return 0;
> >  
> >  	if (obj->pin_count)
> > @@ -2604,7 +2633,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
> >  	if (ret)
> >  		return ret;
> >  
> > -	trace_i915_gem_object_unbind(obj);
> > +	trace_i915_gem_object_unbind(obj, vm);
> >  
> >  	if (obj->has_global_gtt_mapping)
> >  		i915_gem_gtt_unbind_object(obj);
> > @@ -2619,7 +2648,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
> >  	/* Avoid an unnecessary call to unbind on rebind. */
> >  	obj->map_and_fenceable = true;
> >  
> > -	vma = __i915_gem_obj_to_vma(obj);
> > +	vma = i915_gem_obj_to_vma(obj, vm);
> >  	list_del(&vma->vma_link);
> >  	drm_mm_remove_node(&vma->node);
> >  	i915_gem_vma_destroy(vma);
> > @@ -2748,6 +2777,7 @@ static void i830_write_fence_reg(struct drm_device *dev, int reg,
> >  		     "object 0x%08lx not 512K or pot-size 0x%08x aligned\n",
> >  		     i915_gem_obj_ggtt_offset(obj), size);
> >  
> > +
> >  		pitch_val = obj->stride / 128;
> >  		pitch_val = ffs(pitch_val) - 1;
> >  
> > @@ -3069,23 +3099,25 @@ static void i915_gem_verify_gtt(struct drm_device *dev)
> >   */
> >  static int
> >  i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> > +			    struct i915_address_space *vm,
> >  			    unsigned alignment,
> >  			    bool map_and_fenceable,
> >  			    bool nonblocking)
> >  {
> >  	struct drm_device *dev = obj->base.dev;
> >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> >  	u32 size, fence_size, fence_alignment, unfenced_alignment;
> >  	bool mappable, fenceable;
> > -	size_t gtt_max = map_and_fenceable ?
> > -		dev_priv->gtt.mappable_end : dev_priv->gtt.base.total;
> > +	size_t gtt_max =
> > +		map_and_fenceable ? dev_priv->gtt.mappable_end : vm->total;
> >  	struct i915_vma *vma;
> >  	int ret;
> >  
> >  	if (WARN_ON(!list_empty(&obj->vma_list)))
> >  		return -EBUSY;
> >  
> > +	BUG_ON(!i915_is_ggtt(vm));
> > +
> >  	fence_size = i915_gem_get_gtt_size(dev,
> >  					   obj->base.size,
> >  					   obj->tiling_mode);
> > @@ -3125,18 +3157,21 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> >  	i915_gem_object_pin_pages(obj);
> >  
> >  	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
> > +	/* For now we only ever use 1 vma per object */
> > +	WARN_ON(!list_empty(&obj->vma_list));
> > +
> > +	vma = i915_gem_vma_create(obj, vm);
> >  	if (vma == NULL) {
> >  		i915_gem_object_unpin_pages(obj);
> >  		return -ENOMEM;
> >  	}
> >  
> >  search_free:
> > -	ret = drm_mm_insert_node_in_range_generic(&dev_priv->gtt.base.mm,
> > -						  &vma->node,
> > +	ret = drm_mm_insert_node_in_range_generic(&vm->mm, &vma->node,
> >  						  size, alignment,
> >  						  obj->cache_level, 0, gtt_max);
> >  	if (ret) {
> > -		ret = i915_gem_evict_something(dev, size, alignment,
> > +		ret = i915_gem_evict_something(dev, vm, size, alignment,
> >  					       obj->cache_level,
> >  					       map_and_fenceable,
> >  					       nonblocking);
> > @@ -3162,18 +3197,25 @@ search_free:
> >  
> >  	list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
> >  	list_add_tail(&obj->mm_list, &vm->inactive_list);
> > -	list_add(&vma->vma_link, &obj->vma_list);
> > +
> > +	/* Keep GGTT vmas first to make debug easier */
> > +	if (i915_is_ggtt(vm))
> > +		list_add(&vma->vma_link, &obj->vma_list);
> > +	else
> > +		list_add_tail(&vma->vma_link, &obj->vma_list);
> >  
> >  	fenceable =
> > +		i915_is_ggtt(vm) &&
> >  		i915_gem_obj_ggtt_size(obj) == fence_size &&
> >  		(i915_gem_obj_ggtt_offset(obj) & (fence_alignment - 1)) == 0;
> >  
> > -	mappable = i915_gem_obj_ggtt_offset(obj) + obj->base.size <=
> > -		dev_priv->gtt.mappable_end;
> > +	mappable =
> > +		i915_is_ggtt(vm) &&
> > +		vma->node.start + obj->base.size <= dev_priv->gtt.mappable_end;
> >  
> >  	obj->map_and_fenceable = mappable && fenceable;
> >  
> > -	trace_i915_gem_object_bind(obj, map_and_fenceable);
> > +	trace_i915_gem_object_bind(obj, vm, map_and_fenceable);
> >  	i915_gem_verify_gtt(dev);
> >  	return 0;
> >  }
> > @@ -3271,7 +3313,7 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
> >  	int ret;
> >  
> >  	/* Not valid to be called on unbound objects. */
> > -	if (!i915_gem_obj_ggtt_bound(obj))
> > +	if (!i915_gem_obj_bound_any(obj))
> >  		return -EINVAL;
> 
> If we're converting the shmem paths over to wait_rendering then there's
> only the fault handler and the set_domain ioctl left. For the later it
> would make sense to clflush even when an object is on the unbound list, to
> allow userspace to optimize when the clflushing happens. But that would
> only make sense in conjunction with Chris' create2 ioctl and a flag to
> preallocate the storage (and so putting the object onto the unbound list).
> So nothing to do here.
> 
> >  
> >  	if (obj->base.write_domain == I915_GEM_DOMAIN_GTT)
> > @@ -3317,11 +3359,12 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
> >  }
> >  
> >  int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > +				    struct i915_address_space *vm,
> >  				    enum i915_cache_level cache_level)
> >  {
> >  	struct drm_device *dev = obj->base.dev;
> >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > -	struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> > +	struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> >  	int ret;
> >  
> >  	if (obj->cache_level == cache_level)
> > @@ -3333,12 +3376,15 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> >  	}
> >  
> >  	if (!i915_gem_valid_gtt_space(dev, &vma->node, cache_level)) {
> > -		ret = i915_gem_object_unbind(obj);
> > +		ret = i915_gem_object_unbind(obj, vm);
> >  		if (ret)
> >  			return ret;
> >  	}
> >  
> > -	if (i915_gem_obj_ggtt_bound(obj)) {
> > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > +		if (!i915_gem_obj_bound(obj, vm))
> > +			continue;
> 
> Hm, shouldn't we have a per-object list of vmas? Or will that follow later
> on?
> 
> Self-correction: It exists already ... why can't we use this here?

Yes. That should work, I'll fix it and test it. It looks slightly worse
IMO in terms of code clarity, but I don't mind the change.

> 
> > +
> >  		ret = i915_gem_object_finish_gpu(obj);
> >  		if (ret)
> >  			return ret;
> > @@ -3361,7 +3407,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> >  			i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
> >  					       obj, cache_level);
> >  
> > -		i915_gem_obj_ggtt_set_color(obj, cache_level);
> > +		i915_gem_obj_set_color(obj, vm, cache_level);
> >  	}
> >  
> >  	if (cache_level == I915_CACHE_NONE) {
> > @@ -3421,6 +3467,7 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
> >  			       struct drm_file *file)
> >  {
> >  	struct drm_i915_gem_caching *args = data;
> > +	struct drm_i915_private *dev_priv;
> >  	struct drm_i915_gem_object *obj;
> >  	enum i915_cache_level level;
> >  	int ret;
> > @@ -3445,8 +3492,10 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
> >  		ret = -ENOENT;
> >  		goto unlock;
> >  	}
> > +	dev_priv = obj->base.dev->dev_private;
> >  
> > -	ret = i915_gem_object_set_cache_level(obj, level);
> > +	/* FIXME: Add interface for specific VM? */
> > +	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base, level);
> >  
> >  	drm_gem_object_unreference(&obj->base);
> >  unlock:
> > @@ -3464,6 +3513,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> >  				     u32 alignment,
> >  				     struct intel_ring_buffer *pipelined)
> >  {
> > +	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> >  	u32 old_read_domains, old_write_domain;
> >  	int ret;
> >  
> > @@ -3482,7 +3532,8 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> >  	 * of uncaching, which would allow us to flush all the LLC-cached data
> >  	 * with that bit in the PTE to main memory with just one PIPE_CONTROL.
> >  	 */
> > -	ret = i915_gem_object_set_cache_level(obj, I915_CACHE_NONE);
> > +	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> > +					      I915_CACHE_NONE);
> >  	if (ret)
> >  		return ret;
> >  
> > @@ -3490,7 +3541,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> >  	 * (e.g. libkms for the bootup splash), we have to ensure that we
> >  	 * always use map_and_fenceable for all scanout buffers.
> >  	 */
> > -	ret = i915_gem_object_pin(obj, alignment, true, false);
> > +	ret = i915_gem_ggtt_pin(obj, alignment, true, false);
> >  	if (ret)
> >  		return ret;
> >  
> > @@ -3633,6 +3684,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
> >  
> >  int
> >  i915_gem_object_pin(struct drm_i915_gem_object *obj,
> > +		    struct i915_address_space *vm,
> >  		    uint32_t alignment,
> >  		    bool map_and_fenceable,
> >  		    bool nonblocking)
> > @@ -3642,26 +3694,29 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
> >  	if (WARN_ON(obj->pin_count == DRM_I915_GEM_OBJECT_MAX_PIN_COUNT))
> >  		return -EBUSY;
> >  
> > -	if (i915_gem_obj_ggtt_bound(obj)) {
> > -		if ((alignment && i915_gem_obj_ggtt_offset(obj) & (alignment - 1)) ||
> > +	BUG_ON(map_and_fenceable && !i915_is_ggtt(vm));
> 
> WARN_ON, since presumably we can keep on going if we get this wrong
> (albeit with slightly corrupted state, so render corruptions might
> follow).

Can we make a deal, can we leave this as BUG_ON with a FIXME to convert
it at the end of merging?

> 
> > +
> > +	if (i915_gem_obj_bound(obj, vm)) {
> > +		if ((alignment &&
> > +		     i915_gem_obj_offset(obj, vm) & (alignment - 1)) ||
> >  		    (map_and_fenceable && !obj->map_and_fenceable)) {
> >  			WARN(obj->pin_count,
> >  			     "bo is already pinned with incorrect alignment:"
> >  			     " offset=%lx, req.alignment=%x, req.map_and_fenceable=%d,"
> >  			     " obj->map_and_fenceable=%d\n",
> > -			     i915_gem_obj_ggtt_offset(obj), alignment,
> > +			     i915_gem_obj_offset(obj, vm), alignment,
> >  			     map_and_fenceable,
> >  			     obj->map_and_fenceable);
> > -			ret = i915_gem_object_unbind(obj);
> > +			ret = i915_gem_object_unbind(obj, vm);
> >  			if (ret)
> >  				return ret;
> >  		}
> >  	}
> >  
> > -	if (!i915_gem_obj_ggtt_bound(obj)) {
> > +	if (!i915_gem_obj_bound(obj, vm)) {
> >  		struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> >  
> > -		ret = i915_gem_object_bind_to_gtt(obj, alignment,
> > +		ret = i915_gem_object_bind_to_gtt(obj, vm, alignment,
> >  						  map_and_fenceable,
> >  						  nonblocking);
> >  		if (ret)
> > @@ -3684,7 +3739,7 @@ void
> >  i915_gem_object_unpin(struct drm_i915_gem_object *obj)
> >  {
> >  	BUG_ON(obj->pin_count == 0);
> > -	BUG_ON(!i915_gem_obj_ggtt_bound(obj));
> > +	BUG_ON(!i915_gem_obj_bound_any(obj));
> >  
> >  	if (--obj->pin_count == 0)
> >  		obj->pin_mappable = false;
> > @@ -3722,7 +3777,7 @@ i915_gem_pin_ioctl(struct drm_device *dev, void *data,
> >  	}
> >  
> >  	if (obj->user_pin_count == 0) {
> > -		ret = i915_gem_object_pin(obj, args->alignment, true, false);
> > +		ret = i915_gem_ggtt_pin(obj, args->alignment, true, false);
> >  		if (ret)
> >  			goto out;
> >  	}
> > @@ -3957,6 +4012,7 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
> >  	struct drm_i915_gem_object *obj = to_intel_bo(gem_obj);
> >  	struct drm_device *dev = obj->base.dev;
> >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > +	struct i915_vma *vma, *next;
> >  
> >  	trace_i915_gem_object_destroy(obj);
> >  
> > @@ -3964,15 +4020,21 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
> >  		i915_gem_detach_phys_object(dev, obj);
> >  
> >  	obj->pin_count = 0;
> > -	if (WARN_ON(i915_gem_object_unbind(obj) == -ERESTARTSYS)) {
> > -		bool was_interruptible;
> > +	/* NB: 0 or 1 elements */
> > +	WARN_ON(!list_empty(&obj->vma_list) &&
> > +		!list_is_singular(&obj->vma_list));
> > +	list_for_each_entry_safe(vma, next, &obj->vma_list, vma_link) {
> > +		int ret = i915_gem_object_unbind(obj, vma->vm);
> > +		if (WARN_ON(ret == -ERESTARTSYS)) {
> > +			bool was_interruptible;
> >  
> > -		was_interruptible = dev_priv->mm.interruptible;
> > -		dev_priv->mm.interruptible = false;
> > +			was_interruptible = dev_priv->mm.interruptible;
> > +			dev_priv->mm.interruptible = false;
> >  
> > -		WARN_ON(i915_gem_object_unbind(obj));
> > +			WARN_ON(i915_gem_object_unbind(obj, vma->vm));
> >  
> > -		dev_priv->mm.interruptible = was_interruptible;
> > +			dev_priv->mm.interruptible = was_interruptible;
> > +		}
> >  	}
> >  
> >  	/* Stolen objects don't hold a ref, but do hold pin count. Fix that up
> > @@ -4332,6 +4394,16 @@ init_ring_lists(struct intel_ring_buffer *ring)
> >  	INIT_LIST_HEAD(&ring->request_list);
> >  }
> >  
> > +static void i915_init_vm(struct drm_i915_private *dev_priv,
> > +			 struct i915_address_space *vm)
> > +{
> > +	vm->dev = dev_priv->dev;
> > +	INIT_LIST_HEAD(&vm->active_list);
> > +	INIT_LIST_HEAD(&vm->inactive_list);
> > +	INIT_LIST_HEAD(&vm->global_link);
> > +	list_add(&vm->global_link, &dev_priv->vm_list);
> > +}
> > +
> >  void
> >  i915_gem_load(struct drm_device *dev)
> >  {
> > @@ -4344,8 +4416,9 @@ i915_gem_load(struct drm_device *dev)
> >  				  SLAB_HWCACHE_ALIGN,
> >  				  NULL);
> >  
> > -	INIT_LIST_HEAD(&dev_priv->gtt.base.active_list);
> > -	INIT_LIST_HEAD(&dev_priv->gtt.base.inactive_list);
> > +	INIT_LIST_HEAD(&dev_priv->vm_list);
> > +	i915_init_vm(dev_priv, &dev_priv->gtt.base);
> > +
> >  	INIT_LIST_HEAD(&dev_priv->mm.unbound_list);
> >  	INIT_LIST_HEAD(&dev_priv->mm.bound_list);
> >  	INIT_LIST_HEAD(&dev_priv->mm.fence_list);
> > @@ -4616,9 +4689,9 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> >  			     struct drm_i915_private,
> >  			     mm.inactive_shrinker);
> >  	struct drm_device *dev = dev_priv->dev;
> > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > +	struct i915_address_space *vm;
> >  	struct drm_i915_gem_object *obj;
> > -	int nr_to_scan = sc->nr_to_scan;
> > +	int nr_to_scan;
> >  	bool unlock = true;
> >  	int cnt;
> >  
> > @@ -4632,6 +4705,7 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> >  		unlock = false;
> >  	}
> >  
> > +	nr_to_scan = sc->nr_to_scan;
> >  	if (nr_to_scan) {
> >  		nr_to_scan -= i915_gem_purge(dev_priv, nr_to_scan);
> >  		if (nr_to_scan > 0)
> > @@ -4645,11 +4719,93 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> >  	list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_list)
> >  		if (obj->pages_pin_count == 0)
> >  			cnt += obj->base.size >> PAGE_SHIFT;
> > -	list_for_each_entry(obj, &vm->inactive_list, mm_list)
> > -		if (obj->pin_count == 0 && obj->pages_pin_count == 0)
> > -			cnt += obj->base.size >> PAGE_SHIFT;
> > +
> > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > +		list_for_each_entry(obj, &vm->inactive_list, global_list)
> > +			if (obj->pin_count == 0 && obj->pages_pin_count == 0)
> > +				cnt += obj->base.size >> PAGE_SHIFT;
> 
> Isn't this now double-counting objects? In the shrinker we only care about
> how much physical RAM an object occupies, not how much virtual space it
> occupies. So just walking the bound list of objects here should be good
> enough ...
> 

Maybe I've misunderstood you. My code is wrong, but I think you're idea
requires a prep patch because it changes functionality, right?

So let me know if I've understood you.

> >  
> >  	if (unlock)
> >  		mutex_unlock(&dev->struct_mutex);
> >  	return cnt;
> >  }
> > +
> > +/* All the new VM stuff */
> > +unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
> > +				  struct i915_address_space *vm)
> > +{
> > +	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
> > +	struct i915_vma *vma;
> > +
> > +	if (vm == &dev_priv->mm.aliasing_ppgtt->base)
> > +		vm = &dev_priv->gtt.base;
> > +
> > +	BUG_ON(list_empty(&o->vma_list));
> > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> 
> Imo the vma list walking here and in the other helpers below indicates
> that we should deal more often in vmas instead of (object, vm) pairs. Or
> is this again something that'll get fixed later on?
> 
> I just want to avoid diff churn, and it also makes reviewing easier if the
> foreshadowing is correct ;-) So generally I'd vote for more liberal
> sprinkling of obj_to_vma in callers.

It's not something I fixed in the whole series. I think it makes sense
conceptually, to keep some things as <obj,vm> and others as direct vma.

If you want me to change something, you need to be more specific since
no action specifically comes to mind at this point in the series.

> 
> > +		if (vma->vm == vm)
> > +			return vma->node.start;
> > +
> > +	}
> > +	return -1;
> > +}
> > +
> > +bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o)
> > +{
> > +	return !list_empty(&o->vma_list);
> > +}
> > +
> > +bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
> > +			struct i915_address_space *vm)
> > +{
> > +	struct i915_vma *vma;
> > +
> > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > +		if (vma->vm == vm)
> > +			return true;
> > +	}
> > +	return false;
> > +}
> > +
> > +unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
> > +				struct i915_address_space *vm)
> > +{
> > +	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
> > +	struct i915_vma *vma;
> > +
> > +	if (vm == &dev_priv->mm.aliasing_ppgtt->base)
> > +		vm = &dev_priv->gtt.base;
> > +	BUG_ON(list_empty(&o->vma_list));
> > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > +		if (vma->vm == vm)
> > +			return vma->node.size;
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
> > +			    struct i915_address_space *vm,
> > +			    enum i915_cache_level color)
> > +{
> > +	struct i915_vma *vma;
> > +	BUG_ON(list_empty(&o->vma_list));
> > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > +		if (vma->vm == vm) {
> > +			vma->node.color = color;
> > +			return;
> > +		}
> > +	}
> > +
> > +	WARN(1, "Couldn't set color for VM %p\n", vm);
> > +}
> > +
> > +struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
> > +				     struct i915_address_space *vm)
> > +{
> > +	struct i915_vma *vma;
> > +	list_for_each_entry(vma, &obj->vma_list, vma_link)
> > +		if (vma->vm == vm)
> > +			return vma;
> > +
> > +	return NULL;
> > +}
> > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> > index 2074544..c92fd81 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> > @@ -155,6 +155,7 @@ create_hw_context(struct drm_device *dev,
> >  
> >  	if (INTEL_INFO(dev)->gen >= 7) {
> >  		ret = i915_gem_object_set_cache_level(ctx->obj,
> > +						      &dev_priv->gtt.base,
> >  						      I915_CACHE_LLC_MLC);
> >  		/* Failure shouldn't ever happen this early */
> >  		if (WARN_ON(ret))
> > @@ -214,7 +215,7 @@ static int create_default_context(struct drm_i915_private *dev_priv)
> >  	 * default context.
> >  	 */
> >  	dev_priv->ring[RCS].default_context = ctx;
> > -	ret = i915_gem_object_pin(ctx->obj, CONTEXT_ALIGN, false, false);
> > +	ret = i915_gem_ggtt_pin(ctx->obj, CONTEXT_ALIGN, false, false);
> >  	if (ret) {
> >  		DRM_DEBUG_DRIVER("Couldn't pin %d\n", ret);
> >  		goto err_destroy;
> > @@ -398,6 +399,7 @@ mi_set_context(struct intel_ring_buffer *ring,
> >  static int do_switch(struct i915_hw_context *to)
> >  {
> >  	struct intel_ring_buffer *ring = to->ring;
> > +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> >  	struct i915_hw_context *from = ring->last_context;
> >  	u32 hw_flags = 0;
> >  	int ret;
> > @@ -407,7 +409,7 @@ static int do_switch(struct i915_hw_context *to)
> >  	if (from == to)
> >  		return 0;
> >  
> > -	ret = i915_gem_object_pin(to->obj, CONTEXT_ALIGN, false, false);
> > +	ret = i915_gem_ggtt_pin(to->obj, CONTEXT_ALIGN, false, false);
> >  	if (ret)
> >  		return ret;
> >  
> > @@ -444,7 +446,8 @@ static int do_switch(struct i915_hw_context *to)
> >  	 */
> >  	if (from != NULL) {
> >  		from->obj->base.read_domains = I915_GEM_DOMAIN_INSTRUCTION;
> > -		i915_gem_object_move_to_active(from->obj, ring);
> > +		i915_gem_object_move_to_active(from->obj, &dev_priv->gtt.base,
> > +					       ring);
> >  		/* As long as MI_SET_CONTEXT is serializing, ie. it flushes the
> >  		 * whole damn pipeline, we don't need to explicitly mark the
> >  		 * object dirty. The only exception is that the context must be
> > diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
> > index df61f33..32efdc0 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_evict.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_evict.c
> > @@ -32,24 +32,21 @@
> >  #include "i915_trace.h"
> >  
> >  static bool
> > -mark_free(struct drm_i915_gem_object *obj, struct list_head *unwind)
> > +mark_free(struct i915_vma *vma, struct list_head *unwind)
> >  {
> > -	struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> > -
> > -	if (obj->pin_count)
> > +	if (vma->obj->pin_count)
> >  		return false;
> >  
> > -	list_add(&obj->exec_list, unwind);
> > +	list_add(&vma->obj->exec_list, unwind);
> >  	return drm_mm_scan_add_block(&vma->node);
> >  }
> >  
> >  int
> > -i915_gem_evict_something(struct drm_device *dev, int min_size,
> > -			 unsigned alignment, unsigned cache_level,
> > +i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
> > +			 int min_size, unsigned alignment, unsigned cache_level,
> >  			 bool mappable, bool nonblocking)
> >  {
> >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> >  	struct list_head eviction_list, unwind_list;
> >  	struct i915_vma *vma;
> >  	struct drm_i915_gem_object *obj;
> > @@ -81,16 +78,18 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
> >  	 */
> >  
> >  	INIT_LIST_HEAD(&unwind_list);
> > -	if (mappable)
> > +	if (mappable) {
> > +		BUG_ON(!i915_is_ggtt(vm));
> >  		drm_mm_init_scan_with_range(&vm->mm, min_size,
> >  					    alignment, cache_level, 0,
> >  					    dev_priv->gtt.mappable_end);
> > -	else
> > +	} else
> >  		drm_mm_init_scan(&vm->mm, min_size, alignment, cache_level);
> >  
> >  	/* First see if there is a large enough contiguous idle region... */
> >  	list_for_each_entry(obj, &vm->inactive_list, mm_list) {
> > -		if (mark_free(obj, &unwind_list))
> > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> > +		if (mark_free(vma, &unwind_list))
> >  			goto found;
> >  	}
> >  
> > @@ -99,7 +98,8 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
> >  
> >  	/* Now merge in the soon-to-be-expired objects... */
> >  	list_for_each_entry(obj, &vm->active_list, mm_list) {
> > -		if (mark_free(obj, &unwind_list))
> > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> > +		if (mark_free(vma, &unwind_list))
> >  			goto found;
> >  	}
> >  
> > @@ -109,7 +109,7 @@ none:
> >  		obj = list_first_entry(&unwind_list,
> >  				       struct drm_i915_gem_object,
> >  				       exec_list);
> > -		vma = __i915_gem_obj_to_vma(obj);
> > +		vma = i915_gem_obj_to_vma(obj, vm);
> >  		ret = drm_mm_scan_remove_block(&vma->node);
> >  		BUG_ON(ret);
> >  
> > @@ -130,7 +130,7 @@ found:
> >  		obj = list_first_entry(&unwind_list,
> >  				       struct drm_i915_gem_object,
> >  				       exec_list);
> > -		vma = __i915_gem_obj_to_vma(obj);
> > +		vma = i915_gem_obj_to_vma(obj, vm);
> >  		if (drm_mm_scan_remove_block(&vma->node)) {
> >  			list_move(&obj->exec_list, &eviction_list);
> >  			drm_gem_object_reference(&obj->base);
> > @@ -145,7 +145,7 @@ found:
> >  				       struct drm_i915_gem_object,
> >  				       exec_list);
> >  		if (ret == 0)
> > -			ret = i915_gem_object_unbind(obj);
> > +			ret = i915_gem_object_unbind(obj, vm);
> >  
> >  		list_del_init(&obj->exec_list);
> >  		drm_gem_object_unreference(&obj->base);
> > @@ -158,13 +158,18 @@ int
> >  i915_gem_evict_everything(struct drm_device *dev)
> 
> I suspect evict_everything eventually wants a address_space *vm argument
> for those cases where we only want to evict everything in a given vm. Atm
> we have two use-cases of this:
> - Called from the shrinker as a last-ditch effort. For that it should move
>   _every_ object onto the unbound list.
> - Called from execbuf for badly-fragmented address spaces to clean up the
>   mess. For that case we only care about one address space.

The current thing is more or less a result of Chris' suggestions. A
non-posted iteration did plumb the vm, and after reworking to the
suggestion made by Chris, the vm didn't make much sense anymore.

For point #1, it requires VM prioritization I think. I don't really see
any other way to fairly manage it.

For point #2, that I agree it might be useful, but we can easily create
a new function, and not call it "shrinker" to do it. 


> 
> >  {
> >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > +	struct i915_address_space *vm;
> >  	struct drm_i915_gem_object *obj, *next;
> > -	bool lists_empty;
> > +	bool lists_empty = true;
> >  	int ret;
> >  
> > -	lists_empty = (list_empty(&vm->inactive_list) &&
> > -		       list_empty(&vm->active_list));
> > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > +		lists_empty = (list_empty(&vm->inactive_list) &&
> > +			       list_empty(&vm->active_list));
> > +		if (!lists_empty)
> > +			lists_empty = false;
> > +	}
> > +
> >  	if (lists_empty)
> >  		return -ENOSPC;
> >  
> > @@ -181,9 +186,11 @@ i915_gem_evict_everything(struct drm_device *dev)
> >  	i915_gem_retire_requests(dev);
> >  
> >  	/* Having flushed everything, unbind() should never raise an error */
> > -	list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> > -		if (obj->pin_count == 0)
> > -			WARN_ON(i915_gem_object_unbind(obj));
> > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > +		list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> > +			if (obj->pin_count == 0)
> > +				WARN_ON(i915_gem_object_unbind(obj, vm));
> > +	}
> >  
> >  	return 0;
> >  }
> > diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > index 5aeb447..e90182d 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > @@ -150,7 +150,7 @@ eb_get_object(struct eb_objects *eb, unsigned long handle)
> >  }
> >  
> >  static void
> > -eb_destroy(struct eb_objects *eb)
> > +eb_destroy(struct eb_objects *eb, struct i915_address_space *vm)
> >  {
> >  	while (!list_empty(&eb->objects)) {
> >  		struct drm_i915_gem_object *obj;
> > @@ -174,7 +174,8 @@ static inline int use_cpu_reloc(struct drm_i915_gem_object *obj)
> >  static int
> >  i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
> >  				   struct eb_objects *eb,
> > -				   struct drm_i915_gem_relocation_entry *reloc)
> > +				   struct drm_i915_gem_relocation_entry *reloc,
> > +				   struct i915_address_space *vm)
> >  {
> >  	struct drm_device *dev = obj->base.dev;
> >  	struct drm_gem_object *target_obj;
> > @@ -297,7 +298,8 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
> >  
> >  static int
> >  i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
> > -				    struct eb_objects *eb)
> > +				    struct eb_objects *eb,
> > +				    struct i915_address_space *vm)
> >  {
> >  #define N_RELOC(x) ((x) / sizeof(struct drm_i915_gem_relocation_entry))
> >  	struct drm_i915_gem_relocation_entry stack_reloc[N_RELOC(512)];
> > @@ -321,7 +323,8 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
> >  		do {
> >  			u64 offset = r->presumed_offset;
> >  
> > -			ret = i915_gem_execbuffer_relocate_entry(obj, eb, r);
> > +			ret = i915_gem_execbuffer_relocate_entry(obj, eb, r,
> > +								 vm);
> >  			if (ret)
> >  				return ret;
> >  
> > @@ -344,13 +347,15 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
> >  static int
> >  i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
> >  					 struct eb_objects *eb,
> > -					 struct drm_i915_gem_relocation_entry *relocs)
> > +					 struct drm_i915_gem_relocation_entry *relocs,
> > +					 struct i915_address_space *vm)
> >  {
> >  	const struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
> >  	int i, ret;
> >  
> >  	for (i = 0; i < entry->relocation_count; i++) {
> > -		ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i]);
> > +		ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i],
> > +							 vm);
> >  		if (ret)
> >  			return ret;
> >  	}
> > @@ -359,7 +364,8 @@ i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
> >  }
> >  
> >  static int
> > -i915_gem_execbuffer_relocate(struct eb_objects *eb)
> > +i915_gem_execbuffer_relocate(struct eb_objects *eb,
> > +			     struct i915_address_space *vm)
> >  {
> >  	struct drm_i915_gem_object *obj;
> >  	int ret = 0;
> > @@ -373,7 +379,7 @@ i915_gem_execbuffer_relocate(struct eb_objects *eb)
> >  	 */
> >  	pagefault_disable();
> >  	list_for_each_entry(obj, &eb->objects, exec_list) {
> > -		ret = i915_gem_execbuffer_relocate_object(obj, eb);
> > +		ret = i915_gem_execbuffer_relocate_object(obj, eb, vm);
> >  		if (ret)
> >  			break;
> >  	}
> > @@ -395,6 +401,7 @@ need_reloc_mappable(struct drm_i915_gem_object *obj)
> >  static int
> >  i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
> >  				   struct intel_ring_buffer *ring,
> > +				   struct i915_address_space *vm,
> >  				   bool *need_reloc)
> >  {
> >  	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> > @@ -409,7 +416,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
> >  		obj->tiling_mode != I915_TILING_NONE;
> >  	need_mappable = need_fence || need_reloc_mappable(obj);
> >  
> > -	ret = i915_gem_object_pin(obj, entry->alignment, need_mappable, false);
> > +	ret = i915_gem_object_pin(obj, vm, entry->alignment, need_mappable,
> > +				  false);
> >  	if (ret)
> >  		return ret;
> >  
> > @@ -436,8 +444,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
> >  		obj->has_aliasing_ppgtt_mapping = 1;
> >  	}
> >  
> > -	if (entry->offset != i915_gem_obj_ggtt_offset(obj)) {
> > -		entry->offset = i915_gem_obj_ggtt_offset(obj);
> > +	if (entry->offset != i915_gem_obj_offset(obj, vm)) {
> > +		entry->offset = i915_gem_obj_offset(obj, vm);
> >  		*need_reloc = true;
> >  	}
> >  
> > @@ -458,7 +466,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
> >  {
> >  	struct drm_i915_gem_exec_object2 *entry;
> >  
> > -	if (!i915_gem_obj_ggtt_bound(obj))
> > +	if (!i915_gem_obj_bound_any(obj))
> >  		return;
> >  
> >  	entry = obj->exec_entry;
> > @@ -475,6 +483,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
> >  static int
> >  i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
> >  			    struct list_head *objects,
> > +			    struct i915_address_space *vm,
> >  			    bool *need_relocs)
> >  {
> >  	struct drm_i915_gem_object *obj;
> > @@ -529,32 +538,37 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
> >  		list_for_each_entry(obj, objects, exec_list) {
> >  			struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
> >  			bool need_fence, need_mappable;
> > +			u32 obj_offset;
> >  
> > -			if (!i915_gem_obj_ggtt_bound(obj))
> > +			if (!i915_gem_obj_bound(obj, vm))
> >  				continue;
> 
> I wonder a bit how we could avoid the multipler (obj, vm) -> vma lookups
> here ... Maybe we should cache them in some pointer somewhere (either in
> the eb object or by adding a new pointer to the object struct, e.g.
> obj->eb_vma, similar to obj->eb_list).
> 

I agree, and even did this at one unposted patch too. However, I think
it's a premature optimization which risks code correctness. So I think
somewhere a FIXME needs to happen to address that issue. (Or if Chris
complains bitterly about some perf hit).

> >  
> > +			obj_offset = i915_gem_obj_offset(obj, vm);
> >  			need_fence =
> >  				has_fenced_gpu_access &&
> >  				entry->flags & EXEC_OBJECT_NEEDS_FENCE &&
> >  				obj->tiling_mode != I915_TILING_NONE;
> >  			need_mappable = need_fence || need_reloc_mappable(obj);
> >  
> > +			BUG_ON((need_mappable || need_fence) &&
> > +			       !i915_is_ggtt(vm));
> > +
> >  			if ((entry->alignment &&
> > -			     i915_gem_obj_ggtt_offset(obj) & (entry->alignment - 1)) ||
> > +			     obj_offset & (entry->alignment - 1)) ||
> >  			    (need_mappable && !obj->map_and_fenceable))
> > -				ret = i915_gem_object_unbind(obj);
> > +				ret = i915_gem_object_unbind(obj, vm);
> >  			else
> > -				ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs);
> > +				ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
> >  			if (ret)
> >  				goto err;
> >  		}
> >  
> >  		/* Bind fresh objects */
> >  		list_for_each_entry(obj, objects, exec_list) {
> > -			if (i915_gem_obj_ggtt_bound(obj))
> > +			if (i915_gem_obj_bound(obj, vm))
> >  				continue;
> >  
> > -			ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs);
> > +			ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
> >  			if (ret)
> >  				goto err;
> >  		}
> > @@ -578,7 +592,8 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
> >  				  struct drm_file *file,
> >  				  struct intel_ring_buffer *ring,
> >  				  struct eb_objects *eb,
> > -				  struct drm_i915_gem_exec_object2 *exec)
> > +				  struct drm_i915_gem_exec_object2 *exec,
> > +				  struct i915_address_space *vm)
> >  {
> >  	struct drm_i915_gem_relocation_entry *reloc;
> >  	struct drm_i915_gem_object *obj;
> > @@ -662,14 +677,15 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
> >  		goto err;
> >  
> >  	need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
> > -	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs);
> > +	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
> >  	if (ret)
> >  		goto err;
> >  
> >  	list_for_each_entry(obj, &eb->objects, exec_list) {
> >  		int offset = obj->exec_entry - exec;
> >  		ret = i915_gem_execbuffer_relocate_object_slow(obj, eb,
> > -							       reloc + reloc_offset[offset]);
> > +							       reloc + reloc_offset[offset],
> > +							       vm);
> >  		if (ret)
> >  			goto err;
> >  	}
> > @@ -768,6 +784,7 @@ validate_exec_list(struct drm_i915_gem_exec_object2 *exec,
> >  
> >  static void
> >  i915_gem_execbuffer_move_to_active(struct list_head *objects,
> > +				   struct i915_address_space *vm,
> >  				   struct intel_ring_buffer *ring)
> >  {
> >  	struct drm_i915_gem_object *obj;
> > @@ -782,7 +799,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *objects,
> >  		obj->base.read_domains = obj->base.pending_read_domains;
> >  		obj->fenced_gpu_access = obj->pending_fenced_gpu_access;
> >  
> > -		i915_gem_object_move_to_active(obj, ring);
> > +		i915_gem_object_move_to_active(obj, vm, ring);
> >  		if (obj->base.write_domain) {
> >  			obj->dirty = 1;
> >  			obj->last_write_seqno = intel_ring_get_seqno(ring);
> > @@ -836,7 +853,8 @@ static int
> >  i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> >  		       struct drm_file *file,
> >  		       struct drm_i915_gem_execbuffer2 *args,
> > -		       struct drm_i915_gem_exec_object2 *exec)
> > +		       struct drm_i915_gem_exec_object2 *exec,
> > +		       struct i915_address_space *vm)
> >  {
> >  	drm_i915_private_t *dev_priv = dev->dev_private;
> >  	struct eb_objects *eb;
> > @@ -998,17 +1016,17 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> >  
> >  	/* Move the objects en-masse into the GTT, evicting if necessary. */
> >  	need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
> > -	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs);
> > +	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
> >  	if (ret)
> >  		goto err;
> >  
> >  	/* The objects are in their final locations, apply the relocations. */
> >  	if (need_relocs)
> > -		ret = i915_gem_execbuffer_relocate(eb);
> > +		ret = i915_gem_execbuffer_relocate(eb, vm);
> >  	if (ret) {
> >  		if (ret == -EFAULT) {
> >  			ret = i915_gem_execbuffer_relocate_slow(dev, args, file, ring,
> > -								eb, exec);
> > +								eb, exec, vm);
> >  			BUG_ON(!mutex_is_locked(&dev->struct_mutex));
> >  		}
> >  		if (ret)
> > @@ -1059,7 +1077,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> >  			goto err;
> >  	}
> >  
> > -	exec_start = i915_gem_obj_ggtt_offset(batch_obj) + args->batch_start_offset;
> > +	exec_start = i915_gem_obj_offset(batch_obj, vm) +
> > +		args->batch_start_offset;
> >  	exec_len = args->batch_len;
> >  	if (cliprects) {
> >  		for (i = 0; i < args->num_cliprects; i++) {
> > @@ -1084,11 +1103,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> >  
> >  	trace_i915_gem_ring_dispatch(ring, intel_ring_get_seqno(ring), flags);
> >  
> > -	i915_gem_execbuffer_move_to_active(&eb->objects, ring);
> > +	i915_gem_execbuffer_move_to_active(&eb->objects, vm, ring);
> >  	i915_gem_execbuffer_retire_commands(dev, file, ring, batch_obj);
> >  
> >  err:
> > -	eb_destroy(eb);
> > +	eb_destroy(eb, vm);
> >  
> >  	mutex_unlock(&dev->struct_mutex);
> >  
> > @@ -1105,6 +1124,7 @@ int
> >  i915_gem_execbuffer(struct drm_device *dev, void *data,
> >  		    struct drm_file *file)
> >  {
> > +	struct drm_i915_private *dev_priv = dev->dev_private;
> >  	struct drm_i915_gem_execbuffer *args = data;
> >  	struct drm_i915_gem_execbuffer2 exec2;
> >  	struct drm_i915_gem_exec_object *exec_list = NULL;
> > @@ -1160,7 +1180,8 @@ i915_gem_execbuffer(struct drm_device *dev, void *data,
> >  	exec2.flags = I915_EXEC_RENDER;
> >  	i915_execbuffer2_set_context_id(exec2, 0);
> >  
> > -	ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list);
> > +	ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list,
> > +				     &dev_priv->gtt.base);
> >  	if (!ret) {
> >  		/* Copy the new buffer offsets back to the user's exec list. */
> >  		for (i = 0; i < args->buffer_count; i++)
> > @@ -1186,6 +1207,7 @@ int
> >  i915_gem_execbuffer2(struct drm_device *dev, void *data,
> >  		     struct drm_file *file)
> >  {
> > +	struct drm_i915_private *dev_priv = dev->dev_private;
> >  	struct drm_i915_gem_execbuffer2 *args = data;
> >  	struct drm_i915_gem_exec_object2 *exec2_list = NULL;
> >  	int ret;
> > @@ -1216,7 +1238,8 @@ i915_gem_execbuffer2(struct drm_device *dev, void *data,
> >  		return -EFAULT;
> >  	}
> >  
> > -	ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list);
> > +	ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list,
> > +				     &dev_priv->gtt.base);
> >  	if (!ret) {
> >  		/* Copy the new buffer offsets back to the user's exec list. */
> >  		ret = copy_to_user(to_user_ptr(args->buffers_ptr),
> > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > index 298fc42..70ce2f6 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > @@ -367,6 +367,8 @@ static int i915_gem_init_aliasing_ppgtt(struct drm_device *dev)
> >  			    ppgtt->base.total);
> >  	}
> >  
> > +	/* i915_init_vm(dev_priv, &ppgtt->base) */
> > +
> >  	return ret;
> >  }
> >  
> > @@ -386,17 +388,22 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
> >  			    struct drm_i915_gem_object *obj,
> >  			    enum i915_cache_level cache_level)
> >  {
> > -	ppgtt->base.insert_entries(&ppgtt->base, obj->pages,
> > -				   i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> > -				   cache_level);
> > +	struct i915_address_space *vm = &ppgtt->base;
> > +	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
> > +
> > +	vm->insert_entries(vm, obj->pages,
> > +			   obj_offset >> PAGE_SHIFT,
> > +			   cache_level);
> >  }
> >  
> >  void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
> >  			      struct drm_i915_gem_object *obj)
> >  {
> > -	ppgtt->base.clear_range(&ppgtt->base,
> > -				i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> > -				obj->base.size >> PAGE_SHIFT);
> > +	struct i915_address_space *vm = &ppgtt->base;
> > +	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
> > +
> > +	vm->clear_range(vm, obj_offset >> PAGE_SHIFT,
> > +			obj->base.size >> PAGE_SHIFT);
> >  }
> >  
> >  extern int intel_iommu_gfx_mapped;
> > @@ -447,6 +454,9 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
> >  				       dev_priv->gtt.base.start / PAGE_SIZE,
> >  				       dev_priv->gtt.base.total / PAGE_SIZE);
> >  
> > +	if (dev_priv->mm.aliasing_ppgtt)
> > +		gen6_write_pdes(dev_priv->mm.aliasing_ppgtt);
> > +
> >  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> >  		i915_gem_clflush_object(obj);
> >  		i915_gem_gtt_bind_object(obj, obj->cache_level);
> > @@ -625,7 +635,8 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
> >  	 * aperture.  One page should be enough to keep any prefetching inside
> >  	 * of the aperture.
> >  	 */
> > -	drm_i915_private_t *dev_priv = dev->dev_private;
> > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > +	struct i915_address_space *ggtt_vm = &dev_priv->gtt.base;
> >  	struct drm_mm_node *entry;
> >  	struct drm_i915_gem_object *obj;
> >  	unsigned long hole_start, hole_end;
> > @@ -633,19 +644,19 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
> >  	BUG_ON(mappable_end > end);
> >  
> >  	/* Subtract the guard page ... */
> > -	drm_mm_init(&dev_priv->gtt.base.mm, start, end - start - PAGE_SIZE);
> > +	drm_mm_init(&ggtt_vm->mm, start, end - start - PAGE_SIZE);
> >  	if (!HAS_LLC(dev))
> >  		dev_priv->gtt.base.mm.color_adjust = i915_gtt_color_adjust;
> >  
> >  	/* Mark any preallocated objects as occupied */
> >  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> > -		struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, ggtt_vm);
> >  		int ret;
> >  		DRM_DEBUG_KMS("reserving preallocated space: %lx + %zx\n",
> >  			      i915_gem_obj_ggtt_offset(obj), obj->base.size);
> >  
> >  		WARN_ON(i915_gem_obj_ggtt_bound(obj));
> > -		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
> > +		ret = drm_mm_reserve_node(&ggtt_vm->mm, &vma->node);
> >  		if (ret)
> >  			DRM_DEBUG_KMS("Reservation failed\n");
> >  		obj->has_global_gtt_mapping = 1;
> > @@ -656,19 +667,15 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
> >  	dev_priv->gtt.base.total = end - start;
> >  
> >  	/* Clear any non-preallocated blocks */
> > -	drm_mm_for_each_hole(entry, &dev_priv->gtt.base.mm,
> > -			     hole_start, hole_end) {
> > +	drm_mm_for_each_hole(entry, &ggtt_vm->mm, hole_start, hole_end) {
> >  		const unsigned long count = (hole_end - hole_start) / PAGE_SIZE;
> >  		DRM_DEBUG_KMS("clearing unused GTT space: [%lx, %lx]\n",
> >  			      hole_start, hole_end);
> > -		dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> > -					       hole_start / PAGE_SIZE,
> > -					       count);
> > +		ggtt_vm->clear_range(ggtt_vm, hole_start / PAGE_SIZE, count);
> >  	}
> >  
> >  	/* And finally clear the reserved guard page */
> > -	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> > -				       end / PAGE_SIZE - 1, 1);
> > +	ggtt_vm->clear_range(ggtt_vm, end / PAGE_SIZE - 1, 1);
> >  }
> >  
> >  static bool
> > diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > index 245eb1d..bfe61fa 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > @@ -391,7 +391,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
> >  	if (gtt_offset == I915_GTT_OFFSET_NONE)
> >  		return obj;
> >  
> > -	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
> > +	vma = i915_gem_vma_create(obj, vm);
> >  	if (!vma) {
> >  		drm_gem_object_unreference(&obj->base);
> >  		return NULL;
> > @@ -404,8 +404,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
> >  	 */
> >  	vma->node.start = gtt_offset;
> >  	vma->node.size = size;
> > -	if (drm_mm_initialized(&dev_priv->gtt.base.mm)) {
> > -		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
> > +	if (drm_mm_initialized(&vm->mm)) {
> > +		ret = drm_mm_reserve_node(&vm->mm, &vma->node);
> 
> These two hunks here for stolen look fishy - we only ever use the stolen
> preallocated stuff for objects with mappings in the global gtt. So keeping
> that explicit is imo the better approach. And tbh I'm confused where the
> local variable vm is from ...

If we don't create a vma for it, we potentially have to special case a
bunch of places, I think. I'm not actually sure of this, but the
overhead to do it is quite small.

Anyway, I'll look this over again nd see what I think.

> 
> >  		if (ret) {
> >  			DRM_DEBUG_KMS("failed to allocate stolen GTT space\n");
> >  			goto unref_out;
> > diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
> > index 92a8d27..808ca2a 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_tiling.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
> > @@ -360,17 +360,19 @@ i915_gem_set_tiling(struct drm_device *dev, void *data,
> >  
> >  		obj->map_and_fenceable =
> >  			!i915_gem_obj_ggtt_bound(obj) ||
> > -			(i915_gem_obj_ggtt_offset(obj) + obj->base.size <= dev_priv->gtt.mappable_end &&
> > +			(i915_gem_obj_ggtt_offset(obj) +
> > +			 obj->base.size <= dev_priv->gtt.mappable_end &&
> >  			 i915_gem_object_fence_ok(obj, args->tiling_mode));
> >  
> >  		/* Rebind if we need a change of alignment */
> >  		if (!obj->map_and_fenceable) {
> > -			u32 unfenced_alignment =
> > +			struct i915_address_space *ggtt = &dev_priv->gtt.base;
> > +			u32 unfenced_align =
> >  				i915_gem_get_gtt_alignment(dev, obj->base.size,
> >  							    args->tiling_mode,
> >  							    false);
> > -			if (i915_gem_obj_ggtt_offset(obj) & (unfenced_alignment - 1))
> > -				ret = i915_gem_object_unbind(obj);
> > +			if (i915_gem_obj_ggtt_offset(obj) & (unfenced_align - 1))
> > +				ret = i915_gem_object_unbind(obj, ggtt);
> >  		}
> >  
> >  		if (ret == 0) {
> > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> > index 79fbb17..28fa0ff 100644
> > --- a/drivers/gpu/drm/i915/i915_irq.c
> > +++ b/drivers/gpu/drm/i915/i915_irq.c
> > @@ -1716,6 +1716,9 @@ i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
> >  	if (HAS_BROKEN_CS_TLB(dev_priv->dev)) {
> >  		u32 acthd = I915_READ(ACTHD);
> >  
> > +		if (WARN_ON(HAS_HW_CONTEXTS(dev_priv->dev)))
> > +			return NULL;
> > +
> >  		if (WARN_ON(ring->id != RCS))
> >  			return NULL;
> >  
> > @@ -1802,7 +1805,8 @@ static void i915_gem_record_active_context(struct intel_ring_buffer *ring,
> >  		return;
> >  
> >  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> > -		if ((error->ccid & PAGE_MASK) == i915_gem_obj_ggtt_offset(obj)) {
> > +		if ((error->ccid & PAGE_MASK) ==
> > +		    i915_gem_obj_ggtt_offset(obj)) {
> >  			ering->ctx = i915_error_object_create_sized(dev_priv,
> >  								    obj, 1);
> >  			break;
> > diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
> > index 7d283b5..3f019d3 100644
> > --- a/drivers/gpu/drm/i915/i915_trace.h
> > +++ b/drivers/gpu/drm/i915/i915_trace.h
> > @@ -34,11 +34,13 @@ TRACE_EVENT(i915_gem_object_create,
> >  );
> >  
> >  TRACE_EVENT(i915_gem_object_bind,
> > -	    TP_PROTO(struct drm_i915_gem_object *obj, bool mappable),
> > -	    TP_ARGS(obj, mappable),
> > +	    TP_PROTO(struct drm_i915_gem_object *obj,
> > +		     struct i915_address_space *vm, bool mappable),
> > +	    TP_ARGS(obj, vm, mappable),
> >  
> >  	    TP_STRUCT__entry(
> >  			     __field(struct drm_i915_gem_object *, obj)
> > +			     __field(struct i915_address_space *, vm)
> >  			     __field(u32, offset)
> >  			     __field(u32, size)
> >  			     __field(bool, mappable)
> > @@ -46,8 +48,8 @@ TRACE_EVENT(i915_gem_object_bind,
> >  
> >  	    TP_fast_assign(
> >  			   __entry->obj = obj;
> > -			   __entry->offset = i915_gem_obj_ggtt_offset(obj);
> > -			   __entry->size = i915_gem_obj_ggtt_size(obj);
> > +			   __entry->offset = i915_gem_obj_offset(obj, vm);
> > +			   __entry->size = i915_gem_obj_size(obj, vm);
> >  			   __entry->mappable = mappable;
> >  			   ),
> >  
> > @@ -57,19 +59,21 @@ TRACE_EVENT(i915_gem_object_bind,
> >  );
> >  
> >  TRACE_EVENT(i915_gem_object_unbind,
> > -	    TP_PROTO(struct drm_i915_gem_object *obj),
> > -	    TP_ARGS(obj),
> > +	    TP_PROTO(struct drm_i915_gem_object *obj,
> > +		     struct i915_address_space *vm),
> > +	    TP_ARGS(obj, vm),
> >  
> >  	    TP_STRUCT__entry(
> >  			     __field(struct drm_i915_gem_object *, obj)
> > +			     __field(struct i915_address_space *, vm)
> >  			     __field(u32, offset)
> >  			     __field(u32, size)
> >  			     ),
> >  
> >  	    TP_fast_assign(
> >  			   __entry->obj = obj;
> > -			   __entry->offset = i915_gem_obj_ggtt_offset(obj);
> > -			   __entry->size = i915_gem_obj_ggtt_size(obj);
> > +			   __entry->offset = i915_gem_obj_offset(obj, vm);
> > +			   __entry->size = i915_gem_obj_size(obj, vm);
> >  			   ),
> >  
> >  	    TP_printk("obj=%p, offset=%08x size=%x",
> > diff --git a/drivers/gpu/drm/i915/intel_fb.c b/drivers/gpu/drm/i915/intel_fb.c
> > index f3c97e0..b69cc63 100644
> > --- a/drivers/gpu/drm/i915/intel_fb.c
> > +++ b/drivers/gpu/drm/i915/intel_fb.c
> > @@ -170,7 +170,6 @@ static int intelfb_create(struct drm_fb_helper *helper,
> >  		      fb->width, fb->height,
> >  		      i915_gem_obj_ggtt_offset(obj), obj);
> >  
> > -
> >  	mutex_unlock(&dev->struct_mutex);
> >  	vga_switcheroo_client_fb_set(dev->pdev, info);
> >  	return 0;
> > diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
> > index 81c3ca1..517e278 100644
> > --- a/drivers/gpu/drm/i915/intel_overlay.c
> > +++ b/drivers/gpu/drm/i915/intel_overlay.c
> > @@ -1350,7 +1350,7 @@ void intel_setup_overlay(struct drm_device *dev)
> >  		}
> >  		overlay->flip_addr = reg_bo->phys_obj->handle->busaddr;
> >  	} else {
> > -		ret = i915_gem_object_pin(reg_bo, PAGE_SIZE, true, false);
> > +		ret = i915_gem_ggtt_pin(reg_bo, PAGE_SIZE, true, false);
> >  		if (ret) {
> >  			DRM_ERROR("failed to pin overlay register bo\n");
> >  			goto out_free_bo;
> > diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> > index 125a741..449e57c 100644
> > --- a/drivers/gpu/drm/i915/intel_pm.c
> > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > @@ -2858,7 +2858,7 @@ intel_alloc_context_page(struct drm_device *dev)
> >  		return NULL;
> >  	}
> >  
> > -	ret = i915_gem_object_pin(ctx, 4096, true, false);
> > +	ret = i915_gem_ggtt_pin(ctx, 4096, true, false);
> >  	if (ret) {
> >  		DRM_ERROR("failed to pin power context: %d\n", ret);
> >  		goto err_unref;
> > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > index bc4c11b..ebed61d 100644
> > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > @@ -481,6 +481,7 @@ out:
> >  static int
> >  init_pipe_control(struct intel_ring_buffer *ring)
> >  {
> > +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> >  	struct pipe_control *pc;
> >  	struct drm_i915_gem_object *obj;
> >  	int ret;
> > @@ -499,9 +500,10 @@ init_pipe_control(struct intel_ring_buffer *ring)
> >  		goto err;
> >  	}
> >  
> > -	i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
> > +	i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> > +					I915_CACHE_LLC);
> >  
> > -	ret = i915_gem_object_pin(obj, 4096, true, false);
> > +	ret = i915_gem_ggtt_pin(obj, 4096, true, false);
> >  	if (ret)
> >  		goto err_unref;
> >  
> > @@ -1212,6 +1214,7 @@ static void cleanup_status_page(struct intel_ring_buffer *ring)
> >  static int init_status_page(struct intel_ring_buffer *ring)
> >  {
> >  	struct drm_device *dev = ring->dev;
> > +	struct drm_i915_private *dev_priv = dev->dev_private;
> >  	struct drm_i915_gem_object *obj;
> >  	int ret;
> >  
> > @@ -1222,9 +1225,10 @@ static int init_status_page(struct intel_ring_buffer *ring)
> >  		goto err;
> >  	}
> >  
> > -	i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
> > +	i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> > +					I915_CACHE_LLC);
> >  
> > -	ret = i915_gem_object_pin(obj, 4096, true, false);
> > +	ret = i915_gem_ggtt_pin(obj, 4096, true, false);
> >  	if (ret != 0) {
> >  		goto err_unref;
> >  	}
> > @@ -1307,7 +1311,7 @@ static int intel_init_ring_buffer(struct drm_device *dev,
> >  
> >  	ring->obj = obj;
> >  
> > -	ret = i915_gem_object_pin(obj, PAGE_SIZE, true, false);
> > +	ret = i915_gem_ggtt_pin(obj, PAGE_SIZE, true, false);
> >  	if (ret)
> >  		goto err_unref;
> >  
> > @@ -1828,7 +1832,7 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
> >  			return -ENOMEM;
> >  		}
> >  
> > -		ret = i915_gem_object_pin(obj, 0, true, false);
> > +		ret = i915_gem_ggtt_pin(obj, 0, true, false);
> >  		if (ret != 0) {
> >  			drm_gem_object_unreference(&obj->base);
> >  			DRM_ERROR("Failed to ping batch bo\n");
> > -- 
> > 1.8.3.2
> > 
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 05/11] drm/i915: Create VMAs
  2013-07-11 11:20   ` Imre Deak
@ 2013-07-12  2:23     ` Ben Widawsky
  0 siblings, 0 replies; 50+ messages in thread
From: Ben Widawsky @ 2013-07-12  2:23 UTC (permalink / raw)
  To: Imre Deak; +Cc: Intel GFX

On Thu, Jul 11, 2013 at 02:20:50PM +0300, Imre Deak wrote:
> On Mon, 2013-07-08 at 23:08 -0700, Ben Widawsky wrote:
> > Formerly: "drm/i915: Create VMAs (part 1)"
> > 
> > In a previous patch, the notion of a VM was introduced. A VMA describes
> > an area of part of the VM address space. A VMA is similar to the concept
> > in the linux mm. However, instead of representing regular memory, a VMA
> > is backed by a GEM BO. There may be many VMAs for a given object, one
> > for each VM the object is to be used in. This may occur through flink,
> > dma-buf, or a number of other transient states.
> > 
> > Currently the code depends on only 1 VMA per object, for the global GTT
> > (and aliasing PPGTT). The following patches will address this and make
> > the rest of the infrastructure more suited
> > 
> > v2: s/i915_obj/i915_gem_obj (Chris)
> > 
> > v3: Only move an object to the now global unbound list if there are no
> > more VMAs for the object which are bound into a VM (ie. the list is
> > empty).
> > 
> > v4: killed obj->gtt_space
> > some reworks due to rebase
> > 
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > ---
> >  drivers/gpu/drm/i915/i915_drv.h        | 48 ++++++++++++++++++++++------
> >  drivers/gpu/drm/i915/i915_gem.c        | 57 +++++++++++++++++++++++++++++-----
> >  drivers/gpu/drm/i915/i915_gem_evict.c  | 12 ++++---
> >  drivers/gpu/drm/i915/i915_gem_gtt.c    |  5 +--
> >  drivers/gpu/drm/i915/i915_gem_stolen.c | 14 ++++++---
> >  5 files changed, 110 insertions(+), 26 deletions(-)
> > 
> > [...]
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index 525aa8f..058ad44 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -2578,6 +2578,7 @@ int
> >  i915_gem_object_unbind(struct drm_i915_gem_object *obj)
> >  {
> >  	drm_i915_private_t *dev_priv = obj->base.dev->dev_private;
> > +	struct i915_vma *vma;
> >  	int ret;
> >  
> >  	if (!i915_gem_obj_ggtt_bound(obj))
> > @@ -2615,11 +2616,20 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
> >  	i915_gem_object_unpin_pages(obj);
> >  
> >  	list_del(&obj->mm_list);
> > -	list_move_tail(&obj->global_list, &dev_priv->mm.unbound_list);
> >  	/* Avoid an unnecessary call to unbind on rebind. */
> >  	obj->map_and_fenceable = true;
> >  
> > -	drm_mm_remove_node(&obj->gtt_space);
> > +	vma = __i915_gem_obj_to_vma(obj);
> > +	list_del(&vma->vma_link);
> > +	drm_mm_remove_node(&vma->node);
> > +	i915_gem_vma_destroy(vma);
> > +
> > +	/* Since the unbound list is global, only move to that list if
> > +	 * no more VMAs exist.
> > +	 * NB: Until we have real VMAs there will only ever be one */
> > +	WARN_ON(!list_empty(&obj->vma_list));
> > +	if (list_empty(&obj->vma_list))
> > +		list_move_tail(&obj->global_list, &dev_priv->mm.unbound_list);
> >  
> >  	return 0;
> >  }
> > @@ -3070,8 +3080,12 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> >  	bool mappable, fenceable;
> >  	size_t gtt_max = map_and_fenceable ?
> >  		dev_priv->gtt.mappable_end : dev_priv->gtt.base.total;
> > +	struct i915_vma *vma;
> >  	int ret;
> >  
> > +	if (WARN_ON(!list_empty(&obj->vma_list)))
> > +		return -EBUSY;
> > +
> >  	fence_size = i915_gem_get_gtt_size(dev,
> >  					   obj->base.size,
> >  					   obj->tiling_mode);
> > @@ -3110,9 +3124,15 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> >  
> >  	i915_gem_object_pin_pages(obj);
> >  
> > +	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
> > +	if (vma == NULL) {
> > +		i915_gem_object_unpin_pages(obj);
> > +		return -ENOMEM;
> > +	}
> > +
> >  search_free:
> >  	ret = drm_mm_insert_node_in_range_generic(&dev_priv->gtt.base.mm,
> > -						  &obj->gtt_space,
> > +						  &vma->node,
> >  						  size, alignment,
> >  						  obj->cache_level, 0, gtt_max);
> >  	if (ret) {
> > @@ -3126,22 +3146,23 @@ search_free:
> >  		i915_gem_object_unpin_pages(obj);
> >  		return ret;
> >  	}
> > -	if (WARN_ON(!i915_gem_valid_gtt_space(dev, &obj->gtt_space,
> > +	if (WARN_ON(!i915_gem_valid_gtt_space(dev, &vma->node,
> >  					      obj->cache_level))) {
> >  		i915_gem_object_unpin_pages(obj);
> > -		drm_mm_remove_node(&obj->gtt_space);
> > +		drm_mm_remove_node(&vma->node);
> >  		return -EINVAL;
> >  	}
> >  
> >  	ret = i915_gem_gtt_prepare_object(obj);
> >  	if (ret) {
> >  		i915_gem_object_unpin_pages(obj);
> > -		drm_mm_remove_node(&obj->gtt_space);
> > +		drm_mm_remove_node(&vma->node);
> >  		return ret;
> >  	}
> 
> Freeing vma on the error path is missing.
> 
> With this and the issue in 1/5 addressed things look good to me, so on
> 1-5:
> 
> Reviewed-by: Imre Deak <imre.deak@intel.com>
> 
> --Imre

Nice catch. Rebase fail. I feel no shame in making an excuse that it was
correct in the original series.



-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 06/11] drm/i915: plumb VM into object operations
  2013-07-12  2:23     ` Ben Widawsky
@ 2013-07-12  6:26       ` Daniel Vetter
  2013-07-12 15:46         ` Ben Widawsky
  0 siblings, 1 reply; 50+ messages in thread
From: Daniel Vetter @ 2013-07-12  6:26 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Thu, Jul 11, 2013 at 07:23:08PM -0700, Ben Widawsky wrote:
> On Tue, Jul 09, 2013 at 09:15:01AM +0200, Daniel Vetter wrote:
> > On Mon, Jul 08, 2013 at 11:08:37PM -0700, Ben Widawsky wrote:

[snip]

> > > index 058ad44..21015cd 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > @@ -38,10 +38,12 @@
> > >  
> > >  static void i915_gem_object_flush_gtt_write_domain(struct drm_i915_gem_object *obj);
> > >  static void i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *obj);
> > > -static __must_check int i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> > > -						    unsigned alignment,
> > > -						    bool map_and_fenceable,
> > > -						    bool nonblocking);
> > > +static __must_check int
> > > +i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> > > +			    struct i915_address_space *vm,
> > > +			    unsigned alignment,
> > > +			    bool map_and_fenceable,
> > > +			    bool nonblocking);
> > >  static int i915_gem_phys_pwrite(struct drm_device *dev,
> > >  				struct drm_i915_gem_object *obj,
> > >  				struct drm_i915_gem_pwrite *args,
> > > @@ -135,7 +137,7 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
> > >  static inline bool
> > >  i915_gem_object_is_inactive(struct drm_i915_gem_object *obj)
> > >  {
> > > -	return i915_gem_obj_ggtt_bound(obj) && !obj->active;
> > > +	return i915_gem_obj_bound_any(obj) && !obj->active;
> > >  }
> > >  
> > >  int
> > > @@ -422,7 +424,7 @@ i915_gem_shmem_pread(struct drm_device *dev,
> > >  		 * anyway again before the next pread happens. */
> > >  		if (obj->cache_level == I915_CACHE_NONE)
> > >  			needs_clflush = 1;
> > > -		if (i915_gem_obj_ggtt_bound(obj)) {
> > > +		if (i915_gem_obj_bound_any(obj)) {
> > >  			ret = i915_gem_object_set_to_gtt_domain(obj, false);
> > 
> > This is essentially a very convoluted version of "if there's gpu rendering
> > outstanding, please wait for it". Maybe we should switch this to
> > 
> > 	if (obj->active)
> > 		wait_rendering(obj, true);
> > 
> > Same for the shmem_pwrite case below. Would be a separate patch to prep
> > things though. Can I volunteer you for that? The ugly part is to review
> > whether any of the lru list updating that set_domain does in addition to
> > wait_rendering is required, but on a quick read that's not the case.
> 
> Just reading the comment above it says we need the clflush. I don't
> actually understand why we do that even after reading the comment, but
> meh. You tell me, I don't mind doing this as a prep first.

The comment right above is just for the needs_clflush = 1 assignment, the
set_to_gtt_domain call afterwards is just to sync up with the gpu. The
code is confusing and tricky and the lack of a white line in between the
two things plus a comment explaining that we only care about the
wait_rendering side-effect of set_to_gtt_domain doesn't help. If you do
the proposed conversion (and add a white line) that should help a lot in
unconfusing readers.

[snip]

> > > @@ -3333,12 +3376,15 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > >  	}
> > >  
> > >  	if (!i915_gem_valid_gtt_space(dev, &vma->node, cache_level)) {
> > > -		ret = i915_gem_object_unbind(obj);
> > > +		ret = i915_gem_object_unbind(obj, vm);
> > >  		if (ret)
> > >  			return ret;
> > >  	}
> > >  
> > > -	if (i915_gem_obj_ggtt_bound(obj)) {
> > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > > +		if (!i915_gem_obj_bound(obj, vm))
> > > +			continue;
> > 
> > Hm, shouldn't we have a per-object list of vmas? Or will that follow later
> > on?
> > 
> > Self-correction: It exists already ... why can't we use this here?
> 
> Yes. That should work, I'll fix it and test it. It looks slightly worse
> IMO in terms of code clarity, but I don't mind the change.

Actually I think it'd gain in clarity, doing pte updatest (which
set_cache_level does) on the vma instead of the (obj, vm) pair feels more
natural. And we'd be able to drop lots of (obj, vm) -> vma lookups here.

> 
> > 
> > > +
> > >  		ret = i915_gem_object_finish_gpu(obj);
> > >  		if (ret)
> > >  			return ret;
> > > @@ -3361,7 +3407,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > >  			i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
> > >  					       obj, cache_level);
> > >  
> > > -		i915_gem_obj_ggtt_set_color(obj, cache_level);
> > > +		i915_gem_obj_set_color(obj, vm, cache_level);
> > >  	}
> > >  
> > >  	if (cache_level == I915_CACHE_NONE) {
> > > @@ -3421,6 +3467,7 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
> > >  			       struct drm_file *file)
> > >  {
> > >  	struct drm_i915_gem_caching *args = data;
> > > +	struct drm_i915_private *dev_priv;
> > >  	struct drm_i915_gem_object *obj;
> > >  	enum i915_cache_level level;
> > >  	int ret;
> > > @@ -3445,8 +3492,10 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
> > >  		ret = -ENOENT;
> > >  		goto unlock;
> > >  	}
> > > +	dev_priv = obj->base.dev->dev_private;
> > >  
> > > -	ret = i915_gem_object_set_cache_level(obj, level);
> > > +	/* FIXME: Add interface for specific VM? */
> > > +	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base, level);
> > >  
> > >  	drm_gem_object_unreference(&obj->base);
> > >  unlock:
> > > @@ -3464,6 +3513,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> > >  				     u32 alignment,
> > >  				     struct intel_ring_buffer *pipelined)
> > >  {
> > > +	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> > >  	u32 old_read_domains, old_write_domain;
> > >  	int ret;
> > >  
> > > @@ -3482,7 +3532,8 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> > >  	 * of uncaching, which would allow us to flush all the LLC-cached data
> > >  	 * with that bit in the PTE to main memory with just one PIPE_CONTROL.
> > >  	 */
> > > -	ret = i915_gem_object_set_cache_level(obj, I915_CACHE_NONE);
> > > +	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> > > +					      I915_CACHE_NONE);
> > >  	if (ret)
> > >  		return ret;
> > >  
> > > @@ -3490,7 +3541,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> > >  	 * (e.g. libkms for the bootup splash), we have to ensure that we
> > >  	 * always use map_and_fenceable for all scanout buffers.
> > >  	 */
> > > -	ret = i915_gem_object_pin(obj, alignment, true, false);
> > > +	ret = i915_gem_ggtt_pin(obj, alignment, true, false);
> > >  	if (ret)
> > >  		return ret;
> > >  
> > > @@ -3633,6 +3684,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
> > >  
> > >  int
> > >  i915_gem_object_pin(struct drm_i915_gem_object *obj,
> > > +		    struct i915_address_space *vm,
> > >  		    uint32_t alignment,
> > >  		    bool map_and_fenceable,
> > >  		    bool nonblocking)
> > > @@ -3642,26 +3694,29 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
> > >  	if (WARN_ON(obj->pin_count == DRM_I915_GEM_OBJECT_MAX_PIN_COUNT))
> > >  		return -EBUSY;
> > >  
> > > -	if (i915_gem_obj_ggtt_bound(obj)) {
> > > -		if ((alignment && i915_gem_obj_ggtt_offset(obj) & (alignment - 1)) ||
> > > +	BUG_ON(map_and_fenceable && !i915_is_ggtt(vm));
> > 
> > WARN_ON, since presumably we can keep on going if we get this wrong
> > (albeit with slightly corrupted state, so render corruptions might
> > follow).
> 
> Can we make a deal, can we leave this as BUG_ON with a FIXME to convert
> it at the end of merging?

Adding a FIXME right above it will cause equal amounts of conflicts, so I
don't see the point that much ...

> 
> > 
> > > +
> > > +	if (i915_gem_obj_bound(obj, vm)) {
> > > +		if ((alignment &&
> > > +		     i915_gem_obj_offset(obj, vm) & (alignment - 1)) ||
> > >  		    (map_and_fenceable && !obj->map_and_fenceable)) {
> > >  			WARN(obj->pin_count,
> > >  			     "bo is already pinned with incorrect alignment:"
> > >  			     " offset=%lx, req.alignment=%x, req.map_and_fenceable=%d,"
> > >  			     " obj->map_and_fenceable=%d\n",
> > > -			     i915_gem_obj_ggtt_offset(obj), alignment,
> > > +			     i915_gem_obj_offset(obj, vm), alignment,
> > >  			     map_and_fenceable,
> > >  			     obj->map_and_fenceable);
> > > -			ret = i915_gem_object_unbind(obj);
> > > +			ret = i915_gem_object_unbind(obj, vm);
> > >  			if (ret)
> > >  				return ret;
> > >  		}
> > >  	}
> > >  
> > > -	if (!i915_gem_obj_ggtt_bound(obj)) {
> > > +	if (!i915_gem_obj_bound(obj, vm)) {
> > >  		struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> > >  
> > > -		ret = i915_gem_object_bind_to_gtt(obj, alignment,
> > > +		ret = i915_gem_object_bind_to_gtt(obj, vm, alignment,
> > >  						  map_and_fenceable,
> > >  						  nonblocking);
> > >  		if (ret)
> > > @@ -3684,7 +3739,7 @@ void
> > >  i915_gem_object_unpin(struct drm_i915_gem_object *obj)
> > >  {
> > >  	BUG_ON(obj->pin_count == 0);
> > > -	BUG_ON(!i915_gem_obj_ggtt_bound(obj));
> > > +	BUG_ON(!i915_gem_obj_bound_any(obj));
> > >  
> > >  	if (--obj->pin_count == 0)
> > >  		obj->pin_mappable = false;
> > > @@ -3722,7 +3777,7 @@ i915_gem_pin_ioctl(struct drm_device *dev, void *data,
> > >  	}
> > >  
> > >  	if (obj->user_pin_count == 0) {
> > > -		ret = i915_gem_object_pin(obj, args->alignment, true, false);
> > > +		ret = i915_gem_ggtt_pin(obj, args->alignment, true, false);
> > >  		if (ret)
> > >  			goto out;
> > >  	}
> > > @@ -3957,6 +4012,7 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
> > >  	struct drm_i915_gem_object *obj = to_intel_bo(gem_obj);
> > >  	struct drm_device *dev = obj->base.dev;
> > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > > +	struct i915_vma *vma, *next;
> > >  
> > >  	trace_i915_gem_object_destroy(obj);
> > >  
> > > @@ -3964,15 +4020,21 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
> > >  		i915_gem_detach_phys_object(dev, obj);
> > >  
> > >  	obj->pin_count = 0;
> > > -	if (WARN_ON(i915_gem_object_unbind(obj) == -ERESTARTSYS)) {
> > > -		bool was_interruptible;
> > > +	/* NB: 0 or 1 elements */
> > > +	WARN_ON(!list_empty(&obj->vma_list) &&
> > > +		!list_is_singular(&obj->vma_list));
> > > +	list_for_each_entry_safe(vma, next, &obj->vma_list, vma_link) {
> > > +		int ret = i915_gem_object_unbind(obj, vma->vm);
> > > +		if (WARN_ON(ret == -ERESTARTSYS)) {
> > > +			bool was_interruptible;
> > >  
> > > -		was_interruptible = dev_priv->mm.interruptible;
> > > -		dev_priv->mm.interruptible = false;
> > > +			was_interruptible = dev_priv->mm.interruptible;
> > > +			dev_priv->mm.interruptible = false;
> > >  
> > > -		WARN_ON(i915_gem_object_unbind(obj));
> > > +			WARN_ON(i915_gem_object_unbind(obj, vma->vm));
> > >  
> > > -		dev_priv->mm.interruptible = was_interruptible;
> > > +			dev_priv->mm.interruptible = was_interruptible;
> > > +		}
> > >  	}
> > >  
> > >  	/* Stolen objects don't hold a ref, but do hold pin count. Fix that up
> > > @@ -4332,6 +4394,16 @@ init_ring_lists(struct intel_ring_buffer *ring)
> > >  	INIT_LIST_HEAD(&ring->request_list);
> > >  }
> > >  
> > > +static void i915_init_vm(struct drm_i915_private *dev_priv,
> > > +			 struct i915_address_space *vm)
> > > +{
> > > +	vm->dev = dev_priv->dev;
> > > +	INIT_LIST_HEAD(&vm->active_list);
> > > +	INIT_LIST_HEAD(&vm->inactive_list);
> > > +	INIT_LIST_HEAD(&vm->global_link);
> > > +	list_add(&vm->global_link, &dev_priv->vm_list);
> > > +}
> > > +
> > >  void
> > >  i915_gem_load(struct drm_device *dev)
> > >  {
> > > @@ -4344,8 +4416,9 @@ i915_gem_load(struct drm_device *dev)
> > >  				  SLAB_HWCACHE_ALIGN,
> > >  				  NULL);
> > >  
> > > -	INIT_LIST_HEAD(&dev_priv->gtt.base.active_list);
> > > -	INIT_LIST_HEAD(&dev_priv->gtt.base.inactive_list);
> > > +	INIT_LIST_HEAD(&dev_priv->vm_list);
> > > +	i915_init_vm(dev_priv, &dev_priv->gtt.base);
> > > +
> > >  	INIT_LIST_HEAD(&dev_priv->mm.unbound_list);
> > >  	INIT_LIST_HEAD(&dev_priv->mm.bound_list);
> > >  	INIT_LIST_HEAD(&dev_priv->mm.fence_list);
> > > @@ -4616,9 +4689,9 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> > >  			     struct drm_i915_private,
> > >  			     mm.inactive_shrinker);
> > >  	struct drm_device *dev = dev_priv->dev;
> > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > +	struct i915_address_space *vm;
> > >  	struct drm_i915_gem_object *obj;
> > > -	int nr_to_scan = sc->nr_to_scan;
> > > +	int nr_to_scan;
> > >  	bool unlock = true;
> > >  	int cnt;
> > >  
> > > @@ -4632,6 +4705,7 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> > >  		unlock = false;
> > >  	}
> > >  
> > > +	nr_to_scan = sc->nr_to_scan;
> > >  	if (nr_to_scan) {
> > >  		nr_to_scan -= i915_gem_purge(dev_priv, nr_to_scan);
> > >  		if (nr_to_scan > 0)
> > > @@ -4645,11 +4719,93 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> > >  	list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_list)
> > >  		if (obj->pages_pin_count == 0)
> > >  			cnt += obj->base.size >> PAGE_SHIFT;
> > > -	list_for_each_entry(obj, &vm->inactive_list, mm_list)
> > > -		if (obj->pin_count == 0 && obj->pages_pin_count == 0)
> > > -			cnt += obj->base.size >> PAGE_SHIFT;
> > > +
> > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > > +		list_for_each_entry(obj, &vm->inactive_list, global_list)
> > > +			if (obj->pin_count == 0 && obj->pages_pin_count == 0)
> > > +				cnt += obj->base.size >> PAGE_SHIFT;
> > 
> > Isn't this now double-counting objects? In the shrinker we only care about
> > how much physical RAM an object occupies, not how much virtual space it
> > occupies. So just walking the bound list of objects here should be good
> > enough ...
> > 
> 
> Maybe I've misunderstood you. My code is wrong, but I think you're idea
> requires a prep patch because it changes functionality, right?
> 
> So let me know if I've understood you.

Don't we have both the bound and unbound list? So we could just switch
over to counting the bound objects here ... Otherwise yes, we need a prep
patch to create the bound list first.

> 
> > >  
> > >  	if (unlock)
> > >  		mutex_unlock(&dev->struct_mutex);
> > >  	return cnt;
> > >  }
> > > +
> > > +/* All the new VM stuff */
> > > +unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
> > > +				  struct i915_address_space *vm)
> > > +{
> > > +	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
> > > +	struct i915_vma *vma;
> > > +
> > > +	if (vm == &dev_priv->mm.aliasing_ppgtt->base)
> > > +		vm = &dev_priv->gtt.base;
> > > +
> > > +	BUG_ON(list_empty(&o->vma_list));
> > > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > 
> > Imo the vma list walking here and in the other helpers below indicates
> > that we should deal more often in vmas instead of (object, vm) pairs. Or
> > is this again something that'll get fixed later on?
> > 
> > I just want to avoid diff churn, and it also makes reviewing easier if the
> > foreshadowing is correct ;-) So generally I'd vote for more liberal
> > sprinkling of obj_to_vma in callers.
> 
> It's not something I fixed in the whole series. I think it makes sense
> conceptually, to keep some things as <obj,vm> and others as direct vma.
> 
> If you want me to change something, you need to be more specific since
> no action specifically comes to mind at this point in the series.

It's just that the (obj, vm) -> vma lookup is a list-walk, so imo we
should try to avoid it whenever possible. Since the vma has both and obj
and a vm pointer the vma is imo strictly better than the (obj, vm) pair.
And the look-up should be pushed down the callchain as much as possible.

So I think generally we want to pass the vma around to functions
everywhere, and the (obj, vm) pair would be the exception (which needs
special justification).

> 
> > 
> > > +		if (vma->vm == vm)
> > > +			return vma->node.start;
> > > +
> > > +	}
> > > +	return -1;
> > > +}
> > > +
> > > +bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o)
> > > +{
> > > +	return !list_empty(&o->vma_list);
> > > +}
> > > +
> > > +bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
> > > +			struct i915_address_space *vm)
> > > +{
> > > +	struct i915_vma *vma;
> > > +
> > > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > > +		if (vma->vm == vm)
> > > +			return true;
> > > +	}
> > > +	return false;
> > > +}
> > > +
> > > +unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
> > > +				struct i915_address_space *vm)
> > > +{
> > > +	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
> > > +	struct i915_vma *vma;
> > > +
> > > +	if (vm == &dev_priv->mm.aliasing_ppgtt->base)
> > > +		vm = &dev_priv->gtt.base;
> > > +	BUG_ON(list_empty(&o->vma_list));
> > > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > > +		if (vma->vm == vm)
> > > +			return vma->node.size;
> > > +	}
> > > +
> > > +	return 0;
> > > +}
> > > +
> > > +void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
> > > +			    struct i915_address_space *vm,
> > > +			    enum i915_cache_level color)
> > > +{
> > > +	struct i915_vma *vma;
> > > +	BUG_ON(list_empty(&o->vma_list));
> > > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > > +		if (vma->vm == vm) {
> > > +			vma->node.color = color;
> > > +			return;
> > > +		}
> > > +	}
> > > +
> > > +	WARN(1, "Couldn't set color for VM %p\n", vm);
> > > +}
> > > +
> > > +struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
> > > +				     struct i915_address_space *vm)
> > > +{
> > > +	struct i915_vma *vma;
> > > +	list_for_each_entry(vma, &obj->vma_list, vma_link)
> > > +		if (vma->vm == vm)
> > > +			return vma;
> > > +
> > > +	return NULL;
> > > +}
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> > > index 2074544..c92fd81 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_context.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> > > @@ -155,6 +155,7 @@ create_hw_context(struct drm_device *dev,
> > >  
> > >  	if (INTEL_INFO(dev)->gen >= 7) {
> > >  		ret = i915_gem_object_set_cache_level(ctx->obj,
> > > +						      &dev_priv->gtt.base,
> > >  						      I915_CACHE_LLC_MLC);
> > >  		/* Failure shouldn't ever happen this early */
> > >  		if (WARN_ON(ret))
> > > @@ -214,7 +215,7 @@ static int create_default_context(struct drm_i915_private *dev_priv)
> > >  	 * default context.
> > >  	 */
> > >  	dev_priv->ring[RCS].default_context = ctx;
> > > -	ret = i915_gem_object_pin(ctx->obj, CONTEXT_ALIGN, false, false);
> > > +	ret = i915_gem_ggtt_pin(ctx->obj, CONTEXT_ALIGN, false, false);
> > >  	if (ret) {
> > >  		DRM_DEBUG_DRIVER("Couldn't pin %d\n", ret);
> > >  		goto err_destroy;
> > > @@ -398,6 +399,7 @@ mi_set_context(struct intel_ring_buffer *ring,
> > >  static int do_switch(struct i915_hw_context *to)
> > >  {
> > >  	struct intel_ring_buffer *ring = to->ring;
> > > +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> > >  	struct i915_hw_context *from = ring->last_context;
> > >  	u32 hw_flags = 0;
> > >  	int ret;
> > > @@ -407,7 +409,7 @@ static int do_switch(struct i915_hw_context *to)
> > >  	if (from == to)
> > >  		return 0;
> > >  
> > > -	ret = i915_gem_object_pin(to->obj, CONTEXT_ALIGN, false, false);
> > > +	ret = i915_gem_ggtt_pin(to->obj, CONTEXT_ALIGN, false, false);
> > >  	if (ret)
> > >  		return ret;
> > >  
> > > @@ -444,7 +446,8 @@ static int do_switch(struct i915_hw_context *to)
> > >  	 */
> > >  	if (from != NULL) {
> > >  		from->obj->base.read_domains = I915_GEM_DOMAIN_INSTRUCTION;
> > > -		i915_gem_object_move_to_active(from->obj, ring);
> > > +		i915_gem_object_move_to_active(from->obj, &dev_priv->gtt.base,
> > > +					       ring);
> > >  		/* As long as MI_SET_CONTEXT is serializing, ie. it flushes the
> > >  		 * whole damn pipeline, we don't need to explicitly mark the
> > >  		 * object dirty. The only exception is that the context must be
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
> > > index df61f33..32efdc0 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_evict.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_evict.c
> > > @@ -32,24 +32,21 @@
> > >  #include "i915_trace.h"
> > >  
> > >  static bool
> > > -mark_free(struct drm_i915_gem_object *obj, struct list_head *unwind)
> > > +mark_free(struct i915_vma *vma, struct list_head *unwind)
> > >  {
> > > -	struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> > > -
> > > -	if (obj->pin_count)
> > > +	if (vma->obj->pin_count)
> > >  		return false;
> > >  
> > > -	list_add(&obj->exec_list, unwind);
> > > +	list_add(&vma->obj->exec_list, unwind);
> > >  	return drm_mm_scan_add_block(&vma->node);
> > >  }
> > >  
> > >  int
> > > -i915_gem_evict_something(struct drm_device *dev, int min_size,
> > > -			 unsigned alignment, unsigned cache_level,
> > > +i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
> > > +			 int min_size, unsigned alignment, unsigned cache_level,
> > >  			 bool mappable, bool nonblocking)
> > >  {
> > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > >  	struct list_head eviction_list, unwind_list;
> > >  	struct i915_vma *vma;
> > >  	struct drm_i915_gem_object *obj;
> > > @@ -81,16 +78,18 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
> > >  	 */
> > >  
> > >  	INIT_LIST_HEAD(&unwind_list);
> > > -	if (mappable)
> > > +	if (mappable) {
> > > +		BUG_ON(!i915_is_ggtt(vm));
> > >  		drm_mm_init_scan_with_range(&vm->mm, min_size,
> > >  					    alignment, cache_level, 0,
> > >  					    dev_priv->gtt.mappable_end);
> > > -	else
> > > +	} else
> > >  		drm_mm_init_scan(&vm->mm, min_size, alignment, cache_level);
> > >  
> > >  	/* First see if there is a large enough contiguous idle region... */
> > >  	list_for_each_entry(obj, &vm->inactive_list, mm_list) {
> > > -		if (mark_free(obj, &unwind_list))
> > > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> > > +		if (mark_free(vma, &unwind_list))
> > >  			goto found;
> > >  	}
> > >  
> > > @@ -99,7 +98,8 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
> > >  
> > >  	/* Now merge in the soon-to-be-expired objects... */
> > >  	list_for_each_entry(obj, &vm->active_list, mm_list) {
> > > -		if (mark_free(obj, &unwind_list))
> > > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> > > +		if (mark_free(vma, &unwind_list))
> > >  			goto found;
> > >  	}
> > >  
> > > @@ -109,7 +109,7 @@ none:
> > >  		obj = list_first_entry(&unwind_list,
> > >  				       struct drm_i915_gem_object,
> > >  				       exec_list);
> > > -		vma = __i915_gem_obj_to_vma(obj);
> > > +		vma = i915_gem_obj_to_vma(obj, vm);
> > >  		ret = drm_mm_scan_remove_block(&vma->node);
> > >  		BUG_ON(ret);
> > >  
> > > @@ -130,7 +130,7 @@ found:
> > >  		obj = list_first_entry(&unwind_list,
> > >  				       struct drm_i915_gem_object,
> > >  				       exec_list);
> > > -		vma = __i915_gem_obj_to_vma(obj);
> > > +		vma = i915_gem_obj_to_vma(obj, vm);
> > >  		if (drm_mm_scan_remove_block(&vma->node)) {
> > >  			list_move(&obj->exec_list, &eviction_list);
> > >  			drm_gem_object_reference(&obj->base);
> > > @@ -145,7 +145,7 @@ found:
> > >  				       struct drm_i915_gem_object,
> > >  				       exec_list);
> > >  		if (ret == 0)
> > > -			ret = i915_gem_object_unbind(obj);
> > > +			ret = i915_gem_object_unbind(obj, vm);
> > >  
> > >  		list_del_init(&obj->exec_list);
> > >  		drm_gem_object_unreference(&obj->base);
> > > @@ -158,13 +158,18 @@ int
> > >  i915_gem_evict_everything(struct drm_device *dev)
> > 
> > I suspect evict_everything eventually wants a address_space *vm argument
> > for those cases where we only want to evict everything in a given vm. Atm
> > we have two use-cases of this:
> > - Called from the shrinker as a last-ditch effort. For that it should move
> >   _every_ object onto the unbound list.
> > - Called from execbuf for badly-fragmented address spaces to clean up the
> >   mess. For that case we only care about one address space.
> 
> The current thing is more or less a result of Chris' suggestions. A
> non-posted iteration did plumb the vm, and after reworking to the
> suggestion made by Chris, the vm didn't make much sense anymore.
> 
> For point #1, it requires VM prioritization I think. I don't really see
> any other way to fairly manage it.

The shrinker will rip out  objects in lru order by walking first unbound
and then bound objects. That's imo as fair as it gets, we don't need
priorities between vms.

> For point #2, that I agree it might be useful, but we can easily create
> a new function, and not call it "shrinker" to do it. 

Well my point was that this function is called
i915_gem_evict_everything(dev, vm) and for the first use case we simply
pass in vm = NULL. But essentially thrashing the vm should be rare enough
that for now we don't need to care.


> 
> 
> > 
> > >  {
> > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > +	struct i915_address_space *vm;
> > >  	struct drm_i915_gem_object *obj, *next;
> > > -	bool lists_empty;
> > > +	bool lists_empty = true;
> > >  	int ret;
> > >  
> > > -	lists_empty = (list_empty(&vm->inactive_list) &&
> > > -		       list_empty(&vm->active_list));
> > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > > +		lists_empty = (list_empty(&vm->inactive_list) &&
> > > +			       list_empty(&vm->active_list));
> > > +		if (!lists_empty)
> > > +			lists_empty = false;
> > > +	}
> > > +
> > >  	if (lists_empty)
> > >  		return -ENOSPC;
> > >  
> > > @@ -181,9 +186,11 @@ i915_gem_evict_everything(struct drm_device *dev)
> > >  	i915_gem_retire_requests(dev);
> > >  
> > >  	/* Having flushed everything, unbind() should never raise an error */
> > > -	list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> > > -		if (obj->pin_count == 0)
> > > -			WARN_ON(i915_gem_object_unbind(obj));
> > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > > +		list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> > > +			if (obj->pin_count == 0)
> > > +				WARN_ON(i915_gem_object_unbind(obj, vm));
> > > +	}
> > >  
> > >  	return 0;
> > >  }
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > > index 5aeb447..e90182d 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > > @@ -150,7 +150,7 @@ eb_get_object(struct eb_objects *eb, unsigned long handle)
> > >  }
> > >  
> > >  static void
> > > -eb_destroy(struct eb_objects *eb)
> > > +eb_destroy(struct eb_objects *eb, struct i915_address_space *vm)
> > >  {
> > >  	while (!list_empty(&eb->objects)) {
> > >  		struct drm_i915_gem_object *obj;
> > > @@ -174,7 +174,8 @@ static inline int use_cpu_reloc(struct drm_i915_gem_object *obj)
> > >  static int
> > >  i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
> > >  				   struct eb_objects *eb,
> > > -				   struct drm_i915_gem_relocation_entry *reloc)
> > > +				   struct drm_i915_gem_relocation_entry *reloc,
> > > +				   struct i915_address_space *vm)
> > >  {
> > >  	struct drm_device *dev = obj->base.dev;
> > >  	struct drm_gem_object *target_obj;
> > > @@ -297,7 +298,8 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
> > >  
> > >  static int
> > >  i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
> > > -				    struct eb_objects *eb)
> > > +				    struct eb_objects *eb,
> > > +				    struct i915_address_space *vm)
> > >  {
> > >  #define N_RELOC(x) ((x) / sizeof(struct drm_i915_gem_relocation_entry))
> > >  	struct drm_i915_gem_relocation_entry stack_reloc[N_RELOC(512)];
> > > @@ -321,7 +323,8 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
> > >  		do {
> > >  			u64 offset = r->presumed_offset;
> > >  
> > > -			ret = i915_gem_execbuffer_relocate_entry(obj, eb, r);
> > > +			ret = i915_gem_execbuffer_relocate_entry(obj, eb, r,
> > > +								 vm);
> > >  			if (ret)
> > >  				return ret;
> > >  
> > > @@ -344,13 +347,15 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
> > >  static int
> > >  i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
> > >  					 struct eb_objects *eb,
> > > -					 struct drm_i915_gem_relocation_entry *relocs)
> > > +					 struct drm_i915_gem_relocation_entry *relocs,
> > > +					 struct i915_address_space *vm)
> > >  {
> > >  	const struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
> > >  	int i, ret;
> > >  
> > >  	for (i = 0; i < entry->relocation_count; i++) {
> > > -		ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i]);
> > > +		ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i],
> > > +							 vm);
> > >  		if (ret)
> > >  			return ret;
> > >  	}
> > > @@ -359,7 +364,8 @@ i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
> > >  }
> > >  
> > >  static int
> > > -i915_gem_execbuffer_relocate(struct eb_objects *eb)
> > > +i915_gem_execbuffer_relocate(struct eb_objects *eb,
> > > +			     struct i915_address_space *vm)
> > >  {
> > >  	struct drm_i915_gem_object *obj;
> > >  	int ret = 0;
> > > @@ -373,7 +379,7 @@ i915_gem_execbuffer_relocate(struct eb_objects *eb)
> > >  	 */
> > >  	pagefault_disable();
> > >  	list_for_each_entry(obj, &eb->objects, exec_list) {
> > > -		ret = i915_gem_execbuffer_relocate_object(obj, eb);
> > > +		ret = i915_gem_execbuffer_relocate_object(obj, eb, vm);
> > >  		if (ret)
> > >  			break;
> > >  	}
> > > @@ -395,6 +401,7 @@ need_reloc_mappable(struct drm_i915_gem_object *obj)
> > >  static int
> > >  i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
> > >  				   struct intel_ring_buffer *ring,
> > > +				   struct i915_address_space *vm,
> > >  				   bool *need_reloc)
> > >  {
> > >  	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> > > @@ -409,7 +416,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
> > >  		obj->tiling_mode != I915_TILING_NONE;
> > >  	need_mappable = need_fence || need_reloc_mappable(obj);
> > >  
> > > -	ret = i915_gem_object_pin(obj, entry->alignment, need_mappable, false);
> > > +	ret = i915_gem_object_pin(obj, vm, entry->alignment, need_mappable,
> > > +				  false);
> > >  	if (ret)
> > >  		return ret;
> > >  
> > > @@ -436,8 +444,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
> > >  		obj->has_aliasing_ppgtt_mapping = 1;
> > >  	}
> > >  
> > > -	if (entry->offset != i915_gem_obj_ggtt_offset(obj)) {
> > > -		entry->offset = i915_gem_obj_ggtt_offset(obj);
> > > +	if (entry->offset != i915_gem_obj_offset(obj, vm)) {
> > > +		entry->offset = i915_gem_obj_offset(obj, vm);
> > >  		*need_reloc = true;
> > >  	}
> > >  
> > > @@ -458,7 +466,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
> > >  {
> > >  	struct drm_i915_gem_exec_object2 *entry;
> > >  
> > > -	if (!i915_gem_obj_ggtt_bound(obj))
> > > +	if (!i915_gem_obj_bound_any(obj))
> > >  		return;
> > >  
> > >  	entry = obj->exec_entry;
> > > @@ -475,6 +483,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
> > >  static int
> > >  i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
> > >  			    struct list_head *objects,
> > > +			    struct i915_address_space *vm,
> > >  			    bool *need_relocs)
> > >  {
> > >  	struct drm_i915_gem_object *obj;
> > > @@ -529,32 +538,37 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
> > >  		list_for_each_entry(obj, objects, exec_list) {
> > >  			struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
> > >  			bool need_fence, need_mappable;
> > > +			u32 obj_offset;
> > >  
> > > -			if (!i915_gem_obj_ggtt_bound(obj))
> > > +			if (!i915_gem_obj_bound(obj, vm))
> > >  				continue;
> > 
> > I wonder a bit how we could avoid the multipler (obj, vm) -> vma lookups
> > here ... Maybe we should cache them in some pointer somewhere (either in
> > the eb object or by adding a new pointer to the object struct, e.g.
> > obj->eb_vma, similar to obj->eb_list).
> > 
> 
> I agree, and even did this at one unposted patch too. However, I think
> it's a premature optimization which risks code correctness. So I think
> somewhere a FIXME needs to happen to address that issue. (Or if Chris
> complains bitterly about some perf hit).

If you bring up code correctness I'd vote strongly in favour of using vmas
everywhere - vma has the (obj, vm) pair locked down, doing the lookup all
the thing risks us mixing them up eventually and creating a hella lot of
confusion ;-)

> 
> > >  
> > > +			obj_offset = i915_gem_obj_offset(obj, vm);
> > >  			need_fence =
> > >  				has_fenced_gpu_access &&
> > >  				entry->flags & EXEC_OBJECT_NEEDS_FENCE &&
> > >  				obj->tiling_mode != I915_TILING_NONE;
> > >  			need_mappable = need_fence || need_reloc_mappable(obj);
> > >  
> > > +			BUG_ON((need_mappable || need_fence) &&
> > > +			       !i915_is_ggtt(vm));
> > > +
> > >  			if ((entry->alignment &&
> > > -			     i915_gem_obj_ggtt_offset(obj) & (entry->alignment - 1)) ||
> > > +			     obj_offset & (entry->alignment - 1)) ||
> > >  			    (need_mappable && !obj->map_and_fenceable))
> > > -				ret = i915_gem_object_unbind(obj);
> > > +				ret = i915_gem_object_unbind(obj, vm);
> > >  			else
> > > -				ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs);
> > > +				ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
> > >  			if (ret)
> > >  				goto err;
> > >  		}
> > >  
> > >  		/* Bind fresh objects */
> > >  		list_for_each_entry(obj, objects, exec_list) {
> > > -			if (i915_gem_obj_ggtt_bound(obj))
> > > +			if (i915_gem_obj_bound(obj, vm))
> > >  				continue;
> > >  
> > > -			ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs);
> > > +			ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
> > >  			if (ret)
> > >  				goto err;
> > >  		}
> > > @@ -578,7 +592,8 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
> > >  				  struct drm_file *file,
> > >  				  struct intel_ring_buffer *ring,
> > >  				  struct eb_objects *eb,
> > > -				  struct drm_i915_gem_exec_object2 *exec)
> > > +				  struct drm_i915_gem_exec_object2 *exec,
> > > +				  struct i915_address_space *vm)
> > >  {
> > >  	struct drm_i915_gem_relocation_entry *reloc;
> > >  	struct drm_i915_gem_object *obj;
> > > @@ -662,14 +677,15 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
> > >  		goto err;
> > >  
> > >  	need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
> > > -	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs);
> > > +	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
> > >  	if (ret)
> > >  		goto err;
> > >  
> > >  	list_for_each_entry(obj, &eb->objects, exec_list) {
> > >  		int offset = obj->exec_entry - exec;
> > >  		ret = i915_gem_execbuffer_relocate_object_slow(obj, eb,
> > > -							       reloc + reloc_offset[offset]);
> > > +							       reloc + reloc_offset[offset],
> > > +							       vm);
> > >  		if (ret)
> > >  			goto err;
> > >  	}
> > > @@ -768,6 +784,7 @@ validate_exec_list(struct drm_i915_gem_exec_object2 *exec,
> > >  
> > >  static void
> > >  i915_gem_execbuffer_move_to_active(struct list_head *objects,
> > > +				   struct i915_address_space *vm,
> > >  				   struct intel_ring_buffer *ring)
> > >  {
> > >  	struct drm_i915_gem_object *obj;
> > > @@ -782,7 +799,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *objects,
> > >  		obj->base.read_domains = obj->base.pending_read_domains;
> > >  		obj->fenced_gpu_access = obj->pending_fenced_gpu_access;
> > >  
> > > -		i915_gem_object_move_to_active(obj, ring);
> > > +		i915_gem_object_move_to_active(obj, vm, ring);
> > >  		if (obj->base.write_domain) {
> > >  			obj->dirty = 1;
> > >  			obj->last_write_seqno = intel_ring_get_seqno(ring);
> > > @@ -836,7 +853,8 @@ static int
> > >  i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> > >  		       struct drm_file *file,
> > >  		       struct drm_i915_gem_execbuffer2 *args,
> > > -		       struct drm_i915_gem_exec_object2 *exec)
> > > +		       struct drm_i915_gem_exec_object2 *exec,
> > > +		       struct i915_address_space *vm)
> > >  {
> > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > >  	struct eb_objects *eb;
> > > @@ -998,17 +1016,17 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> > >  
> > >  	/* Move the objects en-masse into the GTT, evicting if necessary. */
> > >  	need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
> > > -	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs);
> > > +	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
> > >  	if (ret)
> > >  		goto err;
> > >  
> > >  	/* The objects are in their final locations, apply the relocations. */
> > >  	if (need_relocs)
> > > -		ret = i915_gem_execbuffer_relocate(eb);
> > > +		ret = i915_gem_execbuffer_relocate(eb, vm);
> > >  	if (ret) {
> > >  		if (ret == -EFAULT) {
> > >  			ret = i915_gem_execbuffer_relocate_slow(dev, args, file, ring,
> > > -								eb, exec);
> > > +								eb, exec, vm);
> > >  			BUG_ON(!mutex_is_locked(&dev->struct_mutex));
> > >  		}
> > >  		if (ret)
> > > @@ -1059,7 +1077,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> > >  			goto err;
> > >  	}
> > >  
> > > -	exec_start = i915_gem_obj_ggtt_offset(batch_obj) + args->batch_start_offset;
> > > +	exec_start = i915_gem_obj_offset(batch_obj, vm) +
> > > +		args->batch_start_offset;
> > >  	exec_len = args->batch_len;
> > >  	if (cliprects) {
> > >  		for (i = 0; i < args->num_cliprects; i++) {
> > > @@ -1084,11 +1103,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> > >  
> > >  	trace_i915_gem_ring_dispatch(ring, intel_ring_get_seqno(ring), flags);
> > >  
> > > -	i915_gem_execbuffer_move_to_active(&eb->objects, ring);
> > > +	i915_gem_execbuffer_move_to_active(&eb->objects, vm, ring);
> > >  	i915_gem_execbuffer_retire_commands(dev, file, ring, batch_obj);
> > >  
> > >  err:
> > > -	eb_destroy(eb);
> > > +	eb_destroy(eb, vm);
> > >  
> > >  	mutex_unlock(&dev->struct_mutex);
> > >  
> > > @@ -1105,6 +1124,7 @@ int
> > >  i915_gem_execbuffer(struct drm_device *dev, void *data,
> > >  		    struct drm_file *file)
> > >  {
> > > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > >  	struct drm_i915_gem_execbuffer *args = data;
> > >  	struct drm_i915_gem_execbuffer2 exec2;
> > >  	struct drm_i915_gem_exec_object *exec_list = NULL;
> > > @@ -1160,7 +1180,8 @@ i915_gem_execbuffer(struct drm_device *dev, void *data,
> > >  	exec2.flags = I915_EXEC_RENDER;
> > >  	i915_execbuffer2_set_context_id(exec2, 0);
> > >  
> > > -	ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list);
> > > +	ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list,
> > > +				     &dev_priv->gtt.base);
> > >  	if (!ret) {
> > >  		/* Copy the new buffer offsets back to the user's exec list. */
> > >  		for (i = 0; i < args->buffer_count; i++)
> > > @@ -1186,6 +1207,7 @@ int
> > >  i915_gem_execbuffer2(struct drm_device *dev, void *data,
> > >  		     struct drm_file *file)
> > >  {
> > > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > >  	struct drm_i915_gem_execbuffer2 *args = data;
> > >  	struct drm_i915_gem_exec_object2 *exec2_list = NULL;
> > >  	int ret;
> > > @@ -1216,7 +1238,8 @@ i915_gem_execbuffer2(struct drm_device *dev, void *data,
> > >  		return -EFAULT;
> > >  	}
> > >  
> > > -	ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list);
> > > +	ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list,
> > > +				     &dev_priv->gtt.base);
> > >  	if (!ret) {
> > >  		/* Copy the new buffer offsets back to the user's exec list. */
> > >  		ret = copy_to_user(to_user_ptr(args->buffers_ptr),
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > index 298fc42..70ce2f6 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > @@ -367,6 +367,8 @@ static int i915_gem_init_aliasing_ppgtt(struct drm_device *dev)
> > >  			    ppgtt->base.total);
> > >  	}
> > >  
> > > +	/* i915_init_vm(dev_priv, &ppgtt->base) */
> > > +
> > >  	return ret;
> > >  }
> > >  
> > > @@ -386,17 +388,22 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
> > >  			    struct drm_i915_gem_object *obj,
> > >  			    enum i915_cache_level cache_level)
> > >  {
> > > -	ppgtt->base.insert_entries(&ppgtt->base, obj->pages,
> > > -				   i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> > > -				   cache_level);
> > > +	struct i915_address_space *vm = &ppgtt->base;
> > > +	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
> > > +
> > > +	vm->insert_entries(vm, obj->pages,
> > > +			   obj_offset >> PAGE_SHIFT,
> > > +			   cache_level);
> > >  }
> > >  
> > >  void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
> > >  			      struct drm_i915_gem_object *obj)
> > >  {
> > > -	ppgtt->base.clear_range(&ppgtt->base,
> > > -				i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> > > -				obj->base.size >> PAGE_SHIFT);
> > > +	struct i915_address_space *vm = &ppgtt->base;
> > > +	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
> > > +
> > > +	vm->clear_range(vm, obj_offset >> PAGE_SHIFT,
> > > +			obj->base.size >> PAGE_SHIFT);
> > >  }
> > >  
> > >  extern int intel_iommu_gfx_mapped;
> > > @@ -447,6 +454,9 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
> > >  				       dev_priv->gtt.base.start / PAGE_SIZE,
> > >  				       dev_priv->gtt.base.total / PAGE_SIZE);
> > >  
> > > +	if (dev_priv->mm.aliasing_ppgtt)
> > > +		gen6_write_pdes(dev_priv->mm.aliasing_ppgtt);
> > > +
> > >  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> > >  		i915_gem_clflush_object(obj);
> > >  		i915_gem_gtt_bind_object(obj, obj->cache_level);
> > > @@ -625,7 +635,8 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
> > >  	 * aperture.  One page should be enough to keep any prefetching inside
> > >  	 * of the aperture.
> > >  	 */
> > > -	drm_i915_private_t *dev_priv = dev->dev_private;
> > > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > > +	struct i915_address_space *ggtt_vm = &dev_priv->gtt.base;
> > >  	struct drm_mm_node *entry;
> > >  	struct drm_i915_gem_object *obj;
> > >  	unsigned long hole_start, hole_end;
> > > @@ -633,19 +644,19 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
> > >  	BUG_ON(mappable_end > end);
> > >  
> > >  	/* Subtract the guard page ... */
> > > -	drm_mm_init(&dev_priv->gtt.base.mm, start, end - start - PAGE_SIZE);
> > > +	drm_mm_init(&ggtt_vm->mm, start, end - start - PAGE_SIZE);
> > >  	if (!HAS_LLC(dev))
> > >  		dev_priv->gtt.base.mm.color_adjust = i915_gtt_color_adjust;
> > >  
> > >  	/* Mark any preallocated objects as occupied */
> > >  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> > > -		struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> > > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, ggtt_vm);
> > >  		int ret;
> > >  		DRM_DEBUG_KMS("reserving preallocated space: %lx + %zx\n",
> > >  			      i915_gem_obj_ggtt_offset(obj), obj->base.size);
> > >  
> > >  		WARN_ON(i915_gem_obj_ggtt_bound(obj));
> > > -		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
> > > +		ret = drm_mm_reserve_node(&ggtt_vm->mm, &vma->node);
> > >  		if (ret)
> > >  			DRM_DEBUG_KMS("Reservation failed\n");
> > >  		obj->has_global_gtt_mapping = 1;
> > > @@ -656,19 +667,15 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
> > >  	dev_priv->gtt.base.total = end - start;
> > >  
> > >  	/* Clear any non-preallocated blocks */
> > > -	drm_mm_for_each_hole(entry, &dev_priv->gtt.base.mm,
> > > -			     hole_start, hole_end) {
> > > +	drm_mm_for_each_hole(entry, &ggtt_vm->mm, hole_start, hole_end) {
> > >  		const unsigned long count = (hole_end - hole_start) / PAGE_SIZE;
> > >  		DRM_DEBUG_KMS("clearing unused GTT space: [%lx, %lx]\n",
> > >  			      hole_start, hole_end);
> > > -		dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> > > -					       hole_start / PAGE_SIZE,
> > > -					       count);
> > > +		ggtt_vm->clear_range(ggtt_vm, hole_start / PAGE_SIZE, count);
> > >  	}
> > >  
> > >  	/* And finally clear the reserved guard page */
> > > -	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> > > -				       end / PAGE_SIZE - 1, 1);
> > > +	ggtt_vm->clear_range(ggtt_vm, end / PAGE_SIZE - 1, 1);
> > >  }
> > >  
> > >  static bool
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > index 245eb1d..bfe61fa 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > @@ -391,7 +391,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
> > >  	if (gtt_offset == I915_GTT_OFFSET_NONE)
> > >  		return obj;
> > >  
> > > -	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
> > > +	vma = i915_gem_vma_create(obj, vm);
> > >  	if (!vma) {
> > >  		drm_gem_object_unreference(&obj->base);
> > >  		return NULL;
> > > @@ -404,8 +404,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
> > >  	 */
> > >  	vma->node.start = gtt_offset;
> > >  	vma->node.size = size;
> > > -	if (drm_mm_initialized(&dev_priv->gtt.base.mm)) {
> > > -		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
> > > +	if (drm_mm_initialized(&vm->mm)) {
> > > +		ret = drm_mm_reserve_node(&vm->mm, &vma->node);
> > 
> > These two hunks here for stolen look fishy - we only ever use the stolen
> > preallocated stuff for objects with mappings in the global gtt. So keeping
> > that explicit is imo the better approach. And tbh I'm confused where the
> > local variable vm is from ...
> 
> If we don't create a vma for it, we potentially have to special case a
> bunch of places, I think. I'm not actually sure of this, but the
> overhead to do it is quite small.
> 
> Anyway, I'll look this over again nd see what I think.

I'm not against the vma, I've just wonedered why you do the
/dev_priv->gtt.base/vm/ replacement here since
- it's never gonna be used with another vm than ggtt
- this patch doesn't add the vm variable, so I'm even more confused where
  this started ;-)

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 06/11] drm/i915: plumb VM into object operations
  2013-07-12  6:26       ` Daniel Vetter
@ 2013-07-12 15:46         ` Ben Widawsky
  2013-07-12 16:46           ` Daniel Vetter
  0 siblings, 1 reply; 50+ messages in thread
From: Ben Widawsky @ 2013-07-12 15:46 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX

On Fri, Jul 12, 2013 at 08:26:07AM +0200, Daniel Vetter wrote:
> On Thu, Jul 11, 2013 at 07:23:08PM -0700, Ben Widawsky wrote:
> > On Tue, Jul 09, 2013 at 09:15:01AM +0200, Daniel Vetter wrote:
> > > On Mon, Jul 08, 2013 at 11:08:37PM -0700, Ben Widawsky wrote:
> 
> [snip]
> 
> > > > index 058ad44..21015cd 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > > @@ -38,10 +38,12 @@
> > > >  
> > > >  static void i915_gem_object_flush_gtt_write_domain(struct drm_i915_gem_object *obj);
> > > >  static void i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *obj);
> > > > -static __must_check int i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> > > > -						    unsigned alignment,
> > > > -						    bool map_and_fenceable,
> > > > -						    bool nonblocking);
> > > > +static __must_check int
> > > > +i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> > > > +			    struct i915_address_space *vm,
> > > > +			    unsigned alignment,
> > > > +			    bool map_and_fenceable,
> > > > +			    bool nonblocking);
> > > >  static int i915_gem_phys_pwrite(struct drm_device *dev,
> > > >  				struct drm_i915_gem_object *obj,
> > > >  				struct drm_i915_gem_pwrite *args,
> > > > @@ -135,7 +137,7 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
> > > >  static inline bool
> > > >  i915_gem_object_is_inactive(struct drm_i915_gem_object *obj)
> > > >  {
> > > > -	return i915_gem_obj_ggtt_bound(obj) && !obj->active;
> > > > +	return i915_gem_obj_bound_any(obj) && !obj->active;
> > > >  }
> > > >  
> > > >  int
> > > > @@ -422,7 +424,7 @@ i915_gem_shmem_pread(struct drm_device *dev,
> > > >  		 * anyway again before the next pread happens. */
> > > >  		if (obj->cache_level == I915_CACHE_NONE)
> > > >  			needs_clflush = 1;
> > > > -		if (i915_gem_obj_ggtt_bound(obj)) {
> > > > +		if (i915_gem_obj_bound_any(obj)) {
> > > >  			ret = i915_gem_object_set_to_gtt_domain(obj, false);
> > > 
> > > This is essentially a very convoluted version of "if there's gpu rendering
> > > outstanding, please wait for it". Maybe we should switch this to
> > > 
> > > 	if (obj->active)
> > > 		wait_rendering(obj, true);
> > > 
> > > Same for the shmem_pwrite case below. Would be a separate patch to prep
> > > things though. Can I volunteer you for that? The ugly part is to review
> > > whether any of the lru list updating that set_domain does in addition to
> > > wait_rendering is required, but on a quick read that's not the case.
> > 
> > Just reading the comment above it says we need the clflush. I don't
> > actually understand why we do that even after reading the comment, but
> > meh. You tell me, I don't mind doing this as a prep first.
> 
> The comment right above is just for the needs_clflush = 1 assignment, the
> set_to_gtt_domain call afterwards is just to sync up with the gpu. The
> code is confusing and tricky and the lack of a white line in between the
> two things plus a comment explaining that we only care about the
> wait_rendering side-effect of set_to_gtt_domain doesn't help. If you do
> the proposed conversion (and add a white line) that should help a lot in
> unconfusing readers.
> 
> [snip]
> 
> > > > @@ -3333,12 +3376,15 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > > >  	}
> > > >  
> > > >  	if (!i915_gem_valid_gtt_space(dev, &vma->node, cache_level)) {
> > > > -		ret = i915_gem_object_unbind(obj);
> > > > +		ret = i915_gem_object_unbind(obj, vm);
> > > >  		if (ret)
> > > >  			return ret;
> > > >  	}
> > > >  
> > > > -	if (i915_gem_obj_ggtt_bound(obj)) {
> > > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > > > +		if (!i915_gem_obj_bound(obj, vm))
> > > > +			continue;
> > > 
> > > Hm, shouldn't we have a per-object list of vmas? Or will that follow later
> > > on?
> > > 
> > > Self-correction: It exists already ... why can't we use this here?
> > 
> > Yes. That should work, I'll fix it and test it. It looks slightly worse
> > IMO in terms of code clarity, but I don't mind the change.
> 
> Actually I think it'd gain in clarity, doing pte updatest (which
> set_cache_level does) on the vma instead of the (obj, vm) pair feels more
> natural. And we'd be able to drop lots of (obj, vm) -> vma lookups here.

That sounds good to me. Would you mind a patch on top?

> 
> > 
> > > 
> > > > +
> > > >  		ret = i915_gem_object_finish_gpu(obj);
> > > >  		if (ret)
> > > >  			return ret;
> > > > @@ -3361,7 +3407,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > > >  			i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
> > > >  					       obj, cache_level);
> > > >  
> > > > -		i915_gem_obj_ggtt_set_color(obj, cache_level);
> > > > +		i915_gem_obj_set_color(obj, vm, cache_level);
> > > >  	}
> > > >  
> > > >  	if (cache_level == I915_CACHE_NONE) {
> > > > @@ -3421,6 +3467,7 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
> > > >  			       struct drm_file *file)
> > > >  {
> > > >  	struct drm_i915_gem_caching *args = data;
> > > > +	struct drm_i915_private *dev_priv;
> > > >  	struct drm_i915_gem_object *obj;
> > > >  	enum i915_cache_level level;
> > > >  	int ret;
> > > > @@ -3445,8 +3492,10 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
> > > >  		ret = -ENOENT;
> > > >  		goto unlock;
> > > >  	}
> > > > +	dev_priv = obj->base.dev->dev_private;
> > > >  
> > > > -	ret = i915_gem_object_set_cache_level(obj, level);
> > > > +	/* FIXME: Add interface for specific VM? */
> > > > +	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base, level);
> > > >  
> > > >  	drm_gem_object_unreference(&obj->base);
> > > >  unlock:
> > > > @@ -3464,6 +3513,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> > > >  				     u32 alignment,
> > > >  				     struct intel_ring_buffer *pipelined)
> > > >  {
> > > > +	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> > > >  	u32 old_read_domains, old_write_domain;
> > > >  	int ret;
> > > >  
> > > > @@ -3482,7 +3532,8 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> > > >  	 * of uncaching, which would allow us to flush all the LLC-cached data
> > > >  	 * with that bit in the PTE to main memory with just one PIPE_CONTROL.
> > > >  	 */
> > > > -	ret = i915_gem_object_set_cache_level(obj, I915_CACHE_NONE);
> > > > +	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> > > > +					      I915_CACHE_NONE);
> > > >  	if (ret)
> > > >  		return ret;
> > > >  
> > > > @@ -3490,7 +3541,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> > > >  	 * (e.g. libkms for the bootup splash), we have to ensure that we
> > > >  	 * always use map_and_fenceable for all scanout buffers.
> > > >  	 */
> > > > -	ret = i915_gem_object_pin(obj, alignment, true, false);
> > > > +	ret = i915_gem_ggtt_pin(obj, alignment, true, false);
> > > >  	if (ret)
> > > >  		return ret;
> > > >  
> > > > @@ -3633,6 +3684,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
> > > >  
> > > >  int
> > > >  i915_gem_object_pin(struct drm_i915_gem_object *obj,
> > > > +		    struct i915_address_space *vm,
> > > >  		    uint32_t alignment,
> > > >  		    bool map_and_fenceable,
> > > >  		    bool nonblocking)
> > > > @@ -3642,26 +3694,29 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
> > > >  	if (WARN_ON(obj->pin_count == DRM_I915_GEM_OBJECT_MAX_PIN_COUNT))
> > > >  		return -EBUSY;
> > > >  
> > > > -	if (i915_gem_obj_ggtt_bound(obj)) {
> > > > -		if ((alignment && i915_gem_obj_ggtt_offset(obj) & (alignment - 1)) ||
> > > > +	BUG_ON(map_and_fenceable && !i915_is_ggtt(vm));
> > > 
> > > WARN_ON, since presumably we can keep on going if we get this wrong
> > > (albeit with slightly corrupted state, so render corruptions might
> > > follow).
> > 
> > Can we make a deal, can we leave this as BUG_ON with a FIXME to convert
> > it at the end of merging?
> 
> Adding a FIXME right above it will cause equal amounts of conflicts, so I
> don't see the point that much ...

I'm just really fearful that in doing the reworks I will end up with
this condition, and I am afraid I will miss them if it's a WARN_ON.
Definitely it's more likely to miss than a BUG.

Also, and we've disagreed on this a few times by now, this is an
internal interface which I think should carry such a fatal error for
this level of mistake.

In any case I've made the change locally. Will yell at you later if I
was right.

> 
> > 
> > > 
> > > > +
> > > > +	if (i915_gem_obj_bound(obj, vm)) {
> > > > +		if ((alignment &&
> > > > +		     i915_gem_obj_offset(obj, vm) & (alignment - 1)) ||
> > > >  		    (map_and_fenceable && !obj->map_and_fenceable)) {
> > > >  			WARN(obj->pin_count,
> > > >  			     "bo is already pinned with incorrect alignment:"
> > > >  			     " offset=%lx, req.alignment=%x, req.map_and_fenceable=%d,"
> > > >  			     " obj->map_and_fenceable=%d\n",
> > > > -			     i915_gem_obj_ggtt_offset(obj), alignment,
> > > > +			     i915_gem_obj_offset(obj, vm), alignment,
> > > >  			     map_and_fenceable,
> > > >  			     obj->map_and_fenceable);
> > > > -			ret = i915_gem_object_unbind(obj);
> > > > +			ret = i915_gem_object_unbind(obj, vm);
> > > >  			if (ret)
> > > >  				return ret;
> > > >  		}
> > > >  	}
> > > >  
> > > > -	if (!i915_gem_obj_ggtt_bound(obj)) {
> > > > +	if (!i915_gem_obj_bound(obj, vm)) {
> > > >  		struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> > > >  
> > > > -		ret = i915_gem_object_bind_to_gtt(obj, alignment,
> > > > +		ret = i915_gem_object_bind_to_gtt(obj, vm, alignment,
> > > >  						  map_and_fenceable,
> > > >  						  nonblocking);
> > > >  		if (ret)
> > > > @@ -3684,7 +3739,7 @@ void
> > > >  i915_gem_object_unpin(struct drm_i915_gem_object *obj)
> > > >  {
> > > >  	BUG_ON(obj->pin_count == 0);
> > > > -	BUG_ON(!i915_gem_obj_ggtt_bound(obj));
> > > > +	BUG_ON(!i915_gem_obj_bound_any(obj));
> > > >  
> > > >  	if (--obj->pin_count == 0)
> > > >  		obj->pin_mappable = false;
> > > > @@ -3722,7 +3777,7 @@ i915_gem_pin_ioctl(struct drm_device *dev, void *data,
> > > >  	}
> > > >  
> > > >  	if (obj->user_pin_count == 0) {
> > > > -		ret = i915_gem_object_pin(obj, args->alignment, true, false);
> > > > +		ret = i915_gem_ggtt_pin(obj, args->alignment, true, false);
> > > >  		if (ret)
> > > >  			goto out;
> > > >  	}
> > > > @@ -3957,6 +4012,7 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
> > > >  	struct drm_i915_gem_object *obj = to_intel_bo(gem_obj);
> > > >  	struct drm_device *dev = obj->base.dev;
> > > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > > > +	struct i915_vma *vma, *next;
> > > >  
> > > >  	trace_i915_gem_object_destroy(obj);
> > > >  
> > > > @@ -3964,15 +4020,21 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
> > > >  		i915_gem_detach_phys_object(dev, obj);
> > > >  
> > > >  	obj->pin_count = 0;
> > > > -	if (WARN_ON(i915_gem_object_unbind(obj) == -ERESTARTSYS)) {
> > > > -		bool was_interruptible;
> > > > +	/* NB: 0 or 1 elements */
> > > > +	WARN_ON(!list_empty(&obj->vma_list) &&
> > > > +		!list_is_singular(&obj->vma_list));
> > > > +	list_for_each_entry_safe(vma, next, &obj->vma_list, vma_link) {
> > > > +		int ret = i915_gem_object_unbind(obj, vma->vm);
> > > > +		if (WARN_ON(ret == -ERESTARTSYS)) {
> > > > +			bool was_interruptible;
> > > >  
> > > > -		was_interruptible = dev_priv->mm.interruptible;
> > > > -		dev_priv->mm.interruptible = false;
> > > > +			was_interruptible = dev_priv->mm.interruptible;
> > > > +			dev_priv->mm.interruptible = false;
> > > >  
> > > > -		WARN_ON(i915_gem_object_unbind(obj));
> > > > +			WARN_ON(i915_gem_object_unbind(obj, vma->vm));
> > > >  
> > > > -		dev_priv->mm.interruptible = was_interruptible;
> > > > +			dev_priv->mm.interruptible = was_interruptible;
> > > > +		}
> > > >  	}
> > > >  
> > > >  	/* Stolen objects don't hold a ref, but do hold pin count. Fix that up
> > > > @@ -4332,6 +4394,16 @@ init_ring_lists(struct intel_ring_buffer *ring)
> > > >  	INIT_LIST_HEAD(&ring->request_list);
> > > >  }
> > > >  
> > > > +static void i915_init_vm(struct drm_i915_private *dev_priv,
> > > > +			 struct i915_address_space *vm)
> > > > +{
> > > > +	vm->dev = dev_priv->dev;
> > > > +	INIT_LIST_HEAD(&vm->active_list);
> > > > +	INIT_LIST_HEAD(&vm->inactive_list);
> > > > +	INIT_LIST_HEAD(&vm->global_link);
> > > > +	list_add(&vm->global_link, &dev_priv->vm_list);
> > > > +}
> > > > +
> > > >  void
> > > >  i915_gem_load(struct drm_device *dev)
> > > >  {
> > > > @@ -4344,8 +4416,9 @@ i915_gem_load(struct drm_device *dev)
> > > >  				  SLAB_HWCACHE_ALIGN,
> > > >  				  NULL);
> > > >  
> > > > -	INIT_LIST_HEAD(&dev_priv->gtt.base.active_list);
> > > > -	INIT_LIST_HEAD(&dev_priv->gtt.base.inactive_list);
> > > > +	INIT_LIST_HEAD(&dev_priv->vm_list);
> > > > +	i915_init_vm(dev_priv, &dev_priv->gtt.base);
> > > > +
> > > >  	INIT_LIST_HEAD(&dev_priv->mm.unbound_list);
> > > >  	INIT_LIST_HEAD(&dev_priv->mm.bound_list);
> > > >  	INIT_LIST_HEAD(&dev_priv->mm.fence_list);
> > > > @@ -4616,9 +4689,9 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> > > >  			     struct drm_i915_private,
> > > >  			     mm.inactive_shrinker);
> > > >  	struct drm_device *dev = dev_priv->dev;
> > > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > > +	struct i915_address_space *vm;
> > > >  	struct drm_i915_gem_object *obj;
> > > > -	int nr_to_scan = sc->nr_to_scan;
> > > > +	int nr_to_scan;
> > > >  	bool unlock = true;
> > > >  	int cnt;
> > > >  
> > > > @@ -4632,6 +4705,7 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> > > >  		unlock = false;
> > > >  	}
> > > >  
> > > > +	nr_to_scan = sc->nr_to_scan;
> > > >  	if (nr_to_scan) {
> > > >  		nr_to_scan -= i915_gem_purge(dev_priv, nr_to_scan);
> > > >  		if (nr_to_scan > 0)
> > > > @@ -4645,11 +4719,93 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> > > >  	list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_list)
> > > >  		if (obj->pages_pin_count == 0)
> > > >  			cnt += obj->base.size >> PAGE_SHIFT;
> > > > -	list_for_each_entry(obj, &vm->inactive_list, mm_list)
> > > > -		if (obj->pin_count == 0 && obj->pages_pin_count == 0)
> > > > -			cnt += obj->base.size >> PAGE_SHIFT;
> > > > +
> > > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > > > +		list_for_each_entry(obj, &vm->inactive_list, global_list)
> > > > +			if (obj->pin_count == 0 && obj->pages_pin_count == 0)
> > > > +				cnt += obj->base.size >> PAGE_SHIFT;
> > > 
> > > Isn't this now double-counting objects? In the shrinker we only care about
> > > how much physical RAM an object occupies, not how much virtual space it
> > > occupies. So just walking the bound list of objects here should be good
> > > enough ...
> > > 
> > 
> > Maybe I've misunderstood you. My code is wrong, but I think you're idea
> > requires a prep patch because it changes functionality, right?
> > 
> > So let me know if I've understood you.
> 
> Don't we have both the bound and unbound list? So we could just switch
> over to counting the bound objects here ... Otherwise yes, we need a prep
> patch to create the bound list first.

Of course there is a bound list.

The old code automatically added the size of unbound objects with
unpinned pages, and unpinned inactive objects with unpinned pages.

The latter check for inactive, needs to be checked for all VMAs. That
was my point.

> 
> > 
> > > >  
> > > >  	if (unlock)
> > > >  		mutex_unlock(&dev->struct_mutex);
> > > >  	return cnt;
> > > >  }
> > > > +
> > > > +/* All the new VM stuff */
> > > > +unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
> > > > +				  struct i915_address_space *vm)
> > > > +{
> > > > +	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
> > > > +	struct i915_vma *vma;
> > > > +
> > > > +	if (vm == &dev_priv->mm.aliasing_ppgtt->base)
> > > > +		vm = &dev_priv->gtt.base;
> > > > +
> > > > +	BUG_ON(list_empty(&o->vma_list));
> > > > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > > 
> > > Imo the vma list walking here and in the other helpers below indicates
> > > that we should deal more often in vmas instead of (object, vm) pairs. Or
> > > is this again something that'll get fixed later on?
> > > 
> > > I just want to avoid diff churn, and it also makes reviewing easier if the
> > > foreshadowing is correct ;-) So generally I'd vote for more liberal
> > > sprinkling of obj_to_vma in callers.
> > 
> > It's not something I fixed in the whole series. I think it makes sense
> > conceptually, to keep some things as <obj,vm> and others as direct vma.
> > 
> > If you want me to change something, you need to be more specific since
> > no action specifically comes to mind at this point in the series.
> 
> It's just that the (obj, vm) -> vma lookup is a list-walk, so imo we
> should try to avoid it whenever possible. Since the vma has both and obj
> and a vm pointer the vma is imo strictly better than the (obj, vm) pair.
> And the look-up should be pushed down the callchain as much as possible.
> 
> So I think generally we want to pass the vma around to functions
> everywhere, and the (obj, vm) pair would be the exception (which needs
> special justification).
> 

Without actually coding it, I am not sure. I think there are probably a
decent number of reasonable exceptions where we want the object (ie.
it's not really that much of a special case). In any case, I think we'll
find you have to do this list walk at some point in the call chain
anyway, but I can try to start changing around the code as a patch on
top of this. I really want to leave as much as this patch in place as
is, since it's decently tested (pre-rebase at least).

> > 
> > > 
> > > > +		if (vma->vm == vm)
> > > > +			return vma->node.start;
> > > > +
> > > > +	}
> > > > +	return -1;
> > > > +}
> > > > +
> > > > +bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o)
> > > > +{
> > > > +	return !list_empty(&o->vma_list);
> > > > +}
> > > > +
> > > > +bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
> > > > +			struct i915_address_space *vm)
> > > > +{
> > > > +	struct i915_vma *vma;
> > > > +
> > > > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > > > +		if (vma->vm == vm)
> > > > +			return true;
> > > > +	}
> > > > +	return false;
> > > > +}
> > > > +
> > > > +unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
> > > > +				struct i915_address_space *vm)
> > > > +{
> > > > +	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
> > > > +	struct i915_vma *vma;
> > > > +
> > > > +	if (vm == &dev_priv->mm.aliasing_ppgtt->base)
> > > > +		vm = &dev_priv->gtt.base;
> > > > +	BUG_ON(list_empty(&o->vma_list));
> > > > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > > > +		if (vma->vm == vm)
> > > > +			return vma->node.size;
> > > > +	}
> > > > +
> > > > +	return 0;
> > > > +}
> > > > +
> > > > +void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
> > > > +			    struct i915_address_space *vm,
> > > > +			    enum i915_cache_level color)
> > > > +{
> > > > +	struct i915_vma *vma;
> > > > +	BUG_ON(list_empty(&o->vma_list));
> > > > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > > > +		if (vma->vm == vm) {
> > > > +			vma->node.color = color;
> > > > +			return;
> > > > +		}
> > > > +	}
> > > > +
> > > > +	WARN(1, "Couldn't set color for VM %p\n", vm);
> > > > +}
> > > > +
> > > > +struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
> > > > +				     struct i915_address_space *vm)
> > > > +{
> > > > +	struct i915_vma *vma;
> > > > +	list_for_each_entry(vma, &obj->vma_list, vma_link)
> > > > +		if (vma->vm == vm)
> > > > +			return vma;
> > > > +
> > > > +	return NULL;
> > > > +}
> > > > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> > > > index 2074544..c92fd81 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem_context.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> > > > @@ -155,6 +155,7 @@ create_hw_context(struct drm_device *dev,
> > > >  
> > > >  	if (INTEL_INFO(dev)->gen >= 7) {
> > > >  		ret = i915_gem_object_set_cache_level(ctx->obj,
> > > > +						      &dev_priv->gtt.base,
> > > >  						      I915_CACHE_LLC_MLC);
> > > >  		/* Failure shouldn't ever happen this early */
> > > >  		if (WARN_ON(ret))
> > > > @@ -214,7 +215,7 @@ static int create_default_context(struct drm_i915_private *dev_priv)
> > > >  	 * default context.
> > > >  	 */
> > > >  	dev_priv->ring[RCS].default_context = ctx;
> > > > -	ret = i915_gem_object_pin(ctx->obj, CONTEXT_ALIGN, false, false);
> > > > +	ret = i915_gem_ggtt_pin(ctx->obj, CONTEXT_ALIGN, false, false);
> > > >  	if (ret) {
> > > >  		DRM_DEBUG_DRIVER("Couldn't pin %d\n", ret);
> > > >  		goto err_destroy;
> > > > @@ -398,6 +399,7 @@ mi_set_context(struct intel_ring_buffer *ring,
> > > >  static int do_switch(struct i915_hw_context *to)
> > > >  {
> > > >  	struct intel_ring_buffer *ring = to->ring;
> > > > +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> > > >  	struct i915_hw_context *from = ring->last_context;
> > > >  	u32 hw_flags = 0;
> > > >  	int ret;
> > > > @@ -407,7 +409,7 @@ static int do_switch(struct i915_hw_context *to)
> > > >  	if (from == to)
> > > >  		return 0;
> > > >  
> > > > -	ret = i915_gem_object_pin(to->obj, CONTEXT_ALIGN, false, false);
> > > > +	ret = i915_gem_ggtt_pin(to->obj, CONTEXT_ALIGN, false, false);
> > > >  	if (ret)
> > > >  		return ret;
> > > >  
> > > > @@ -444,7 +446,8 @@ static int do_switch(struct i915_hw_context *to)
> > > >  	 */
> > > >  	if (from != NULL) {
> > > >  		from->obj->base.read_domains = I915_GEM_DOMAIN_INSTRUCTION;
> > > > -		i915_gem_object_move_to_active(from->obj, ring);
> > > > +		i915_gem_object_move_to_active(from->obj, &dev_priv->gtt.base,
> > > > +					       ring);
> > > >  		/* As long as MI_SET_CONTEXT is serializing, ie. it flushes the
> > > >  		 * whole damn pipeline, we don't need to explicitly mark the
> > > >  		 * object dirty. The only exception is that the context must be
> > > > diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
> > > > index df61f33..32efdc0 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem_evict.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gem_evict.c
> > > > @@ -32,24 +32,21 @@
> > > >  #include "i915_trace.h"
> > > >  
> > > >  static bool
> > > > -mark_free(struct drm_i915_gem_object *obj, struct list_head *unwind)
> > > > +mark_free(struct i915_vma *vma, struct list_head *unwind)
> > > >  {
> > > > -	struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> > > > -
> > > > -	if (obj->pin_count)
> > > > +	if (vma->obj->pin_count)
> > > >  		return false;
> > > >  
> > > > -	list_add(&obj->exec_list, unwind);
> > > > +	list_add(&vma->obj->exec_list, unwind);
> > > >  	return drm_mm_scan_add_block(&vma->node);
> > > >  }
> > > >  
> > > >  int
> > > > -i915_gem_evict_something(struct drm_device *dev, int min_size,
> > > > -			 unsigned alignment, unsigned cache_level,
> > > > +i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
> > > > +			 int min_size, unsigned alignment, unsigned cache_level,
> > > >  			 bool mappable, bool nonblocking)
> > > >  {
> > > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > >  	struct list_head eviction_list, unwind_list;
> > > >  	struct i915_vma *vma;
> > > >  	struct drm_i915_gem_object *obj;
> > > > @@ -81,16 +78,18 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
> > > >  	 */
> > > >  
> > > >  	INIT_LIST_HEAD(&unwind_list);
> > > > -	if (mappable)
> > > > +	if (mappable) {
> > > > +		BUG_ON(!i915_is_ggtt(vm));
> > > >  		drm_mm_init_scan_with_range(&vm->mm, min_size,
> > > >  					    alignment, cache_level, 0,
> > > >  					    dev_priv->gtt.mappable_end);
> > > > -	else
> > > > +	} else
> > > >  		drm_mm_init_scan(&vm->mm, min_size, alignment, cache_level);
> > > >  
> > > >  	/* First see if there is a large enough contiguous idle region... */
> > > >  	list_for_each_entry(obj, &vm->inactive_list, mm_list) {
> > > > -		if (mark_free(obj, &unwind_list))
> > > > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> > > > +		if (mark_free(vma, &unwind_list))
> > > >  			goto found;
> > > >  	}
> > > >  
> > > > @@ -99,7 +98,8 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
> > > >  
> > > >  	/* Now merge in the soon-to-be-expired objects... */
> > > >  	list_for_each_entry(obj, &vm->active_list, mm_list) {
> > > > -		if (mark_free(obj, &unwind_list))
> > > > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> > > > +		if (mark_free(vma, &unwind_list))
> > > >  			goto found;
> > > >  	}
> > > >  
> > > > @@ -109,7 +109,7 @@ none:
> > > >  		obj = list_first_entry(&unwind_list,
> > > >  				       struct drm_i915_gem_object,
> > > >  				       exec_list);
> > > > -		vma = __i915_gem_obj_to_vma(obj);
> > > > +		vma = i915_gem_obj_to_vma(obj, vm);
> > > >  		ret = drm_mm_scan_remove_block(&vma->node);
> > > >  		BUG_ON(ret);
> > > >  
> > > > @@ -130,7 +130,7 @@ found:
> > > >  		obj = list_first_entry(&unwind_list,
> > > >  				       struct drm_i915_gem_object,
> > > >  				       exec_list);
> > > > -		vma = __i915_gem_obj_to_vma(obj);
> > > > +		vma = i915_gem_obj_to_vma(obj, vm);
> > > >  		if (drm_mm_scan_remove_block(&vma->node)) {
> > > >  			list_move(&obj->exec_list, &eviction_list);
> > > >  			drm_gem_object_reference(&obj->base);
> > > > @@ -145,7 +145,7 @@ found:
> > > >  				       struct drm_i915_gem_object,
> > > >  				       exec_list);
> > > >  		if (ret == 0)
> > > > -			ret = i915_gem_object_unbind(obj);
> > > > +			ret = i915_gem_object_unbind(obj, vm);
> > > >  
> > > >  		list_del_init(&obj->exec_list);
> > > >  		drm_gem_object_unreference(&obj->base);
> > > > @@ -158,13 +158,18 @@ int
> > > >  i915_gem_evict_everything(struct drm_device *dev)
> > > 
> > > I suspect evict_everything eventually wants a address_space *vm argument
> > > for those cases where we only want to evict everything in a given vm. Atm
> > > we have two use-cases of this:
> > > - Called from the shrinker as a last-ditch effort. For that it should move
> > >   _every_ object onto the unbound list.
> > > - Called from execbuf for badly-fragmented address spaces to clean up the
> > >   mess. For that case we only care about one address space.
> > 
> > The current thing is more or less a result of Chris' suggestions. A
> > non-posted iteration did plumb the vm, and after reworking to the
> > suggestion made by Chris, the vm didn't make much sense anymore.
> > 
> > For point #1, it requires VM prioritization I think. I don't really see
> > any other way to fairly manage it.
> 
> The shrinker will rip out  objects in lru order by walking first unbound
> and then bound objects. That's imo as fair as it gets, we don't need
> priorities between vms.

If you pass in a vm, the semantics would be, evict everything for the
vm, right?

> 
> > For point #2, that I agree it might be useful, but we can easily create
> > a new function, and not call it "shrinker" to do it. 
> 
> Well my point was that this function is called
> i915_gem_evict_everything(dev, vm) and for the first use case we simply
> pass in vm = NULL. But essentially thrashing the vm should be rare enough
> that for now we don't need to care.
> 

IIRC, this is exactly how my original patch worked pre-Chris.

> 
> > 
> > 
> > > 
> > > >  {
> > > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > > +	struct i915_address_space *vm;
> > > >  	struct drm_i915_gem_object *obj, *next;
> > > > -	bool lists_empty;
> > > > +	bool lists_empty = true;
> > > >  	int ret;
> > > >  
> > > > -	lists_empty = (list_empty(&vm->inactive_list) &&
> > > > -		       list_empty(&vm->active_list));
> > > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > > > +		lists_empty = (list_empty(&vm->inactive_list) &&
> > > > +			       list_empty(&vm->active_list));
> > > > +		if (!lists_empty)
> > > > +			lists_empty = false;
> > > > +	}
> > > > +
> > > >  	if (lists_empty)
> > > >  		return -ENOSPC;
> > > >  
> > > > @@ -181,9 +186,11 @@ i915_gem_evict_everything(struct drm_device *dev)
> > > >  	i915_gem_retire_requests(dev);
> > > >  
> > > >  	/* Having flushed everything, unbind() should never raise an error */
> > > > -	list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> > > > -		if (obj->pin_count == 0)
> > > > -			WARN_ON(i915_gem_object_unbind(obj));
> > > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > > > +		list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> > > > +			if (obj->pin_count == 0)
> > > > +				WARN_ON(i915_gem_object_unbind(obj, vm));
> > > > +	}
> > > >  
> > > >  	return 0;
> > > >  }
> > > > diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > > > index 5aeb447..e90182d 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > > > @@ -150,7 +150,7 @@ eb_get_object(struct eb_objects *eb, unsigned long handle)
> > > >  }
> > > >  
> > > >  static void
> > > > -eb_destroy(struct eb_objects *eb)
> > > > +eb_destroy(struct eb_objects *eb, struct i915_address_space *vm)
> > > >  {
> > > >  	while (!list_empty(&eb->objects)) {
> > > >  		struct drm_i915_gem_object *obj;
> > > > @@ -174,7 +174,8 @@ static inline int use_cpu_reloc(struct drm_i915_gem_object *obj)
> > > >  static int
> > > >  i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
> > > >  				   struct eb_objects *eb,
> > > > -				   struct drm_i915_gem_relocation_entry *reloc)
> > > > +				   struct drm_i915_gem_relocation_entry *reloc,
> > > > +				   struct i915_address_space *vm)
> > > >  {
> > > >  	struct drm_device *dev = obj->base.dev;
> > > >  	struct drm_gem_object *target_obj;
> > > > @@ -297,7 +298,8 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
> > > >  
> > > >  static int
> > > >  i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
> > > > -				    struct eb_objects *eb)
> > > > +				    struct eb_objects *eb,
> > > > +				    struct i915_address_space *vm)
> > > >  {
> > > >  #define N_RELOC(x) ((x) / sizeof(struct drm_i915_gem_relocation_entry))
> > > >  	struct drm_i915_gem_relocation_entry stack_reloc[N_RELOC(512)];
> > > > @@ -321,7 +323,8 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
> > > >  		do {
> > > >  			u64 offset = r->presumed_offset;
> > > >  
> > > > -			ret = i915_gem_execbuffer_relocate_entry(obj, eb, r);
> > > > +			ret = i915_gem_execbuffer_relocate_entry(obj, eb, r,
> > > > +								 vm);
> > > >  			if (ret)
> > > >  				return ret;
> > > >  
> > > > @@ -344,13 +347,15 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
> > > >  static int
> > > >  i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
> > > >  					 struct eb_objects *eb,
> > > > -					 struct drm_i915_gem_relocation_entry *relocs)
> > > > +					 struct drm_i915_gem_relocation_entry *relocs,
> > > > +					 struct i915_address_space *vm)
> > > >  {
> > > >  	const struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
> > > >  	int i, ret;
> > > >  
> > > >  	for (i = 0; i < entry->relocation_count; i++) {
> > > > -		ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i]);
> > > > +		ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i],
> > > > +							 vm);
> > > >  		if (ret)
> > > >  			return ret;
> > > >  	}
> > > > @@ -359,7 +364,8 @@ i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
> > > >  }
> > > >  
> > > >  static int
> > > > -i915_gem_execbuffer_relocate(struct eb_objects *eb)
> > > > +i915_gem_execbuffer_relocate(struct eb_objects *eb,
> > > > +			     struct i915_address_space *vm)
> > > >  {
> > > >  	struct drm_i915_gem_object *obj;
> > > >  	int ret = 0;
> > > > @@ -373,7 +379,7 @@ i915_gem_execbuffer_relocate(struct eb_objects *eb)
> > > >  	 */
> > > >  	pagefault_disable();
> > > >  	list_for_each_entry(obj, &eb->objects, exec_list) {
> > > > -		ret = i915_gem_execbuffer_relocate_object(obj, eb);
> > > > +		ret = i915_gem_execbuffer_relocate_object(obj, eb, vm);
> > > >  		if (ret)
> > > >  			break;
> > > >  	}
> > > > @@ -395,6 +401,7 @@ need_reloc_mappable(struct drm_i915_gem_object *obj)
> > > >  static int
> > > >  i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
> > > >  				   struct intel_ring_buffer *ring,
> > > > +				   struct i915_address_space *vm,
> > > >  				   bool *need_reloc)
> > > >  {
> > > >  	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> > > > @@ -409,7 +416,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
> > > >  		obj->tiling_mode != I915_TILING_NONE;
> > > >  	need_mappable = need_fence || need_reloc_mappable(obj);
> > > >  
> > > > -	ret = i915_gem_object_pin(obj, entry->alignment, need_mappable, false);
> > > > +	ret = i915_gem_object_pin(obj, vm, entry->alignment, need_mappable,
> > > > +				  false);
> > > >  	if (ret)
> > > >  		return ret;
> > > >  
> > > > @@ -436,8 +444,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
> > > >  		obj->has_aliasing_ppgtt_mapping = 1;
> > > >  	}
> > > >  
> > > > -	if (entry->offset != i915_gem_obj_ggtt_offset(obj)) {
> > > > -		entry->offset = i915_gem_obj_ggtt_offset(obj);
> > > > +	if (entry->offset != i915_gem_obj_offset(obj, vm)) {
> > > > +		entry->offset = i915_gem_obj_offset(obj, vm);
> > > >  		*need_reloc = true;
> > > >  	}
> > > >  
> > > > @@ -458,7 +466,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
> > > >  {
> > > >  	struct drm_i915_gem_exec_object2 *entry;
> > > >  
> > > > -	if (!i915_gem_obj_ggtt_bound(obj))
> > > > +	if (!i915_gem_obj_bound_any(obj))
> > > >  		return;
> > > >  
> > > >  	entry = obj->exec_entry;
> > > > @@ -475,6 +483,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
> > > >  static int
> > > >  i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
> > > >  			    struct list_head *objects,
> > > > +			    struct i915_address_space *vm,
> > > >  			    bool *need_relocs)
> > > >  {
> > > >  	struct drm_i915_gem_object *obj;
> > > > @@ -529,32 +538,37 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
> > > >  		list_for_each_entry(obj, objects, exec_list) {
> > > >  			struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
> > > >  			bool need_fence, need_mappable;
> > > > +			u32 obj_offset;
> > > >  
> > > > -			if (!i915_gem_obj_ggtt_bound(obj))
> > > > +			if (!i915_gem_obj_bound(obj, vm))
> > > >  				continue;
> > > 
> > > I wonder a bit how we could avoid the multipler (obj, vm) -> vma lookups
> > > here ... Maybe we should cache them in some pointer somewhere (either in
> > > the eb object or by adding a new pointer to the object struct, e.g.
> > > obj->eb_vma, similar to obj->eb_list).
> > > 
> > 
> > I agree, and even did this at one unposted patch too. However, I think
> > it's a premature optimization which risks code correctness. So I think
> > somewhere a FIXME needs to happen to address that issue. (Or if Chris
> > complains bitterly about some perf hit).
> 
> If you bring up code correctness I'd vote strongly in favour of using vmas
> everywhere - vma has the (obj, vm) pair locked down, doing the lookup all
> the thing risks us mixing them up eventually and creating a hella lot of
> confusion ;-)

I think this is addressed with the previous comments.

> 
> > 
> > > >  
> > > > +			obj_offset = i915_gem_obj_offset(obj, vm);
> > > >  			need_fence =
> > > >  				has_fenced_gpu_access &&
> > > >  				entry->flags & EXEC_OBJECT_NEEDS_FENCE &&
> > > >  				obj->tiling_mode != I915_TILING_NONE;
> > > >  			need_mappable = need_fence || need_reloc_mappable(obj);
> > > >  
> > > > +			BUG_ON((need_mappable || need_fence) &&
> > > > +			       !i915_is_ggtt(vm));
> > > > +
> > > >  			if ((entry->alignment &&
> > > > -			     i915_gem_obj_ggtt_offset(obj) & (entry->alignment - 1)) ||
> > > > +			     obj_offset & (entry->alignment - 1)) ||
> > > >  			    (need_mappable && !obj->map_and_fenceable))
> > > > -				ret = i915_gem_object_unbind(obj);
> > > > +				ret = i915_gem_object_unbind(obj, vm);
> > > >  			else
> > > > -				ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs);
> > > > +				ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
> > > >  			if (ret)
> > > >  				goto err;
> > > >  		}
> > > >  
> > > >  		/* Bind fresh objects */
> > > >  		list_for_each_entry(obj, objects, exec_list) {
> > > > -			if (i915_gem_obj_ggtt_bound(obj))
> > > > +			if (i915_gem_obj_bound(obj, vm))
> > > >  				continue;
> > > >  
> > > > -			ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs);
> > > > +			ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
> > > >  			if (ret)
> > > >  				goto err;
> > > >  		}
> > > > @@ -578,7 +592,8 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
> > > >  				  struct drm_file *file,
> > > >  				  struct intel_ring_buffer *ring,
> > > >  				  struct eb_objects *eb,
> > > > -				  struct drm_i915_gem_exec_object2 *exec)
> > > > +				  struct drm_i915_gem_exec_object2 *exec,
> > > > +				  struct i915_address_space *vm)
> > > >  {
> > > >  	struct drm_i915_gem_relocation_entry *reloc;
> > > >  	struct drm_i915_gem_object *obj;
> > > > @@ -662,14 +677,15 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
> > > >  		goto err;
> > > >  
> > > >  	need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
> > > > -	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs);
> > > > +	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
> > > >  	if (ret)
> > > >  		goto err;
> > > >  
> > > >  	list_for_each_entry(obj, &eb->objects, exec_list) {
> > > >  		int offset = obj->exec_entry - exec;
> > > >  		ret = i915_gem_execbuffer_relocate_object_slow(obj, eb,
> > > > -							       reloc + reloc_offset[offset]);
> > > > +							       reloc + reloc_offset[offset],
> > > > +							       vm);
> > > >  		if (ret)
> > > >  			goto err;
> > > >  	}
> > > > @@ -768,6 +784,7 @@ validate_exec_list(struct drm_i915_gem_exec_object2 *exec,
> > > >  
> > > >  static void
> > > >  i915_gem_execbuffer_move_to_active(struct list_head *objects,
> > > > +				   struct i915_address_space *vm,
> > > >  				   struct intel_ring_buffer *ring)
> > > >  {
> > > >  	struct drm_i915_gem_object *obj;
> > > > @@ -782,7 +799,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *objects,
> > > >  		obj->base.read_domains = obj->base.pending_read_domains;
> > > >  		obj->fenced_gpu_access = obj->pending_fenced_gpu_access;
> > > >  
> > > > -		i915_gem_object_move_to_active(obj, ring);
> > > > +		i915_gem_object_move_to_active(obj, vm, ring);
> > > >  		if (obj->base.write_domain) {
> > > >  			obj->dirty = 1;
> > > >  			obj->last_write_seqno = intel_ring_get_seqno(ring);
> > > > @@ -836,7 +853,8 @@ static int
> > > >  i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> > > >  		       struct drm_file *file,
> > > >  		       struct drm_i915_gem_execbuffer2 *args,
> > > > -		       struct drm_i915_gem_exec_object2 *exec)
> > > > +		       struct drm_i915_gem_exec_object2 *exec,
> > > > +		       struct i915_address_space *vm)
> > > >  {
> > > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > > >  	struct eb_objects *eb;
> > > > @@ -998,17 +1016,17 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> > > >  
> > > >  	/* Move the objects en-masse into the GTT, evicting if necessary. */
> > > >  	need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
> > > > -	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs);
> > > > +	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
> > > >  	if (ret)
> > > >  		goto err;
> > > >  
> > > >  	/* The objects are in their final locations, apply the relocations. */
> > > >  	if (need_relocs)
> > > > -		ret = i915_gem_execbuffer_relocate(eb);
> > > > +		ret = i915_gem_execbuffer_relocate(eb, vm);
> > > >  	if (ret) {
> > > >  		if (ret == -EFAULT) {
> > > >  			ret = i915_gem_execbuffer_relocate_slow(dev, args, file, ring,
> > > > -								eb, exec);
> > > > +								eb, exec, vm);
> > > >  			BUG_ON(!mutex_is_locked(&dev->struct_mutex));
> > > >  		}
> > > >  		if (ret)
> > > > @@ -1059,7 +1077,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> > > >  			goto err;
> > > >  	}
> > > >  
> > > > -	exec_start = i915_gem_obj_ggtt_offset(batch_obj) + args->batch_start_offset;
> > > > +	exec_start = i915_gem_obj_offset(batch_obj, vm) +
> > > > +		args->batch_start_offset;
> > > >  	exec_len = args->batch_len;
> > > >  	if (cliprects) {
> > > >  		for (i = 0; i < args->num_cliprects; i++) {
> > > > @@ -1084,11 +1103,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> > > >  
> > > >  	trace_i915_gem_ring_dispatch(ring, intel_ring_get_seqno(ring), flags);
> > > >  
> > > > -	i915_gem_execbuffer_move_to_active(&eb->objects, ring);
> > > > +	i915_gem_execbuffer_move_to_active(&eb->objects, vm, ring);
> > > >  	i915_gem_execbuffer_retire_commands(dev, file, ring, batch_obj);
> > > >  
> > > >  err:
> > > > -	eb_destroy(eb);
> > > > +	eb_destroy(eb, vm);
> > > >  
> > > >  	mutex_unlock(&dev->struct_mutex);
> > > >  
> > > > @@ -1105,6 +1124,7 @@ int
> > > >  i915_gem_execbuffer(struct drm_device *dev, void *data,
> > > >  		    struct drm_file *file)
> > > >  {
> > > > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > > >  	struct drm_i915_gem_execbuffer *args = data;
> > > >  	struct drm_i915_gem_execbuffer2 exec2;
> > > >  	struct drm_i915_gem_exec_object *exec_list = NULL;
> > > > @@ -1160,7 +1180,8 @@ i915_gem_execbuffer(struct drm_device *dev, void *data,
> > > >  	exec2.flags = I915_EXEC_RENDER;
> > > >  	i915_execbuffer2_set_context_id(exec2, 0);
> > > >  
> > > > -	ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list);
> > > > +	ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list,
> > > > +				     &dev_priv->gtt.base);
> > > >  	if (!ret) {
> > > >  		/* Copy the new buffer offsets back to the user's exec list. */
> > > >  		for (i = 0; i < args->buffer_count; i++)
> > > > @@ -1186,6 +1207,7 @@ int
> > > >  i915_gem_execbuffer2(struct drm_device *dev, void *data,
> > > >  		     struct drm_file *file)
> > > >  {
> > > > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > > >  	struct drm_i915_gem_execbuffer2 *args = data;
> > > >  	struct drm_i915_gem_exec_object2 *exec2_list = NULL;
> > > >  	int ret;
> > > > @@ -1216,7 +1238,8 @@ i915_gem_execbuffer2(struct drm_device *dev, void *data,
> > > >  		return -EFAULT;
> > > >  	}
> > > >  
> > > > -	ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list);
> > > > +	ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list,
> > > > +				     &dev_priv->gtt.base);
> > > >  	if (!ret) {
> > > >  		/* Copy the new buffer offsets back to the user's exec list. */
> > > >  		ret = copy_to_user(to_user_ptr(args->buffers_ptr),
> > > > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > > index 298fc42..70ce2f6 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > > @@ -367,6 +367,8 @@ static int i915_gem_init_aliasing_ppgtt(struct drm_device *dev)
> > > >  			    ppgtt->base.total);
> > > >  	}
> > > >  
> > > > +	/* i915_init_vm(dev_priv, &ppgtt->base) */
> > > > +
> > > >  	return ret;
> > > >  }
> > > >  
> > > > @@ -386,17 +388,22 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
> > > >  			    struct drm_i915_gem_object *obj,
> > > >  			    enum i915_cache_level cache_level)
> > > >  {
> > > > -	ppgtt->base.insert_entries(&ppgtt->base, obj->pages,
> > > > -				   i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> > > > -				   cache_level);
> > > > +	struct i915_address_space *vm = &ppgtt->base;
> > > > +	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
> > > > +
> > > > +	vm->insert_entries(vm, obj->pages,
> > > > +			   obj_offset >> PAGE_SHIFT,
> > > > +			   cache_level);
> > > >  }
> > > >  
> > > >  void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
> > > >  			      struct drm_i915_gem_object *obj)
> > > >  {
> > > > -	ppgtt->base.clear_range(&ppgtt->base,
> > > > -				i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> > > > -				obj->base.size >> PAGE_SHIFT);
> > > > +	struct i915_address_space *vm = &ppgtt->base;
> > > > +	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
> > > > +
> > > > +	vm->clear_range(vm, obj_offset >> PAGE_SHIFT,
> > > > +			obj->base.size >> PAGE_SHIFT);
> > > >  }
> > > >  
> > > >  extern int intel_iommu_gfx_mapped;
> > > > @@ -447,6 +454,9 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
> > > >  				       dev_priv->gtt.base.start / PAGE_SIZE,
> > > >  				       dev_priv->gtt.base.total / PAGE_SIZE);
> > > >  
> > > > +	if (dev_priv->mm.aliasing_ppgtt)
> > > > +		gen6_write_pdes(dev_priv->mm.aliasing_ppgtt);
> > > > +
> > > >  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> > > >  		i915_gem_clflush_object(obj);
> > > >  		i915_gem_gtt_bind_object(obj, obj->cache_level);
> > > > @@ -625,7 +635,8 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
> > > >  	 * aperture.  One page should be enough to keep any prefetching inside
> > > >  	 * of the aperture.
> > > >  	 */
> > > > -	drm_i915_private_t *dev_priv = dev->dev_private;
> > > > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > > > +	struct i915_address_space *ggtt_vm = &dev_priv->gtt.base;
> > > >  	struct drm_mm_node *entry;
> > > >  	struct drm_i915_gem_object *obj;
> > > >  	unsigned long hole_start, hole_end;
> > > > @@ -633,19 +644,19 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
> > > >  	BUG_ON(mappable_end > end);
> > > >  
> > > >  	/* Subtract the guard page ... */
> > > > -	drm_mm_init(&dev_priv->gtt.base.mm, start, end - start - PAGE_SIZE);
> > > > +	drm_mm_init(&ggtt_vm->mm, start, end - start - PAGE_SIZE);
> > > >  	if (!HAS_LLC(dev))
> > > >  		dev_priv->gtt.base.mm.color_adjust = i915_gtt_color_adjust;
> > > >  
> > > >  	/* Mark any preallocated objects as occupied */
> > > >  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> > > > -		struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> > > > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, ggtt_vm);
> > > >  		int ret;
> > > >  		DRM_DEBUG_KMS("reserving preallocated space: %lx + %zx\n",
> > > >  			      i915_gem_obj_ggtt_offset(obj), obj->base.size);
> > > >  
> > > >  		WARN_ON(i915_gem_obj_ggtt_bound(obj));
> > > > -		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
> > > > +		ret = drm_mm_reserve_node(&ggtt_vm->mm, &vma->node);
> > > >  		if (ret)
> > > >  			DRM_DEBUG_KMS("Reservation failed\n");
> > > >  		obj->has_global_gtt_mapping = 1;
> > > > @@ -656,19 +667,15 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
> > > >  	dev_priv->gtt.base.total = end - start;
> > > >  
> > > >  	/* Clear any non-preallocated blocks */
> > > > -	drm_mm_for_each_hole(entry, &dev_priv->gtt.base.mm,
> > > > -			     hole_start, hole_end) {
> > > > +	drm_mm_for_each_hole(entry, &ggtt_vm->mm, hole_start, hole_end) {
> > > >  		const unsigned long count = (hole_end - hole_start) / PAGE_SIZE;
> > > >  		DRM_DEBUG_KMS("clearing unused GTT space: [%lx, %lx]\n",
> > > >  			      hole_start, hole_end);
> > > > -		dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> > > > -					       hole_start / PAGE_SIZE,
> > > > -					       count);
> > > > +		ggtt_vm->clear_range(ggtt_vm, hole_start / PAGE_SIZE, count);
> > > >  	}
> > > >  
> > > >  	/* And finally clear the reserved guard page */
> > > > -	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> > > > -				       end / PAGE_SIZE - 1, 1);
> > > > +	ggtt_vm->clear_range(ggtt_vm, end / PAGE_SIZE - 1, 1);
> > > >  }
> > > >  
> > > >  static bool
> > > > diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > > index 245eb1d..bfe61fa 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > > @@ -391,7 +391,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
> > > >  	if (gtt_offset == I915_GTT_OFFSET_NONE)
> > > >  		return obj;
> > > >  
> > > > -	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
> > > > +	vma = i915_gem_vma_create(obj, vm);
> > > >  	if (!vma) {
> > > >  		drm_gem_object_unreference(&obj->base);
> > > >  		return NULL;
> > > > @@ -404,8 +404,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
> > > >  	 */
> > > >  	vma->node.start = gtt_offset;
> > > >  	vma->node.size = size;
> > > > -	if (drm_mm_initialized(&dev_priv->gtt.base.mm)) {
> > > > -		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
> > > > +	if (drm_mm_initialized(&vm->mm)) {
> > > > +		ret = drm_mm_reserve_node(&vm->mm, &vma->node);
> > > 
> > > These two hunks here for stolen look fishy - we only ever use the stolen
> > > preallocated stuff for objects with mappings in the global gtt. So keeping
> > > that explicit is imo the better approach. And tbh I'm confused where the
> > > local variable vm is from ...
> > 
> > If we don't create a vma for it, we potentially have to special case a
> > bunch of places, I think. I'm not actually sure of this, but the
> > overhead to do it is quite small.
> > 
> > Anyway, I'll look this over again nd see what I think.
> 
> I'm not against the vma, I've just wonedered why you do the
> /dev_priv->gtt.base/vm/ replacement here since
> - it's never gonna be used with another vm than ggtt
> - this patch doesn't add the vm variable, so I'm even more confused where
>   this started ;-)

It started from the rebase. In the original series, I did that
"deferred_offset" thing, and having a vm variable made that code pass
the checkpatch.pl. There wasn't a particular reason for naming it vm
other than I had done it all over the place.

I've fixed this locally, leaving the vma, and renamed the local variable
ggtt. It still has 3 uses in this function, so it's a bit less typing.

> 
> Cheers, Daniel
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 01/11] drm/i915: Move gtt and ppgtt under address space umbrella
  2013-07-11 23:57     ` Ben Widawsky
@ 2013-07-12 15:59       ` Ben Widawsky
  0 siblings, 0 replies; 50+ messages in thread
From: Ben Widawsky @ 2013-07-12 15:59 UTC (permalink / raw)
  To: Imre Deak; +Cc: Intel GFX

On Thu, Jul 11, 2013 at 04:57:30PM -0700, Ben Widawsky wrote:
> On Thu, Jul 11, 2013 at 02:14:06PM +0300, Imre Deak wrote:
> > On Mon, 2013-07-08 at 23:08 -0700, Ben Widawsky wrote:
> > > The GTT and PPGTT can be thought of more generally as GPU address
> > > spaces. Many of their actions (insert entries), state (LRU lists) and
> > > many of their characteristics (size), can be shared. Do that.
> > > 
> > > The change itself doesn't actually impact most of the VMA/VM rework
> > > coming up, it just fits in with the grand scheme. GGTT will usually be a
> > > special case where we either know an object must be in the GGTT (dislay
> > > engine, workarounds, etc.).
> > > 
> > > v2: Drop usage of i915_gtt_vm (Daniel)
> > > Make cleanup also part of the parent class (Ben)
> > > Modified commit msg
> > > Rebased
> > > 
> > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > > ---
> > >  drivers/gpu/drm/i915/i915_debugfs.c |   4 +-
> > >  drivers/gpu/drm/i915/i915_dma.c     |   4 +-
> > >  drivers/gpu/drm/i915/i915_drv.h     |  57 ++++++-------
> > >  drivers/gpu/drm/i915/i915_gem.c     |   4 +-
> > >  drivers/gpu/drm/i915/i915_gem_gtt.c | 162 ++++++++++++++++++++----------------
> > >  5 files changed, 121 insertions(+), 110 deletions(-)
> > > 
> > >[...]
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > index 242d0f9..693115a 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > @@ -102,7 +102,7 @@ static gen6_gtt_pte_t hsw_pte_encode(dma_addr_t addr,
> > >  
> > >  static void gen6_write_pdes(struct i915_hw_ppgtt *ppgtt)
> > >  {
> > > -	struct drm_i915_private *dev_priv = ppgtt->dev->dev_private;
> > > +	struct drm_i915_private *dev_priv = ppgtt->base.dev->dev_private;
> > >  	gen6_gtt_pte_t __iomem *pd_addr;
> > >  	uint32_t pd_entry;
> > >  	int i;
> > > @@ -181,18 +181,18 @@ static int gen6_ppgtt_enable(struct drm_device *dev)
> > >  }
> > >  
> > >  /* PPGTT support for Sandybdrige/Gen6 and later */
> > > -static void gen6_ppgtt_clear_range(struct i915_hw_ppgtt *ppgtt,
> > > +static void gen6_ppgtt_clear_range(struct i915_address_space *vm,
> > >  				   unsigned first_entry,
> > >  				   unsigned num_entries)
> > >  {
> > > -	struct drm_i915_private *dev_priv = ppgtt->dev->dev_private;
> > > +	struct i915_hw_ppgtt *ppgtt =
> > > +		container_of(vm, struct i915_hw_ppgtt, base);
> > >  	gen6_gtt_pte_t *pt_vaddr, scratch_pte;
> > >  	unsigned act_pt = first_entry / I915_PPGTT_PT_ENTRIES;
> > >  	unsigned first_pte = first_entry % I915_PPGTT_PT_ENTRIES;
> > >  	unsigned last_pte, i;
> > >  
> > > -	scratch_pte = ppgtt->pte_encode(dev_priv->gtt.scratch.addr,
> > > -					I915_CACHE_LLC);
> > > +	scratch_pte = vm->pte_encode(vm->scratch.addr, I915_CACHE_LLC);
> > 
> > I only see ggtt's scratch page being initialized, but can't find the
> > corresponding init/teardown for ppgtt. Btw, why do we need separate
> > global/per-process scratch pages? (would be nice to add it to the commit
> > message)
> > 
> > --Imre
> > 
> 
> There is indeed a bug here, it existed somewhere, so I've mistakenly dropped
> it. Here is my local fix, which is what I had done previously.
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 552e4cb..c8130db 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -295,6 +295,7 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
>         ppgtt->base.clear_range = gen6_ppgtt_clear_range;
>         ppgtt->base.bind_object = gen6_ppgtt_bind_object;
>         ppgtt->base.cleanup = gen6_ppgtt_cleanup;
> +       ppgtt->base.scratch = dev_priv->gtt.base.scratch;
>         ppgtt->pt_pages = kzalloc(sizeof(struct page *)*ppgtt->num_pd_entries,
>                                   GFP_KERNEL);
>         if (!ppgtt->pt_pages)
> 
> 
> Not sure what you mean, there should be only 1 scratch page now.
>
I've updated my commit message to address what we discussed on IRC. The
VM has the scratch structure because I intend to have a scratch page per
PPGTT when we have full PPGTT.

Thanks for the insightful question ;-)
>
> 
> -- 
> Ben Widawsky, Intel Open Source Technology Center
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 06/11] drm/i915: plumb VM into object operations
  2013-07-12 15:46         ` Ben Widawsky
@ 2013-07-12 16:46           ` Daniel Vetter
  2013-07-16  3:57             ` Ben Widawsky
  0 siblings, 1 reply; 50+ messages in thread
From: Daniel Vetter @ 2013-07-12 16:46 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Fri, Jul 12, 2013 at 08:46:48AM -0700, Ben Widawsky wrote:
> On Fri, Jul 12, 2013 at 08:26:07AM +0200, Daniel Vetter wrote:
> > On Thu, Jul 11, 2013 at 07:23:08PM -0700, Ben Widawsky wrote:
> > > On Tue, Jul 09, 2013 at 09:15:01AM +0200, Daniel Vetter wrote:
> > > > On Mon, Jul 08, 2013 at 11:08:37PM -0700, Ben Widawsky wrote:

[snip]

> > > > > @@ -3333,12 +3376,15 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > > > >  	}
> > > > >  
> > > > >  	if (!i915_gem_valid_gtt_space(dev, &vma->node, cache_level)) {
> > > > > -		ret = i915_gem_object_unbind(obj);
> > > > > +		ret = i915_gem_object_unbind(obj, vm);
> > > > >  		if (ret)
> > > > >  			return ret;
> > > > >  	}
> > > > >  
> > > > > -	if (i915_gem_obj_ggtt_bound(obj)) {
> > > > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > > > > +		if (!i915_gem_obj_bound(obj, vm))
> > > > > +			continue;
> > > > 
> > > > Hm, shouldn't we have a per-object list of vmas? Or will that follow later
> > > > on?
> > > > 
> > > > Self-correction: It exists already ... why can't we use this here?
> > > 
> > > Yes. That should work, I'll fix it and test it. It looks slightly worse
> > > IMO in terms of code clarity, but I don't mind the change.
> > 
> > Actually I think it'd gain in clarity, doing pte updatest (which
> > set_cache_level does) on the vma instead of the (obj, vm) pair feels more
> > natural. And we'd be able to drop lots of (obj, vm) -> vma lookups here.
> 
> That sounds good to me. Would you mind a patch on top?

If you want I guess we can refactor this after everything has settled. Has
the upside that assessing whether using vma or (obj, vm) is much easier.
So fine with me.

> 
> > 
> > > 
> > > > 
> > > > > +
> > > > >  		ret = i915_gem_object_finish_gpu(obj);
> > > > >  		if (ret)
> > > > >  			return ret;
> > > > > @@ -3361,7 +3407,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > > > >  			i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
> > > > >  					       obj, cache_level);
> > > > >  
> > > > > -		i915_gem_obj_ggtt_set_color(obj, cache_level);
> > > > > +		i915_gem_obj_set_color(obj, vm, cache_level);
> > > > >  	}
> > > > >  
> > > > >  	if (cache_level == I915_CACHE_NONE) {
> > > > > @@ -3421,6 +3467,7 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
> > > > >  			       struct drm_file *file)
> > > > >  {
> > > > >  	struct drm_i915_gem_caching *args = data;
> > > > > +	struct drm_i915_private *dev_priv;
> > > > >  	struct drm_i915_gem_object *obj;
> > > > >  	enum i915_cache_level level;
> > > > >  	int ret;
> > > > > @@ -3445,8 +3492,10 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
> > > > >  		ret = -ENOENT;
> > > > >  		goto unlock;
> > > > >  	}
> > > > > +	dev_priv = obj->base.dev->dev_private;
> > > > >  
> > > > > -	ret = i915_gem_object_set_cache_level(obj, level);
> > > > > +	/* FIXME: Add interface for specific VM? */
> > > > > +	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base, level);
> > > > >  
> > > > >  	drm_gem_object_unreference(&obj->base);
> > > > >  unlock:
> > > > > @@ -3464,6 +3513,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> > > > >  				     u32 alignment,
> > > > >  				     struct intel_ring_buffer *pipelined)
> > > > >  {
> > > > > +	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> > > > >  	u32 old_read_domains, old_write_domain;
> > > > >  	int ret;
> > > > >  
> > > > > @@ -3482,7 +3532,8 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> > > > >  	 * of uncaching, which would allow us to flush all the LLC-cached data
> > > > >  	 * with that bit in the PTE to main memory with just one PIPE_CONTROL.
> > > > >  	 */
> > > > > -	ret = i915_gem_object_set_cache_level(obj, I915_CACHE_NONE);
> > > > > +	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> > > > > +					      I915_CACHE_NONE);
> > > > >  	if (ret)
> > > > >  		return ret;
> > > > >  
> > > > > @@ -3490,7 +3541,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> > > > >  	 * (e.g. libkms for the bootup splash), we have to ensure that we
> > > > >  	 * always use map_and_fenceable for all scanout buffers.
> > > > >  	 */
> > > > > -	ret = i915_gem_object_pin(obj, alignment, true, false);
> > > > > +	ret = i915_gem_ggtt_pin(obj, alignment, true, false);
> > > > >  	if (ret)
> > > > >  		return ret;
> > > > >  
> > > > > @@ -3633,6 +3684,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
> > > > >  
> > > > >  int
> > > > >  i915_gem_object_pin(struct drm_i915_gem_object *obj,
> > > > > +		    struct i915_address_space *vm,
> > > > >  		    uint32_t alignment,
> > > > >  		    bool map_and_fenceable,
> > > > >  		    bool nonblocking)
> > > > > @@ -3642,26 +3694,29 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
> > > > >  	if (WARN_ON(obj->pin_count == DRM_I915_GEM_OBJECT_MAX_PIN_COUNT))
> > > > >  		return -EBUSY;
> > > > >  
> > > > > -	if (i915_gem_obj_ggtt_bound(obj)) {
> > > > > -		if ((alignment && i915_gem_obj_ggtt_offset(obj) & (alignment - 1)) ||
> > > > > +	BUG_ON(map_and_fenceable && !i915_is_ggtt(vm));
> > > > 
> > > > WARN_ON, since presumably we can keep on going if we get this wrong
> > > > (albeit with slightly corrupted state, so render corruptions might
> > > > follow).
> > > 
> > > Can we make a deal, can we leave this as BUG_ON with a FIXME to convert
> > > it at the end of merging?
> > 
> > Adding a FIXME right above it will cause equal amounts of conflicts, so I
> > don't see the point that much ...
> 
> I'm just really fearful that in doing the reworks I will end up with
> this condition, and I am afraid I will miss them if it's a WARN_ON.
> Definitely it's more likely to miss than a BUG.
> 
> Also, and we've disagreed on this a few times by now, this is an
> internal interface which I think should carry such a fatal error for
> this level of mistake.

Ime every time I argue this with myself and state your case it ends up
biting me horribly because I'm regularly too incompetent and hit my very
on BUG_ONs ;-) Hence why I insist so much on using WARN_ON wherever
possible. Of course if people don't check they're logs that's a different
matter (*cough* Jesse *cough*) ...

> In any case I've made the change locally. Will yell at you later if I
> was right.

Getting yelled at is part of my job, so bring it on ;-)

[snip]

> > > > > @@ -4645,11 +4719,93 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> > > > >  	list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_list)
> > > > >  		if (obj->pages_pin_count == 0)
> > > > >  			cnt += obj->base.size >> PAGE_SHIFT;
> > > > > -	list_for_each_entry(obj, &vm->inactive_list, mm_list)
> > > > > -		if (obj->pin_count == 0 && obj->pages_pin_count == 0)
> > > > > -			cnt += obj->base.size >> PAGE_SHIFT;
> > > > > +
> > > > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > > > > +		list_for_each_entry(obj, &vm->inactive_list, global_list)
> > > > > +			if (obj->pin_count == 0 && obj->pages_pin_count == 0)
> > > > > +				cnt += obj->base.size >> PAGE_SHIFT;
> > > > 
> > > > Isn't this now double-counting objects? In the shrinker we only care about
> > > > how much physical RAM an object occupies, not how much virtual space it
> > > > occupies. So just walking the bound list of objects here should be good
> > > > enough ...
> > > > 
> > > 
> > > Maybe I've misunderstood you. My code is wrong, but I think you're idea
> > > requires a prep patch because it changes functionality, right?
> > > 
> > > So let me know if I've understood you.
> > 
> > Don't we have both the bound and unbound list? So we could just switch
> > over to counting the bound objects here ... Otherwise yes, we need a prep
> > patch to create the bound list first.
> 
> Of course there is a bound list.
> 
> The old code automatically added the size of unbound objects with
> unpinned pages, and unpinned inactive objects with unpinned pages.
> 
> The latter check for inactive, needs to be checked for all VMAs. That
> was my point.

Oh right. The thing is that technically there's no reason to not also scan
the active objects, i.e. just the unbound list. So yeah, sounds like we
need a prep patch to switch to the unbound list here first. My apologies
for being dense and not fully grasphing this right away.

> 
> > 
> > > 
> > > > >  
> > > > >  	if (unlock)
> > > > >  		mutex_unlock(&dev->struct_mutex);
> > > > >  	return cnt;
> > > > >  }
> > > > > +
> > > > > +/* All the new VM stuff */
> > > > > +unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
> > > > > +				  struct i915_address_space *vm)
> > > > > +{
> > > > > +	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
> > > > > +	struct i915_vma *vma;
> > > > > +
> > > > > +	if (vm == &dev_priv->mm.aliasing_ppgtt->base)
> > > > > +		vm = &dev_priv->gtt.base;
> > > > > +
> > > > > +	BUG_ON(list_empty(&o->vma_list));
> > > > > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > > > 
> > > > Imo the vma list walking here and in the other helpers below indicates
> > > > that we should deal more often in vmas instead of (object, vm) pairs. Or
> > > > is this again something that'll get fixed later on?
> > > > 
> > > > I just want to avoid diff churn, and it also makes reviewing easier if the
> > > > foreshadowing is correct ;-) So generally I'd vote for more liberal
> > > > sprinkling of obj_to_vma in callers.
> > > 
> > > It's not something I fixed in the whole series. I think it makes sense
> > > conceptually, to keep some things as <obj,vm> and others as direct vma.
> > > 
> > > If you want me to change something, you need to be more specific since
> > > no action specifically comes to mind at this point in the series.
> > 
> > It's just that the (obj, vm) -> vma lookup is a list-walk, so imo we
> > should try to avoid it whenever possible. Since the vma has both and obj
> > and a vm pointer the vma is imo strictly better than the (obj, vm) pair.
> > And the look-up should be pushed down the callchain as much as possible.
> > 
> > So I think generally we want to pass the vma around to functions
> > everywhere, and the (obj, vm) pair would be the exception (which needs
> > special justification).
> > 
> 
> Without actually coding it, I am not sure. I think there are probably a
> decent number of reasonable exceptions where we want the object (ie.
> it's not really that much of a special case). In any case, I think we'll
> find you have to do this list walk at some point in the call chain
> anyway, but I can try to start changing around the code as a patch on
> top of this. I really want to leave as much as this patch in place as
> is, since it's decently tested (pre-rebase at least).

Ok, I can life with this if we clean things up afterwards. But imo
vma->obj isn't worse for readability than just obj, and passing pairs of
(obj,vm) around all the time just feels wrong conceptually. In C we have
structs for this, and since we already have a suitable one created we
might as well use it.

Aside: I know that we have the (ring, seqno) pair splattered all over the
code. It's been on my todo to fix that ever since I've proposed to add a
i915_gpu_sync_cookie with the original multi-ring enabling. As you can see
I've been ignored, but I hope we can finally fix this with the dma_buf
fence rework.

And we did just recently discover a bug where such a (ring, seqno) pair
got mixed up, so imo not using vma is fraught with unecessary peril.

[snip]

> > > > > @@ -158,13 +158,18 @@ int
> > > > >  i915_gem_evict_everything(struct drm_device *dev)
> > > > 
> > > > I suspect evict_everything eventually wants a address_space *vm argument
> > > > for those cases where we only want to evict everything in a given vm. Atm
> > > > we have two use-cases of this:
> > > > - Called from the shrinker as a last-ditch effort. For that it should move
> > > >   _every_ object onto the unbound list.
> > > > - Called from execbuf for badly-fragmented address spaces to clean up the
> > > >   mess. For that case we only care about one address space.
> > > 
> > > The current thing is more or less a result of Chris' suggestions. A
> > > non-posted iteration did plumb the vm, and after reworking to the
> > > suggestion made by Chris, the vm didn't make much sense anymore.
> > > 
> > > For point #1, it requires VM prioritization I think. I don't really see
> > > any other way to fairly manage it.
> > 
> > The shrinker will rip out  objects in lru order by walking first unbound
> > and then bound objects. That's imo as fair as it gets, we don't need
> > priorities between vms.
> 
> If you pass in a vm, the semantics would be, evict everything for the
> vm, right?

Yes.

> 
> > 
> > > For point #2, that I agree it might be useful, but we can easily create
> > > a new function, and not call it "shrinker" to do it. 
> > 
> > Well my point was that this function is called
> > i915_gem_evict_everything(dev, vm) and for the first use case we simply
> > pass in vm = NULL. But essentially thrashing the vm should be rare enough
> > that for now we don't need to care.
> > 
> 
> IIRC, this is exactly how my original patch worked pre-Chris.

Oops. Do you or Chris still now the argument for changing things? Maybe I
just don't see another facet of the issue at hand ...

[snip]

> > > > > @@ -475,6 +483,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
> > > > >  static int
> > > > >  i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
> > > > >  			    struct list_head *objects,
> > > > > +			    struct i915_address_space *vm,
> > > > >  			    bool *need_relocs)
> > > > >  {
> > > > >  	struct drm_i915_gem_object *obj;
> > > > > @@ -529,32 +538,37 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
> > > > >  		list_for_each_entry(obj, objects, exec_list) {
> > > > >  			struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
> > > > >  			bool need_fence, need_mappable;
> > > > > +			u32 obj_offset;
> > > > >  
> > > > > -			if (!i915_gem_obj_ggtt_bound(obj))
> > > > > +			if (!i915_gem_obj_bound(obj, vm))
> > > > >  				continue;
> > > > 
> > > > I wonder a bit how we could avoid the multipler (obj, vm) -> vma lookups
> > > > here ... Maybe we should cache them in some pointer somewhere (either in
> > > > the eb object or by adding a new pointer to the object struct, e.g.
> > > > obj->eb_vma, similar to obj->eb_list).
> > > > 
> > > 
> > > I agree, and even did this at one unposted patch too. However, I think
> > > it's a premature optimization which risks code correctness. So I think
> > > somewhere a FIXME needs to happen to address that issue. (Or if Chris
> > > complains bitterly about some perf hit).
> > 
> > If you bring up code correctness I'd vote strongly in favour of using vmas
> > everywhere - vma has the (obj, vm) pair locked down, doing the lookup all
> > the thing risks us mixing them up eventually and creating a hella lot of
> > confusion ;-)
> 
> I think this is addressed with the previous comments.

See my example for (ring, seqno). I really strongly believe passing pairs
is the wrong thing and passing structs is the right thing. Especially if
we have one at hand. But I'm ok if you want to clean this up afterwards.

Ofc if the cleanup afterwards doesn't happend I'll be a bit pissed ;-)

[snip]

> > > > > diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > > > index 245eb1d..bfe61fa 100644
> > > > > --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > > > +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > > > @@ -391,7 +391,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
> > > > >  	if (gtt_offset == I915_GTT_OFFSET_NONE)
> > > > >  		return obj;
> > > > >  
> > > > > -	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
> > > > > +	vma = i915_gem_vma_create(obj, vm);
> > > > >  	if (!vma) {
> > > > >  		drm_gem_object_unreference(&obj->base);
> > > > >  		return NULL;
> > > > > @@ -404,8 +404,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
> > > > >  	 */
> > > > >  	vma->node.start = gtt_offset;
> > > > >  	vma->node.size = size;
> > > > > -	if (drm_mm_initialized(&dev_priv->gtt.base.mm)) {
> > > > > -		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
> > > > > +	if (drm_mm_initialized(&vm->mm)) {
> > > > > +		ret = drm_mm_reserve_node(&vm->mm, &vma->node);
> > > > 
> > > > These two hunks here for stolen look fishy - we only ever use the stolen
> > > > preallocated stuff for objects with mappings in the global gtt. So keeping
> > > > that explicit is imo the better approach. And tbh I'm confused where the
> > > > local variable vm is from ...
> > > 
> > > If we don't create a vma for it, we potentially have to special case a
> > > bunch of places, I think. I'm not actually sure of this, but the
> > > overhead to do it is quite small.
> > > 
> > > Anyway, I'll look this over again nd see what I think.
> > 
> > I'm not against the vma, I've just wonedered why you do the
> > /dev_priv->gtt.base/vm/ replacement here since
> > - it's never gonna be used with another vm than ggtt
> > - this patch doesn't add the vm variable, so I'm even more confused where
> >   this started ;-)
> 
> It started from the rebase. In the original series, I did that
> "deferred_offset" thing, and having a vm variable made that code pass
> the checkpatch.pl. There wasn't a particular reason for naming it vm
> other than I had done it all over the place.
> 
> I've fixed this locally, leaving the vma, and renamed the local variable
> ggtt. It still has 3 uses in this function, so it's a bit less typing.

Yeah, make sense to keep it then.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 12/15] [RFC] create vm->bind,unbind
  2013-07-09  6:08 [PATCH 00/11] ppgtt: just the VMA Ben Widawsky
                   ` (11 preceding siblings ...)
  2013-07-09  7:50 ` [PATCH 00/11] ppgtt: just the VMA Daniel Vetter
@ 2013-07-13  4:45 ` Ben Widawsky
  2013-07-13  4:45   ` [PATCH 1/3] drm/i915: Add bind/unbind object functions to VM Ben Widawsky
                     ` (2 more replies)
  12 siblings, 3 replies; 50+ messages in thread
From: Ben Widawsky @ 2013-07-13  4:45 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

In response to some of Daniel's requests on patch 6, I tried to clean up some
of our code via bind/unbind. I haven't done terribly thorough testing so far -
but basic tests are passing on IVB, and the code is a lot cleaner IMO. This
could be squashed in to patch 6, but I would prefer to leave it as this small
series on top of the bunch.

I have tried a lesser version of this before in my earlier gtt/agp cleanups.
Daniel rejected it then in favor of his own version. I am trying again because
I think the latest PPGTT work provide an even greater case for it.

References:
http://lists.freedesktop.org/archives/intel-gfx/2013-January/023920.html

Ben Widawsky (3):
  drm/i915: Add bind/unbind object functions to VM
  drm/i915: Use the new vm [un]bind functions
  drm/i915: eliminate vm->insert_entries()

 drivers/gpu/drm/i915/i915_drv.h            |  23 +++---
 drivers/gpu/drm/i915/i915_gem.c            |  36 +++++-----
 drivers/gpu/drm/i915/i915_gem_context.c    |   6 +-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  19 ++---
 drivers/gpu/drm/i915/i915_gem_gtt.c        | 109 ++++++++++++++++++-----------
 5 files changed, 109 insertions(+), 84 deletions(-)

-- 
1.8.3.2

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 1/3] drm/i915: Add bind/unbind object functions to VM
  2013-07-13  4:45 ` [PATCH 12/15] [RFC] create vm->bind,unbind Ben Widawsky
@ 2013-07-13  4:45   ` Ben Widawsky
  2013-07-13  9:33     ` Daniel Vetter
  2013-07-13  4:45   ` [PATCH 2/3] drm/i915: Use the new vm [un]bind functions Ben Widawsky
  2013-07-13  4:45   ` [PATCH 3/3] drm/i915: eliminate vm->insert_entries() Ben Widawsky
  2 siblings, 1 reply; 50+ messages in thread
From: Ben Widawsky @ 2013-07-13  4:45 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

As we plumb the code with more VM information, it has become more
obvious that the easiest way to deal with bind and unbind is to simply
put the function pointers in the vm, and let those choose the correct
way to handle the page table updates. This change allows many places in
the code to simply be vm->bind, and not have to worry about
distinguishing PPGTT vs GGTT.

NOTE: At some point in the future, brining back insert_entries may in
fact be desirable in order to use 1 bind/unbind for multiple generations
of PPGTT. For now however, it's just not necessary.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h     |  9 +++++
 drivers/gpu/drm/i915/i915_gem_gtt.c | 72 +++++++++++++++++++++++++++++++++++++
 2 files changed, 81 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index e6694ae..c2a9c98 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -484,9 +484,18 @@ struct i915_address_space {
 	/* FIXME: Need a more generic return type */
 	gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
 				     enum i915_cache_level level);
+
+	/** Unmap an object from an address space. This usually consists of
+	 * setting the valid PTE entries to a reserved scratch page. */
+	void (*unbind_object)(struct i915_address_space *vm,
+			      struct drm_i915_gem_object *obj);
 	void (*clear_range)(struct i915_address_space *vm,
 			    unsigned int first_entry,
 			    unsigned int num_entries);
+	/* Map an object into an address space with the given cache flags. */
+	void (*bind_object)(struct i915_address_space *vm,
+			    struct drm_i915_gem_object *obj,
+			    enum i915_cache_level cache_level);
 	void (*insert_entries)(struct i915_address_space *vm,
 			       struct sg_table *st,
 			       unsigned int first_entry,
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index c0d0223..31ff971 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -45,6 +45,12 @@
 #define GEN6_PTE_CACHE_LLC_MLC		(3 << 1)
 #define GEN6_PTE_ADDR_ENCODE(addr)	GEN6_GTT_ADDR_ENCODE(addr)
 
+static void gen6_ppgtt_bind_object(struct i915_address_space *vm,
+				   struct drm_i915_gem_object *obj,
+				   enum i915_cache_level cache_level);
+static void gen6_ppgtt_unbind_object(struct i915_address_space *vm,
+				     struct drm_i915_gem_object *obj);
+
 static gen6_gtt_pte_t gen6_pte_encode(dma_addr_t addr,
 				      enum i915_cache_level level)
 {
@@ -285,7 +291,9 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 	}
 	ppgtt->num_pd_entries = GEN6_PPGTT_PD_ENTRIES;
 	ppgtt->enable = gen6_ppgtt_enable;
+	ppgtt->base.unbind_object = gen6_ppgtt_unbind_object;
 	ppgtt->base.clear_range = gen6_ppgtt_clear_range;
+	ppgtt->base.bind_object = gen6_ppgtt_bind_object;
 	ppgtt->base.insert_entries = gen6_ppgtt_insert_entries;
 	ppgtt->base.cleanup = gen6_ppgtt_cleanup;
 	ppgtt->base.scratch = dev_priv->gtt.base.scratch;
@@ -397,6 +405,17 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
 			   cache_level);
 }
 
+static void gen6_ppgtt_bind_object(struct i915_address_space *vm,
+				   struct drm_i915_gem_object *obj,
+				   enum i915_cache_level cache_level)
+{
+	const unsigned long entry = i915_gem_obj_offset(obj, vm);
+
+	gen6_ppgtt_insert_entries(vm, obj->pages, entry >> PAGE_SHIFT,
+				  cache_level);
+	obj->has_aliasing_ppgtt_mapping = 1;
+}
+
 void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
 			      struct drm_i915_gem_object *obj)
 {
@@ -407,6 +426,16 @@ void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
 			obj->base.size >> PAGE_SHIFT);
 }
 
+static void gen6_ppgtt_unbind_object(struct i915_address_space *vm,
+				     struct drm_i915_gem_object *obj)
+{
+	const unsigned long entry = i915_gem_obj_offset(obj, vm);
+
+	gen6_ppgtt_clear_range(vm, entry >> PAGE_SHIFT,
+			       obj->base.size >> PAGE_SHIFT);
+	obj->has_aliasing_ppgtt_mapping = 0;
+}
+
 extern int intel_iommu_gfx_mapped;
 /* Certain Gen5 chipsets require require idling the GPU before
  * unmapping anything from the GTT when VT-d is enabled.
@@ -555,6 +584,18 @@ static void i915_ggtt_insert_entries(struct i915_address_space *vm,
 
 }
 
+static void i915_ggtt_bind_object(struct i915_address_space *vm,
+				  struct drm_i915_gem_object *obj,
+				  enum i915_cache_level cache_level)
+{
+	const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
+	unsigned int flags = (cache_level == I915_CACHE_NONE) ?
+		AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY;
+
+	BUG_ON(!i915_is_ggtt(vm));
+	intel_gtt_insert_sg_entries(obj->pages, entry, flags);
+}
+
 static void i915_ggtt_clear_range(struct i915_address_space *vm,
 				  unsigned int first_entry,
 				  unsigned int num_entries)
@@ -562,6 +603,24 @@ static void i915_ggtt_clear_range(struct i915_address_space *vm,
 	intel_gtt_clear_range(first_entry, num_entries);
 }
 
+static void i915_ggtt_unbind_object(struct i915_address_space *vm,
+				    struct drm_i915_gem_object *obj)
+{
+	const unsigned int first = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
+	const unsigned int size = obj->base.size >> PAGE_SHIFT;
+
+	BUG_ON(!i915_is_ggtt(vm));
+	intel_gtt_clear_range(first, size);
+}
+
+static void gen6_ggtt_bind_object(struct i915_address_space *vm,
+				  struct drm_i915_gem_object *obj,
+				  enum i915_cache_level cache_level)
+{
+	const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
+	gen6_ggtt_insert_entries(vm, obj->pages, entry, cache_level);
+	obj->has_global_gtt_mapping = 1;
+}
 
 void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
 			      enum i915_cache_level cache_level)
@@ -590,6 +649,15 @@ void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj)
 	obj->has_global_gtt_mapping = 0;
 }
 
+static void gen6_ggtt_unbind_object(struct i915_address_space *vm,
+				    struct drm_i915_gem_object *obj)
+{
+	const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
+
+	gen6_ggtt_clear_range(vm, entry, obj->base.size >> PAGE_SHIFT);
+	obj->has_global_gtt_mapping = 0;
+}
+
 void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj)
 {
 	struct drm_device *dev = obj->base.dev;
@@ -823,7 +891,9 @@ static int gen6_gmch_probe(struct drm_device *dev,
 		DRM_ERROR("Scratch setup failed\n");
 
 	dev_priv->gtt.base.clear_range = gen6_ggtt_clear_range;
+	dev_priv->gtt.base.unbind_object = gen6_ggtt_unbind_object;
 	dev_priv->gtt.base.insert_entries = gen6_ggtt_insert_entries;
+	dev_priv->gtt.base.bind_object = gen6_ggtt_bind_object;
 
 	return ret;
 }
@@ -855,7 +925,9 @@ static int i915_gmch_probe(struct drm_device *dev,
 
 	dev_priv->gtt.do_idle_maps = needs_idle_maps(dev_priv->dev);
 	dev_priv->gtt.base.clear_range = i915_ggtt_clear_range;
+	dev_priv->gtt.base.unbind_object = i915_ggtt_unbind_object;
 	dev_priv->gtt.base.insert_entries = i915_ggtt_insert_entries;
+	dev_priv->gtt.base.bind_object = i915_ggtt_bind_object;
 
 	return 0;
 }
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 2/3] drm/i915: Use the new vm [un]bind functions
  2013-07-13  4:45 ` [PATCH 12/15] [RFC] create vm->bind,unbind Ben Widawsky
  2013-07-13  4:45   ` [PATCH 1/3] drm/i915: Add bind/unbind object functions to VM Ben Widawsky
@ 2013-07-13  4:45   ` Ben Widawsky
  2013-07-13  4:45   ` [PATCH 3/3] drm/i915: eliminate vm->insert_entries() Ben Widawsky
  2 siblings, 0 replies; 50+ messages in thread
From: Ben Widawsky @ 2013-07-13  4:45 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.

Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h            | 10 -----
 drivers/gpu/drm/i915/i915_gem.c            | 36 +++++++--------
 drivers/gpu/drm/i915/i915_gem_context.c    |  6 ++-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 19 ++++----
 drivers/gpu/drm/i915/i915_gem_gtt.c        | 70 +++++++++---------------------
 5 files changed, 52 insertions(+), 89 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index c2a9c98..8f9569b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1923,18 +1923,8 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
 
 /* i915_gem_gtt.c */
 void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev);
-void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
-			    struct drm_i915_gem_object *obj,
-			    enum i915_cache_level cache_level);
-void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
-			      struct drm_i915_gem_object *obj);
-
 void i915_gem_restore_gtt_mappings(struct drm_device *dev);
 int __must_check i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj);
-/* FIXME: this is never okay with full PPGTT */
-void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
-				enum i915_cache_level cache_level);
-void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj);
 void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj);
 void i915_gem_init_global_gtt(struct drm_device *dev);
 void i915_gem_setup_global_gtt(struct drm_device *dev, unsigned long start,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 90d49fb..8e7a12d 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2655,12 +2655,9 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj,
 
 	trace_i915_gem_object_unbind(obj, vm);
 
-	if (obj->has_global_gtt_mapping && i915_is_ggtt(vm))
-		i915_gem_gtt_unbind_object(obj);
-	if (obj->has_aliasing_ppgtt_mapping) {
-		i915_ppgtt_unbind_object(dev_priv->mm.aliasing_ppgtt, obj);
-		obj->has_aliasing_ppgtt_mapping = 0;
-	}
+
+	vm->unbind_object(vm, obj);
+
 	i915_gem_gtt_finish_object(obj);
 	i915_gem_object_unpin_pages(obj);
 
@@ -3393,7 +3390,6 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 				    enum i915_cache_level cache_level)
 {
 	struct drm_device *dev = obj->base.dev;
-	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
 	int ret;
 
@@ -3428,13 +3424,8 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 				return ret;
 		}
 
-		if (obj->has_global_gtt_mapping)
-			i915_gem_gtt_bind_object(obj, cache_level);
-		if (obj->has_aliasing_ppgtt_mapping)
-			i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
-					       obj, cache_level);
-
-		i915_gem_obj_set_color(obj, vma->vm, cache_level);
+		vm->bind_object(vm, obj, cache_level);
+		i915_gem_obj_set_color(obj, vm, cache_level);
 	}
 
 	if (cache_level == I915_CACHE_NONE) {
@@ -3716,6 +3707,7 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
 		    bool map_and_fenceable,
 		    bool nonblocking)
 {
+	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
 	int ret;
 
 	if (WARN_ON(obj->pin_count == DRM_I915_GEM_OBJECT_MAX_PIN_COUNT))
@@ -3741,20 +3733,24 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
 	}
 
 	if (!i915_gem_obj_bound(obj, vm)) {
-		struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
-
 		ret = i915_gem_object_bind_to_gtt(obj, vm, alignment,
 						  map_and_fenceable,
 						  nonblocking);
 		if (ret)
 			return ret;
 
-		if (!dev_priv->mm.aliasing_ppgtt)
-			i915_gem_gtt_bind_object(obj, obj->cache_level);
+		if (!dev_priv->mm.aliasing_ppgtt) {
+			dev_priv->gtt.base.bind_object(&dev_priv->gtt.base,
+						       obj,
+						       obj->cache_level);
+		}
 	}
 
-	if (!obj->has_global_gtt_mapping && map_and_fenceable)
-		i915_gem_gtt_bind_object(obj, obj->cache_level);
+	if (!obj->has_global_gtt_mapping && map_and_fenceable) {
+		dev_priv->gtt.base.bind_object(&dev_priv->gtt.base,
+					       obj,
+					       obj->cache_level);
+	}
 
 	obj->pin_count++;
 	obj->pin_mappable |= map_and_fenceable;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index c92fd81..177e42c 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -424,8 +424,10 @@ static int do_switch(struct i915_hw_context *to)
 		return ret;
 	}
 
-	if (!to->obj->has_global_gtt_mapping)
-		i915_gem_gtt_bind_object(to->obj, to->obj->cache_level);
+	if (!to->obj->has_global_gtt_mapping) {
+		dev_priv->gtt.base.bind_object(&dev_priv->gtt.base,
+					       to->obj, to->obj->cache_level);
+	}
 
 	if (!to->is_initialized || is_default_context(to))
 		hw_flags |= MI_RESTORE_INHIBIT;
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 725dd7f..9e9d955 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -197,8 +197,10 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
 	if (unlikely(IS_GEN6(dev) &&
 	    reloc->write_domain == I915_GEM_DOMAIN_INSTRUCTION &&
 	    !target_i915_obj->has_global_gtt_mapping)) {
-		i915_gem_gtt_bind_object(target_i915_obj,
-					 target_i915_obj->cache_level);
+		struct drm_i915_private *dev_priv = dev->dev_private;
+		dev_priv->gtt.base.bind_object(&dev_priv->gtt.base,
+					       target_i915_obj,
+					       target_i915_obj->cache_level);
 	}
 
 	/* Validate that the target is in a valid r/w GPU domain */
@@ -438,10 +440,9 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
 
 	/* Ensure ppgtt mapping exists if needed */
 	if (dev_priv->mm.aliasing_ppgtt && !obj->has_aliasing_ppgtt_mapping) {
-		i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
-				       obj, obj->cache_level);
-
-		obj->has_aliasing_ppgtt_mapping = 1;
+		struct i915_address_space *appgtt;
+		appgtt = &dev_priv->mm.aliasing_ppgtt->base;
+		appgtt->bind_object(appgtt, obj, obj->cache_level);
 	}
 
 	if (entry->offset != i915_gem_obj_offset(obj, vm)) {
@@ -456,7 +457,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
 
 	if (entry->flags & EXEC_OBJECT_NEEDS_GTT &&
 	    !obj->has_global_gtt_mapping)
-		i915_gem_gtt_bind_object(obj, obj->cache_level);
+		dev_priv->gtt.base.bind_object(&dev_priv->gtt.base,
+					       obj, obj->cache_level);
 
 	return 0;
 }
@@ -1046,7 +1048,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	 * hsw should have this fixed, but let's be paranoid and do it
 	 * unconditionally for now. */
 	if (flags & I915_DISPATCH_SECURE && !batch_obj->has_global_gtt_mapping)
-		i915_gem_gtt_bind_object(batch_obj, batch_obj->cache_level);
+		dev_priv->gtt.base.bind_object(&dev_priv->gtt.base, batch_obj,
+					       batch_obj->cache_level);
 
 	ret = i915_gem_execbuffer_move_to_gpu(ring, &eb->objects);
 	if (ret)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 31ff971..31bffb9 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -393,18 +393,6 @@ void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev)
 	dev_priv->mm.aliasing_ppgtt = NULL;
 }
 
-void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
-			    struct drm_i915_gem_object *obj,
-			    enum i915_cache_level cache_level)
-{
-	struct i915_address_space *vm = &ppgtt->base;
-	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
-
-	vm->insert_entries(vm, obj->pages,
-			   obj_offset >> PAGE_SHIFT,
-			   cache_level);
-}
-
 static void gen6_ppgtt_bind_object(struct i915_address_space *vm,
 				   struct drm_i915_gem_object *obj,
 				   enum i915_cache_level cache_level)
@@ -416,16 +404,6 @@ static void gen6_ppgtt_bind_object(struct i915_address_space *vm,
 	obj->has_aliasing_ppgtt_mapping = 1;
 }
 
-void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
-			      struct drm_i915_gem_object *obj)
-{
-	struct i915_address_space *vm = &ppgtt->base;
-	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
-
-	vm->clear_range(vm, obj_offset >> PAGE_SHIFT,
-			obj->base.size >> PAGE_SHIFT);
-}
-
 static void gen6_ppgtt_unbind_object(struct i915_address_space *vm,
 				     struct drm_i915_gem_object *obj)
 {
@@ -489,7 +467,8 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
 
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
 		i915_gem_clflush_object(obj);
-		i915_gem_gtt_bind_object(obj, obj->cache_level);
+		dev_priv->gtt.base.bind_object(&dev_priv->gtt.base,
+					       obj, obj->cache_level);
 	}
 
 	i915_gem_chipset_flush(dev);
@@ -620,33 +599,16 @@ static void gen6_ggtt_bind_object(struct i915_address_space *vm,
 	const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
 	gen6_ggtt_insert_entries(vm, obj->pages, entry, cache_level);
 	obj->has_global_gtt_mapping = 1;
-}
-
-void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
-			      enum i915_cache_level cache_level)
-{
-	struct drm_device *dev = obj->base.dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
-
-	dev_priv->gtt.base.insert_entries(&dev_priv->gtt.base, obj->pages,
-					  entry,
-					  cache_level);
 
-	obj->has_global_gtt_mapping = 1;
-}
-
-void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj)
-{
-	struct drm_device *dev = obj->base.dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
-
-	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
-				       entry,
-				       obj->base.size >> PAGE_SHIFT);
-
-	obj->has_global_gtt_mapping = 0;
+	/* GGTT bound buffers are special cases with aliasing PPGTT. Assume we
+	 * always want to do both */
+	if (obj->has_aliasing_ppgtt_mapping) {
+		struct drm_device *dev = obj->base.dev;
+		struct drm_i915_private *dev_priv = dev->dev_private;
+		struct i915_address_space *appgtt;
+		appgtt = &dev_priv->mm.aliasing_ppgtt->base;
+		appgtt->bind_object(appgtt, obj, cache_level);
+	}
 }
 
 static void gen6_ggtt_unbind_object(struct i915_address_space *vm,
@@ -656,6 +618,16 @@ static void gen6_ggtt_unbind_object(struct i915_address_space *vm,
 
 	gen6_ggtt_clear_range(vm, entry, obj->base.size >> PAGE_SHIFT);
 	obj->has_global_gtt_mapping = 0;
+
+	/* GGTT bound buffers are special cases with aliasing PPGTT. Assume we
+	 * always want to do both */
+	if (obj->has_aliasing_ppgtt_mapping) {
+		struct drm_device *dev = obj->base.dev;
+		struct drm_i915_private *dev_priv = dev->dev_private;
+		struct i915_address_space *appgtt;
+		appgtt = &dev_priv->mm.aliasing_ppgtt->base;
+		appgtt->unbind_object(appgtt, obj);
+	}
 }
 
 void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj)
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 3/3] drm/i915: eliminate vm->insert_entries()
  2013-07-13  4:45 ` [PATCH 12/15] [RFC] create vm->bind,unbind Ben Widawsky
  2013-07-13  4:45   ` [PATCH 1/3] drm/i915: Add bind/unbind object functions to VM Ben Widawsky
  2013-07-13  4:45   ` [PATCH 2/3] drm/i915: Use the new vm [un]bind functions Ben Widawsky
@ 2013-07-13  4:45   ` Ben Widawsky
  2 siblings, 0 replies; 50+ messages in thread
From: Ben Widawsky @ 2013-07-13  4:45 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

With bind/unbind function pointers in place, we no longer need
insert_entries. We could, and want, to remove clear_range, however it's
not totally easy at this point. Since it's used in a couple of place
still that don't only deal in objects: setup, ppgtt init, and restore
gtt mappings.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h     |  4 ----
 drivers/gpu/drm/i915/i915_gem_gtt.c | 15 ---------------
 2 files changed, 19 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 8f9569b..eb13399 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -496,10 +496,6 @@ struct i915_address_space {
 	void (*bind_object)(struct i915_address_space *vm,
 			    struct drm_i915_gem_object *obj,
 			    enum i915_cache_level cache_level);
-	void (*insert_entries)(struct i915_address_space *vm,
-			       struct sg_table *st,
-			       unsigned int first_entry,
-			       enum i915_cache_level cache_level);
 	void (*cleanup)(struct i915_address_space *vm);
 };
 
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 31bffb9..dd3d5e5 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -294,7 +294,6 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 	ppgtt->base.unbind_object = gen6_ppgtt_unbind_object;
 	ppgtt->base.clear_range = gen6_ppgtt_clear_range;
 	ppgtt->base.bind_object = gen6_ppgtt_bind_object;
-	ppgtt->base.insert_entries = gen6_ppgtt_insert_entries;
 	ppgtt->base.cleanup = gen6_ppgtt_cleanup;
 	ppgtt->base.scratch = dev_priv->gtt.base.scratch;
 	ppgtt->pt_pages = kzalloc(sizeof(struct page *)*ppgtt->num_pd_entries,
@@ -551,18 +550,6 @@ static void gen6_ggtt_clear_range(struct i915_address_space *vm,
 }
 
 
-static void i915_ggtt_insert_entries(struct i915_address_space *vm,
-				     struct sg_table *st,
-				     unsigned int pg_start,
-				     enum i915_cache_level cache_level)
-{
-	unsigned int flags = (cache_level == I915_CACHE_NONE) ?
-		AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY;
-
-	intel_gtt_insert_sg_entries(st, pg_start, flags);
-
-}
-
 static void i915_ggtt_bind_object(struct i915_address_space *vm,
 				  struct drm_i915_gem_object *obj,
 				  enum i915_cache_level cache_level)
@@ -864,7 +851,6 @@ static int gen6_gmch_probe(struct drm_device *dev,
 
 	dev_priv->gtt.base.clear_range = gen6_ggtt_clear_range;
 	dev_priv->gtt.base.unbind_object = gen6_ggtt_unbind_object;
-	dev_priv->gtt.base.insert_entries = gen6_ggtt_insert_entries;
 	dev_priv->gtt.base.bind_object = gen6_ggtt_bind_object;
 
 	return ret;
@@ -898,7 +884,6 @@ static int i915_gmch_probe(struct drm_device *dev,
 	dev_priv->gtt.do_idle_maps = needs_idle_maps(dev_priv->dev);
 	dev_priv->gtt.base.clear_range = i915_ggtt_clear_range;
 	dev_priv->gtt.base.unbind_object = i915_ggtt_unbind_object;
-	dev_priv->gtt.base.insert_entries = i915_ggtt_insert_entries;
 	dev_priv->gtt.base.bind_object = i915_ggtt_bind_object;
 
 	return 0;
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/3] drm/i915: Add bind/unbind object functions to VM
  2013-07-13  4:45   ` [PATCH 1/3] drm/i915: Add bind/unbind object functions to VM Ben Widawsky
@ 2013-07-13  9:33     ` Daniel Vetter
  2013-07-16  3:35       ` Ben Widawsky
  0 siblings, 1 reply; 50+ messages in thread
From: Daniel Vetter @ 2013-07-13  9:33 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Fri, Jul 12, 2013 at 09:45:54PM -0700, Ben Widawsky wrote:
> As we plumb the code with more VM information, it has become more
> obvious that the easiest way to deal with bind and unbind is to simply
> put the function pointers in the vm, and let those choose the correct
> way to handle the page table updates. This change allows many places in
> the code to simply be vm->bind, and not have to worry about
> distinguishing PPGTT vs GGTT.
> 
> NOTE: At some point in the future, brining back insert_entries may in
> fact be desirable in order to use 1 bind/unbind for multiple generations
> of PPGTT. For now however, it's just not necessary.

I need to check the -internal tree again, but I'm rather sure that we need
->insert_entries. In that case I don't want to remove it here in the
upstream tree since I have no intention to carry the re-add patch in
-internal ;-)

> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> ---
>  drivers/gpu/drm/i915/i915_drv.h     |  9 +++++
>  drivers/gpu/drm/i915/i915_gem_gtt.c | 72 +++++++++++++++++++++++++++++++++++++
>  2 files changed, 81 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index e6694ae..c2a9c98 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -484,9 +484,18 @@ struct i915_address_space {
>  	/* FIXME: Need a more generic return type */
>  	gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
>  				     enum i915_cache_level level);
> +
> +	/** Unmap an object from an address space. This usually consists of
> +	 * setting the valid PTE entries to a reserved scratch page. */
> +	void (*unbind_object)(struct i915_address_space *vm,
> +			      struct drm_i915_gem_object *obj);

	void (*unbind_vma)(struct i915_vma *vma);
	void (*bind_vma)(struct i915_vma *vma,
			 enum i915_cache_level cache_level);

I think if you do this as a follow-up we might as well bikeshed the
interface a bit. Again (I know, broken record) for me it feels
semantically much cleaner to talk about binding/unbindinig a vma instead
of an (obj, vm) pair ...

>  	void (*clear_range)(struct i915_address_space *vm,
>  			    unsigned int first_entry,
>  			    unsigned int num_entries);
> +	/* Map an object into an address space with the given cache flags. */
> +	void (*bind_object)(struct i915_address_space *vm,
> +			    struct drm_i915_gem_object *obj,
> +			    enum i915_cache_level cache_level);
>  	void (*insert_entries)(struct i915_address_space *vm,
>  			       struct sg_table *st,
>  			       unsigned int first_entry,
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index c0d0223..31ff971 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -45,6 +45,12 @@
>  #define GEN6_PTE_CACHE_LLC_MLC		(3 << 1)
>  #define GEN6_PTE_ADDR_ENCODE(addr)	GEN6_GTT_ADDR_ENCODE(addr)
>  
> +static void gen6_ppgtt_bind_object(struct i915_address_space *vm,
> +				   struct drm_i915_gem_object *obj,
> +				   enum i915_cache_level cache_level);
> +static void gen6_ppgtt_unbind_object(struct i915_address_space *vm,
> +				     struct drm_i915_gem_object *obj);
> +
>  static gen6_gtt_pte_t gen6_pte_encode(dma_addr_t addr,
>  				      enum i915_cache_level level)
>  {
> @@ -285,7 +291,9 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
>  	}
>  	ppgtt->num_pd_entries = GEN6_PPGTT_PD_ENTRIES;
>  	ppgtt->enable = gen6_ppgtt_enable;
> +	ppgtt->base.unbind_object = gen6_ppgtt_unbind_object;
>  	ppgtt->base.clear_range = gen6_ppgtt_clear_range;
> +	ppgtt->base.bind_object = gen6_ppgtt_bind_object;
>  	ppgtt->base.insert_entries = gen6_ppgtt_insert_entries;
>  	ppgtt->base.cleanup = gen6_ppgtt_cleanup;
>  	ppgtt->base.scratch = dev_priv->gtt.base.scratch;
> @@ -397,6 +405,17 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
>  			   cache_level);
>  }
>  
> +static void gen6_ppgtt_bind_object(struct i915_address_space *vm,
> +				   struct drm_i915_gem_object *obj,
> +				   enum i915_cache_level cache_level)
> +{
> +	const unsigned long entry = i915_gem_obj_offset(obj, vm);
> +
> +	gen6_ppgtt_insert_entries(vm, obj->pages, entry >> PAGE_SHIFT,
> +				  cache_level);
> +	obj->has_aliasing_ppgtt_mapping = 1;

Since this is the bind function for ppgtt the aliasing ppgtt stuff looks a
bit wrong here. Either we do the ppgtt insert_entries call as part of the
global gtt bind call (if vm->aliasing_ppgtt is set) or we have a special
global gtt binding call for execbuf.

Thinking about this some more we might need bind flags with

#define VMA_BIND_CPU  (1<<0) /* ensure ggtt mapping exists for aliasing ppgtt */
#define VMA_BIND_GPU  (1<<1) /* ensure ppgtt mappings exists for aliasing ppgtt */

since otherwise we can't properly encapsulate the aliasing ppgtt binding
logic into vm->bind. So in the end we'd have

void ggtt_bind_vma(vma, bind_flags, cache_level)
{
	ggtt_vm = vma->vm;
	WARN_ON(ggtt_vm != &dev_priv->gtt.base);

	if ((!ggtt_vm->aliasing_ppgtt || (bind_flags & BIND_CPU)) &&
	    !obj->has_global_gtt_mapping) {
		ggtt_vm->insert_entries(vma->obj, vma->node.start, cache_leve);
		vma->obj->has_global_gtt_mapping = true;
	}

	if ((ggtt_vm->aliasing_ppgtt && (bind_flags & BIND_GPU)) &&
	    !obj->has_ppgtt_mapping) {
		ggtt_vm->aliasing_ppgtt->insert_entries(vma->obj,
							vma->node.start,
							cache_leve);
		vma->obj->has_ppgtt_mapping = true;
	}
}

Obviously completely untested, but I hope I could get the idea accross.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/3] drm/i915: Add bind/unbind object functions to VM
  2013-07-13  9:33     ` Daniel Vetter
@ 2013-07-16  3:35       ` Ben Widawsky
  2013-07-16  4:00         ` Ben Widawsky
  2013-07-16  5:13         ` Daniel Vetter
  0 siblings, 2 replies; 50+ messages in thread
From: Ben Widawsky @ 2013-07-16  3:35 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX

On Sat, Jul 13, 2013 at 11:33:22AM +0200, Daniel Vetter wrote:
> On Fri, Jul 12, 2013 at 09:45:54PM -0700, Ben Widawsky wrote:
> > As we plumb the code with more VM information, it has become more
> > obvious that the easiest way to deal with bind and unbind is to simply
> > put the function pointers in the vm, and let those choose the correct
> > way to handle the page table updates. This change allows many places in
> > the code to simply be vm->bind, and not have to worry about
> > distinguishing PPGTT vs GGTT.
> > 
> > NOTE: At some point in the future, brining back insert_entries may in
> > fact be desirable in order to use 1 bind/unbind for multiple generations
> > of PPGTT. For now however, it's just not necessary.
> 
> I need to check the -internal tree again, but I'm rather sure that we need
> ->insert_entries. In that case I don't want to remove it here in the
> upstream tree since I have no intention to carry the re-add patch in
> -internal ;-)

We do use it for i915_ppgtt_bind_object(), however it should be easily
fixable since the mini-series is exactly about removing
i915_ppgtt_bind_object, and making into vm->bind_object. I think it's
fair if you ask me to fix this up on -internal as well, before merging
it, but with that one exception - I still believe this is the right
direction to go in.

> 
> > 
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > ---
> >  drivers/gpu/drm/i915/i915_drv.h     |  9 +++++
> >  drivers/gpu/drm/i915/i915_gem_gtt.c | 72 +++++++++++++++++++++++++++++++++++++
> >  2 files changed, 81 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index e6694ae..c2a9c98 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -484,9 +484,18 @@ struct i915_address_space {
> >  	/* FIXME: Need a more generic return type */
> >  	gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
> >  				     enum i915_cache_level level);
> > +
> > +	/** Unmap an object from an address space. This usually consists of
> > +	 * setting the valid PTE entries to a reserved scratch page. */
> > +	void (*unbind_object)(struct i915_address_space *vm,
> > +			      struct drm_i915_gem_object *obj);
> 
> 	void (*unbind_vma)(struct i915_vma *vma);
> 	void (*bind_vma)(struct i915_vma *vma,
> 			 enum i915_cache_level cache_level);
> 
> I think if you do this as a follow-up we might as well bikeshed the
> interface a bit. Again (I know, broken record) for me it feels
> semantically much cleaner to talk about binding/unbindinig a vma instead
> of an (obj, vm) pair ...
> 

So as mentioned (and I haven't yet responded to the other email, but
I'll be broken record there also) - I don't disagree with you. My
argument is the performance difference should be negligible, and the code
as is, is decently tested. Changing this requires changing so much, I'd
rather do the conversion on top. See the other mail thread for more...

> >  	void (*clear_range)(struct i915_address_space *vm,
> >  			    unsigned int first_entry,
> >  			    unsigned int num_entries);
> > +	/* Map an object into an address space with the given cache flags. */
> > +	void (*bind_object)(struct i915_address_space *vm,
> > +			    struct drm_i915_gem_object *obj,
> > +			    enum i915_cache_level cache_level);
> >  	void (*insert_entries)(struct i915_address_space *vm,
> >  			       struct sg_table *st,
> >  			       unsigned int first_entry,
> > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > index c0d0223..31ff971 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > @@ -45,6 +45,12 @@
> >  #define GEN6_PTE_CACHE_LLC_MLC		(3 << 1)
> >  #define GEN6_PTE_ADDR_ENCODE(addr)	GEN6_GTT_ADDR_ENCODE(addr)
> >  
> > +static void gen6_ppgtt_bind_object(struct i915_address_space *vm,
> > +				   struct drm_i915_gem_object *obj,
> > +				   enum i915_cache_level cache_level);
> > +static void gen6_ppgtt_unbind_object(struct i915_address_space *vm,
> > +				     struct drm_i915_gem_object *obj);
> > +
> >  static gen6_gtt_pte_t gen6_pte_encode(dma_addr_t addr,
> >  				      enum i915_cache_level level)
> >  {
> > @@ -285,7 +291,9 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
> >  	}
> >  	ppgtt->num_pd_entries = GEN6_PPGTT_PD_ENTRIES;
> >  	ppgtt->enable = gen6_ppgtt_enable;
> > +	ppgtt->base.unbind_object = gen6_ppgtt_unbind_object;
> >  	ppgtt->base.clear_range = gen6_ppgtt_clear_range;
> > +	ppgtt->base.bind_object = gen6_ppgtt_bind_object;
> >  	ppgtt->base.insert_entries = gen6_ppgtt_insert_entries;
> >  	ppgtt->base.cleanup = gen6_ppgtt_cleanup;
> >  	ppgtt->base.scratch = dev_priv->gtt.base.scratch;
> > @@ -397,6 +405,17 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
> >  			   cache_level);
> >  }
> >  
> > +static void gen6_ppgtt_bind_object(struct i915_address_space *vm,
> > +				   struct drm_i915_gem_object *obj,
> > +				   enum i915_cache_level cache_level)
> > +{
> > +	const unsigned long entry = i915_gem_obj_offset(obj, vm);
> > +
> > +	gen6_ppgtt_insert_entries(vm, obj->pages, entry >> PAGE_SHIFT,
> > +				  cache_level);
> > +	obj->has_aliasing_ppgtt_mapping = 1;
> 
> Since this is the bind function for ppgtt the aliasing ppgtt stuff looks a
> bit wrong here. Either we do the ppgtt insert_entries call as part of the
> global gtt bind call (if vm->aliasing_ppgtt is set) or we have a special
> global gtt binding call for execbuf.
> 
> Thinking about this some more we might need bind flags with
> 
> #define VMA_BIND_CPU  (1<<0) /* ensure ggtt mapping exists for aliasing ppgtt */
> #define VMA_BIND_GPU  (1<<1) /* ensure ppgtt mappings exists for aliasing ppgtt */
> 
> since otherwise we can't properly encapsulate the aliasing ppgtt binding
> logic into vm->bind. So in the end we'd have
> 
> void ggtt_bind_vma(vma, bind_flags, cache_level)
> {
> 	ggtt_vm = vma->vm;
> 	WARN_ON(ggtt_vm != &dev_priv->gtt.base);
> 
> 	if ((!ggtt_vm->aliasing_ppgtt || (bind_flags & BIND_CPU)) &&
> 	    !obj->has_global_gtt_mapping) {
> 		ggtt_vm->insert_entries(vma->obj, vma->node.start, cache_leve);
> 		vma->obj->has_global_gtt_mapping = true;
> 	}
> 
> 	if ((ggtt_vm->aliasing_ppgtt && (bind_flags & BIND_GPU)) &&
> 	    !obj->has_ppgtt_mapping) {
> 		ggtt_vm->aliasing_ppgtt->insert_entries(vma->obj,
> 							vma->node.start,
> 							cache_leve);
> 		vma->obj->has_ppgtt_mapping = true;
> 	}
> }
> 
> Obviously completely untested, but I hope I could get the idea accross.
> 
> Cheers, Daniel

To me, aliasing ppgtt is just a wart that doesn't fit well with
anything. As such, my plan was to hide as much of it as possible in ggtt
functions. Using some kind of flag on ggtt_bind() we can determine if
the user actually wants ggtt, and if so bind to both, else just use
aliasing ppgtt. None of that code appears here because I want to make
the diff churn as small as possible, and hadn't completely thought it
all through.

Now after typing that (and this really did happen), I just looked at
your function, and it seems to be more or less exactly what I just
typed. Cool! The GPU/CPU naming scheme seems off to me, and I think you
really just want one flag which specifies "bind it in the global gtt,
sucka"

Now having just typed /that/, it was indeed my plan. So as long as
nothing really bothers you with the bind/unbind() stuff, I can move
forward with a patch on top to fix it.


-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 06/11] drm/i915: plumb VM into object operations
  2013-07-12 16:46           ` Daniel Vetter
@ 2013-07-16  3:57             ` Ben Widawsky
  2013-07-16  5:06               ` Daniel Vetter
  0 siblings, 1 reply; 50+ messages in thread
From: Ben Widawsky @ 2013-07-16  3:57 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX

On Fri, Jul 12, 2013 at 06:46:51PM +0200, Daniel Vetter wrote:
> On Fri, Jul 12, 2013 at 08:46:48AM -0700, Ben Widawsky wrote:
> > On Fri, Jul 12, 2013 at 08:26:07AM +0200, Daniel Vetter wrote:
> > > On Thu, Jul 11, 2013 at 07:23:08PM -0700, Ben Widawsky wrote:
> > > > On Tue, Jul 09, 2013 at 09:15:01AM +0200, Daniel Vetter wrote:
> > > > > On Mon, Jul 08, 2013 at 11:08:37PM -0700, Ben Widawsky wrote:
> 
> [snip]
> 
> > > > > > @@ -3333,12 +3376,15 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > > > > >  	}
> > > > > >  
> > > > > >  	if (!i915_gem_valid_gtt_space(dev, &vma->node, cache_level)) {
> > > > > > -		ret = i915_gem_object_unbind(obj);
> > > > > > +		ret = i915_gem_object_unbind(obj, vm);
> > > > > >  		if (ret)
> > > > > >  			return ret;
> > > > > >  	}
> > > > > >  
> > > > > > -	if (i915_gem_obj_ggtt_bound(obj)) {
> > > > > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > > > > > +		if (!i915_gem_obj_bound(obj, vm))
> > > > > > +			continue;
> > > > > 
> > > > > Hm, shouldn't we have a per-object list of vmas? Or will that follow later
> > > > > on?
> > > > > 
> > > > > Self-correction: It exists already ... why can't we use this here?
> > > > 
> > > > Yes. That should work, I'll fix it and test it. It looks slightly worse
> > > > IMO in terms of code clarity, but I don't mind the change.
> > > 
> > > Actually I think it'd gain in clarity, doing pte updatest (which
> > > set_cache_level does) on the vma instead of the (obj, vm) pair feels more
> > > natural. And we'd be able to drop lots of (obj, vm) -> vma lookups here.
> > 
> > That sounds good to me. Would you mind a patch on top?
> 
> If you want I guess we can refactor this after everything has settled. Has
> the upside that assessing whether using vma or (obj, vm) is much easier.
> So fine with me.

I think our time to get these merged in is quickly evaporating. I am
sure you know this - just reminding you.

> 
> > 
> > > 
> > > > 
> > > > > 
> > > > > > +
> > > > > >  		ret = i915_gem_object_finish_gpu(obj);
> > > > > >  		if (ret)
> > > > > >  			return ret;
> > > > > > @@ -3361,7 +3407,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > > > > >  			i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
> > > > > >  					       obj, cache_level);
> > > > > >  
> > > > > > -		i915_gem_obj_ggtt_set_color(obj, cache_level);
> > > > > > +		i915_gem_obj_set_color(obj, vm, cache_level);
> > > > > >  	}
> > > > > >  
> > > > > >  	if (cache_level == I915_CACHE_NONE) {
> > > > > > @@ -3421,6 +3467,7 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
> > > > > >  			       struct drm_file *file)
> > > > > >  {
> > > > > >  	struct drm_i915_gem_caching *args = data;
> > > > > > +	struct drm_i915_private *dev_priv;
> > > > > >  	struct drm_i915_gem_object *obj;
> > > > > >  	enum i915_cache_level level;
> > > > > >  	int ret;
> > > > > > @@ -3445,8 +3492,10 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
> > > > > >  		ret = -ENOENT;
> > > > > >  		goto unlock;
> > > > > >  	}
> > > > > > +	dev_priv = obj->base.dev->dev_private;
> > > > > >  
> > > > > > -	ret = i915_gem_object_set_cache_level(obj, level);
> > > > > > +	/* FIXME: Add interface for specific VM? */
> > > > > > +	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base, level);
> > > > > >  
> > > > > >  	drm_gem_object_unreference(&obj->base);
> > > > > >  unlock:
> > > > > > @@ -3464,6 +3513,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> > > > > >  				     u32 alignment,
> > > > > >  				     struct intel_ring_buffer *pipelined)
> > > > > >  {
> > > > > > +	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> > > > > >  	u32 old_read_domains, old_write_domain;
> > > > > >  	int ret;
> > > > > >  
> > > > > > @@ -3482,7 +3532,8 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> > > > > >  	 * of uncaching, which would allow us to flush all the LLC-cached data
> > > > > >  	 * with that bit in the PTE to main memory with just one PIPE_CONTROL.
> > > > > >  	 */
> > > > > > -	ret = i915_gem_object_set_cache_level(obj, I915_CACHE_NONE);
> > > > > > +	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> > > > > > +					      I915_CACHE_NONE);
> > > > > >  	if (ret)
> > > > > >  		return ret;
> > > > > >  
> > > > > > @@ -3490,7 +3541,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> > > > > >  	 * (e.g. libkms for the bootup splash), we have to ensure that we
> > > > > >  	 * always use map_and_fenceable for all scanout buffers.
> > > > > >  	 */
> > > > > > -	ret = i915_gem_object_pin(obj, alignment, true, false);
> > > > > > +	ret = i915_gem_ggtt_pin(obj, alignment, true, false);
> > > > > >  	if (ret)
> > > > > >  		return ret;
> > > > > >  
> > > > > > @@ -3633,6 +3684,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
> > > > > >  
> > > > > >  int
> > > > > >  i915_gem_object_pin(struct drm_i915_gem_object *obj,
> > > > > > +		    struct i915_address_space *vm,
> > > > > >  		    uint32_t alignment,
> > > > > >  		    bool map_and_fenceable,
> > > > > >  		    bool nonblocking)
> > > > > > @@ -3642,26 +3694,29 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
> > > > > >  	if (WARN_ON(obj->pin_count == DRM_I915_GEM_OBJECT_MAX_PIN_COUNT))
> > > > > >  		return -EBUSY;
> > > > > >  
> > > > > > -	if (i915_gem_obj_ggtt_bound(obj)) {
> > > > > > -		if ((alignment && i915_gem_obj_ggtt_offset(obj) & (alignment - 1)) ||
> > > > > > +	BUG_ON(map_and_fenceable && !i915_is_ggtt(vm));
> > > > > 
> > > > > WARN_ON, since presumably we can keep on going if we get this wrong
> > > > > (albeit with slightly corrupted state, so render corruptions might
> > > > > follow).
> > > > 
> > > > Can we make a deal, can we leave this as BUG_ON with a FIXME to convert
> > > > it at the end of merging?
> > > 
> > > Adding a FIXME right above it will cause equal amounts of conflicts, so I
> > > don't see the point that much ...
> > 
> > I'm just really fearful that in doing the reworks I will end up with
> > this condition, and I am afraid I will miss them if it's a WARN_ON.
> > Definitely it's more likely to miss than a BUG.
> > 
> > Also, and we've disagreed on this a few times by now, this is an
> > internal interface which I think should carry such a fatal error for
> > this level of mistake.
> 
> Ime every time I argue this with myself and state your case it ends up
> biting me horribly because I'm regularly too incompetent and hit my very
> on BUG_ONs ;-) Hence why I insist so much on using WARN_ON wherever
> possible. Of course if people don't check they're logs that's a different
> matter (*cough* Jesse *cough*) ...
> 
> > In any case I've made the change locally. Will yell at you later if I
> > was right.
> 
> Getting yelled at is part of my job, so bring it on ;-)
> 
> [snip]
> 
> > > > > > @@ -4645,11 +4719,93 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> > > > > >  	list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_list)
> > > > > >  		if (obj->pages_pin_count == 0)
> > > > > >  			cnt += obj->base.size >> PAGE_SHIFT;
> > > > > > -	list_for_each_entry(obj, &vm->inactive_list, mm_list)
> > > > > > -		if (obj->pin_count == 0 && obj->pages_pin_count == 0)
> > > > > > -			cnt += obj->base.size >> PAGE_SHIFT;
> > > > > > +
> > > > > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > > > > > +		list_for_each_entry(obj, &vm->inactive_list, global_list)
> > > > > > +			if (obj->pin_count == 0 && obj->pages_pin_count == 0)
> > > > > > +				cnt += obj->base.size >> PAGE_SHIFT;
> > > > > 
> > > > > Isn't this now double-counting objects? In the shrinker we only care about
> > > > > how much physical RAM an object occupies, not how much virtual space it
> > > > > occupies. So just walking the bound list of objects here should be good
> > > > > enough ...
> > > > > 
> > > > 
> > > > Maybe I've misunderstood you. My code is wrong, but I think you're idea
> > > > requires a prep patch because it changes functionality, right?
> > > > 
> > > > So let me know if I've understood you.
> > > 
> > > Don't we have both the bound and unbound list? So we could just switch
> > > over to counting the bound objects here ... Otherwise yes, we need a prep
> > > patch to create the bound list first.
> > 
> > Of course there is a bound list.
> > 
> > The old code automatically added the size of unbound objects with
> > unpinned pages, and unpinned inactive objects with unpinned pages.
> > 
> > The latter check for inactive, needs to be checked for all VMAs. That
> > was my point.
> 
> Oh right. The thing is that technically there's no reason to not also scan
> the active objects, i.e. just the unbound list. So yeah, sounds like we
> need a prep patch to switch to the unbound list here first. My apologies
> for being dense and not fully grasphing this right away.
> 
> > 
> > > 
> > > > 
> > > > > >  
> > > > > >  	if (unlock)
> > > > > >  		mutex_unlock(&dev->struct_mutex);
> > > > > >  	return cnt;
> > > > > >  }
> > > > > > +
> > > > > > +/* All the new VM stuff */
> > > > > > +unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
> > > > > > +				  struct i915_address_space *vm)
> > > > > > +{
> > > > > > +	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
> > > > > > +	struct i915_vma *vma;
> > > > > > +
> > > > > > +	if (vm == &dev_priv->mm.aliasing_ppgtt->base)
> > > > > > +		vm = &dev_priv->gtt.base;
> > > > > > +
> > > > > > +	BUG_ON(list_empty(&o->vma_list));
> > > > > > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > > > > 
> > > > > Imo the vma list walking here and in the other helpers below indicates
> > > > > that we should deal more often in vmas instead of (object, vm) pairs. Or
> > > > > is this again something that'll get fixed later on?
> > > > > 
> > > > > I just want to avoid diff churn, and it also makes reviewing easier if the
> > > > > foreshadowing is correct ;-) So generally I'd vote for more liberal
> > > > > sprinkling of obj_to_vma in callers.
> > > > 
> > > > It's not something I fixed in the whole series. I think it makes sense
> > > > conceptually, to keep some things as <obj,vm> and others as direct vma.
> > > > 
> > > > If you want me to change something, you need to be more specific since
> > > > no action specifically comes to mind at this point in the series.
> > > 
> > > It's just that the (obj, vm) -> vma lookup is a list-walk, so imo we
> > > should try to avoid it whenever possible. Since the vma has both and obj
> > > and a vm pointer the vma is imo strictly better than the (obj, vm) pair.
> > > And the look-up should be pushed down the callchain as much as possible.
> > > 
> > > So I think generally we want to pass the vma around to functions
> > > everywhere, and the (obj, vm) pair would be the exception (which needs
> > > special justification).
> > > 
> > 
> > Without actually coding it, I am not sure. I think there are probably a
> > decent number of reasonable exceptions where we want the object (ie.
> > it's not really that much of a special case). In any case, I think we'll
> > find you have to do this list walk at some point in the call chain
> > anyway, but I can try to start changing around the code as a patch on
> > top of this. I really want to leave as much as this patch in place as
> > is, since it's decently tested (pre-rebase at least).
> 
> Ok, I can life with this if we clean things up afterwards. But imo
> vma->obj isn't worse for readability than just obj, and passing pairs of
> (obj,vm) around all the time just feels wrong conceptually. In C we have
> structs for this, and since we already have a suitable one created we
> might as well use it.
> 
> Aside: I know that we have the (ring, seqno) pair splattered all over the
> code. It's been on my todo to fix that ever since I've proposed to add a
> i915_gpu_sync_cookie with the original multi-ring enabling. As you can see
> I've been ignored, but I hope we can finally fix this with the dma_buf
> fence rework.
> 
> And we did just recently discover a bug where such a (ring, seqno) pair
> got mixed up, so imo not using vma is fraught with unecessary peril.
> 
> [snip]

As I wrote this code, I would say things as, "do this operation on that
object, in so and so address space." That's more or less why I kept
everything as obj, vm pairs. Since GEM is about BOs, at some fundamental
level we're always talking about objects, and you need to translate an
<obj,VM> pair into a VMA. The other motivation was of course to make the
diffs more manageable and easier to write. It's way easier to add a new
VM argument, than it is to do even more plumbing. As stated now in several
places, I agree it very well might make sense to convert many functions
in the call chain to VMAs, and do the lookup at the entry point from
userspace, or whenever it's logical; and indeed in the long run I think
it will lead to less buggy code.


> 
> > > > > > @@ -158,13 +158,18 @@ int
> > > > > >  i915_gem_evict_everything(struct drm_device *dev)
> > > > > 
> > > > > I suspect evict_everything eventually wants a address_space *vm argument
> > > > > for those cases where we only want to evict everything in a given vm. Atm
> > > > > we have two use-cases of this:
> > > > > - Called from the shrinker as a last-ditch effort. For that it should move
> > > > >   _every_ object onto the unbound list.
> > > > > - Called from execbuf for badly-fragmented address spaces to clean up the
> > > > >   mess. For that case we only care about one address space.
> > > > 
> > > > The current thing is more or less a result of Chris' suggestions. A
> > > > non-posted iteration did plumb the vm, and after reworking to the
> > > > suggestion made by Chris, the vm didn't make much sense anymore.
> > > > 
> > > > For point #1, it requires VM prioritization I think. I don't really see
> > > > any other way to fairly manage it.
> > > 
> > > The shrinker will rip out  objects in lru order by walking first unbound
> > > and then bound objects. That's imo as fair as it gets, we don't need
> > > priorities between vms.
> > 
> > If you pass in a vm, the semantics would be, evict everything for the
> > vm, right?
> 
> Yes.
> 
> > 
> > > 
> > > > For point #2, that I agree it might be useful, but we can easily create
> > > > a new function, and not call it "shrinker" to do it. 
> > > 
> > > Well my point was that this function is called
> > > i915_gem_evict_everything(dev, vm) and for the first use case we simply
> > > pass in vm = NULL. But essentially thrashing the vm should be rare enough
> > > that for now we don't need to care.
> > > 
> > 
> > IIRC, this is exactly how my original patch worked pre-Chris.
> 
> Oops. Do you or Chris still now the argument for changing things? Maybe I
> just don't see another facet of the issue at hand ...
> 
> [snip]
> 
> > > > > > @@ -475,6 +483,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
> > > > > >  static int
> > > > > >  i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
> > > > > >  			    struct list_head *objects,
> > > > > > +			    struct i915_address_space *vm,
> > > > > >  			    bool *need_relocs)
> > > > > >  {
> > > > > >  	struct drm_i915_gem_object *obj;
> > > > > > @@ -529,32 +538,37 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
> > > > > >  		list_for_each_entry(obj, objects, exec_list) {
> > > > > >  			struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
> > > > > >  			bool need_fence, need_mappable;
> > > > > > +			u32 obj_offset;
> > > > > >  
> > > > > > -			if (!i915_gem_obj_ggtt_bound(obj))
> > > > > > +			if (!i915_gem_obj_bound(obj, vm))
> > > > > >  				continue;
> > > > > 
> > > > > I wonder a bit how we could avoid the multipler (obj, vm) -> vma lookups
> > > > > here ... Maybe we should cache them in some pointer somewhere (either in
> > > > > the eb object or by adding a new pointer to the object struct, e.g.
> > > > > obj->eb_vma, similar to obj->eb_list).
> > > > > 
> > > > 
> > > > I agree, and even did this at one unposted patch too. However, I think
> > > > it's a premature optimization which risks code correctness. So I think
> > > > somewhere a FIXME needs to happen to address that issue. (Or if Chris
> > > > complains bitterly about some perf hit).
> > > 
> > > If you bring up code correctness I'd vote strongly in favour of using vmas
> > > everywhere - vma has the (obj, vm) pair locked down, doing the lookup all
> > > the thing risks us mixing them up eventually and creating a hella lot of
> > > confusion ;-)
> > 
> > I think this is addressed with the previous comments.
> 
> See my example for (ring, seqno). I really strongly believe passing pairs
> is the wrong thing and passing structs is the right thing. Especially if
> we have one at hand. But I'm ok if you want to clean this up afterwards.
> 
> Ofc if the cleanup afterwards doesn't happend I'll be a bit pissed ;-)
> 
> [snip]

I think it's fair to not merge anything until I do it. I'm just asking
that you let me do it as a patch on top of the series as opposed to
doing a bunch of rebasing which I've shown historically time and again
to screw up. Whether or not I should be better at rebasing isn't up for
debate.

> 
> > > > > > diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > > > > index 245eb1d..bfe61fa 100644
> > > > > > --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > > > > +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > > > > @@ -391,7 +391,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
> > > > > >  	if (gtt_offset == I915_GTT_OFFSET_NONE)
> > > > > >  		return obj;
> > > > > >  
> > > > > > -	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
> > > > > > +	vma = i915_gem_vma_create(obj, vm);
> > > > > >  	if (!vma) {
> > > > > >  		drm_gem_object_unreference(&obj->base);
> > > > > >  		return NULL;
> > > > > > @@ -404,8 +404,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
> > > > > >  	 */
> > > > > >  	vma->node.start = gtt_offset;
> > > > > >  	vma->node.size = size;
> > > > > > -	if (drm_mm_initialized(&dev_priv->gtt.base.mm)) {
> > > > > > -		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
> > > > > > +	if (drm_mm_initialized(&vm->mm)) {
> > > > > > +		ret = drm_mm_reserve_node(&vm->mm, &vma->node);
> > > > > 
> > > > > These two hunks here for stolen look fishy - we only ever use the stolen
> > > > > preallocated stuff for objects with mappings in the global gtt. So keeping
> > > > > that explicit is imo the better approach. And tbh I'm confused where the
> > > > > local variable vm is from ...
> > > > 
> > > > If we don't create a vma for it, we potentially have to special case a
> > > > bunch of places, I think. I'm not actually sure of this, but the
> > > > overhead to do it is quite small.
> > > > 
> > > > Anyway, I'll look this over again nd see what I think.
> > > 
> > > I'm not against the vma, I've just wonedered why you do the
> > > /dev_priv->gtt.base/vm/ replacement here since
> > > - it's never gonna be used with another vm than ggtt
> > > - this patch doesn't add the vm variable, so I'm even more confused where
> > >   this started ;-)
> > 
> > It started from the rebase. In the original series, I did that
> > "deferred_offset" thing, and having a vm variable made that code pass
> > the checkpatch.pl. There wasn't a particular reason for naming it vm
> > other than I had done it all over the place.
> > 
> > I've fixed this locally, leaving the vma, and renamed the local variable
> > ggtt. It still has 3 uses in this function, so it's a bit less typing.
> 
> Yeah, make sense to keep it then.
> 
> Cheers, Daniel

So here is the plan summing up from this mail thread and the other
one...

I'm calling the series done until we get feedback from Chris on the
eviction/shrinker code. Basically whatever he signs off on, I am willing
to implement (assuming you agree). I am confident enough in my rebase
abilities to at least fix that up when we get to it.

On top of this series, I am going smash the 3 patches which introduce
bind/unbind function pointers. I am doing this because all this code
already exists, plays nicely together, and has been tested.

Finally on top of this, I am going to have a [fairly large] patch to
squash <obj,vm> pair into <vma> where I feel its appropriate. My plan
for this is to basically look at a squashed version of all the patches
where I introduce both arguments in one call, think about if it makes
sense to me to think about the operation in terms of an object, or a
vma, and then change as appropriate.

So the feedback I need from you:
1. Does that sound reasonable? If you want to call it weird, that's fine,
but I think the diff churn is acceptable, and the series should be quite
bisectable.
2. Are you aware of anything else I missed in other patch review which I
haven't mentioned. I cannot find anything, and I think except for this
patch I had updated everything locally.

You can see the 14 patches (11 here + 3 from the others series):
http://cgit.freedesktop.org/~bwidawsk/drm-intel/log/?h=ppgtt2

Hopefully this was all taken in good humor. I am not upset or anything,
we just need to figure out how we can make some progress on this soon
without making total crap, and without taking all my time (since I have
a few other things to work on). I know you warned me about the latter a
few weeks ago, but well, reality.

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/3] drm/i915: Add bind/unbind object functions to VM
  2013-07-16  3:35       ` Ben Widawsky
@ 2013-07-16  4:00         ` Ben Widawsky
  2013-07-16  5:10           ` Daniel Vetter
  2013-07-16  5:13         ` Daniel Vetter
  1 sibling, 1 reply; 50+ messages in thread
From: Ben Widawsky @ 2013-07-16  4:00 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX

On Mon, Jul 15, 2013 at 08:35:43PM -0700, Ben Widawsky wrote:
> On Sat, Jul 13, 2013 at 11:33:22AM +0200, Daniel Vetter wrote:
> > On Fri, Jul 12, 2013 at 09:45:54PM -0700, Ben Widawsky wrote:
> > > As we plumb the code with more VM information, it has become more
> > > obvious that the easiest way to deal with bind and unbind is to simply
> > > put the function pointers in the vm, and let those choose the correct
> > > way to handle the page table updates. This change allows many places in
> > > the code to simply be vm->bind, and not have to worry about
> > > distinguishing PPGTT vs GGTT.
> > > 
> > > NOTE: At some point in the future, brining back insert_entries may in
> > > fact be desirable in order to use 1 bind/unbind for multiple generations
> > > of PPGTT. For now however, it's just not necessary.
> > 
> > I need to check the -internal tree again, but I'm rather sure that we need
> > ->insert_entries. In that case I don't want to remove it here in the
> > upstream tree since I have no intention to carry the re-add patch in
> > -internal ;-)
> 
> We do use it for i915_ppgtt_bind_object(), however it should be easily
> fixable since the mini-series is exactly about removing
> i915_ppgtt_bind_object, and making into vm->bind_object. I think it's
> fair if you ask me to fix this up on -internal as well, before merging
> it, but with that one exception - I still believe this is the right
> direction to go in.
> 
> > 
> > > 
> > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > > ---
> > >  drivers/gpu/drm/i915/i915_drv.h     |  9 +++++
> > >  drivers/gpu/drm/i915/i915_gem_gtt.c | 72 +++++++++++++++++++++++++++++++++++++
> > >  2 files changed, 81 insertions(+)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > > index e6694ae..c2a9c98 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > @@ -484,9 +484,18 @@ struct i915_address_space {
> > >  	/* FIXME: Need a more generic return type */
> > >  	gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
> > >  				     enum i915_cache_level level);
> > > +
> > > +	/** Unmap an object from an address space. This usually consists of
> > > +	 * setting the valid PTE entries to a reserved scratch page. */
> > > +	void (*unbind_object)(struct i915_address_space *vm,
> > > +			      struct drm_i915_gem_object *obj);
> > 
> > 	void (*unbind_vma)(struct i915_vma *vma);
> > 	void (*bind_vma)(struct i915_vma *vma,
> > 			 enum i915_cache_level cache_level);
> > 
> > I think if you do this as a follow-up we might as well bikeshed the
> > interface a bit. Again (I know, broken record) for me it feels
> > semantically much cleaner to talk about binding/unbindinig a vma instead
> > of an (obj, vm) pair ...
> > 
> 
> So as mentioned (and I haven't yet responded to the other email, but
> I'll be broken record there also) - I don't disagree with you. My
> argument is the performance difference should be negligible, and the code
> as is, is decently tested. Changing this requires changing so much, I'd
> rather do the conversion on top. See the other mail thread for more...
> 
> > >  	void (*clear_range)(struct i915_address_space *vm,
> > >  			    unsigned int first_entry,
> > >  			    unsigned int num_entries);
> > > +	/* Map an object into an address space with the given cache flags. */
> > > +	void (*bind_object)(struct i915_address_space *vm,
> > > +			    struct drm_i915_gem_object *obj,
> > > +			    enum i915_cache_level cache_level);
> > >  	void (*insert_entries)(struct i915_address_space *vm,
> > >  			       struct sg_table *st,
> > >  			       unsigned int first_entry,
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > index c0d0223..31ff971 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > @@ -45,6 +45,12 @@
> > >  #define GEN6_PTE_CACHE_LLC_MLC		(3 << 1)
> > >  #define GEN6_PTE_ADDR_ENCODE(addr)	GEN6_GTT_ADDR_ENCODE(addr)
> > >  
> > > +static void gen6_ppgtt_bind_object(struct i915_address_space *vm,
> > > +				   struct drm_i915_gem_object *obj,
> > > +				   enum i915_cache_level cache_level);
> > > +static void gen6_ppgtt_unbind_object(struct i915_address_space *vm,
> > > +				     struct drm_i915_gem_object *obj);
> > > +
> > >  static gen6_gtt_pte_t gen6_pte_encode(dma_addr_t addr,
> > >  				      enum i915_cache_level level)
> > >  {
> > > @@ -285,7 +291,9 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
> > >  	}
> > >  	ppgtt->num_pd_entries = GEN6_PPGTT_PD_ENTRIES;
> > >  	ppgtt->enable = gen6_ppgtt_enable;
> > > +	ppgtt->base.unbind_object = gen6_ppgtt_unbind_object;
> > >  	ppgtt->base.clear_range = gen6_ppgtt_clear_range;
> > > +	ppgtt->base.bind_object = gen6_ppgtt_bind_object;
> > >  	ppgtt->base.insert_entries = gen6_ppgtt_insert_entries;
> > >  	ppgtt->base.cleanup = gen6_ppgtt_cleanup;
> > >  	ppgtt->base.scratch = dev_priv->gtt.base.scratch;
> > > @@ -397,6 +405,17 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
> > >  			   cache_level);
> > >  }
> > >  
> > > +static void gen6_ppgtt_bind_object(struct i915_address_space *vm,
> > > +				   struct drm_i915_gem_object *obj,
> > > +				   enum i915_cache_level cache_level)
> > > +{
> > > +	const unsigned long entry = i915_gem_obj_offset(obj, vm);
> > > +
> > > +	gen6_ppgtt_insert_entries(vm, obj->pages, entry >> PAGE_SHIFT,
> > > +				  cache_level);
> > > +	obj->has_aliasing_ppgtt_mapping = 1;
> > 
> > Since this is the bind function for ppgtt the aliasing ppgtt stuff looks a
> > bit wrong here. Either we do the ppgtt insert_entries call as part of the
> > global gtt bind call (if vm->aliasing_ppgtt is set) or we have a special
> > global gtt binding call for execbuf.
> > 
> > Thinking about this some more we might need bind flags with
> > 
> > #define VMA_BIND_CPU  (1<<0) /* ensure ggtt mapping exists for aliasing ppgtt */
> > #define VMA_BIND_GPU  (1<<1) /* ensure ppgtt mappings exists for aliasing ppgtt */
> > 
> > since otherwise we can't properly encapsulate the aliasing ppgtt binding
> > logic into vm->bind. So in the end we'd have
> > 
> > void ggtt_bind_vma(vma, bind_flags, cache_level)
> > {
> > 	ggtt_vm = vma->vm;
> > 	WARN_ON(ggtt_vm != &dev_priv->gtt.base);
> > 
> > 	if ((!ggtt_vm->aliasing_ppgtt || (bind_flags & BIND_CPU)) &&
> > 	    !obj->has_global_gtt_mapping) {
> > 		ggtt_vm->insert_entries(vma->obj, vma->node.start, cache_leve);
> > 		vma->obj->has_global_gtt_mapping = true;
> > 	}
> > 
> > 	if ((ggtt_vm->aliasing_ppgtt && (bind_flags & BIND_GPU)) &&
> > 	    !obj->has_ppgtt_mapping) {
> > 		ggtt_vm->aliasing_ppgtt->insert_entries(vma->obj,
> > 							vma->node.start,
> > 							cache_leve);
> > 		vma->obj->has_ppgtt_mapping = true;
> > 	}
> > }
> > 
> > Obviously completely untested, but I hope I could get the idea accross.
> > 
> > Cheers, Daniel
> 
> To me, aliasing ppgtt is just a wart that doesn't fit well with
> anything. As such, my plan was to hide as much of it as possible in ggtt
> functions. Using some kind of flag on ggtt_bind() we can determine if
> the user actually wants ggtt, and if so bind to both, else just use
> aliasing ppgtt. None of that code appears here because I want to make
> the diff churn as small as possible, and hadn't completely thought it
> all through.
> 
> Now after typing that (and this really did happen), I just looked at
> your function, and it seems to be more or less exactly what I just
> typed. Cool! The GPU/CPU naming scheme seems off to me, and I think you
> really just want one flag which specifies "bind it in the global gtt,
> sucka"
> 
> Now having just typed /that/, it was indeed my plan. So as long as
> nothing really bothers you with the bind/unbind() stuff, I can move
> forward with a patch on top to fix it.
> 

I changed my mind already. A patch on top doesn't make sense. I'll try
to fix this one up as is.

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 06/11] drm/i915: plumb VM into object operations
  2013-07-16  3:57             ` Ben Widawsky
@ 2013-07-16  5:06               ` Daniel Vetter
  0 siblings, 0 replies; 50+ messages in thread
From: Daniel Vetter @ 2013-07-16  5:06 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Mon, Jul 15, 2013 at 08:57:13PM -0700, Ben Widawsky wrote:
> So here is the plan summing up from this mail thread and the other
> one...
> 
> I'm calling the series done until we get feedback from Chris on the
> eviction/shrinker code. Basically whatever he signs off on, I am willing
> to implement (assuming you agree). I am confident enough in my rebase
> abilities to at least fix that up when we get to it.

I guess we could shovel this around after merging, shrinker gets frobbed
around once a while for anyway ;-)

> On top of this series, I am going smash the 3 patches which introduce
> bind/unbind function pointers. I am doing this because all this code
> already exists, plays nicely together, and has been tested.
> 
> Finally on top of this, I am going to have a [fairly large] patch to
> squash <obj,vm> pair into <vma> where I feel its appropriate. My plan
> for this is to basically look at a squashed version of all the patches
> where I introduce both arguments in one call, think about if it makes
> sense to me to think about the operation in terms of an object, or a
> vma, and then change as appropriate.
> 
> So the feedback I need from you:
> 1. Does that sound reasonable? If you want to call it weird, that's fine,
> but I think the diff churn is acceptable, and the series should be quite
> bisectable.

I guess I'll live ;-)

> 2. Are you aware of anything else I missed in other patch review which I
> haven't mentioned. I cannot find anything, and I think except for this
> patch I had updated everything locally.

Imo my review on the semantics of ->bind/unbind is fairly important, at
least for discussion. Atm you just add some new abstraction (and kill some
old one for which we've established some good use) without putting it to
real use imo. Haven't read any responses yet though, it's early.

> You can see the 14 patches (11 here + 3 from the others series):
> http://cgit.freedesktop.org/~bwidawsk/drm-intel/log/?h=ppgtt2
> 
> Hopefully this was all taken in good humor. I am not upset or anything,
> we just need to figure out how we can make some progress on this soon
> without making total crap, and without taking all my time (since I have
> a few other things to work on). I know you warned me about the latter a
> few weeks ago, but well, reality.

Imo the first five patches should get the small reviews addressed and then
merged. Atm that's what I'm kinda stalling on ... Then we can tackle the
next little bit of this major work.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/3] drm/i915: Add bind/unbind object functions to VM
  2013-07-16  4:00         ` Ben Widawsky
@ 2013-07-16  5:10           ` Daniel Vetter
  0 siblings, 0 replies; 50+ messages in thread
From: Daniel Vetter @ 2013-07-16  5:10 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Mon, Jul 15, 2013 at 09:00:54PM -0700, Ben Widawsky wrote:
> On Mon, Jul 15, 2013 at 08:35:43PM -0700, Ben Widawsky wrote:
> > To me, aliasing ppgtt is just a wart that doesn't fit well with
> > anything. As such, my plan was to hide as much of it as possible in ggtt
> > functions. Using some kind of flag on ggtt_bind() we can determine if
> > the user actually wants ggtt, and if so bind to both, else just use
> > aliasing ppgtt. None of that code appears here because I want to make
> > the diff churn as small as possible, and hadn't completely thought it
> > all through.
> > 
> > Now after typing that (and this really did happen), I just looked at
> > your function, and it seems to be more or less exactly what I just
> > typed. Cool! The GPU/CPU naming scheme seems off to me, and I think you
> > really just want one flag which specifies "bind it in the global gtt,
> > sucka"

Feel free to bikeshed the names, I tend to not be really good at that ;-)
I guess a flag to switch between binding to ppgtt or ggtt would also work,
I just slowly started to dislike make bool arguments and favour explicitly
named flags/enums more. Better self-documenting code ...

> > Now having just typed /that/, it was indeed my plan. So as long as
> > nothing really bothers you with the bind/unbind() stuff, I can move
> > forward with a patch on top to fix it.
> > 
> 
> I changed my mind already. A patch on top doesn't make sense. I'll try
> to fix this one up as is.

Yeah, overall the thing that irks me is that you've added ppgtt bind
functions despite that we don't yet bind anything into any real ppgtt
address space. I'd recommend to just implement the ggtt bind functions
(with the aliasing crap added) and then add the ppgtt bind functions once
we get nearer to actually using them for real.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/3] drm/i915: Add bind/unbind object functions to VM
  2013-07-16  3:35       ` Ben Widawsky
  2013-07-16  4:00         ` Ben Widawsky
@ 2013-07-16  5:13         ` Daniel Vetter
  1 sibling, 0 replies; 50+ messages in thread
From: Daniel Vetter @ 2013-07-16  5:13 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Mon, Jul 15, 2013 at 08:35:43PM -0700, Ben Widawsky wrote:
> On Sat, Jul 13, 2013 at 11:33:22AM +0200, Daniel Vetter wrote:
> > On Fri, Jul 12, 2013 at 09:45:54PM -0700, Ben Widawsky wrote:
> > > As we plumb the code with more VM information, it has become more
> > > obvious that the easiest way to deal with bind and unbind is to simply
> > > put the function pointers in the vm, and let those choose the correct
> > > way to handle the page table updates. This change allows many places in
> > > the code to simply be vm->bind, and not have to worry about
> > > distinguishing PPGTT vs GGTT.
> > > 
> > > NOTE: At some point in the future, brining back insert_entries may in
> > > fact be desirable in order to use 1 bind/unbind for multiple generations
> > > of PPGTT. For now however, it's just not necessary.
> > 
> > I need to check the -internal tree again, but I'm rather sure that we need
> > ->insert_entries. In that case I don't want to remove it here in the
> > upstream tree since I have no intention to carry the re-add patch in
> > -internal ;-)
> 
> We do use it for i915_ppgtt_bind_object(), however it should be easily
> fixable since the mini-series is exactly about removing
> i915_ppgtt_bind_object, and making into vm->bind_object. I think it's
> fair if you ask me to fix this up on -internal as well, before merging
> it, but with that one exception - I still believe this is the right
> direction to go in.

My idea behind ->bind was that we could us this to hide the aliasing ppgtt
stuff a bit, and otherwise keep things exactly as-is. I haven't actually
looked at -internal to check whether it's as ugly as I expect ;-)

So if you promise to fix up -internal if I come screaming around due to
rebase breakage I'm ok with either option.

> > > 
> > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > > ---
> > >  drivers/gpu/drm/i915/i915_drv.h     |  9 +++++
> > >  drivers/gpu/drm/i915/i915_gem_gtt.c | 72 +++++++++++++++++++++++++++++++++++++
> > >  2 files changed, 81 insertions(+)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > > index e6694ae..c2a9c98 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > @@ -484,9 +484,18 @@ struct i915_address_space {
> > >  	/* FIXME: Need a more generic return type */
> > >  	gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
> > >  				     enum i915_cache_level level);
> > > +
> > > +	/** Unmap an object from an address space. This usually consists of
> > > +	 * setting the valid PTE entries to a reserved scratch page. */
> > > +	void (*unbind_object)(struct i915_address_space *vm,
> > > +			      struct drm_i915_gem_object *obj);
> > 
> > 	void (*unbind_vma)(struct i915_vma *vma);
> > 	void (*bind_vma)(struct i915_vma *vma,
> > 			 enum i915_cache_level cache_level);
> > 
> > I think if you do this as a follow-up we might as well bikeshed the
> > interface a bit. Again (I know, broken record) for me it feels
> > semantically much cleaner to talk about binding/unbindinig a vma instead
> > of an (obj, vm) pair ...
> > 
> 
> So as mentioned (and I haven't yet responded to the other email, but
> I'll be broken record there also) - I don't disagree with you. My
> argument is the performance difference should be negligible, and the code
> as is, is decently tested. Changing this requires changing so much, I'd
> rather do the conversion on top. See the other mail thread for more...

Yeah, I agree with the testing argument, sometimes I just want the Perfect
Patch a bit too much ;-)
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2013-07-16  5:13 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-09  6:08 [PATCH 00/11] ppgtt: just the VMA Ben Widawsky
2013-07-09  6:08 ` [PATCH 01/11] drm/i915: Move gtt and ppgtt under address space umbrella Ben Widawsky
2013-07-09  6:37   ` Daniel Vetter
2013-07-10 16:36     ` Ben Widawsky
2013-07-10 17:03       ` Daniel Vetter
2013-07-11 11:14   ` Imre Deak
2013-07-11 23:57     ` Ben Widawsky
2013-07-12 15:59       ` Ben Widawsky
2013-07-09  6:08 ` [PATCH 02/11] drm/i915: Put the mm in the parent address space Ben Widawsky
2013-07-09  6:08 ` [PATCH 03/11] drm/i915: Create a global list of vms Ben Widawsky
2013-07-09  6:37   ` Daniel Vetter
2013-07-09  6:08 ` [PATCH 04/11] drm/i915: Move active/inactive lists to new mm Ben Widawsky
2013-07-09  6:08 ` [PATCH 05/11] drm/i915: Create VMAs Ben Widawsky
2013-07-11 11:20   ` Imre Deak
2013-07-12  2:23     ` Ben Widawsky
2013-07-09  6:08 ` [PATCH 06/11] drm/i915: plumb VM into object operations Ben Widawsky
2013-07-09  7:15   ` Daniel Vetter
2013-07-10 16:37     ` Ben Widawsky
2013-07-10 17:05       ` Daniel Vetter
2013-07-10 22:23         ` Ben Widawsky
2013-07-11  6:01           ` Daniel Vetter
2013-07-12  2:23     ` Ben Widawsky
2013-07-12  6:26       ` Daniel Vetter
2013-07-12 15:46         ` Ben Widawsky
2013-07-12 16:46           ` Daniel Vetter
2013-07-16  3:57             ` Ben Widawsky
2013-07-16  5:06               ` Daniel Vetter
2013-07-09  6:08 ` [PATCH 07/11] drm/i915: Fix up map and fenceable for VMA Ben Widawsky
2013-07-09  7:16   ` Daniel Vetter
2013-07-10 16:39     ` Ben Widawsky
2013-07-10 17:08       ` Daniel Vetter
2013-07-09  6:08 ` [PATCH 08/11] drm/i915: mm_list is per VMA Ben Widawsky
2013-07-09  7:18   ` Daniel Vetter
2013-07-10 16:39     ` Ben Widawsky
2013-07-09  6:08 ` [PATCH 09/11] drm/i915: Update error capture for VMs Ben Widawsky
2013-07-09  6:08 ` [PATCH 10/11] drm/i915: create an object_is_active() Ben Widawsky
2013-07-09  6:08 ` [PATCH 11/11] drm/i915: Move active to vma Ben Widawsky
2013-07-09  7:45   ` Daniel Vetter
2013-07-10 16:39     ` Ben Widawsky
2013-07-10 17:13       ` Daniel Vetter
2013-07-09  7:50 ` [PATCH 00/11] ppgtt: just the VMA Daniel Vetter
2013-07-13  4:45 ` [PATCH 12/15] [RFC] create vm->bind,unbind Ben Widawsky
2013-07-13  4:45   ` [PATCH 1/3] drm/i915: Add bind/unbind object functions to VM Ben Widawsky
2013-07-13  9:33     ` Daniel Vetter
2013-07-16  3:35       ` Ben Widawsky
2013-07-16  4:00         ` Ben Widawsky
2013-07-16  5:10           ` Daniel Vetter
2013-07-16  5:13         ` Daniel Vetter
2013-07-13  4:45   ` [PATCH 2/3] drm/i915: Use the new vm [un]bind functions Ben Widawsky
2013-07-13  4:45   ` [PATCH 3/3] drm/i915: eliminate vm->insert_entries() Ben Widawsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.